Figure 1.
(A) The framework of ViraLM. The input sequence is tokenized and fed into the transformer block. Then the binary classification layer will aggregate the result from the transformer block to generate a final prediction. (B) The performance of each tool on various-length contigs where negative samples only consist of prokaryotes (bacteria, archaea, plasmid). (C) Comparison of the performances on prokaryotic (bacteria, archaea, plasmid) and eukaryotic (fungi, protozoa, insects, bats, and humans) genomes. (D) The performance of each tool on distinguishing viruses from eukaryotic contigs (fungi, protozoa, insects, bats, and humans). (E) Sensitivity of virus identification on contigs with various protein densities, grouped by contig lengths. X-axis: percentage of identified viruses (sensitivity). Y-axis: number of proteins.

(A) The framework of ViraLM. The input sequence is tokenized and fed into the transformer block. Then the binary classification layer will aggregate the result from the transformer block to generate a final prediction. (B) The performance of each tool on various-length contigs where negative samples only consist of prokaryotes (bacteria, archaea, plasmid). (C) Comparison of the performances on prokaryotic (bacteria, archaea, plasmid) and eukaryotic (fungi, protozoa, insects, bats, and humans) genomes. (D) The performance of each tool on distinguishing viruses from eukaryotic contigs (fungi, protozoa, insects, bats, and humans). (E) Sensitivity of virus identification on contigs with various protein densities, grouped by contig lengths. X-axis: percentage of identified viruses (sensitivity). Y-axis: number of proteins.

Close
This Feature Is Available To Subscribers Only

Sign In or Create an Account

Close

This PDF is available to Subscribers Only

View Article Abstract & Purchase Options

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

Close