Model name . | Model size . | Model task . | Model name . | Model size . | Model task . |
---|---|---|---|---|---|
BioBERT | 110 M/340 M | Biomedical text mining (NER, RE, QA) | ProGen | 1.2B | Stability prediction, remote homology detection, secondary structure prediction |
BioELECTRA | 109 M | Biomedical text mining (NER, RE, QA) | ProGen2 | 6.4B | Functional sequence generation, protein fitness prediction |
BLURB | Unknown | Biomedical NLP benchmark (QA, NER, parsing, etc.) | CLAPE-DB | Unknown | Protein–ligand-binding site prediction |
BioBART | 139 M/400 M | Biomedical text generation (dialogue, summarization, NER) | Geneformer | 30 M | Sequence-based prediction |
Med-PaLM | 12B/84B/562B | Medical question answering | scGPT | Unknown | Multibatch integration, multi-omic integration, cell-type annotation, genetic perturbation prediction, gene network inference |
MSA | 30 M/100 M | Arabic NLP tasks (NER, POS tagging, sentiment analysis, etc.) | ESM-1b | 650 M | Supervised prediction of mutational effect and secondary structure |
GMAI | Unknown | Generalist medical AI (multimodal tasks) | AlphaFold2 | 21 M | Protein structure prediction |
DNABERT | 110 M | DNA sequence prediction (promoters, TFBSs, splice sites) | AlphaFold3 | 93 M | Protein structure prediction, structure of protein–protein interaction prediction |
Enformer | Unknown | Gene expression prediction | RGN2 | 110 M | Protein design and analysis of allelic variation or disease mutations |
HyenaDNA | 7 M | Genomic sequence modeling (regulatory elements, chromatin profiles) | Uni-Mol | 1.1B | 3D position recovery, masked atom prediction, molecular property prediction |
Nucleotide Transformer | 500 M ~ 2.5B | DNA sequence analysis | RNA-FM | 99.52 M | RNA secondary structure prediction, distance regression task |
ProteinBERT | 16 M | Bidirectional language modeling of protein sequences, Gene Ontology (GO) annotation prediction | UNI-RNA | 25 M/85 M/169 M/400 M | RNA structure and function prediction |
ProtGPT | 1.6 M/25.2 M | Protein sequence generation | RNA-MSM | Unknown | RNA structure and function prediction |
ProtGPT2 | 738 M | Protein sequence generation, structural similarity detection, stability prediction | Bingo | 8 ~ 15 M | Filling in randomly masked amino acids, generating residue-level feature matrix and protein contact map |
xTrimoPGLM | 100B | Protein understanding and generation | scFoundation | 100 M | Gene expression enhancement, tissue drug response prediction, single-cell drug response classification, single-cell perturbation prediction |
DNABERT-2 | 117 M | DNA sequence prediction | scHyena | Unknown | Cell type classification, scRNA-seq imputation |
scBERT | Unknown | Single-cell RNA sequencing analysis | ProtST | 650 M | Unimodal mask prediction, multimodal representation alignment, multimodal mask prediction |
Model name . | Model size . | Model task . | Model name . | Model size . | Model task . |
---|---|---|---|---|---|
BioBERT | 110 M/340 M | Biomedical text mining (NER, RE, QA) | ProGen | 1.2B | Stability prediction, remote homology detection, secondary structure prediction |
BioELECTRA | 109 M | Biomedical text mining (NER, RE, QA) | ProGen2 | 6.4B | Functional sequence generation, protein fitness prediction |
BLURB | Unknown | Biomedical NLP benchmark (QA, NER, parsing, etc.) | CLAPE-DB | Unknown | Protein–ligand-binding site prediction |
BioBART | 139 M/400 M | Biomedical text generation (dialogue, summarization, NER) | Geneformer | 30 M | Sequence-based prediction |
Med-PaLM | 12B/84B/562B | Medical question answering | scGPT | Unknown | Multibatch integration, multi-omic integration, cell-type annotation, genetic perturbation prediction, gene network inference |
MSA | 30 M/100 M | Arabic NLP tasks (NER, POS tagging, sentiment analysis, etc.) | ESM-1b | 650 M | Supervised prediction of mutational effect and secondary structure |
GMAI | Unknown | Generalist medical AI (multimodal tasks) | AlphaFold2 | 21 M | Protein structure prediction |
DNABERT | 110 M | DNA sequence prediction (promoters, TFBSs, splice sites) | AlphaFold3 | 93 M | Protein structure prediction, structure of protein–protein interaction prediction |
Enformer | Unknown | Gene expression prediction | RGN2 | 110 M | Protein design and analysis of allelic variation or disease mutations |
HyenaDNA | 7 M | Genomic sequence modeling (regulatory elements, chromatin profiles) | Uni-Mol | 1.1B | 3D position recovery, masked atom prediction, molecular property prediction |
Nucleotide Transformer | 500 M ~ 2.5B | DNA sequence analysis | RNA-FM | 99.52 M | RNA secondary structure prediction, distance regression task |
ProteinBERT | 16 M | Bidirectional language modeling of protein sequences, Gene Ontology (GO) annotation prediction | UNI-RNA | 25 M/85 M/169 M/400 M | RNA structure and function prediction |
ProtGPT | 1.6 M/25.2 M | Protein sequence generation | RNA-MSM | Unknown | RNA structure and function prediction |
ProtGPT2 | 738 M | Protein sequence generation, structural similarity detection, stability prediction | Bingo | 8 ~ 15 M | Filling in randomly masked amino acids, generating residue-level feature matrix and protein contact map |
xTrimoPGLM | 100B | Protein understanding and generation | scFoundation | 100 M | Gene expression enhancement, tissue drug response prediction, single-cell drug response classification, single-cell perturbation prediction |
DNABERT-2 | 117 M | DNA sequence prediction | scHyena | Unknown | Cell type classification, scRNA-seq imputation |
scBERT | Unknown | Single-cell RNA sequencing analysis | ProtST | 650 M | Unimodal mask prediction, multimodal representation alignment, multimodal mask prediction |
Model name . | Model size . | Model task . | Model name . | Model size . | Model task . |
---|---|---|---|---|---|
BioBERT | 110 M/340 M | Biomedical text mining (NER, RE, QA) | ProGen | 1.2B | Stability prediction, remote homology detection, secondary structure prediction |
BioELECTRA | 109 M | Biomedical text mining (NER, RE, QA) | ProGen2 | 6.4B | Functional sequence generation, protein fitness prediction |
BLURB | Unknown | Biomedical NLP benchmark (QA, NER, parsing, etc.) | CLAPE-DB | Unknown | Protein–ligand-binding site prediction |
BioBART | 139 M/400 M | Biomedical text generation (dialogue, summarization, NER) | Geneformer | 30 M | Sequence-based prediction |
Med-PaLM | 12B/84B/562B | Medical question answering | scGPT | Unknown | Multibatch integration, multi-omic integration, cell-type annotation, genetic perturbation prediction, gene network inference |
MSA | 30 M/100 M | Arabic NLP tasks (NER, POS tagging, sentiment analysis, etc.) | ESM-1b | 650 M | Supervised prediction of mutational effect and secondary structure |
GMAI | Unknown | Generalist medical AI (multimodal tasks) | AlphaFold2 | 21 M | Protein structure prediction |
DNABERT | 110 M | DNA sequence prediction (promoters, TFBSs, splice sites) | AlphaFold3 | 93 M | Protein structure prediction, structure of protein–protein interaction prediction |
Enformer | Unknown | Gene expression prediction | RGN2 | 110 M | Protein design and analysis of allelic variation or disease mutations |
HyenaDNA | 7 M | Genomic sequence modeling (regulatory elements, chromatin profiles) | Uni-Mol | 1.1B | 3D position recovery, masked atom prediction, molecular property prediction |
Nucleotide Transformer | 500 M ~ 2.5B | DNA sequence analysis | RNA-FM | 99.52 M | RNA secondary structure prediction, distance regression task |
ProteinBERT | 16 M | Bidirectional language modeling of protein sequences, Gene Ontology (GO) annotation prediction | UNI-RNA | 25 M/85 M/169 M/400 M | RNA structure and function prediction |
ProtGPT | 1.6 M/25.2 M | Protein sequence generation | RNA-MSM | Unknown | RNA structure and function prediction |
ProtGPT2 | 738 M | Protein sequence generation, structural similarity detection, stability prediction | Bingo | 8 ~ 15 M | Filling in randomly masked amino acids, generating residue-level feature matrix and protein contact map |
xTrimoPGLM | 100B | Protein understanding and generation | scFoundation | 100 M | Gene expression enhancement, tissue drug response prediction, single-cell drug response classification, single-cell perturbation prediction |
DNABERT-2 | 117 M | DNA sequence prediction | scHyena | Unknown | Cell type classification, scRNA-seq imputation |
scBERT | Unknown | Single-cell RNA sequencing analysis | ProtST | 650 M | Unimodal mask prediction, multimodal representation alignment, multimodal mask prediction |
Model name . | Model size . | Model task . | Model name . | Model size . | Model task . |
---|---|---|---|---|---|
BioBERT | 110 M/340 M | Biomedical text mining (NER, RE, QA) | ProGen | 1.2B | Stability prediction, remote homology detection, secondary structure prediction |
BioELECTRA | 109 M | Biomedical text mining (NER, RE, QA) | ProGen2 | 6.4B | Functional sequence generation, protein fitness prediction |
BLURB | Unknown | Biomedical NLP benchmark (QA, NER, parsing, etc.) | CLAPE-DB | Unknown | Protein–ligand-binding site prediction |
BioBART | 139 M/400 M | Biomedical text generation (dialogue, summarization, NER) | Geneformer | 30 M | Sequence-based prediction |
Med-PaLM | 12B/84B/562B | Medical question answering | scGPT | Unknown | Multibatch integration, multi-omic integration, cell-type annotation, genetic perturbation prediction, gene network inference |
MSA | 30 M/100 M | Arabic NLP tasks (NER, POS tagging, sentiment analysis, etc.) | ESM-1b | 650 M | Supervised prediction of mutational effect and secondary structure |
GMAI | Unknown | Generalist medical AI (multimodal tasks) | AlphaFold2 | 21 M | Protein structure prediction |
DNABERT | 110 M | DNA sequence prediction (promoters, TFBSs, splice sites) | AlphaFold3 | 93 M | Protein structure prediction, structure of protein–protein interaction prediction |
Enformer | Unknown | Gene expression prediction | RGN2 | 110 M | Protein design and analysis of allelic variation or disease mutations |
HyenaDNA | 7 M | Genomic sequence modeling (regulatory elements, chromatin profiles) | Uni-Mol | 1.1B | 3D position recovery, masked atom prediction, molecular property prediction |
Nucleotide Transformer | 500 M ~ 2.5B | DNA sequence analysis | RNA-FM | 99.52 M | RNA secondary structure prediction, distance regression task |
ProteinBERT | 16 M | Bidirectional language modeling of protein sequences, Gene Ontology (GO) annotation prediction | UNI-RNA | 25 M/85 M/169 M/400 M | RNA structure and function prediction |
ProtGPT | 1.6 M/25.2 M | Protein sequence generation | RNA-MSM | Unknown | RNA structure and function prediction |
ProtGPT2 | 738 M | Protein sequence generation, structural similarity detection, stability prediction | Bingo | 8 ~ 15 M | Filling in randomly masked amino acids, generating residue-level feature matrix and protein contact map |
xTrimoPGLM | 100B | Protein understanding and generation | scFoundation | 100 M | Gene expression enhancement, tissue drug response prediction, single-cell drug response classification, single-cell perturbation prediction |
DNABERT-2 | 117 M | DNA sequence prediction | scHyena | Unknown | Cell type classification, scRNA-seq imputation |
scBERT | Unknown | Single-cell RNA sequencing analysis | ProtST | 650 M | Unimodal mask prediction, multimodal representation alignment, multimodal mask prediction |
This PDF is available to Subscribers Only
View Article Abstract & Purchase OptionsFor full access to this pdf, sign in to an existing account, or purchase an annual subscription.