A comprehensive review and performance evaluation of bioinformatics tools for HLA class I peptide-binding prediction

15.

Rajasagi

M

,

Shukla

SA

,

Fritsch

EF

, et al.

Systematic identification of personal tumor-specific neoantigens in chronic lymphocytic leukemia

.

Blood

2014

;

124

:

453

–

62

.

16.

Robbins

PF

,

Lu

Y-C

,

El-Gamil

M

, et al.

Mining exomic sequencing data to identify mutated antigens recognized by adoptively transferred tumor-reactive T cells

.

Nat Med

2013

;

19

:

747

.

17.

Bassani-Sternberg

M

,

Bräunlein

E

,

Klar

R

, et al.

Direct identification of clinically relevant neoepitopes presented on native human melanoma tissue by mass spectrometry

.

Nat Commun

2016

;

7

:

13404

.

18.

Bassani-Sternberg

M

,

Chong

C

,

Guillaume

P

, et al.

Deciphering HLA-I motifs across HLA peptidomes improves neo-antigen predictions and identifies allostery regulating HLA specificity

.

PLoS Comput Biol

2017

;

13

:

e1005725

.

19.

Ramarathinam

SH

,

Croft

NP

,

Illing

PT

, et al.

Employing proteomics in the study of antigen presentation: an update

.

Expert Rev Proteomics

2018

;

15

:

637

–

45

.

20.

Vita

R

,

Overton

JA

,

Greenbaum

JA

, et al.

The immune epitope database (IEDB) 3.0

.

Nucleic Acids Res

2014

;

43

:

D405

–

12

.

21.

Nielsen

M

,

Lund

O

,

Buus

S

, et al.

MHC class II epitope predictive algorithms

.

Immunology

2010

;

130

:

319

–

28

.

22.

Rammensee

H-G

,

Bachmann

J

,

Emmerich

NPN

, et al.

SYFPEITHI: database for MHC ligands and peptide motifs

.

Immunogenetics

1999

;

50

:

213

–

9

.

23.

Reche

PA

,

Glutting

J-P

,

Reinherz

EL

.

Prediction of MHC class I binding peptides using profile motifs

.

Hum Immunol

2002

;

63

:

701

–

9

.

24.

Zhang

H

,

Lund

O

,

Nielsen

M

.

The PickPocket method for predicting binding specificities for receptors based on receptor pocket similarities: application to MHC-peptide binding

.

Bioinformatics

2009

;

25

:

1293

–

9

.

25.

Kim

Y

,

Ponomarenko

J

,

Zhu

Z

, et al.

Immune epitope database analysis resource

.

Nucleic Acids Res

2012

;

40

:

W525

–

30

.

26.

Liu

G

,

Li

D

,

Li

Z

, et al.

PSSMHCpan: a novel PSSM-based software for predicting class I peptide-HLA binding affinity

.

Gigascience

2017

;

6

:

1

–

11

.

, https://doi.org/10.1101/154757.

27.

Gfeller

D

,

Guillaume

P

,

Michaux

J

, et al.

The length distribution and multiple specificity of naturally presented HLA-I ligands

.

J Immunol

2018

;

201

:

3705

–

16

.

28.

Andreatta

M

,

Nielsen

M

.

Gapped sequence alignment using artificial neural networks: application to the MHC class I system

.

Bioinformatics

2015

;

32

:

511

–

7

.

29.

Rasmussen

M

,

Fenoy

E

,

Harndahl

M

, et al.

Pan-specific prediction of peptide–MHC class I complex stability, a correlate of T cell immunogenicity

.

J Immunol

2016

;

197

:

1517

–

24

.

30.

Jurtz

V

,

Paul

S

,

Andreatta

M

, et al.

NetMHCpan-4.0: improved peptide–MHC class I interaction predictions integrating eluted ligand and peptide binding affinity data

.

J Immunol

2017

;

199

:

3360

–

8

.

31.

O'Donnell

TJ

,

Rubinsteyn

A

.

Bonsack M et al

,

MHCflurry: open-source class I MHC binding affinity prediction

.

Cell Syst

2018

;

7

:

129

–

32

.

32.

Bhattacharya

R

,

Tokheim

C

,

Sivakumar

A

, et al.

Prediction of peptide binding to MHC Class I proteins in the age of deep learning

.

bioRxiv

2017

33.

Han

Y

,

Kim

D

.

Deep convolutional neural networks for pan-specific peptide-MHC class I binding prediction

.

BMC Bioinformatics

2017

;

18

:

585

.

34.

Vang

YS

,

Xie

X

.

HLA class I binding prediction via convolutional neural networks

.

Bioinformatics

2017

;

33

:

2658

–

65

.

35.

Karosiene

E

,

Lundegaard

C

,

Lund

O

, et al.

NetMHCcons: a consensus method for the major histocompatibility complex class I predictions

.

Immunogenetics

2012

;

64

:

177

–

86

.

36.

Antunes

DA

,

Abella

JR

,

Devaurs

D

, et al.

Structure-based methods for binding mode and binding affinity prediction for peptide-MHC complexes

.

Curr Top Med Chem

2018

;

18

:

2239

–

55

.

37.

Rognan

D

,

Lauemøller

SL

,

Holm

A

, et al.

Predicting binding affinities of protein ligands from three-dimensional models: application to peptide binding to class I major histocompatibility proteins

.

J Med Chem

1999

;

42

:

4650

–

8

.

38.

Altuvia

Y

,

Margalit

H

.

A structure-based approach for prediction of MHC-binding peptides

.

Methods

2004

;

34

:

454

–

9

.

39.

Liao

WW

,

Arthur

JW

.

Predicting peptide binding affinities to MHC molecules using a modified semi-empirical scoring function

.

PLoS One

2011

;

6

:

e25055

.

40.

Knapp

B

,

Giczi

V

,

Ribarics

R

, et al.

PeptX: using genetic algorithms to optimize peptides for MHC binding

.

BMC Bioinformatics

2011

;

12

:

241

.

41.

Yanover

C

,

Bradley

P

.

Large-scale characterization of peptide-MHC binding landscapes with structural simulations

.

Proc Natl Acad Sci U S A

2011

;

108

:

6981

–

6

.

42.

Doytchinova

IA

,

Flower

DR

.

Physicochemical explanation of peptide binding to HLA-A* 0201 major histocompatibility complex: a three-dimensional quantitative structure-activity relationship study

.

Proteins

2002

;

48

:

505

–

18

.

43.

Doytchinova

IA

,

Walshe

VA

,

Jones

NA

, et al.

Coupling in silico and in vitro analysis of peptide-MHC binding: a bioinformatic approach enabling prediction of superbinding peptides and anchorless epitopes

.

J Immunol

2004

;

172

:

7495

–

502

.

44.

Jojic

N

,

Reyes-Gomez

M

,

Heckerman

D

, et al.

Learning MHC I—peptide binding

.

Bioinformatics

2006

;

22

:

e227

–

35

.

45.

Antes

I

,

Siu

SW

,

Lengauer

T

.

DynaPred: a structure and sequence based method for the prediction of MHC class I binding peptide sequences and conformations

.

Bioinformatics

2006

;

22

:

e16

–

24

.

46.

Bordner

AJ

,

Abagyan

R

.

Ab initio prediction of peptide-MHC binding geometry for diverse class I MHC allotypes

.

Proteins

2006

;

63

:

512

–

26

.

47.

Tian

F

,

Yang

L

,

Lv

F

, et al.

In silico quantitative prediction of peptides binding affinity to human MHC molecule: an intuitive quantitative structure–activity relationship approach

.

Amino Acids

2009

;

36

:

535

.

48.

Saethang

T

,

Hirose

O

,

Kimkong

I

, et al.

PAAQD: predicting immunogenicity of MHC class I binding peptides using amino acid pairwise contact potentials and quantum topological molecular similarity descriptors

.

J Immunol Methods

2013

;

387

:

293

–

302

.

49.

Mukherjee

S

,

Bhattacharyya

C

,

Chandra

N

.

HLaffy: estimating peptide affinities for Class-1 HLA molecules by learning position-specific pair potentials

.

Bioinformatics

2016

;

32

:

2297

–

305

.

50.

Wan

S

,

Knapp

B

,

Wright

DW

, et al.

Rapid, precise, and reproducible prediction of peptide–MHC binding affinities from molecular dynamics that correlate well with experiment

.

J Chem Theory Comput

2015

;

11

:

3346

–

56

.

51.

Knapp

B

,

Demharter

S

,

Deane

CM

, et al.

Exploring peptide/MHC detachment processes using hierarchical natural move Monte Carlo

.

Bioinformatics

2015

;

32

:

181

–

6

.

PubMed

52.

Peters

B

,

Bui

H-H

,

Frankild

S

, et al.

A community resource benchmarking predictions of peptide binding to MHC-I molecules

.

PLoS Comput Biol

2006

;

2

:

e65

.

53.

Lin

HH

,

Ray

S

,

Tongchusak

S

, et al.

Evaluation of MHC class I peptide binding prediction servers: applications for vaccine research

.

BMC Immunol

2008

;

9

:

8

.

54.

Zhang

H

,

Lundegaard

C

,

Nielsen

M

.

Pan-specific MHC class I predictors: a benchmark of HLA class I pan-specific prediction methods

.

Bioinformatics

2008

;

25

:

83

–

9

.

55.

Zhang

L

,

Udaka

K

,

Mamitsuka

H

, et al.

Toward more accurate pan-specific MHC-peptide binding prediction: a review of current methods and tools

.

Brief Bioinform

2011

;

13

:

350

–

64

.

56.

Zhao

W

,

Sher

X

.

Systematically benchmarking peptide-MHC binding predictors: from synthetic to naturally processed epitopes

.

PLoS Comput Biol

2018

;

14

:

e1006457

.

57.

Zhang

Q

,

Wang

P

,

Kim

Y

, et al.

Immune epitope database analysis resource (IEDB-AR)

.

Nucleic Acids Res

2008

;

36

:

W513

–

8

.

58.

Lata

S

,

Bhasin

M

,

Raghava

GP

.

MHCBN 4.0: a database of MHC/TAP binding peptides and T-cell epitopes

.

BMC Res Notes

2009

;

2

:

61

.

59.

Reche

PA

,

Zhang

H

,

Glutting

J-P

, et al.

EPIMHC: a curated database of MHC-binding peptides for customized computational vaccinology

.

Bioinformatics

2005

;

21

:

2140

–

1

.

60.

Alvarez

B

,

Barra

C

,

Nielsen

M

, et al.

Computational tools for the identification and interpretation of sequence motifs in immunopeptidomes

.

Proteomics

2018

;

18

:

1700252

.

61.

Stranzl

T

,

Larsen

MV

,

Lundegaard

C

, et al.

NetCTLpan: pan-specific MHC class I pathway epitope predictions

.

Immunogenetics

2010

;

62

:

357

–

68

.

62.

Larsen

MV

,

Lundegaard

C

,

Lamberth

K

, et al.

An integrative approach to CTL epitope prediction: a combined algorithm integrating MHC class I binding, TAP transport efficiency, and proteasomal cleavage predictions

.

Eur J Immunol

2005

;

35

:

2295

–

303

.

63.

Yewdell

JW

,

Bennink

JR

.

Immunodominance in major histocompatibility complex class I–restricted T lymphocyte responses

.

Annu Rev Immunol

1999

;

17

:

51

–

88

.

64.

Chou

K-C

.

Some remarks on protein attribute prediction and pseudo amino acid composition

.

J Theor Biol

2011

;

273

:

236

–

47

.

65.

Li

F

,

Li

C

,

Marquez-Lago

TT

, et al.

Quokka: a comprehensive tool for rapid and accurate prediction of kinase family-specific phosphorylation sites in the human proteome

.

Bioinformatics

2018

;

1

:

9

.

66.

Chen

W

,

Feng

P-M

,

Lin

H

, et al.

iRSpot-PseDNC: identify recombination spots with pseudo dinucleotide composition

.

Nucleic Acids Res

2013

;

41

:

e68

–

8

.

67.

Chou

K-C

,

Wu

Z-C

,

Xiao

X

.

iLoc-hum: using the accumulation-label scale to predict subcellular locations of human proteins with both single and multiple sites

.

Mol Biosyst

2012

;

8

:

629

–

41

.

68.

Lill

JR

,

van Veelen

PA

,

Tenzer

S

, et al.

Minimal information about an immuno-peptidomics experiment (MIAIPE)

.

Proteomics

2018

;

18

:

1800110

.

69.

Li

F

,

Zhang

Y

,

Purcell

AW

, et al.

Positive-unlabelled learning of glycosylation sites in the human proteome

.

BMC Bioinformatics

2019

;

20

:

112

.

70.

Chen

Z

,

Zhao

P

,

Li

F

, et al.

iFeature: a python package and web server for features extraction and selection from protein and peptide sequences

.

Bioinformatics

2018

;

1

:

4

.

71.

Andreatta

M

,

Alvarez

B

,

Nielsen

M

.

GibbsCluster: unsupervised clustering and alignment of peptide sequences

.

Nucleic Acids Res

2017

;

45

:

W458

–

63

.

72.

Lam

L

,

Suen

S

.

Application of majority voting to pattern recognition: an analysis of its behavior and performance

.

IEEE Trans Syst Man Cybern A Syst Hum

1997

;

27

:

553

–

68

.

73.

Thompson

JD

,

Higgins

DG

,

Gibson

TJ

.

Improved sensitivity of profile searches through the use of sequence weights and gap excision

.

Bioinformatics

1994

;

10

:

19

–

29

.

74.

Henikoff

S

,

Henikoff

JG

.

Amino acid substitution matrices from protein blocks

.

Proc Natl Acad Sci U S A

1992

;

89

:

10915

–

9

.

75.

Peters

B

,

Sette

A

.

Generating quantitative models describing the sequence specificity of biological processes with the stabilized matrix method

.

BMC Bioinformatics

2005

;

6

:

132

.

76.

Kim

Y

,

Sidney

J

,

Pinilla

C

, et al.

Derivation of an amino acid similarity matrix for peptide: MHC binding and its application as a Bayesian prior

.

BMC Bioinformatics

2009

;

10

:

394

.

77.

Altschul

SF

,

Gertz

EM

,

Agarwala

R

, et al.

PSI-BLAST pseudocounts and the minimum description length principle

.

Nucleic Acids Res

2008

;

37

:

815

–

24

.

78.

Nielsen

M

,

Lundegaard

C

,

Worning

P

, et al.

Improved prediction of MHC class I and class II epitopes using a novel Gibbs sampling approach

.

Bioinformatics

2004

;

20

:

1388

–

97

.

79.

Bassani-Sternberg

M

,

Gfeller

D

.

Unsupervised HLA peptidome deconvolution improves ligand prediction accuracy and predicts cooperative effects in peptide–HLA interactions

.

J Immunol

2016

;

197

:

2492

–

9

.

80.

Suliman

A

,

Zhang

Y

.

A review on back-propagation neural networks in the application of remote sensing image classification

.

Journal of Earth Science and Engineering

2015

;

5

:

52

–

65

.

81.

Nielsen

M

,

Lund

O

.

NN-align. An artificial neural network-based alignment algorithm for MHC class II peptide binding prediction

.

BMC Bioinformatics

2009

;

10

:

296

.

82.

Harndahl

M

,

Rasmussen

M

,

Roder

G

, et al.

Real-time, high-throughput measurements of peptide–MHC-I dissociation using a scintillation proximity assay

.

J Immunol Methods

2011

;

374

:

5

–

12

.

83.

Hochreiter

S

,

Schmidhuber

J

.

Long short-term memory

.

Neural Comput

1997

;

9

:

1735

–

80

.

84.

Cho

K

,

Van Merriënboer

B

,

Gulcehre

C

et al.

Learning phrase representations using RNN encoder-decoder for statistical machine translation

.

2014

,

arXiv preprint arXiv:1406.1078, doi:

10.3115/v1/D14-1179 .

85.

Kinga

D

,

Adam

JB

. A method for stochastic optimization. In:

International Conference on Learning Representations (ICLR)

.

2014

.

arXiv preprint arXiv:1412.6980, Ithaca: San Diego

.

86.

Nielsen

M

,

Lundegaard

C

,

Blicher

T

, et al.

NetMHCpan, a method for quantitative predictions of peptide binding to any HLA-A and -B locus protein of known sequence

.

PLoS One

2007

;

2

:

e796

.

87.

Simonyan

K

,

Zisserman

A

.

Very deep convolutional networks for large-scale image recognition

.

2014

,

arXiv preprint arXiv:1409.1556, arXiv:1409.1556v6

.

88.

Srivastava

N

,

Hinton

G

,

Krizhevsky

A

, et al.

Dropout: a simple way to prevent neural networks from overfitting

.

J Mach Learn Res

2014

;

15

:

1929

–

58

.

89.

Nair

V

,

Hinton

GE

. Rectified linear units improve restricted boltzmann machines. In:

Proceedings of the 27th International Conference on Machine Learning (ICML-10)

.

2010

, pp.

807

–

814

, Ominipress: Haifa

.

90.

Mikolov

T

,

Sutskever

I

,

Chen

K

, et al. Distributed representations of words and phrases and their compositionality. In:

Advances in Neural information Processing Systems

,

2013

, pp.

3111

–

3119

, Curran Associate Inc.: Nevada

.

91.

Maas

AL

,

Hannun

AY

,

Ng

AY

.

Rectifier nonlinearities improve neural network acoustic models

.

In

:

Proceedings of the International Council for Machinery Lubrication

,

2013

,

3

, Atlanta

.

Google Preview

92.

Moutaftsi

M

,

Peters

B

,

Pasquetto

V

, et al.

A consensus epitope prediction approach identifies the breadth of murine T CD8+-cell responses to vaccinia virus

.

Nat Biotechnol

2006

;

24

:

817

.

93.

Sidney

J

,

Assarsson

E

,

Moore

C

, et al.

Quantitative peptide binding motifs for 19 human and mouse MHC class I molecules derived using positional scanning combinatorial peptide libraries

.

Immunome Res

2008

;

4

:

2

.

94.

Lundegaard

C

,

Lamberth

K

,

Harndahl

M

, et al.

NetMHC-3.0: accurate web accessible predictions of human, mouse and monkey MHC class I affinities for peptides of length 8–11

.

Nucleic Acids Res

2008

;

36

:

W509

–

12

.

95.

Lundegaard

C

,

Lund

O

,

Nielsen

M

.

Accurate approximation method for prediction of class I MHC affinities for peptides of length 8, 10 and 11 using prediction tools trained on 9mers

.

Bioinformatics

2008

;

24

:

1397

–

8

.

96.

Hoof

I

,

Peters

B

,

Sidney

J

, et al.

NetMHCpan, a method for MHC class I binding prediction beyond humans

.

Immunogenetics

2009

;

61

:

1

.

97.

Li

F

,

Wang

Y

,

Li

C

, et al.

Twenty years of bioinformatics research for protease-specific substrate and cleavage site prediction: a comprehensive revisit and benchmarking of existing methods

.

Briefings Bioinform

2018

;

bby077

, doi:10.1093/bib/bby077.

98.

O'shea

JP

,

Chou

MF

,

Quader

SA

, et al.

pLogo: a probabilistic approach to visualizing sequence motifs

.

Nat Methods

2013

;

10

:

1211

.

99.

Samuels

Y

,

Kalaora

S

,

Wolf

Y

, et al.

Combined analysis of antigen presentation and T cell recognition reveals restricted immune responses in melanoma

.

Cancer Discov

2018

;

8

:

1366

–

75

.

100.

Sakabe

S

,

Sullivan

BM

,

Hartnett

JN

, et al.

Analysis of CD8+ T cell response during the 2013–2016 Ebola epidemic in West Africa

.

Proc Natl Acad Sci U S A

2018

;

115

:

E7578

–

86

.

101.

Rozanov

DV

,

Rozanov

ND

,

Chiotti

KE

, et al.

MHC class I loaded ligands from breast cancer cell lines: a potential HLA-I-typed antigen collection

.

J Proteomics

2018

;

176

:

13

–

23

.

102.

Fiore-Gartland

A

,

Manso

BA

,

Friedrich

DP

, et al.

Pooled-peptide epitope mapping strategies are efficient and highly sensitive: an evaluation of methods for identifying human T cell epitope specificities in large-scale HIV vaccine efficacy trials

.

PLoS One

2016

;

11

:

e0147812

.

103.

Blankenstein

T

,

Coulie

PG

,

Gilboa

E

, et al.

The determinants of tumour immunogenicity

.

Nat Rev Cancer

2012

;

12

:

307

.

104.

Croft

NP

,

Smith

SA

,

Pickering

J

, et al.

Most viral peptides displayed by class I MHC on infected cells are immunogenic

.

Proc Natl Acad Sci U S A

2019

;

116

:

3112

–

7

, dio: 10.1145/3219819.3220005.

105.

Kim

S

,

Kim

HS

,

Kim

E

, et al.

Neopepsee: accurate genome-level prediction of neoantigens by harnessing sequence and amino acid immunogenicity information

.

Ann Oncol

2018

;

29

:

1030

–

6

.

106.

Chowell

D

,

Krishna

S

,

Becker

PD

, et al.

TCR contact residue hydrophobicity is a hallmark of immunogenic CD8+ T cell epitopes

.

Proc Natl Acad Sci U S A

2015

;

112

:

E1754

–

62

.

107.

Zeng

J

,

Treutlein

HR

,

Rudy

GB

.

Predicting sequences and structures of MHC-binding peptides: a computational combinatorial approach

.

J Comput Aided Mol Des

2001

;

15

:

573

–

86

.

108.

Abelin

JG

,

Keskin

DB

,

Sarkizova

S

, et al.

Mass spectrometry profiling of HLA-associated peptidomes in mono-allelic cells enables more accurate epitope prediction

.

Immunity

2017

;

46

:

315

–

26

.

109.

Yadav

M

,

Jhunjhunwala

S

,

Phung

QT

, et al.

Predicting immunogenic tumour mutations by combining mass spectrometry and exome sequencing

.

Nature

2014

;

515

:

572

.

110.

Li

B

,

Li

T

,

Pignon

J-C

, et al.

Landscape of tumor-infiltrating T cell repertoire of human cancers

.

Nat Genet

2016

;

48

:

725

.

111.

LeCun

Y

,

Bengio

Y

,

Hinton

G

.

Deep learning

.

Nature

2015

;

521

:

436

.

112.

Min

S

,

Lee

B

,

Yoon

S

.

Deep learning in bioinformatics

.

Brief Bioinform

2017

;

18

:

851

–

69

.

PubMed

113.

Domingos

P

,

Hulten

G

. Mining high-speed data streams. In:

Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

,

2000

, pp.

71

–

80

.

Association for Computing Machinery

, New York

.

114.

Manapragada

C

,

Webb

G

,

Salehi

M

.

Extremely Fast Decision Tree

.

2018

,

arXiv preprint arXiv:1802.08780

, doi: 10.1145/3219819.3220005.

115.

Riedmiller

M

,

Gabel

T

,

Hafner

R

, et al.

Reinforcement learning for robot soccer

.

Auton Robots

2009

;

27

:

55

–

73

.