Abstract

Major histocompatibility complex (MHC) class II molecules play a pivotal role in antigen presentation and CD4+ T cell response. Accurate prediction of the immunogenicity of MHC class II-associated antigens is critical for vaccine design and cancer immunotherapies. However, current computational methods are limited by insufficient training data and algorithmic constraints, and the rules that govern which peptides are truly recognized by existing T cell receptors remain poorly understood. Here, we build a transfer learning-based, long short-term memory model named ‘TLimmuno2’ to predict whether epitope-MHC class II complex can elicit T cell response. Through leveraging binding affinity data, TLimmuno2 shows superior performance compared with existing models on independent validation datasets. TLimmuno2 can find real immunogenic neoantigen in real-world cancer immunotherapy data. The identification of significant MHC class II neoantigen-mediated immunoediting signal in the cancer genome atlas pan-cancer dataset further suggests the robustness of TLimmuno2 in identifying really immunogenic neoantigens that are undergoing negative selection during cancer evolution. Overall, TLimmuno2 is a powerful tool for the immunogenicity prediction of MHC class II presented epitopes and could promote the development of personalized immunotherapies.

INTRODUCTION

Major histocompatibility complex (MHC) class II molecules play a pivotal role in the adaptive immune system. It is a glycoprotein complex on the surface of professional antigen-presenting cells that display short antigen peptides to CD4+ T cells. Genome alterations are major driving forces for human cancer evolution, and neoantigens derived from cancer genome alterations are ideal cancer-specific drug targets. There is evidence that CD4+ T cell recognition of neoantigens has been observed across diverse human tumor types and in animal models [1–4]. Evidence also indicates that CD4+ T cell responses to MHC II-restricted neoantigens are required for robust responses to immune-checkpoint inhibitors [5] and that neoantigen vaccines can enhance CD4+ T cell responses [6]. These results underscore the important clinical relevance of MHC-II-restricted neoepitope for cancer immunotherapy.

However, identifying CD4+ T cell epitopes with sufficient immunogenicity is still a technical bottleneck. There has been a lot of work demonstrating that deep learning can be used to solve this problem [7–9]. The accurate prediction of immunogenicity mainly focuses on two aspects: whether the peptide could bind to MHC molecule and whether the peptide–MHC (pMHC) complex could elicit an immune response. NetMHCIIpan [10, 11] and MixMHC2pred [12] were developed to predict the binding affinity (BA) between MHC and peptides. By using mass spectrometry datasets of peptides eluted from MHC II and appropriate deep learning algorithms, they all achieve substantial performance in pMHC BA prediction. However, pMHC II BA prediction alone is not sufficient to infer immunogenicity. To solve this problem, some tools have been developed. IEDB online tool (CD4episcore) [13] was developed to predict the MHC allele independent CD4+ T cell immunogenicity at population level. Besides, a CNN-long short-term memory (LSTM) architecture model, named Deepitope [14], was later developed and showed improved performance compared with the IEDB online tool. Repitope [15] is a framework which computes physiochemical properties based on mimicking the thermodynamics between pMHC and public T cell receptor (TCR) interactions to discriminate immunogenicity. However, these tools could not accurately quantify the ability of pMHC II complex to elicit T cell responses due to the lack of sufficient training data.

In this study, we employed transfer learning to train a model, named TLimmuno2, which can accurately predict the immunogenicity of pMHC class II complexes. Rather than relying solely on existing pMHC II immunogenicity data, TLimmuno2 transfers pMHC II BA information to predict the immunogenicity of the peptide, as the interaction between the MHC and the peptide is required to provoke T cell responses. At the same time, the amount of data for pMHC II affinity is much larger than that for pMHC II immunogenicity [16]. Training on a large amount of dataset first and then transferring the model to a smaller dataset can effectively improve the generalization ability of the model [17]. TLimmuno2 was applied to human tumor sequencing data, and the existence of human leukocyte antigen (HLA) II neoantigen-mediated immunoediting signal was demonstrated in the cancer genome atlas (TCGA) pan-cancer dataset. Overall, TLimmuno2 addresses the long-standing problem of predicting the immunogenicity of MHC II peptides and can serve as a useful tool for precision cancer immunotherapy.

RESULTS

Development of TLimmuno2

We divide the task of learning the MHC class II peptide immunogenicity into two steps. First, we trained a pMHC II BA model using a LSTM network to get the latent embedding of the pMHC complex. Second, we transferred the BA model, constructed another LSTM network and used them as input to a deep neural network (DNN). We employed fine-tuning to finalize the prediction model for the MHC II immunogenicity of peptides. The details of the model structure are shown in Figure 1A and in Methods.

The TLimmuno2 model. (A) The architecture of the TLimmuno2 model. (B) The ROC curves of peptide-HLA II BA model, NetMHCIIpan and mixMHC2pred on the independent dataset from Andreatta et al. (C) Summary of the data used to develop the TLimmuno2 model. (D) Allelic coverage in the immunogenic and non-immunogenic peptides tested in cancer neoepitope studies for the 15 most frequent HLA-II alleles. ‘Other’ indicates the cumulative frequency of the remaining alleles.
Figure 1

The TLimmuno2 model. (A) The architecture of the TLimmuno2 model. (B) The ROC curves of peptide-HLA II BA model, NetMHCIIpan and mixMHC2pred on the independent dataset from Andreatta et al. (C) Summary of the data used to develop the TLimmuno2 model. (D) Allelic coverage in the immunogenic and non-immunogenic peptides tested in cancer neoepitope studies for the 15 most frequent HLA-II alleles. ‘Other’ indicates the cumulative frequency of the remaining alleles.

To integrate BA with another part of the model, we first constructed the BA model using a deep LSTM neural network. The same BA data used for training NetMHCIIpan [11] were used to train our model; these data comprise 107 008 measurements of pMHC BA covering 71 types of class II MHCs of humans, and we transformed BA into a binary label (positive and negative, see Method, or Supplementary Table S1). More details of these data are shown in Supplementary Figure S1. The input of this model is the peptide sequence and the MHC sequence. The output is a continuous score to infer whether the peptide is bound to the MHC molecule. We used 5-fold cross-validation, and the training results are shown in Supplementary Figure S2A and B. Although the output layer of the BA model is dedicated to predict peptide and MHC binding, the layers before output should contain important biological information regarding the interactions between the peptide and MHC. We use an independent dataset from Andreatta et al. [18], which contains 10 399 independent data to compare the performance of the BA model, NetMHCIIpan and MixMHC2pred (Supplementary Table S2). The area under the curve (AUC) of the receiver operating characteristic (ROC) is used as the evaluation index for model comparisons. The BA model (AUC: 0.7742) shows better performance than MixMHC2pred (AUC: 0.6482) and has comparable performance as NetMHCIIpan (AUC: 0.7965) (Figure 1B). By using IEDB recommended threshold 2% rank and 0.426 (500 nm, log50k-transformed, Methods), we find that the BA model shows an improved performance on F1 score and distance to the optimal point value [19, 20] (Supplementary Figure S2C). We also show the result in different peptide lengths and different MHC II alleles (Supplementary Figure S2D and E). These results show that our BA model can make robust, reliable pMHC II BA predictions. Therefore, we can transfer the BA model (removing the last output layer), and the features encoded by this model can be used to represent the interaction between peptide and MHC and can subsequently be applied in immunogenicity prediction.

Then, we use transfer learning to construct our immunogenicity prediction model. We compiled a dataset of 6408 peptides (3930 immunogenic and 2478 non-immunogenic) derived from the Immune Epitope Database (IEDB) [16] that were experimentally tested for T cell immunogenicity in multiple studies in human (Methods, Supplementary Table S3). Two times peptides (12 793) randomly selected from the human proteome were added to the training set as negatives to better match the real situation where non-immunogenic peptides are in excess compared to immunogenic ones (Figure 1C). We observe similar MHC II allele coverage in the immunogenic and non-immunogenic peptides (Figure 1D). Details on these data cleaning processes are shown in Methods and Supplementary Figure S3. We use two additional multi-layer LSTM to encode MHC and peptide information, respectively, and concatenated outputs with the latent embedding (10-dimensional) of the pMHC complex obtained from the pre-trained BA model. Then, this vector was fed into a three-layer fully connected neural network to obtain the final immunogenicity prediction results.

We named the final model of MHC II immunogenicity prediction network based on transfer learning as ‘TLimmuno2’. TLimmuno2 outputs a score between 0 and 1 and also immunogenicity percentile rank. For each pMHC II pair, percentile rank is computed based on a dataset of 90 000 13- to 21-mer peptides (10 000 for each length) randomly selected from the human proteome. A smaller rank or a higher score indicates stronger immunogenicity.

TLimmuno2 outperforms existing methods in pMHC II immunogenicity prediction

We first divided the IEDB dataset into training and testing set by the ratio of 9:1 for model training and testing. To verify the stability of the model, 5-fold cross-validation was applied in the training process of TLimmuno2. We used output score to calculate ROC, and TLimmuno2 shows highly stable AUC (average: 0.8727) across the validations (Figure 2A). TLimmuno2 reaches AUC = 0.9909 in the training set and AUC = 0.8649 in the testing set (Figure 2B). Transfer learning has proven its value in multiple studies [9], and we set out to test if TLimmuno2 can achieve more accurate predictive power. We therefore first compare TLimmuo2’s performance with existing methods IEDB tool (CD4episcore), NetMHCIIpan BA/EL and Repitope in the testing set. Some published methods, such as Deepitope [14] and FIONA [21], do not have the tools publicly available, so these are not included in the subsequent comparison analysis. And, MixMHC2pred showed poor performance compared with other tools and was not considered in the subsequent analysis. As shown in Figure 2C, TLimmuno2 demonstrates a large margin of improvement over existing tools.

TLimmuno2 produces stable predictions and outperforms existing methods. (A) The ROC curves of TLimmuno2 on 5-fold cross-validation. (B) The ROC curves of TLimmuno2 on training process. (C) The ROC curve of TLimmuno2, IEDB (CD4episcore), Repitope, NetMHCIIpan BA and EL in the testing set. (D) The ROC curve of TLimmuno2, IEDB (CD4episcore), Repitope, NetMHCIIpan BA and EL in the neoepitope dataset by using the 15-mer method. (E) The ROC curve of TLimmuno2, NetMHCIIpan BA and EL in the neoepitope dataset by using the k-mer method.
Figure 2

TLimmuno2 produces stable predictions and outperforms existing methods. (A) The ROC curves of TLimmuno2 on 5-fold cross-validation. (B) The ROC curves of TLimmuno2 on training process. (C) The ROC curve of TLimmuno2, IEDB (CD4episcore), Repitope, NetMHCIIpan BA and EL in the testing set. (D) The ROC curve of TLimmuno2, IEDB (CD4episcore), Repitope, NetMHCIIpan BA and EL in the neoepitope dataset by using the 15-mer method. (E) The ROC curve of TLimmuno2, NetMHCIIpan BA and EL in the neoepitope dataset by using the k-mer method.

To further compare the performance of TLimmuno2 with existing methods in predicting immunogenic peptides, we retrieved published papers to collect neoepitopes that were tested experimentally for CD4+ T cell immunogenicity [12] (Supplementary Table S4). Neoepitopes tested experimentally were usually sequences of 20–25 amino acids [10, 12], so we use 15-mer and k-mer methods (Methods, Supplementary Figure S4) to assign a score to a given neoepitope. The performances of IEDB tool and Repitope are only evaluated using 15-mer method, but not k-mer method, since the IEDB tool only allows one to submit peptides with a length greater than 15 and the feature calculation process of the Repitope is time-consuming and resource-intensive, making it impractical to be included in the k-mer method which covers >200 000 pMHC combinations. In these comparisons, TLimmuno2 shows the best performance both in the 15-mer method and the k-mer method (Figure 2D and E). Due to the imbalance of the neoepitope dataset (Supplementary Figure S5C), the AUC of the ROC metric may not be sufficient to quantify the performance of the prediction models [22]. Thus, we also compare model performance by using the AUC of precision-recall (PR), and TLImmuno2 shows improved performance compared with existing methods in these additional analyses (Supplementary Figure S5A and B).

The model architecture has a significant impact on TLimmuno2

To further investigate the importance of model structure to TLimmuno2 performance, we used other model architectures and compared their performance. We constructed four traditional machine learning classification models [AdaBoost, K Nearest Neighbor (KNN), Random Forest (RF) and support vector machine (SVM)] and tuned their hyperparameters by cross-validation. In addition, we constructed deep learning models of DNN and convolutional neural network (CNN). All models were trained using the same data. We validated these models in our curated neoepitope dataset. As Figure 3A shows, TLimmuno2 outperforms DNN, CNN and traditional machine learning methods, proving that our model structure can better fit the problem of immunogenicity prediction.

The model architecture has a significant impact on TLimmuno2. (A) AUC values of TLimmuno2, Adaboost, KNN, RF, SVM, CNNs and DNNs in the neoepitope dataset by using the 15-mer method (left) and the k-mer method (right). (B) The ROC curve of TLimmuno2, TLimmuno2_noba (left) and TLimmuno2_onlyba (right). TLimmuno2_noba: delete the transfer layer in TLimmuno2 model; TLimmuno2_onlyba: delete the LSTM layers in TLimmuno2 model. P-values were calculated via a two-sided DeLong’s test. The data used here are the neoepitope dataset by using the k-mer method.
Figure 3

The model architecture has a significant impact on TLimmuno2. (A) AUC values of TLimmuno2, Adaboost, KNN, RF, SVM, CNNs and DNNs in the neoepitope dataset by using the 15-mer method (left) and the k-mer method (right). (B) The ROC curve of TLimmuno2, TLimmuno2_noba (left) and TLimmuno2_onlyba (right). TLimmuno2_noba: delete the transfer layer in TLimmuno2 model; TLimmuno2_onlyba: delete the LSTM layers in TLimmuno2 model. P-values were calculated via a two-sided DeLong’s test. The data used here are the neoepitope dataset by using the k-mer method.

Similarly, we want to demonstrate the impact of transfer learning on TLimmuno2, so we reconstructed two models, only BA and without BA. For the only BA model, its input layer contains only the transfer output of the BA model, while for the without BA model, we masked the BA model and only the LSTM layer left. In independent validation datasets, the performance of these two models is significantly decreased compared with TLimmuno2 (Figure 3B). These results demonstrate that the transfer learning approach improves the predictive power of TLimmuno2 in pMHC II immunogenicity prediction.

TLimmuno2 can learn important features that determine immunogenicity

To further demonstrate the discriminative power of TLimmuno2, we collected 52 MHC II neoepitopes and their corresponding wild-type peptides from published papers [4, 23–36] (Supplementary Table S5). These neoepitopes have been experimentally shown to elicit T cell responses. For each neoantigen and its corresponding wild-type peptide, we predicted the immunogenicity rank values by TLimmuno2. As shown in Figure 4A, the TLimmuno2 predicted immunogenicity rank values of mutant neopeptides are significantly lower than those of wild-type peptides, and this demonstrates the ability of TLimmuno2 in recognizing really immunogenic neoepitope even when the neoepitope differs from its wild-type peptide by only one amino acid.

TLimmuno2 can learn important features that determine immunogenicity. (A) The rank of neoepitopes and their corresponding wild-type peptides predicted by TLimmuno2. These peptide data are collected from the literature and verified by experiments. P-values are calculated with the paired two-sided Wilcoxon rank-test. Ascending importance rank of each position of Ala-scanning (B) and zero-setting (C). The position with the largest performance drop receiving the highest ranking across 100 simulations. Dot size corresponds to the frequencies of each position assigned the denoted rank, and the number on the x-axis indicate different amino acid positions.
Figure 4

TLimmuno2 can learn important features that determine immunogenicity. (A) The rank of neoepitopes and their corresponding wild-type peptides predicted by TLimmuno2. These peptide data are collected from the literature and verified by experiments. P-values are calculated with the paired two-sided Wilcoxon rank-test. Ascending importance rank of each position of Ala-scanning (B) and zero-setting (C). The position with the largest performance drop receiving the highest ranking across 100 simulations. Dot size corresponds to the frequencies of each position assigned the denoted rank, and the number on the x-axis indicate different amino acid positions.

We performed in silico mutational analyses to investigate whether TLimmuno2 was able to learn the molecular characteristics of the peptide. By changing the amino acid residues at each position, we compare the differences of the predicted immunogenicity values, and the largest decrease corresponds to the great importance of the position. We do this with two methods: zero-setting and Ala-scanning. Zero-setting is a commonly used method in computer vision, that is, the feature of a specific location is set to zero by masking. The method of setting all to zero is too violent and not biologically explainable. Ala-scanning is a method borrowed from alanine search in biochemical experiments. We simulated this process 100 times and an ascending ranking was performed each time to highlight the most salient positions. Both results show that P4 (residue 4), P5 and P6 is very important for pMHC II immunogenicity prediction (Figure 4B). This result is consistent with the results related to MHC class I molecules [37], and it is reported that these positions are essential for interacting with the TCR [38, 39]. The results of Ala-scanning are more stable rather than zero-setting because Ala also has biological functions. Furthermore, the BA model does not show the same trend (Supplementary Figure S6), which indicates that different features have been learned for pMHC II BA prediction and pMHC II immunogenicity prediction. Overall, these analyses demonstrate that TLimmuno2 can learn meaningful biological features for immunogenicity prediction.

Neoantigen vaccine and immunoediting signal detection

To further validate TLimmuno2 and demonstrate the value of our model as a knowledge discovery tool, we evaluate the physiological relevance of TLimmuno2’s prediction. Most current cancer vaccine platforms preferentially screen candidate neoepitopes for vaccine production by selecting highly expressed neoepitopes with high BA to MHC alleles. However, despite the rigorous candidate selection, many vaccine peptides do not elicit T cell responses after vaccination. Therefore, we compare the performance of TLimmuno2 and NetMHCIIpan by using personalized melanoma neoepitopes with corresponding immune response data [40]. As shown in Figure 5A, TLimmuno2 performs better than NetMHCIIpan in accuracy, recall and F1 score by using 2% rank as a cutoff. The TopK metric (K = 20 or 50, the number of positive samples in the top K samples with the lowest predicted ranks) is also used to evaluate the performance of different models, and TLimmuno2 shows improved performance compared with netMHCIIpan in these TopK metrics.

Application of TLimmuno2 in neoantigen vaccine design and immunoediting signal detection. (A) Comparison between TLimmuno2 and NetMHCIIpan using an experimentally validated melanoma neoepitope dataset, with the performance on different metrics (left) and the number of true-positive predictions overlapping with each algorithm’s top 20 or top 50 predictions (right). (B) Density plots showing the distributions of R2 (left) and P-value (right) (Student’s t-test) in comparing the rank of MHC I or MHC II binding affinities and MHC II immunogenicity in each sample of TCGA dataset. (C) Distribution of ESCCF in pan-cancer (left) and in specific cancer type (right) after removing samples with antigenic and driver mutations located in the same gene. The neoepitopes were predicted by TLimmuno2. The P-value was calculated from simulated median ESCCF distributions (Supplementary Figure S7).
Figure 5

Application of TLimmuno2 in neoantigen vaccine design and immunoediting signal detection. (A) Comparison between TLimmuno2 and NetMHCIIpan using an experimentally validated melanoma neoepitope dataset, with the performance on different metrics (left) and the number of true-positive predictions overlapping with each algorithm’s top 20 or top 50 predictions (right). (B) Density plots showing the distributions of R2 (left) and P-value (right) (Student’s t-test) in comparing the rank of MHC I or MHC II binding affinities and MHC II immunogenicity in each sample of TCGA dataset. (C) Distribution of ESCCF in pan-cancer (left) and in specific cancer type (right) after removing samples with antigenic and driver mutations located in the same gene. The neoepitopes were predicted by TLimmuno2. The P-value was calculated from simulated median ESCCF distributions (Supplementary Figure S7).

The interactions between immune cells and tumor cells are reflected as immunoediting, which could mediate the negative selection of deoxyribonucleic acid alterations encoding high immunogenicity [3, 41]. It is still unknown if MHC class II-presented neoantigens are undergoing immunoediting-related negative selection in cancer evolution. Here, we apply the TLimmuno2 to predict the immunogenicity of neopeptides arising from missense point mutations in TCGA samples. We predict the immunogenicity rank of all neoepitopes with TLimmuno2 and predict the BA rank by NetMHCpan [11] or NetMHCIIpan. HLA typing information of all TCGA samples is obtained from Li et al. [42]. To evaluate the extent to which the TLimmuno2 learned a signal that is also learned by the BA predictor, we measured the Pearson correlation between immunogenicity rank and BA rank. The correlations are positive, significant but low in magnitude, with median Pearson R2 = 0.0968 in the comparison between MHC I BA rank and MHC II immunogenicity rank, and with median Pearson R2 = 0.2057 in the comparison between MHC II BA rank and MHC II immunogenicity rank (Figure 5B). Overall, this analysis suggests that TLimmuno2 is at least partially non-redundant with the BA predictors.

We then detect the immunoediting signals in TCGA samples by using the previously published CCF enrichment score (ESCCF) method [43]. ESCCF reflects the cancer cell fraction (CCF) distribution difference between immunogenic and non-immunogenic mutations. Theoretically, immunogenic mutations will undergo immune-based negative selection, and consequently, the CCF of immunogenic mutations will be decreased compared with non-immunogenic mutations. Thus, ESCCF values could reflect the strength of immune-based negative selection. We analyzed all missense point mutations in TCGA using TLimmuno2 and set cutoff to 2% to obtain neoantigens. In TCGA pan-cancer cohort, when samples with antigenic and driver mutations lying on the same gene are removed, the observed median ESCCF is −0.024 (n = 5524). By comparing simulated median ESCCF distribution and the actually observed ESCCF score in TCGA dataset, we can conclude that there is significant (P-value < 0.0005) MHC II neoantigen-mediated immunoediting-elimination signal in TCGA pan-cancer dataset (Supplementary Figure S7). This significant immunoediting-elimination signal is also observed in several specific cancer types, including adrenocortical carcinoma, Thymoma, etc. (Figure 5C). These data support the existence of MHC class II neoantigen-mediated immunoediting-elimination signal in TCGA dataset and also demonstrate the robustness of TLimmuno2 in identifying real immunogenic peptides that are undergoing negative selection during cancer evolution.

DISCUSSION

Many experimental evidences have demonstrated the potential clinical application value of MHC class II-presented epitopes [2, 5], which can play an important role in immunotherapy. However, binding-affinity-based methods are not sufficient to accurately predict whether a peptide can elicit downstream immune responses. Due to the lack of sufficient training data, existing tools for MHC class II epitope immunogenicity prediction could not accurately quantify the ability of pMHC II complex to elicit T cell responses. Here, we propose a transfer learning based model, TLimmuno2, for high-confidence epitope-MHC II immunogenicity prediction. TLimmuno2 shows improved prediction accuracy and precision compared with existing models on independent validation datasets. By applying TLimmuno2 to predict neoantigens, we further demonstrate the existence of significant MHC class II neoantigen-mediated immunoediting signal in TCGA pan-cancer dataset.

Transfer learning is a technique that uses the similarity of data, models and tasks to apply knowledge in a certain field to a new field, and the application of transfer learning has significantly enhanced the performance of various models [44–47]. In the process of transfer learning, the data are divided into target data and source data. The target data refer to the data required for specific tasks with a small amount of data, while the source data are usually larger in scale and show similarity to the target data. The interaction between MHC and peptide is a key process in eliciting T cell responses. In this study, we innovatively transfer BA as the source data to the prediction of immunogenicity, which can overcome the problem that insufficient immunogenicity data seriously affect the performance of the model. By comparing with other machine learning and deep learning algorithms, we found that transfer learning significantly improved the performance of our model.

There are also some limitations of this study. The T cell response is a complex biological process that involves not only the interaction between MHC and peptide [7]; peptide processing and presentation process [48, 49] and the interaction between pMHC and TCR are also important [50]. HLA–peptide pairs are considered as immunogenic if they have been validated to elicit T cell activation in at least one experiment when training an immunogenic model. This assumption simplifies the evaluation of potential immunogenicity because whether the pMHC-matched TCR exists in the body is unknown. And, with the alterations in the tumor microenvironment, the previous immunogenic neoantigens might not be able to elicit T cell response [6]. Many peptide features have been reported to be important factors in determining immunogenicity, such as differential agretopicity index (DAI) [51], expression level, etc. Some of the peptide features are specific to tumor neoantigen (e.g. DAI) and cannot be applied in this model training. Some additional peptide features (e.g. peptide expression) are not included in this model due to the limited availability of training data. These complex factors could be considered in the further optimization of epitope immunogenicity prediction.

The aim of the BA model constructed in this study was to extract the pMHC II BA information for subsequent transfer learning. This BA model was trained using data that measure the in vitro binding between peptide and MHC II with experiments, and many in vivo antigen processing and presentation features, such as protein internalization and protease digestion, have not been considered in these in vitro experiments [52]. The function of this BA model has been demonstrated in the subsequent immunogenicity prediction through transfer learning in this study. However, this BA model itself does not show superior performance compared with other available tools in immunogenicity prediction or mass-spectrometry immunopeptidome data prediction due to the limited BA model training process, and this could be a direction for future improvement.

Overall, we demonstrate that the immunogenicity of a given peptide and MHC lI class can be accurately predicted by machine learning methods, which suggests a reliable way for future research. We expect TLimmuno2 to propel tumor immunogenomics research and also enhance the design and implementation of precision immunotherapy. Tlimmuno2 is freely available at https://github.com/XSLiuLab/TLimmuno2.

METHODS

BA training dataset collection and processing

The same BA data used for training NetMHCIIpan4.0 were used to train our BA model; these data comprise 107 008 measurements of pMHC BA covering 71 types of class II MHCs of humans. The IC50 is transformed by log50k-transformed BA method and a threshold of 500 nM is used. This means that peptides with log50k-transformed BA values greater than 0.426 are classified as MHC binders.

IEDB immunogenicity dataset collection and processing

Immunogenicity dataset was downloaded from IEDB on 15 May 2022. The following keywords were used: linear peptide; T cell; MHC II; include positive and negative; human and any disease. We next used strict criteria for data cleaning. First, data instances without explicit four-digit MHC alleles were discarded. Second, the length of the peptides is limited in the range of 13–21 mer. Third, the data without explicit experimental information (no information on the number of subjects tested/responded) were removed. We chose data instances having the following five experimental types for TLimmuno2 training: ‘51 chromium’, ‘ELISA’, ‘ELISPOT’, ‘ICS’ and ‘multimer/tetramer’, and these experimental types provide high-quality immunogenicity data for pMHC pair. Finally, we obtained 3930 positive and 2478 negative data instances for model training (Supplementary Table S3).

Generation of random artificial negative peptides

In the collected peptide immunogenicity dataset, we found that the percentage of immunogenic peptides is higher than non-immunogenic peptides. One potential reason for this phenomenon is that the data without response information (number of subjects tested/responded) in the IEDB dataset were removed during data cleaning process. To deal with this situation, we used the whole human peptide sequence downloaded from the universal protein knowledgebase (UniProt) database [53] to generate random artificial negative peptides. It should be noted that even though this is a widely used method for generating negative controls, the generated peptides are not perfect negative peptides, and some of them could still be immunogenetic. We generated two times random peptides for each length of each MHC allele as negative instances to construct the IEDB-training dataset.

Benchmark analysis on the BA models

The performance of the BA model was tested in an independent dataset, and we removed all pairs that were also part of the training data from the BA model and changed the IC50 to binary value by using the same methods. We compared the performance of the BA model, NetMHCIIpan4.0 BA (download from IEDB) and MixMHC2pred. The AUC of the ROC was computed to indicate the performance.

Benchmark dataset for TLimmuno2

Independent datasets, including neoepitope dataset (Supplementary Table S4), are used for testing the performance of the TLimmuno2 model. Neoepitope dataset containing neoepitopes experimentally tested for CD4+ T cell immunogenicity was retrieved from the literature [12]. These neoepitopes are usually sequences of 20–25 amino acids with multiple MHC alleles, and we used two methods to process them: 15-mer and k-mer method (Supplementary Figure S4). For the 15-mer method, we split one neoepitope sequence into 15 subsequences and score or rank each subsequence. In the k-mer method, neoepitopes were cut into 13–21-mers subsequence. The prediction of the neoepitope is the maximum score or the minimum ranking of the predicted results of all subsequences of different methods. In case of a neoepitope with multiple MHC types, we used the maximum score or minimum rank of prediction for different MHC types as the prediction result for the neoepitope.

Peptide and allele representation

We used the BLOSUM62 substitution matrix to encode peptides and MHC in numeric vectors. The length of the peptides was limited to 13–21, and all peptides were padded with pad character ‘X’ to the maximum length of 21. For example, a peptide ‘AFLRFLAIPPTAGIL’ is converted to ‘AFLRFLAIPPTAGILXXXXXX’, and we will get a 21 × 21 encoding matrix. We used ‘pseudosequence’ to represent the MHC class II alleles [54]. The pseudosequences consist of amino acids in contact with peptide and only 34 polymorphic residues are included. Similar to the peptide encoding, each amino acid was transformed into a 21-dimensional vector using the BLOSUM62 substitution matrix, and in the end, we got a 34 × 21 matrix for each MHC class II allele.

Peptide MHC II BA model structure

Unlike NetMHCIIpan4.0, we use the LSTM layer to construct pMHC II BA model. Antigen sequences and MHC pseudo-sequences are used as inputs. We input antigen and MHC into two LSTM layers with output sizes of 128 and 64, respectively. The LSTM outputs of antigen and MHC are concatenated in the same layer to form a 128-dimensional vector. This layer is followed by three dense layers with 100, 50, and 10 neurons activated by the RELU function. The last output layer is a single-neuron dense layer activated by the sigmoid function.

TLimmuno2 model structure

We employed transfer learning to add BA information by extracting the penultimate layer of the BA model. We also used LSTM layers to get antigen and MHC information. We concatenated the three numerical vectors into a single layer, added two dense layers with 400 neurons and 200 neurons, respectively, and activated by the RELU function. The sigmoid function activates the last layer with a single neuron.

Model training process

We first partitioned the data into a training set and a testing set to 9:1 and then performed a 5-fold cross-validation in the training sets to evaluate the model robustness. Before training, the dataset is randomly partitioned into five non-overlapping subsets. The cross-validation process is repeated five times, with each subset used as a validation set, while the remaining subsets are used as the training set. Finally, we use the training set to train the model and the testing set to get model performance. The performance of the model was further evaluated and compared using independent datasets.

Immunoediting signal detection

Precompiled curated somatic mutations [mutation annotation format (MAF)] for TCGA cohorts covering 33 cancer types were downloaded from UCSC Xena [55], and missense variants were selected for downstream analysis. We changed MAF into VCF (variant call format) to get the mutation protein sequence and split these into 15-mer subsequences. Immunogenicity rank was calculated by TLimmuno2, and BA rank was calculated by NetMHCIIpan. Antigen presentation by these HLA-II molecules on human cells involves three loci on chromosome 6 (DR, DQ and DP), and we limited it in HLA-DR. HLA typing information of all TCGA samples is obtained from Li et al. [42]. The CCF data and gene expression data are collected as published work described. By using previously described enrichment score–CCF method (ESCCF) [43], we detected the immunoediting-elimination signal in TCGA dataset using neoantigens predicted with TLimmuno2.

Statistical analysis

All computations and statistical analyses were carried out in the R and python computing environment. All P-values are two-way unless otherwise noted. The AUCs of the ROC and PR were calculated by the sklearn python package. The P-value of Figure 3B was calculated by the DeLong’s test. The P-value of Figure 4A was calculated by the Wilcoxon rank-test. The P-value of Figure 5B was calculated by the Student’s t-test. The P-value of Figure 5C was calculated from the positive or negative region of the empirical null distribution.

Key Points
  • Applying transfer learning concept for epitope MHC II immunogenicity prediction.

  • TLimmuno2 shows superior performance compared with existing models.

  • MHC II neoantigen-mediated immunoediting signal can be identified by TLimmuno2.

  • Open-source python packages with command line and library interfaces.

ACKNOWLEDGEMENTS

We thank ShanghaiTech University High Performance Computing Public Service Platform for computing services. We thank the multi-omics facility, molecular and cell biology core facility of ShanghaiTech University for technical help.

FUNDING

Shanghai Science and Technology Commission (21ZR1442400); National Natural Science Foundation of China (31771373); ShanghaiTech University (startup funding).

DATA AVAILABILITY

Python3 and R code of TLimmuno2 in this study are publicly available at https://github.com/XSLiuLab/TLimmuno2. Pipeline for CCF enrichment score method is available on GitHub at https://github.com/XSLiuLab/Immunoediting [43]. The data used in this article are publicly available at https://github.com/XSLiuLab/TLimmuno2. The source code and technological process for reproducing all figures in this study are provided at https://xsliulab.github.io/TLimmuno2/.

CONTRIBUTIONS

G.W. collected the data, developed the TLimmuno2 tool and drafted the manuscript. T.W., W.N., K.D., X.S., J.W., C.W., J.C. and D.X. participated in critical project discussion and resources. X.-S.L. conceptualized the idea, designed, supervised the study and wrote the manuscript.

Guangshuai Wang is a PhD student at School of Life Science and Technology, ShanghaiTech University, Shanghai 201203, China; email: [email protected]

Tao Wu is a PhD student at School of Life Science and Technology, ShanghaiTech University, Shanghai 201203, China; email: [email protected]

Wei Ning is a master student at School of Life Science and Technology, ShanghaiTech University, Shanghai 201203, China; email: [email protected]

Kaixuan Diao is a PhD student at School of Life Science and Technology, ShanghaiTech University, Shanghai 201203, China; email: [email protected]

Xiaoqin Sun is an engineer at School of Life Science and Technology, ShanghaiTech University, Shanghai 201203, China; email: [email protected]

Jinyu Wang is a PhD student at School of Life Science and Technology, ShanghaiTech University, Shanghai 201203, China; email: [email protected]

Chenxu Wu is a PhD student at School of Life Science and Technology, ShanghaiTech University, Shanghai 201203, China; email: [email protected]

Jing Chen is a PhD student at School of Life Science and Technology, ShanghaiTech University, Shanghai 201203, China; email: [email protected]

Dongliang Xu is a professor at Department of Urology, Shuguang Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai 200120, China; email: [email protected]

Xue-Song Liu is a professor at School of Life Science and Technology, ShanghaiTech University, Shanghai 201203, China; email: [email protected]

References

1.

Khodadoust
MS
,
Olsson
N
,
Wagar
LE
, et al.
Antigen presentation profiling reveals recognition of lymphoma immunoglobulin neoantigens
.
Nature
2017
;
543
:
723
7
.

2.

Linnemann
C
,
van
Buuren
MM
,
Bies
L
, et al.
High-throughput epitope discovery reveals frequent recognition of neo-antigens by CD4+ T cells in human melanoma
.
Nat Med
2015
;
21
:
81
5
.

3.

Schreiber
RD
,
Old
LJ
,
Smyth
MJ
.
Cancer immunoediting: integrating immunity's roles in cancer suppression and promotion
.
Science
2011
;
331
:
1565
70
.

4.

Tran
E
,
Ahmadzadeh
M
,
Lu
YC
, et al.
Immunogenicity of somatic mutations in human gastrointestinal cancers
.
Science
2015
;
350
:
1387
90
.

5.

Alspach
E
,
Lussier
DM
,
Miceli
AP
, et al.
MHC-II neoantigens shape tumour immunity and response to immunotherapy
.
Nature
2019
;
574
:
696
701
.

6.

Hu
Z
,
Leet
DE
,
Allesoe
RL
, et al.
Personal neoantigen vaccines induce persistent memory T cell responses and epitope spreading in patients with melanoma
.
Nat Med
2021
;
27
:
515
25
.

7.

Buckley
PR
,
Lee
CH
,
Ma
R
, et al.
Evaluating performance of existing computational models in predicting CD8+ T cell pathogenic epitopes and cancer neoantigens
.
Brief Bioinform
2022
;
23
:bbac141.

8.

Schmidt
J
,
Smith
AR
,
Magnin
M
, et al.
Prediction of neo-epitope immunogenicity reveals TCR recognition determinants and provides insight into immunoediting
.
Cell Rep Med
2021
;
2
:
100194
.

9.

Wang
F
,
Wang
H
,
Wang
L
, et al.
MHCRoBERTa: pan-specific peptide-MHC class I binding prediction through transfer learning with label-agnostic protein sequences
.
Brief Bioinform
2022
;
23
:bbab595.

10.

Reynisson
B
,
Barra
C
,
Kaabinejadian
S
, et al.
Improved prediction of MHC II antigen presentation through integration and motif deconvolution of mass spectrometry MHC eluted ligand data
.
J Proteome Res
2020
;
19
:
2304
15
.

11.

Reynisson
B
,
Alvarez
B
,
Paul
S
, et al.
NetMHCpan-4.1 and NetMHCIIpan-4.0: improved predictions of MHC antigen presentation by concurrent motif deconvolution and integration of MS MHC eluted ligand data
.
Nucleic Acids Res
2020
;
48
:
W449
54
.

12.

Racle
J
,
Michaux
J
,
Rockinger
GA
, et al.
Robust prediction of HLA class II epitopes by deep motif deconvolution of immunopeptidomes
.
Nat Biotechnol
2019
;
37
:
1283
6
.

13.

Dhanda
SK
,
Karosiene
E
,
Edwards
L
, et al.
Predicting HLA CD4 immunogenicity in human populations
.
Front Immunol
2018
;
9
:
1369
.

14.

Trevizani
R
,
Custódio
FL
.
Deepitope: prediction of HLA-independent T-cell epitopes mediated by MHC class II using a convolutional neural network
.
Artif Intell Life Sci
2022
;
2
:
100038
.

15.

Ogishi
M
,
Yotsuyanagi
H
.
Quantitative prediction of the landscape of T cell epitope immunogenicity in sequence space
.
Front Immunol
2019
;
10
:
827
.

16.

Vita
R
,
Mahajan
S
,
Overton
JA
, et al.
The Immune Epitope Database (IEDB): 2018 update
.
Nucleic Acids Res
2019
;
47
:
D339
43
.

17.

Taroni
JN
,
Grayson
PC
,
Hu
Q
, et al.
MultiPLIER: a transfer learning framework for transcriptomics reveals systemic features of rare disease
.
Cell Syst
2019
;
8
:
380
94 e384
.

18.

Andreatta
M
,
Trolle
T
,
Yan
Z
, et al.
An automated benchmarking platform for MHC class II binding prediction methods
.
Bioinformatics
2018
;
34
:
1522
8
.

19.

Fernandez
EA
,
Valtuille
R
,
Presedo
JM
, et al.
Comparison of different methods for hemodialysis evaluation by means of ROC curves: from artificial intelligence to current methods
.
Clin Nephrol
2005
;
64
:
205
13
.

20.

Nibeyro
G
,
Girotti
MR
,
Prato
L
, et al.
MHC-I binding affinity derived metrics fail to predict tumor specific neoantigen immunogenicity
.
bioRxiv
.
2022
.03.14.484285. https://doi.org/10.1101/2022.03.14.484285.

21.

Xu
S
,
Wang
X
,
Fei
C
.
A highly effective system for predicting MHC-II epitopes with immunogenicity
.
Front Oncol
2022
;
12
:
888556
.

22.

Saito
T
,
Rehmsmeier
M
.
The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets
.
PloS One
2015
;
10
:e0118432.

23.

Li
F
,
Deng
L
,
Jackson
KR
, et al.
Neoantigen vaccination induces clinical and immunologic responses in non-small cell lung cancer patients harboring EGFR mutations
.
J Immunother Cancer
2021
;
9
:e002531.

24.

Sha
H
,
Liu
Q
,
Xie
L
, et al.
Case report: pathological complete response in a lung metastasis of phyllodes tumor patient following treatment containing peptide neoantigen nano-vaccine
.
Front Oncol
2022
;
12
:
800484
.

25.

Deniger
DC
,
Pasetto
A
,
Robbins
PF
, et al.
T-cell responses to TP53 “hotspot” mutations and unique neoantigens expressed by human ovarian cancers
.
Clin Cancer Res
2018
;
24
:
5562
73
.

26.

Novellino
L
,
Renkvist
N
,
Rini
F
, et al.
Identification of a mutated receptor-like protein tyrosine phosphatase κ as a novel, class II HLA-restricted melanoma antigen
.
J Immun
2003
;
170
:
6363
70
.

27.

Deng
L
,
Langley
RJ
,
Brown
PH
, et al.
Structural basis for the recognition of mutant self by a tumor-specific, MHC class II-restricted T cell receptor
.
Nat Immunol
2007
;
8
:
398
408
.

28.

Schumacher
T
,
Bunse
L
,
Pusch
S
, et al.
A vaccine targeting mutant IDH1 induces antitumour immunity
.
Nature
2014
;
512
:
324
7
.

29.

Assadipour
Y
,
Zacharakis
N
,
Crystal
JS
, et al.
Characterization of an immunogenic mutation in a patient with metastatic triple-negative breast cancer
.
Clin Cancer Res
2017
;
23
:
4347
53
.

30.

Zacharakis
N
,
Chinnasamy
H
,
Black
M
, et al.
Immune recognition of somatic mutations leading to complete durable regression in metastatic breast cancer
.
Nat Med
2018
;
24
:
724
30
.

31.

Meng
Q
,
Valentini
D
,
Rao
M
, et al.
Neoepitope targets of tumour-infiltrating lymphocytes from patients with pancreatic cancer
.
Br J Cancer
2019
;
120
:
97
108
.

32.

Malekzadeh
P
,
Pasetto
A
,
Robbins
PF
, et al.
Neoantigen screening identifies broad TP53 mutant immunogenicity in patients with epithelial cancers
.
J Clin Invest
2019
;
129
:
1109
14
.

33.

Leko
V
,
McDuffie
LA
,
Zheng
Z
, et al.
Identification of neoantigen-reactive tumor-infiltrating lymphocytes in primary bladder cancer
.
J Immunol
2019
;
202
:
3458
67
.

34.

Liu
S
,
Matsuzaki
J
,
Wei
L
, et al.
Efficient identification of neoantigen-specific T-cell responses in advanced human ovarian cancer
.
J Immunother Cancer
2019
;
7
:
156
.

35.

Zeng
Y
,
Zhang
W
,
Li
Z
, et al.
Personalized neoantigen-based immunotherapy for advanced collecting duct carcinoma: case report
.
J Immunother Cancer
2020
;
8
:
8
.

36.

Ding
Z
,
Li
Q
,
Zhang
R
, et al.
Personalized neoantigen pulsed dendritic cell vaccine for advanced lung cancer
.
Signal Transduct Target Ther
2021
;
6
:
26
.

37.

Li
G
,
Iyer
B
,
Prasath
VBS
, et al.
DeepImmuno: deep learning-empowered prediction and generation of immunogenic peptides for T-cell immunity
.
Brief Bioinform
2021
;
22
:bbab160.

38.

Wucherpfennig
KW
,
Call
MJ
,
Deng
L
, et al.
Structural alterations in peptide-MHC recognition by self-reactive T cell receptors
.
Curr Opin Immunol
2009
;
21
:
590
5
.

39.

Rudolph
MG
,
Stanfield
RL
,
Wilson
IA
.
How TCRs bind MHCs, peptides, and coreceptors
.
Annu Rev Immunol
2006
;
24
:
419
66
.

40.

Ott
PA
,
Hu
Z
,
Keskin
DB
, et al.
An immunogenic personal neoantigen vaccine for patients with melanoma
.
Nature
2017
;
547
:
217
21
.

41.

O'Donnell
JS
,
Teng
MWL
,
Smyth
MJ
.
Cancer immunoediting and resistance to T cell-based immunotherapy
.
Nat Rev Clin Oncol
2019
;
16
:
151
67
.

42.

Li
X
,
Zhou
C
,
Chen
K
, et al.
Benchmarking HLA genotyping and clarifying HLA impact on survival in tumor immunotherapy
.
Mol Oncol
2021
;
15
:
1764
82
.

43.

Wu
T
,
Wang
G
,
Wang
X
, et al.
Quantification of neoantigen-mediated immunoediting in cancer evolution
.
Cancer Res
2022
;
82
:
2226
38
.

44.

Maghdid HS, Asaad AT, Ghafoor KZ, et al. Diagnosing COVID-19 pneumonia from X-ray and CT images using deep learning and transfer learning algorithms. In:

Multimodal image exploitation and learning 2021
. 2021:99–110.

45.

Yang
Y
,
Li
XF
,
Wang
P
, et al.
Multi-source transfer learning via ensemble approach for initial diagnosis of Alzheimer's disease
.
IEEE J Transl Eng Health Med
2020
;
8
:1–10.

46.

Gao
Y
,
Cui
Y
.
Author correction: deep transfer learning for reducing health care disparities arising from biomedical data inequality
.
Nat Commun
2020
;
11
:
6444
.

47.

Farahani A, Pourshojae B, Rasheed K, et al.

A concise review of transfer learning
. In:
International Conference on Computational Science and Computational Intelligence (CSCI)
. 2020:344–51.

48.

Chen
B
,
Khodadoust
MS
,
Olsson
N
, et al.
Predicting HLA class II antigen presentation through integrated deep learning
.
Nat Biotechnol
2019
;
37
:
1332
43
.

49.

Wang
S
,
He
Z
,
Wang
X
, et al.
Antigen presentation and tumor immunogenicity in cancer immunotherapy response prediction
.
Elife
2019
;
8
:e49020.

50.

Lu
T
,
Zhang
Z
,
Zhu
J
, et al.
Deep learning-based prediction of the T cell receptor-antigen binding specificity
.
Nat Mach Intell
2021
;
3
:
864
75
.

51.

Duan
F
,
Duitama
J
,
Al Seesi
S
, et al.
Genomic and bioinformatic profiling of mutational neoepitopes reveals new rules to predict anticancer immunogenicity
.
J Exp Med
2014
;
211
:
2231
48
.

52.

Alvarez
B
,
Reynisson
B
,
Barra
C
, et al.
NNAlign_MA; MHC peptidome deconvolution for accurate MHC binding motif characterization and improved T-cell epitope predictions
.
Mol Cell Proteomics
2019
;
18
:
2459
77
.

53.

UniProt
C
.
UniProt: the universal protein knowledgebase in 2021
.
Nucleic Acids Res
2021
;
49
:
D480
9
.

54.

Karosiene
E
,
Rasmussen
M
,
Blicher
T
, et al.
NetMHCIIpan-3.0, a common pan-specific MHC class II prediction method including all three human MHC class II isotypes, HLA-DR, HLA-DP and HLA-DQ
.
Immunogenetics
2013
;
65
:
711
24
.

55.

Wang
S
,
Xiong
Y
,
Zhao
L
, et al.
UCSCXenaShiny: an R/CRAN package for interactive analysis of UCSC Xena data
.
Bioinformatics
2021
;
38
:
527
9
.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://dbpia.nl.go.kr/journals/pages/open_access/funder_policies/chorus/standard_publication_model)