Abstract

Summary: Over the last decade, immunoinformatics has made significant progress. Computational approaches, in particular the prediction of T-cell epitopes using machine learning methods, are at the core of modern vaccine design. Large-scale analyses and the integration or comparison of different methods become increasingly important. We have developed FRED, an extendable, open source software framework for key tasks in immunoinformatics. In this, its first version, FRED offers easily accessible prediction methods for MHC binding and antigen processing as well as general infrastructure for the handling of antigen sequence data and epitopes. FRED is implemented in Python in a modular way and allows the integration of external methods.

Availability: FRED is freely available for download at http://www-bs.informatik.uni-tuebingen.de/Software/FRED.

Contact:  [email protected]

1 INTRODUCTION

The detection of T-cell epitopes is a critical step in vaccine design and a key problem in immunoinformatics. Experimental studies to detect epitopes are expensive and time consuming. Computational methods are able to reduce this experimental effort and thereby facilitate the process of epitope detection (DeLuca and Blasczyk, 2007). Many computational methods (based on, e.g. position-specific scoring matrices, various machine learning methods or structural information) have been developed for this task. Many of these methods are freely available through the internet, however, few standalone implementations are available. While web-based predictions are easy to use on the small scale, it severely hampers large-scale predictions and makes a direct comparison of individual methods difficult. The development of flexible prediction and analysis pipelines that can handle large amounts of data and combine prediction methods becomes increasingly important. These pipelines include extensive and flexible pre- and post-processing in addition to the application of a prediction method. The web-based methods available today do not offer tools for flexible data processing. Since there is no uniform interface to access these methods it is difficult to include them into automated prediction pipelines. One option to provide convenient and coherent interfaces to immunoinformatics tools is through web services (Halling-Brown et al., 2009); however, speed and availability tend to limit this approach particularly for large-scale studies.

Here, we present FRED, a software framework for computational immunomics, that provides a uniform interface for a variety of prediction methods and support for the implementation of custom-tailored prediction pipelines. FRED offers methods for extensive data processing as well as methods to assess and compare the performance of the prediction methods. This makes it a powerful platform for the rapid development of new algorithms and the analysis of large datasets.

2 METHODS

2.1 Implementation

FRED provides methods for sequence input, sequence preprocessing, filtering and display of the results. The general organization of FRED is depicted in Figure 1. The single prediction methods are accessed internally via a consistent interface. FRED can handle polymorphic sequences (e.g. for the study of single nucleotide polymorphisms (SNPs) in an epitope context) and offers the possibility of accessing different methods simultaneously and of combining, comparing or benchmarking the methods. FRED is easily extendable by user-defined prediction methods or methods for filtering of results. FRED is implemented in Python (release 2.6) (www.python.org). All additional software required for FRED is freely available and installation packages are included in the FRED package. The prediction methods currently available in FRED are listed in Table 1.

FRED is organized into four major parts: sequence input, application of prediction methods, filtering of the results and model testing.
Fig. 1.

FRED is organized into four major parts: sequence input, application of prediction methods, filtering of the results and model testing.

Table 1.

Prediction methods currently integrated in FRED

MethodReferences
MHC binding:
 SYFPEITHIRammensee et al. (1999)
 SVMHCDönnes and Kohlbacher (2006)
 BIMASParker et al. (1994)
 NetMHCpanaNielsen et al. (2007)
 NetMHCaBuus et al. (2003)
 HammerSturniolo et al. (1999)
 NetMHCIIpanaNielsen et al. (2008)
Proteasomal Cleavage:
 PCM method from WAPPDönnes and Kohlbacher (2005)
TAP Transport:
 SVMTAPDönnes and Kohlbacher (2005)
 Additive matrix methodDoytchinova et al. (2004)
MethodReferences
MHC binding:
 SYFPEITHIRammensee et al. (1999)
 SVMHCDönnes and Kohlbacher (2006)
 BIMASParker et al. (1994)
 NetMHCpanaNielsen et al. (2007)
 NetMHCaBuus et al. (2003)
 HammerSturniolo et al. (1999)
 NetMHCIIpanaNielsen et al. (2008)
Proteasomal Cleavage:
 PCM method from WAPPDönnes and Kohlbacher (2005)
TAP Transport:
 SVMTAPDönnes and Kohlbacher (2005)
 Additive matrix methodDoytchinova et al. (2004)

aInstallation of external software is required. Due to licensing issues, we could not include the standalone versions of these methods in the FRED package.

Table 1.

Prediction methods currently integrated in FRED

MethodReferences
MHC binding:
 SYFPEITHIRammensee et al. (1999)
 SVMHCDönnes and Kohlbacher (2006)
 BIMASParker et al. (1994)
 NetMHCpanaNielsen et al. (2007)
 NetMHCaBuus et al. (2003)
 HammerSturniolo et al. (1999)
 NetMHCIIpanaNielsen et al. (2008)
Proteasomal Cleavage:
 PCM method from WAPPDönnes and Kohlbacher (2005)
TAP Transport:
 SVMTAPDönnes and Kohlbacher (2005)
 Additive matrix methodDoytchinova et al. (2004)
MethodReferences
MHC binding:
 SYFPEITHIRammensee et al. (1999)
 SVMHCDönnes and Kohlbacher (2006)
 BIMASParker et al. (1994)
 NetMHCpanaNielsen et al. (2007)
 NetMHCaBuus et al. (2003)
 HammerSturniolo et al. (1999)
 NetMHCIIpanaNielsen et al. (2008)
Proteasomal Cleavage:
 PCM method from WAPPDönnes and Kohlbacher (2005)
TAP Transport:
 SVMTAPDönnes and Kohlbacher (2005)
 Additive matrix methodDoytchinova et al. (2004)

aInstallation of external software is required. Due to licensing issues, we could not include the standalone versions of these methods in the FRED package.

3 APPLICATIONS

Tutorial and documentation: with the FRED package, we provide examples that demonstrate how FRED can be used to solve typical tasks in computational immunomics with short and simple scripts. A detailed tutorial is available on the project's web site. It explains how to implement prediction pipelines, offers more detailed information on the functionality of FRED and addresses problems like choosing the right prediction method or threshold. We additionally provide a detailed documentation of the code.

Vaccine design: the selection of peptides for epitope-based vaccines is a typical application for large-scale predictions of MHC binding peptides. The following short and simple program implements a typical scenario for the selection of conserved peptide candidates for a vaccine against a virus. The scenario is based on the paper by Toussaint et al. (2008): a set of sequences of the hepatitis C virus core protein from four different subtypes is used. All peptides that occur in at least 90% of the input sequences are considered candidates for conserved epitopes. Predictions are made for 29 HLA alleles using the BIMAS method (Parker et al., 1994). graphic

Integration of new methods and performance evaluation: epitope prediction is still a very active field, with new methods continuously being developed. Such methods not implemented in Python can be plugged in using command-line calls. FRED provides a number of standard measures to compare different prediction methods and to evaluate the performance w.r.t. experimental values (Matthews Correlation Coefficient, accuracy, sensitivity, specificity, area under the ROC curve, correlation and rank correlation). Different prediction methods can thus be compared with ease.

Web server development: using FRED as the basis for new applications in computational immunomics leads to a significant reduction of development time and allows the convenient combination of new methods with existing ones. An example of an application based on FRED is EpiToolKit (www.epitoolkit.org, Feldhahn et al., 2008; Toussaint and Kohlbacher, 2009). Only the web-based user interface and the data management in the web server had to be newly implemented. Prediction functionality of EpiToolKit is completely provided by FRED. Through the use of Python, FRED can be integrated seamlessly in web servers/content management systems like Plone (http://www.plone.org/).

4 CONCLUSIONS

FRED is a valuable tool for performing large-scale analyses in immunoinformatics with different prediction methods and is also a software framework for the development of novel immunoinformatics methods. Ease of use, extendability and openness make it an ideal tool for addressing complex immuno-informatics problems in an uncomplicated manner.

Funding: Deutsche Forschungsgemeinschaft (SFB 685/B1).

Conflict of Interest: none declared.

REFERENCES

Buus
S
et al.
,
Sensitive quantitative predictions of peptide-MHC binding by a ‘Query by Committee’ artificial neural network approach
Tissue Antigens
,
2003
, vol.
62
(pg.
378
-
384
)
DeLuca
DS
Blasczyk
R
,
The immunoinformatics of cancer immunotherapy
Tissue Antigens
,
2007
, vol.
70
(pg.
265
-
271
)
Dönnes
P
Kohlbacher
O
,
Integrated modeling of the major events in the MHC class I antigen processing pathway
Protein Sci.
,
2005
, vol.
14
(pg.
2132
-
2140
)
Dönnes
P
Kohlbacher
O
,
SVMHC: a server for prediction of MHC-binding peptides
Nucleic Acids Res.
,
2006
, vol.
34
(pg.
W194
-
W197
)
Doytchinova
I
et al.
,
Transporter associated with antigen processing preselection of peptides binding to the MHC: a bioinformatic evaluation
J. Immunol.
,
2004
, vol.
173
(pg.
6813
-
6819
)
Feldhahn
M
et al.
,
Epitoolkit–a web server for computational immunomics
Nucleic Acids Res.
,
2008
, vol.
36
(pg.
W519
-
W522
)
Halling-Brown
M
et al.
,
Computational grid framework for immunological applications
Philos. Trans. R. Soc. A
,
2009
, vol.
367
(pg.
2705
-
2716
)
Nielsen
M
et al.
,
NetMHCpan, a method for quantitative predictions of peptide binding to any HLA-A and -B locus protein of known sequence
PLoS ONE
,
2007
, vol.
2
pg.
e796
Nielsen
M
et al.
,
Quantitative predictions of peptide binding to any HLA-DR molecule of known sequence: NetMHCIIpan
PLoS Comput. Biol.
,
2008
, vol.
4
pg.
e1000107
Parker
KC
et al.
,
Scheme for ranking potential HLA-A2 binding peptides based on independent binding of individual peptide side-chains
J. Immunol.
,
1994
, vol.
152
(pg.
163
-
175
)
Rammensee
H
et al.
,
SYFPEITHI: database for MHC ligands and peptide motifs
Immunogenetics
,
1999
, vol.
50
(pg.
213
-
219
)
Sturniolo
T
et al.
,
Generation of tissue-specific and promiscuous HLA ligand databases using DNA microarrays and virtual HLA class II matrices
Nat. Biotechnol.
,
1999
, vol.
17
(pg.
555
-
561
)
Toussaint
NC
Kohlbacher
O
,
OptiTope—a web server for the selection of an optimal set of peptides for epitope-based vaccines
Nucleic Acids Res.
,
2009
, vol.
37
(pg.
W617
-
W622
)
Toussaint
NC
et al.
,
A mathematical framework for the selection of an optimal set of peptides for epitope-based vaccines
PLoS Comput. Biol.
,
2008
, vol.
4
pg.
e1000246

Author notes

Associate Editor: John Quackenbush

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.