-
PDF
- Split View
-
Views
-
Cite
Cite
Haleh Yasrebi, SurvJamda: an R package to predict patients' survival and risk assessment using joint analysis of microarray gene expression data, Bioinformatics, Volume 27, Issue 8, April 2011, Pages 1168–1169, https://doi.org/10.1093/bioinformatics/btr103
- Share Icon Share
Abstract
Summary: SurvJamda (Survival prediction by joint analysis of microarray data) is an R package that utilizes joint analysis of microarray gene expression data to predict patients' survival and risk assessment. Joint analysis can be performed by merging datasets or meta-analysis to increase the sample size and to improve survival prognosis. The prognosis performance derived from the combined datasets can be assessed to determine which feature selection approach, joint analysis method and bias estimation provide the most robust prognosis for a given set of datasets.
Availability: The survJamda package is available at the Comprehensive R Archive Network, http://cran.r-project.org.
Contact: [email protected]
1 INTRODUCTION
The survJamda package was developed for survival prediction and risk assessment based on microarray data. It allows to jointly analyze the datasets through data merging and meta-analysis. Data merging combines the data into one set prior to their analysis, whereas meta-analysis integrates only the results. In addition to different joint analysis methods, survJamda contains various feature selection approaches and bias estimation techniques which enable the user to determine the combination of which methods provides the most robust prediction for a given set of datasets.
A few other R packages like ipdmeta (Broeze et al., 2009) and survcomp (Haibe-Kains et al., 2008) have been created for joint analysis, that are more specifically, related to meta-analysis of censored data with time-to-event outcome.
2 DATA
The functions and algorithms developed in survJamda can be assessed on the datasets of the survJamda.data package. SurvJamda.data, created by the author, is a data package of 18 Mb containing three breast cancer datasets, GSE3143, GSE1992 and GSE4335, which were analyzed in (Yasrebi et al., 2009). SurvJamda.data are also available on Comprehensive R Archive Network.
3 METHODS
3.1 Feature selection
Top-ranking (Yasrebi et al., 2009). The multiple hypothesis testing correction implemented in the p.adjust function in the R stats package can also be applied to the top-ranking method.
User-defined method.
3.2 Joint analysis methods
Merging method:
ComBat (Johnson et al., 2007).
Z-score normalization. This method is applied in two ways:
Z-score1 normalization: in this approach, all datasets are Z-score normalized (Larsen et al., 2000) prior to their selection for the training and testing sets and their combination into one set (Yasrebi et al., 2009).
Z-score2 normalization: in this method, the datasets are initially selected for the training and testing sets. Then, the datasets composing the training set are merged together and Z-score normalized. The testing set is also Z-score normalized independently and separately from the training set.
Meta-analysis. The inverse normal method (Hedges et al., 1985) is used for meta-analysis.
3.3 Validation frameworks
Cross validation (CV) nested in 10 iterations.
Independent validation.
Pair-wise mode: two datasets are selected at a time, one of which is used as the training set and the other as the testing set. This process is iterated until all datasets are used as the training and testing sets (Yasrebi et al., 2009).
Leave one dataset out: all datasets except one are merged together to form the training set and the left-out set is used as the testing set. Similarly, this process is iterated until all datasets are used as the training and testing sets (Yasrebi et al., 2009).
3.4 Performance measures
Survival prediction is expressed by time-dependent area under the receiver operating characteristic curve (Heagerty et al., 2000) and hazard ratio measures risk assessment (Yasrebi et al., 2009).
Concordance index (Haibe-Kains et al., 2008).
Brier score (Haibe-Kains et al., 2008).
Conflict of Interest: none declared.
REFERENCES
Author notes
Associate Editor: Joaquin Dopazo