ANPELA: analysis and performance assessment of the label-free quantification workflow for metaproteomic studies

Overview on 18 quantification tools together with their corresponding mode of acquisition, acquisition technique and quantification metric. APEX, absolute protein expression index; Average-Top3, average intensity of the three highest intense peptides; emPAI, exponentially modified protein abundance index; iBAQ, intensity-based absolute quantification; IRP, internal reference peptide; MaxLFQ, delayed normalization and maximal peptide ratio extraction; NoC, intensity of nonconflicting peptides; NSAF, normalized spectral abundance factors; PAI, protein abundance index; Raw, raw spectral counts; Summed-Top3, summed intensity of the three highest intense peptides; TIC, total ion current; TPA, total protein approach; TRIC, transfer of identification confidence

Mode ofacquisition	Acquisition technique	Representativequantification tools	Quantification metric provided
Data-dependent acquisition (DDA)	Spectral counts (MS2)	Abacus	NSAF [47, 64, 130]
		Census	Raw [65]
		DTASelect	Raw [67]
		IRMa-hEIDI	Raw [129]
		MaxQuant	emPAI; APEX [47, 69, 130]
		MFPaQ	PAI [131]
		ProteinProphet	Raw [132]
		Thermo Proteome Discoverer	NSAF [33, 47, 130]
		Scaffold	emPAI; NSAF; Total spectra; Weighted spectra [133]
	Peak intensity (MS1)	MaxQuant	MaxLFQ; Summed-Top3; iBAQ; TPA [69]
		MFPaQ	TIC; Average TIC [134]
		OpenMS	TIC; Top three TIC [135]
		PEAKS	Average-Top3; Summed-Top3 [136]
		Progenesis	Average-Top3; NoC [33, 137]
		Proteios SE	Summed-Top3 [48,73]
		Scaffold	Average-Top3; iBAQ; TIC [138]
		Skyline	Average-Top3 [139]
		Thermo Proteome Discoverer	Minora algorithm [140, 141]
		Census	Weighted overall intensities [142]
Data-independent acquisition (DIA)	SWATH-MS	DIA-Umpire	Data-centric iBAQ abundance measures and ‘Top N’ metrics [65]
		OpenSWATH	Peptide-centric TRIC algorithm [143, 144]
		PeakView	Peptide-centric extracted ion chromatograms alignment [145]
		Spectronaut	Peptide-centric ‘target-decoy’ search strategy [75]
		Skyline	Peptide-centric IRP method [146]

Mode ofacquisition	Acquisition technique	Representativequantification tools	Quantification metric provided
Data-dependent acquisition (DDA)	Spectral counts (MS2)	Abacus	NSAF [47, 64, 130]
		Census	Raw [65]
		DTASelect	Raw [67]
		IRMa-hEIDI	Raw [129]
		MaxQuant	emPAI; APEX [47, 69, 130]
		MFPaQ	PAI [131]
		ProteinProphet	Raw [132]
		Thermo Proteome Discoverer	NSAF [33, 47, 130]
		Scaffold	emPAI; NSAF; Total spectra; Weighted spectra [133]
	Peak intensity (MS1)	MaxQuant	MaxLFQ; Summed-Top3; iBAQ; TPA [69]
		MFPaQ	TIC; Average TIC [134]
		OpenMS	TIC; Top three TIC [135]
		PEAKS	Average-Top3; Summed-Top3 [136]
		Progenesis	Average-Top3; NoC [33, 137]
		Proteios SE	Summed-Top3 [48,73]
		Scaffold	Average-Top3; iBAQ; TIC [138]
		Skyline	Average-Top3 [139]
		Thermo Proteome Discoverer	Minora algorithm [140, 141]
		Census	Weighted overall intensities [142]
Data-independent acquisition (DIA)	SWATH-MS	DIA-Umpire	Data-centric iBAQ abundance measures and ‘Top N’ metrics [65]
		OpenSWATH	Peptide-centric TRIC algorithm [143, 144]
		PeakView	Peptide-centric extracted ion chromatograms alignment [145]
		Spectronaut	Peptide-centric ‘target-decoy’ search strategy [75]
		Skyline	Peptide-centric IRP method [146]

Table 1

Mode ofacquisition	Acquisition technique	Representativequantification tools	Quantification metric provided
Data-dependent acquisition (DDA)	Spectral counts (MS2)	Abacus	NSAF [47, 64, 130]
		Census	Raw [65]
		DTASelect	Raw [67]
		IRMa-hEIDI	Raw [129]
		MaxQuant	emPAI; APEX [47, 69, 130]
		MFPaQ	PAI [131]
		ProteinProphet	Raw [132]
		Thermo Proteome Discoverer	NSAF [33, 47, 130]
		Scaffold	emPAI; NSAF; Total spectra; Weighted spectra [133]
	Peak intensity (MS1)	MaxQuant	MaxLFQ; Summed-Top3; iBAQ; TPA [69]
		MFPaQ	TIC; Average TIC [134]
		OpenMS	TIC; Top three TIC [135]
		PEAKS	Average-Top3; Summed-Top3 [136]
		Progenesis	Average-Top3; NoC [33, 137]
		Proteios SE	Summed-Top3 [48,73]
		Scaffold	Average-Top3; iBAQ; TIC [138]
		Skyline	Average-Top3 [139]
		Thermo Proteome Discoverer	Minora algorithm [140, 141]
		Census	Weighted overall intensities [142]
Data-independent acquisition (DIA)	SWATH-MS	DIA-Umpire	Data-centric iBAQ abundance measures and ‘Top N’ metrics [65]
		OpenSWATH	Peptide-centric TRIC algorithm [143, 144]
		PeakView	Peptide-centric extracted ion chromatograms alignment [145]
		Spectronaut	Peptide-centric ‘target-decoy’ search strategy [75]
		Skyline	Peptide-centric IRP method [146]

Mode ofacquisition	Acquisition technique	Representativequantification tools	Quantification metric provided
Data-dependent acquisition (DDA)	Spectral counts (MS2)	Abacus	NSAF [47, 64, 130]
		Census	Raw [65]
		DTASelect	Raw [67]
		IRMa-hEIDI	Raw [129]
		MaxQuant	emPAI; APEX [47, 69, 130]
		MFPaQ	PAI [131]
		ProteinProphet	Raw [132]
		Thermo Proteome Discoverer	NSAF [33, 47, 130]
		Scaffold	emPAI; NSAF; Total spectra; Weighted spectra [133]
	Peak intensity (MS1)	MaxQuant	MaxLFQ; Summed-Top3; iBAQ; TPA [69]
		MFPaQ	TIC; Average TIC [134]
		OpenMS	TIC; Top three TIC [135]
		PEAKS	Average-Top3; Summed-Top3 [136]
		Progenesis	Average-Top3; NoC [33, 137]
		Proteios SE	Summed-Top3 [48,73]
		Scaffold	Average-Top3; iBAQ; TIC [138]
		Skyline	Average-Top3 [139]
		Thermo Proteome Discoverer	Minora algorithm [140, 141]
		Census	Weighted overall intensities [142]
Data-independent acquisition (DIA)	SWATH-MS	DIA-Umpire	Data-centric iBAQ abundance measures and ‘Top N’ metrics [65]
		OpenSWATH	Peptide-centric TRIC algorithm [143, 144]
		PeakView	Peptide-centric extracted ion chromatograms alignment [145]
		Spectronaut	Peptide-centric ‘target-decoy’ search strategy [75]
		Skyline	Peptide-centric IRP method [146]

Table 2

Abbreviated names of the subsequent processing methods used in this study together with their corresponding full names and brief descriptions

	Processing method (abbreviation)	Brief description of each method
Transformation	Box–Cox Transformation (BOX)	Transforming data based on linearity, normality and homoscedasticity assumption [76]
	Cube Root Transformation (CUB)	Improving normality distribution of simple count data [77]
	Log Transformation (LOG)	Almost routinely carried out for reaching a more symmetricdistribution [78]
	None (NON)	No transformation method applied
	Power Transformation (POW)	Stabilizing variance and making data more normal distribution-like [79]
Normalization	Auto Scaling (ATO)	Scaling protein intensities based on the standard deviation of OMICdata [80]
	Cyclic Loess (CYC)	Estimating a regression surface using multivariate smoothingprocedure [81]
	EigenMS (EIG)	Preserving true differences based on treatment effect and singularvalue decomposition [50]
	Linear Baseline Scaling (LIN)	Mapping each spectrum to the baseline based on a constant linearrelationship [82]
	Locally Weighted Scatterplot Smoothing (LOW)	Normalizing a proteomic data set by compensating for non-linearbias [83]
	Mean Normalization (MEA)	Normalizing the data by mean value of all signals to eliminatebackground effect [84]
	Median Absolute Deviation (MAD)	A measure of the spread of the data and used to estimate the samplestandard deviation [85]
	Median Normalization (MED)	Scaling the samples so that they have the same median [86]
	None (NON)	No normalization method applied
	Pareto Scaling (PAR)	Reducing the weight of large fold changes in protein intensities bystandard deviation [87]
	Probabilistic Quotient Normalization (PQN)	Transforming the spectra based on an overall estimation on the mostprobable dilution [88]
	Quantile Normalization (QUA)	Achieving the same distribution of protein intensities across allsamples [82]
	Robust Linear Regression (RLR)	Used for transference to rescale one reference interval to anotherscale [50]
	Total Ion Current (TIC)	Summing all the separate ion currents carried by the ions of differentm/z [89]
	Trimmed Mean of M Values (TMM)	Estimating scale factors between samples for differential expressionanalysis [90]
	Variance Stabilization Normalization (VSN)	A non-linear method for keeping the variance constant over the entiredata range [50]
	Z-score normalization (ZSC)	Normalizing data based on the mean and standard deviation [92]
Imputation	Background Imputation (BAK)	Simulating the situation where protein values are missing [44]
	Bayesian Principal Component Imputation (BPC)	Providing capacity to auto-select the parameters used in theestimation [44]
	Censored Imputation (CEN)	Imputing non-missing completely at random missing values by lowestintensity value [44]
	K-nearest Neighbor Imputation (KNN)	Imputing values based on K proteins similar to the proteins withmissing values [93]
	Local Least Squares Imputation (LLS)	Missing value imputation by a linear combination of similar genesidentified by KNN [44]
	None (NON)	No imputation/filtering method applied
	Singular Value Decomposition (SVD)	Estimating the missing values based on a linear consideration [93]
	Zero Imputation (ZER)	Replacing the missing values with zeros deemed to the simplestimputation method [94]

	Processing method (abbreviation)	Brief description of each method
Transformation	Box–Cox Transformation (BOX)	Transforming data based on linearity, normality and homoscedasticity assumption [76]
	Cube Root Transformation (CUB)	Improving normality distribution of simple count data [77]
	Log Transformation (LOG)	Almost routinely carried out for reaching a more symmetricdistribution [78]
	None (NON)	No transformation method applied
	Power Transformation (POW)	Stabilizing variance and making data more normal distribution-like [79]
Normalization	Auto Scaling (ATO)	Scaling protein intensities based on the standard deviation of OMICdata [80]
	Cyclic Loess (CYC)	Estimating a regression surface using multivariate smoothingprocedure [81]
	EigenMS (EIG)	Preserving true differences based on treatment effect and singularvalue decomposition [50]
	Linear Baseline Scaling (LIN)	Mapping each spectrum to the baseline based on a constant linearrelationship [82]
	Locally Weighted Scatterplot Smoothing (LOW)	Normalizing a proteomic data set by compensating for non-linearbias [83]
	Mean Normalization (MEA)	Normalizing the data by mean value of all signals to eliminatebackground effect [84]
	Median Absolute Deviation (MAD)	A measure of the spread of the data and used to estimate the samplestandard deviation [85]
	Median Normalization (MED)	Scaling the samples so that they have the same median [86]
	None (NON)	No normalization method applied
	Pareto Scaling (PAR)	Reducing the weight of large fold changes in protein intensities bystandard deviation [87]
	Probabilistic Quotient Normalization (PQN)	Transforming the spectra based on an overall estimation on the mostprobable dilution [88]
	Quantile Normalization (QUA)	Achieving the same distribution of protein intensities across allsamples [82]
	Robust Linear Regression (RLR)	Used for transference to rescale one reference interval to anotherscale [50]
	Total Ion Current (TIC)	Summing all the separate ion currents carried by the ions of differentm/z [89]
	Trimmed Mean of M Values (TMM)	Estimating scale factors between samples for differential expressionanalysis [90]
	Variance Stabilization Normalization (VSN)	A non-linear method for keeping the variance constant over the entiredata range [50]
	Z-score normalization (ZSC)	Normalizing data based on the mean and standard deviation [92]
Imputation	Background Imputation (BAK)	Simulating the situation where protein values are missing [44]
	Bayesian Principal Component Imputation (BPC)	Providing capacity to auto-select the parameters used in theestimation [44]
	Censored Imputation (CEN)	Imputing non-missing completely at random missing values by lowestintensity value [44]
	K-nearest Neighbor Imputation (KNN)	Imputing values based on K proteins similar to the proteins withmissing values [93]
	Local Least Squares Imputation (LLS)	Missing value imputation by a linear combination of similar genesidentified by KNN [44]
	None (NON)	No imputation/filtering method applied
	Singular Value Decomposition (SVD)	Estimating the missing values based on a linear consideration [93]
	Zero Imputation (ZER)	Replacing the missing values with zeros deemed to the simplestimputation method [94]

Table 2

Abbreviated names of the subsequent processing methods used in this study together with their corresponding full names and brief descriptions

	Processing method (abbreviation)	Brief description of each method
Transformation	Box–Cox Transformation (BOX)	Transforming data based on linearity, normality and homoscedasticity assumption [76]
	Cube Root Transformation (CUB)	Improving normality distribution of simple count data [77]
	Log Transformation (LOG)	Almost routinely carried out for reaching a more symmetricdistribution [78]
	None (NON)	No transformation method applied
	Power Transformation (POW)	Stabilizing variance and making data more normal distribution-like [79]
Normalization	Auto Scaling (ATO)	Scaling protein intensities based on the standard deviation of OMICdata [80]
	Cyclic Loess (CYC)	Estimating a regression surface using multivariate smoothingprocedure [81]
	EigenMS (EIG)	Preserving true differences based on treatment effect and singularvalue decomposition [50]
	Linear Baseline Scaling (LIN)	Mapping each spectrum to the baseline based on a constant linearrelationship [82]
	Locally Weighted Scatterplot Smoothing (LOW)	Normalizing a proteomic data set by compensating for non-linearbias [83]
	Mean Normalization (MEA)	Normalizing the data by mean value of all signals to eliminatebackground effect [84]
	Median Absolute Deviation (MAD)	A measure of the spread of the data and used to estimate the samplestandard deviation [85]
	Median Normalization (MED)	Scaling the samples so that they have the same median [86]
	None (NON)	No normalization method applied
	Pareto Scaling (PAR)	Reducing the weight of large fold changes in protein intensities bystandard deviation [87]
	Probabilistic Quotient Normalization (PQN)	Transforming the spectra based on an overall estimation on the mostprobable dilution [88]
	Quantile Normalization (QUA)	Achieving the same distribution of protein intensities across allsamples [82]
	Robust Linear Regression (RLR)	Used for transference to rescale one reference interval to anotherscale [50]
	Total Ion Current (TIC)	Summing all the separate ion currents carried by the ions of differentm/z [89]
	Trimmed Mean of M Values (TMM)	Estimating scale factors between samples for differential expressionanalysis [90]
	Variance Stabilization Normalization (VSN)	A non-linear method for keeping the variance constant over the entiredata range [50]
	Z-score normalization (ZSC)	Normalizing data based on the mean and standard deviation [92]
Imputation	Background Imputation (BAK)	Simulating the situation where protein values are missing [44]
	Bayesian Principal Component Imputation (BPC)	Providing capacity to auto-select the parameters used in theestimation [44]
	Censored Imputation (CEN)	Imputing non-missing completely at random missing values by lowestintensity value [44]
	K-nearest Neighbor Imputation (KNN)	Imputing values based on K proteins similar to the proteins withmissing values [93]
	Local Least Squares Imputation (LLS)	Missing value imputation by a linear combination of similar genesidentified by KNN [44]
	None (NON)	No imputation/filtering method applied
	Singular Value Decomposition (SVD)	Estimating the missing values based on a linear consideration [93]
	Zero Imputation (ZER)	Replacing the missing values with zeros deemed to the simplestimputation method [94]

	Processing method (abbreviation)	Brief description of each method
Transformation	Box–Cox Transformation (BOX)	Transforming data based on linearity, normality and homoscedasticity assumption [76]
	Cube Root Transformation (CUB)	Improving normality distribution of simple count data [77]
	Log Transformation (LOG)	Almost routinely carried out for reaching a more symmetricdistribution [78]
	None (NON)	No transformation method applied
	Power Transformation (POW)	Stabilizing variance and making data more normal distribution-like [79]
Normalization	Auto Scaling (ATO)	Scaling protein intensities based on the standard deviation of OMICdata [80]
	Cyclic Loess (CYC)	Estimating a regression surface using multivariate smoothingprocedure [81]
	EigenMS (EIG)	Preserving true differences based on treatment effect and singularvalue decomposition [50]
	Linear Baseline Scaling (LIN)	Mapping each spectrum to the baseline based on a constant linearrelationship [82]
	Locally Weighted Scatterplot Smoothing (LOW)	Normalizing a proteomic data set by compensating for non-linearbias [83]
	Mean Normalization (MEA)	Normalizing the data by mean value of all signals to eliminatebackground effect [84]
	Median Absolute Deviation (MAD)	A measure of the spread of the data and used to estimate the samplestandard deviation [85]
	Median Normalization (MED)	Scaling the samples so that they have the same median [86]
	None (NON)	No normalization method applied
	Pareto Scaling (PAR)	Reducing the weight of large fold changes in protein intensities bystandard deviation [87]
	Probabilistic Quotient Normalization (PQN)	Transforming the spectra based on an overall estimation on the mostprobable dilution [88]
	Quantile Normalization (QUA)	Achieving the same distribution of protein intensities across allsamples [82]
	Robust Linear Regression (RLR)	Used for transference to rescale one reference interval to anotherscale [50]
	Total Ion Current (TIC)	Summing all the separate ion currents carried by the ions of differentm/z [89]
	Trimmed Mean of M Values (TMM)	Estimating scale factors between samples for differential expressionanalysis [90]
	Variance Stabilization Normalization (VSN)	A non-linear method for keeping the variance constant over the entiredata range [50]
	Z-score normalization (ZSC)	Normalizing data based on the mean and standard deviation [92]
Imputation	Background Imputation (BAK)	Simulating the situation where protein values are missing [44]
	Bayesian Principal Component Imputation (BPC)	Providing capacity to auto-select the parameters used in theestimation [44]
	Censored Imputation (CEN)	Imputing non-missing completely at random missing values by lowestintensity value [44]
	K-nearest Neighbor Imputation (KNN)	Imputing values based on K proteins similar to the proteins withmissing values [93]
	Local Least Squares Imputation (LLS)	Missing value imputation by a linear combination of similar genesidentified by KNN [44]
	None (NON)	No imputation/filtering method applied
	Singular Value Decomposition (SVD)	Estimating the missing values based on a linear consideration [93]
	Zero Imputation (ZER)	Replacing the missing values with zeros deemed to the simplestimputation method [94]

Criteria and corresponding assessment metrics for evaluating the performance of LFQ workflow

Besides the criterion ‘precision’ (intragroup variations) primarily applied for assessing the performance of LFQ [50], a variety of criteria have been developed and applied in the LFQ-related studies [20, 51, 52]. Due to the independent nature of these criteria [47], they are known to complement one another and highly recommended to be collectively employed for ‘thoroughly’ evaluating specific LFQ workflow [46, 47, 52]. Particularly, two independent criteria, ‘precision’ and ‘accuracy’, were collectively considered to enable the in-depth evaluation of LFQ [20, 44, 50], and three independent criteria were further integrated for the first time to comparatively evaluate various LFQ methods [47]. Moreover, an appropriate LFQ is expected to retain or even increase the difference in proteomic data between two distinct sample classes [59, 95]. In the meantime, the reproducibility of identified markers is also recognized as another independent criterion for assessing the performance of LFQ [52]. Thus, five independent criteria were used here to assess the performance of an LFQ workflow in ANPELA (each measured by a variety of assessment metrics; Supplementary Method S3).

(a) Precision of LFQ based on the proteomes among replicates

Different acquisition techniques, various tools for preprocessing raw proteomics data and diverse approaches for data processing (transformation, normalization, filtering and missing value imputation) could profoundly affect the precision of LFQ [20], which was assessed by pooled coefficient of variation (PCV) of the reported protein intensities among replicates [20, 96]. PCV was thus adopted as an assessment metric reflecting LFQ’s ability to reduce variations among replicates [97] and to enhance technical reproducibility [56]. A lower value of PCV indicates a more complete reduction of unwanted signals (denoting improved LFQ precision) [98].

(b) Classification ability of LFQ between distinct sample groups

An appropriate LFQ is expected to retain or even increase the difference in metaproteomic data between two distinct sample groups [59, 95]. Two-way clustering of proteins identified from these two groups is thus used as an assessment metric to assess LFQ classification ability [95, 99]. First, the total number of protein intensities in each sample is reduced by feature selection. Then proteins (rows) and samples (columns) are clustered by their similarities in a protein intensity profile. Details can be found in Supplementary Method S3.

(c) Differential expression analysis based on reproducibility optimization

To avoid overfitting/confounding in LFQ, distribution of P-values of protein intensities between two distinct sample groups is examined [100, 101]. Unbiased variation is expected for the majority of proteins (without clear differentiation of expression), with an obvious peak in the [0.00, 0.05] interval corresponding to proteins with differential intensities [101]. Moreover, a volcano plot of proteins with differential intensity provides a glimpse of the total number of differentially expressed proteins [102]. In a metaproteomic study exploring mechanisms underlying complex biological process, the limited number of differentially expressed proteins may result in false discovery [103]. Thus, the statistical differences (P-value) of protein intensities between distinct groups are first estimated by reproducibility-optimized test statistic (ROTS) [61, 104]. Distribution of P-values is then provided, and a skewed distribution may indicate overfitting and/or confounding [105].

(d) Reproducibility of the identified protein markers among different data sets

Consistency score (CS) is a popular assessment metric representing the reproducibility of identified markers [52, 106] and is used to assess the level of consistency among the protein markers discovered from the various portions of a given data set [107]. A higher CS denotes a more stable discovery of markers [52]. First, a random sampling is conducted on the studied data set to generate multiple sub-data sets. Then each protein is ranked based on its significance (q-value) and absolute fold change. Finally, those top-ranked proteins in each sub-data set are selected as markers before the calculation of the CS (Supplementary Method S3).

(e) Accuracy of LFQ based on spiked and background proteins

Extra experimental data (e.g. spiked proteins) are frequently generated and used as a reference to validate or adjust LFQ’s performances [20, 96], and it is therefore straightforward to know the expected log fold changes (logFCs) for both spiked and background proteins (the expected logFC for background proteins is zero) [102]. First, the differentially expressed proteins are identified by ROTS. Then the true positive rate (TPR) for the successful discovery of spiked protein is calculated. A higher TPR corresponds to a greater accuracy. Moreover, the logFCs of protein intensity (for both spiked and background proteins) between distinct sample groups are measured, and the correlation between measured and expected logFCs is then assessed by mean squared error. A higher level of correlation indicates a more accurate LFQ workflow [102].

Each criterion above evaluated the LFQ performance based on its own underlying theory, and the combination of multiple criteria resulted in a systematic assessment. The assessment results are directly provided as tables or illustrated as plots online, all of which can be downloaded from the ANPELA web page. More information can be found in Supplementary Method S3.

Details of web server implementation

Official (https://idrblab.org/anpela/) and mirror (http://idrb.zju.edu.cn/anpela/) sites of ANPELA are deployed on a server running ‘Cent OS Linux v7.0 operating system’, ‘Apache HTTP web server v2.2.15’ and ‘Apache Tomcat servlet container’ (http://httpd.apache.org). The web interface was constructed by R Package v3.4.1 (Shiny v0.13.1) running on the Shiny-server v1.4.1.759 (http://www.rstudio.com/shiny). A number of R packages were utilized in the background processes including affy, AUC, DiffCorr, DT, e1071, fastlo, ggfortify, ggsci, ggplot2, gplots, impute, limma, LPE, metabolomics, MetNorm, NOISeq, pcaMethods, png, RcmdrMisc, ROTS, rmarkdown, ropls, shiny, shinyBS, shinydashboard, shinyRGL, statTarget and vsn. ANPELA website can be readily accessed by all users with no login requirement and by a variety of popular web browsers including Google Chrome, Mozilla Firefox, Safari and Internet Explorer (10 or later). For using the stand-alone version of ANPELA downloaded directly from the official and mirror sites, the installation of R environment is required before its running. The ‘User Manual’ of the stand-alone version of tool was provided in Supplementary Method S4.

The input formats required by ANPELA

The input files required by ANPELA are the standard output raw files of the 18 quantification tools, together with a label file indicating the category of each sample. The standard output raw files of quantification tools are described in the ‘Tutorial’ panel of ANPELA. Alternatively, a standard format designed by ANPELA is also accepted for assessing the in-house tool or predefined analytical workflow, which should be in ‘csv’ format with the dimension of m × n (m and n indicate the numbers of metaproteomic features and microbial samples, respectively). Additionally, an additional file providing the concentrations of known proteins (such as spiked proteins) is required to assess the performances of LFQ based on the last criterion. Specifically, the extra file should contain the ‘class of samples’ and the ‘Sample ID’. The ‘Sample ID’ should be unique and defined by the preference of the ANPELA user, and the ‘class of samples’ refers to the group of ‘Sample ID’. The input sample file can be found in the ‘Tutorial’ panel of ANPELA.

The formats and visualization of ANPELA output files

The output files of ANPELA include the following: (1) a variety of statistical measures (e.g. coefficient of variation, median absolute deviation, CS and PCV value), (2) the histograms, boxplots and matrixplots before and after transformation, normalization, filtering and missing value imputation, (3) Venn diagrams assessing LFQ’s reproducibility, (4) a map of PCV distribution of protein intensities among replicates, (5) the P-value distribution, heatmap and volcano plot based on the identified markers, (6) boxplots of the deviations between the measured and expected logFCs of spiked and background proteins and (7) bar chart and receiver operating characteristic curve measuring quantification accuracy of both spiked and background proteins. The resulting LFQ files and assessment documents in a compressed ZIP format can be downloaded from ANPELA.

Results and discussion

The standard workflow of ANPELA

The main components of the ANPELA framework include four steps: (α) uploading of the raw metaproteomic data. Various file formats generated by all 18 quantification tools can be accepted, and the users are asked to upload specific files containing the data generated by those tools, together with a label file indicating the class of each sample. If the users want to process their own data before the ANPELA analysis, they can upload the data in a unified format defined by ANPELA. (β) data transformation and normalization. Transformation aims at reducing the impacts of very large value of intensities and making them more comparable or normally distributed [108, 109], and normalization can remove variability during the separate sample preparations and Mass Spectrometry (MS) runs [46, 50, 78]. (γ) data filtering and imputation. In total, eight popular approaches of metaproteomic studies were integrated. Missing values are frequently encountered in metaproteomic data sets and can greatly hamper subsequent OMIC analysis (with no calculation result for some extreme cases) [83]. (δ) multifaceted performance assessment. Each LFQ workflow is collectively evaluated from five perspectives. A series of assessment metrics (represented by digital numbers or statistical plots) are provided. Figure 1 illustrates the standard workflow of ANPELA, and a detailed demo on its usage can be found in the ‘Tutorial’ panel.

Figure 1

The standard workflow of ANPELA: (α) uploading the raw metaproteomic data preprocessed by 18 quantification tools; (β) data transformation and normalization; (γ) data filtering and imputation; and (δ) the multifaceted performance evaluation from five different perspectives.

As known, a web server is accessible online and can work regardless of users’ operating system and platform, providing advantage over stand-alone applications in terms of accessibility and software update. However, the web server has the disadvantage of being slower than the stand-alone applications due to the time cost of web connection and the shared nature of computational resources [110]. Therefore, the function of ANPELA was realized by not only an online web server but also a local version downloadable directly from ANPELA sites. On one hand, the online web server provided the unique function of evaluating the performance of the whole LFQ workflow from multiple perspectives; on the other hand, the downloaded local tool enabled the discovery of the optimal LFQ(s) by comprehensively ranking all (more than 500) possible LFQ workflows. The official (https://idrblab.org/anpela/) and mirror (http://idrb.zju.edu.cn/anpela/) sites of ANPELA could be readily assessed. Additionally, the source code of the local tool could be downloaded from the home page of these sites and run under R environment without extra installation. The ‘User Manual’ of the local version of ANPELA was also provided in Supplementary Method S4.

ANPELA facilitates the discovery of well-performing LFQ workflows for metaproteomic study

To comprehensively evaluate the level of dependence of LFQ workflows on different metaproteomic data sets, PRIDE database [111] was screened by searching the keywords including ‘Microbiota’, ‘Microbiome’ and ‘Metaproteomic’, which resulted in 106 metaproteomics-related records. Then the corresponding studies of the records were systematically reviewed. By considering several additional criteria [LFQ, the availability of raw intensity data files and the protein database or library to search against, the well-defined parameters (such as isolation scheme and range of retention time) and the clear description on distinct sample groups], eight representative metaproteomic data sets were finally selected for further analysis: PXD006224 [42] (60 metabolic phase & 24 equilibrium phase fecal samples), PXD002882 [112] (21 Crohn’s disease patients & 10 healthy individuals), PXD006129 [113] (14 western style diet & 14 chow-fed mice), PXD006070 [114] (9 corn & 9 grass silage-based samples), PXD003028 [115] (8 people before & 8 people after their breakfast), PXD000987 [116] (8 transverse colon & 8 descending colon samples), PXD005929 [117] (3 surface-exposed & 3 whole cell extracts) and PXD006810 [118] (3 NleB1-infected and 3 wild-type cells). Detailed information of these benchmark data sets could be found in Supplementary Table S3.

Herein, the PCV of protein intensities among replicates for each data set was calculated and applied to assess the ‘precision’ of LFQ. A lower PCV value denoted a more complete reduction of the unwanted signal, which indicated improved LFQ ‘precision’ [20, 98]. The performance of each workflow could be categorized into four groups based on the following PCV values: superior (<0.14) [20, 119], good (0.14~0.3) [119, 120], fair (0.3~0.7) [121] and poor (>0.7) [121]. For 20 representative workflows (as shown in Table 3), their performances on different data sets varied significantly. Taking the first workflow (BOX-MAD-SVD) shown in Table 3 as an example, its performance was superior, good, fair and poor in 1, 2, 4 and 1 data sets, respectively. The variation in LFQ performances among eight metaproteomic data sets are further illustrated in Supplementary Figures S1 and S2. As shown, there was a clear data set-dependent characteristic for the analyzed representative workflows. Thus, for a particular metaproteomic study, it was necessary to assess the performance of each LFQ to discover the well-performing ones, and ANPELA was developed to provide this unique function by assessing each workflow as a whole analysis chain.

Table 3

Evaluation of different LFQ workflows based on the value of PCV of eight metaproteomic benchmark data sets. The performance of a workflow was categorized into four groups based on the following PCV values: superior (<0.14, double underline), good (0.14~0.3, single underline), fair (0.3~0.7, dotted underline) and poor (>0.7, no underline). The workflow abbreviations are provided in Table 2, and those benchmark data sets are in descending order by their total number of samples (the number of cases versus that of the controls, as listed in the brackets under each data set ID)

Workflow	PXD006224 (60:24)	PXD002882 (21:10)	PXD006129 (14:14)	PXD006070 (9:9)	PXD003028 (8:8)	PXD000987 (4:4)	PXD005929 (3:3)	PXD006810 (3:3)
BOX-MAD-SVD	0.05	0.15	0.47	0.38	0.44	0.64	0.27	1.31
BOX-EIG-KNN	0.26	0.48	0.32	0.19	0.26	0.22	0.10	0.28
BOX-QUA-BAK	0.43	0.76	0.65	0.54	0.54	0.46	0.22	0.64
BOX-VSN-CEN	0.09	0.02	0.14	0.09	0.10	0.09	0.04	0.43
CUB-EIG-KNN	0.29	0.52	0.36	0.21	0.29	0.25	0.11	0.31
CUB-MAD-SVD	0.05	0.15	0.47	0.38	0.44	0.64	0.27	1.31
CUB-RLR-CEN	0.75	0.93	0.85	0.59	0.61	0.53	0.24	0.55
CUB-VSN-BAK	0.11	0.18	0.15	0.76	0.12	0.11	0.05	0.60
LOG-EIG-CEN	0.10	0.24	0.13	0.12	0.12	0.09	0.03	0.12
LOG-PQN-ZER	0.90	0.95	1.27	1.07	0.87	0.68	0.29	1.34
LOG-TIC-BAK	0.21	0.31	0.23	0.45	0.21	0.17	0.09	0.72
LOG-VSN-SVD	0.06	0.15	0.48	0.39	0.45	0.65	0.27	1.31
NON-EIG-KNN	1.08	1.55	1.07	0.69	0.96	0.82	0.32	0.82
NON-MAD-SVD	0.05	0.15	0.47	0.38	0.44	0.64	0.27	1.30
NON-VSN-BAK	0.11	0.23	0.17	0.12	0.12	0.11	0.04	0.14
NON-ZSC-CEN	0.64	1.60	0.94	1.68	1.39	1.56	0.44	1.06
POW-CYC-BAK	1.25	2.95	2.46	1.20	4.55	10.83	1.00	6.08
POW-LOW-BAK	2.26	0.41	4.71	4.26	4.08	4.41	15.87	6.64
POW-TMM-ZER	1.68	1.07	1.11	1.47	4.07	2.65	11.16	1.07
POW-VSN-CEN	0.19	0.26	0.28	0.25	0.32	0.43	0.14	0.33

Workflow	PXD006224 (60:24)	PXD002882 (21:10)	PXD006129 (14:14)	PXD006070 (9:9)	PXD003028 (8:8)	PXD000987 (4:4)	PXD005929 (3:3)	PXD006810 (3:3)
BOX-MAD-SVD	0.05	0.15	0.47	0.38	0.44	0.64	0.27	1.31
BOX-EIG-KNN	0.26	0.48	0.32	0.19	0.26	0.22	0.10	0.28
BOX-QUA-BAK	0.43	0.76	0.65	0.54	0.54	0.46	0.22	0.64
BOX-VSN-CEN	0.09	0.02	0.14	0.09	0.10	0.09	0.04	0.43
CUB-EIG-KNN	0.29	0.52	0.36	0.21	0.29	0.25	0.11	0.31
CUB-MAD-SVD	0.05	0.15	0.47	0.38	0.44	0.64	0.27	1.31
CUB-RLR-CEN	0.75	0.93	0.85	0.59	0.61	0.53	0.24	0.55
CUB-VSN-BAK	0.11	0.18	0.15	0.76	0.12	0.11	0.05	0.60
LOG-EIG-CEN	0.10	0.24	0.13	0.12	0.12	0.09	0.03	0.12
LOG-PQN-ZER	0.90	0.95	1.27	1.07	0.87	0.68	0.29	1.34
LOG-TIC-BAK	0.21	0.31	0.23	0.45	0.21	0.17	0.09	0.72
LOG-VSN-SVD	0.06	0.15	0.48	0.39	0.45	0.65	0.27	1.31
NON-EIG-KNN	1.08	1.55	1.07	0.69	0.96	0.82	0.32	0.82
NON-MAD-SVD	0.05	0.15	0.47	0.38	0.44	0.64	0.27	1.30
NON-VSN-BAK	0.11	0.23	0.17	0.12	0.12	0.11	0.04	0.14
NON-ZSC-CEN	0.64	1.60	0.94	1.68	1.39	1.56	0.44	1.06
POW-CYC-BAK	1.25	2.95	2.46	1.20	4.55	10.83	1.00	6.08
POW-LOW-BAK	2.26	0.41	4.71	4.26	4.08	4.41	15.87	6.64
POW-TMM-ZER	1.68	1.07	1.11	1.47	4.07	2.65	11.16	1.07
POW-VSN-CEN	0.19	0.26	0.28	0.25	0.32	0.43	0.14	0.33

Table 3

Workflow	PXD006224 (60:24)	PXD002882 (21:10)	PXD006129 (14:14)	PXD006070 (9:9)	PXD003028 (8:8)	PXD000987 (4:4)	PXD005929 (3:3)	PXD006810 (3:3)
BOX-MAD-SVD	0.05	0.15	0.47	0.38	0.44	0.64	0.27	1.31
BOX-EIG-KNN	0.26	0.48	0.32	0.19	0.26	0.22	0.10	0.28
BOX-QUA-BAK	0.43	0.76	0.65	0.54	0.54	0.46	0.22	0.64
BOX-VSN-CEN	0.09	0.02	0.14	0.09	0.10	0.09	0.04	0.43
CUB-EIG-KNN	0.29	0.52	0.36	0.21	0.29	0.25	0.11	0.31
CUB-MAD-SVD	0.05	0.15	0.47	0.38	0.44	0.64	0.27	1.31
CUB-RLR-CEN	0.75	0.93	0.85	0.59	0.61	0.53	0.24	0.55
CUB-VSN-BAK	0.11	0.18	0.15	0.76	0.12	0.11	0.05	0.60
LOG-EIG-CEN	0.10	0.24	0.13	0.12	0.12	0.09	0.03	0.12
LOG-PQN-ZER	0.90	0.95	1.27	1.07	0.87	0.68	0.29	1.34
LOG-TIC-BAK	0.21	0.31	0.23	0.45	0.21	0.17	0.09	0.72
LOG-VSN-SVD	0.06	0.15	0.48	0.39	0.45	0.65	0.27	1.31
NON-EIG-KNN	1.08	1.55	1.07	0.69	0.96	0.82	0.32	0.82
NON-MAD-SVD	0.05	0.15	0.47	0.38	0.44	0.64	0.27	1.30
NON-VSN-BAK	0.11	0.23	0.17	0.12	0.12	0.11	0.04	0.14
NON-ZSC-CEN	0.64	1.60	0.94	1.68	1.39	1.56	0.44	1.06
POW-CYC-BAK	1.25	2.95	2.46	1.20	4.55	10.83	1.00	6.08
POW-LOW-BAK	2.26	0.41	4.71	4.26	4.08	4.41	15.87	6.64
POW-TMM-ZER	1.68	1.07	1.11	1.47	4.07	2.65	11.16	1.07
POW-VSN-CEN	0.19	0.26	0.28	0.25	0.32	0.43	0.14	0.33

Workflow	PXD006224 (60:24)	PXD002882 (21:10)	PXD006129 (14:14)	PXD006070 (9:9)	PXD003028 (8:8)	PXD000987 (4:4)	PXD005929 (3:3)	PXD006810 (3:3)
BOX-MAD-SVD	0.05	0.15	0.47	0.38	0.44	0.64	0.27	1.31
BOX-EIG-KNN	0.26	0.48	0.32	0.19	0.26	0.22	0.10	0.28
BOX-QUA-BAK	0.43	0.76	0.65	0.54	0.54	0.46	0.22	0.64
BOX-VSN-CEN	0.09	0.02	0.14	0.09	0.10	0.09	0.04	0.43
CUB-EIG-KNN	0.29	0.52	0.36	0.21	0.29	0.25	0.11	0.31
CUB-MAD-SVD	0.05	0.15	0.47	0.38	0.44	0.64	0.27	1.31
CUB-RLR-CEN	0.75	0.93	0.85	0.59	0.61	0.53	0.24	0.55
CUB-VSN-BAK	0.11	0.18	0.15	0.76	0.12	0.11	0.05	0.60
LOG-EIG-CEN	0.10	0.24	0.13	0.12	0.12	0.09	0.03	0.12
LOG-PQN-ZER	0.90	0.95	1.27	1.07	0.87	0.68	0.29	1.34
LOG-TIC-BAK	0.21	0.31	0.23	0.45	0.21	0.17	0.09	0.72
LOG-VSN-SVD	0.06	0.15	0.48	0.39	0.45	0.65	0.27	1.31
NON-EIG-KNN	1.08	1.55	1.07	0.69	0.96	0.82	0.32	0.82
NON-MAD-SVD	0.05	0.15	0.47	0.38	0.44	0.64	0.27	1.30
NON-VSN-BAK	0.11	0.23	0.17	0.12	0.12	0.11	0.04	0.14
NON-ZSC-CEN	0.64	1.60	0.94	1.68	1.39	1.56	0.44	1.06
POW-CYC-BAK	1.25	2.95	2.46	1.20	4.55	10.83	1.00	6.08
POW-LOW-BAK	2.26	0.41	4.71	4.26	4.08	4.41	15.87	6.64
POW-TMM-ZER	1.68	1.07	1.11	1.47	4.07	2.65	11.16	1.07
POW-VSN-CEN	0.19	0.26	0.28	0.25	0.32	0.43	0.14	0.33

Besides metaproteomics, it was of great interest to evaluate the LFQ workflow on further proteomics data sets. As newly emerging technique [122], the ‘sequential windowed acquisition of all theoretical fragment ion mass spectra’ (SWATH-MS) was reported to provide much more comprehensive detection and accurate quantitation of proteins compared to traditional acquisition techniques [122]. Therefore, seven SWATH-MS proteomic data sets were collected from the PRIDE database [111] following the similar process of metaproteomic data sets collection. These benchmark data sets included PXD001064 [123] (72 monozygotic & 44 dizygotic twins blood samples), PXD003972 [124] (20 wild-type & 20 GRB2 knock-in mice), PXD004880 [125] (18 ‘Down syndrome’ patients & 18 healthy individuals plasma samples), PXD000672 [126] (18 tumorous & 18 non-tumorous kidney tissue biopsies), PXD006106 [127] (10 formaldehyde-treated & 10 -untreated HeLa), PXD003278 [128] (6 siRNA-treated & 6 PRPF8-depleted Cal51 cell samples) and PXD002952 [20] (3 yeast 30% versus Escherichia coli 5% & 3 yeast 15% versus E. coli 20% mixtures). Detailed information of these 7 data sets could be found in Supplementary Table S3, and the performances of the same set of 20 representative LFQ workflows in Table 3 on these 7 SWATH-MS proteomic data sets were provided in Supplementary Table S4. Similar to the findings of metaproteomic data sets, the performances of a given LFQ on seven different sets of SWATH-MS data varied greatly. Taking the first workflow (BOX-MAD-SVD) as an example, it performed ‘superior’, ‘good’, ‘fair’ and ‘poor’ in 5, 0, 1 and 1 data sets, respectively (Supplementary Table S4), which showed similar data set-dependent characteristic as for those metaproteomic data sets.

As shown in Table 3 and Supplementary Table S4, some workflows were found to provide mainly ‘superior’, ‘good’, ‘fair’ or ‘poor’ PCV performance. These findings inspired us to further investigate whether there was any general guideline on the selection of consistently well- or poorly performed workflows across various data sets. To achieve this, those eight metaproteomic and seven SWATH-MS proteomic data sets were further analyzed to assess the ‘robustness’ of LFQ performance. First, the ‘robustness’ of LFQ performance between any 2 of 15 studied data sets (defined by the amounts of overlapped workflows with the same level of performances between two data sets) was calculated. Since the number of 2 combinations from those 15 studied data sets (⁠|${C}_{15}^2$|⁠) equaled to 105, 4 error bars (in dark gray) were drawn in Supplementary Figure S3A to indicate the amounts of the overlapped workflows of consistently ‘superior’, ‘good’, ‘fair’ or ‘poor’ performance between any 2 of those 15 studied data sets. Then the ‘robustness’ of LFQ performance among 8 metaproteomic and among all 15 studied data sets was calculated, and the amounts of the overlapped workflows performing consistently ‘superior’, ‘good’, ‘fair’ or ‘poor’ were shown in Supplementary Figure S3A (blue bars for the metaproteomic data sets and orange bars for all 15 studied data sets). Obviously, with the accumulation of the analyzed data sets (from 2 to 8 to 15 in Supplementary Figure S3A), the amounts of overlapped LFQs decreased substantially (from ~161 to 53 to 33 and from ~242 to 129 to 97 for ‘superior’ and ‘poor’, respectively), and there was no overlapped workflow consistently performing ‘good’ and ‘fair’ across 8 and 15 data sets. These indicated a dramatic reduction of the ‘robustness’ of LFQ performance with the accumulation of the analyzed data sets.

As shown in Supplementary Figure S3A, there were 33 and 97 workflows performing consistently ‘superior’ and ‘poor’ across all 15 studied data sets. To investigate the probabilities of each processing method appearing in the consistently ‘superior’ performing workflows, Supplementary Figure S3B was drawn. The occurrence probabilities of some processing methods were found to be higher than the others. Moreover, Supplementary Figure S3C was provided to show the frequency of each method in consistently ‘poor’ performing workflows, and some methods were found to appear more frequently than others. However, considering the dramatically reduced amounts of overlapped workflows (from ~161 to 33 for ‘superior’, from ~242 to 97 for ‘poor’) with the accumulation of the analyzed data sets (Supplementary Figure S3A), it should be very difficult to summarize a general guideline for selecting the consistently well- or poorly performing workflows, especially when more data sets are included for further analysis. Thus, for a specific data set, an independent performance assessment on various LFQ workflows is required, and tools like ANPELA are recommended.

Feasibility of discovering the LFQ performing consistently well under two distinct criteria

Five well-established criteria [20, 59–62] (each with distinct underlying theory) are now available to evaluate the performance of LFQ workflows. Herein, the benchmark data set PXD006224 [42] was collected to analyze and compare the assessment results based on different criteria. As demonstrated in Figure 2A and B, two distinct criteria, (a) precision and (d) reproducibility, were utilized to assess and rank the performance of 560 LFQ workflows. In particular, Figure 2A ranked the LFQ workflows based on their ‘precision’ (measured by PCV value). The cut-off (≤0.7) for a ‘desired’ PCV [121] resulted in 144 workflows with good performance (light orange). Figure 2B ordered the LFQ workflows by their ‘reproducibility’ (measured by ‘CS’). Both figures demonstrated significant variations among the performances of different LFQ workflows (from |${e}^{-6.9}$| to |${e}^{3.7}$| for PCVs; from 38.6 to 106.5 for ‘CSs’). Due to these variations, the performance of each LFQ workflow should be assessed before any metaproteomic study, and ANPELA could be a handy tool for providing such important information.

Figure 2

Performance assessment of LFQ workflows with the benchmark microbiome data set (PXD006224 [42]) from multiple perspectives. (A) precisions of different LFQ workflows were assessed by PCVs. Cut-off (≤0.7) for ‘desired’ PCVs [121] resulted in 144 workflows of good performance (in light orange background). (B) reproducibility of various workflows was evaluated by the CSs. (C) performance of workflows was collectively assessed by both PCVs and ‘CSs’. The dots indicate the rank of each LFQ workflow, as evaluated by PCVs and ‘CSs’, with the possible workflow used in Tilocca’s pioneering study [42] shown as green dots. The dots accumulated within the purple dash line square were considered as the well-performing workflows under both criteria (a) precision and (d) reproducibility. (D) the top 50 overall ranked workflows by both criteria together with their ranks by each independent criterion.

Moreover, the performance of LFQ collectively assessed by both criteria, (a) precision and (d) reproducibility, was shown in Figure 2C. As illustrated, among those 144 highest-ranked workflows by PCV value, only 44 (30.6%, accumulated within the purple dash line square of Figure 2C) were found to perform well under the criterion (d) reproducibility. Considering the large number of possible workflows (560 in total, shown as dots in Figure 2C), only a small fraction (~7.9%) could be considered as the well-performing ones by both criteria. Furthermore, the workflow used in Tilocca’s pioneering study [42] was identified and mapped in Figure 2C (green dots). Since Tilocca’s study mentioned only the transformation method applied, all workflows (96 in total) containing that method were highlighted as green dot. Eight (8.3%) out of these 96 were considered as well performing by both criteria (within purple dash line square). Supplementary Table S5 provided a full list of LFQ workflows ranked independently by the ‘precision’ or ‘reproducibility’. As shown, the ranking results of two criteria were inconsistent with each other. Due to the independent nature of these two criteria [47], the collective consideration of both criteria was proposed here to give an overall ranking to each workflow. Particularly, the overall ranking was defined by the sum of the ranks for two criteria (the smaller the sum is, the higher the LFQ workflow ranks). Taking the LFQ workflow (LOG-EIG-KNN) in Supplementary Table S5 as an example, its independent ranks by both criteria were not so high (ranked 47th and 19th by criterion (a) and (d), respectively), but it was the one of the best overall ranking due to its well-balanced performances between two criteria. Figure 2D gave the top 50 overall ranked workflows together with their ranks by independent criteria (the workflows highlighted in green indicated those LFQs used in Tilocca’s study [42]).

Similar to the above analysis, the same strategy was applied to another SWATH-MS-based benchmark data set PXD001064 [123]. The performance ranks by two distinct criteria, (a) precision and (d) reproducibility, were illustrated in Supplementary Figure S4A and B, and the performances of LFQs collectively evaluated by both criteria were shown in Supplementary Figure S4C. Similar to metaproteomic data set PXD006224 analyzed above, among the 280 highest-ranked workflows by one criterion (PCV values), only 14 (5.0%, accumulated in the purple dash line square of Supplementary Figure S4C) were found to be highly ranked by another (‘CSs’). That is to say, among all 560 workflows (the dots in Supplementary Figure S4C), only very small fraction (~2.5%) could be considered as the well-performing ones by both criteria. The original workflows (five in total) used in Liu’s study [123] for analyzing PXD001064 data set were highlighted as green dots in Supplementary Figure S4C. As shown, three out of these five were suggested by criterion (a) to be the ‘desired’ workflows, but none of the five workflows was found to be well performing under both criteria (no green dot in purple dash line square). Supplementary Figure S4D gave the top 50 overall ranked workflows together with their ranks by each independent criterion. In summary, based on the analysis on two benchmark data sets (one metaproteomic and another SWATH-MS proteomic data sets), the selection of the most suitable criterion for a specific biological problem should be determined by the innate nature of the given data set. As these criteria complement one another [20, 52, 59–62], it is essential to collectively employ them for assessing the performance of LFQ workflow, which further made ANPELA distinguished from other available tools.

ANPELA enables the performance assessment of LFQ workflows from multiple perspectives

The assessment and subsequent performance ranking from multiple perspectives in ANPELA was realized by a variety of metrics. These metrics included PCV value for criterion (a) precision [20], classification accuracy for criterion (b) classification ability [95], level of uniformity of P-value distribution for criterion (c) differential expression analysis [101], CS value for criterion (d) reproducibility [107] and degree of correlation between the measured and expected logFCs of protein intensity for criterion (e) accuracy [50]. Based on these metrics, the performances of 560 LFQ workflows could be ranked separately, and five ranking numbers were assigned to each workflow by the five corresponding criteria. Due to the independent nature of the five criteria [47], the collective consideration of multiple criteria was proposed in this study and realized in ANPELA for providing the overall ranking to all 560 workflows. Particularly, the overall ranking of a given workflow was defined by the sum of multiple ranking numbers under multiple criteria (the smaller the sum is, the higher the LFQ workflow ranks). Taking the metaproteomic benchmark data set PXD006129 (including 14 western style diet & 14 chow-fed mice) [113] as an example, the overall ranking of the performances of 560 workflows on this particular data set was calculated and the top 100 ranked LFQs were provided in Figure 3A. Since there is no spiked protein in this data set, only four criteria (a), (b), (c) and (d) were collectively assessed. As shown, the performance ranking under each criterion was represented by colored rectangles (the top-ranked LFQ was colored by exact blue. With the increase of ranking number, the color changed gradually toward orange with the last-ranked LFQ shown in exact orange). Based on the comprehensive ranking shown in Figure 3A, the selection range of LFQ workflows could be significantly narrowed down, and it is thus possible for the readers to discover the optimal workflows for their own data set.

Figure 3

Overall ranking of the performances of LFQ workflows on the metaproteomic data sets (A) PXD006129 with 14 western style diet and 14 chow-fed mice and (B) PXD002099 containing 48 UPS1 proteins spiked with five different concentrations. Since there is no spiked protein in data set PXD006129, only four criteria (a), (b), (c) and (d) were collectively assessed. Since the sample sizes of both case and control in PXD002099 were very small (3 versus 3), it was inappropriate to assess the quantification performance base on criterion (d) reproducibility, and only four criteria (a), (b), (c) and (e) were considered. The performance ranking under each criterion was represented by colored rectangles (the top-ranked LFQ was colored by exact blue). With the increase of ranking number, the color changed gradually toward orange with the last-ranked LFQ shown in exact orange.

Figure 4

The accuracy of 20 representative LFQ workflows based on (A) spiked and (B) background proteins for two distinct data sets of different protein concentration (10 versus 25 fmol/μL) in PXD002099. The workflows were ordered by the level of deviations between the median and expected logFCs of spiked proteins. Expected logFCs for the background and spiked proteins were 0 and 1.32, respectively. The workflow abbreviation was provided in Table 2. ‘Unnorm.’ denotes unnormalized data.

Because of the tremendous computational workload required for assessing 560 workflows, it is too time- and resource-consuming to make the comprehensive assessment service online. Therefore, an alternative way of enabling the evaluation on user’s local computer was provided as a stand-alone ANPELA. This version could be downloaded from both official (https://idrblab.org/anpela/) and mirror (http://idrb.zju.edu.cn/anpela/) sites and provided the same sets of assessment metrics and plots as that of the online ANPELA. Moreover, this local version tool could further provide an overall performance ranking for all 560 workflows in the format of not only a figure (shown in Figure 3) but also a ‘CSV’ table (including detail results of assessing metrics, ranks by independent criteria and overall ranking). The exemplar input/output files could be simultaneously downloaded together with the source code of the stand-alone tool from ANPELA websites. The installation of R environment was required before running ANPELA and was provided in ‘User Manual’ (downloadable from websites) and Supplementary Method S4. Both ‘User Manual’ and exemplar files could help users to get familiar with this tool as soon as possible.

ANPELA validates the accuracy of LFQ workflows based on spiked and background proteins

Spiked protein is frequently adopted to evaluate the performance of an LFQ workflow [52, 129]. A workflow with desirable accuracy should not only maintain unbiased variation in background proteins between sample groups (the expected logFC between the case and control equals to zero) but also make the variation in spiked proteins biased by viewing the expected abundance ratios as golden standard [50, 101]. Herein, the ability of ANPELA to validate the accuracy of LFQ workflows based on spiked proteins was analyzed using benchmark data PXD002099 with known ‘ground truth’ of variant proteins (48 UPS1 proteins spiked into yeast proteome digest with five different concentrations 2, 4, 10, 25 and 50 fmol/μL) [61]. A random combination of any 2 of these different concentrations could therefore result in 10 pairs of sample group with distinct concentration. Taking the pair of 10 versus 25 fmol/μL as an example, the accuracy of the 20 representative LFQ workflows were determined by considering the spiked (Figure 4A) and background (Figure 4B) proteins. All workflows were found to be able to maintain the unbiased variations of background proteins between sample groups, but only some workflows could make the variation in spiked proteins biased according to the expected abundance ratio (⁠|${\mathrm{logFC}}_{expected}=1.32$| in this case). With reference to the logFC of unnormalized data (the gray boxplot in Figure 4A), several workflows demonstrated good performance in guaranteeing biased variation in spiked proteins. The variance stabilization normalization (VSN) described in Valikangas’s study [50] as ‘performed well in differential expression analysis’ was identified well performing (the 6th ranked as shown in Figure 4) in this study. The well-performing workflows (in Figure 4) also included LOG-MED-NON, LOG-RLR-BAK, NON-EIG-ZER, NON-RLR-NON and LOG-CYC-BAK. For the remaining nine pairs of sample group, the accuracy of the same set of 20 representative workflows were analyzed and illustrated in Supplementary Figures S5–S13. As shown, the accuracy of different representative workflows for a specific concentration varied greatly, and the accuracy ranks of a particular workflow for different pairs also changed substantially. Table 4 gave an overview of the LFQ workflows performing better than unnormalization (indicated by circle). No LFQ workflow was found performing consistently better across all pairs of sample group, and the number of these well-performing workflows for different data set pairs varied from 1 to 15.

Table 4

Overview of the performance of 20 representative LFQ workflows assessed by criterion (e) accuracy. The LFQ workflows performing better and worse than unnormalization were represented by circle and cross, respectively

Workflow	2 versus 4	2 versus 10	2 versus 25	2 versus 50	4 versus 10	4 versus 25	4 versus 50	10 versus 25	10 versus 50	25 versus 50
NON-VSN-NON	×	Ο	Ο	Ο	Ο	Ο	Ο	Ο	Ο	Ο
LOG-CYC-BAK	×	Ο	Ο	Ο	Ο	Ο	Ο	Ο	×	Ο
LOG-MED-NON	×	Ο	Ο	Ο	Ο	Ο	Ο	Ο	×	Ο
LOG-PQN-SVD	×	Ο	Ο	Ο	Ο	Ο	Ο	×	Ο	Ο
LOG-RLR-BAK	×	Ο	Ο	Ο	Ο	Ο	Ο	Ο	×	×
NON-RLR-NON	×	Ο	Ο	Ο	Ο	Ο	Ο	Ο	×	×
NON-EIG-ZER	×	×	Ο	Ο	Ο	×	×	Ο	×	Ο
BOX-MEA-BPC	×	Ο	×	×	Ο	×	×	×	×	×
BOX-VSN-BAK	×	Ο	×	×	Ο	×	×	×	×	×
BOX-ZSC-CEN	×	Ο	×	×	Ο	×	×	×	×	×
CUB-MEA-SVD	×	Ο	×	×	Ο	×	×	×	×	×
LOG-EIG-CEN	Ο	×	×	×	×	×	×	×		Ο
BOX-CYC-KNN	×	×	×	×	Ο	×	×	×	×	×
CUB-QUA-NON	×	×	×	×	Ο	×	×	×	×	×
CUB-TMM-SVD	×	×	×	×	Ο	×	×	×	×	×
LOG-MAD-ZER	×	×	×	×	Ο	×	×	×	×	×
CUB-MAD-BAK	×	×	×	×	×	×	×	×	×	×
LOG-TIC-CEN	×	×	×	×	×	×	×	×	×	×
POW-TIC-ZER	×	×	×	×	×	×	×	×	×	×

Workflow	2 versus 4	2 versus 10	2 versus 25	2 versus 50	4 versus 10	4 versus 25	4 versus 50	10 versus 25	10 versus 50	25 versus 50
NON-VSN-NON	×	Ο	Ο	Ο	Ο	Ο	Ο	Ο	Ο	Ο
LOG-CYC-BAK	×	Ο	Ο	Ο	Ο	Ο	Ο	Ο	×	Ο
LOG-MED-NON	×	Ο	Ο	Ο	Ο	Ο	Ο	Ο	×	Ο
LOG-PQN-SVD	×	Ο	Ο	Ο	Ο	Ο	Ο	×	Ο	Ο
LOG-RLR-BAK	×	Ο	Ο	Ο	Ο	Ο	Ο	Ο	×	×
NON-RLR-NON	×	Ο	Ο	Ο	Ο	Ο	Ο	Ο	×	×
NON-EIG-ZER	×	×	Ο	Ο	Ο	×	×	Ο	×	Ο
BOX-MEA-BPC	×	Ο	×	×	Ο	×	×	×	×	×
BOX-VSN-BAK	×	Ο	×	×	Ο	×	×	×	×	×
BOX-ZSC-CEN	×	Ο	×	×	Ο	×	×	×	×	×
CUB-MEA-SVD	×	Ο	×	×	Ο	×	×	×	×	×
LOG-EIG-CEN	Ο	×	×	×	×	×	×	×		Ο
BOX-CYC-KNN	×	×	×	×	Ο	×	×	×	×	×
CUB-QUA-NON	×	×	×	×	Ο	×	×	×	×	×
CUB-TMM-SVD	×	×	×	×	Ο	×	×	×	×	×
LOG-MAD-ZER	×	×	×	×	Ο	×	×	×	×	×
CUB-MAD-BAK	×	×	×	×	×	×	×	×	×	×
LOG-TIC-CEN	×	×	×	×	×	×	×	×	×	×
POW-TIC-ZER	×	×	×	×	×	×	×	×	×	×

Table 4

Workflow	2 versus 4	2 versus 10	2 versus 25	2 versus 50	4 versus 10	4 versus 25	4 versus 50	10 versus 25	10 versus 50	25 versus 50
NON-VSN-NON	×	Ο	Ο	Ο	Ο	Ο	Ο	Ο	Ο	Ο
LOG-CYC-BAK	×	Ο	Ο	Ο	Ο	Ο	Ο	Ο	×	Ο
LOG-MED-NON	×	Ο	Ο	Ο	Ο	Ο	Ο	Ο	×	Ο
LOG-PQN-SVD	×	Ο	Ο	Ο	Ο	Ο	Ο	×	Ο	Ο
LOG-RLR-BAK	×	Ο	Ο	Ο	Ο	Ο	Ο	Ο	×	×
NON-RLR-NON	×	Ο	Ο	Ο	Ο	Ο	Ο	Ο	×	×
NON-EIG-ZER	×	×	Ο	Ο	Ο	×	×	Ο	×	Ο
BOX-MEA-BPC	×	Ο	×	×	Ο	×	×	×	×	×
BOX-VSN-BAK	×	Ο	×	×	Ο	×	×	×	×	×
BOX-ZSC-CEN	×	Ο	×	×	Ο	×	×	×	×	×
CUB-MEA-SVD	×	Ο	×	×	Ο	×	×	×	×	×
LOG-EIG-CEN	Ο	×	×	×	×	×	×	×		Ο
BOX-CYC-KNN	×	×	×	×	Ο	×	×	×	×	×
CUB-QUA-NON	×	×	×	×	Ο	×	×	×	×	×
CUB-TMM-SVD	×	×	×	×	Ο	×	×	×	×	×
LOG-MAD-ZER	×	×	×	×	Ο	×	×	×	×	×
CUB-MAD-BAK	×	×	×	×	×	×	×	×	×	×
LOG-TIC-CEN	×	×	×	×	×	×	×	×	×	×
POW-TIC-ZER	×	×	×	×	×	×	×	×	×	×

Workflow	2 versus 4	2 versus 10	2 versus 25	2 versus 50	4 versus 10	4 versus 25	4 versus 50	10 versus 25	10 versus 50	25 versus 50
NON-VSN-NON	×	Ο	Ο	Ο	Ο	Ο	Ο	Ο	Ο	Ο
LOG-CYC-BAK	×	Ο	Ο	Ο	Ο	Ο	Ο	Ο	×	Ο
LOG-MED-NON	×	Ο	Ο	Ο	Ο	Ο	Ο	Ο	×	Ο
LOG-PQN-SVD	×	Ο	Ο	Ο	Ο	Ο	Ο	×	Ο	Ο
LOG-RLR-BAK	×	Ο	Ο	Ο	Ο	Ο	Ο	Ο	×	×
NON-RLR-NON	×	Ο	Ο	Ο	Ο	Ο	Ο	Ο	×	×
NON-EIG-ZER	×	×	Ο	Ο	Ο	×	×	Ο	×	Ο
BOX-MEA-BPC	×	Ο	×	×	Ο	×	×	×	×	×
BOX-VSN-BAK	×	Ο	×	×	Ο	×	×	×	×	×
BOX-ZSC-CEN	×	Ο	×	×	Ο	×	×	×	×	×
CUB-MEA-SVD	×	Ο	×	×	Ο	×	×	×	×	×
LOG-EIG-CEN	Ο	×	×	×	×	×	×	×		Ο
BOX-CYC-KNN	×	×	×	×	Ο	×	×	×	×	×
CUB-QUA-NON	×	×	×	×	Ο	×	×	×	×	×
CUB-TMM-SVD	×	×	×	×	Ο	×	×	×	×	×
LOG-MAD-ZER	×	×	×	×	Ο	×	×	×	×	×
CUB-MAD-BAK	×	×	×	×	×	×	×	×	×	×
LOG-TIC-CEN	×	×	×	×	×	×	×	×	×	×
POW-TIC-ZER	×	×	×	×	×	×	×	×	×	×

Figure 5

Numbers and percentages of the LFQs performing better than unnormalization among all workflows assessed by criterion (e) accuracy.

10.2174/1381612824666181102125638

Supplementary Table S6 provided the degree of correlation (between the measured and expected logFCs of protein intensity) of LFQ workflow for the data sets of different concentrations (2 versus 4 fmol/μL, 2 versus 10 fmol/μL, 2 versus 25 fmol/μL, 2 versus 50 fmol/μL, 4 versus 10 fmol/μL, 4 versus 25 fmol/μL, 4 versus 50 fmol/μL, 10 versus 25 fmol/μL, 10 versus 50 fmol/μL and 25 versus 50 fmol/μL). The number in each cell of Supplementary Table S6 indicated the median deviation of spiked proteins ± standard deviation. The number highlighted in green and bold was the workflow of better accuracy than unnormalization. As shown, no LFQ workflow was discovered to perform consistently better than unnormalization across all 10 pairs of sample group, and the majority of the workflows performed well only in very limited pairs (~75% of the 560 were well performing in no more than 2 pairs; Figure 5). The inconsistency of the performances among different pairs indicated the data set-dependent nature of LFQ workflow, and a case-by-case performance assessment for a given data set is thus required. In sum, ANPELA demonstrated good capacity for validating the accuracy of an LFQ workflow by preserving the true biological variation of spiked proteins, which could be used to discover the optimal workflow for a metaproteomic study with the gold standard of the expected protein abundance ratio.

Besides ‘accuracy’, other four criteria could be collectively considered in ANPELA. A metaproteomic benchmark data set PXD002099 (48 UPS1 proteins spiked into the yeast proteome with five different concentrations) [61] was therefore collected as example. The overall ranking of the performances of 560 workflows on this data set was calculated and the top 100 ranked LFQs were provided in Figure 3B. Since the sample sizes of the cases and controls were small (3 versus 3), it was inappropriate to assess the quantification performance using criterion (d) reproducibility, and only criteria (a), (b), (c) and (e) were considered. Based on the ranking in Figure 3B, the selection range of LFQs was greatly narrowed down by collectively considering those independent criteria.

Conclusions

ANPELA allows the users to directly analyze the standard output files of popular quantification tools, and its multifaceted strategies for performance assessments could collectively improve the reproducibility, precision and accuracy of LFQ in metaproteomic studies. ANPELA has a unique ability to evaluate the performance of the whole workflow and enables the identification of the optimal LFQ(s) by the comprehensive performance ranking of all 560 workflows. Thus, it has great potential for applications in metaproteomic and other studies requiring LFQ techniques, as many features are shared among proteomic studies. To enhance the accessibility of ANPELA, a mirror site (http://idrb.zju.edu.cn/anpela/) was constructed as the backup of the official ANPELA site (https://idrblab.org/anpela/).

Author Contributions

F.Z. conceived the idea and supervised the work. J.T., J.F. and Y.W. performed the research. J.T., B.L., Y.L. and J.F. constructed the web server, developed the software and wrote the scripts. J.T., J.F., Y.W., Y.L., Q.Y., X.C., J.H., X.L., Y.C. and W.X. prepared and analyzed the data. F.Z. wrote the manuscript.

Key Points

An online tool ANPELA is presented to enable performance assessment of the whole LFQ workflow (collective assessment by five well-established criteria with distinct underlying theories).
ANPELA enables the identification of the optimal LFQ(s) based on the comprehensive performance ranking of all 560 workflows.
ANPELA not only automatically detects the diverse formats of data generated by all quantification tools but also provides the most complete set of processing methods among the available web servers and stand-alone tools.
Systematic validation using metaproteomic benchmarks revealed ANPELA’s capabilities. It is publicly accessible at https://idrblab.org/anpela/.

Funding

National Natural Science Foundation of China (81872798); National Key Research and Development Program of China (2018YFC0910500); Innovation Project on Industrial Generic Key Technologies of Chongqing (cstc2015zdcy-ztzx120003); Fundamental Research Funds for the Central Universities (2018QNA7023, 10611CDJXZ238826, 2018CDQYSG0007, CDJZR14468801, CDJKXB14011).

Jing Tang, Jianbo Fu, Yunxia Wang, Bo Li, Yinghong Li, Qingxia Yang, Xuejiao Cui, Jiajun Hong, Xiaofeng Li are students of the College of Pharmaceutical Sciences in Zhejiang University, China and jointly cultivated by the School of Pharmaceutical Sciences in Chongqing University, China. They are interested in the area of bioinformatics and proteomics-based target identification.

Feng Zhu is a professor of the College of Pharmaceutical Sciences in Zhejiang University, China. He got his PhD degree from the National University of Singapore, Singapore. His research group (https://idrblab.org/) has been working in the fields of bioinformatics, OMIC-based drug discovery, system biology and medicinal chemistry. Welcome to visit his personal website (https://idrblab.org/Peoples.php) at any time.

References

Widder

Allen

Pfeiffer

, et al.

Challenges in microbial ecology: building predictive understanding of community function and dynamics

ISME J

2016

;

2557

–

2568

Cheng

Ning

Zhang

, et al.

MetaLab: an automated pipeline for metaproteomic data analysis

Microbiome

2017

;

157

Lynch

Pedersen

The human intestinal microbiome in health and disease

N Engl J Med

2016

;

375

2369

–

2379

Mallick

Franzosa

, et al.

Experimental design and quantitative analysis of microbial community multiomics

Genome Biol

2017

;

228

Martinez

Andriantsitohaina

Extracellular vesicles in metabolic syndrome

Circ Res

2017

;

120

1674

–

1686

Castrillo

Teixeira

Paredes

, et al.

Root microbiota drive direct integration of phosphate stress and immunity

Nature

2017

;

543

513

–

518

Philippot

Raaijmakers

Lemanceau

, et al.

Going back to the roots: the microbial ecology of the rhizosphere

Nat Rev Microbiol

2013

;

789

–

799

Liu

Williams

Geisler-Lee

, et al.

Impact of wastewater effluent containing aged nanoparticles and other components on biological activities of the soil microbiome, Arabidopsis plants, and earthworms

Environ Res

2018

;

164

197

–

203

Maier

Tamm

, et al.

Photoautotrophic organisms control microbial abundance, diversity, and physiology in different types of biological soil crusts

ISME J

2018

;

1032

–

1046

10.

Broberg

Doonan

Mundt

, et al.

Integrated multi-omic analysis of host-microbiota interactions in acute oak decline

Microbiome

2018

;

11.

Wang

Kong

, et al.

Environmental microbial community proteomics: status, challenges and perspectives

Int J Mol Sci

2016

;

E1275

12.

Zhang

Ning

Mayne

, et al.

In vitro metabolic labeling of intestinal microbiota for quantitative metaproteomics

Anal Chem

2016

;

6120

–

6125

13.

Brunkwall

Orho-Melander

The gut microbiome as a target for prevention and treatment of hyperglycaemia in type 2 diabetes: from current human evidence to future possibilities

Diabetologia

2017

;

943

–

951

14.

Eymann

Lassek

Wegner

, et al.

Symbiotic interplay of fungi, algae, and bacteria within the lung lichen Lobaria pulmonaria L. Hoffm. as assessed by state-of-the-art metaproteomics

J Proteome Res

2017

;

2160

–

2173

15.

Tanca

Abbondio

Palomba

, et al.

Potential and active functions in the gut microbiota of a healthy human cohort

Microbiome

2017

;

16.

Kleiner

Dong

Hinzke

, et al.

Metaproteomics method to determine carbon sources and assimilation pathways of species in microbial communities

Proc Natl Acad Sci U S A

2018

;

115

E5576

–

E5584

17.

Zhang

Deeke

Ning

, et al.

Metaproteomics reveals associations between microbiome and intestinal extracellular vesicle proteins in pediatric inflammatory bowel disease

Nat Commun

2018

;

2873

18.

Korpela

Salonen

Vepsalainen

, et al.

Probiotic supplementation restores normal microbiota composition and function in antibiotic-treated and in caesarean-born infants

Microbiome

2018

;

182

19.

von

Bergen

Jehmlich

Taubert

, et al.

Insights from quantitative metaproteomics and protein-stable isotope probing into microbial ecology

ISME J

2013

;

1877

–

1885

20.

Navarro

Kuharev

Gillet

, et al.

A multicenter study benchmarks software tools for label-free proteome quantification

Nat Biotechnol

2016

;

1130

–

1136

21.

Lichtman

Sonnenburg

Elias

Monitoring host responses to the gut microbiota

ISME J

2015

;

1908

–

1915

22.

Ferrer

Raczkowska

Martinez-Martinez

, et al.

Phenotyping of gut microbiota: focus on capillary electrophoresis

Electrophoresis

2017

;

2275

–

2286

23.

Tang

Zhang

, et al.

Computational advances in the label-free quantification of cancer proteomics data

Curr Pharm Des

2018

. doi:

24.

El-Rami

Nelson

Proteomic approach for extracting cytoplasmic proteins from Streptococcus sanguinis using mass spectrometry

J Mol Biol Res

2017

;

–

25.

Kleiner

Young

Shah

, et al.

Metaproteomics reveals abundant transposase expression in mutualistic endosymbionts

MBio

2013

;

e00223

–

e00213

26.

Adams

Chourey

, et al.

Systematic comparison of label-free, metabolic labeling, and isobaric chemical labeling for quantitative proteomics on LTQ Orbitrap Velos

J Proteome Res

2012

;

1582

–

1590

27.

Cretu

Prassas

Saraon

, et al.

Identification of psoriatic arthritis mediators in synovial fluid by quantitative mass spectrometry

Clin Proteomics

2014

;

28.

Barschke

Oeckl

Steinacker

, et al.

Proteomic studies in the discovery of cerebrospinal fluid biomarkers for amyotrophic lateral sclerosis

Expert Rev Proteomics

2017

;

769

–

777

29.

Huang

Yang

Luo

, et al.

SWATH enables precise label-free quantification on proteome scale

Proteomics

2015

;

1215

–

1223

30.

Ivanov

Lobas

Karpov

, et al.

Comparison of false discovery rate control strategies for variant peptide identifications in shotgun proteogenomics

J Proteome Res

2017

;

1936

–

1943

31.

Cappadona

Baker

Cutillas

, et al.

Current challenges in software solutions for mass spectrometry-based quantitative proteomics

Amino Acids

2012

;

1087

–

1108

32.

Wang

Shi

Qian

, et al.

The clinical impact of recent advances in LC-MS for cancer biomarker discovery and verification

Expert Rev Proteomics

2016

;

–

114

33.

Al Shweiki

Monchgesang

Majovsky

, et al.

Assessment of label-free quantification in discovery proteomics and impact of technological factors and natural variability of protein abundance

J Proteome Res

2017

;

1410

–

1424

34.

Shen

, et al.

IonStar enables high-precision, low-missing-data proteomics quantification in large biological cohorts

Proc Natl Acad Sci U S A

2018

;

115

E4767

–

E4776

35.

Lyutvinskiy

Yang

Rutishauser

, et al.

In silico instrumental response correction improves precision of label-free proteomics and accuracy of proteomics-based predictive models

Mol Cell Proteomics

2013

;

2324

–

2331

36.

Vandenkoornhuyse

Quaiser

Duhamel

, et al.

The importance of the microbiome of the plant holobiont

New Phytol

2015

;

206

1196

–

1206

37.

Legewie

Bluthgen

Schafer

, et al.

Ultrasensitization: switch-like regulation of cellular signaling by transcriptional induction

PLoS Comput Biol

2005

;

e54

38.

Gatto

Hansen

Hoopmann

, et al.

Testing and validation of computational methods for mass spectrometry

J Proteome Res

2016

;

809

–

814

39.

Ding

, et al.

dbDEPC: a database of differentially expressed proteins in human cancers

Nucleic Acids Res

2010

;

D658

–

D664

40.

Chen

Zhang

Wang

, et al.

Proteomics progresses in microbial physiology and clinical antimicrobial therapy

Eur J Clin Microbiol Infect Dis

2017

;

403

–

413

41.

Lassek

Burghartz

Chaves-Moreno

, et al.

A metaproteomics approach to elucidate host and pathogen protein expression during catheter-associated urinary tract infections (CAUTIs)

Mol Cell Proteomics

2015

;

989

–

1008

42.

Tilocca

Burbach

Heyer

CME

, et al.

Dietary changes in nutritional studies shape the structural and functional composition of the pigs’ fecal microbiome-from days to weeks

Microbiome

2017

;

144

43.

Heyer

Kohrs

Reichl

, et al.

Metaproteomics of complex microbial communities in biogas plants

J Microbial Biotechnol

2015

;

749

–

763

44.

Valikangas

Suomi

Elo

A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation

Brief Bioinform

2017

. doi: https://doi.org/10.1093/bib/bbx054.

45.

Khoonsari

Haggmark

Lonnberg

, et al.

Analysis of the cerebrospinal fluid proteome in Alzheimer’s disease

PLoS One

2016

;

e0150672

46.

Weiss

Peddada

, et al.

Normalization and microbial differential abundance strategies depend upon data characteristics

Microbiome

2017

;

47.

Bubis

Levitsky

Ivanov

, et al.

Comparative evaluation of label-free quantification methods for shotgun proteomics

Rapid Commun Mass Spectrom

2017

;

606

–

612

48.

Ning

Fermin

Nesvizhskii

Comparative analysis of different label-free mass spectrometry based protein abundance estimates and their correlation with RNA-Seq gene expression data

J Proteome Res

2012

;

2261

–

2271

49.

Chawade

Sandin

Teleman

, et al.

Data processing has major impact on the outcome of quantitative label-free LC-MS analysis

J Proteome Res

2015

;

676

–

687

50.

Valikangas

Suomi

Elo

A systematic evaluation of normalization methods in quantitative label-free proteomics

Brief Bioinform

2018

;

–

PubMed

51.

Hawley

Brewer

Norbeck

, et al.

Metaproteomics reveals differential modes of metabolic coupling among ubiquitous oxygen minimum zone microbes

Proc Natl Acad Sci U S A

2014

;

111

11395

–

11400

52.

Kashyap

Chia

Nelson

, et al.

Microbiome at the frontier of personalized medicine

Mayo Clin Proc

2017

;

1855

–

1864

53.

Tyanova

Temu

Sinitcyn

, et al.

The Perseus computational platform for comprehensive analysis of (prote)omics data

Nat Methods

2016

;

731

–

740

54.

Proietti

Zakrzewski

Watkins

, et al.

Mining, visualizing and comparing multidimensional biomolecular data using the Genomics Data Miner (GMine) web-server

Sci Rep

2016

;

38178

55.

Hoekman

Breitling

Suits

, et al.

msCompare: a framework for quantitative analysis of label-free LC-MS data for comparative candidate biomarker studies

Mol Cell Proteomics

2012

;

(

M111.015974

56.

Chawade

Alexandersson

Levander

Normalyzer: a tool for rapid evaluation of normalization methods for omics data sets

J Proteome Res

2014

;

3114

–

3120

57.

Webb-Robertson

Matzke

Jacobs

, et al.

A statistical selection strategy for normalization procedures in LC-MS proteomics experiments through dataset-dependent ranking of normalization scaling factors

Proteomics

2011

;

4736

–

4741

58.

Weiner

Sidoli

Diskin

, et al.

GiaPronto: a one-click graph visualization software for proteomics datasets

Mol Cell Proteomics

2018

;

1426

–

1431

59.

Williams

Lemieux

Hassis

, et al.

Quantitative proteomic analyses of mammary organoids reveals distinct signatures after exposure to environmental chemicals

Proc Natl Acad Sci U S A

2016

;

113

E1343

–

E1351

60.

Collins

Hunter

Liu

, et al.

Multi-laboratory assessment of reproducibility, qualitative and quantitative performance of SWATH-mass spectrometry

Nat Commun

2017

;

291

61.

Pursiheimo

Vehmas

Afzal

, et al.

Optimization of statistical methods impact on quantitative proteomics data

J Proteome Res

2015

;

4118

–

4126

62.

Dowle

Wilson

Thomas

Comparing the diagnostic classification accuracy of iTRAQ, peak-area, spectral-counting, and emPAI methods for relative quantification in expression proteomics

J Proteome Res

2016

;

3550

–

3562

63.

Zhu

Yang

, et al.

Clinical success of drug targets prospectively predicted by in silico study

Trends Pharmacol Sci

2018

;

229

–

231

64.

Fermin

Basrur

Yocum

, et al.

Abacus: a computational tool for extracting and pre-processing spectral count data for label-free quantitative proteomic analysis

Proteomics

2011

;

1340

–

1345

65.

Park

Venable

, et al.

A quantitative analysis software tool for mass spectrometry-based proteomics

Nat Methods

2008

;

319

–

322

66.

Tsou

Avtonomov

Larsen

, et al.

DIA-Umpire: comprehensive computational framework for data-independent acquisition proteomics

Nat Methods

2015

;

258

–

264

67.

Tabb

McDonald

Yates

, III.

DTASelect and Contrast: tools for assembling and comparing protein identifications from shotgun proteomics

J Proteome Res

2002

;

–

68.

Ramus

Hovasse

Marcellin

, et al.

Spiked proteomic standard dataset for testing label-free quantitative software and statistical methods

Data Brief

2016

;

286

–

294

69.

Cox

Mann

MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification

Nat Biotechnol

2008

;

1367

–

1372

70.

Sturm

Bertsch

Gropl

, et al.

OpenMS—an open-source software framework for mass spectrometry

BMC Bioinformatics

2008

;

163

71.

Zhang

Hendrie

, et al.

PEAKS: powerful software for peptide de novo sequencing by tandem mass spectrometry

Rapid Commun Mass Spectrom

2003

;

2337

–

2342

72.

Nesvizhskii

Keller

Kolker

, et al.

A statistical model for identifying proteins by tandem mass spectrometry

Anal Chem

2003

;

4646

–

4658

73.

Hakkinen

Vincic

Mansson

, et al.

The proteios software environment: an extensible multiuser platform for management and analysis of proteomics data

J Proteome Res

2009

;

3037

–

3043

74.

MacLean

Tomazela

Shulman

, et al.

Skyline: an open source document editor for creating and analyzing targeted proteomics experiments

Bioinformatics

2010

;

966

–

968

75.

Bruderer

Bernhardt

Gandhi

, et al.

Extending the limits of quantitative proteome profiling with data-independent acquisition and application to acetaminophen-treated three-dimensional liver microtissues

Mol Cell Proteomics

2015

;

1400

–

1410

76.

Sakia

The Box–Cox transformation technique—a review

J R Stat Soc Ser D Stat

1992

;

169

–

178

77.

Kwok

Leung

Control of the misuse of testosterone in castrated horses based on an international threshold in plasma

Drug Test Anal

2015

;

414

–

419

78.

De Livera

Dias

De Souza

, et al.

Normalizing and integrating metabolomics data

Anal Chem

2012

;

10768

–

10776

79.

Ghosh

, et al.

Estimating spatiotemporal variability of ambient air pollutant concentrations with a hierarchical model

Atmos Environ (1994)

2013

;

–

80.

Gromski

Hollywood

, et al.

The influence of scaling metabolomics data on model classification accuracy

Metabolomics

2015

;

684

–

695

81.

Kultima

Nilsson

Scholz

, et al.

Development and evaluation of normalization methods for label-free relative quantification of endogenous peptides

Mol Cell Proteomics

2009

;

2285

–

2295

82.

Kohl

Klein

Hochrein

, et al.

State-of-the art data normalization methods improve NMR-based metabolomic analysis

Metabolomics

2012

;

146

–

160

83.

Karpievitch

Dabney

Smith

Normalization and missing value imputation for label-free LC-MS analysis

BMC Bioinformatics

2012

;

(

Suppl 16

84.

Callister

Barry

Adkins

, et al.

Normalization approaches for removing systematic biases associated with mass spectrometry and label-free proteomics

J Proteome Res

2006

;

277

–

286

85.

Andjelkovic

Thompson

Changes in gene expression in maize kernel in response to water and salt stress

Plant Cell Rep

2006

;

–

86.

Matzke

Waters

Metz

, et al.

Improved quality control processing of peptide-centric LC-MS proteomics data

Bioinformatics

2011

;

2866

–

2872

87.

van den

Berg

Hoefsloot

Westerhuis

, et al.

Centering, scaling, and transformations: improving the biological information content of metabolomics data

BMC Genomics

2006

;

142

88.

Dieterle

Ross

Schlotterbeck

, et al.

Probabilistic quotient normalization as robust method to account for dilution of complex biological mixtures. Application in 1H NMR metabonomics

Anal Chem

2006

;

4281

–

4290

89.

Gaspari

Chiesa

Nicastri

, et al.

Proteome speciation by mass spectrometry: characterization of composite protein mixtures in milk replacers

Anal Chem

2016

;

11568

–

11574

90.

Lin

Golovnina

Chen

, et al.

Comparison of normalization and differential expression analyses using RNA-seq data from 726 individual Drosophila melanogaster

BMC Genomics

2016

;

91.

Tang

Yang

, et al.

Performance evaluation and online realization of data-driven normalization methods used in LC/MS based untargeted metabolomics analysis

Sci Rep

2016

;

38881

92.

Geiger

Wehner

Schaab

, et al.

Comparative proteomic analysis of eleven common cell lines reveals ubiquitous but varying expression of most proteins

Mol Cell Proteomics

2012

;

M111.014050

93.

Troyanskaya

Cantor

Sherlock

, et al.

Missing value estimation methods for DNA microarrays

Bioinformatics

2001

;

520

–

525

94.

Gan

Liew

Yan

Microarray missing data imputation based on a set theoretic framework and biological knowledge

Nucleic Acids Res

2006

;

1608

–

1619

95.

Griffin

Long

, et al.

Label-free, normalized quantification of complex mass spectrometry data for proteomic analysis

Nat Biotechnol

2010

;

–

96.

Kuharev

Navarro

Distler

, et al.

In-depth evaluation of software tools for data-independent acquisition based label-free quantification

Proteomics

2015

;

3140

–

3151

97.

Chignell

Park

Lacerda

CMR

, et al.

Label-free proteomics of a defined, binary co-culture reveals diversity of competitive responses between members of a model soil microbial system

Microb Ecol

2018

;

701

–

719

98.

Muller

Fischer

Chen

, et al.

On the reproducibility of label-free quantitative cross-linking/mass spectrometry

J Am Soc Mass Spectrom

2018

;

405

–

412

99.

Dubinkina

Tyakht

Odintsova

, et al.

Links of gut microbiota composition with alcohol dependence syndrome and alcoholic liver disease

Microbiome

2017

;

141

100.

Arroyo

Gines

Prostaglandins and the treatment of hepatorenal syndrome in cirrhosis

J Hepatol

1990

;

142

–

144

101.

Risso

Ngai

Speed

, et al.

Normalization of RNA-seq data using factor analysis of control genes or samples

Nat Biotechnol

2014

;

896

–

902

102.

Komatsu

Han

Nanjo

, et al.

Label-free quantitative proteomic analysis of abscisic acid effect in early-stage soybean under flooding

J Proteome Res

2013

;

4769

–

4784

103.

Blaise

Data-driven sample size determination for metabolic phenotyping studies

Anal Chem

2013

;

8943

–

8950

104.

Suomi

Seyednasrollah

Jaakkola

, et al.

ROTS: an R package for reproducibility-optimized statistical testing

PLoS Comput Biol

2017

;

e1005562

105.

Min

Lee

Cheon

, et al.

In-depth proteomic analysis of Glycine max seeds during controlled deterioration treatment reveals a shift in seed metabolism

J Proteomics

2017

;

169

125

–

135

106.

Tang

Yang

, et al.

NOREVA: normalization and evaluation of MS-based metabolomics data

Nucleic Acids Res

2017

;

W162

–

W170

107.

Wang

Gardiner

Cairns

Optimal consistency in microRNA expression analysis using reference-gene-based normalization

Mol Biosyst

2015

;

1235

–

1240

108.

Zevin

Xie

Birse

, et al.

Microbiome composition and function drives wound-healing impairment in the female genital tract

PLoS Pathog

2016

;

e1005889

109.

Xia

Wishart

Web-based inference of biological patterns, functions and pathways from metabolomic data using MetaboAnalyst

Nat Protoc

2011

;

743

–

760

110.

Lee

Saha

Yusufi

, et al.

Web-based applications for building, managing and analysing kinetic models of biological systems

Brief Bioinform

2009

;

–

111.

Vizcaino

Csordas

del-Toro

, et al.

2016 update of the PRIDE database and its related tools

Nucleic Acids Res

2016

;

D447

–

D456

112.

Mottawea

Chiang

Muhlbauer

, et al.

Altered intestinal microbiota-host mitochondria crosstalk in new onset Crohn’s disease

Nat Commun

2016

;

13419

113.

Schroeder

Birchenough

GMH

Stahlman

, et al.

Bifidobacteria or fiber protects against diet-induced microbiota-mediated colonic mucus deterioration

Cell Host Microbe

2018

;

–

114.

Deusch

Camarinha-Silva

Conrad

, et al.

A structural and functional elucidation of the rumen microbiome influenced by various diets and microenvironments

Front Microbiol

2017

;

1605

115.

Grassl

Kulak

Pichler

, et al.

Ultra-deep and quantitative saliva proteome reveals dynamics of the oral microbiome

Genome Med

2016

;

116.

van der

Post

Hansson

Membrane protein profiling of human colon reveals distinct regional differences

Mol Cell Proteomics

2014

;

2277

–

2287

117.

Poppleton

Duchateau

Hourdel

, et al.

Outer membrane proteome of Veillonella parvula: a diderm firmicute of the human microbiome

Front Microbiol

2017

;

1215

118.

Scott

Giogha

Pollock

, et al.

The bacterial arginine glycosyltransferase effector NleB preferentially modifies Fas-associated death domain protein (FADD)

J Biol Chem

2017

;

292

17337

–

17350

119.

Chong

Gan

Pham

, et al.

Isobaric tags for relative and absolute quantitation (iTRAQ) reproducibility: implication of multiple injections

J Proteome Res

2006

;

1232

–

1240

120.

La Delfa

Freeman

Petruzzi

, et al.

Equations to predict female manual arm strength based on hand location relative to the shoulder

Ergonomics

2014

;

254

–

261

121.

Simula

Cannizzaro

Marin

, et al.

Two-dimensional gel proteome reference map of human small intestine

Proteome Sci

2009

;

122.

Anjo

Santa

Manadas

SWATH-MS as a tool for biomarker discovery: from basic research to clinical applications

Proteomics

2017

;

1600278

10.1007/s10792-017-0791-0

123.

Liu

Buil

Collins

, et al.

Quantitative variability of 342 plasma proteins in a human twin population

Mol Syst Biol

2015

;

786

124.

Caron

Roncagalli

Hase

, et al.

Precise temporal profiling of signaling complexes in primary cells using SWATH mass spectrometry

Cell Rep

2017

;

3219

–

3226

125.

Sullivan

Evans

Pandey

, et al.

Trisomy 21 causes changes in the circulating proteome indicative of chronic autoinflammation

Sci Rep

2017

;

14818

126.

Guo

Kouvonen

Koh

, et al.

Rapid mass spectrometric conversion of tissue biopsy samples into permanent quantitative digital proteome maps

Nat Med

2015

;

407

–

413

127.

Tan

SLW

Chadha

Liu

, et al.

A class of environmental and endogenous toxins induces BRCA2 haploinsufficiency and genome instability

Cell

2017

;

169

1105

–

1118

128.

Liu

Gonzalez-Porta

Santos

, et al.

Impact of alternative splicing on the human proteome

Cell Rep

2017

;

1229

–

1241

129.

Ramus

Hovasse

Marcellin

, et al.

Benchmarking quantitative label-free LC-MS data processing workflows using a complex spiked proteomic standard dataset

J Proteomics

2016

;

132

–

130.

McIlwain

Mathews

Bereman

, et al.

Estimating relative abundances of proteins from shotgun proteomics data

BMC Bioinformatics

2012

;

308

131.

Mouton-Barbosa

Roux-Dalvai

Bouyssie

, et al.

In-depth exploration of cerebrospinal fluid by combining peptide ligand library treatment and label-free protein quantification

Mol Cell Proteomics

2010

;

1006

–

1021

132.

von

Haller

Donohoe

, et al.

The application of new software tools to quantitative protein profiling via isotope-coded affinity tag (ICAT) and tandem mass spectrometry: II. Evaluation of tandem mass spectrometry methodologies for large-scale protein analysis, and the application of statistical tools for data analysis and interpretation

Mol Cell Proteomics

2003

;

428

–

442

133.

Kosanam

Thai

Zhang

, et al.

Diabetes induces lysine acetylation of intermediary metabolism enzymes in the kidney

Diabetes

2014

;

2432

–

2439

134.

Roux-Dalvai

Gonzalez de Peredo

Simo

, et al.

Extensive analysis of the cytoplasmic proteome of human erythrocytes using the peptide ligand library technology and advanced mass spectrometry

Mol Cell Proteomics

2008

;

2254

–

2269

135.

Weisser

Nahnsen

Grossmann

, et al.

An automated pipeline for high-throughput label-free quantitative proteomics

J Proteome Res

2013

;

1628

–

1644

136.

Zhang

Xin

Shan

, et al.

PEAKS DB: de novo sequencing assisted database search for sensitive and accurate peptide identification

Mol Cell Proteomics

2012

;

M111. 010587

137.

Brownridge

Xia

, et al.

A software toolkit and interface for performing stable isotope labeling and top3 quantification using Progenesis LC-MS

OMICS

2012

;

489

–

495

138.

Searle

Scaffold: a bioinformatic tool for validating MS/MS-based proteomic studies

Proteomics

2010

;

1265

–

1269

139.

Schilling

Rardin

MacLean

, et al.

Platform-independent and label-free quantitation of proteomic data using MS1 extracted ion chromatograms in skyline: application to protein acetylation and phosphorylation

Mol Cell Proteomics

2012

;

202

–

214

140.

Kaur

Sooraj

, et al.

Comparative evaluation of the aqueous humor proteome of primary angle closure and primary open angle glaucomas and age-related cataract eyes

Int Ophthalmol

2018

. doi: