List of methods to predict the thermodynamic stability changes of proteins upon mutations based on machine learning approaches, developed or updated in the last 15 years and freely available to users via web servers. All the web servers listed here were available at the time of the submission of the review
Name and original reference . | Input data . | Input data and optional settings . | Type of approach . | Output . | Availability . |
---|---|---|---|---|---|
I-Mutant 2.0 [21, 23] | Protein structure or sequence | Temperature; pH | SVM-based web server, trained on a dataset derived from ProTherm | Direction of the free energy change and its value for either all possible mutations of a particular residue or only for a specific mutation | http://gpcr2.biocomp.unibo.it/cgi/predictors/I-Mutant2.0/I-Mutant2.0.cgi |
MUpro [26] | Protein sequence. Protein structure if available | – | SVM- and neural network-based predictors, trained on the same dataset as I-Mutant 2.0. | Prediction of the value of energy change using support vector machine, using regression methods (recommended). Prediction of the sign of energy change using support vector machines and neural networks, using classification methods | http://mupro.proteomics.ics.uci.edu/ |
I-Mutant 3.0 [31] | Protein structure or sequence | Temperature; pH | SVM-based web server, trained on a dataset derived from ProTherm | ΔΔG value and binary classification (ΔΔG ≥ 0, ΔΔG <0) or ternary classification (ΔΔG < -0.5, −0.5 ≤ ΔΔG ≤ 0.5, ΔΔG >0.5) | http://gpcr2.biocomp.unibo.it/cgi/predictors/I-Mutant3.0/I-Mutant3.0.cgi |
mCSM [46] | Protein structure | Single mutation, mutation list or systematic mutations on a single residue | Graph-based distance patterns among atoms to represent the residue environment and a ‘pharmacophore count’ vector to account for the atom changes introduced by the mutation. The resulting signature vector is used to train predictive machine learning methods in regression and classification tasks | Prediction of the direction of the change in stability and actual numerical experimental value. Also prediction of change in affinity of protein–protein and protein–DNA complexes upon mutation | http://biosig.unimelb.edu.au/mcsm/ |
NeEMO [43] | Protein structure | Temperature; pH; amino acid to be substituted (one or more, manually edited) | Calculation of residue–residue interaction networks where nodes represent residues and edges represent different types of physicochemical bonds. These graphs are used to train neural network for the prediction of stability changes | Prediction of ΔΔG changes upon point mutations | http://protein.bio.unipd.it/neemo/ |
AUTO-MUTE 2.0 (Stability changes tool) [32, 44] | Protein structure (no multiple models, no gaps; no alternative conformations for alpha-carbon atoms) | Temperature; pH; amino acid to be substituted (single or systematic) | Two supervised classification models (random forest and SVM) to predict only the sign of ΔΔG; two regression models (tree regression and SVM regression) to predict the actual value of ΔΔG | Either predicted sign of ΔΔG along with a confidence level or predicted value of ΔΔG with other information about structural features of the mutant | http://binf2.gmu.edu/automute/AUTO-MUTE_Stability_ddG.html |
INPS-MD [47, 54] | Protein sequence (INPS) or structure (INPS3D) | – | Support vector regression trained on descriptors encoding mutation type (in particular, substitution score, hydrophobicity score, mutability index of native residue, molecular weights of native and mutant residues) and evolutionary information (INPS). Addition of structural features such as relative solvent accessibility of native residue and local energy difference calculated by a contact potential (INPS3D) | Changes in ΔG values upon residue substitution in the protein sequence | https://inpsmd.biocomp.unibo.it/welcome/default/index |
EASE-MM [51] | Protein sequence of a single domain monomeric protein | – | Combination of five specialized SVM models to predict ΔΔG of mutations. Each SVM combines a different set of features encoding evolutionary conservation, amino acid parameters and predicted structural properties such as secondary structures and different levels of accessible surface areas | Predicted ΔΔG and stability class: ∆∆Gu in (−inf, −1), destabilizing; ∆∆Gu in (−1, −0.5), likely destabilizing; ∆∆Gu in (−0.5, 0.5), neutral; ∆∆Gu in (0.5, 1), likely stabilizing; ∆∆Gu in (1, +inf), stabilizing. Predicted secondary structure and relative accessible surface area of the mutation site | https://sparks-lab.org/server/ease-mm/ |
STRUM [53] | Protein sequence | Single variation or multiple variation | Gradient boosting regression approach using different features sequence profile scores for evolutionary information, structural profile scores and different energy functions providing accurate environment information | Predicted ΔΔG of single-point mutation | http://zhanglab.ccmb.med.umich.edu/STRUM/ |
PON-tstab [63] | Protein sequence | Temperature; pH; single variation or multiple variation | Random forests tool based on similarity features, conservation features, amino acid features, variation type features, neighborhood features, and other sequence-based protein features | Predicted ΔΔG of single-point mutation and predicted probability | http://structure.bmc.lu.se/PON-Tstab/ |
DeepDDG [64] | Protein structure | Single mutations or a list of mutations | Neural network-based predictor in which the parameters are shared for each target residue–neighbor residue pair | Prediction of the change in folding free energy upon mutation (ΔΔG) | http://protein.org.cn/ddg.html |
Name and original reference . | Input data . | Input data and optional settings . | Type of approach . | Output . | Availability . |
---|---|---|---|---|---|
I-Mutant 2.0 [21, 23] | Protein structure or sequence | Temperature; pH | SVM-based web server, trained on a dataset derived from ProTherm | Direction of the free energy change and its value for either all possible mutations of a particular residue or only for a specific mutation | http://gpcr2.biocomp.unibo.it/cgi/predictors/I-Mutant2.0/I-Mutant2.0.cgi |
MUpro [26] | Protein sequence. Protein structure if available | – | SVM- and neural network-based predictors, trained on the same dataset as I-Mutant 2.0. | Prediction of the value of energy change using support vector machine, using regression methods (recommended). Prediction of the sign of energy change using support vector machines and neural networks, using classification methods | http://mupro.proteomics.ics.uci.edu/ |
I-Mutant 3.0 [31] | Protein structure or sequence | Temperature; pH | SVM-based web server, trained on a dataset derived from ProTherm | ΔΔG value and binary classification (ΔΔG ≥ 0, ΔΔG <0) or ternary classification (ΔΔG < -0.5, −0.5 ≤ ΔΔG ≤ 0.5, ΔΔG >0.5) | http://gpcr2.biocomp.unibo.it/cgi/predictors/I-Mutant3.0/I-Mutant3.0.cgi |
mCSM [46] | Protein structure | Single mutation, mutation list or systematic mutations on a single residue | Graph-based distance patterns among atoms to represent the residue environment and a ‘pharmacophore count’ vector to account for the atom changes introduced by the mutation. The resulting signature vector is used to train predictive machine learning methods in regression and classification tasks | Prediction of the direction of the change in stability and actual numerical experimental value. Also prediction of change in affinity of protein–protein and protein–DNA complexes upon mutation | http://biosig.unimelb.edu.au/mcsm/ |
NeEMO [43] | Protein structure | Temperature; pH; amino acid to be substituted (one or more, manually edited) | Calculation of residue–residue interaction networks where nodes represent residues and edges represent different types of physicochemical bonds. These graphs are used to train neural network for the prediction of stability changes | Prediction of ΔΔG changes upon point mutations | http://protein.bio.unipd.it/neemo/ |
AUTO-MUTE 2.0 (Stability changes tool) [32, 44] | Protein structure (no multiple models, no gaps; no alternative conformations for alpha-carbon atoms) | Temperature; pH; amino acid to be substituted (single or systematic) | Two supervised classification models (random forest and SVM) to predict only the sign of ΔΔG; two regression models (tree regression and SVM regression) to predict the actual value of ΔΔG | Either predicted sign of ΔΔG along with a confidence level or predicted value of ΔΔG with other information about structural features of the mutant | http://binf2.gmu.edu/automute/AUTO-MUTE_Stability_ddG.html |
INPS-MD [47, 54] | Protein sequence (INPS) or structure (INPS3D) | – | Support vector regression trained on descriptors encoding mutation type (in particular, substitution score, hydrophobicity score, mutability index of native residue, molecular weights of native and mutant residues) and evolutionary information (INPS). Addition of structural features such as relative solvent accessibility of native residue and local energy difference calculated by a contact potential (INPS3D) | Changes in ΔG values upon residue substitution in the protein sequence | https://inpsmd.biocomp.unibo.it/welcome/default/index |
EASE-MM [51] | Protein sequence of a single domain monomeric protein | – | Combination of five specialized SVM models to predict ΔΔG of mutations. Each SVM combines a different set of features encoding evolutionary conservation, amino acid parameters and predicted structural properties such as secondary structures and different levels of accessible surface areas | Predicted ΔΔG and stability class: ∆∆Gu in (−inf, −1), destabilizing; ∆∆Gu in (−1, −0.5), likely destabilizing; ∆∆Gu in (−0.5, 0.5), neutral; ∆∆Gu in (0.5, 1), likely stabilizing; ∆∆Gu in (1, +inf), stabilizing. Predicted secondary structure and relative accessible surface area of the mutation site | https://sparks-lab.org/server/ease-mm/ |
STRUM [53] | Protein sequence | Single variation or multiple variation | Gradient boosting regression approach using different features sequence profile scores for evolutionary information, structural profile scores and different energy functions providing accurate environment information | Predicted ΔΔG of single-point mutation | http://zhanglab.ccmb.med.umich.edu/STRUM/ |
PON-tstab [63] | Protein sequence | Temperature; pH; single variation or multiple variation | Random forests tool based on similarity features, conservation features, amino acid features, variation type features, neighborhood features, and other sequence-based protein features | Predicted ΔΔG of single-point mutation and predicted probability | http://structure.bmc.lu.se/PON-Tstab/ |
DeepDDG [64] | Protein structure | Single mutations or a list of mutations | Neural network-based predictor in which the parameters are shared for each target residue–neighbor residue pair | Prediction of the change in folding free energy upon mutation (ΔΔG) | http://protein.org.cn/ddg.html |
List of methods to predict the thermodynamic stability changes of proteins upon mutations based on machine learning approaches, developed or updated in the last 15 years and freely available to users via web servers. All the web servers listed here were available at the time of the submission of the review
Name and original reference . | Input data . | Input data and optional settings . | Type of approach . | Output . | Availability . |
---|---|---|---|---|---|
I-Mutant 2.0 [21, 23] | Protein structure or sequence | Temperature; pH | SVM-based web server, trained on a dataset derived from ProTherm | Direction of the free energy change and its value for either all possible mutations of a particular residue or only for a specific mutation | http://gpcr2.biocomp.unibo.it/cgi/predictors/I-Mutant2.0/I-Mutant2.0.cgi |
MUpro [26] | Protein sequence. Protein structure if available | – | SVM- and neural network-based predictors, trained on the same dataset as I-Mutant 2.0. | Prediction of the value of energy change using support vector machine, using regression methods (recommended). Prediction of the sign of energy change using support vector machines and neural networks, using classification methods | http://mupro.proteomics.ics.uci.edu/ |
I-Mutant 3.0 [31] | Protein structure or sequence | Temperature; pH | SVM-based web server, trained on a dataset derived from ProTherm | ΔΔG value and binary classification (ΔΔG ≥ 0, ΔΔG <0) or ternary classification (ΔΔG < -0.5, −0.5 ≤ ΔΔG ≤ 0.5, ΔΔG >0.5) | http://gpcr2.biocomp.unibo.it/cgi/predictors/I-Mutant3.0/I-Mutant3.0.cgi |
mCSM [46] | Protein structure | Single mutation, mutation list or systematic mutations on a single residue | Graph-based distance patterns among atoms to represent the residue environment and a ‘pharmacophore count’ vector to account for the atom changes introduced by the mutation. The resulting signature vector is used to train predictive machine learning methods in regression and classification tasks | Prediction of the direction of the change in stability and actual numerical experimental value. Also prediction of change in affinity of protein–protein and protein–DNA complexes upon mutation | http://biosig.unimelb.edu.au/mcsm/ |
NeEMO [43] | Protein structure | Temperature; pH; amino acid to be substituted (one or more, manually edited) | Calculation of residue–residue interaction networks where nodes represent residues and edges represent different types of physicochemical bonds. These graphs are used to train neural network for the prediction of stability changes | Prediction of ΔΔG changes upon point mutations | http://protein.bio.unipd.it/neemo/ |
AUTO-MUTE 2.0 (Stability changes tool) [32, 44] | Protein structure (no multiple models, no gaps; no alternative conformations for alpha-carbon atoms) | Temperature; pH; amino acid to be substituted (single or systematic) | Two supervised classification models (random forest and SVM) to predict only the sign of ΔΔG; two regression models (tree regression and SVM regression) to predict the actual value of ΔΔG | Either predicted sign of ΔΔG along with a confidence level or predicted value of ΔΔG with other information about structural features of the mutant | http://binf2.gmu.edu/automute/AUTO-MUTE_Stability_ddG.html |
INPS-MD [47, 54] | Protein sequence (INPS) or structure (INPS3D) | – | Support vector regression trained on descriptors encoding mutation type (in particular, substitution score, hydrophobicity score, mutability index of native residue, molecular weights of native and mutant residues) and evolutionary information (INPS). Addition of structural features such as relative solvent accessibility of native residue and local energy difference calculated by a contact potential (INPS3D) | Changes in ΔG values upon residue substitution in the protein sequence | https://inpsmd.biocomp.unibo.it/welcome/default/index |
EASE-MM [51] | Protein sequence of a single domain monomeric protein | – | Combination of five specialized SVM models to predict ΔΔG of mutations. Each SVM combines a different set of features encoding evolutionary conservation, amino acid parameters and predicted structural properties such as secondary structures and different levels of accessible surface areas | Predicted ΔΔG and stability class: ∆∆Gu in (−inf, −1), destabilizing; ∆∆Gu in (−1, −0.5), likely destabilizing; ∆∆Gu in (−0.5, 0.5), neutral; ∆∆Gu in (0.5, 1), likely stabilizing; ∆∆Gu in (1, +inf), stabilizing. Predicted secondary structure and relative accessible surface area of the mutation site | https://sparks-lab.org/server/ease-mm/ |
STRUM [53] | Protein sequence | Single variation or multiple variation | Gradient boosting regression approach using different features sequence profile scores for evolutionary information, structural profile scores and different energy functions providing accurate environment information | Predicted ΔΔG of single-point mutation | http://zhanglab.ccmb.med.umich.edu/STRUM/ |
PON-tstab [63] | Protein sequence | Temperature; pH; single variation or multiple variation | Random forests tool based on similarity features, conservation features, amino acid features, variation type features, neighborhood features, and other sequence-based protein features | Predicted ΔΔG of single-point mutation and predicted probability | http://structure.bmc.lu.se/PON-Tstab/ |
DeepDDG [64] | Protein structure | Single mutations or a list of mutations | Neural network-based predictor in which the parameters are shared for each target residue–neighbor residue pair | Prediction of the change in folding free energy upon mutation (ΔΔG) | http://protein.org.cn/ddg.html |
Name and original reference . | Input data . | Input data and optional settings . | Type of approach . | Output . | Availability . |
---|---|---|---|---|---|
I-Mutant 2.0 [21, 23] | Protein structure or sequence | Temperature; pH | SVM-based web server, trained on a dataset derived from ProTherm | Direction of the free energy change and its value for either all possible mutations of a particular residue or only for a specific mutation | http://gpcr2.biocomp.unibo.it/cgi/predictors/I-Mutant2.0/I-Mutant2.0.cgi |
MUpro [26] | Protein sequence. Protein structure if available | – | SVM- and neural network-based predictors, trained on the same dataset as I-Mutant 2.0. | Prediction of the value of energy change using support vector machine, using regression methods (recommended). Prediction of the sign of energy change using support vector machines and neural networks, using classification methods | http://mupro.proteomics.ics.uci.edu/ |
I-Mutant 3.0 [31] | Protein structure or sequence | Temperature; pH | SVM-based web server, trained on a dataset derived from ProTherm | ΔΔG value and binary classification (ΔΔG ≥ 0, ΔΔG <0) or ternary classification (ΔΔG < -0.5, −0.5 ≤ ΔΔG ≤ 0.5, ΔΔG >0.5) | http://gpcr2.biocomp.unibo.it/cgi/predictors/I-Mutant3.0/I-Mutant3.0.cgi |
mCSM [46] | Protein structure | Single mutation, mutation list or systematic mutations on a single residue | Graph-based distance patterns among atoms to represent the residue environment and a ‘pharmacophore count’ vector to account for the atom changes introduced by the mutation. The resulting signature vector is used to train predictive machine learning methods in regression and classification tasks | Prediction of the direction of the change in stability and actual numerical experimental value. Also prediction of change in affinity of protein–protein and protein–DNA complexes upon mutation | http://biosig.unimelb.edu.au/mcsm/ |
NeEMO [43] | Protein structure | Temperature; pH; amino acid to be substituted (one or more, manually edited) | Calculation of residue–residue interaction networks where nodes represent residues and edges represent different types of physicochemical bonds. These graphs are used to train neural network for the prediction of stability changes | Prediction of ΔΔG changes upon point mutations | http://protein.bio.unipd.it/neemo/ |
AUTO-MUTE 2.0 (Stability changes tool) [32, 44] | Protein structure (no multiple models, no gaps; no alternative conformations for alpha-carbon atoms) | Temperature; pH; amino acid to be substituted (single or systematic) | Two supervised classification models (random forest and SVM) to predict only the sign of ΔΔG; two regression models (tree regression and SVM regression) to predict the actual value of ΔΔG | Either predicted sign of ΔΔG along with a confidence level or predicted value of ΔΔG with other information about structural features of the mutant | http://binf2.gmu.edu/automute/AUTO-MUTE_Stability_ddG.html |
INPS-MD [47, 54] | Protein sequence (INPS) or structure (INPS3D) | – | Support vector regression trained on descriptors encoding mutation type (in particular, substitution score, hydrophobicity score, mutability index of native residue, molecular weights of native and mutant residues) and evolutionary information (INPS). Addition of structural features such as relative solvent accessibility of native residue and local energy difference calculated by a contact potential (INPS3D) | Changes in ΔG values upon residue substitution in the protein sequence | https://inpsmd.biocomp.unibo.it/welcome/default/index |
EASE-MM [51] | Protein sequence of a single domain monomeric protein | – | Combination of five specialized SVM models to predict ΔΔG of mutations. Each SVM combines a different set of features encoding evolutionary conservation, amino acid parameters and predicted structural properties such as secondary structures and different levels of accessible surface areas | Predicted ΔΔG and stability class: ∆∆Gu in (−inf, −1), destabilizing; ∆∆Gu in (−1, −0.5), likely destabilizing; ∆∆Gu in (−0.5, 0.5), neutral; ∆∆Gu in (0.5, 1), likely stabilizing; ∆∆Gu in (1, +inf), stabilizing. Predicted secondary structure and relative accessible surface area of the mutation site | https://sparks-lab.org/server/ease-mm/ |
STRUM [53] | Protein sequence | Single variation or multiple variation | Gradient boosting regression approach using different features sequence profile scores for evolutionary information, structural profile scores and different energy functions providing accurate environment information | Predicted ΔΔG of single-point mutation | http://zhanglab.ccmb.med.umich.edu/STRUM/ |
PON-tstab [63] | Protein sequence | Temperature; pH; single variation or multiple variation | Random forests tool based on similarity features, conservation features, amino acid features, variation type features, neighborhood features, and other sequence-based protein features | Predicted ΔΔG of single-point mutation and predicted probability | http://structure.bmc.lu.se/PON-Tstab/ |
DeepDDG [64] | Protein structure | Single mutations or a list of mutations | Neural network-based predictor in which the parameters are shared for each target residue–neighbor residue pair | Prediction of the change in folding free energy upon mutation (ΔΔG) | http://protein.org.cn/ddg.html |
This PDF is available to Subscribers Only
View Article Abstract & Purchase OptionsFor full access to this pdf, sign in to an existing account, or purchase an annual subscription.