Abstract
The conformational stability of more than 1500 protein mutants was modelled by a proteometric approach using amino acid sequence autocorrelation vector (AASA) formalism. 48 amino acid/residue properties selected from the AAindex database weighted the AASA vectors. Genetic algorithm-optimised support vector machine (GA-SVM), trained with subset of AASA descriptors, yielded predictive classification and regression models of unfolding Gibbs free energy change (ΔΔG). Function mapping and binary SVM models correctly predicted about 50 and 80% of ΔΔG variances and signs in crossvalidation experiments, respectively. Test set prediction showed adequate accuracies about 70% for stable single and double point mutants. Conformational stability depended on autocorrelations at medium and long ranges in the mutant sequences of general structural, physico-chemical and thermodynamical properties relative to protein hydration process. A preliminary version of the predictor is available online at http://gibk21.bse.kyutech.ac.jp/llamosa/ddG-AASA/ddG_AASA.html.
Acknowledgements
The authors would like to acknowledge Professor Akinori Sarai, who provided useful information to prepare the revised manuscript. The authors also would like to thank the anonymous referees because their useful comments improved the quality of the manuscript. Financial supports of this research by Cuban Ministerio de Ciencia, Tecnología y Medio Ambiente (CITMA) through a grant to M. Fernandez (Grant No. 20104102).