Abstract
We present a novel approach combining mathematical methods and artificial neural networks to predict the transmembrane regions of transmembrane proteins, considering protein sequence information alone. We have focused on developing a data-driven model based on a non-linear modelling method, the counter-propagation artificial neural network, and on mathematical descriptors defining the sequence information of transmembrane proteins with known three-dimensional structures. The developed model has proven to be promising in predicting protein transmembrane regions, with an error below 10% for the external validation set. In combination with available experimental data the model can give us a better understanding of transmembrane proteins.
Acknowledgments
The Slovenian Ministry of Higher Education, Science and Technology is acknowledged for financial support through the research grants P1-0017 and J1-2151-0104. Professors Sabina Passamonti and Milan Randić are gratefully acknowledged for valuable discussions and advice regarding bilitranslocase characteristics and applications of graph theory in protein coding.