166
Views
0
CrossRef citations to date
0
Altmetric
Research Articles

An enhanced methodology for predicting protein-protein interactions between human and hepatitis C virus via ensemble learning algorithms

, ORCID Icon, , , &
Pages 10592-10602 | Received 05 Jan 2021, Accepted 17 Jun 2021, Published online: 11 Jul 2021

References

  • Alguwaizani, S., Park, B., Zhou, X., Huang, D. S., & Han, K. (2018). Predicting interactions between virus and host proteins using repeat patterns and composition of amino acids. Journal of Healthcare Engineering, 2018, 1391265. https://doi.org/10.1155/2018/1391265
  • Arnold, R., Boonen, K., Sun, M. G., & Kim, P. M. (2012). Computational analysis of interactomes: Current and future perspectives for bioinformatics approaches to model the host-pathogen interaction space. Methods (San Diego, Calif.), 57(4), 508–518. https://doi.org/10.1016/j.ymeth.2012.06.011
  • Barman, R. K., Saha, S., & Das, S. (2014). Prediction of interactions between viral and host proteins using supervised machine learning methods. PLoS One, 9(11), e112034. https://doi.org/10.1371/journal.pone.0112034
  • Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324
  • Chandrashekar, G., & Sahin, F. (2014). A survey on feature selection methods. Computers & Electrical Engineering, 40(1), 16–28. https://doi.org/10.1016/j.compeleceng.2013.11.024
  • Chang, S., Dolganiuc, A., & Szabo, G. (2007). Toll-like receptors 1 and 6 are involved in TLR2-mediated macrophage activation by hepatitis C virus core and NS3 proteins. Journal of Leukocyte Biology, 82(3), 479–487. https://doi.org/10.1189/jlb.0207128
  • Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. https://arxiv.org/abs/1603.02754
  • Chou, K.-C. (2020). Proposing pseudo amino acid components is an important milestone for proteome and genome analyses. International Journal of Peptide Research and Therapeutics, 26(2), 1085–1098. https://doi.org/10.1007/s10989-019-09910-7
  • Chou, K. C. (2001). Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins, 43(3), 246–255. https://doi.org/10.1002/prot.1035
  • Chou, K. C. (2005). Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes. Bioinformatics (Oxford, England), 21(1), 10–19. https://doi.org/10.1093/bioinformatics/bth466
  • Cui, G., Fang, C., & Han, K. (2012). Prediction of protein-protein interactions between viruses and human by an SVM model. BMC Bioinformatics, 13(Suppl 7), S5. https://doi.org/10.1186/1471-2105-13-S7-S5
  • De Chassey, B., Navratil, V., Tafforeau, L., Hiet, M. S., Aublin-Gex, A., Agaugue, S., Meiffren, G., Pradezynski, F., Faria, B. F., Chantier, T., Le Breton, M., Pellet, J., Davoust, N., Mangeot, P. E., Chaboud, A., Penin, F., Jacob, Y., Vidalain, P. O., Vidal, M., … Lotteau, V. (2008). Hepatitis C virus infection protein network. Molecular Systems Biology, 4, 230. https://doi.org/10.1038/msb.2008.66
  • Dolganiuc, A., Oak, S., Kodys, K., Golenbock, D. T., Finberg, R. W., Kurt-Jones, E., & Szabo, G. (2004). Hepatitis C core and nonstructural 3 proteins trigger toll-like receptor 2-mediated pathways and inflammatory activation. Gastroenterology, 127(5), 1513–1524. https://doi.org/10.1053/j.gastro.2004.08.067
  • Dong, Q., Zhou, S., & Guan, J. (2009). A new taxonomy-based protein fold recognition approach based on autocross-covariance transformation. Bioinformatics (Oxford, England), 25(20), 2655–2662. https://doi.org/10.1093/bioinformatics/btp500
  • Emamjomeh, A., Goliaei, B., Zahiri, J., & Ebrahimpour, R. (2014). Predicting protein-protein interactions between human and hepatitis C virus via an ensemble learning method. Molecular bioSystems, 10(12), 3147–3154. https://doi.org/10.1039/c4mb00410h
  • Fayed, H. A., & Atiya, A. F. (2019). Speed up grid-search for parameter selection of support vector machines. Applied Soft Computing, 80, 202–210. https://doi.org/10.1016/j.asoc.2019.03.037
  • Flajolet, M., Rotondo, G., Daviet, L., Bergametti, F., Inchauspe, G., Tiollais, P., Transy, C., & Legrain, P. (2000). A genomic approach of the hepatitis C virus generates a protein interaction map. Gene, 242(1–2), 369–379. https://doi.org/10.1016/S0378-1119(99)00511-9
  • Fontove, F., & Del Rio, G. (2020). Residue cluster classes: A unified protein representation for efficient structural and functional classification. Entropy, 22(4), 472. https://doi.org/10.3390/e22040472
  • Forbes, A. D. (1995). Classification-algorithm evaluation: Five performance measures based on confusion matrices. Journal of Clinical Monitoring, 11(3), 189–206. https://doi.org/10.1007/BF01617722
  • Fushiki, T. (2011). Estimation of prediction error by using K-fold cross-validation. Statistics and Computing, 21(2), 137–146. https://doi.org/10.1007/s11222-009-9153-8
  • Geurts, P., Ernst, D., & Wehenkel, L. (2006). Extremely randomized trees. Machine Learning, 63(1), 3–42. https://doi.org/10.1007/s10994-006-6226-1
  • Guirimand, T., Delmotte, S., & Navratil, V. (2015). VirHostNet 2.0: Surfing on the web of virus/host molecular interactions data. Nucleic Acids Research, 43(D1), D583–D587. https://doi.org/10.1093/nar/gku1121
  • Guo, Y., Yu, L., Wen, Z., & Li, M. (2008). Using support vector machine combined with auto covariance to predict protein-protein interactions from protein sequences. Nucleic Acids Research, 36(9), 3025–3030. https://doi.org/10.1093/nar/gkn159
  • Ho, Y., Gruhler, A., Heilbut, A., Bader, G. D., Moore, L., Adams, S. L., Millar, A., Taylor, P., Bennett, K., Boutilier, K., Yang, L., Wolting, C., Donaldson, I., Schandorff, S., Shewnarane, J., Vo, M., Taggart, J., Goudreault, M., Muskat, B., … Tyers, M. (2002). Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature, 415(6868), 180–183. https://doi.org/10.1038/415180a
  • Ito, T., Chiba, T., Ozawa, R., Yoshida, M., Hattori, M., & Sakaki, Y. (2001). A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proceedings of the National Academy of Sciences of the United States of America, 98(8), 4569–4574. https://doi.org/10.1073/pnas.061034498
  • Ju, Z., & Wang, S. Y. (2018). Prediction of citrullination sites by incorporating k-spaced amino acid pairs into Chou's general pseudo amino acid composition. Gene, 664, 78–83. https://doi.org/10.1016/j.gene.2018.04.055
  • Kawashima, S., Pokarowski, P., Pokarowska, M., Kolinski, A., Katayama, T., & Kanehisa, M. (2008). AAindex: Amino acid index database, progress report 2008. Nucleic Acids Research, 36(Database Issue), D202–D205. https://doi.org/10.1093/nar/gkm998
  • Kim, B., Alguwaizani, S., Zhou, X., Huang, D.-S., Park, B., & Han, K. (2017). An improved method for predicting interactions between virus and human proteins. Journal of Bioinformatics and Computational Biology, 15(1), 1650024. https://doi.org/10.1142/S0219720016500244
  • Kumar, V. S., & Vellaichamy, A. (2019). Sequence and structure-based characterization of ubiquitination sites in human and yeast proteins using Chou's sample formulation. Proteins, 87(8), 646–657. https://doi.org/10.1002/prot.25689
  • Lai, C. K., Jeng, K. S., Machida, K., & Lai, M. M. (2008). Association of hepatitis C virus replication complexes with microtubules and actin filaments is dependent on the interaction of NS3 and NS5A. Journal of Virology, 82(17), 8838–8848. https://doi.org/10.1128/JVI.00398-08
  • Lavanchy, D. (2011). Evolving epidemiology of hepatitis C virus. Clinical Microbiology and Infection: The Official Publication of the European Society of Clinical Microbiology and Infectious Diseases, 17(2), 107–115. https://doi.org/10.1111/j.1469-0691.2010.03432.x
  • Li, W., Yin, Y., Quan, X., & Zhang, H. (2019). Gene expression value prediction based on XGBoost algorithm. Frontiers in Genetics, 10, 1077. https://doi.org/10.3389/fgene.2019.01077
  • Liang, Y., & Zhang, S. (2017). Predict protein structural class by incorporating two different modes of evolutionary information into Chou's general pseudo amino acid composition. Journal of Molecular Graphics & Modelling, 78, 110–117. https://doi.org/10.1016/j.jmgm.2017.10.003
  • Liu, B., Liu, F., Wang, X., Chen, J., Fang, L., & Chou, K. C. (2015). Pse-in-one: A web server for generating various modes of pseudo components of DNA, RNA, and protein sequences. Nucleic Acids Research, 43(W1), W65–71. https://doi.org/10.1093/nar/gkv458
  • Liu, B., Wang, X., Chen, Q., Dong, Q., & Lan, X. (2012). Using amino acid physicochemical distance transformation for fast protein remote homology detection. PLoS One, 7(9), e46633. https://doi.org/10.1371/journal.pone.0046633
  • Liu, B., Wang, X., Lin, L., Dong, Q., & Wang, X. (2008). A discriminative method for protein remote homology detection and fold recognition combining Top-n-grams and latent semantic analysis. BMC Bioinformatics, 9, 510. https://doi.org/10.1186/1471-2105-9-510
  • Liu, B., Xu, J., Lan, X., Xu, R., Zhou, J., Wang, X., & Chou, K. C. (2014). iDNA-Prot|dis: Identifying DNA-binding proteins by incorporating amino acid distance-pairs and reduced alphabet profile into the general pseudo amino acid composition. PLoS One, 9, e106691.
  • Liu, B., Xu, J., Zou, Q., Xu, R., Wang, X., & Chen, Q. (2014). Using distances between Top-n-gram and residue pairs for protein remote homology detection. BMC Bioinformatics, 15(Suppl 2), S3. https://doi.org/10.1186/1471-2105-15-S2-S3
  • Lobo, J. M., Jiménez-Valverde, A., & Real, R. (2008). AUC: A misleading measure of the performance of predictive distribution models. Global Ecology and Biogeography, 17(2), 145–151. https://doi.org/10.1111/j.1466-8238.2007.00358.x
  • Löbrich, M., Shibata, A., Beucher, A., Fisher, A., Ensminger, M., Goodarzi, A. A., Barton, O., & Jeggo, P. A. (2010). gammaH2AX foci analysis for monitoring DNA double-strand break repair: strengths, limitations and optimization. Cell Cycle (Georgetown, Tex.), 9(4), 662–669. https://doi.org/10.4161/cc.9.4.10764
  • Machida, K., Cheng, K. T. H., Sung, V. M. H., Levine, A. M., Foung, S., & Lai, M. M. C. (2006). Hepatitis C virus induces toll-like receptor 4 expression, leading to enhanced production of beta interferon and interleukin-6. Journal of Virology, 80(2), 866–874. https://doi.org/10.1128/JVI.80.2.866-874.2006
  • Maria, D., Imbert, I., Kieny, M. P., & Schuster, C. (2003). Protein-protein interactions between hepatitis C virus nonstructural proteins. Journal of Virology, 77(9), 5401–5414. https://doi.org/10.1128/jvi.77.9.5401-5414.2003
  • Moradpour, D., Penin, F., & Rice, C. M. (2007). Replication of hepatitis C virus. Nature Reviews. Microbiology, 5(6), 453–463. https://doi.org/10.1038/nrmicro1645
  • Penin, F., Dubuisson, J., Rey, F. A., Moradpour, D., & Pawlotsky, J. M. (2004). Structural biology of hepatitis C virus. Hepatology (Baltimore, MD), 39(1), 5–19. https://doi.org/10.1002/hep.20032
  • Rashti, R., Alavian, S. M., Moradi, Y., Sharafi, H., Mohamadi Bolbanabad, A., Roshani, D., & Moradi, G. (2020). Global prevalence of HCV and/or HBV coinfections among people who inject drugs and female sex workers who live with HIV/AIDS: A systematic review and meta-analysis. Archives of Virology, 165(9), 1947–1958. https://doi.org/10.1007/s00705-020-04716-1
  • Rätsch, G., Onoda, T., & Müller, K. R. (2001). Soft margins for AdaBoost. Machine Learning, 42(3), 287–320. https://doi.org/10.1023/A:1007618119488
  • Saeys, Y., Inza, I., & Larranaga, P. (2007). A review of feature selection techniques in bioinformatics. Bioinformatics (Oxford, England), 23(19), 2507–2517. https://doi.org/10.1093/bioinformatics/btm344
  • Shen, Y., Tang, J., & Guo, F. (2019). Identification of protein subcellular localization via integrating evolutionary and physicochemical information into Chou's general PseAAC. Journal of Theoretical Biology, 462, 230–239. https://doi.org/10.1016/j.jtbi.2018.11.012
  • Varma, S., & Simon, R. (2006). Bias in error estimation when using cross-validation for model selection. BMC Bioinformatics, 7, 91. https://doi.org/10.1186/1471-2105-7-91
  • Wang, L., Yang, J., Xu, Y., Piao, X., & Lv, J. (2019). Domain-based comparative analysis of bacterial proteomes: Uniqueness, interactions, and the dark matter. Current Genomics, 20(2), 115–123. https://doi.org/10.2174/1389202920666190320134438
  • Wang, Y., You, Z., Li, L., & Chen, Z. (2020). A survey of current trends in computational predictions of protein-protein interactions. Frontiers of Computer Science, 14(4), 1–12. https://doi.org/10.1007/s11704-019-8232-z
  • Wu, M.-J., Ke, P.-Y., Hsu, J. T. A., Yeh, C.-T., & Horng, J.-T. (2014). Reticulon 3 interacts with NS4B of the hepatitis C virus and negatively regulates viral replication by disrupting NS4B self-interaction. Cellular Microbiology, 16(11), 1603–1618. https://doi.org/10.1111/cmi.12318
  • Zheng, H., Yuan, J., & Chen, L. (2017). Short-term load forecasting using EMD-LSTM neural networks with a Xgboost algorithm for feature importance evaluation. Energies, 10(8), 1168. https://doi.org/10.3390/en10081168
  • Zhou, H., Jin, J., & Wong, L. (2013). Progress in computational studies of host-pathogen interactions. Journal of Bioinformatics and Computational Biology, 11(2), 1230001. https://doi.org/10.1142/S0219720012300018
  • Zhu, H., Bilgin, M., Bangham, R., Hall, D., Casamayor, A., Bertone, P., Lan, N., Jansen, R., Bidlingmaier, S., Houfek, T., Mitchell, T., Miller, P., Dean, R. A., Gerstein, M., & Snyder, M. (2001). Global analysis of protein activities using proteome chips. Science (New York, N.Y.), 293(5537), 2101–2105. https://doi.org/10.1126/science.1062191
  • Zou, Q., Zeng, J., Cao, L., & Ji, R. (2016). A novel features ranking metric with application to scalable visual and bioinformatics data classification. Neurocomputing, 173, 346–354. https://doi.org/10.1016/j.neucom.2014.12.123

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.