119
Views
5
CrossRef citations to date
0
Altmetric
Articles

An ensemble method for multi-type Gram-negative bacterial secreted protein prediction by integrating different PSSM-based features

&
Pages 181-194 | Received 06 Dec 2018, Accepted 20 Jan 2019, Published online: 11 Feb 2019
 

ABSTRACT

In Gram-negative bacteria, a wide range of proteins are secreted by highly specialized secretion systems. These secreted proteins play essential roles in the response of bacteria to their environment and also in several physiological processes such as adhesion, pathogenicity, adaptation and survival. Therefore, identifying secreted proteins in Gram-negative bacteria may assist in understanding the secretion mechanism and development of new antimicrobial strategies. Considering that a single-feature model is less likely to comprehensively cover this information, three kinds of feature models were used in this paper to represent protein samples by composition analysis, correlation analysis and smoothing encoding method on position-specific scoring matrix profiles. A support vector machine-based ensemble method with these hybrid features was developed to predict multi-type Gram-negative bacterial secreted proteins. Finally, our method achieves overall accuracies of 97.09% and 96.51% using an independent dataset test and jackknife test on a public test dataset, which are 3.49% and 2.32% higher, respectively, than results obtained by other methods. These results show the effectiveness and stability of the proposed ensemble method. It is anticipated that our method will provide useful information for further research on bacterial secreted proteins and secreted systems.

Acknowledgements

The authors would like to thank the anonymous reviewers and editor for their helpful comments on our manuscript.

Disclosure statement

No potential conflict of interest was reported by the authors.

Supplemental Material

Supplementary material for this article can be accessed at: https://doi.org/10.1080/1062936X.2019.1573438

Additional information

Funding

This work was supported by the National Natural Science Foundation of China (grant number 61602100); the Natural Science Foundation of Hebei Province (grant number F2016407082); the Fundamental Research Funds for the Central Universities (grant number N172304038); the Science Research Foundation of Hebei Normal University of Science & Technology (grant number 2018YB012); the Research Foundation of Qinhuangdao Engineering and Technology Center for Information Agriculture (grant number 201705B017); and the Doctoral Foundation of Northeastern University at Qinhuangdao (grant number XNB201613).

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.