Abstract
Facing the explosion of newly generated protein sequences in the postgenomic age, we are challenged to develop computational methods for the fast and accurate identification of their subcellular localization and other attributes. This review summarizes recent methodology developments, with a focus on artificial neural networks, the statistical learning and support vector machine, the fuzzy logic-based algorithm and the evidence-theory-based algorithm, as well as the ensemble classifier approach. Meanwhile, an outline of the use of different descriptors for protein samples is given. In addition, a series of web servers established recently based on various ensemble classifiers are also briefly introduced.
Financial disclosure
The authors have no relevant financial interests, including employment, consultancies, honoraria, stock ownership or options, expert testimony, grants or patents received or pending, or royalties related to this manuscript.