Abstract
Aim: Breast cancers at different stages have tremendous differences on both phenotypic and molecular patterns. The developmental stage is an essential factor in the clinical decision of treatment plans, but was usually formulated as a classification problem, which ignored the consecutive relationships among them. Materials & methods: This study proposed a regression-based procedure to detect the stage biomarkers of breast cancers. Biomarkers were detected by the Lasso and Ridge algorithms. Results & conclusion: A collaboration duet of Lasso and Ridge regression algorithms achieved the best performances, with classification accuracy (Acc) equal to 0.8294 and regression goodness-of-fit (R2) equal to 0.7810. The 265 biomarker genes were enriched with the signal peptide-based secretion function with the Bonferroni-corrected p-value equal to 6.9408e-3 and false discovery rate (FDR) equal to 1.1614e-2.
Supplementary data
To view the supplementary data that accompany this paper please visit the journal website at: https://www.tandfonline.com/doi/suppl/10.2217/bmm-2018-0305
Financial & competing interests disclosure
This work was supported by the Strategic Priority Research Program of the Chinese Academy of Sciences (XDB13040400), Jilin Provincial Key Laboratory of Big Data Intelligent Computing (20180622002JC), the Education Department of Jilin Province (JJKH20180145KJ) and the Startup Grant of the Jilin University. This work was also partially supported by the Bioknow MedAI Institute (BMCPP-2018-001) and the High Performance Computing Center of Jilin University, China. The authors have no other relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript apart from those disclosed.
No writing assistance was utilized in the production of this manuscript.
Acknowledgements
Constructive comments from the editor and the two anonymous reviewers were greatly appreciated.