310
Views
4
CrossRef citations to date
0
Altmetric
Articles

Machine learning model for classification of predominantly allergic and non-allergic asthma among preschool children with asthma hospitalization

, MSORCID Icon, , MD, DNB, , MD, DNB, , MS & , PhD
Pages 487-495 | Received 30 Nov 2021, Accepted 26 Mar 2022, Published online: 07 Apr 2022
 

Abstract

Objective

Asthma is the most frequent chronic airway illness in preschool children and is difficult to diagnose due to the disease’s heterogeneity. This study aimed to investigate different machine learning models and suggested the most effective one to classify two forms of asthma in preschool children (predominantly allergic asthma and non-allergic asthma) using a minimum number of features.

Methods

After pre-processing, 127 patients (70 with non-allergic asthma and 57 with predominantly allergic asthma) were chosen for final analysis from the Frankfurt dataset, which had asthma-related information on 205 patients. The Random Forest algorithm and Chi-square were used to select the key features from a total of 63 features. Six machine learning models: random forest, extreme gradient boosting, support vector machines, adaptive boosting, extra tree classifier, and logistic regression were then trained and tested using 10-fold stratified cross-validation.

Results

Among all features, age, weight, C-reactive protein, eosinophilic granulocytes, oxygen saturation, pre-medication inhaled corticosteroid + long-acting beta2-agonist (PM-ICS + LABA), PM-other (other pre-medication), H-Pulmicort/celestamine (Pulmicort/celestamine during hospitalization), and H-azithromycin (azithromycin during hospitalization) were found to be highly important. The support vector machine approach with a linear kernel was able to diffrentiate between predominantly allergic asthma and non-allergic asthma with higher accuracy (77.8%), precision (0.81), with a true positive rate of 0.73 and a true negative rate of 0.81, a F1 score of 0.81, and a ROC-AUC score of 0.79. Logistic regression was found to be the second-best classifier with an overall accuracy of 76.2%.

Conclusion

Predominantly allergic and non-allergic asthma can be classified using machine learning approaches based on nine features.

Supplemental data for this article is available online at at www.tandfonline.com/ijas .

Acknowledgements

We thank Stefan Zielen, Sven Kluge, Helena Donath, Katherina Blümchen, Jordis Trischlera, and Johannes Schulze from Klinikum Goethe University (KGU) for providing the Frankfurt dataset for the present study.

Authors’ contribution

Piyush Bhardwaj: Conceptualization, Investigation, Software, Writing-Original draft; Ashish Tyagi: Supervision, Writing: Review & Editing; Shashank Tyagi: Supervision, Writing: Review & Editing; Joana Antão: Writing: Review & Editing; Qichen Deng: Supervision, Writing: Review & Editing.

Declaration of interest

The authors declare no conflict of interest and there has been no financial support for this work.

Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.