Abstract
Objective
To predict ALS progression with varying observation and prediction window lengths, using machine learning (ML).
Methods
We used demographic, clinical, and laboratory parameters from 5030 patients in the Pooled Resource Open-Access ALS Clinical Trials (PRO-ACT) database to model ALS disease progression as fast (at least 1.5 points decline in ALS Functional Rating Scale-Revised (ALSFRS-R) per month) or non-fast, using Extreme Gradient Boosting (XGBoost) and Bayesian Long Short Term Memory (BLSTM). XGBoost identified predictors of progression while BLSTM provided a confidence level for each prediction.
Results
ML models achieved area under receiver-operating-characteristics curve (AUROC) of 0.570-0.748 and were non-inferior to clinician assessments. Performance was similar with observation lengths of a single visit, 3, 6, or 12 months and on a holdout validation dataset, but was better for longer prediction lengths. 21 important predictors were identified, with the top 3 being days since disease onset, past ALSFRS-R and forced vital capacity. Nonstandard predictors included phosphorus, chloride and albumin. BLSTM demonstrated higher performance for the samples about which it was most confident. Patient screening by models may reduce hypothetical Phase II/III clinical trial sizes by 18.3%.
Conclusion
Similar accuracies across ML models using different observation lengths suggest that a clinical trial observation period could be shortened to a single visit and clinical trial sizes reduced. Confidence levels provided by BLSTM gave additional information on the trustworthiness of predictions, which could aid decision-making. The identified predictors of ALS progression are potential biomarkers and therapeutic targets for further research.
Keywords:
Acknowledgements
Our thanks to Dr James Berry, Massachusetts General Hospital, and the NEALS consortium for kindly providing the Celecoxib trial dataset. Our thanks also to all contributors to the PRO-ACT database. Finally, we thank all patients who have contributed their information to the patient databases and enabled us to work towards improving patient care and understanding ALS through our research using these databases.
Competing interests
The authors report there are no competing interests to declare.
Data availability statement
All PRO-ACT patient data used in this paper is publicly available as the PRO-ACT database at https://ncri1.partners.org/proact. “Trial of celecoxib in Amyotrophic Lateral Sclerosis (ALS)” (Citation40) dataset is provided through DTUA2021A009305 with Massachusetts General Hospital/Harvard Medical School and the NEALS consortium.
Declaration of interest
The authors report no conflicts of interest. The authors alone are responsible for the content and writing of this article.