1,703
Views
9
CrossRef citations to date
0
Altmetric
Review

Application of Machine Learning Algorithms for Asthma Management with mHealth: A Clinical Review

ORCID Icon, ORCID Icon, & ORCID Icon
Pages 855-873 | Published online: 29 Jun 2022

Abstract

Background

Asthma is a variable long-term condition. Currently, there is no cure for asthma and the focus is, therefore, on long-term management. Mobile health (mHealth) is promising for chronic disease management but to be able to realize its potential, it needs to go beyond simply monitoring. mHealth therefore needs to leverage machine learning to provide tailored feedback with personalized algorithms. There is a need to understand the extent of machine learning that has been leveraged in the context of mHealth for asthma management. This review aims to fill this gap.

Methods

We searched PubMed for peer-reviewed studies that applied machine learning to data derived from mHealth for asthma management in the last five years. We selected studies that included some human data other than routinely collected in primary care and used at least one machine learning algorithm.

Results

Out of 90 studies, we identified 22 relevant studies that were then further reviewed. Broadly, existing research efforts can be categorized into three types: 1) technology development, 2) attack prediction, 3) patient clustering. Using data from a variety of devices (smartphones, smartwatches, peak flow meters, electronic noses, smart inhalers, and pulse oximeters), most applications used supervised learning algorithms (logistic regression, decision trees, and related algorithms) while a few used unsupervised learning algorithms. The vast majority used traditional machine learning techniques, but a few studies investigated the use of deep learning algorithms.

Discussion

In the past five years, many studies have successfully applied machine learning to asthma mHealth data. However, most have been developed on small datasets with internal validation at best. Small sample sizes and lack of external validation limit the generalizability of these studies. Future research should collect data that are more representative of the wider asthma population and focus on validating the derived algorithms and technologies in a real-world setting.

Introduction

Asthma is a variable long-term condition, affecting 339 million people worldwide,Citation1 often with diurnal, seasonal and life-time differences in symptoms and disease burden. Although, for many, asthma symptoms are controlled most of the time, some have on-going poor control and all are at risk of attacks which, at best, are inconvenient and at worst can result in hospitalization or even death.Citation2 Currently, there is no cure for asthma, therefore the focus of management is on improving symptom control and reducing the risk of attacks. Asthma is an umbrella term encompassing a range of phenotypes so personalization of management strategies is essential.

Monitoring is one of the pillars of management, allowing patients to correctly assess their health and take appropriate action. Mobile health or mHealth is commonly defined as the practice of using mobile technologies in medical care. This can range from using text reminders for medical appointments to healthcare telephone helplines to using home monitoring systems and wearable devices.Citation3 mHealth encompasses many streams of data, most of which are produced faster than a single human can comprehend; machine learning is ideal for processing this amount of data to produce actionable information and personalized feedback.

Machine learning involves using computers and algorithms to process large amounts of data (many observations and many variables) and identify patterns without explicit human programming.Citation4 It has provided insights into a very wide range of applications, including genomics,Citation5Citation7 images,Citation8Citation10 sound recordings,Citation11,Citation12 vital signs,Citation13 and electronic health records data collected in primary,Citation14,Citation15 secondary,Citation16 and tertiary care.Citation17 Machine learning is an umbrella term, consisting of tools and techniques that use data to learn how to perform a given task, but the algorithms generally fall into two classes, supervised and unsupervised learning. Supervised learning finds a mathematical function to link the data with known labels and is suitable for tasks that have a well-defined goal. Unsupervised learning, on the other hand, describe patterns and structures in the data without following the lead of labels or categories defined by a human. More details about machine learning algorithms are provided in the Supplementary Material – Machine Learning.

Currently, most mHealth interventions that have been implemented in healthcare have focused on reminders and communications.Citation3 Areas of asthma management that machine learning and mHealth can support include monitoring,Citation18 personalizing care,Citation19 providing education,Citation20 understanding patterns in the population to better target care,Citation21 and predicting asthma attacks using a multitude of data sources.Citation22 Broadly, existing research efforts can be categorized into three types: 1) technology development, 2) attack prediction, 3) patient clustering.

This clinical review will provide a critical overview of the current research that has leveraged machine learning in the context of mHealth for remote asthma management, its shortcomings, challenges, the extent of readiness for deployment, and future research recommendations.

Methods

We carried out a clinical review and searched PubMed for applications of machine learning to mHealth for asthma management, based on the following inclusion criteria: 1) full text available; 2) available in English; 3) published in last 5 years; 4) including at least one machine learning algorithm; 5) including data collected from humans; 6) including data other than electronic health records; 7) peer reviewed. We excluded systematic reviews, commentaries, and preprints. The terms used to search title and abstract are listed in . Terms in the same column were joined by the OR operator and the search terms in different columns were joined by the AND operator. Publications in the past five years equated to publications between 1st January 2017 and 30th July 2021.

Table 1 Search Strategy

Results

Search Results

With our search terms, we found 90 papers available via PubMed published in the last 5 years. After reviewing the abstracts of all the papers with the inclusion and exclusion criteria, 22 papers were identified and further reviewed in this study (see ).

Figure 1 Article selection.

Figure 1 Article selection.

We classified the studies in three areas: technology development, attack prediction, and patient clustering. Technology development refers to contexts where machine learning is central to developing a new monitoring tool,Citation23Citation33 such as in cough and wheeze analysis. Attack prediction refers to studies that use machine learning to predict an asthma event (typically an attack) usually using mHealth data.Citation34Citation42 Patient clustering refers to studies which subtype the asthma population using unsupervised learning algorithms.Citation43,Citation44 See for a summary of the papers.

Table 2 Summary of Studies

Most applications of machine learning for asthma management in mHealth involve collecting self-reported data to form the ground truth of a patient’s asthma condition, and some objective data either using smartphones or mobile monitoring devices, or both. Frequently, a validated measures of asthma control is collected (eg, Asthma Control Questionnaire (ACQ)Citation45 or Asthma Control Test (ACT)Citation46) in mHealth studies. Using around five questions about the symptoms experienced by patients, the questionnaires determine whether patients’ asthma is controlled or uncontrolled.

Many methods and devices for monitoring different aspects of a person have been studied individually and in combination. Machine learning can be applied to breath monitoring,Citation37,Citation41 sleep monitoring,Citation23,Citation34Citation36,Citation38,Citation39,Citation42 cough and wheeze,Citation24,Citation26,Citation27,Citation29Citation31,Citation36 lung function monitoring,Citation23,Citation25,Citation33Citation35,Citation38,Citation40 adherence monitoring,Citation32,Citation35,Citation38,Citation43 and environment monitoring.Citation39,Citation40,Citation44 However, studies had different outcome measures; hence, it is difficult to conduct a direct comparison between studies.

Technology Development

Developing monitoring tools was a goal for 11 of the included studies. These include identifying sleeping postures from wearable respiration sensor data,Citation23 activity detection using smartwatches,Citation28 home breathing monitoring,Citation25,Citation33 and activeCitation24,Citation27,Citation29,Citation31 and passive cough and wheeze detection.Citation26,Citation30 Many of the identified studies on technology development applied digital signal processing (DSP) to process the raw signals collected via sensors, a necessary step before the application of machine learning.

TwoCitation27,Citation28 out of 11 studies included data from children and fiveCitation23,Citation25,Citation27,Citation28,Citation32 out of 11 studies included data from adults; however, none of the 11 studies developing monitoring tools had specifically investigated data from a senior population. Some of the studies on adults were conducted purely with healthy adults who could mimic a wide range of breathing patterns.

Sleep Posture

Among patients with asthma, posture (such as standing vs supine) can influence respiratory behavior.Citation55 However, there is conflicting evidence as to whether sleeping posture has a significant effect on respiratory behavior.Citation55Citation57 Identifying the posture of when the respiratory measurement was taken can be useful when studying posture-related instabilities.

Using two wearable sensors located at the abdomen and chest, four postures (standing and three sleeping) were identified with high accuracy. However, the ability to correctly identify postures from sensor data was dependent on knowing to which individual the data belonged. Using this information, the classifier jumped in performance from 21.9% accuracy to 99.5% accuracy, thus adapting this method for asthma management will require more research or include a calibration stage.Citation23

Activity Detection

Smartwatches are increasingly prevalent amongst the public, healthy individuals, and elite athletes to measure their health. This has promoted technology development, so that the sensors are more reliable, affordable, and comparable between brands.Citation58 Motion data (triaxial accelerometry and gyroscopic data) commonly collected in smartwatches was used in activity detection, which could improve the capabilities of passive monitoring potentially replacing the need to ask questions about activity. Using DSP to process the raw signals and supervised learning (gradient boosted tree classification) on two datasets, various activities like standing, sitting, and walking were identified from signals from the wrist worn device with promising accuracy.Citation28

In a comparison between the performance of algorithms trained on two datasets, one in adults and one in children, found the activity detection performed better in adults, but this was confounded by the adults performing tightly proscribed movements and the children recording more natural movements.Citation28

Breathing Monitoring

Breathing monitoring and detecting difficulties in breathing could help potentially identify asthma attacks early. Tools that have been proposed for home monitoring include portable sleep diagnostic devices to monitor breathing,Citation25 and radar to measure chest movement.Citation33 Using deep learning and features from a pulse oximeter, there were accurate predictions of the respiratory waveforms.Citation25 Likewise, applying supervised learning (XGBoost) on features extracted from chest movement recorded by the radar gave promising accuracy of identifying different breathing patterns.Citation33

Cough Monitoring

Like sleep monitoring, wheeze and cough are widely captured as a measure of asthma control and included in validated asthma questionnaires. However, there are also studies combining mHealth and machine learning to develop new tools for monitoring wheeze and cough, both activelyCitation24,Citation27,Citation29,Citation31 and passively.Citation26,Citation30 Recording and analyzing voluntary coughs and respiratory sounds from people with different respiratory diseases could provide a tool to assist diagnosis. Although separating wet (cough with phlegm) and dry coughs was successful, there were varying levels of performance when making a diagnosis using recordings alone.Citation24,Citation29,Citation31 Using voluntary cough recordings, one study accurately predicted individuals who were either healthy, had asthma, had chronic obstructive pulmonary disease (COPD), or had comorbid asthma and COPD with an accuracy of 93.3%.Citation24 In contrast, another study using cough type to distinguish healthy people from those with respiratory disease had a much lower performance, with AUC of 67.8%.Citation31 Developing new DSP methods (an essential step to be able to extract relevant information from raw sound signals) have shown promise in wheeze and cough detection from digital stethoscope recordings.Citation26,Citation27,Citation30

Inhaler Technique Monitoring

Measuring adherence to medication is widely studied in asthma research. In addition to measuring when patients took medication, measuring how the inhalers were used and checking for correct technique is another application of mHealth and machine learning. Regression models of DSP processed adult audio recordings from the INhaler Compliance Assessment (INCA) device were found to accurately estimate the inhaler inhalation flow profile with 91% accuracy.Citation32 This objective measure of inhaler technique could help patients improve how they take their medication.

Attack Prediction

Machine learning was applied to several different mHealth data sources to predict asthma attacks and change in symptoms. The data included volatile organic compounds,Citation37,Citation41 sleep quality,Citation36,Citation39,Citation42 peak flow,Citation34,Citation35,Citation38,Citation40 preventer medication adherence,Citation35,Citation38 and environmental triggers.Citation39,Citation40 TwoCitation34,Citation40 of the nine studies included data collected from children or teenagers, and adults, but the population was considered as a whole in both cases. Three studiesCitation37,Citation41,Citation42 focused on children with asthma, four studiesCitation35,Citation36,Citation38,Citation39 focused on adults with asthma, and none of the studies focused on seniors. The performance of the algorithms was unlikely to have been affected by the age group of the study population.

Breath Analysis

Volatile organic compounds (VOCs), stemming from indoor pollutants, that are present in the breath of patients could be used to understand the development of asthma attacks, but evidence is inconsistent.Citation59 Gas chromatography–mass spectrometry (GC-MS) is the gold standard in VOC analysis, but electronic nose (e-Nose) could be a portable alternative. The e-Nose can detect and recognize individual chemical compounds in mixtures of chemical vapors.

The VOCs in exhaled breath of children were analyzed using both supervised and unsupervised learning.Citation37,Citation41 Supervised learning methods (penalized logistic models and random forest) were used to identify the most important VOCs for attack prediction. Classifiers were trained to identify which VOCs would predict an upcoming asthma attack or worsening control. The study reported good performance, with sensitivity and specificity between 70% and 90%, and an AUC upwards of 80%. Furthermore, unsupervised learning (principal component analysis (PCA)) was used to pre-process the data to form combinations of VOCs for attack prediction and for visualizing high-dimensional data in a two-dimensional graph.Citation37,Citation41

Sleep Monitoring

Aligned with the clinical recognition of exaggerated diurnal variation causing sleep disturbance as a sign of poorly controlled asthma,Citation60,Citation61 disturbance to sleep was widely used as a potential predictor of worsening asthma. Many studies captured night symptoms and sleep quality using questionnaires,Citation34,Citation35,Citation38 but some collected objective sleep data using devices.Citation36,Citation39,Citation42 Out of 25 features used to predict asthma attacks with daily (symptom diary like-) questionnaires about asthma, night symptoms-related features were two of the four most predictive features.Citation35 Also, night-time waking was selected as one of three basic variables used for prediction.Citation34 When the objective data were combined with machine learning algorithms (random forest, generalized linear mixed models, regression), it enabled smartphone recordings to analyze nocturnal coughs,Citation36 related fitness tracker activity data with sleep wakening,Citation39 and bed sensors to predict asthma control.Citation42 The usefulness of using sensors to predict self-reported asthma control is unclear, using nocturnal cough and sleep quality alone achieving balanced accuracy of no more than 70% in predicting attacks,Citation36 but using fitness tracker data to predict sleep wakening had an AUC of 77%,Citation39 and an accuracy of 87.4% in predicting reports of asthma symptoms.Citation42

Lung Function Monitoring

Falling peak expiratory flow (PEF) is a major indicator of asthma attacks. Peak flow meters are sometimes used by patients at home to take objective measurements and used to inform whether action needs to be taken. Spirometers are another device that measures lung function, but in more detail than peak flow meters.Citation62 Action plans use thresholds of 80% of their best PEF to determine that action needs to be taken, and urgent action is required if a person’s PEF falls below 60%.Citation60 A drop in PEF and/or a change in symptom score are widely used in asthma action plans to determine self-management in response to deterioration.Citation63 Smart peak flow meters enable patients to measure and track their PEF, and are often linked with a mobile app to function.

Measuring PEF to monitor lung function is commonplace in asthma studies. This could be either reporting the results from a traditional peak flow meter,Citation34,Citation35,Citation40 or using a smart peak flow meters that sends the data through a computer or smartphone.Citation38 PEF measurements are used as both predictors of asthma attacks as well as defining severity and informing management. Using daily diaries and PEF measurements to predict worsening condition with supervised learning (adaptive Bayesian network) achieved a performance of 100.0% accuracy, sensitivity, and specificity.Citation38

Adherence Monitoring

Adherence to regular preventative medication is sometimes captured by questionnaire and used as a predictor for asthma attacks.Citation35,Citation38 Although clinically important, the two studies did not identify the adherence to controller medication as an important predictive feature in their methods. In contrast, and consistent with clinical recommendations, features based on the use of short-acting reliever medication were two of the four most predictive features.Citation35

Environment Monitoring

Some common asthma triggers in the environment, such as pollen, meteorological change, and air pollution (eg, particulate matter, carbon monoxide (CO), nitrogen dioxide (NO2)), could be monitored to reduce risk of exposure to known triggers. Also, recording asthma triggers encountered, such as viral infections, passive smoke, and pets, could give a better understanding of a person’s asthma and their symptoms.Citation64Citation66 Connecting data from pollution monitoring stations and meteorology stations with patient health records provides a wealth of information for analysis.

Furthermore, combining physicians’ knowledge using a rule-based classifier (analogous to a decision tree created based on knowledge) with conventional supervised learning techniques (multinomial logistic regression, SVM, random forest, extreme gradient boosting, KNN, decision tree, Gaussian naïve Bayesian) created an accurate (sensitivity of 88.3% and precision of 89.4%) ensemble learning algorithm for predicting levels of asthma control.Citation40 Based on the joined dataset, the most important features for prediction were lung function and symptoms: PEF in the morning and before bedtime, ACT score, and shortness of breath in the last 24 hours. Although environmental features were not ranked highly, daily NO2 concentration and daily temperatures were useful.Citation40 Further, home environment measuring device has also been shown to be useful in predicting self-reported asthma-specific wakening.Citation39

Patient Clustering

Two studiesCitation43,Citation44 used unsupervised learning to form data-driven clusters using data collected via mHealth. One study was investigated clusters in children with asthma,Citation43 the other had focused on data collected by adults with asthma.Citation44

Adherence Monitoring

In addition to capturing adherence to regular controller medication via questionnaires, there has also been in-depth studies of medication adherence. Smart inhalers are devices that objectively measure how inhaler medication is taken, as an alternative to self-report. Monitoring can be applied to the long-acting controller inhaler or the short-acting reliever inhaler, or both. By analyzing electronic inhaler monitoring data of controller medication with unsupervised learning algorithms (PCA and k-mean), asthma patients were characterized by multi-dimensional inhaler adherence measures, which formed three groups, poor (on average 16% of their prescribed doses), moderate (averaged 60% of dose), and good (averaged 91% of dose) adherence.Citation43 Furthermore, comparison with clusters formed by another data-driven method (decision trees) yielded similar results.Citation43

Environment Monitoring

Like many daily questionnaires, recording encounters with asthma triggers can be difficult and lead to missing data. To tackle this, probability-based imputation with consensus clustering was developed as a method of imputing the missing data and clustering patients, which can be used to subtype asthma patients for personalized alerts based on their triggers.Citation44 Using the imputation method, three patient clusters were formed using the daily asthma symptom data. The characteristics of each cluster was investigated on four clinical, three demographic, and three trigger features. Cluster 1, with the highest average day symptom level, had patients who frequently reported pollen and heat as their triggers. On the other hand, cluster 3, with the lowest average day symptoms, was characterized more by patients citing air quality as their trigger.Citation44 Prospectively, weather forecasts could be useful in predicting the risk of a future asthma attack for patients who are sensitive to environmental triggers such as sudden temperature changes or high pollen levels.

Discussion

This review has described a range of machine learning applications being used to support asthma management, in the areas of developing novel technology,Citation23Citation33 predicting acute attacks at an individual level,Citation34Citation42 and informing understanding of asthma phenotypes by clustering patients within populations.Citation43,Citation44 There were examples of successful application of machine learning to achieve a novel task (such as attack prediction from sleep quality, control prediction from exhaled breath, characterize asthma patients by medication adherence)Citation36,Citation37,Citation42,Citation43 or to improve existing methodology by using fewer resources for similar or better performance (such as smartphone-based passive monitoring of coughs).Citation24,Citation26,Citation27,Citation30,Citation31,Citation40,Citation41

Most of the machine learning algorithms applied were easily interpretable,Citation26Citation32,Citation34Citation39 a desirable characteristic to help easily understand the decision process in a clinical context. However, a few studies applied more complex but less interpretable machine learning algorithms.Citation24,Citation25,Citation40

Developing Novel Technology: Proof-of-Concept with Clinical Potential

Using machine learning, new home monitoring tools were under development, including for activity detection, breath monitoring, cough monitoring, and inhaler technique monitoring.Citation23Citation33 Most studies were in the proof-of-concept stage and although they were developed on selected small populations, many had achieved promising performance.Citation23Citation25 An initial challenge, before considering the clinical potential of novel technology, is to process the incoming data so that background noise is removed and clear signals emerge.Citation29 This was the focus of several of the papers that described development of new methods to filter the signal data.Citation26,Citation27,Citation29 Before using the novel technology to monitor asthma at home, validation studies should be conducted in a real-world environment.

Prediction of Attacks: Supporting Individual Self-Management

Asthma is a variable condition,Citation67 and central to supported self-management is the ability to recognize early evidence of deterioration and to take appropriate timely action to prevent a serious attack.Citation68,Citation69 A key aim of many of the machine learning papers was to use a wide variety of data sources to identify an individual’s risk of uncontrolled asthma and to improve prediction of asthma attacks.Citation34Citation42 All the predictors explored (asthma symptoms, PEF, VOCs, fractional exhaled nitric oxide (FeNO), heart rate, respiratory rate, sleep quality, medication adherence, and environment) showed promise, though it was widely discussed that combining multiple varied data sources could help improve asthma attack prediction.Citation28,Citation34,Citation35,Citation38,Citation40 Importantly, the prediction algorithms were developed retrospectively and require external validation in different datasets before they can be used in clinical practice. Besides the need for external validation, future studies should also consider evaluating the algorithms by comparison to existing effective “action plans” in clinical practice.

Clustering Patients: Informing Phenotypes and Targeting Care

Contemporary understanding of asthma as an umbrella term describing a heterogenous group of conditionsCitation70 has increased interest in identifying phenotypes of asthma amenable to specific treatments or carrying specific risks of poor symptom control and/or acute attacks. Using unsupervised learning algorithms, progress has been made on forming patient clusters representing natural patterns spotted in the data.Citation43 Understanding phenotypes not only has value in terms of individual risk and targeting care to “treatable traits” but can inform health service delivery as appropriate care can be targeted on high-risk populations.Citation71 However, many of the studies used relatively small datasets – and often of populations selected for frequent symptoms or willingness to monitor – with limited generalizability to the whole asthma population.Citation23Citation25,Citation31,Citation36,Citation37,Citation39,Citation41Citation43 Future research should consider larger sample sizes that can better represent the general asthma population.

Machine Learning Applied to Asthma Management: Challenges

Tailored Data Collection

The performance of machine learning algorithms largely depends on the input data; hence, the sample size and data pre-processing methods must be considered in conjunction with the performance metrics. Most data used to train the machine learning algorithms in this review had small sample sizes, and sometimes used narrow inclusion criteria to collect the data.Citation23Citation25,Citation31,Citation36,Citation37,Citation39,Citation41Citation43 For example, a common exclusion criterion for asthma studies is “other respiratory disease”,Citation23,Citation37,Citation41,Citation43,Citation44 which makes for a homogeneous dataset (which may be easier to analyze) but it reduces the likelihood of the results being generalizable. It also overlooks the possibility that the conditions excluded may be part of the phenotype. Even within asthma, different individuals have different medication regimes, which complicates the analysis,Citation43 but selection according to a specific regime (say prescribed combination controller medication) will only give information on a selected population. Importantly, in longitudinal studies where participant retention is a factor, different individuals may provide different amounts of data for analysis, which will skew analysis towards patients who are more engaged with the study, more adherent to data collection, possibly influenced by the characteristics of their asthma.Citation42,Citation44

Secondary Analysis of Existing Datasets

To tackle the problem of small sample sizes, some studies have conducted secondary analysis on data that were collected for a different purpose.Citation27,Citation34 Eight studies (36%) were based on data that were publicly available or available on request.Citation26Citation28,Citation30,Citation34,Citation35,Citation43,Citation44 This makes for efficient use of data, but the aims (and thus eligibility) of the original dataset may not match the aims of the new analysis thereby making the interpretation of the results more challenging.

Missing Data

How the analysis handled missing data will be important to understand the differences between studies.Citation35,Citation40,Citation42,Citation44 If the amount of missing data is small, removing the cases with missing data is an option. Alternatively, imputing the missing values is a method that avoids losing data, but is a major challenge when there is a low response rate or the data are not missing at randomCitation44,Citation72,Citation73 (eg, people with frequent attacks may monitor more regularly than those who rarely have symptoms). Other methods to handle missing data include interpolation into regular spacing or creating summary windows,Citation35 which can then be analyzed using regular methods. However, each method of handling missing data carries their assumptions (for example, assuming people with missing inhaler data and people who reporting using and not using their have the same inhaler usage rate).

Low Event Rate in the Dataset

For many people with less severe asthma, attacks are infrequent leading to large “class imbalance”. In some populations, the imbalance can be upwards of 90%.Citation26,Citation34Citation36,Citation38,Citation40 Data analysis sampling techniques, such as Synthetic Minority Oversampling TEchnique (SMOTE),Citation74 have been applied to balance out the classes by essentially multiplying the minority class, which allows machine learning techniques to function properly. For example, oversampling techniques can be used to artificially enlarge the number of asthma attacks such that the data now has 50% attacks and 50% controlled asthma.

Inconsistent Output Definitions During Modelling

Different studies of asthma attack predictions had different definitions of an asthma attack and outcome measures. This included using patient symptoms,Citation36,Citation37,Citation39Citation42 self-reported asthma attack treatment,Citation34,Citation35 and spirometry measurements.Citation38,Citation39 Although sometimes similar, the different definitions cannot be used in direct comparison.Citation73 Furthermore, some outcomes were easier to model based on the input data, thus leading to over-optimistic performance results. For example, Finkelstein and Jeong used 21 daily measures, including symptoms and PEF, to predict asthma attacks.Citation38 However, the asthma attacks were defined as the PEF zone on day 8, which is directly related to one of the input features, namely PEF on day 7. Consequently, it is not sufficient to assess any study based solely on the performance metrics without the broader context.

External Validation

For external validation, the “new” dataset must be the similar in at least the key parameters as the training dataset to meaningfully compare the machine learning algorithms. Ideally, and especially for health data, the methods should be robust and comparable even if there are slight differences in the data. It is highly challenging to externally validate machine learning models partly due to major differences in inclusion criteria and outcome definitions, and most often due to lack of access to comparable data.Citation26,Citation30,Citation41 Slight differences in wording of questions or device choice can create datasets that are similar yet not directly comparable, hence not applicable for external validation (for example, acute attacks might be measured as “needing an oral steroid course” or “unscheduled care” and might be assessed over a year or a few months). In the context of mHealth, this requires similar devices to be used, but rapidly advancing technology may make this a challenge. However, this may change in the future as devices become validated and widely used (like how validated questionnaires and guidelines have allowed studies to be comparable).

None of the machine learning algorithms in the 22 studies had been externally validated and were only internally validated.

Data Quality

Conducting data collection in controlled environments enables cleaner data to be collected and analyzed.Citation27,Citation29 However, real-world settings will most likely lead to reduced data quality. Consequently, it is important that a given model’s performance is evaluated for use by actual patients in their day-to-day lives.Citation32,Citation33

Future Direction

Machine learning algorithms are dependent on the data that is inputted. Since most existing studies are based on relatively small sample sizes and often selected populations, the next natural step is to validate the results in larger – and more representative – populations.Citation25,Citation39,Citation43 Future research should consider adding other data sources to existing models, collecting multi-dimensional data using several devices and data sources simultaneously to provides a more complete picture about a person and their environment, whilst also assessing the utility of individual devices.Citation25,Citation28,Citation34,Citation35,Citation38,Citation40 Studies like MyAirCoachCitation22 and Biomedical REAl-Time Health Evaluation (BREATHE)Citation51 that combine several sources of data longitudinally are important for future development of mHealth technologies for asthma.

The data used to train the machine learning models included data collected from children, teenagers, and adults, patients with asthma, COPD, and other respiratory diseases, some exclusively and others in combination. Although any variation of the performance in the algorithms trained on data from either age group was unlikely to be directly related to the age, it remains to be seen if the model developed for one population can perform comparably with a new or more general population.

Expanding the functionality of technologies developed, improving performance, and validating results against other devices is another area for future research.Citation23,Citation24,Citation27,Citation31,Citation33,Citation37,Citation41 For example, wheeze detection could be extended to other breath sounds,Citation27 expanding its application to other respiratory diseases. Cough detection could be applied to more difficult data, such as a mix of multiple individuals and background noise,Citation24 much like the “cocktail party problem” in machine learning. Developments in image recognition and video analysis using machine learning is promisingCitation8Citation10 and could be applied to enhance inhaler technique monitoring.

The data generated by mHealth devices for home monitoring are increasingly reliable and validated against existing gold-standard equipment.Citation58,Citation75,Citation76 However, the validity of the information created by machine learning analysis has not yet reached the standards required by health services. Many more large-scale studies, akin to clinical trials, will be required to test the outputs of real-time analysis using mHealth and machine learning algorithms deployed in the real world.Citation23,Citation28Citation30,Citation34,Citation42 Although training machine learning models often require a large amount of computing power, the resulting models may be easy to use and can be deployed and run on a mobile phone.

An ideal asthma management system combining machine learning and mHealth would intelligently utilize both active and passive monitoring and be validated with clinical trials. Passive monitoring requires minimal input from the patient, such as wearing a smartwatch or switching on a sleep monitoring device, capturing data without interfering with the patient’s daily life. In contrast, active monitoring requires more input from the patient but could provide more detailed information about a person’s condition, such as measuring peak flow or answering questions about asthma control. Using machine learning to infer when active monitoring is required based on passive monitoring data would minimize the need for intrusive data collection, while not reducing the attention given to patients.Citation36,Citation40 Most importantly, systems must be evaluated clinically to ensure clinical (and cost) effectiveness and safety.

Strengths and Limitations

A reproducible search strategy was implemented using the free search engine PubMed database to search for the latest developments in applications of machine learning algorithms, where the focus was placed only on the past five years. The interdisciplinary team who interpreted the papers consisted of practicing clinicians (covering both primary and secondary care) and applied machine learning experts. However, this is not a systematic review, and it was challenging to directly compare studies and algorithms due to diverse contexts.

Conclusion

Recent developments in applying machine learning to asthma management have tested a wide range of functionalities using mHealth devices. The algorithms have demonstrated promising results, but they have only been assessed with internal validation at best. Further, the algorithms were mostly developed on small datasets and a select population. Consequently, the likely performance of these algorithms in the general population in a real-world environment is unknown. Future research should include external validation with large sample size and a focus on combining multiple, diverse sources of data.

Abbreviations

ACT, Asthma Control Test; ACQ, Asthma Control Questionnaire; AUC, area under the ROC curve; BYOT, bring your own technology; COPD, chronic obstructive pulmonary disease; DSP, digital signal processing; FeNO, fractional exhaled nitric oxide; FN, false negative; FP, false positive; GINA, Global Initiative for Asthma; kNN, k-nearest neighbors; LSTM, long short-term memory; mHealth, mobile health; PCA, principal component analysis; PEF, peak expiratory flow; PPG, photoplethysmogram; RCT, randomized control trial; ROC, receiver operating characteristic; SVM, support vector machine; TN, true negative; TP, true positive; VOC, volatile organic compound.

Acknowledgement

This work is funded by Asthma+Lung UK as part of the Asthma UK Centre for Applied Research [AUK-AC-2018-01]

Disclosure

The authors report no conflicts of interest in this work.

References

  • Global Asthma Network. The global asthma report 2018. Global Asthma Network; 2018.
  • Reddel HK, Taylor DR, Bateman ED, et al. An Official American Thoracic Society/European Respiratory society statement: asthma control and exacerbations. Am J Respir Crit Care Med. 2009;180(1):59–99. doi:10.1164/rccm.200801-060ST
  • World Health Organization. mHealth: new horizons for health through mobile technology. Who Press; 2011. Available from: http://www.who.int/about/. Accessed September 3, 2021.
  • Samuel AL. Some studies in machine learning using the game of checkers. IBM J Res Dev. 1959;3(3):210–229. doi:10.1147/rd.33.0210
  • Zhou Y, Zhao L, Zhou N, et al. Predictive big data analytics using the UK biobank data. Sci Rep. 2019;9(1):6012. doi:10.1038/s41598-019-41634-y
  • Senior AW, Evans R, Jumper J, et al. Improved protein structure prediction using potentials from deep learning. Nature. 2020;577(7792):706–710. doi:10.1038/s41586-019-1923-7
  • Caravagna G, Giarratano Y, Ramazzotti D, et al. Detecting repeated cancer evolution from multi-region tumor sequencing data. Nat Methods. 2018;15(9):707–714. doi:10.1038/s41592-018-0108-x
  • Gornale SS, Patravali PU, Manza RR. Detection of osteoarthritis using knee x-ray image analyses: a machine vision based approach. Int J Comput Appl. 2016;145(1):20–26. doi:10.5120/ijca2016910544
  • Falcini F, Lami G, Costanza AM. Deep learning in automotive software. IEEE Softw. 2017;34(3):56–63. doi:10.1109/MS.2017.79
  • Giarratano Y, Bianchi E, Gray C, et al. Automated segmentation of optical coherence tomography angiography images: benchmark data and clinically relevant metrics. Transl Vis Sci Technol. 2020;9(13):5. doi:10.1167/tvst.9.13.5
  • Palaniappan R, Sundaraj K, Ahamed NU. Machine learning in lung sound analysis: a systematic review. Biocybern Biomed Eng. 2013;33(3):129–135. doi:10.1016/j.bbe.2013.07.001
  • Li R, Jiang J-Y, Wu X, Hsieh -C-C, Stolcke A. Speaker identification for household scenarios with self-attention and adversarial training. In: Interspeech 2020, ISCA: 2020: 2272–2276.
  • Shah SA, Velardo C, Farmer A, Tarassenko L. Exacerbations in chronic obstructive pulmonary disease: identification and prediction using a digital health system. J Med Internet Res. 2017;19(3):e69. doi:10.2196/jmir.7207
  • Hill NR, Ayoubkhani D, McEwan P, et al. Predicting atrial fibrillation in primary care using machine learning. PLoS One. 2019;14(11):e0224582. doi:10.1371/JOURNAL.PONE.0224582
  • Wang Z, Shah AD, Tate AR, Denaxas S, Shawe-Taylor J, Hemingway H. Extracting diagnoses and investigation results from unstructured text in electronic health records by semi-supervised machine learning. PLoS One. 2012;7(1):e30412. doi:10.1371/journal.pone.0030412
  • Shah SA. Vital sign monitoring and data fusion for paediatric triage. [PhD Thesis]; 2012. Available from: https://ora.ox.ac.uk/objects/uuid:80ae66e3-849b-4df1-b064-f9eb7530200d. Accessed October 25, 2021.
  • Shah SA, Brown P, Gimeno H, Lin J-P, McClelland VM. Application of machine learning using decision trees for prognosis of deep brain stimulation of globus pallidus internus for children with dystonia. Front Neurol. 2020;11:825. doi:10.3389/fneur.2020.00825
  • Menni C, Valdes AM, Freidin MB, et al. Real-time tracking of self-reported symptoms to predict potential COVID-19. Nat Med. 2020;26:1037–1040. doi:10.1038/s41591-020-0916-2
  • Berry SE, Valdes AM, Drew DA, et al. Human postprandial responses to food and potential for precision nutrition. Nat Med. 2020;26(6):964–973. doi:10.1038/s41591-020-0934-0
  • North M, Bourne S, Green B, et al. A randomised controlled feasibility trial of E-health application supported care vs usual care after exacerbation of COPD: the RESCUE trial. Npj Digit Med. 2020. doi:10.1038/s41746-020-00347-7
  • Horne E, Tibble H, Sheikh A, Tsanas A. Challenges of clustering multimodal clinical data: review of applications in asthma subtyping. JMIR Med Informatics. 2020;8(5):e16452. doi:10.2196/16452
  • Honkoop PJ, Simpson A, Bonini M, et al. MyAirCoach: the use of home-monitoring and mHealth systems to predict deterioration in asthma control and the occurrence of asthma exacerbations; study protocol of an observational study. BMJ Open. 2017;7(1):e013935. doi:10.1136/bmjopen-2016-013935
  • Chen A, Zhang J, Zhao L, et al. Machine-learning enabled wireless wearable sensors to study individuality of respiratory behaviors. Biosens Bioelectron. 2020;173:112799. doi:10.1016/j.bios.2020.112799
  • Vatanparvar K, Nemati E, Nathan V, Rahman MM, Kuang J. CoughMatch – subject verification using cough for personal passive health monitoring. In: 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), IEEE: 2020: 5689–5695.
  • Prinable J, Jones P, Boland D, Thamrin C, McEwan A. Derivation of breathing metrics from a photoplethysmogram at rest: machine learning methodology. JMIR mHealth uHealth. 2020;8(7):e13737. doi:10.2196/13737
  • Adhi Pramono RX, Anas Imtiaz S, Rodriguez-Villegas E. Automatic cough detection in acoustic signal using spectral features. In: 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), IEEE: 2019: 7153–7156.
  • Chen H, Yuan X, Li J, Pei Z, Zheng X. Automatic multi-level in-exhale segmentation and enhanced generalized S-transform for wheezing detection. Comput Methods Programs Biomed. 2019;178:163–173. doi:10.1016/j.cmpb.2019.06.024
  • Li K, Habre R, Deng H, et al. Applying multivariate segmentation methods to human activity recognition from wearable sensors’ data. JMIR mHealth uHealth. 2019;7(2):e11201. doi:10.2196/11201
  • Azam MA, Shahzadi A, Khalid A, Anwar SM, Naeem U Smartphone based human breath analysis from respiratory sounds. In: 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), IEEE: 2018: 445–448.
  • Adhi Pramono RX, Anas Imtiaz S, Rodriguez-Villegas E. Automatic identification of cough events from acoustic signals. In: 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), IEEE: 2019: 217–220.
  • Infante C, Chamberlain DB, Kodgule R, Fletcher RR Classification of voluntary coughs applied to the screening of respiratory disease. In: 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), IEEE: 2017: 1413–1416.
  • Taylor TE, Lacalle Muls H, Costello RW, Reilly RB. Estimation of inhalation flow profile using audio-based methods to assess inhaler medication adherence. Kou YR, ed. PLoS One. 2018;13(1):e0191330. doi:10.1371/journal.pone.0191330
  • Purnomo AT, Lin D-B, Adiprabowo T, Hendria WF. Non-contact monitoring and classification of breathing pattern for the supervision of people infected by COVID-19. Sensors. 2021;21(9):3172. doi:10.3390/s21093172
  • Zhang O, Minku LL, Gonem S. Detecting asthma exacerbations using daily home monitoring and machine learning. J Asthma. 2021;58(11):1518–1527. doi:10.1080/02770903.2020.1802746
  • Tsang KCH, Pinnock H, Wilson AM, Ahmar Shah S Application of machine learning to support self-management of asthma with mHealth. In: 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), IEEE: 2020: 5673–5677. doi:10.1109/EMBC44109.2020.9175679.
  • Tinschert P, Rassouli F, Barata F, et al. Nocturnal cough and sleep quality to assess asthma control and predict attacks. J Asthma Allergy. 2020;13:669–678. doi:10.2147/JAA.S278155
  • Tenero L, Sandri M, Piazza M, Paiola G, Zaffanello M, Piacentini G. Electronic nose in discrimination of children with uncontrolled asthma. J Breath Res. 2020;14(4):046003. doi:10.1088/1752-7163/ab9ab0
  • Finkelstein J, Jeong I. Machine learning approaches to personalize early prediction of asthma exacerbations. Ann N Y Acad Sci. 2017;1387(1):153–165. doi:10.1111/nyas.13218
  • Castner J, Jungquist CR, Mammen MJ, Pender JJ, Licata O, Sethi S. Prediction model development of women’s daily asthma control using fitness tracker sleep disruption. Hear Lung. 2020;49(5):548–555. doi:10.1016/j.hrtlng.2020.01.013
  • Khasha R, Sepehri MM, Mahdaviani SA. An ensemble learning method for asthma control level detection with leveraging medical knowledge-based classifier and supervised learning. J Med Syst. 2019;43(6):158. doi:10.1007/s10916-019-1259-8
  • van Vliet D, Smolinska A, Jöbsis Q, et al. Can exhaled volatile organic compounds predict asthma exacerbations in children? J Breath Res. 2017;11(1):016016. doi:10.1088/1752-7163/aa5a8b
  • Huffaker MF, Carchia M, Harris BU, et al. Passive nocturnal physiologic monitoring enables early detection of exacerbations in children with asthma. a proof-of-concept study. Am J Respir Crit Care Med. 2018;198(3):320–328. doi:10.1164/rccm.201712-2606OC
  • Tibble H, Chan A, Mitchell EA, et al. A data-driven typology of asthma medication adherence using cluster analysis. Sci Rep. 2020;10(1):14999. doi:10.1038/s41598-020-72060-0
  • Tignor N, Wang P, Genes N, et al. Methods for clustering time series data acquired from mobile health apps. In: Biocomputing 2017, WORLD SCIENTIFIC: 2017: 300–311.
  • Juniper EF, O’Byrne PM, Guyatt GH, Ferrie PJ, King DR. Development and validation of a questionnaire to measure asthma control. Eur Respir J. 1999;14(4):902–907. doi:10.1034/j.1399-3003.1999.14d29.x
  • Nathan RA, Sorkness CA, Kosinski M, et al. Development of the asthma control test: a survey for assessing asthma control. J Allergy Clin Immunol. 2004;113(1):59–65. doi:10.1016/j.jaci.2003.09.008
  • Adhi Pramono RX, Imtiaz SA, Rodriguez-Villegas E, Cough-Based A. Algorithm for automatic diagnosis of pertussis. PLoS One. 2016;11(9):e0162128. doi:10.1371/journal.pone.0162128
  • Rocha BM, Filos D, Mendes L, et al. Α respiratory sound database for the development of automated classification. In: IFMBE Proceedings. Vol 66, Singapore: Springer: 2018: 33–37.
  • Ward JJ. Rale lung sounds 3.1 professional edition. Respir Care. 2005;50(10):1385–1388.
  • Anguita D, Ghio A, Oneto L, Parra X, Reyes-Ortiz JL. A public domain dataset for human activity recognition using smartphones. In: 2013 21st European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, (ESANN): 2013.
  • Bui AAT, Hosseini A, Rocchio R, et al. Biomedical REAl-Time Health Evaluation (BREATHE): toward an mHealth informatics platform. JAMIA Open. 2020;3(2):190–200. doi:10.1093/jamiaopen/ooaa011
  • Atienza T, Aquino T, Fernández M, et al. Budesonide/formoterol maintenance and reliever therapy via turbuhaler versus fixed-dose budesonide/formoterol plus terbutaline in patients with asthma: phase III study results. Respirology. 2013;18(2):354–363. doi:10.1111/RESP.12009
  • Chan Y-FY, Wang P, Rogers L, et al. The asthma mobile health study, a large-scale clinical observational study using ResearchKit. Nat Biotechnol. 2017;35(4):354–362. doi:10.1038/nbt.3826
  • Chan AHY, Stewart AW, Harrison J, Camargo CA, Black PN, Mitchell EA. The effect of an electronic monitoring device with audiovisual reminder function on adherence to inhaled corticosteroids and school attendance in children with asthma: a randomised controlled trial. Lancet Respir Med. 2015;3(3):210–219.
  • Katz S, Arish N, Rokach A, Zaltzman Y, Marcus E-L. the effect of body position on pulmonary function: a systematic review. BMC Pulm Med. 2018;18(1):159. doi:10.1186/s12890-018-0723-4
  • Kera T, Maruyama H. The effect of posture on respiratory activity of the abdominal muscles. J Physiol Anthropol Appl Human Sci. 2005;24(4):259–265. doi:10.2114/jpa.24.259
  • Penzel T, Möller M, Becker HF, Knaack L, Peter JH. Effect of sleep position and sleep stage on the collapsibility of the upper airways in patients with sleep apnea. Sleep. 2001;24(1):90–95. doi:10.1093/sleep/24.1.90
  • Price K, Bird SR, Lythgo N, Raj IS, Wong JYL, Lynch C. Validation of the fitbit one, Garmin Vivofit and Jawbone UP activity tracker in estimation of energy expenditure during treadmill walking and running. J Med Eng Technol. 2017;41(3):208–215. doi:10.1080/03091902.2016.1253795
  • Nurmatov UB, Tagiyeva N, Semple S, Devereux G, Sheikh A. Volatile organic compounds and risk of asthma and allergy: a systematic review. Eur Respir Rev. 2015;24(135):92–101. doi:10.1183/09059180.00000714
  • Scottish Intercollegiate Guidelines Network/ British Thoracic Society. SIGN 158 British guideline on the management of asthma. BTS/SIGN; 2019. Available from: https://www.brit-thoracic.org.uk/document-library/guidelines/asthma/btssign-guideline-for-The-management-of-asthma-2019/. Accessed June 17, 2022.
  • Clark TJ, Hetzel MR. Diurnal Variation of Asthma. Br J Dis Chest. 1977;71(2):87–92.
  • Moore VC. Spirometry: step by Step. Breathe. 2012;8(3):232–240. doi:10.1183/20734735.0021711
  • Honkoop PJ, Taylor DR, Smith AD, Snoeck-Stroband JB, Sont JK. Early detection of asthma exacerbations by using action points in self-management plans. Eur Respir J. 2013;41(1):53–59. doi:10.1183/09031936.00205911
  • Gautier C, Charpin D. Environmental triggers and avoidance in the management of asthma. J Asthma Allergy. 2017;10:47–56. doi:10.2147/JAA.S121276
  • Fang W, Zhang Y, Li S, et al. Effects of air pollutant exposure on exacerbation severity in asthma patients with or without reversible airflow obstruction. J Asthma Allergy. 2021;14:1117–1127. doi:10.2147/JAA.S328652
  • Baldacci S, Maio S, Cerrai S, et al. Allergy and asthma: effects of the exposure to particulate matter and biological allergens. Respir Med. 2015;109(9):1089–1104. doi:10.1016/J.RMED.2015.05.017
  • Global Initiative for Asthma (GINA). Global strategy for asthma management and prevention; 2021. https://ginasthma.org/gina-reports/.Accessed June 17, 2021.
  • Pinnock H, Parke HL, Panagioti M, et al. Systematic meta-review of supported self-management for asthma: a healthcare perspective. BMC Med. 2017;15(1):64. doi:10.1186/s12916-017-0823-7
  • Pearce G, Parke HL, Pinnock H, et al. The PRISMS taxonomy of self-management support: derivation of a novel taxonomy and initial testing of its utility. J Health Serv Res Policy. 2016;21(2):73–82. doi:10.1177/1355819615602725
  • Pavord ID, Beasley R, Agusti A, et al. After asthma: redefining airways diseases. Lancet. 2018;391(10118):350–400. doi:10.1016/S0140-6736(17
  • Morjaria JB, Polosa R. Recommendation for optimal management of severe refractory asthma. J Asthma Allergy. 2010;3:43–56. doi:10.2147/jaa.s6710
  • Sterne JAC, White IR, Carlin JB, et al. Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. BMJ. 2009;338(7713):157–160. doi:10.1136/BMJ.B2393
  • Ghassemi M, Naumann T, Schulam P, Beam AL, Chen IY, Ranganath R A review of challenges and opportunities in machine learning for health. In: AMIA Joint Summits on Translational Science. Vol 2020, American Medical Informatics Association: 2020: 191–200.
  • Blagus R, Lusa L. SMOTE for high-dimensional class-imbalanced data. BMC Bioinform. 2013;14(1):106. doi:10.1186/1471-2105-14-106
  • Nelson BW, Allen NB. Accuracy of consumer wearable heart rate measurement during an ecologically valid 24-hour period: intraindividual validation study. JMIR mHealth uHealth. 2019;7(3):e10828. doi:10.2196/10828
  • VanZeller C, Williams A, Pollock I. Comparison of bench test results measuring the accuracy of peak flow meters. BMC Pulm Med. 2019;19(1):74. doi:10.1186/s12890-019-0837-3