1,363
Views
0
CrossRef citations to date
0
Altmetric
Review

Demystification of artificial intelligence for respiratory clinicians managing patients with obstructive lung diseases

, , , , &
Pages 1207-1219 | Received 13 Jul 2023, Accepted 04 Jan 2024, Published online: 25 Jan 2024

ABSTRACT

Introduction

Asthma and chronic obstructive pulmonary disease (COPD) are leading causes of morbidity and mortality worldwide. Despite all available diagnostics and treatments, these conditions pose a significant individual, economic and social burden. Artificial intelligence (AI) promises to support clinical decision-making processes by optimizing diagnosis and treatment strategies of these heterogeneous and complex chronic respiratory diseases. Its capabilities extend to predicting exacerbation risk, disease progression and mortality, providing healthcare professionals with valuable insights for more effective care. Nevertheless, the knowledge gap between respiratory clinicians and data scientists remains a major constraint for wide application of AI and may hinder future progress. This narrative review aims to bridge this gap and encourage AI deployment by explaining its methodology and added value in asthma and COPD diagnosis and treatment.

Areas covered

This review offers an overview of the fundamental concepts of AI and machine learning, outlines the key steps in building a model, provides examples of their applicability in asthma and COPD care, and discusses barriers to their implementation.

Expert opinion

Machine learning can advance our understanding of asthma and COPD, enabling personalized therapy and better outcomes. Further research and validation are needed to ensure the development of clinically meaningful and generalizable models.

1. Introduction

Artificial Intelligence (AI) has grown strikingly over the last decade, driven by recent technological advancements in computational power and increased acquisition and (real-time) availability of large volumes of different types of data. There is no consensus on AI definition [Citation1]. For this review, the definition proposed in the University of Helsinki’s Elements of AI course was adopted, which defines AI as systems that can execute specific tasks autonomously and adaptively. Broadly, AI includes machine learning (ML), rule-based expert systems and supporting technologies (). AI is leading to a paradigm shift in clinical practice, optimizing processes and curtailing medical errors [Citation2]. In respiratory medicine, it is primarily applied to optimize analyses of chest computed tomography scans and conventional chest radiographs, supporting the diagnosis of a wide range of health conditions, such as lung cancer [Citation3].

Figure 1. Artificial intelligence hierarchy. Artificial intelligence comprises a variety of methods, including supporting technologies, rule-based expert systems and machine learning. Machine learning can be further divided into supervised, unsupervised and reinforcement learning. KNN – K-nearest neighbors; OLS – ordinary least squares; SVM – support vector machine.

Figure 1. Artificial intelligence hierarchy. Artificial intelligence comprises a variety of methods, including supporting technologies, rule-based expert systems and machine learning. Machine learning can be further divided into supervised, unsupervised and reinforcement learning. KNN – K-nearest neighbors; OLS – ordinary least squares; SVM – support vector machine.

Asthma and chronic obstructive pulmonary disease (COPD) are highly prevalent chronic airway diseases, placing a substantial burden on individuals, healthcare systems and societies [Citation4,Citation5]. AI offers great promise in addressing challenges associated with the diagnosis and management of these heterogeneous and complex diseases by i) analyzing simultaneously different types of data including demographics, lifestyle, patient-reported outcome measures, pulmonary and extrapulmonary features; and ii) finding linear and non-linear relationships and complex patterns. Consequently, it may aid in gaining insight into disease heterogeneity [Citation6,Citation7] and identifying patients at risk of exacerbations or premature death [Citation8,Citation9], allowing for timely treatment adjustments and improved health outcomes and costs.

Despite the mounting evidence showcasing AI advantages within respiratory medicine, its integration in clinical practice remains scarce. This is partly due to the unawareness of AI potential and clinicians’ limited understanding of these techniques [Citation10]. Bridging the knowledge gap may help build trust in AI, consequently boosting its deployment in respiratory medicine [Citation10]. This narrative review aims to improve clinicians’ understanding of AI and provide examples of its implementation in the diagnosis and management of chronic airway diseases.

2. Demystifying machine learning

ML, a subfield of AI, encompasses three main methods: supervised learning (deep and shallow), unsupervised learning, and reinforcement learning (). For simplicity, this section introduces ML starting by covering linear regression.

A linear regression model is the relationship y=fx between a dependent (or response) variable y, and ≥ 1 independent (or explanatory) variables x, as shown in (Equation1):

(1) yˆi=β0+β1xi,1+β2xi,2++βmxi,m(1)

Here, x1,,xm denote the explanatory variables, which can be clinical characteristics (e.g. age or forced expiratory volume in the first second (FEV1)). yˆi denotes the response values predicted by the model (e.g. the predicted number of days until the next exacerbation-related hospitalization) as opposed to yi, which are the true observed response values (e.g. the actual number of days until the exacerbation-related hospitalization).

The model’s coefficients β0,,βm, are traditionally fitted by ordinary least squares which selects β0,,βm, such that the mean squared difference between observed and predicted/expected values is minimized (e.g. the smallest difference between the predicted and observed number of days until the next exacerbation-related hospitalization) (Figure S1).

Linear regression adjusted for confounders on observational data has been a common approach for hypothesis testing and causal explanation [Citation11]. In the AI/ML era, emphasis has shifted to predictive models, which do not claim to test a causal hypothesis. Instead, they are correlational models, fitted on observational data, that claim to predict an outcome (e.g. the number of days until the next exacerbation-related hospitalization). Many clinical tasks are predictive in nature, such as identifying patients at increased risk of a poor outcome [Citation12] or making a diagnosis from radiological images [Citation13]. It is noteworthy that causality cannot be inferred from predictive models due to confounders. While randomized controlled trials are the gold standard for establishing causality, they are not always feasible [Citation14]. With the increasing availability of data, there is a growing interest in estimating causal effects from observational data using ML (also known as causal ML) [Citation14–17]. In contrast to traditional ML, causal ML combines data-driven methods with causal inference to determine the effect of a variable on an outcome while considering the complex interplay between all variables [Citation15–17]. This usually requires domain knowledge of the relationships between variables involved to guide the modeling process and deduce causal effects [Citation15–17]. Causal ML holds significant promise in healthcare, particularly for tasks that extend beyond prediction (e.g. deciding which intervention is likely to result in the best outcome [Citation16]). In such cases, clinicians seek to understand what would happen to the outcome if different decisions were made [Citation16]. In a recent application, causal ML has been used to estimate the treatment effects of dual therapy fluticasone furoate/vilanterol on mortality and exacerbation rate in people with COPD [Citation18]. Healthcare professionals may, therefore, identify responses to different treatments prior to an intervention by incorporating causal ML in their practice.

2.1. Forms of machine learning

2.1.1. Supervised learning

Predictive modeling is the underlying concept of supervised learning and consists of models, fitted on data, that predict response values y (labels in ML) from independent variables x (features in ML) [Citation19]. Linear regression is the simplest form of supervised learning. There are many types of algorithms, such as decision trees, support vector machines, and neural networks (Table S1). All these types of models are similar in form and function to the linear regression model (1), in that they are mathematical equations used to predict y values from x values. Supervised learning involves fitting a mathematical equation of the form y=fx based on the training data, in which both the response yi and the features xi have been observed. The function of the model is to predict for new observations, where we observe the features x, and then use the equation to find a y value.

Problems in which the y variable is categorical are called classification (). For instance, support vector machine and artificial neural networks have been applied to discriminate asthmatics from healthy subjects (y1=asthma or y0=healthy), using lung sounds signals as predictors (features, xs) [Citation20]. Conversely, if y is quantitative, then the application is called regression (). For example, predicting asthma lung function (y = FEV1% predicted) from breathing and speech audio (features, xs) using linear regression, random forest, and support vector machine [Citation21].

Figure 2. Graphical representation of (a) classification and (b) regression problems of supervised learning; (c) clustering analysis and (d) reinforcement learning.

Figure 2. Graphical representation of (a) classification and (b) regression problems of supervised learning; (c) clustering analysis and (d) reinforcement learning.

Deep learning takes model complexity to an extreme. A deep learning model is built around artificial neural networks, which are essentially mathematical equations of the form y=fx [Citation19]. In deep learning, artificial neural network models have multiple layers, where the output y of one layer is the input x for the next layer [Citation22]. The resulting model is essentially still a mathematical equation of the form y=fx, but with many more terms than the model in (1). Deep neural networks have achieved high predictive performances in detecting asthma [Citation23] and COPD [Citation24] from sociodemographic variables, clinical data, biochemical results, lung function, bronchial challenge test [Citation23], and lung sounds [Citation24], respectively. Such algorithms are superior at handling many features, and therefore outperform shallow algorithms in applications, such as image and audio processing [Citation25].

2.1.2. Unsupervised learning

Unsupervised learning is another form of ML, where x data (features) are available but no y values (labels) [Citation19]. It consists of exploratory techniques, such as clustering () and dimensionality reduction, which aim to find structure in data, or suggest a representation of data with fewer dimensions (i.e. fewer x variables), respectively. Table S1 provides a brief overview of the algorithms used for these purposes. Clustering techniques have been of great value to better understand asthma and COPD heterogeneity. A hierarchical cluster analysis was conducted to explore COPD phenotypes based on 13 comorbidities (features, x, no label, y) [Citation26].

2.1.3. Reinforcement learning

In reinforcement learning [Citation27], the task is not to find a y=fx relationship in the data, but to learn a suitable action in each set of possible scenarios. Conversely to supervised and unsupervised learning, reinforcement learning does not fit a model based on historical data, but instead, the procedure creates new data by interacting with the system in which it tries to learn suitable actions (). Reinforcement learning can be used for optimizing ventilation regimes in critically ill patients [Citation28]. A ventilation strategy (tidal volume, positive end-respiratory pressure, and fraction of inspired oxygen) is proposed by the model based on patient’s current characteristics (e.g. demographics, vitals, laboratory results, and medication). The algorithm learns the optimal ventilation regime by analyzing patients’ health status and observing the result of its actions on patient’s survival [Citation28]. Reinforcement learning holds great potential in informing clinical decision-making due to its ability to address sequential decision-making problems, e.g. when treatment requires continuous adjustments considering changes in patient’s health status [Citation29]. Examples of successful research applications using reinforcement learning in healthcare practice [Citation29] include the optimization of cancer treatment [Citation29,Citation30] and of multimorbidity management in patients with type 2 diabetes [Citation31].

3. Rule-based expert systems

Rule-based algorithms translate expert knowledge, like the implicit rules used by physicians in diagnosis, into explicit if/then/else statements. Please note this is fundamentally different from ML; ML algorithms are mathematical relationships derived from data and not deduced from expert knowledge. Rule-based expert systems can support clinical decision making by providing recommendations or warnings from a set of rules based on patient’s information. For example, an expert system has been developed by pulmonologists for supporting the diagnosis of obstructive lung disease in primary care [Citation32]. Complex tasks like natural language processing and image recognition are challenging to handle with rule-based methods. As a result, ML has become the predominant approach, contributing to recent successes in AI.

4. Supporting technologies

AI also includes supporting technologies such as sensors, machine-human interaction, and knowledge representation, as well as integrated technologies in which ML is applied, such as robotics and autonomous cars. Examples of supporting technologies are the digital AI-powered stethoscope for remote monitoring of respiratory sounds [Citation33], continuous remote monitoring of oxygen saturation levels using pulse oximetry [Citation34] or physical activity sensors [Citation35].

5. Phases of ML model development

ML models for medical purposes have advanced, but respiratory clinicians often lack understanding. This section describes the steps for building ML models.

5.1. Data cleaning

ML algorithms are data-driven methods capable of integrating extensive and multimodal data, i.e. different types of data including clinical and omics data, text, audio (e.g. lung sounds) and imaging (e.g. computed tomography scans). Healthcare data often contains missing values, inconsistencies, or errors, leading to inaccurate analysis and unreliable results that can jeopardize patient safety. It is imperative to ensure data is complete, accurate, and consistent before training a model. Data cleaning includes checking data accuracy, format, uniformity, de-duplication, handling of missing values and outliers [Citation36].

5.2. Model selection

The selection of an ML algorithm is task dependent. If the aim is to predict an outcome such as a diagnosis (i.e. presence or absence of a disease) or prognosis (e.g. risk of death), supervised learning algorithms may be useful for these types of tasks. Conversely, if the goal is to uncover patterns in data where there is no predefined outcome variable, such as clustering individuals based on a set of characteristics, unsupervised learning methods should be employed. For tasks involving a sequence of decisions such as the management of chronic diseases with a sequential set of interventions to manage disease progression and severity, reinforcement learning becomes more appropriate. Other aspects should also be taken into consideration, such as number of features, model complexity, interpretability, and performance.

5.2.1. Number of features and model complexity

Including a large set of features may substantially increase model complexity and risk of overfitting. An ideal model has the right complexity to capture the y=fx relationship, but not the random noise [Citation19] (Figure S2A). Underfitted models fail to capture the y=fx relationship and thereby produce less accurate predictions (Figure S2B). In contrast, overfitted models not only model the y=fx relationship, but also capture sample-specific random noise [Citation19] (Figure S2C), leading to a small error in the training set, but a large error in new data [Citation19].

5.2.2. Predictive performance and interpretability

Predictive performance and interpretability are important aspects that influence the choice of a ML algorithm [Citation36]. In healthcare, both accuracy and interpretability are required. Interpretable models, such as linear regression, enable easy understanding of how independent variables contribute to the response, but may show low predictive ability when input variables have non-linear relationships. In linear regression, a positive coefficient indicates that the mean of the dependent variable tends to increase as the value of the independent variable increases, whereas a negative coefficient implies the opposite. More complex models, such as artificial neural networks, may yield higher predictive performances at the cost of lower interpretability [Citation19], for which they are known as ‘black box’ algorithms [Citation37].

5.3. Validation

Model validation is the process of evaluating an ML model on a test set to ensure its precision before using it in real-life applications [Citation38]. There are two forms of validation: internal and external validation [Citation39]. Internal validation is performed in individuals that have the same origin as the ones used for model training [Citation39]. External validation consists of testing the model in a fully independent sample to determine its generalizability [Citation39].

5.4. Performance metrics

Performance metrics are required to measure the performance of an ML model. Most regression metrics are based on the difference between the observed and the predicted values, such as mean absolute error and root mean squared error [Citation40]. There is no determined threshold above which a model is considered appropriate, since both mean absolute error and root mean squared error are scale dependent. Therefore, as the error decreases, the model’s performance improves.

In classification, most metrics are obtained from a confusion matrix, which is a cross-tabulation of the true and the predicted classes (Table S2) [Citation40]. Several metrics can be calculated from the confusion matrix, such as sensitivity, specificity, and accuracy (Table S3) [Citation40]. The model’s performance improves as accuracy, sensitivity, and specificity converge toward one. A good discrimination model should have a high accuracy (>70%) [Citation41] and a good trade-off between specificity and sensitivity (sum of specificity and sensitivity greater than 1.5) [Citation42].

Another commonly reported metric is the area under the Receiver Operating Characteristic (ROC) Curve (AUC). The AUC can also be denoted as C-statistic. It is important to recognize that C-statistics involve distinct formulations in time-to-event analysis compared to the standard binary outcome. An AUC value of one indicates perfect classification, whereas an AUC value of 0.5 indicates that the model predicts no better than chance [Citation19]. AUC scores above 0.7 are considered acceptable [Citation43]. The AUC, however, can be misleading when classes are imbalanced [Citation40]. In these situations, the area under the precision-recall curve is recommended [Citation40].

6. Role of ML in asthma

Examples of articles implementing ML in asthma care are summarized in .

Table 1. Summary of characteristics of the included studies applying machine learning for asthma diagnosis and management.

6.1. Diagnosis

Asthma diagnosis remains a major problem due to disease heterogeneity and non-specificity of the symptoms. In fact, respiratory symptoms vary over time and in intensity, which may lead to misattribution to other respiratory diseases, especially in the absence of lung function testing [Citation44]. Therefore, diagnosis of asthma is often challenging, and attempts have been made to identify potential diagnostic markers using ML.

Sociodemographic variables, clinical data, spirometry parameters, biomechanical findings, and bronchial tests [Citation23]; carbon dioxide waveforms [Citation45]; respiratory sounds [Citation20]; and Raman spectra from blood sera samples [Citation46] have been considered in the development of ML-based diagnostic tools for asthma. Accuracies exceeded 90% using deep neural networks [Citation23] and support vector machine [Citation20,Citation45,Citation46].

6.2. Phenotypes

Asthma phenotyping is becoming increasingly important as it provides a foundation to understand disease etiology, heterogeneity and ultimately guide treatment [Citation7]. Various clustering approaches, predominantly hierarchical, k-means, and k-medoids algorithms, have been employed to investigate asthma phenotypes in adults [Citation7]. These approaches have considered a diverse range of variables, including sociodemographic, clinical, pathophysiological, lung function, behavioral, medication, and healthcare utilization data [Citation7].

Based on readily available variables in clinical practice, recent research has identified three phenotypes using the k-medoids method [Citation47]. The cluster solution was independently replicated in another sample [Citation47]. Another recent study considered biomarkers from routine blood tests to explain asthma heterogeneity using the k-means algorithm. The three identified clusters had different risks of asthma exacerbations [Citation48].

6.3. Management

Electronic health records have gained increased popularity as sources of valuable information and have already been considered for predicting asthma exacerbations [Citation49–51]. Exacerbations, defined as the need for oral corticosteroids, an asthma-related emergency department visit, or hospitalization, were moderately predicted with a deep neural network (AUC = 0.70) [Citation50]. Similarly, a boosting algorithm was used to predict non-severe exacerbations, emergency department visits or hospitalization [Citation49]. Emergency department visits and hospitalizations were better predicted (AUC = 0.88 and AUC = 0.85, respectively) than non-severe exacerbations (AUC = 0.71) [Citation49]. Similar accuracies were obtained when predicting emergency department visits or hospitalizations in asthma (AUC = 0.86) using another boosting algorithm [Citation51]. Identified predictors for exacerbations were history of non-severe exacerbations requiring oral glucocorticoid bursts, severe asthma, age, number of hospital visits and number of systemic corticosteroids prescriptions [Citation49,Citation51].

Monitoring asthma control levels and symptoms can guide clinical decision-making. An artificial neural network model was proposed to remotely monitor asthma symptoms using indexes from a digital AI-powered stethoscope [Citation33]. Included were individuals of all age groups, both asthmatics and non-asthmatics, with and without adventitious respiratory sounds [Citation33]. The model that best distinguished abnormal from normal respiratory sounds in people with asthma reached an AUC of 0.94 using intensity scores of wheezes and rhonchi [Citation33]. Recently, a novel approach combined a rule-based expert model and ML to predict asthma control level from demographics, clinical characteristics, lung function and environmental factors, yielding an excellent predictive power (accuracy = 91.7%) [Citation52].

ML has also shown potential in identifying patients who may be responsive to standard care or who may require more personalized treatment. A random forest algorithm predicted treatment failure after an exacerbation and the prescription of systemic corticosteroids during exacerbation in individuals with asthma and COPD [Citation53]. Patients who were readmitted, required treatment adjustment, or died over the 30-day follow-up period were deemed unresponsive [Citation53]. The model achieved a good performance for treatment failure (AUC = 0.81), with the scores of the visual analogue scale for breathlessness and sputum purulence being the most predictive [Citation53]. A satisfactory performance (AUC = 0.69) was found for the prescription of systemic corticosteroids using only as input the presence of wheezing and the percentage of blood eosinophils [Citation53].

7. Role of ML in COPD

Examples of the applicability of ML in COPD care are described in .

Table 2. Summary of characteristics of the included studies applying machine learning for COPD diagnosis and management.

7.1. Diagnosis

Several attempts have been made to diagnose COPD using different types of data. Data from cardiopulmonary exercise tests [Citation54], chest computed tomography scans [Citation55,Citation56], respiratory sounds [Citation24] coupled with spirometry [Citation57], volatile organic compounds [Citation58,Citation59], electronic health records [Citation60] have been used to distinguish COPD from controls. Accuracies varied between 76.7% to 100% using support vector machine [Citation54,Citation57,Citation58] or boosting algorithms [Citation59]. The model developed based on real-world data also reached an excellent predictive performance (precision-recall AUC = 0.93) [Citation60]. Deep learning applied to lung sounds (sensitivity = 0.93, specificity = 0.93) [Citation24] and chest computed tomography scans (AUC >0.85) [Citation55,Citation56] also showed high discriminatory power in the detection of COPD.

The forced oscillation technique parameters have been used to build classifiers that discriminate between different levels of airflow limitation in people with COPD [Citation61]. K-nearest neighbors and random forest yielded excellent performances (AUC ≥0.90) in control versus all people with COPD and control versus patients with moderate, severe, and very severe airflow limitation [Citation61].

7.2. Phenotypes

Over the last decade, considerable research was performed to identify clinically relevant COPD phenotypes using unsupervised learning, mainly through hierarchical and k-means algorithms [Citation6].

COPD heterogeneity has also been explored using k-means [Citation62], k-medoids [Citation63] and hierarchical [Citation64] algorithms considering multidimensional data [Citation62] as well as clinically relevant and easily accessible variables [Citation63,Citation64]. Clusters were either prospectively validated with clinically relevant outcomes [Citation62,Citation64] or assessed for stability over time [Citation63]. Other studies demonstrated that people with COPD have distinct comorbidity [Citation26] and lung function profiles [Citation65] using self-organizing maps followed by hierarchical clustering. There are, however, several studies reporting different cluster solutions, which reflect the use of different sample sizes and characteristics, choice of input variables and clustering algorithms [Citation6].

7.3. Management

Currently, risk classification of COPD is mainly based on the history of previous exacerbations and hospitalizations, which is often based on patient recall and characterized by underreporting of events [Citation5]. Early identification of at-risk patients may allow timely treatment adjustment, prevent disease progression, and reduce the burden on the healthcare system. Recent research has applied ML to identify risk factors for hospital readmission in people with COPD [Citation66]. Top predictors were hospitalization in the previous two years, older age, being male, number of comorbidities, and longer length of hospital stay [Citation66].

Exacerbation prediction based on remote monitoring of respiratory sounds has also been reported [Citation67]. A support vector machine model was able to predict the onset of exacerbation, on average, 5 ± 1.9 days before in 75.8% of the cases [Citation67].

ML has also been applied to identify physically inactive people with COPD who could benefit from physical activity promotion interventions [Citation68]. Patients were divided into two categories based on their daily walking duration and intensity level: extremely inactive or overactive. The random forest algorithm, which was trained to distinguish extremely inactive from overactive using nonphysical activity related data, yielded an overall AUC of 0.84 [Citation68].

Prediction models have also been developed to identify individuals at risk of mortality using decision tree algorithm [Citation12], Cox regression [Citation69], and deep neural networks [Citation55]. The tree-based model developed based on age, spirometry parameters, dyspnea, physical activity, and number of hospital admissions in the previous 2 years reached comparable performances when compared to established mortality prediction models (AUC0.7) [Citation12]. Similarly, a Cox regression model built using imaging, spirometry, and clinical data to predict all-cause mortality in people with COPD [Citation69], outperformed body-mass index, airflow obstruction, dyspnea, exercise capacity (BODE) index, modifications of BODE index and age, dyspnea, and airflow obstruction index [Citation69]. The top predictors were the 6-minute walk distance, FEV1%-predicted, age, and pulmonary artery-to-aorta ratio [Citation69]. In contrast, a deep learning model exhibited only moderate predictive capability for the risk of death, using computed tomography imaging data (C-statistic = 0.6) [Citation55].

Moreover, ML has been used to identify high-cost patients based on medical insurance data [Citation70]. A good predictive performance was reached with a boosting algorithm (AUC = 0.80). Relevant predictors were cost-related variables, age, region, gender, type of insurance, number of comorbidities, emphysema, hypertension, heart disease, and Charlson Comorbidity Index scores [Citation70].

8. Implementation challenges

AI deployment into outpatient or bedside care for asthma and COPD has been hindered by the knowledge gap but also by other hurdles. Studies often lack external validation, which may lead to overoptimistic predictive performances [Citation71]. Cluster validity assessment is crucial to confirm reproducibility of the proposed classification and their relevance in guiding asthma and COPD care. Studies also frequently included small sample sizes, which hinders the model’s generalizability. Hence, studies with larger samples, as well as external validation of findings are crucial steps prior to ML implementation within clinical workflow [Citation71]. Additionally, it is also imperative to compare the predictive ability across models and conduct prospective clinical trials to assess efficacy at improving patient care [Citation72].

ML also raises concerns regarding data protection [Citation36]. Data collection and processing must comply with ethical guidelines and data privacy legislation throughout model development and implementation [Citation36].

Another major issue arises when black-box algorithms are employed. The lack of interpretability delays ML acceptance, as clinicians are unable to double-check for errors or biases and provide an explanation to the patient [Citation36,Citation72].

Accountability is also a source of ongoing debate in AI implementation in healthcare, as poor decision-making may jeopardize patient safety [Citation72]. Currently, no broad consensus exists regarding responsibility for the model’s decisions since healthcare professionals, data scientists, and software developers lack complete control [Citation72].

9. Conclusion

This review covers fundamental concepts of AI/ML and main steps of ML model development, providing a simple guide for clinicians to improve their understanding of AI/ML, create trust, and accelerate ML deployment in clinical practice. This paper has also provided examples of AI/ML for diagnosis, phenotyping, and management of people with asthma and COPD. AI may aid in making informed decisions about diagnosis and management by incorporating a large number of data and drawing connections that healthcare professionals may overlook. AI has potential to enable a fast and accurate diagnosis and support early identification of at-risk patients. It also provides valuable information necessary for treatment adjustments and personalized medicine, preventing worsening of patient’s health, saving time and resources.

However, low-quality evidence, data privacy, interpretability, and accountability need to be addressed for AI to become a reality in healthcare. Research with larger sample sizes and a thorough validation of findings are crucial to ensure unbiased, generalizable, and clinically relevant ML models.

10. Expert opinion

This review introduces ML in a user-friendly language and gives examples of its application in the context of respiratory medicine. It describes how ML can support the diagnosis and management of chronic respiratory diseases such as asthma and COPD, which are characterized by considerable heterogeneity and complexity. It also serves as a guide for clinicians to improve their understanding and build trust in the use of ML in healthcare. Nevertheless, clinicians need to be aware of some methodological issues to be considered in future studies with ML. Adequate sample size, high quality data and thorough validation are critical requirements for developing robust ML models. A sufficient sample size and representativeness of the target population are crucial to ensure generalizability and reduce the risk of overfitting and bias toward a particular group. Electronic health records have become one of the most important sources of data in clinical research. However, data collected in clinical practice are often unstructured, non-standardized and incomplete, which can significantly affect the performance of the model if not properly addressed. It is necessary to report errors that occur and describe the pre-processing steps used to correct them. Establishing the performance of a model should include not only internal, but also external validation. Many models have demonstrated excellent predictive performance during development phase, only to fail when tested on an independent sample. Therefore, extensive validation is needed to ensure generalizability and avoid potentially harmful models for patients. Interpretability should also be highlighted as an important aspect in the development of ML models. Simpler models allow for better transparency of the decision-making process, which can help detect and reduce algorithmic bias and clarify some of the ethical issues associated with the use of AI (e.g. accountability), while promoting trust among healthcare professionals and patients. Post-hoc explainability methods have been developed to facilitate understanding of the reasons for the predictions of black-box models. However, there is still room for improvement [Citation73,Citation74]. In addition, the (dis)advantages of incorporating ML models into standard care need to be carefully weighed from both a patient-centered and an economic perspective [Citation36], which is only possible if data scientists work closely with healthcare professionals and other stakeholders to identify and resolve important issues early on. Such efforts can lead to models that are clinically useful and operationally feasible. Finally, it is of utmost importance to follow the guidelines for development (e.g. Cross-Industry Standard Process for Data Mining – CRISP-DM [Citation75,Citation76]) and reporting of ML projects (e.g. Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis – TRIPOD [Citation77]).

The potential impact of reinforcement learning in the management of chronic diseases is also noteworthy. In reinforcement learning, data is constantly fed into the system and incorporated into the model through continuous updates. Unlike static models – i.e. the model is trained exactly once – the goal is to choose an action at each point in time that maximizes long-term reward through constant feedback from the environment. Reinforcement learning can become a powerful tool for improving chronic respiratory disease management and represents a promising avenue for future advances in the field. Moreover, with its ability to identify treatment responses before intervention, causal ML can support clinicians in selecting the most effective treatment for each patient, and therefore may also represent a promising avenue to advance personalized medicine in chronic respiratory diseases.

Article highlights

  • Artificial intelligence and machine learning can learn from big data, identify non-linear relationships, and uncover connections that healthcare professionals may overlook.

  • Machine learning may improve the diagnosis and prognosis of asthma and chronic obstructive pulmonary disease by providing valuable support to physicians in clinical decision-making.

  • A better understanding of machine learning by clinicians may overcome their resistance to the use of machine learning models and facilitate the widespread adoption of these techniques in healthcare.

  • Future research should focus on conducting studies with larger samples and thoroughly validating machine learning models to ensure their generalizability and safety.

  • Reinforcement learning and causal machine learning hold promise as future avenues for the management of chronic respiratory diseases.

Abbreviations

AI=

Artificial Intelligence

AUC=

Area under the Receiver Operating Characteristic Curve

BODE=

Body mass index, airflow obstruction, dyspnea, exercise capacity

COPD=

Chronic obstructive pulmonary disease

FEV1=

Forced expiratory volume in one second

ML=

Machine learning

ROC=

Receiver Operating Characteristic

Declaration of interest

FME Franssen has obtained research grants from AstraZeneca, outside the scope of the current study. FME Franssen has obtained consultancy fees from MSD for advisory boards outside the scope of the current study. FME Franssen has received speakers’ fees by AstraZeneca, Boehringer Ingelheim, GlaxoSmithKline, Chiesi and Novartis. MAS has obtained research grants from Netherlands Lung Foundation and Stichting Astma Bestrijding, outside the scope of the current study. MA Spruit has obtained research grants from AstraZeneca, TEVA, Chiesi and Boehringer Ingelheim for the current study. MA Spruit has obtained consultancy fees from AstraZeneca and Boehringer Ingelheim for advisory boards outside the scope of the current study. All research grants and consultancy fees were paid to Ciro. The authors have no other relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript apart from those disclosed.

Reviewer disclosures

A reviewer on this manuscript received an honorarium from Expert Review of Respiratory Medicine for their review work. Peer reviewers on this manuscript have no other relevant financial relationships or otherwise to disclose.

Supplemental material

Supplemental Material

Download MS Word (444.3 KB)

Supplementary material

Supplemental data for this article can be accessed online at https://doi.org/10.1080/17476348.2024.2302940

Additional information

Funding

Part of the work of J Antão and Q Deng is financially supported by AstraZeneca, Boehringer Ingelheim, Chiesi and TEVA.

References

  • Prins C, Sheikh H, Schrijvers E, et al. Mission AI: The New System Technology. The Hague: The Netherlands Scientific Council for Government Policy (WRR); 2021.
  • Hazarika I. Artificial intelligence: opportunities and implications for the health workforce. Int Health. 2020 Jul 1;12(4):241–245. doi: 10.1093/inthealth/ihaa007
  • Diagnostic Image Analysis Group. AI for radiology: an implementation guide. Available from: https://grand-challenge.org/aiforradiology/?subspeciality=Chest&modality=All&search
  • Global Initiative for Asthma (GINA). 2022 GINA report - global strategy for asthma management and prevention. 2022
  • Global Initiative for Chronic Obstructive Lung Disease (GOLD). GOLD report - global strategy for the diagnosis, management and prevention of COPD. 2022. 2022.
  • Nikolaou V, Massaro S, Fakhimi M, et al. COPD phenotypes and machine learning cluster analysis: a systematic review and future research agenda. Respir Med. 2020 Sep;171:106093. doi: 10.1016/j.rmed.2020.106093
  • Cunha F, Amaral R, Jacinto T, et al. A systematic review of asthma phenotypes derived by data-driven methods. Diagnostics. 2021 Apr 2;11(4):644. doi: 10.3390/diagnostics11040644
  • Exarchos KP, Beltsiou M, Votti CA, et al. Artificial intelligence techniques in asthma: a systematic review and critical appraisal of the existing literature. Eur Respir J. 2020 Sep;56(3):2000521. doi: 10.1183/13993003.00521-2020
  • Exarchos KP, Aggelopoulou A, Oikonomou A, et al. Review of artificial intelligence techniques in chronic obstructive lung disease. IEEE J Biomed Health Inform. 2022 Dec 16;26(5):2331–2338. doi: 10.1109/JBHI.2021.3135838
  • Pucchio A, Eisenhauer EA, Moraes FY. Medical students need artificial intelligence and machine learning training. Nat Biotechnol. 2021 Mar 01;39(3):388–389.
  • Shmueli G. To explain or to predict? Stat Sci. 2010;25(3):289–310. doi: 10.1214/10-STS330
  • Esteban C, Arostegui I, Moraza J, et al. Development of a decision tree to assess the severity and prognosis of stable COPD. Eur Respir J. 2011;38(6):1294–300. doi: 10.1183/09031936.00189010
  • Chen S, Suzuki K, MacMahon H. Development and evaluation of a computer-aided diagnostic scheme for lung nodule detection in chest radiographs by means of two-stage nodule enhancement with support vector classification. Med Phys. 2011 Apr;38(4):1844–58. doi: 10.1118/1.3561504
  • Olier I, Zhan Y, Liang X, et al. Causal inference and observational data. BMC Med Res Methodol. 2023 Oct 11;23(1):227. doi: 10.1186/s12874-023-02058-5
  • Bica I, Alaa AM, Lambert C, et al. From real-world patient data to individualized treatment effects using machine learning: Current and future methods to address underlying challenges. Clin Pharmacol Ther. 2021 Jan;109(1):87–100.
  • Prosperi M, Guo Y, Sperrin M, et al. Causal inference and counterfactual prediction in machine learning for actionable healthcare. Nat Mach Intell. 2020 Jun 01;2(7):369–375. doi: 10.1038/s42256-020-0197-y
  • Sanchez P, Voisey JP, Xia T, et al. Causal machine learning for healthcare and precision medicine. R Soc Open Sci. 2022 Aug;9(8):220638.
  • Verstraete K, Gyselinck I, Huts H, et al. Estimating individual treatment effects on COPD exacerbations by causal machine learning on randomised controlled trials. Thorax. 2023 Oct;78(10):983–989.
  • WD JG, Hastie T, Tibshirani R. An introduction to statistical learning. (New York): Springer; 2013.
  • Islam MA, Bandyopadhyaya I, Bhattacharyya P, et al. Multichannel lung sound analysis for asthma detection. Comput Methods Programs Biomed. 2018 Jun;159:111–123.
  • Alam MZ, Simonetti A, Brillantino R, et al. Predicting pulmonary function from the analysis of voice: a machine learning approach. Front Digit Health. 2022;4:750226. doi: 10.3389/fdgth.2022.750226
  • LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015 May 01;521(7553):436–444.
  • Tomita K, Nagao R, Touge H, et al. Deep learning facilitates the diagnosis of adult asthma. Allergol Int. 2019 Oct;68(4):456–461.
  • Srivastava A, Jain S, Miranda R, et al. Deep learning based respiratory sound analysis for detection of chronic obstructive pulmonary disease. PeerJ Comput Sci. 2021;7:e369. doi: 10.7717/peerj-cs.369
  • Janiesch C, Zschech P, Heinrich K. Machine learning and deep learning. Electron Markets. 2021 Sep 01;31(3):685–695.
  • Vanfleteren LE, Spruit MA, Groenen M, et al. Clusters of comorbidities based on validated objective measurements and systemic inflammation in patients with chronic obstructive pulmonary disease. Am J Respir Crit Care Med. 2013 Apr 1;187(7):728–735. doi: 10.1164/rccm.201209-1665OC
  • Sutton RS, Barto AG. Reinforcement learning: an introduction. 2nd ed. Cambridge: The MIT Press; 2018.
  • Peine A, Hallawa A, Bickenbach J, et al. Development and validation of a reinforcement learning algorithm to dynamically optimize mechanical ventilation in critical care. NPJ Digit Med. 2021 Feb 19;4(1):32. doi: 10.1038/s41746-021-00388-6
  • Yu C, Liu J, Nemati S, et al. Reinforcement learning in healthcare: a survey. ACM Comput Surv. 2023;55(1):1–36. doi: 10.1145/3477600
  • Padmanabhan R, Meskin N, Haddad WM. Reinforcement learning-based control of drug dosing for cancer chemotherapy treatment. Math Biosci. 2017 Nov;293:11–20. doi: 10.1016/j.mbs.2017.08.004
  • Zheng H, Ryzhov IO, Xie W, et al. Personalized multimorbidity management for patients with type 2 diabetes using reinforcement learning of electronic health records. Drugs. 2021 Mar;81(4):471–482.
  • Braido F, Santus P, Corsico AG, et al. Chronic obstructive lung disease “expert system”: validation of a predictive tool for assisting diagnosis. Int J Chron Obstruct Pulmon Dis. 2018;13:1747–1753. doi: 10.2147/COPD.S165533
  • Hafke-Dys H, Kuźnar-Kamińska B, Grzywalski T, et al. Artificial Intelligence Approach to the monitoring of respiratory sounds in asthmatic patients. Front Physiol. 2021;12:745635. doi: 10.3389/fphys.2021.745635
  • Buekers J, Theunis J, De Boever P, et al. Wearable finger pulse oximetry for continuous oxygen saturation measurements during daily home routines of patients with chronic obstructive pulmonary disease (COPD) over one week: observational study. JMIR mHealth uHealth. 2019 Jun 6;7(6):e12866. doi: 10.2196/12866
  • Mesquita R, Spina G, Pitta F, et al. Physical activity patterns and clusters in 1001 patients with COPD. Chron Respir Dis. 2017 Aug;14(3):256–269.
  • de Hond AAH, Leeuwenberg AM, Hooft L, et al. Guidelines and quality criteria for artificial intelligence-based prediction models in healthcare: a scoping review. NPJ Digit Med. 2022 Jan 10;5(1):2. doi: 10.1038/s41746-021-00549-7
  • Choi RY, Coyner AS, Kalpathy-Cramer J, et al. Introduction to machine learning, neural networks, and deep learning. Transl Vis Sci Technol. 2020;9(2):14. doi: 10.1167/tvst.9.2.14
  • Badillo S, Banfai B, Birzele F, et al. An introduction to machine learning. Clin Pharmacol Ther. 2020 Apr;107(4):871–885.
  • Steyerberg EW. Clinical prediction models: a practical approach to development, validation, and updating. 2nd ed. Switzerland: Springer; 2019.
  • Naser MZ, Alavi AH. Error metrics and performance fitness indicators for artificial intelligence and machine learning in engineering and sciences. Archit Struct Constr. 2023 Nov 24;3(4):499–517.
  • Congalton RG. A review of assessing the accuracy of classifications of remotely sensed data. Remote Sens Environ. 1991 Jul 01;37(1):35–46.
  • Power M, Fell G, Wright M. Principles for high-quality, high-value testing. Evid Based Med. 2013 Feb;18(1):5–10. doi: 10.1136/eb-2012-100645
  • Hosmer DW, Lemeshow S. Applied Logistic Regression. 2nd ed. (New York): John Wiley and Sons; 2000.
  • MacNeil J, Loves RH, Aaron SD. Addressing the misdiagnosis of asthma in adults: where does it go wrong? Expert Rev Respir Med. 2016 Nov;10(11):1187–1198. doi: 10.1080/17476348.2016.1242415
  • Singh OP, Palaniappan R, Malarvili M. Automatic quantitative analysis of human respired carbon dioxide waveform for asthma and non-asthma classification using support vector machine. IEEE Access. 2018;6:55245–55256. doi: 10.1109/ACCESS.2018.2871091
  • Ullah R, Khan S, Ali H, et al. A comparative study of machine learning classifiers for risk prediction of asthma disease. Photodiagnosis Photodyn Ther. 2019 Dec;28:292–296.
  • Kisiel MA, Zhou X, Sundh J, et al. Data-driven questionnaire-based cluster analysis of asthma in Swedish adults. NPJ Prim Care Respir Med. 2020 Apr 6;30(1):14. doi: 10.1038/s41533-020-0168-0
  • Oh JH, Ahn KM, Chung SJ, et al. Usefulness of routine blood test-driven clusters for predicting acute exacerbation in patients with asthma. Respir Med. 2020 Aug;170:106042.
  • Zein JG, Wu CP, Attaway AH, et al. Novel Machine Learning Can Predict Acute Asthma Exacerbation. Chest. 2021 May;159(5):1747–1757.
  • Xiang Y, Ji H, Zhou Y, et al. Asthma exacerbation prediction and risk factor analysis based on a time-sensitive, attentive neural network: retrospective cohort study. J Med Internet Res. 2020 Jul 31;22(7):e16981. doi: 10.2196/16981
  • Luo G, He S, Stone BL, et al. Developing a model to predict hospital encounters for asthma in asthmatic patients: secondary analysis. JMIR Med Inform. 2020 Jan 21;8(1):e16080. doi: 10.2196/16080
  • Khasha R, Sepehri MM, Mahdaviani SA. An ensemble learning method for asthma control level detection with leveraging medical knowledge-based classifier and supervised learning. J Med Syst. 2019 Apr 26;43(6):158. doi: 10.1007/s10916-019-1259-8
  • Halner A, Beer S, Pullinger R, et al. Predicting treatment outcomes following an exacerbation of airways disease. PLoS One. 2021;16(8):e0254425. doi: 10.1371/journal.pone.0254425
  • Inbar O, Inbar O, Reuveny R, et al. A machine learning approach to the interpretation of cardiopulmonary exercise tests: development and validation. Pulm Med. 2021;2021:5516248–5516248. doi: 10.1155/2021/5516248
  • González G, Ash SY, Vegas-Sánchez-Ferrero G, et al. Disease staging and prognosis in smokers using deep learning in chest computed tomography. Am J Respir Crit Care Med. 2018 Jan 15;197(2):193–203. doi: 10.1164/rccm.201705-0860OC
  • Tang LYW, Coxson HO, Lam S, et al. Towards large-scale case-finding: training and validation of residual networks for detection of chronic obstructive pulmonary disease using low-dose CT. Lancet Digit Health. 2020 May;2(5):e259–e267.
  • Haider NS, Singh BK, Periyasamy R, et al. Respiratory sound based classification of chronic obstructive pulmonary disease: a risk stratification approach in machine learning paradigm. J Med Syst. 2019 Jun 28;43(8):255. doi: 10.1007/s10916-019-1388-0
  • Van Berkel JJ, Dallinga JW, Möller GM, et al. A profile of volatile organic compounds in breath discriminates COPD patients from controls. Respir Med. 2010 Apr;104(4):557–563.
  • B VA, Subramoniam M, Mathew L. Detection of COPD and lung cancer with electronic nose using ensemble learning methods. Clin Chim Acta. 2021 Dec 01;523:231–238.
  • Mariani S, Metting E, Lahr MMH, et al. Developing an ML pipeline for asthma and COPD: the case of a Dutch primary care service. Int J Intell Syst. 2021;36(11):6763–6790. doi: 10.1002/int.22568
  • Amaral JL, Lopes AJ, Faria AC, et al. Machine learning algorithms and forced oscillation measurements to categorise the airway obstruction severity in chronic obstructive pulmonary disease. Comput Methods Programs Biomed. 2015 Feb;118(2):186–197.
  • Garcia-Aymerich J, Gómez FP, Benet M, et al. Identification and prospective validation of clinically relevant chronic obstructive pulmonary disease (COPD) subtypes. Thorax. 2011 May;66(5):430–7.
  • Marques A, Souto-Miranda S, Machado A, et al. COPD profiles and treatable traits using minimal resources: identification, decision tree and stability over time. Respir Res. 2022 Feb 14;23(1):30. doi: 10.1186/s12931-022-01954-6
  • Burgel P-R, Paillasseur J-L, Janssens W, et al. A simple algorithm for the identification of clinical COPD phenotypes. Eur Respir J. 2017;50(5):1701034. doi: 10.1183/13993003.01034-2017
  • Augustin IML, Spruit MA, Houben-Wilke S, et al. The respiratory physiome: clustering based on a comprehensive lung function assessment in patients with COPD. PLoS One. 2018;13(9):e0201593. doi: 10.1371/journal.pone.0201593
  • Cavailles A, Melloni B, Motola S, et al. Identification of patient profiles with high risk of hospital re-admissions for acute COPD exacerbations (AECOPD) in France using a machine learning model. Int J Chron Obstruct Pulmon Dis. 2020;15:949–962. 10.2147/COPD.S236787
  • Fernandez-Granero MA, Sanchez-Morillo D, Leon-Jimenez A. Computerised analysis of telemonitored respiratory sounds for predicting acute exacerbations of COPD. Sensors. 2015;15(10):26978–26996. doi: 10.3390/s151026978
  • Aguilaniu B, Hess D, Kelkel E, et al. A machine learning approach to predict extreme inactivity in COPD patients using non-activity-related clinical data. PloS One. 2021;16(8):e0255977. doi: 10.1371/journal.pone.0255977
  • Moll M, Qiao D, Regan EA, et al. Machine learning and prediction of all-cause mortality in COPD. Chest. 2020 Sep;158(3):952–964.
  • Luo L, Li J, Lian S, et al. Using machine learning approaches to predict high-cost chronic obstructive pulmonary disease patients in China. Health Inform J. 2020 Sep;26(3):1577–1598.
  • Ramspek CL, Jager KJ, Dekker FW, et al. External validation of prognostic models: what, why, how, when and where? Clin Kidney J. 2021;14(1):49–58. doi: 10.1093/ckj/sfaa188
  • Aung YYM, Wong DCS, Ting DSW. The promise of artificial intelligence: a review of the opportunities and challenges of artificial intelligence in healthcare. Br Med Bull. 2021 Sep 10;139(1):4–15. doi: 10.1093/bmb/ldab016
  • Ghassemi M, Oakden-Rayner L, Beam AL. The false hope of current approaches to explainable artificial intelligence in health care. Lancet Digit Health. 2021 Nov;3(11):e745–e750. doi: 10.1016/S2589-7500(21)00208-9
  • Rudin C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat Mach Intell. 2019 May 01;1(5):206–215.
  • Martínez-Plumed F, Contreras-Ochando L, Ferri C, et al. CRISP-DM Twenty Years Later: from data mining processes to data science trajectories. IEEE Trans Knowl Data Eng. 2021;33(8):3048–3061. doi: 10.1109/TKDE.2019.2962680
  • Chapman P, Clinton J, Kerber R, et al. CRISP-DM 1.0: step-by-step data mining guide. USA: SPSS Inc; 2000.
  • Collins GS, Reitsma JB, Altman DG, et al. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. BMJ. 2015 Jan 7;350(4):g7594. doi: 10.1136/bmj.g7594