2,840
Views
0
CrossRef citations to date
0
Altmetric
Original Articles

When AI Is Wrong: Addressing Liability Challenges in Women’s Healthcare

ORCID Icon

ABSTRACT

Healthcare professionals can leverage Artificial intelligence (AI) to provide better care for their patients. However, it is also necessary to consider that AI algorithms operate according to historical diagnostic data, which often include evidence gathered from men. The biases of prior practices and the perpetuation of exclusionary processes toward women can lead to inaccurate medical decisions. The ramifications of such errors show that the incorrect use of AI raises several critical questions regarding who should be responsible for potential incidents. This study aims to provide an analysis of the role of AI in affecting women’s healthcare and an overview of the liability implications caused by AI mistakes. Finally, this work presents a framework for algorithmic auditing to ensure that AI data are collected and stored according to secure, legal, and fair practices.

Introduction

Artificial Intelligence (AI) plays an increasingly important role in healthcare as it involves the use of sophisticated algorithms to perform specific tasks in an automated manner. AI algorithms can examine, evaluate, and even provide solutions to complicated health problems. There are several areas where AI is already capable of delivering significant outcomes. Some include diagnosis and treatment of diseases and other conditions. To support medical staff in these areas, AI systems are designed to assist in decision-making and make recommendations based on analyses of large amounts of healthcare data (e.g., medical images, electronic health records (EHRs), symptom data, etc.).

The development of medicine follows scientific principles and rigorous practices, and for this reason, AI can be suitable for practical applications in this field. However, while this aspect is beneficial from a technical point of view, other factors need to be considered in practice. There are several challenges for these technologies that, if not addressed or corrected in a timely manner, might impede their effective performance in settings outside of the development environment in which they originated.

Therefore, the use of AI in healthcare raises several critical questions regarding who should be responsible for potential incidents. This study aims to provide an analysis of the role of AI in affecting women’s healthcare and an overview of the liability implications caused by AI mistakes. This analysis involved an overview of the literature on this topic and an examination of the main issues pertaining to this area from a forward and backward perspective. Firstly, the paper discusses how a decision reached by an AI system is established and affects the various stakeholders (forward analysis). Secondly, it analyzes the roots of AI bias in medical practices for women’s care (backward analysis). Finally, it proposes a framework for algorithmic auditing to ensure that AI data are collected and stored according to the law and security and fair practices. Since this is an ongoing work, the findings elaborated in this paper are partly preliminary and represent a foundation for further empirical research on the topic.

Literature review

Artificial intelligence in healthcare is receiving attention from sectoral practitioners and scientists. For example, major biopharmaceutical companies are increasingly using AI technologies to improve their procedures and discover new remedies. Among them, US pharmaceutical giant Pfizer uses IBM Watson, a machine learning-based system, to aid in discovering immuno-oncology therapies.Citation1 Sanofi, another leading healthcare company based in France, has agreed to use Exscientia’s AI technology to look for treatments to cure metabolic diseases.Citation1,Citation2 Similarly, Genentech, a Roche Group subsidiary, is employing an AI system from GNS Healthcare, a big data analytics organization, to help in its hunt for new cancer treatments.Citation1 Following the promising results deriving from the employment of AI, the majority of academic studies agree that AI has the potential to significantly enhance medical treatment and practice,Citation3–5 benefiting both patients and healthcare professionals,Citation6–8 According to Mun et al.,Citation9 radiology was one of the earliest medical sectors to embrace digital technologies, including AI. Over the lastCitation30 years, the development of digital imaging technologies, picture archiving and communication systems (PACS), and teleradiology has revolutionized radiology services and enabled specialists to work in a more advanced digital environment. However, the emergence of AI systems improved other medical areas as well. Rong et al., for example, examined the most recent advances in the use of AI in biomedicine, such as disease diagnoses and prognosis, living assistance, biomedical information processing, and biomedical research. Davenport,Citation4 also investigated other categories of applications, such as patient engagement and adherence, and administrative activities. Applying AI in medicine can help clinicians and organizations move from traditional medical solutions to evidence-based adaptive healthcare,Citation10,Citation11 However, while there is a growing body of scientific literature describing the value of these techniques, several academic works also outline the practical issues with using AI in clinical practice. Some authors,Citation1Citation2Citation1Citation3 for instance, provided detailed guidance on the most typical risks of medical AI systems. Currently, there is one major theme in the literature on the challenging integration of AI in healthcare. This issue entails difficulties adapting AI technology from the development environment to the actual application context,Citation12–14 For example, Panch et al.,Citation15 refer to this problem as “the inconvenient truth,” noting that AI algorithms are not operating as they should. According to the authors, the reason for this issue lies in two limitations. First, AI advancements may not be in line with real-life medical procedures. For example, one of the related issues some authors investigated is the limited ability of AI systems to function in novel conditions,Citation16–18 AI models may fail to account for unforeseen changes that can occur in the medical environment, thus affecting the diagnoses and new statistical comparisons. Therefore, with the emergence of new medical trends or unexpected diseases (e.g., the advent of the SARS-CoV-Citation2), results deriving from AI tools may not be completely reliable and valid. Second, most healthcare organizations do not have the necessary infrastructure for training algorithms and ensuring that they perform consistently across different patient groups. Several authors,Citation19–21 have discussed AI issues affecting underrepresented populations, such as those affecting women,Citation22 In the past, women’s health has mainly focused on the topics of reproduction and childbearing; due to their “complex biology,” women have often been excluded from clinical trials on other health conditions or tests on the effects of treatments. For example, according to Miller and Cronin-Golomb,Citation23 more men than women receive a diagnosis for Parkinson’s disease (PD). Additionally, several medications, including anesthetics and cardiovascular treatments, have primarily been tested on males despite the evident difference between male and female physiology,Citation24,Citation25 When AI systems contain these prejudices and biases, algorithms automatically perpetuate gender distinctions and inequities, causing adverse health consequences for women. For instance, an algorithm that learns “typical“ symptoms of a cardiac problem might misdiagnose many women. In addition to presenting risks for female patients’ safety, such failures, once recognized, can lead to exploitation by cybercriminals or misinformation. According to Reddy et al.,Citation26 privacy breaches represent another concern associated with AI in healthcare. AI systems rarely have protections against privacy issues and, therefore, may cause psychological and reputational harm to female patients. To solve these issues, Bates et al., for example, emphasize the necessity of implementing improved methodological best practices, such as external validation, proactive learning algorithms to compensate for biases, and increasing algorithmic robustness. Unlike previous technologies, AI is continually evolving, and there is no clear description of its operations process, functioning, and purpose. According to Asan et al.,Citation27 this lack of clarity also erodes trust, which is a critical component affecting human-human relationships, including interactions with AI. Understanding the mechanics of trust between AI and humans is critical, especially in the domain of healthcare, where lives are at stake. As a result, there is a growing need to establish appropriate criteria for their design, implementation, and assessment in such a way that clinical AI systems work in the best interests of their intended clinical end users.Citation28

Methodology

Studying developments and implications of AI in healthcare is of central importance in the field of algorithmic fairness and health analytics. However, current methods are either limited to specific technical analysis or focused on broad examinations in artificial intelligence for biomedicine and healthcare. Therefore, this paper introduced a new methodology for comprehensively investigating the issue defined in this work. More specifically, it involved applying the Forward-Backward (FB) analysis, which is a strategy for generating solutions or deducing information,Citation29–31 The FB method consists in splitting the analysis of a problem into two opposite directions, “forward” and “backward,” in such a way as to cover multiple perspectives. There is a significant body of literature on the study and application of this approach in different fields, such as computing sciences,Citation32–34 and communications.Citation35,Citation36 For example, Janowitz and Solow described the FB method as an efficient technique for proving mathematical theorems. The idea underlying the authors’ description of the FB method is to “attack” a theorem from both sides (forward and backward) in an attempt to link them and produce a logical argumentation of proof. More generally, the forward and backward processes of this approach are as follows:

  • Forward. The forward process comprises the logical processing of information from existing data until an endpoint terminates its progress.Citation29 In this type of reasoning, the analysis starts by evaluating existing assumptions, variables, and circumstances before deducing new information. The progressive manipulation of knowledge contributes to the achievement of the endpoint.

  • Backward. The backward process involves analyzing the endpoint obtained in the “forward” process and moving backward to comprehend the steps that led to that point. The backward’s aim is to foresee or recognize potential failures in order to comprehend the phenomena or theory as a whole and develop appropriate responses to potential problems.Citation30

For the purpose of this study, this work used a modification of the FB method as follows. First, it examined the potential consequences (forward) and the affected parties involved in the AI decision-making process. Then it used the information obtained in the “forward analysis“ to evaluate possible causes and corrective actions to avoid negative impacts on women’s health (backward). The choice of this method is motivated by the novelty of the field. Most information regarding bias in AI healthcare derives from posterior implications of AI-based decisions. The lack of data on the connection between cause and effect in AI decisions makes it difficult to tackle the problem directly and conduct an a priori analysis. Therefore, this method provided a path to understanding data bias in AI healthcare from multiple perspectives. The following section discusses the findings resulting from the application of this methodology.

Forward perspective: decision-making and stakeholders’ liability

According to Colson,Citation37 the decision-making process in an AI environment leverages the information contained in the data to determine which decisional path is most effective. Phillips-WrenCitation38 explained this procedure by describing the four main phases implemented in AI systems: intelligence, design, choice, and implementation. The intelligence phase involves gathering information to develop an understanding of the problem. The design phase entails determining decision-making criteria to construct a model and investigate alternatives. The choice phase involves the selection of the decision. Finally, the implementation phase involves learning in real-time and adjusting to new potential data. The process continues in a sequential manner with loops between the various phases. The decision provided by AI has ramifications for who should be responsible for any mistakes made during these steps. There are two main categories of stakeholders in this process:

  • The first category of stakeholders comprehends the users (usually the clinicians). This stakeholder has the task of taking over the last step of the decision-making process by using the suggestions of AI to assist them in their diagnoses or test results. Most of the AI tools medical practitioners use belong to Clinical Decision Support Systems (CDSSs). These systems build upon fuzzy logic, artificial neural networks, Bayesian networks, and general machine-learning techniques. Therefore, CDSSs have an intrinsic potential. For example, they may be able to process and learn adaptively from massive volumes of data in a relatively short time. However, unlike traditional systems, CDSS are highly complex, and their functioning is generally hard to explain.Citation39 This complexity raises concerns about misuse of AI tools and consequent responsibility attribution. In particular, attributing responsibility for misdiagnosis or incorrect treatment due to the use of CDSS faces the difficulty of identifying those involved in AI-driven decision-making processes.Citation40 Depending on the hospital’s management structure and contractual arrangements, physicians, nurses, medical groups, or all of them may be held responsible for poor medical treatment outcomes. In such cases, various attributions of responsibility may accumulate, especially when medical and organizational difficulties converge.Citation40

  • The second category of stakeholders comprehends those involved in developing an AI system (algorithm designs, programmers, manufacturers, etc.). For example, AI developers may be held accountable for ensuring that AI algorithms operate securely and adequately according to functional specifications provided by a deployer. In turn, deployers (e.g., those who deploy AI services or bring AI tools to the market) may be responsible for the provision of legitimate and lawful services as well as the correctness of the data integrated into the AI system.

In the legislative process, these stakeholders are subject to different liability issues depending on the cause of the potential damage, which can be the following:

  • Damage caused when AI was in use. Broadly, clinicians can be liable under medical malpractice, which involves “a failure to exercise due care”.Citation41 In the context of AI, clinicians may be liable for “failing to critically evaluate AI/ML recommendations”.Citation42 Additionally, medical staff may also be subject to other negligence theories. Some legal experts have also argued that this type of liability may also apply to health systems that do not adequately examine an AI system before its clinical use.Citation42,Citation43 For example, a medical institution serving a predominantly female population could be liable for injuries resulting from an AI system based on a biased training dataset. Such a system can provide medical solutions that do not conform to the current standard of care for that specific group. Analogously, a critical legal concept is vicarious liability. This type of liability involves holding an organization or a clinician liable for someone else’s actions; vicarious liability can apply to healthcare organizations, clinician groups, or employers are responsible for their employees or other associates’ actions.Citation42 For example, a clinician could be vicariously liable for the negligence of a nurse who incorrectly interpreted data from the AI system.

  • Damage caused during the design, implementation, or production of the AI system. Conversely, developers and deployers may be subject to products liability, which involves liability for “injuries that result from poor design, failure to warn about risks, or manufacturing defects”.Citation42 However, the current legal doctrines only apply to the conduct of humans in the field of AI.Citation44 Therefore, it is still unclear how these doctrines can apply to AI technology itself if it acts autonomously or if the decision-making process does not depend entirely on the medical provider. One solution is learning from other fields of application. For instance, Gilke, Minssen, and CohenCitation45 give an example of AI bias against women in hiring and tie bias to informed consent and transparency. The same concept may apply to AI in women’s healthcare; if a clinician uses AI that potentially contains biases against women’s health, it would be necessary to disclose it, just as it would in a hiring situation.

summarizes the liability elements and stakeholders associated with the two types of damages described above.

Table 1. Liability.

Backward perspective: behind the algorithm

AI systems tend to be very hard to explain, and so are their recommendations. When this happens, AI algorithms act as a “black box,” where underlying reasonings are “hidden“ in their implementation.Citation46 Similarly, these inscrutable models may also hide biases that, if not detected and rectified, could amplify in the resulting procedures. One approach to opening this black box and addressing this problem consists in retroactively identifying where prejudices originate from and then understanding how to eliminate them (backward perspective). In particular, studies have shown that analyzing biases in the early phases of AI data training may provide insights into what kind of data flowed into the model and may reveal potential inequities in current clinical practice.Citation27,Citation47 Therefore, to explore the origin of biases toward women in algorithms, it can be useful to refer to the basic data lifecycle model presented by Stobierski.Citation48 This model involves eight steps, which are as follows:

  1. Generation: Data generation is the process of creating data before the data life cycle begins in order to trigger subsequent actions. Data generation may derive from the organization, its customers, or third parties.

  2. Collection: Following data generation, it is necessary to determine what information is relevant and the best method for collecting it. Generally, this type of information involves forms, surveys, analysis results, interviews, and direct observations.

  3. Processing: Following the collection of data, it is necessary to process them. Data processing can refer to several activities, such as data cleaning (i.e., cleaning and combining large data sets), data compression (i.e., creating a condensed representation of data), and data encryption (i.e., protecting the confidentiality of data by translating them into encoded information).

  4. Storage: Following the processing of data, it is necessary to store them safely. This step generally involves the creation of databases or datasets stored in the cloud, on servers, or on a form of physical storage.

  5. Management: Data management is the process of organizing, monitoring, maintaining, and locating data over the life of a data project. Even though it is an important step in the data lifecycle, it is important to remember that it is a continuous activity that takes place from the start to the end of a project.

  6. Analysis: Data analysis is the process of extracting relevant information from unstructured data. To perform these analyses, data analysts use several methods, such as data mining, and statistical modeling.

  7. Visualization: Data visualization is the process of building visual representations of data, making it easier to communicate data analysis results to a larger audience.

The lessons learned and insights gained from one step often influence the next, allowing the last step of the data lifecycle process to link back into the first. Therefore, these steps provide a realistic interpretation for structuring the life cycle of a data project. For this reason, this paper analyzed common types of biases in relation to each stage of this model ().

Figure 1. Bias in the AI data lifecycle.

Figure 1. Bias in the AI data lifecycle.

Generally, biases originate from historical data (e.g., existing practices or studies), which tend to reflect existing societal prejudices.Citation49 For instance, most studies on Human Immunodeficiency Virus (HIV) and Autism contain data on male patients;Citation22 if a company builds an AI-driven diagnostic system for female patients using these studies without taking appropriate mitigating steps, the tool is likely to have a bias against female patients due to historical bias in the original data. Data bias can also be introduced when the data are processed without fully understanding the specific context. For example, processing a demographic dataset containing weight measurements may introduce bias to the system if differences in men’s and women’s weight are not considered.

Personal biases can also appear during the data collection or interpretation process.Citation47 The people who collect or interpret the dataset may have a personal bias due to cultural stereotypes or differences. For example, those who gather data on a specific treatment may induce an unconscious selectivity against women.

There may also be issues of statistical bias in the data management and analysis process.Citation46 Generally, this phase involves different processes, ranging from organization of data to implementation. According to Gutbezahl,Citation50 “statistical bias is anything that leads to a systematic difference between the true parameters of a population and the statistics used to estimate those parameters.” If management and analysis procedures contain this type of bias, they can generate results that do not accurately represent the female population. For example, biases can result from the analysis of electronic health records (EHRs) that do not contain information on women.Citation22

In the data visualization process, biases are rare but possible. For example, creating a large, labeled training set may be necessary for developing certain models.

If the developers have unconscious biases or inadequate cultural training, their mistakes could infiltrate the data labeling process. For example, developers might mislabel a critical indicator if they had limited exposure to female-specific symptoms.

Finally, the storage phase is not directly subject to biases. However, if healthcare organizations store biased data insecurely (e.g., unencrypted), cyber attackers may exploit them for malicious purposes (e.g., reputational damages).

It is also worth noting that biases can also arise in areas other than flawed data lifecycles, such as forecasting, labeling, and reporting. However, detecting biases in data with a comprehensive set of metrics can provide a solid base for limiting the damages.

Framework

To ensure that gender bias does not influence the diagnosis and management of treatments and diseases, it is necessary to monitor and test AI algorithms throughout a systematic and documented process (also known as the auditing process). One of the most commonly adopted approaches to this type of auditing is the SMACTR method (Scoping, Mapping, Artifact Collection, Testing, and Reflection).Citation51

  • Scoping. This stage involves the examination of the use cases and related applications to highlight areas of concern, identifying possible damage and social impact that will require consideration in subsequent audit phases.

  • Mapping. This stage entails creating a map of internal stakeholders and identifying collaborators for the audit’s implementation.

  • Artifact Collection. This stage involves developing an audit checklist and gathering the necessary documentation to initiate the audit.

  • Testing. This stage entails conducting tests to ensure that the requirements are in accordance with the organization’s stated ethical standards.

  • Reflection. This stage involves comparing the audit’s results with the expected results and creating the final risk analysis and mitigations.

Based on the previous (forward and backward) analyses and inspired by the auditing principles defined in the SMACTR method, this paper proposed an Algorithm Auditing Framework that considers liability, security, and unbiased practices.Citation52 In particular, the development of this framework focused on the third phase of the SMACTR method (Artifact Collection), which involves the elaboration of a checklist and an inventory of the documents necessary to start the audit. The analysis of this stage produced a high-level specification model (Algorithm Auditing Framework) to outline the main areas and elements to audit in an AI system, as shown in .

Table 2. Algorithm auditing framework.

Results and discussion

The first area to audit is liability. Even though AI algorithms themselves cannot be held liable, the company designing and deploying algorithms must comply with relevant laws and regulations. Given the potential applications of AI in healthcare, AI developers should be familiar with specific compliance requirements, such as those of the Health Insurance Portability and Accountability Act (HIPAA).Citation53 HIPAA, a US legislation passed in 1996, which safeguards electronic data, including patient health information.Citation54 Although HIPAA generally applies to insurance companies, hospitals, and health organizations, developers may also be subject to this regulation depending on their interactions with health organizations and whether electronic patient health information is accessible. The European Union also recognizes regulations as critical to the development of AI technologies that users can trust. The General Data Protection Regulation (GDPR) is among the first to consider algorithmic decision-making’s impacts on the “fundamental rights and freedom of natural persons” and tackle possible AI abuses.Citation46 In addition, a necessary liability verification also involves identifying the roles and responsibilities within the AI data lifecycle.Citation55 For example, if a company uses external consultants, it must verify whether they have sufficient knowledge and requirements to handle AI data.

The second auditing area is technique. Technical professionals involved in the algorithm design and deployment should employ the necessary mechanisms to ensure data security and protection. They also need to apply explainability and traceability methods to explain AI to the user in an understandable way (explainability), and keep track of where the attributes for a decision established by AI come from (traceability).

The third auditing area is fairness. Any decision-making system can contain biases toward particular elements, and it is necessary to assess them for fairness. In the context of AI, fairness verification entails determining if biases are present according to pre-established criteria. For example, one approach to defining fairness measurements is to use an indicator called “individual fairness.”Citation56 This criterion generally finds application in economics to examine income distribution and inequality and assumes that similar people require equal treatment. Similarly, individual fairness can also be useful in determining equality among members of gender groups in AI healthcare systems.

Implementing fair AI algorithms also implies the use of data quality (e.g., recent statistics on women’s side effects to a treatment). Even if an algorithm’s logic is right, the outcomes might be inaccurate if the input is incorrect. As a result, the existence of low-quality data might result in low-quality algorithm outputs. These inaccuracies might lead to a breach of fundamental rights, especially those related to privacy and data protection. Additionally, to ensure that an AI system designed specifically for a female patient population is fair, it is also necessary to verify whether cultural and social barriers are inherent in the system and that technical employees received appropriate training.Citation57

Implications for research

Despite recent advances in AI and increased investigation into how AI failures may hurt patients, few studies have specifically examined how AI affects women’s healthcare. This study intended to contribute to the understanding of this issue by offering a comprehensive review of the main challenges in this field as well as the many stakeholders involved. In addition, this work demonstrated that evaluating biases from different perspectives can provide insights into the possible challenges in developing and using AI systems. The FB technique utilized in this study provided a multidimensional view of AI decision-making through the analysis of possible outcomes (forward) as well as probable causes and subsequent remedial steps to avoid adverse effects on women’s health (backward). In particular, the most significant research value in the FB approach lies in the “backward” interpretation of the analysis. Generally, research methods employed to examine complex problems involve progressive reasoning, which entails solving tasks of increasing complexity by building on top of previously acquired knowledge. Progressive reasoning can be effective because it provides a solid foundation for reaching conclusions and has no restrictions on the evidence produced from it. However, as stated by Partridge and Hussain,Citation58 AI problems are “intractable,” meaning that they “cannot be solved within the bounds of finite time and computational resources.” Thus, AI problems may have different elements of representation and explanation and may require problem-solving techniques that proceed along several dimensions. As a result, the use of the FB technique in this study can aid in performing different evaluations of AI issues, making it simpler for researchers to discover emerging obstacles as well as divergences between theory and practice.

Implications for practice

As AI becomes more prevalent in healthcare, it is critical to assess AI algorithms through a systematic and documented process. For this reason, the proposed Algorithm Auditing Framework presents a comprehensive procedure for reducing gender bias in AI algorithm development. In particular, the framework prioritizes three key auditing areas for assessing the causes of bias in data and algorithms: liability, technique, and fairness. These areas are crucial for establishing if AI-induced risks (e.g., those associated with liability exposure, AI development techniques, data, or models) can materialize and negatively influence potential stakeholders. Therefore, the audit process is a critical phase in the development of AI systems as the outcome may have implications for their actual deployment in the healthcare setting. Thus, the framework intends to help bridge the practical gap between the development and implementation of AI systems in healthcare by providing critical reflections on a system’s algorithm’s possible failures and their related impact. However, while auditing AI for bias might be a valuable strategy, there may still be limits, mainly relating to the auditor’s ability to remain objective and unbiased while completing auditing activities. To compensate for this limitation, it would be necessary to constantly review and monitor the aspects influencing audit quality, such as the auditor’s integrity and background. Overall, regulators, IT experts, and legal professionals could use this framework as a guide to understanding the multiple risks resulting from the use of AI systems.

Conclusions

When building AI systems for healthcare, it is essential that they do not contain biases in their algorithms. However, for this to happen, firstly, it is necessary to understand and recognize bias; secondly, it is crucial to ensure that the datasets on which algorithms operate do not integrate biased information. More specifically, the main finding resulting from this study is that the extent to which AI affects medical outcomes for female patients (and, consequently, liability issues) depends on two factors:

  • The first factor is the degree of bias integrated into the AI dataset lifecycle. As shown in the FB analysis and, in particular, in the backward perspective section, an AI system’s decision-making process uses the information in the data training set to evaluate the most suitable decisional path. However, the presence of a biased dataset lifecycle might introduce inaccuracies in this type of data and alter decisions. Thus, biases can propagate because of the complex nature of AI models, and their implications can rise to significant degrees within the system. Therefore, the use of auditing methods, such as the one outlined in this study, can be beneficial in systematically evaluating the strengths, limitations, and uses of datasets and models. In addition, they can increase transparency, allowing for the early detection of possible problems and opportunities for improved development.

  • The second factor is the level of human interaction of users (e.g., medical staff) with the information introduced in and produced by the AI’s processing system. As shown in this study, in many circumstances, AI systems can reduce humans’ interpretation of data due to their capacity to increase accuracy and complete tasks within shorter time frames. However, while these algorithms can certainly produce advanced results, they may fail to account for the complexities of the circumstances surrounding the use of AI tools and their related repercussions. As a result, establishing the necessary level of human interaction in a system is critical.

While these factors are essential in tackling the bias problem, the path to successfully implementing equitable AI solutions in healthcare will likely remain challenging. AI technology and interdisciplinary research have achieved considerable progress in recent years; however, more investment in addressing the complex concerns of AI bias (e.g., algorithm fairness, safety, governance, and supervision) may be necessary. These efforts require interdisciplinary engagement and collaboration among experts who can grasp the nuances of each application area in the AI development and implementation process, including data scientists, healthcare providers, and regulators. Finally, as the AI field evolves and the related applications grow, a fundamental aspect of this interdisciplinary approach is comprehensively assessing and evaluating the role of AI decision-making. For example, some key considerations may include examining instances in which automated decision-making is acceptable (and, therefore, suitable for application scenarios) and the related risks of failure. However, using a combination of AI systems and human judgment can reduce bias and turn AI into an opportunity rather than a challenge.

Disclosure statement

No potential conflict of interest was reported by the author(s).

References

  • Fleming N. How artificial intelligence is changing drug discovery spotlight /631/45 /639/705/117 /631/154 /706/703/559 n/a. Nature. 2018;557(7707):S55–S57. doi:10.1038/d41586-018-05267-x.
  • Shaheen MY. Applications of Artificial Intelligence (AI) in healthcare: a review. Sci Prepr. 2021. Published online doi: 10.14293/S2199-1006.1.SOR-.PPVRY8K.v1.
  • Sim KM. Bilattices and reasoning in artificial intelligence: concepts and foundations. Artif Intell Rev. 2001;15(3):219–40. doi:10.1023/A:1011049617655.
  • Davenport T, Kalakota R. The potential for artificial intelligence in healthcare. Futur Healthc J. 2019;6(2):94–98. doi:10.7861/futurehosp.6-2-94.
  • Rong G, Mendez A, Bou Assi E, Zhao B, Sawan M. Artificial intelligence in healthcare: review and prediction case studies. Engineering. 2020;6(3):291–301. doi:10.1016/j.eng.2019.08.015.
  • Tulk Jesso S, Kelliher A, Sanghavi H, Martin T, Henrickson Parker S. Inclusion of clinicians in the development and evaluation of clinical artificial intelligence tools: a systematic literature review. Front Psychol. 2022:13. doi:10.3389/fpsyg.2022.830345.
  • Bates DW, Auerbach A, Schulam P, Wright A, Saria S. Reporting and implementing interventions involving machine learning and artificial intelligence. Ann Intern Med. 2020;172(11_Supplement):S137–S144. doi:10.7326/M19-0872.
  • Schwendicke F, Krois J. Better reporting of studies on artificial intelligence: CONSORT-AI and beyond. J Dent Res. 2021;100(7):677–80. doi:10.1177/0022034521998337.
  • Mun SK, Wong KH, Lo SCB, Li Y, Bayarsaikhan S. Artificial intelligence for the future radiology diagnostic service. Front Mol Biosci. 2021;7:512. doi:10.3389/fmolb.2020.614258.
  • Shah NR. Health care in 2030: will artificial intelligence replace physicians? Ann Intern Med. 2019;170(6):407–08. doi:10.7326/M19-0344.
  • Sanford T, Harmon SA, Turkbey EB, Kesani D, Tuncer S, Madariaga M, Yang C, Sackett J, Mehralivand S, Yan P, et al. Deep-learning-based artificial intelligence for PI-RADS classification to assist multiparametric prostate MRI interpretation: a development study. J Magn Reson Imaging. 2020;52(5):1499–507. doi:10.1002/jmri.27204.
  • Schriger DL, Elder JW, Cooper RJ. Structured clinical decision aids are seldom compared with subjective physician judgment, and are seldom superior. Ann Emerg Med. 2017;70(3):338–344.e3. doi:10.1016/j.annemergmed.2016.12.004.
  • Wears RL, Berg M. Computer technology and clinical work: still waiting for godot. J Am Med Assoc. 2005;293(10):1261–63. doi:10.1001/jama.293.10.1261.
  • Hu Y, Jacob J, Parker GJM, Hawkes DJ, Hurst JR, Stoyanov D. The challenges of deploying artificial intelligence models in a rapidly evolving pandemic. Nat Mach Intell. 2020;2(6):298–300. doi:10.1038/s42256-020-0185-2.
  • Panch T, Mattie H, Celi LA. The “inconvenient truth” about AI in healthcare. Npj Digit Med. 2019;2(1):1–3. doi:10.1038/s41746-019-0155-4.
  • Doyen S, Dadario NB. 12 plagues of AI in healthcare: a practical guide to current issues with using machine learning in a medical context. Front Digit Heal. 2022;74. doi:10.3389/FDGTH.2022.765406.
  • Roberts M, Driggs D, Thorpe M, Gilbey J, Yeung M, Ursprung S, Aviles-Rivero AI, Etmann C, McCague C, Beer L, et al. Common pitfalls and recommendations for using machine learning to detect and prognosticate for COVID-19 using chest radiographs and CT scans. Nat Mach Intell. 2021;3(3):199–217. doi:10.1038/s42256-021-00307-0.
  • Wynants L, Van Calster B, Collins GS, Riley RD, Heinze G, Schuit E, Bonten MMJ, Dahly DL, Damen JAA, Debray TPA, et al. Prediction models for diagnosis and prognosis of covid-19: systematic review and critical appraisal. BMJ. 2020;369. doi:10.1136/bmj.m1328.
  • Bentley AR, Callier S, Rotimi CN. Diversity and inclusion in genomic research: why the uneven progress? J Community Genet. 2017;8(4):255–66. doi:10.1007/s12687-017-0316-6.
  • Werling DM, Geschwind DH. Sex differences in autism spectrum disorders. Curr Opin Neurol. 2013;26(2):146–53. doi:10.1097/WCO.0b013e32835ee548.
  • Chen IY, Szolovits P, Ghassemi M. Can AI help reduce disparities in general medical and mental health care? AMA J Ethics. 2019;21(2):167–79. doi:10.1001/amajethics.2019.167.
  • Cirillo D, Catuara-Solarz S, Morey C, Guney E, Subirats L, Mellino S, Gigante A, Valencia A, Rementeria MJ, Chadha AS, et al. Sex and gender differences and biases in artificial intelligence for biomedicine and healthcare. Npj Digit Med. 2020;3(1):1–11. doi:10.1038/s41746-020-0288-5.
  • Miller IN, Cronin-Golomb A. Gender differences in Parkinson’s disease: clinical characteristics and cognition. Mov Disord. 2010;25(16):2695–703. doi:10.1002/MDS.23388.
  • Regitz-Zagrosek V. Sex and gender differences in health. science & society series on sex and science. EMBO Rep. 2012;13(7):596–603. doi:10.1038/embor.2012.87.
  • Ferretti MT, Iulita MF, Cavedo E, Chiesa PA, Schumacher Dimech A, Santuccione Chadha A, Baracchi F, Girouard H, Misoch S, Giacobini E, et al. Sex differences in Alzheimer disease — the gateway to precision medicine. Nat Rev Neurol. 2018;14(8):457–69. doi:10.1038/s41582-018-0032-9.
  • Reddy S, Allan S, Coghlan S, Cooper P. A governance model for the application of AI in health care. J Am Med Informatics Assoc. 2020;27(3):491–97. doi:10.1093/jamia/ocz192.
  • Asan O, Bayrak AE, Artificial Intelligence CA. Human trust in healthcare: focus on clinicians. J Med Internet Res. 2020;22(6):e15154. doi:10.2196/15154.
  • Wiens J, Saria S, Sendak M, Ghassemi M, Liu VX, Doshi-Velez F, Jung K, Heller K, Kale D, Saeed M, et al. Do no harm: a roadmap for responsible machine learning for health care. Nat Med. 2019;25(9):1337–40. doi:10.1038/s41591-019-0548-6.
  • Janowitz MF, Solow D. How to read and do proofs. Am Math Mon. 1993;100(2):197. doi:10.2307/2323794.
  • DeSanctis G, Gallupe RB. Decision support systems: concepts and resources for managers. Manage Sci. 1987; 20(4):80–81. [Accessed May 16, 2022]. https://books.google.com/books?hl=en‎&id=9NA6QMcte3cC&oi=fnd&pg=PR9&dq=+%22Decision+Support+Systems%22&ots=DPpsrwMvBa&sig=NgG5F6ya69eWvn6SsgY3SVG9pgw
  • Charu C, Aggarwal JH. Data mining: the textbook. Springer Int Publ. Published online 2015:746. Accessed May 16, 2022. https://books.google.com/books/about/Data_Mining.html?hl=it&id=cfNICAAAQBAJ
  • Bonettini S, Porta F, Ruggiero V. A variable metric forward-backward method with extrapolation. SIAM J Sci Comput. 2016;38(4):A2558–A2584. doi:10.1137/15M1025098.
  • Dixit A, Sahu DR, Gautam P, Som T, Yao JC. An accelerated forward-backward splitting algorithm for solving inclusion problems with applications to regression and link prediction problems. J Nonlinear Var Anal. 2021;5(1):79–101. doi:10.23952/JNVA.5.2021.1.06.
  • He B, Yuan X. Forward–backward-based descent methods for composite variational inequalities. Optimization Methods and Software . 2013;28(4):706–24. doi:10.1080/10556788.2011.645033.
  • Mullen M, Brennan C, Downes T. A hybridized forward backward method applied to electromagnetic wave scattering problems. IEEE Trans Antennas Propag. 2009;57(6):1846–50. doi:10.1109/TAP.2009.2019994.
  • Holliday D, Deraad LL, St-Cyr GJ. Forward-backward method for scattering from imperfect conductors. IEEE Trans Antennas Propag. 1998;46(1):101–07. doi:10.1109/8.655456.
  • What CE. AI-Driven decision making looks like. Harvard Business Review. Published 2019. Accessed 1 June 2022. https://hbr.org/2019/07/what-ai-driven-decision-making-looks-like
  • Phillips-Wren G. AI tools in decision making support systems: a review. Int J Artif Intell Tools. 2012;21(2):2. doi:10.1142/S0218213012400052.
  • Amann J, Vetter D, Blomberg SN, Christensen HC, Coffee M, Gerke S, Gilbert TK, Hagendorff T, Holm S, Livne M, et al. To explain or not to explain?—Artificial intelligence explainability in clinical decision support systems. PLOS Digit Heal. 2022;1(2):e0000016. doi:10.1371/journal.pdig.0000016.
  • Bleher H, Braun M. Diffused responsibility: attributions of responsibility in the use of AI-driven clinical decision support systems. AI Ethics. 2022;1:1. doi:10.1007/s43681-022-00135-x.
  • Hafemeister TL, Gulbrandsen RM. The fiduciary obligation of physicians to “Just Say No” if an “Informed” patient demands services that are not medically indicated. Vol 39.; 2009. Accessed 3 June 2022. http://www.pewinternet.org/pdfs/PIP_Health_
  • Maliha G, Gerke S, Cohen IG, Parikh RB. Artificial intelligence and liability in medicine: balancing safety and innovation. Milbank Q. 2021 April 6;99(3):629–47. Published online. doi:10.1111/1468-0009.12504.
  • Lysaght T, Lim HY, Xafis V, Ngiam KY. AI-assisted decision-making in healthcare: the application of an ethics framework for big data in health and research. Asian Bioeth Rev. 2019;11(3):299–314. doi:10.1007/s41649-019-00096-0.
  • DePaul KS. Journal of health care law DePaul journal of health care law part of the health law and policy commons recommended citation recommended citation Sarah kamensky, artificial intelligence and technology in health care: overview and possible legal implications. 21 DePaul J Heal Care L. 2020. Published online doi: 10.1098/rsta.2017.0360.
  • Gerke S, Minssen T, Cohen G. Ethical and legal challenges of artificial intelligence-driven healthcare. Artificial intelligence in healthcare 1st Bohr, A, Memarzadeh, K. Elsevier; 2020. 295–336. doi: 10.1016/b978-0-12-818438-7.00012-5.
  • Yu R, Alì GS. What’s inside the black box? AI Challenges for lawyers and researchers. In: Legal information management. Vol. 19. Cambridge, UK: Cambridge University Press (CUP); 2019. p. 2–13. doi:10.1017/s1472669619000021.
  • Sperling E. How hardware can bias AI Data: seminconductorengeneering. Published 2019. Accessed 15 May 2022. https://semiengineering.com/where-data-gets-biased/
  • Stobierski T. 8 steps in the data life cycle. Harvard Business School. Published 2021. Accessed 1 June 2022. https://online.hbs.edu/blog/post/data-life-cycle
  • Zhou M, Guo J, Chen N, Ma M, Dong S, Li Y, Fang J, Zhang Y, Zhang Y, Bao J, et al. Effects of message framing and time discounting on health communication for optimum cardiovascular disease and stroke prevention (EMT-OCSP): a protocol for a pragmatic, multicentre, observer-blinded, 12-month randomised controlled study. BMJ Open. 2021;11(3):3. doi:10.1136/bmjopen-2020-043450.
  • Gutbezahl J. 5 types of statistical biases to avoid in your analyses. Harvard Business School - HBS Online. Published 2017. Accessed 30 May 2022. https://online.hbs.edu/blog/post/types-of-statistical-bias
  • Garbin C, Marques MO. Tools to improve reporting, increaseTransparency, and reduce failures in machine learning applications in healthCare. Radiol Artif Intell. 2022;4(2):2. doi:10.1148/RYAI.210127.
  • Raji ID, Smart A, White RN, Mitchell, M, Gebru, T, Hutchinson, B, Smith-Loud, J, Theron, D, Barnes, P, et al. Closing the AI accountability gap: defining an end-to-end framework for internal algorithmic auditing. FAT* 2020 - Proc 2020 Conf Fairness, Accountability, Transpar Barcelona Spain. Published online 2020: 33–44. doi:10.1145/3351095.3372873.
  • Johnston AC, Warkentin M. Information privacy compliance in the healthcare industry. Inf Manag & Comput Secur. 2008;16(1):5–19. doi:10.1108/09685220810862715.
  • Johnston MB, Roper L. HIPAA becomes reality: compliance with new privacy, security, and electronic transmission standards. West VA Law Rev. 2000 Accessed August 14, 2021; 103. https://heinonline.org/HOL/Page?handle=hein.journals/wvb103&id=553&div=&collection=
  • Auditability checklist | auditing machine learning algorithms. Accessed 1 June 2022. https://www.auditingalgorithms.net/AuditabilityChecklist.html
  • Mukherjee D, Yurochkin M, Banerjee M, Sun Y. Two simple ways to learn individual fairness metrics from data. 37th Int Conf Mach Learn ICML 2020. 2020;PartF16814: 7054-7064. Virtual Event. Accessed June 3, 2022. https://github.com/
  • Olteanu A, Castillo C, Diaz F, Kıcıman E. Social data: biases, methodological pitfalls, and ethical boundaries. Front Big Data. 2019;2(13):13. doi:10.3389/fdata.2019.00013.
  • Partridge D, Hussain KM. Artificial intelligence and business management. Ablex P, ed. Norwood, N.J.: Ablex Pub. Corp. 1992. Accessed June 7, 2022 https://books.google.com/books?id=3_zkZYwj43sC&pgis=1