5,226
Views
6
CrossRef citations to date
0
Altmetric
Articles

Ironies of artificial intelligence

Pages 1656-1668 | Received 14 Apr 2023, Accepted 21 Jul 2023, Published online: 06 Aug 2023

Abstract

Bainbridge’s Ironies of Automation was a prescient description of automation related challenges for human performance that have characterised much of the 40 years since its publication. Today a new wave of automation based on artificial intelligence (AI) is being introduced across a wide variety of domains and applications. Not only are Bainbridge’s original warnings still pertinent for AI, but AI’s very nature and focus on cognitive tasks has introduced many new challenges for people who interact with it. Five ironies of AI are presented including difficulties with understanding AI and forming adaptations, opaqueness in AI limitations and biases that can drive human decision biases, and difficulties in understanding the AI reliability, despite the fact that AI remains insufficiently intelligent for many of its intended applications. Future directions are provided to create more human-centered AI applications that can address these challenges.

Practitioner summary:

Artificial Intelligence (AI) creates many new challenges for human interaction. Five ironies of AI are discussed that limit its ultimate success, and future directions are provided to create more human-centered AI applications that can address these challenges.

Introduction

Lisanne Bainbridge’s Citation1983 paper, the Ironies of Automation (Bainbridge Citation1983), was a telling and prescient summary of the many challenges that arise from automation. She pointed out the ways in which automation, paradoxically, make the human’s job more crucial and more difficult, rather than easier and less essential as so many engineers believe. Not only does automation introduce new design errors into the control of systems, but it creates very different jobs that have many new problems, with the result that people may be less able to perform when needed. They need to be more skilled to understand and operate the automation, while simultaneously the automation leads to skill atrophy. Additional system complexity is introduced as well as vigilance problems that interfere with peoples’ ability to oversee the automation. And while manual workload may be decreased much of the time, cognitive workload is often increased at critical times.

After 40 years, Bainbridge’s keen observations continue to hold true as the use of automation has increased across many domains, including aviation, air traffic control, automated process control, drilling, and transportation systems. Problems with loss of skills needed for manual performance and decision-making have been reported in aviation (Jacobson Citation2010; National Transportation Safety Board Citation2010; Wiener and Curry Citation1980), information automation (Volz et al. Citation2016), and vehicle automation (Nordhoff et al. Citation2023), among others. Inadequate training on automation has been found to be a critical problem associated with many aviation automation accidents (Funk et al. Citation1999), and attention to automation training has often been lacking (Strauch Citation2017). Increased cognitive workload, particularly at critical times, has been shown to be problematic, making it quite challenging for pilots take over manual control when needed, leading to numerous aviation accidents involving automation (Endsley and Strauch Citation1997; Federal Aviation Administration Citation2019; Funk et al. Citation1999).

While automation can be very useful when it works well, of critical concern, when automation is not perfect, people struggle to compensate for its shortcomings. A wide body of evidence has amassed showing that automation decreases peoples’ ability to intervene when needed because it puts them out-of-the-loop (OOTL). One study found 26 automation-related accidents among major air carriers between 1972 and 2013 in which pilots were significantly challenged in understanding what the automation was doing and interacting with it correctly to avoid the resulting accident (Gawron Citation2019). Another review found that two of the biggest factors underlying automation accidents and incidents are inadequate understanding of automation and poor transparency of the behaviour of the automation (Funk et al. Citation1999). These problems are not limited to aviation; similar automation related OOTL failures have contributed to major accidents in power grid operations (U.S. Canada Power System Outage Task Force Citation2004) and oil drilling (National Academy of Engineering and National Research Council Citation2012), for example.

OOTL related accidents have been shown to be due to the loss of situation awareness (SA) that occurs when people monitor automated systems, which arise from a combination of poor information displays, vigilance and complacency problems, and loss of cognitive engagement (Endsley and Kiris Citation1995). Summarising a wide body of research examining this phenomenon over the past 30 years, I posited an automation conundrum which states that:

The more automation is added to a system, and the more reliable and robust that automation is, the less likely that human operators overseeing the automation will be aware of critical information and able to take over manual control when needed. More automation refers to automation use for more functions, longer durations, higher levels of automation, and automation that encompasses longer task sequences. (Endsley Citation2017b)

In related work, Chris Wickens described the lumberjack effect which states that while automation can create better performance in routine situations, it also can lead to more catastrophic outcomes in non-routine situations where automation is often ill suited (Sebok and Wickens Citation2017). Overall, working with automated systems has proven challenging for people. Not only has automation often been limited in terms of its reliability and the range of situations it can handle, but simultaneously it negatively effects the degree to which people have been able to compensate for its deficiencies when needed.

New ironies with artificial intelligence

In the newest wave of automation, the use of artificial intelligence (AI) software is being touted as the means for overcoming the past limitations of traditional software automation. AI is a form of automation that is directed at highly perceptual and cognitive tasks (National Academies of Sciences Engineering and Medicine Citation2021; U. S. Air Force Citation2015). The goal of AI is to perform tasks that would normally require human intelligence, such as recognising speech or objects in visual scenes, making decisions and solving problems. AI applications potentially have the ‘ability to reason, discover meaning, generalize, or learn from past experience’ (Copeland Citation2020), although current instantiations may only do small portions of this.

Earlier versions of AI software relied on knowledge engines with extensive rules for behaviour or cases that had to be derived and encoded manually. Today’s AI instead uses machine learning (ML) algorithms (such as neural networks) to detect and recognise patterns in data, largely benefiting from the availability of big data sets that are digitised to produce useful products. For example, digital twins rely heavily on ML to process large sets of data for understanding complex systems, performing fault diagnosis, and predicting system failures in manufacturing and aerospace applications (Rathore et al. Citation2021). In other cases, such as autonomous vehicles, extensive sets of sensors need to be employed to gather environmental information for processing by the AI. And in some cases, generative AI techniques are actually creating new content (such as articles or pictures) by recombining or iterating based on input examples.

As an advanced form of automation, many of the same challenges and ironies identified by Bainbridge equally apply to the job of working with an AI system. Deskilling, increases in system complexity, increases in cognitive workload, and decreases in SA remain fundamental problems to be addressed with AI systems. As long as people need to interact with and potentially correct for deficiencies of the AI system, these fundamental human interaction challenges remain. Changes in the mechanism for creating the automated behaviours do not in any way alleviate previous automation related concerns that result from the fundamental changes in the roles of people along with co-occurring changes to workload, SA and engagement.

While AI can be considered an advanced (and potentially more capable) form of automation, its reliance on learning algorithms also creates additional challenges for the people who interact with these systems. Most notably, AI based on learning techniques suffers from opaqueness; it can be quite difficult for people to understand how the AI works and what features of the environment or situation it is basing its performance on (or is oblivious to). (Endsley Citation2017b; Huang and Endsley Citation1997). In addition, the learning algorithms themselves create a new kind of challenge for people—that of maintaining a mental model of a constantly changing and evolving system (U. S. Air Force Citation2015). Thus, in addition to Bainbridge’s initial ironies of automation, several additional ironies are created with AI. I will first discuss these ironies, followed by a discussion of inter-related approaches for mitigating them.

Irony 1: Artificial intelligence is still not that intelligent

AI has come a long way in recent years. New generative AI systems, such as ChatGPT, can create quite reasonable text based on a natural language generator and behind-the-scenes data sets that may be useful in many tasks. Initial results indicate that people may have trouble determining whether an essay was generated by a human or the AI, indicating potentially Turing level intelligence. A deep examination, however, shows that even this impressive feat is limited. Its answers to questions may be inaccurate or biased and it lacks an understanding of context (Rudolph, Tan, and Tan Citation2023). It does not possess any ‘common sense, reasoning, intuition, or creativity’ according to the ChatGPT website. While possessing an excellent natural language capability, it still is only capable of rephrasing the information in its underlying database and predicting what words are most likely to come next. And it often may simply make up information if it does not know something (i.e. hallucinations)(Alkaissi and McFarlane Citation2023; Sallam Citation2023).

While ChatGPT is growing and improving rapidly, one may ask is this intelligence? Does intelligence require more than gathering and regurgitating information (even if that information is presented well)? An intelligent system is defined as one that recognise situations, adapts to changes and generates solutions to even novel problems, and can act to optimise performance (Copeland Citation2020). It can be argued that generative AI still does not provide these capabilities; it just provides an excellent mimic.

Although programs such as AlphaGo can actually out-perform humans via ML (Silver et al. Citation2017), this is a highly constrained problem without the wide variety of edge cases and novel situations that can occur in real world settings. ML is excellent for data analysis when there is a large data set of situations to learn from, however, it is far less capable in managing novel situations that pose the edge cases where human intelligence is better able to perform.

AI systems are also being proposed for far more difficult tasks with which it still struggles. Even though AI can operate fairly easily on the basis of already digitised data, AI that requires sensors and machine vision for information inputs still has problems with accuracy (Akhtar and Mian Citation2018; Feng et al. Citation2019) and is easily confused by noisy data. This creates significant difficulties for any system that must operate in the real (as opposed to virtual) world, such as flying, driving, military operations and power systems.

Most importantly, Pearl and Mackenzie (Citation2018) point out that ML based AI systems lack a model of causation that is critical for predicting future events, simulating potential actions, or appropriately generalising to new situations. Decision making requires not only understanding what is currently happening, but also projecting what is likely to happen, or what could possibly happen; therefore, this is a serious limitation on AI performance. Ultimately, truly intelligent AI will require capabilities which exceed that provided by current pattern-recognition based ML approaches. Intelligence will require the ability to perform well in untrained situations, the ability to recognise the limits of its own abilities and perform gracefully in such cases, the ability to manage uncertainty including noisy data, and the ability to predict future events and apply general knowledge to new situations appropriately.

Irony 2: the more intelligent and adaptive the AI, the less able people are to understand the system

Mental models constitute an important mechanism that people use for understanding the functioning of any system, including automation and AI (Manktelow and Jones Citation1987; Rouse and Morris Citation1985; Wilson and Rutherford Citation1989). Mental models are used for gathering and interpreting information to form SA by directing attention to relevant information and by providing the mechanisms needed for integrating information to create comprehension and projection of future events (Endsley Citation1995).

Challenges in developing accurate mental models of automated systems are well documented (Mogford Citation1997; Mumaw, Sarter, and Wickens Citation2001; Phillips et al. Citation2011; Silva and Hansman Citation2015). Automation tends to be inherently complex and opaque, with deep logic structures, making it quite difficult for people to develop good mental models and SA (Endsley Citation1995). This leads to unexpected ‘automation surprise’ occurring in infrequently encountered conditions (McClumpha and James Citation1994; Vakil and Hansman Citation1997; Woods and Sarter Citation2000).

It is even more difficult to develop an accurate mental model of an AI system than of traditional automation. First, because modern AI systems are generally developed through ML, distinct logic or rules are not apparent. The way in which the system will perform in different circumstances is inherently opaque and not open to inspection. Even the developers of AI systems are often unaware of what features of the situation it is using and how, and of its limitations and capabilities for different circumstances (Ferreira and Monteiro Citation2020). Therefore, mental models of how AI works must be developed through experience, with users creating their own representations of AI capabilities and limitations to guide their expectations of what it will do in difference situations, often incorrectly it turns out (Druce et al. Citation2021).

In addition, maintaining an accurate mental model of AI is made much more difficult by the fact that it can change over time (Endsley Citation2017b; U. S. Air Force Citation2015). One of the advantages of using ML approaches to AI is that it can constantly improve and extend as it encounters new situations. This, however, only makes the challenge of maintaining an accurate mental model of the AI more difficult. If the logic and capabilities of the AI change over time, users may not even be aware of these changes or able to update their internal models appropriately, and training for AI systems is often lacking (Casner and Hutchins Citation2019; Endsley Citation2017a). Without accurate and up-to-date mental models of how AI works, the ability of people to predict and understand AI actions is severely limited, decreasing their ability to adopt effective strategies for dealing with any AI deficiencies.

Irony 3: the more capable the AI, the poorer people’s self-adaptive behaviours for compensating for shortcomings

People can often compensate for systems with limited reliability by paying more attention to operations at times when the system may have difficulties (Bagheri and Jamieson Citation2004). When people believe their SA is poor, they tend to behave adaptively, acting conservatively to avoid negative outcomes while gathering more information to improve their SA (Endsley Citation1995). A key challenge arises in that not only can automation and AI directly reduce SA, but it also compromises the very mechanisms that people rely on when they need to improve their SA.

People actually have significant problems in assessing the accuracy of their own SA. A recent meta-analysis of 37 studies found very poor correlation between objective measures of SA and subjective measures of SA (Endsley Citation2020). There are a number of reasons for this disconnect, including limited feedback, Kruger-Dunning effects (the tendency of some people to over-estimate their abilities), the use of different brain structures for confidence assessments, selective information sampling, lack of knowledge of missing information, poor mental models, and the probabilistic link between SA and outcomes that limits learning (Endsley Citation2020). People often simply do not know what they do not know.

With respect to AI, the degree to which people may not know when their SA is low limits their ability to direct more attention to overseeing the AI or taking over in situations that it is ill-suited for. A recent study of Tesla drivers found that 46% believed that the AI autopilot improved their SA, while simultaneously 45% reported increased complacency, 21% reported over-reliance on the autopilot, and 15% reported mind-wandering and fatigued driving (Nordhoff et al. Citation2023). Even though many drivers seemed aware of some of the negative effects of the autopilot that could degrade their SA, this was at direct odds with their belief that SA was improved.

This disconnect can be partially explained by the observation that SA may actually be better at times with an autopilot, due to an increased ability to look around, even though it may be worse at other times due to complacency, poor engagement and limited information on autopilot functioning (Endsley Citation2017a). That is, SA becomes more variable. However, because people are also poor at determining when their SA is low, they are unable to effectively alter their information gathering behaviours when needed (e.g. introduce more scanning or interrupt competing tasking) in order to compensate for these deficiencies. Complacency, mind-wandering, and an increase in engaging in competing tasks become commonplace problems (Carsten et al. Citation2012; Hergeth et al. Citation2016; Ma and Kaber Citation2005; Sethumadhavan Citation2009).

Even as AI decreases SA, simultaneously behaviours typically used to increase SA also become compromised. While this problem exists with all automation, it can be argued that it is worse with AI. First, because AI is a more capable and higher level of automation, able to operate in a wider set of circumstances than traditional automation, SA can become even more degraded per the automation conundrum (Endsley Citation2017b). Therefore, it is likely to happen more frequently with AI systems. Secondly, the inherent opaqueness of AI systems will exacerbate the inability of people to accurately calibrate their understanding of system operations.

Irony 4: the more intelligent the AI, the more obscure it is, and the less able people are to determine its limitations and biases and when to use the AI

The challenge of AI bias has received considerable attention. These biases often are introduced inadvertently through the provision of limited or statistically biased training sets (i.e. limited representativeness of problems) that create biases towards certain sets of conclusions (Daugherty and Wilson Citation2018; Gianfrancesco et al. Citation2018; West, Whittaker, and Crawford Citation2019), as well as artefacts that creep into the development of the AI algorithms (Northcutt, Athalye, and Mueller Citation2021). These biases create a shortcoming in the performance of AI in that it will perform more poorly or inaccurately in situations that are different than what it has been trained on. Problems with AI systems introducing racial or gender bias in hiring decisions or facial recognition have been widely publicised, for example (Howard and Borenstein Citation2018; Osoba & Welser IV, 2017). As a more general case, bias can be considered any use of an AI system in situations outside of its training (National Academies of Sciences Engineering and Medicine Citation2021), i.e. an over-generalization that occurs when AI trained to operate in certain conditions is applied in other conditions.

People are often expected to be able to compensate for AI shortcomings, like bias, substituting their own knowledge and judgement in cases where the AI may be deficient. Paradoxically, however, AI makes it very difficult to do so. First, these biases tend to be hidden due the opaque nature of ML. Even the developers of AI systems may not know what biases have inadvertently been introduced in the training process. Furthermore, the users of AI systems are generally a different set of people than the developers of the AI, and therefore are even less likely to understand the limitations of its training or what situations it should be limited to.

For example, AI based recommendation systems have been found to promote misinformation and conspiracy theories (Hussein, Juneja, and Mitra Citation2020; Mortimer Citation2017). As people search for information or view certain web sites on social media, the AI recommender systems encourage additional related content. People may mistakenly view the recommended ‘similar’ content as having undeserved authenticity. They are not aware of how the AI system works, what it bases its recommendations on, or the risks inherent in its output. These biases are hidden.

Secondly, humans do not form their decisions independently from AI, but are directly influenced by the provision of recommendations or assessments from the AI (Endsley and Jones Citation2012). People tend to anchor on the recommendation of the AI system, and then gather information to agree or disagree with it, creating confirmation bias. AI biases therefore can directly compound human biases in the decision process, thereby reducing the reliability of the joint human-AI system. Further, the impact of the AI biases can vary depending on the format and framing of the AI system’s recommendations (Banbury et al. Citation1998; Endsley and Kris Citation1994; Friesen et al. Citation2021; Selcon Citation1990). Rather than overcoming human decision bias, AI can make it worse through this process of anchoring and confirmation bias.

In that AI biases are generally invisible, unknown by both developers and users of systems, and they can affect human decision making quite surreptitiously, their negative effects can be insidious. Therefore, people will be often unable to detect and compensate for these biases (by choosing when to use the system or interjecting corrections, for example).

Irony 5: the more natural the AI communications, the less able people are to understand the trustworthiness of the AI

ChatGPT is a new AI chatbot developed to deliver the answers to people’s questions in an easy to understand, natural language format. It can provide detailed essays on a wide variety of topics, based on internet searches, that adhere relatively well to expected syntax and writing styles (Sinha, Burd, and DuPreez Citation2023). The problem, however, is that not only is it unable to effectively differentiate accurate from inaccurate information, ChatGPT delivers its answers in a confident and unnuanced way that removes critical cues (such as the source of the information) that might allow users to calibrate the accuracy or reliability of its answers.

A recent study provided ChatGPT with 100 questions associated with known misinformation topics (Brewster, Arvantis, and Sadeghi Citation2023). The chatbot generated false narratives for 80% of the topics, significantly exacerbating and promoting conspiracy theories and misinformation on the internet. While the algorithms used by AI such as ChatGPT can presumably improve to better filter false information (and steps have been taken to reduce some of these obvious sources of misinformation), the central challenge will remain that people need key information to help them determine how much confidence to place in the information provided by AI systems.

Typically, people rely on a number of cues to determine how much confidence to place in information provided by a system including the source of information, conditions underlying the generation of the data, its recency, the presence of incongruent or conflicting data, and noisiness or ambiguities in the data (Endsley and Jones Citation2012). Good design principles stress the importance of providing insight into these factors in system displays to support human understanding of information confidence (Endsley and Jones Citation2012).

However, these cues are hidden when the AI relies on natural language for its communications. Further, because the AI is integrating information from a wide variety of sources, the reliability of the aggregated outcome is unknown. Parts of what it provides may be accurate and other parts inaccurate. Conclusions may be only partially justified, and sources may be cherry picked or questionable, making conclusions misleading. Not only are these issues not visible to users, many people may mistakenly be overconfident in the output of AI systems (Garcia et al. Citation2022; Gillespie et al. Citation2023; Howard Citation2020), believing them to be more capable than they really are.

Therefore, while natural language interfaces for AI systems may be viewed as desirable, providing more human-friendly interactions, they also pose a simultaneous problem for assessments of their reliability and trustworthiness. Given that people may have many difficulties in determining when and how much to trust AI systems (Kaplan et al. Citation2023; Stanton and Jensen Citation2021), a problem that is exacerbated by misleading naming of AI systems (Forster et al. Citation2018; Liu et al. Citation2021; Teoh Citation2020) and media coverage of AI (Fast and Horvitz Citation2017; Kelley et al. Citation2021), addressing this problem can be quite complicated.

Overcoming the ironies of AI – future directions

Support for human interaction and oversight

The key implication of Irony 1 is that although AI systems may be very useful for certain applications, they will not take over functions and tasks completely. Rather AI will need to be able to work with and facilitate human decision making as a key objective in their design (National Academies of Sciences Engineering and Medicine Citation2021). Due to AI limitations, a central focus of the design of AI systems must be to facilitate effective human decision making when AI systems are involved. As the insertion of AI systems into many different applications appears bound to continue, at varying levels of capability, a new framework is needed for ensuring that these systems address the needs of the people who must interact with them. An initial set of requirements for more human-centered AI should include the following.

Attribution

AI systems must be explicitly identified as bots. Further any product of a bot (e.g. generative language, videos, images, recommendations) must be labelled as such, along with the source of its products. This will help people to better calibrate their expectations and assumptions with regard to the veracity of its outputs. This will partially address Irony 5, improving people’s ability to assess the trustworthiness of AI outputs. Given the problems of potential over-trust, terms including ‘intelligent’ and ‘autonomous’ are highly discouraged due to the promotion of unrealistic expectations of system capabilities.

Explainability

AI systems must be equipped with explainability features that allow people who interact with it to understand the systems’ capabilities and limitations for performance (including what factors it does or does not consider in its assessments), partially addressing Irony 2. Because AI can be both opaque and changeable, developing and implementing effective AI explanation systems is important for helping people to develop accurate mental models of the AI. The benefits of AI explainability have been demonstrated in several studies (Bass, Baumgart, and Shepley Citation2013; Oduor and Wiebe Citation2008; Paleja et al. Citation2021).

Considerable work has been directed towards the challenge of creating explainable AI (Samek et al. Citation2019). Approaches include attempting to create surrogate models, examining how the model behaves under various perturbations, propagating model predictions, and providing meta-explanations though methods such as heatmaps to show what features the model is utilising (Samek and Müller Citation2019). For example, research has generated understandable rules from ML systems (Huang and Endsley Citation1997), language predicates from graph models (Kazhdan et al. Citation2023), and linguistic insights into classification criteria (Saranti et al. Citation2022).

These approaches have often been more useful for model developers than end users, however. People still need to manually construct an understanding of the AI over time based on AI matching features used in different cases, for example, so as to determine how it might perform in different situations. They must work to generate their own mental models of the system’s capabilities, which can vary significantly across people, even with such aids (Druce et al. Citation2021). Druce et al. (Citation2021) also found that to be effective a) explanations need to be tailored to the situational circumstances in which the behaviour occurs, b) many AI systems are very brittle and their logic is not easily amenable to explanation, c) methods for providing causal explanations of AI behaviour are not straightforward, and d) people are prone to adopt highly inaccurate mental models of how AI systems work and these models are highly resistant to correction. Effective methods for AI explainability still need further development.

Developing an understanding of causality has been shown to be critical for the development of appropriate trust in AI systems (Shin Citation2021). AI must be capable of communicating to the user why it makes particular recommendations or takes actions in each case, tailored to the needs of the user. Sanneman and Shah (Citation2022) provide a number of recommendations for improving AI explainability towards this end. Human-centered approaches to AI explanations need to consider the capabilities of the human receiver (e.g. expertise, bandwidth, prior knowledge and assumptions) as well as provide effective methods for explanation delivery.

Although explainability features can be useful for developing mental models of how the AI works and its potential limitations, the use of explainability features is limited for many applications where people need to make real-time decisions and do not have the time to delve into complex explanations. An overload situation can be easily created in which people do not have time to dig into the factors behind each AI action or recommendation.

In dynamic situations, real-time system transparency conveyed through the user interface is more likely to be useful (Endsley Citation2023). Real-time transparency also has the advantage of being able to reflect AI capability changes associated with ML that people may not have kept up with, and for providing an understanding of AI’s performance and capabilities within the current context.

Transparency

Effective mechanisms must be provided for helping people to understand the capabilities and limitations of AI systems so that they can adopt effective strategies for use, which will partially address Ironies 2 and 3. Two extensive meta-analyses on this topic showed that significant benefits of AI transparency have been demonstrated with regard to reducing OOTL, and for improving both SA and trust (van de Merwe, Mallam, and Nazir Citation2022; Wickens et al. Citation2022).

The information needed to support AI transparency is shown in in terms of three factors: (a) Taskwork SA – SA of the world and system that is needed to support task performance, (b) Agent SA – SA of the AI that is needed to provide effective oversight, and (c) Teamwork SA – SA needed for coordinating the functions and behaviours of the human and AI (Endsley Citation2023). AI systems must be transparent to users in terms of: the information it perceives, its integrated interpretations of that information, its projections of future events, its current state or mode, its goals and currently assigned functions, its ability to perform tasks in the current situation and upcoming situations, how well it is currently performing tasks, and its projected actions. Further the source of information, reliability of information and confidence in assessments should be explicitly communicated to users. Endsley (Citation2023) provides an approach to detailing the specific requirements for AI applications, and developing effective formats for providing transparency.

Table 1. Shared SA requirements in human-AI teams—defining AI transparency needs (from Endsley Citation2023).

Exposing bias

AI systems must be capable of exposing their own biases and learning limitations to users, addressing Irony 4. AI biases are largely the result of limitations in the generalisability of training sets, or non-causal patterns that may be present in those data sets. AI developers must carefully determine what biases and generalisability limitations exist, and document them for users. Approaches for detecting and mitigating AI bias are being explored (Kiyasseh et al. Citation2023; Mazijn et al. Citation2022; Srivastava and Rossi Citation2019). Because, over time, there is always the possibility that the AI will be used in situations that it is not suited for, it is important that the limits and biases of a given system be clearly indicated to end-users. In addition, a level of meta-awareness of its own limitations needs to be developed within AI systems.

Meaningful control

People must be provided with meaningful control over AI systems (Boardman and Butcher Citation2019; Cavalcante Siebert et al. Citation2023), to include the ability maintain situation awareness when overseeing the performance of systems controlled by AI, and the ability to take corrective actions over the AI in a timely manner, partially addressing Irony 3. The challenges of lowered cognitive engagement and human attention are fundamental problems. Even with effective AI transparency, if people are reliant on the AI and thus directing their attention elsewhere, they may not be cognisant of problems demanding their intervention, particularly in continuous control tasks such as driving or flying. Mechanisms for increased engagement (e.g. though divisions of functions that retain meaningful roles for people and allow them to maintain expertise) will be important as well as the ability to effect meaningful control of operations (Boardman and Butcher Citation2019). AI systems will need to develop a level of meta-awareness (Kounev et al. Citation2017) that allows them to understand when situations are beyond their capabilities and they should ask people for help. Effective mechanisms for transferring control to people, that takes into account the need for them to rebuild their SA of the situation, must be provided.

Training and skill retention

To addresses the previously identified irony of deskilling, methods for ensuring that people effectively develop and retain cognitive skills required for performance (both alone and in conjunction with the AI) must be developed and implemented (Casner and Hutchins Citation2019; Paranjape et al. Citation2019). Relatively little research has been directed at the effect AI may have on the development and maintenance of needed skills. Without the ability to perform complex cognitive tasks, the ability of people to detect insufficient behaviour by AI, or to assume those tasks themselves, is severely compromised. Training should encompass both the requirement to acquire the fundamental knowledge and skills associated with manual performance, as well as the ability to understand and work with the AI. Further, due to the changing nature of ML based AI, refresher training will need to be instituted, so that changes in AI capabilities over time can be learned.

Joint testing of Human-AI systems

The safety and reliability of AI systems when working in conjunction with people must be carefully tested before they are deployed in any safety critical environment. The capability of the AI for allowing people to understand it, and to detect and overcome problems in realistic conditions should be determined. Traditional metrics such as system usability and suitability for purpose should be augmented by more detailed metrics such as the SA and trust of the users when working with the AI system.

This testing should also determine unforeseen behaviours of the AI and the ability of people to detect and react appropriately, as well as unintended consequences (e.g. behavioural changes that occur when the AI is present). A recent study points to the need to consider how human behaviour, the environment and AI blind spots can align to create new types of performance challenges which should be the focus of joint human-AI testing (National Academies of Sciences Engineering and Medicine Citation2021).

Because of the changing nature of AI, it is also envisioned that a lifecycle approach to testing will be needed, wherein the impact of changes to the AI system over time will need to be tested and verified with human users prior to each deployment. Metrics such as AI sustainability and auditability become important in practice, speaking to the need for people to be able to identify, track and manage system changes as the context of operations changes (National Academies of Sciences Engineering and Medicine Citation2021). The ability of AI implementations to meet its objectives when working in realistic conditions with actual users is critical. An effective capability for the co-development and testing of the AI system as a part of a joint human-AI team is critical to its success.

Improvement of AI capabilities

Lastly, the AI community continues to work to improve the capabilities of AI systems, making them more ‘intelligent’, addressing Irony 1. In addition to improving their robustness and ability to work in real-world situations, this includes efforts to integrate an understanding of causality into AI approaches (Pearl Citation2009; Pelivani and Cico Citation2021; Schölkopf et al. Citation2021), which would significantly improve their ability to do situational predictions for improved decision making. Work is also ongoing to develop AI systems with ‘self-awareness’ (Chatila et al. Citation2018; Kounev et al. Citation2017). This includes the ability to determine when the AI system is dealing with situations that are outside of its boundaries of operations. Such systems would be highly beneficial for avoiding some of the issues associated with AI bias and brittleness.

In addition, there is significant interest in developing AI systems with a higher level of capability for teaming with humans (Liang et al. Citation2019; McNeese et al. Citation2018; National Academies of Sciences Engineering and Medicine Citation2021). In conceiving of the AI as a team-mate with its human counterparts, the importance of mutual collaboration and assistance in adapting to demands becomes emphasised. Rather than providing only one-way information, for the human and AI to team will require substantially more two-way information, with the AI capable of monitoring human performance, aligning its self with human goals, and proactively providing information or assistive tasking for example. Human-AI teaming would require a shared understanding of goals, plans and actions, and shared SA (Endsley Citation2017b, Citation2023; U. S. Air Force Citation2015). Both improved communications mechanisms as well as improved displays and sensing capabilities will likely be required to achieve this goal.

Conclusions

AI has received significant market penetration in a variety of areas, even though its current capabilities remain limited. Rapid technical advances, however, are likely, and along with it increased challenges for human performance when overseeing and interacting with AI systems. Addressing these existing and new ironies of AI will be essential for achieving the expected benefits of AI and averting the long history of problems that are much too likely.

In the long term, significant changes to the job market are possible (with some jobs perhaps being made obsolete). However, in the foreseeable future, the prognosis is for AI to become incorporated into many existing jobs and human activities. In order to avoid the potential negative downsides of the AI ironies described, an active effort to create human-centered AI systems that incorporate the capabilities and approaches described here is essential.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

The author(s) reported there is no funding associated with the work featured in this article.

References

  • Akhtar, N., and A. Mian. 2018. “Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey.” IEEE Access. 6: 14410–14430. doi:10.1109/ACCESS.2018.2807385.
  • Alkaissi, H., and S. I. McFarlane. 2023. “Artificial Hallucinations in ChatGPT: implications in Scientific Writing.” Cureus 15 (2): e35179. doi:10.7759/cureus.35179.
  • Bagheri, N., and G. A. Jamieson. 2004. “The Impact of Context-Related Reliability on Automation Failure Detection and Scanning Behaviour.” Proceedings of the 2004 IEEE International Conference on Systems, Man and Cybernetics, 212-217. IEEE. doi:10.1109/ICSMC.2004.1398299.
  • Bainbridge, L. 1983. “Ironies of Automation.” Automatica 19 (6): 775–779. doi:10.1016/0005-1098(83)90046-8.
  • Banbury, S., S. Selcon, M. Endsley, T. Gorton, and K. Tatlock. 1998. “Being Certain about Uncertainty: How the Representation of System Reliability Affects Pilot Decision Making.” Proceedings of the Human Factors and Ergonomics Society 42nd Annual Meeting, 36–41. Santa Monica, CA: Human Factors and Ergonomics Society. doi:10.1177/154193129804200109.
  • Bass, E. J., L. A. Baumgart, and K. K. Shepley. 2013. “The Effect of Information Analysis Automation Display Content on Human Judgment Performance in Noisy Environments.” Journal of Cognitive Engineering and Decision Making 7 (1): 49–65. doi:10.1177/1555343412453461.
  • Boardman, M., and F. Butcher. 2019. “An Exploration of Maintaining Human Control in AI Enabled Systems and the Challenges of Achieving It.”. Paper Presented at the Workshop on Big Data Challenge-Situation Awareness and Decision Support, DSTL Porton Down.
  • Brewster, J., L. Arvantis, and M. Sadeghi. 2023. The next great misinformation superspreader: How ChatGPT could spread toxic misinformation at unprecedented scale, Misinformation Monitor. https://www.newsguardtech.com/misinformation-monitor/jan-2023/
  • Carsten, O., F. C. H. Lai, Y. Barnard, A. H. Jamson, and N. Merat. 2012. “Control Task Substitution in Semiautomated Driving: Does It Matter What Aspects Are Automated?” Human Factors 54 (5): 747–761. doi:10.1177/0018720812460246.
  • Casner, S. M., and E. L. Hutchins. 2019. “What Do we Tell the Drivers? Toward Minimum Driver Training Standards for Partially Automated Cars.” Journal of Cognitive Engineering and Decision Making 13 (2): 55–66. doi:10.1177/1555343419830901.
  • Cavalcante Siebert, Luciano, Maria Luce Lupetti, Evgeni Aizenberg, Niek Beckers, Arkady Zgonnikov, Herman Veluwenkamp, David Abbink, Elisa Giaccardi, Geert-Jan Houben, Catholijn M. Jonker, Jeroen van den Hoven, Deborah Forster, and Reginald L. Lagendijk. 2023. “Meaningful Human Control: Actionable Properties for AI System Development.” AI and Ethics 3 (1): 241–255. doi:10.1007/s43681-022-00167-3.
  • Chatila, Raja, Erwan Renaudo, Mihai Andries, Ricardo-Omar Chavez-Garcia, Pierre Luce-Vayrac, Raphael Gottstein, Rachid Alami, Aurélie Clodic, Sandra Devin, Benoît Girard, and Mehdi Khamassi. 2018. “Toward Self-Aware Robots.” Frontiers in Robotics and AI 5: 88. doi:10.3389/frobt.2018.00088.
  • Copeland, B. J. 2020. Artificial Intelligence: Britannica. Retrieved from https://www.britannica.com/technology/artificial-intelligence.
  • Daugherty, P. R., and H. J. Wilson. 2018. Human ± Machine: Reimagining Work in the Age of AI. Boston, MA: Harvard Business Press.
  • Druce, J., J. Niehaus, V. Moody, D. Jensen, and M. L. Littman. 2021. “Brittle AI, Causal Confusion, and Bad Mental Models: Challenges and Successes in the XAI Program.”. arXiv preprint arXiv:2106.05506.
  • Endsley, M. R. 1995. “Toward a Theory of Situation Awareness in Dynamic Systems.” Human Factors: The Journal of the Human Factors and Ergonomics Society 37 (1): 32–64. doi:10.1518/001872095779049543.
  • Endsley, M. R. 2017a. “Autonomous Driving Systems: A Preliminary Naturalistic Study of the Tesla Model S.” Journal of Cognitive Engineering and Decision Making 11 (3): 225–238. doi:10.1177/1555343417695197.
  • Endsley, M. R. 2017b. “From Here to Autonomy: Lessons Learned from Human-Automation Research.” Human Factors 59 (1): 5–27. doi:10.1177/0018720816681350.
  • Endsley, M. R. 2020. “The Divergence of Objective and Subjective Situation Awareness: A Meta-Analysis.” Journal of Cognitive Engineering and Decision Making 14 (1): 34–53. doi:10.1177/1555343419874248.
  • Endsley, M. R. 2023. “Supporting human-AI Teams: Transparency, Explainability, and Situation Awareness.” Computers in Human Behavior 140: 107574. doi:10.1016/j.chb.2022.107574.
  • Endsley, M. R., and D. G. Jones. 2012. Designing for Situation Awareness: An Approach to Human-Centered Design (2nd ed.). London: Taylor & Francis.
  • Endsley, Mica R., and Esin O. Kris. 1994. “Information Presentation for Expert Systems in Future Fighter Aircraft.” The International Journal of Aviation Psychology 4 (4): 333–348. doi:10.1207/s15327108ijap0404_3.
  • Endsley, M. R., and E. O. Kiris. 1995. “The out-of-the-Loop Performance Problem and Level of Control in Automation.” Human Factors: The Journal of the Human Factors and Ergonomics Society 37 (2): 381–394. doi:10.1518/001872095779064555.
  • Endsley, M. R., and B. Strauch. 1997. “Automation and Situation Awareness: The Accident at Cali, Columbia.”. Proceedings of the Ninth International Symposium on Aviation Psychology 877–881 Columbus, OH: Ohio State University.
  • Fast, E., and E. Horvitz. 2017. “Long-Term Trends in the Public Perception of Artificial Intelligence.”. Proceedings of the Proceedings of the AAAI Conference on Artificial Intelligence: doi:10.1609/aaai.v31i1.10635.
  • Federal Aviation Administration 2019. Joint Authorities Technical Review: Boeing 737 Max Flight Control System Observations, Findings and Recommendations Washington, DC: Author.
  • Feng, X., Y. Jiang, X. Yang, M. Du, and X. Li. 2019. “Computer Vision Algorithms and Hardware Implementations: A Survey.” Integration 69: 309–320. doi:10.1016/j.vlsi.2019.07.005.
  • Ferreira, J. J., and M. S. Monteiro. 2020. “What Are People Doing about XAI User Experience? A Survey on AI Explainability Research and Practice.” Proceedings of the Design, User Experience, and Usability. Design for Contemporary Interactive Environments: 9th International Conference, DUXU 2020, Held as Part of the 22nd HCI International Conference, HCII 2020, Copenhagen, Denmark, July 19–24, 2020, Proceedings, Part II 22, 56–73. Springer.
  • Forster, Y., J. Kraus, S. Feinauer, and M. Baumann. 2018. “Calibration of Trust Expectancies in Conditionally Automated Driving by Brand, Reliability Information and Introductionary Videos: An Online Study.” Proceedings of the Proceedings of the 10th International Conference on Automotive User Interfaces and Interactive Vehicular Applications, 118–128.
  • Friesen, D., C. Borst, M. Pavel, P. Masarati, and M. Mulder. 2021. “Design and Evaluation of a Constraint-Based Helicopter Display to Support Safe Path Planning.” Proceedings of the Nitros Safety Workshop, 9–11.
  • Funk, K., B. Lyall, J. Wilson, R. Vint, M. Niemczyk, C. Suroteguh, and G. Owen. 1999. “Flight Deck Automation Issues.” The International Journal of Aviation Psychology 9 (2): 109–123. doi:10.1207/s15327108ijap0902_2.
  • Garcia, K. R., S. Mishler, Y. Xiao, C. Wang, B. Hu, J. D. Still, and J. Chen. 2022. “Drivers’ Understanding of Artificial Intelligence in Automated Driving Systems: A Study of a Malicious Stop Sign.” Journal of Cognitive Engineering and Decision Making 16 (4): 237–251. doi:10.1177/15553434221117001.
  • Gawron, V. 2019. Automation in Aviation Accidents: Accident Analyses McLean, VA: MITRE Corporation.
  • Gianfrancesco, M. A., S. Tamang, J. Yazdany, and G. Schmajuk. 2018. “Potential Biases in Machine Learning Algorithms Using Electronic Health Record Data.” JAMA Internal Medicine 178 (11): 1544–1547. doi:10.1001/jamainternmed.2018.3763.
  • Gillespie, N., S. Lockey, C. Curtis, J. Pool, and A. Akbari. 2023. Trust in Artificial Intelligence: A Global Study. The University of Queensland and KPMG Australia.
  • Hergeth, S., L. Lorenz, R. Vilimek, and J. F. Krems. 2016. “Keep Your Scanners Peeled: Gaze Behavior as a Measure of Automation Trust during Highly Automated Driving.” Human Factors 58 (3): 509–519. doi:10.1177/0018720815625744.
  • Howard, A. 2020. “Are we Trusting AI Too Much? Examining Human-Robot Interactions in the Real World.”. Proceedings of the Proceedings of the 2020 ACM/IEEE International Conference on Human-Robot Interaction, 1–1.
  • Howard, A., and J. Borenstein. 2018. “The Ugly Truth about Ourselves and Our Robot Creations: The Problem of Bias and Social Inequity.” Science and Engineering Ethics 24 (5): 1521–1536. doi:10.1007/s11948-017-9975-2.
  • Huang, S. H., and M. R. Endsley. 1997. “Providing Understanding of the Behavior of Feedforward Neural Networks.” IEEE Transactions on Systems, Man, and Cybernetics. Part B, Cybernetics : a Publication of the IEEE Systems, Man, and Cybernetics Society 27 (3): 465–474. doi:10.1109/3477.584953.
  • Hussein, E., P. Juneja, and T. Mitra. 2020. “Measuring Misinformation in Video Search Platforms: An Audit Study on YouTube.” Proceedings of the ACM on Human-Computer Interaction 4 (CSCW1): 1–27. doi:10.1145/3392854.
  • Jacobson, S. 2010. “Aircraft Loss of Control Causal Factors and Mitigation Challenges. Proceedings of the AIAA.” Guidance, Navigation, and Control Conference (pp. 8007). doi:10.2514/6.2010-8007.
  • Kaplan, A. D., T. T. Kessler, J. C. Brill, and P. Hancock. 2023. “Trust in Artificial Intelligence: Meta-Analytic Findings.” Human Factors 65 (2): 337–359. doi:10.1177/00187208211013988.
  • Kazhdan, D., B. Dimanov, L. C. Magister, P. Barbiero, M. Jamnik, and P. Lio. 2023. GCI: A (G) raph (C) oncept (I) nterpretation Framework. arXiv preprint arXiv:2302.04899.
  • Kelley, P. G., Y. Yang, C. Heldreth, C. Moessner, A. Sedley, A. Kramm, … A. Woodruff. 2021. “Exciting, Useful, Worrying, Futuristic: Public Perception of Artificial Intelligence in 8 Countries.” Proceedings of the Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, 627–637. doi:10.1145/3461702.3462605.
  • Kiyasseh, Dani, Jasper Laca, Taseen F. Haque, Maxwell Otiato, Brian J. Miles, Christian Wagner, Daniel A. Donoho, Quoc-Dien Trinh, Animashree Anandkumar, and Andrew J. Hung. 2023. “Human Visual Explanations Mitigate Bias in AI-Based Assessment of Surgeon Skills.” NPJ Digital Medicine 6 (1): 54. doi:10.1038/s41746-023-00766-2.
  • Kounev, S., P. Lewis, K. L. Bellman, N. Bencomo, J. Camara, A. Diaconescu, … S. Götz. 2017. The Notion of Self-Aware Computing. Self-Aware Computing Systems.
  • Liang, C., J. Proft, E. Andersen, and R. A. Knepper. 2019. “Implicit Communication of Actionable Information in Human-ai Teams.”. Proceedings of the Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, 1–13. doi:10.1145/3290605.3300325.
  • Liu, P., Q. Fei, J. Liu, and J. Wang. 2021. “Naming is Framing: The Framing Effect of Technology Name on Public Attitude toward Automated Vehicles.” Public Understanding of Science (Bristol, England) 30 (6): 691–707. doi:10.1177/0963662520987806.
  • Ma, R., and D. Kaber. 2005. “Situation Awareness and Workload in Driving While Using Adaptive Cruise Control and a Cell Phone.” International Journal of Industrial Ergonomics 35 (10): 939–953. (doi:10.1016/j.ergon.2005.04.002.
  • Manktelow, K., and J. Jones. 1987. “Principles from the Psychology of Thinking and Mental Models.”. In Applying Cognitive Psychology to User-Interface Design, edited by M. M. Gardiner & B. Christie, 83–117. Chichester: Wiley and Sons.
  • Mazijn, C., C. Prunkl, A. Algaba, J. Danckaert, and V. Ginis. 2022. LUCID: Exposing Algorithmic Bias through Inverse Design. arXiv preprint arXiv:2208.12786.
  • McClumpha, A., and M. James. 1994. “Understanding Automated Aircraft.” In Human Performance in Automated Systems: Current Research and Trends, edited by M. Mouloua & R. Parasuraman, 183–190. Hillsdale, NJ: LEA.
  • McNeese, N. J., M. Demir, N. J. Cooke, and C. Myers. 2018. “Teaming with a Synthetic Teammate: Insights into Human-Autonomy Teaming.” Human Factors 60 (2): 262–273. doi:10.1177/0018720817743223.
  • Mogford, R. H. 1997. “Mental Models and Situation Awareness in Air Traffic Control.” The International Journal of Aviation Psychology 7 (4): 331–341. doi:10.1207/s15327108ijap0704_5.
  • Mortimer, K. 2017. “Understanding Conspiracy Online: Social Media and the Spread of Suspicious Thinking.” Dalhousie Journal of Interdisciplinary Management 13 (1): 1–16 doi:10.5931/djim.v13i1.6928.
  • Mumaw, R. J., N. Sarter, and C. D. Wickens. 2001. “Analysis of Pilots’ Monitoring and Performance on an Automated Flight Deck.” Proceedings of the 11th International Symposium on Aviation Psychology, Columbus, OH.
  • National Academies of Sciences Engineering and Medicine 2021. Human-AI Teaming: State-of-the-Art and Research Needs. Washington, DC: National Academies Press.
  • National Academy of Engineering and National Research Council 2012. Macondo Well Deepwater Horizon Blowout: Lessons for Improving Offshore Drilling Safety. Washington, DC: National Academies Press.
  • National Transportation Safety Board. 2010. Aviation accident report: Loss of control on approach Colgan Air, Inc. operating as Continental Connection Flight 3407 Bombardier DHC-8-400, N200WQ Clarence Center New York, February 12, 2009. (Tech. Rep. No. NTSB/AAR-10/01 PB2010-910401.
  • Nordhoff, S., J. D. Lee, S. C. Calvert, S. Berge, M. Hagenzieker, and R. Happee. 2023. “(Mis-)Use of Standard Autopilot and Full Self-Driving (FSD) Beta: Results from Interviews with Users of Tesla’s FSD Beta.” Frontiers in Psychology 14: 1101520. doi:10.3389/fpsyg.2023.1101520.
  • Northcutt, C. G., A. Athalye, and J. Mueller. 2021. Pervasive Label Errors in Test Sets Destabilize Machine Learning Benchmarks. arXiv preprint arXiv:2103.14749.
  • Oduor, K. F., and E. N. Wiebe. 2008. “The Effects of Automated Decision Algorithm Modality and Transparency on Reported Trust and Task Performance.” Proceedings of the Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 302–306. Los Angeles, CA: Sage. doi:10.1177/154193120805200422.
  • Osoba, O. A., and W. Welser. IV, 2017. An Intelligence in Our Image: The Risks of Bias and Errors in Artificial Intelligence: Rand Corporation.
  • Paleja, R., M. Ghuy, N. Ranawaka Arachchige, R. Jensen, and M. Gombolay. 2021. “The Utility of Explainable AI in Ad Hoc Human-Machine Teaming.” Advances in Neural Information Processing Systems 34: 610–623.
  • Paranjape, K., M. Schinkel, R. N. Panday, J. Car, and P. Nanayakkara. 2019. “Introducing Artificial Intelligence Training in Medical Education.” JMIR Medical Education 5 (2): e16048. doi:10.2196/16048.
  • Pearl, J. 2009. Causality: Cambridge university press.
  • Pearl, J., and D. Mackenzie. 2018. The Book of Why: The New Science of Cause and Effect. New York: Basic Books.
  • Pelivani, E., and B. Cico. 2021. “Toward Self-Aware Machines: Insights of Causal Reasoning in Artificial Intelligence.” Proceedings of the 2021 International Conference on Information Technologies (Infotech), 1–4. IEEE. doi:10.1109/Infotech52438.2021.9548511.
  • Phillips, E., S. Ososky, J. Grove, and F. Jentsch. 2011. “From Tools to Teammates: Toward the Development of Appropriate Mental Models for Intelligent Robots.” Proceedings of the Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 1491–1495). SAGE Publications Sage CA: Los Angeles, CA. doi:10.1177/1071181311551310.
  • Rathore, M. M., S. A. Shah, D. Shukla, E. Bentafat, and S. Bakiras. 2021. “The Role of ai, Machine Learning, and Big Data in Digital Twinning: A Systematic Literature Review, Challenges, and Opportunities.” IEEE Access. 9: 32030–32052. doi:10.1109/ACCESS.2021.3060863.
  • Rouse, W. B., and N. M. Morris. 1985. On Looking into the Black Box: Prospects and Limits in the Search for Mental Models (DTIC #AD-A159080). Atlanta, GA: Center for Man-Machine Systems Research, Georgia Institute of Technology.
  • Rudolph, J., S. Tan, and S. Tan. 2023. “ChatGPT: Bullshit Spewer or the End of Traditional Assessments in Higher Education?” Journal of Applied Learning and Teaching 6 (1)
  • Sallam, M. 2023. “ChatGPT Utility in Healthcare Education, Research, and Practice: systematic Review on the Promising Perspectives and Valid Concerns.”. Proceedings of the Healthcare 887. MDPI. doi:10.3390/healthcare11060887.
  • Samek, W., G. Montavon, A. Vedaldi, L. K. Hansen, and K.-R. Müller. 2019. Explainable AI: interpreting, Explaining and Visualizing Deep Learning (Vol. 11700). Springer Nature.
  • Samek, W., and K.-R. Müller. 2019. Towards Explainable Artificial Intelligence. Explainable AI: interpreting, Explaining and Visualizing Deep Learning. 5–22.
  • Sanneman, L., and J. A. Shah. 2022. “The Situation Awareness Framework for Explainable AI (SAFE-AI) and Human Factors Considerations for XAI Systems.” International Journal of Human–Computer Interaction 38 (18-20): 1772–1788. doi:10.1080/10447318.2022.2081282.
  • Saranti, Anna, Miroslav Hudec, Erika Mináriková, Zdenko Takáč, Udo Großschedl, Christoph Koch, Bastian Pfeifer, Alessa Angerschmid, and Andreas Holzinger. 2022. “Actionable Explainable AI (AxAI): a Practical Example with Aggregation Functions for Adaptive Classification and Textual Explanations for Interpretable Machine Learning.” Machine Learning and Knowledge Extraction 4 (4): 924–953. doi:10.3390/make4040047.
  • Schölkopf, B., F. Locatello, S. Bauer, N. R. Ke, N. Kalchbrenner, A. Goyal, and Y. Bengio. 2021. “Toward Causal Representation Learning.” Proceedings of the IEEE 109 (5): 612–634. doi:10.1109/JPROC.2021.3058954.
  • Sebok, A., and C. D. Wickens. 2017. “Implementing Lumberjacks and Black Swans into Model-Based Tools to Support Human–Automation Interaction.” Human Factors 59 (2): 189–203. doi:10.1177/0018720816665201.
  • Selcon, S. J. 1990. “Decision Support in the Cockpit: Probably a Good Thing?.” Proceedings of the Human Factors Society 34th Annual Meeting, 46–50. Santa Monica, CA: Human Factors Society. doi:10.1177/154193129003400111.
  • Sethumadhavan, A. 2009. “Effects of Automation Types on Air Traffic Controller Situation Awareness and Performance.” Proceedings of the Human Factors and Ergonomics Society 53rd Annual Meeting, 1–5. Santa Monica, CA: Human Factors and Ergonomics Society. doi:10.1177/154193120905300102.
  • Shin, D. 2021. “The Effects of Explainability and Causability on Perception, Trust, and Acceptance: Implications for Explainable AI.” International Journal of Human-Computer Studies 146: 102551. doi:10.1016/j.ijhcs.2020.102551.
  • Silva, S. S., and R. J. Hansman. 2015. “Divergence between Flight Crew Mental Model and Aircraft System State in Auto-Throttle Mode Confusion Accident and Incident Cases.” Journal of Cognitive Engineering and Decision Making 9 (4): 312–328. doi:10.1177/1555343415597344.
  • Silver, David, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, Yutian Chen, Timothy Lillicrap, Fan Hui, Laurent Sifre, George van den Driessche, Thore Graepel, and Demis Hassabis. 2017. “Mastering the Game of Go without Human Knowledge.” Nature 550 (7676): 354–359. doi:10.1038/nature24270.
  • Sinha, S., L. Burd, and J. DuPreez. 2023. How ChatGPT could revolutionize academia, IEEE Spectrum, February 22. https://spectrum.ieee.org/how-chatgpt-could-revolutionize-academia
  • Srivastava, B., and F. Rossi. 2019. “Rating AI Systems for Bias to Promote Trustable Applications.” IBM Journal of Research and Development 63 (4/5): 5: 1–5: 9. doi:10.1147/JRD.2019.2935966.
  • Stanton, B., and T. Jensen. 2021. Trust and Artificial Intelligence (Draft NIST-8332). Washington, DC: NIST.
  • Strauch, B. 2017. “The Automation-by-Expertise-by-Training Interaction: Why Automation-Related Accidents Continue to Occur in Sociotechnical Systems.” Human Factors 59 (2): 204–228. doi:10.1177/0018720816665459.
  • Teoh, E. R. 2020. “What’s in a Name? Drivers’ Perceptions of the Use of Five SAE Level 2 Driving Automation Systems.” Journal of Safety Research 72: 145–151. doi:10.1016/j.jsr.2019.11.005.
  • U. S. Air Force 2015. Autonomous Horizons. Washington, DC: United States Air Force Office of the Chief Scientist.
  • U.S. Canada Power System Outage Task Force 2004. Final report on the August 14, 2003 blackout in the United States and Canada: Causes and recommendations. http://certs.lbl.gov/pdf/b-f-web-part1.pdf
  • Vakil, S. S., and R. J. Hansman. 1997. “Predictability as a Metric of Automation Complexity.” Proceedings of the Human Factors and Ergonomics Society 41st Annual Meeting, 70–74. Santa Monica, CA: Human Factors and Ergonomcis Society. doi:10.1177/107118139704100118.
  • van de Merwe, K., S. Mallam, and S. Nazir. 2022. “Agent Transparency, Situation Awareness, Mental Workload, and Operator Performance: A Systematic Literature Review.” Human Factors:187208221077804. doi:10.1177/00187208221077804.
  • Volz, K., E. Yang, R. Dudley, E. Lynch, M. Dropps, and M. C. Dorneich. 2016. “An Evaluation of Cognitive Skill Degradation in Information Automation.” Proceedings of the Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 191–195. SAGE Publications Sage CA: Los Angeles, CA. doi:10.1177/1541931213601043.
  • West, S. M., M. Whittaker, and K. Crawford. 2019. Discriminating Systems. AI Now.
  • Wickens, C. D., W. S. Helton, J. G. Hollands, and S. Banbury. 2022. Engineering Psychology and Human Performance (5th ed.). New York: Routledge.
  • Wiener, E. L., and R. E. Curry. 1980. “Flight Deck Automation: Promises and Problems.” Ergonomics 23 (10): 995–1011. doi:10.1080/00140138008924809.
  • Wilson, J. R., and A. Rutherford. 1989. “Mental Models: Theory and Application in Human Factors.” Human Factors: The Journal of the Human Factors and Ergonomics Society 31 (6): 617–634. doi:10.1177/001872088903100601.
  • Woods, D. D., and N. B. Sarter. 2000. “Learning from Automation Surprises and Going Sour Accidents.” Cognitive engineering in the aviation domain, 327–353.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.