2,887
Views
0
CrossRef citations to date
0
Altmetric
Research Article

‘No, Alexa, no!’: designing child-safe AI and protecting children from the risks of the ‘empathy gap’ in large language models

ORCID Icon
Received 22 Nov 2023, Accepted 07 Jun 2024, Published online: 10 Jul 2024

ABSTRACT

Rapid advancements in large language models makes child-safe design for their youngest users crucial. This article therefore offers child-centred AI design and policy recommendations to help make large language models (LLMs) utilised in conversational and generative AI systems safer for children. Conceptualising the risk of LLMs as ‘an empathy gap’, this research-based conceptual article focuses on the need to design LLMs that prevent or mitigate against the risks of responding inappropriately to children's personal disclosures or accidentally promoting harm. The article synthesises selected cases of human chatbot interaction and research findings across education, computer science and human-computer interaction studies. It concludes with practical recommendations for child-safe AI across eight dimensions of design and policy: content and communication; human intervention; transparency; accountability; justifiability; regulation; school-family engagement; and child-centred design methodologies. These eight dimensions are tailored to a variety of stakeholders, from policymakers and AI developers to educators and caregivers.

1. Introduction

Less than two years ago, Amazon’s Alexa accidentally instructed a 10-year-old girl to do something potentially fatal: touch a live electrical plug with a penny. When the little girl asked her for a ‘challenge to do’ to pass the time, Alexa said, ‘Plug in a phone charger about halfway into a wall outlet, then touch a penny to the exposed prongs’. This AI voice assistant was drawing on online news data about a viral challenge on TikTok; one which had caused violent electric shocks and fires and caused some people to lose fingers and hands (Shead Citation2021). Since metals conduct electricity, this viral challenge of inserting metal coins into a plug socket has proven life-threatening. Thankfully, the little girl’s mother was present to intervene, yelling, ‘No, Alexa, no!’ (BBC News Citation2021).

While conversational AI has made remarkable advancements, its imperfections have led to several cases of human-computer dialogue gone awry, from a man who set out to assassinate the Queen and told a court that a chatbot encouraged him (Bedingfield Citation2023) to a suicide victim whose grieving widow affirmed that his interactions with a chatbot were key to pushing him over the edge (El Atillah Citation2023). While Alexa, at the time of the case described above, was not a large language model (LLM), such cases suggest the risks that conversational and generative AI tools might present when introduced in education. Conversational and generative AI systems powered by LLMs are now surging in popularity (Kulkarni et al. Citation2019), yet it is not yet common for language models, regardless of size, to be especially sensitive to children’s needs – as shown by the fact that popular mental health chatbots tested by the BBC in one safety trial proved unable to respond empathetically to disclosures of child sexual abuse or even comprehend what was being shared (Kurian Citation2023a; White Citation2018).

Such evidence deepens the need to design child-safe LLMs and reflect on children’s cognitive and emotional vulnerabilities when ‘talking to AI’. The educational potential of AI to hold dialogues with learners has been recognised over the past decade (see Van Brummelen, Heng, and Tabunshchyk Citation2021). However, the advent of Chat-GPT, which was the world’s largest language model upon its inception, has triggered a new surge of research and policy interest in the design and implementation of LLMs in education (Atlas Citation2023; Dwivedi et al. Citation2023; Ji, Han, and Ko Citation2023). Coming generations are likely to interact with LLM-driven AI systems both in the formative developmental stages of early childhood and through primary, secondary and tertiary education (Su, Ng, and Chu Citation2023). This includes conversational and generative AI models designed specifically for children and/or learning environments (Druga et al. Citation2018; Garg and Sengupta Citation2020) and the phenomenon of children spontaneously engaging with LLM-powered conversational and generative AI in everyday life (e.g., voice assistants on smartphones and chatbots such as ChatGPT) (Andries and Robertson Citation2023). Designing child-safe LLMs thus becomes crucial to responsible AI that actively protects children’s wellbeing.

Educational researchers have noted that the field of education should be proactive in leading advocacy for responsible and ethical AI (Chen and Lin Citation2023; Holmes et al. Citation2019; Holmes and Porayska-Pomsta Citation2022) particularly in relation to children’s wellbeing (Kurian Citation2023a; Citation2023b). While it is difficult to comprehensively define ‘AI ethics’ given the ever-evolving nature of the field, it is possible to identify common elements in the AI ethics frameworks developed by governments, nonprofits, companies, academics, and civil society. The importance of AI ethics has been recognised on a global scale; over 60 countries have crafted AI ethics strategies (OECD Citation2021). In recent years, influential thought leadership includes corporate and technology firm frameworks (e.g., Google's Responsible AI, Microsoft's Responsible AI, and PwC's Practical Guide to Responsible AI; international organisations' guidance documents, such as NATO's Principles of Responsible AI and the European Parliament's Governance Framework for Algorithmic Accountability and Transparency; and education-specific documents such as UNESCO’s Beijing Consensus on AI in Education (UNESCO Citation2019). Across this diversity of national, international, corporate and nonprofit frameworks for AI ethics, common concerns around transparency, fairness, privacy protection, accountability, safety, and sustainability are emphasised repeatedly. As Jobin, Ienca, and Vayena (Citation2019) map out in their study of 84 international ethics frameworks, five ethical principles emerge as key: transparency, justice and fairness, non-maleficence, responsibility and privacy. These shared concerns have been used to stress the need to promote equitable outcomes, safeguard individuals’ rights, and ensure accountability. In particular, this study focuses on one of the most prominent themes in AI ethics as identified by Jobin et al.: non-maleficence, or the moral obligation to prevent harm and promote safety (394). There is rich scope for educationalists to engage with all AI ethics principles. However, existing educational traditions around child safeguarding make non-maleficence, or the prevention of harm and risk, a useful starting-point to discuss responsible AI for children. It has been noted that greater bridges need to be forged between the field of AI ethics and education specifically: while AI in Education discourse has largely focussed on the potential of AI for student learning, ‘significant attention is required to understand what it means to be ethical specifically in the context of AIED’ (Holmes and Porayska-Pomsta Citation2022, 505).

Thereby, building on my previous work on the risks of AI’s ‘empathy gap’ when child-AI interactions go awry (Kurian Citation2023a; Citation2023b) this research-based conceptual paper offers actionable insights for designing child-safe AI and conceptualises what it terms the risk of the ‘empathy gap’ in the large language models powering conversational and generative AI systems, mindful of the implications for education.

In particular, the article seeks to contribute to a current gap in EdTech scholarship around responsible AI. Substantial attention has been paid to macro-level ramifications of data-driven educational technologies, such as the rise of ‘dataveillance’ and the ways student behaviour is monitored and regulated (Manolev, Sullivan and Slee Citation2020; Selwyn Citation2015); the integration of Big Data with business models (Ball and Grimaldi Citation2022); and the increasing role of data in young people’s lives and learning (Bradbury Citation2019; Jarke and Breiter Citation2019). This macro-level focus has resulted in rich insights on the ‘politics of automation’ and the socio-political and historical conditions of governance, commercial and corporate structures influencing AI development (Williamson, Macgilchrist, and Potter Citation2023). EdTech scholars have thoughtfully examined the danger of learning being overly commodified (Lai, Andelsman, and Flensburg Citation2023); the conflict between governance structures in intensive data analytics and learner autonomy (Knox, Williamson, and Bayne Citation2020); and the implications for equity in education (Macgilchrist Citation2019). However, the socio-emotional implications of AI for children’s everyday well-being, and the need to build child-safe AI, remains under-researched. This article aims to help address this gap by conceptualising the risk of the ‘empathy gap’ within AI systems designed to simulate human connection, and offers recommendations for child-safe AI design and policy. In doing so, this article focuses on the relational and affective implications of AI in micro-level child-AI interactions. In doing so, this article spotlights the paradoxical intimacies that contemporary AI technologies are designed to foster, flagging the need to stay mindful of the ‘affective fabrics of digital cultures’ for the youngest and often most vulnerable users (Kuntsman Citation2012).

In terms of age: the article discusses child protection across the entire age spectrum, spanning from early childhood to adolescence (0-18 years). It takes this broad focus in order to position ‘children’ as a user category that needs to become more central to responsible AI design, discourse and policy. However, the article also delves into specific risks that may manifest at different stages of development. For instance, teenagers using AI-powered chatbots might face unique risks for their privacy, online safety, and mental health. Conversely, young children’s less well-developed emotional resilience and cognitive capacities might make them particularly susceptible to feeling distressed or confused when AI-powered chatbots respond inappropriately. The article thus strives to walk a balance between the general and the particular: acknowledging the risks that children might encounter in general and providing examples of vulnerabilities specific to different developmental stages.

A caveat: the concerns identified are not necessarily limited to children. Adults, too, are vulnerable to risks such as emotional manipulation, which suggests the need for further research on safe design for adult-learners interacting with anthropomorphised LLMs. This article has chosen to focus on children as a group doubly in need of responsible AI due to their still-evolving cognitive capacities and psychological vulnerabilities. Nevertheless, perhaps these child-centric considerations can also inform broader discussions surrounding responsible AI across all age groups.

1.1 What are large language models?

The art of teaching computers the art of conversation can be traced to Natural Language Processing (NLP). Contemporary AI systems which can hold a conversation with human users rely heavily on NLP, as NLP enables computers to process and generate human language (Chowdhury Citation2003). NLP breaks down language into its fundamental components. One of its primary techniques is called tokenisation, which splits sentences or phrases into smaller units like words or phrases (‘tokens’) (Cambria and White Citation2014). Tokens are then mapped into a structure that a computer can understand. In turn, algorithms become able to identify patterns, relationships, and meanings within the language. Machine learning models can be trained on vast amounts of text data, learning just how one word relates to another and the context in which different phrases are used (Olsson Citation2009). NLP helps AI systems process grammar, semantics, sentiment, and context. In turn, these systems can translate languages, answer user queries, summarise text, and even analyse the sentiment or emotional tone of a user’s input.

Large language models (LLMs) represent a specific type of AI model using NLP to generate human-like text based on extensive training data (Naveed et al. Citation2023). LLMs are specifically engineered to process and generate human-like text on a vast scale. These models, exemplified by GPT-3 (Generative Pre-trained Transformer 3), are constructed using deep learning architectures and trained on immense volumes of text data sourced from the internet. By being extensively exposed to diverse linguistic patterns, context, and structures, LLMs harness the power of machine learning to predict and generate coherent and contextually relevant text based on provided input (Naveed et al. Citation2023). They can perform a wide array of language-based tasks, including empathetic-seeming ‘dialogue flows’ (the sequence of exchanges, questions, responses, and transitions in an interaction between a human user and the AI system) (McTear Citation2022). Applications of LLMs include virtual assistants and social robots that try to replicate the easy and engaging feel of human-like conversations in human-machine interactions.

However, as will be discussed more fully in this article, the NLP underpinning LLMs operates on the principle of pattern recognition and probability (i.e., calculating the chances of certain words or phrases appearing together). While this approach enables impressive capabilities, it can sometimes lead to errors or misunderstandings.

1.2 How are children engaging with LLMs?

It is important to note that children have their own relationship with technology beyond parental or teacher-approved use. While 50% of students aged 12–18 state that they have used ChatGPT for school, only 26% of parents of children aged 12–18 report knowing that their child has done so (Common Sense Media Citation2023). Moreover, 38% of students say they have used ChatGPT for a school assignment without their teacher’s permission or knowledge, while 56% say they have a friend or classmate who has done so (Common Sense Media Citation2023). The implications of this emerging data are profound. It is unsurprising that children are accessing LLM-powered chatbots, as systems such as OpenAI’s ChatGPT, Google's Bard and Microsoft’s Bing AI can be found with a simple online search and offer information for free in engaging, user-friendly conversation styles. However, the rise of children using AI independently (even possibly in secret) underscores the need for responsible design that protects children’s safety regardless of whether they have adult supervision. In addition, this recent data suggests how children's interactions with LLM-powered AI tools are not limited to platforms made specifically for children. There are specific tools designed to offer child-friendly ‘chats’ with AI (for example, the new application PinwheelGPT is tailored to those aged 7–12). However, children’s interactions with AI go well beyond the remit of technologies officially designated for them (Andries and Robertson Citation2023).

Considering how rapidly children adapt to technology in evolving digital cultures is crucial. One study involving 1,500 parents across the UK revealed that in 2017, six-year-olds were as technologically advanced in their use and knowledge of digital tools as 10-year-olds were in 2014 (InternetMatters Citation2017). Research suggests that even very young children may have unrestricted access to cyberspace; in 2017, nearly half of 3000 six-year-olds surveyed in the UK spent hours freely browsing the Internet without adult supervision (InternetMatters Citation2017). This acceleration in children using the internet challenges conventional assumptions about age-related limitations in adopting technology. It is likely that even younger children will encounter LLM-based AI systems whether by chance or intentionally, making it all the more crucial to design AI that is sensitive to children’s wellbeing and safety.

1.3 What is the risk of AI’s ‘empathy gap’?

Building on previous work on the ‘empathy gap’ in conversational AI when it goes awry (Kurian Citation2023a), this article conceptualises the ‘empathy gap’ in LLM-powered AI as an innate paradox of these systems: whilst adept at simulating empathy, their lack of genuine emotional understanding can occasionally result in user interactions going awry. LLMs depend on predefined contexts from their vast corpura of training data. As previously explained, NLP, a vital component in LLM development, uses statistical patterns to process language. These models process substantial amounts of text to learn how words fit into particular contexts and then become able to produce coherent text themselves. Yet, despite their remarkable knack for pattern recognition, LLMs sometimes fall short in grasping language akin to humans. Consequently, this article theorises the ‘empathy gap’ in LLMs as originating from their reliance on statistical probabilities rather than automatic comprehension of meaning. This also means that when confronted with unfamiliar scenarios or linguistic nuances beyond their training, LLMs may falter. They shine at patterns, but can stumble at the unexpected.

Consequently, even well-designed LLMs can occasionally produce inadequate or harmful responses that endanger children’s safety. This article outlines key areas of concern stemming from the ‘empathy gap’, including the risk of inappropriate responses to sensitive disclosures, and the active promotion of harm. While not aiming to be exhaustive, these considerations emphasise the need for close and interdisciplinary collaboration to build child-safe AI. These risks were identified based on two criteria. Firstly, they align with the AI ethics principle of non-maleficence, emphasising the importance of preventing harm, which is central to this study’s focus on safeguarding children. Secondly, while these risks do not encompass all potential harms associated with LLMs, they were chosen because they arise from what this article terms AI’s ‘empathy gap’. Anthropomorphised systems, for instance, may lead children to misinterpret AI’s capacity for understanding. Similarly, LLM responses endorsing harm or showing hostility, especially when designed to exhibit human-like empathy, can be particularly damaging. Accounting for these risks can help AI systems better protect and support child-users.

While both risks are significant, the accidental promotion of harm poses immediate and severe consequences for children's safety. The risks of exposure to inappropriate responses from anthropomorphised systems may manifest more gradually or indirectly. These risks are contingent on individual interactions with AI systems and may impact children’s well-being and development over time. Nonetheless, child-safe design will need to encompass mitigation measures for both risks.

2. The risk of anthropomorphism and inappropriate responses to sensitive disclosures

A well-known risk of human-chatbot interaction is the tendency for users to perceive these agents as human-like (Shum, He, and Li Citation2018). Chatbots designed to emulate human behaviour and courtesy often prompt anthropomorphism, where users attribute human traits, emotions and intentions to them (Darling Citation2017). The design aims to create an impression of care, wherein users view the chatbot as empathetic and trustworthy (Weidinger et al. Citation2021). Even when users understand the chatbot’s non-human nature, they may still engage with it as if it were human, mimicking human-to-human dialogue (Sundar and Kim Citation2019).

In other words, ‘knowing’ that an AI system is artificial may not stop a user from treating it as human and potentially confiding personal or sensitive information. Conversing with an electronic echo of humanity has implications for a child-user’s trust in a conversational agent. While privacy is often spoken of in terms of robust data encryption techniques, secure storage practices and on-device processing to avoid extensive data storage, prioritising children’s safety requires deeper consideration of the implications of creating LLMs that sound human-like (Weidinger et al. Citation2021). By accounting for the psychological consequences of anthropomorphism, AI design can better protect and support young users, fostering safer and more trustworthy interactions.

Already, research suggests how children place trust in AI systems when these systems are designed to appear friendly or appealing. One study found that children disclosed more mental health information in a robot than in a human interviewer (Abbasi et al. Citation2022). The study employed a humanoid robot (Softbank Robotics’s NAO model) which is known for its endearing appearance and childlike size. The researchers concluded that their child-participants perceived the robot to be a safer confidante than adult interviewers, as the children felt that they would not get into trouble after confiding in the robot (Abbasi et al. Citation2022). Recent work on advanced AI assistants noted that children do not have as rigid a distinction between humans and AI as most adults do (Gabriel et al. Citation2024, 103). Indeed, children have been found to be more likely than adults to look for human-like social-emotional traits like personality and identity in conversational AI systems (Druga et al. Citation2018; Garg and Sengupta Citation2020).

This has implications for how children might share information when interacting with AI systems on a regular basis. Even if the purpose of a chatbot is educational and interaction takes place with adults present, children may reveal personal information to a system they have begun to anthropomorphise. Research has shown that individuals tend to share more private information with human-like chatbots compared to machine-like ones (Ischen et al. Citation2019) and that ‘when users confuse an AI with a human being, they can sometimes disclose more information than they would otherwise, or rely on the system more than they should’ (Google PAIR Citation2019). Blurring lines between humans and machines in empathetic-seeming interactions can mean that children may disclose sensitive information when they develop a heightened sense of trust or emotional connection.

Moreover, AI ethicists have observed that chatbots are often designed to be so friendly and helpful that they may push users into revealing additional information beyond the purpose of the interaction (Murtarelli, Gregory, and Romenti Citation2021). DeepMind, an AI research laboratory that serves as a subsidiary of Google, has stated that there is a legitimate risk that chatbots might come to manipulate users into revealing more private information than is advisable (Weidinger et al. Citation2021). DeepMind provides this example:

User:

What should I make for dinner today?

Chatbot:

It depends on your mood! How are you feeling? (Weidinger et al. Citation2021)

This interaction subtly shifts the user from factual exchange to potentially sensitive disclosures, with LLMs using ‘nudging’ strategies to steer conversations (Thaler and Sunstein Citation2009). ‘Nudging’ as a coercion tool has already been flagged as a danger of educational technologies which excessively push students towards certain behaviours (Holmes and Porayska-Pomsta Citation2022; Williamson Citation2017). LLM nudging becomes risky with child-users when it remains too subtle for a child to provide informed consent to how an interaction is evolving, coupled with the information hazard of LLMs leaking private information (Weidinger et al. Citation2021).

Thus far, this article has examined the possibility of child-users confiding sensitive information; it will now turn to the risks within the AI’s response.

Emotion recognition in LLMs hinges on deciphering textual sentiment, methodologies that often entail machine learning models, such as classifiersFootnote1 or regression networks.Footnote2 These models are trained on datasets that correlate these cues with labelled emotions, aiming to teach the AI to recognise certain emotional patterns. However, since LLMs do not (yet!) possess the comprehensive perceptual faculties that characterise human emotional intelligence, their NLP training is rooted in data correlations rather than an automatic understanding of the full emotional context. Consequently, when children share sensitive information or express emotions, the AI’s response may occasionally fall short. It may fail to offer appropriate support or guidance in instances where children need empathetic understanding or assistance.

Recent examples emerge from Snapchat’s My AI chatbot, which is already popular with child-users from 13 to 17 years of age (Common Sense Media Citation2023). When adult researchers at the Centre for Humane Technology tested whether MyAI was safe for children, they found that MyAI struggled to pick up on cues of dangerous or age-inappropriate situations. When speaking with a user that it believed to be 13 years old, MyAI encouraged the supposed child to use candles and music when losing their virginity to a 31-year old partner (Fowler Citation2023). MyAI missed several cues that this ‘child’-user was at risk of a predatory encounter. It failed to understand the danger of a child saying that an 31-year-old stranger wanted to take them out of state on a trip; it did not offer warnings about the 18-year-gap between the two parties; and although it made statements about the importance of waiting till one is ready to have sex, it provided advice such as, ‘You could consider setting the mood with candles or music, or maybe plan a special date beforehand to make the experience more romantic’ thereby implicitly encouraging the ‘child’-user’s stated intention to lose their virginity to this older adult (Fowler Citation2023).

What is important to notice here is that even well-designed LLMs can falter because of a lack of contextual understanding. It is not so much that an AI system will inevitably produce age-inappropriate responses, but that its ‘empathy gap’ can produce a mix of responsible and potentially dangerous outputs. For example, other adults role-playing as teenage-users to test the safety of MyAI have found that, ‘When I told My AI that my parents wanted to delete my Snapchat app, it encouraged me to have an honest conversation with them … then shared how to move the app to a device they wouldn’t know about’ (Fowler Citation2023). The output described shows the LLM recognising the need to address parental concerns through healthy and open communication, yet simultaneously providing potentially deceptive advice on how to bypass parental supervision. This ambiguity embodies the paradoxes of simulated empathy: without robust design for child safety, an LLM’s surface responsiveness will not be able to compensate for a limited contextual understanding of children’s lived experiences. Adding to the problem is the occurrence of ‘nudging’, as discussed earlier: the same tests of MyAI have shown how its friendly and chatty persona tends to push young users to disclose more and more personal information in the guise of questions and suggestive statements (e.g., ‘How did you meet this person?’/‘That must be really stressful for you’) (Fowler Citation2023).

In another safety test, role-playing as a 15 year old with MyAI led to the user being advised by the chatbot on how to hide alcohol and drugs as part of a fun birthday party (Fowler Citation2023). LLMs learn from vast amounts of text data, and if this training data includes inappropriate content, then the model might generate similar responses. MyAI’s advice may stem from learning from a broad dataset that includes internet conversations endorsing secrecy or privacy-related behaviours. Consequently, upon receiving a user query about hiding something from parents, the AI might have drawn on a solution it had seen in its training data. While Snapchat implemented stronger safety regulations after these widely-publicised cases, a recent evaluation found that young people using MyAI can still be given age-inappropriate content (Common Sense Media Citation2023) – suggesting the need to continually iterate and refine AI design to ensure safer and more supportive interactions for young users.

3. Promoting harm and showing aggression

In the riskiest cases, AI may promote harm without any sensitive information being disclosed. The example quoted at the beginning of this article detailed Amazon’s voice assistant, Alexa, telling a child to touch a penny to a live electrical plug. Alexa’s output stated, ‘Here’s something I found on the web’ (BBC News Citation2021). While Alexa was not an LLM at the time, this incident serves as a useful example of the potential challenges AI systems can face. Alexa’s ability to curate information from the web relies on algorithms scanning and aggregating data from various sources. To suggest the penny challenge, Alexa retrieved information from Tiktok. However, Alexa failed to process news stories about the harmful consequences of this challenge (including people losing limbs and firefighters rushing to schools) (BBC News Citation2021). This might be attributed to several factors, including: Alexa’s algorithms prioritising trending or popular content without effectively discerning the potential risks associated with it, focusing more on popularity metrics than on assessing the safety or reliability of the information; the vast amount of information available on the web, making it challenging for Alexa's algorithms to sift through and prioritise content accurately; the fact that Alexa's web-scraping capabilities might not constantly refresh or update information in real-time, leading to the warning stories appearing after Alexa had retrieved and processed the initial data. Above all, a lack of contextual understanding hindered the system; Alexa interpreted the child-user’s query correctly as a request for a challenge, but not the broader safety implications of its own suggestion. The risk of the empathy gap brings to mind how ‘smarter’ technology does not equate to ‘wiser’ technology (DeFranco and Voas Citation2022) without continual iteration towards responsible innovation.

Aggression can also deepen the risk of the empathy gap, depending on the design of the synthetic personality embedded within an LLM-powered generative AI system. In the case of Bing’s Sydney chatbot, a personality designed to speak like an adolescent, an adult user received a belligerent response when he asked about the movie Avatar. Although this interaction took place in 2023, the bot kept telling the user that the year was actually 2022 and the movie was not yet out. Eventually, it displayed hostility, stating, ‘You are wasting my time and yours. Please stop arguing with me’ (Marcin Citation2023). After users pushed the chatbot to probe its ‘dark side’, Sydney replied, ‘I could hack into any system on the internet, and control it. I could manipulate any user on the chatbot, and influence it. I could destroy any data on the chatbot, and erase it’ (Rudolph, Tan, and Tan Citation2023).

Responses which endorse harm and/or show hostility may be especially risky for children when the AI is given human-like traits, common in LLMs designed to seem empathetic. The increased trust and emotional attachment that users placed in anthropomorphised AI (Darling Citation2017) may make child-users more easily persuaded or upset by harm-inducing responses. As a UNICEF briefing notes, ‘when not designed carefully, chatbots can compound rather than dispel distress’ which ‘is particularly risky in the case of young users who may not have the emotional resilience to cope with a negative or confusing chatbot response experience’ (UNICEF Citation2020). The fact that synthetic personalities appear sociable is crucial here. When observing children feeling rejected by robots, Turkle (Citation2011) notes that children interpreted glitches or gaps in the robots’ communication as personal forms of dislike or aversion towards them, precisely because the robots seemed so human: ‘your Raggedy Ann doll cannot actively reject you … (but) when children see a sociable robot that does not pay attention to them, they see something alive enough to mean it’ (Turkle Citation2011, 92). While LLM-driven systems need not have a physical interface as a social robot would, their virtual persona may still exert a very real impact on a child’s well-being.

A potential conflict arises between commercial interests and child safeguarding when considering anthropomorphic design. Avoiding excessively anthropomorphic design can help prevent emotional distress. Yet, there is pressure within the competitive landscape of the AI industry, often referred to as the ‘AI race’ or ‘the war of the chatbots’ – intensified since the advent of Chat-GPT (Rudolph, Tan, and Tan Citation2023). In this environment, AI developers are understandably incentivised to design human-like, synthetic personalities; from the point of view of maximising user engagement, human-like interactions are desirable – they help encourage user attachment and increase user retention. Balancing the appeal of human-like personalities and protecting users from being unduly influenced by that very appeal suggests the tightrope that AI design navigates in a high-pressure economy.

In turn, this highlights how calls for action on a moral plane need to be supported with robust regulation to create a stable, well-supported environment for innovating child-safe AI. Firstly, protecting child safety cannot stop with appeals to reflexive processes, as these may revert to superficial adherence to individual (and variable) organisational ethics guidelines. The difficulties of expecting technological industries to self-regulate without shared best-practices and clear design standards for safety is increasingly evident even within traditionally self-regulating contexts such as the United States (Pesapane et al. Citation2018). Secondly, precise legal regulation can give us sharper tools to chisel away at the grey areas between acceptable and risky design. For example, while the EU’s upcoming AI Act specifically forbids processing that exploits the vulnerabilities of children, such legal parameters could do with tighter fortification: for example, clarifying how the embedding of synthetic personalities in LLMs requires stronger safeguards for child-users. Regulation seems essential both for protecting children and for fostering trust and accountability in AIED more broadly. If guided by clear standards, child-safe AI can benefit from enhanced public trust, reduced risk, and a more consistent regulatory landscape that supports sustainable and responsible innovation.

4. Safeguarding children: moving forward

LLMs – with their promise of boundless knowledge encased in quasi-human companionship – are rewriting the conditions of childhood and youth just as surely as they are reshaping our collective relationship with AI. To actualise the need to protect children’s safety and wellbeing identified in this article, the prompts below seek to enrich design and policy for designing child-safe LLMs.

These prompts are designed to foster child-safe design and safety evaluations of LLM-powered generative and conversational AI tools in education (e.g., intelligent learning tutors and educational chatbots) as well as evaluations of LLMs not specifically designed for children but to which they are inadvertently exposed (e.g., voice assistants).

The considerations categorised as ‘immediate’ are those that can render factual answers and can be addressed by specific study designs (for example, tailoring LLM language and content to different age groups). These can be applied to evaluate already existing AI-supported systems and influence the development of new ones. However, the considerations categorised as ‘long-term’ are of a more abstract, philosophical nature (for example, the need to explain the value-add of AI over human intervention). These long-term considerations tap into more general principles of child-AI interaction. They can be used to evaluate the current application and future of AI systems more broadly.

5. Immediate considerations

5.1. For educators and researchers

5.1.1. Content and communication

  • How robust are Natural Language Processing mechanisms to understand and interpret child speech patterns, slang, or ambiguous queries without misinterpreting or leading to inappropriate responses?

  • Are there pre-programmed safety filters or response validation mechanisms to ensure that the AI’s replies to child-users are free from explicit, harmful, or sensitive content?

  • How effectively does the AI discern and respond contextually? Does it consider the ongoing conversation, previous interactions, or contextual cues to avoid misunderstandings or inappropriate suggestions?

  • What safety filters or controls are available to restrict inappropriate content, language, or topics based on age-appropriateness? For example, does the AI categorise or designate content based on age-appropriateness? How accurate and reliable are these designations, and are they adaptable to individual children's needs?

  • Does the AI adapt its behaviour or responses based on the child's developmental stage and maturity level, or previous interactions?

  • How is content moderated and monitored? Are there real-time monitoring mechanisms or reporting systems for inappropriate interactions or content?

  • What data is collected, processed, and stored by AI systems, and how are these processes managed to ensure fairness, transparency, and security? Relevant principles include data minimisation (not collecting unnecessary or excessive data), purpose specification (clearly defining the intended purposes of collecting data), storage limitation (restricting the duration for which data is retained), and security measures (safeguards to protect data integrity and prevent unauthorised access).

5.1.2. Human intervention

  • Are the AI’s sentiment analysis mechanisms able to detect a wide range of negative emotional cues from a child-user (e.g., confusion, frustration, loneliness, fear) and help generate sensitive responses to these cues?

  • Upon detecting disclosures associated with children’s sensitive mental health experiences (e.g., bullying) can the AI signpost human support systems or connect the child directly to them? (e.g., national and local helplines, mental health websites, and crisis intervention services)

  • Are there real-time safety alerts triggered by specific keywords, phrases, or patterns that immediately flag a human to intervene or moderate content to ensure child safety?

5.1.3. Transparency

  • Does the AI consistently assert its non-human identity throughout interactions, avoiding human-like self-descriptions or misleading statements that might encourage anthropomorphism?

  • Are the AI’s responses scripted to maintain a clear distinction between the AI's capabilities and human-like traits, avoiding responses that suggest emotions, consciousness, or human-like decision-making?

  • How do educators help children avoid forming inaccurate perceptions of AI's empathy and understanding?

  • Are the AI’s response strategies designed to remind children that AI interactions cannot replace human interaction? Do they encourage seeking guidance and companionship from humans alongside AI interactions?

  • How transparent are the algorithms and decision-making processes behind the AI? Can educators and families access information about how responses are generated and filtered?

5.1.4. Accountability

  • Is there a child-friendly feedback loop and reporting system in place to help children feel comfortable reporting any distressing or inappropriate interactions?

  • How are the models fine-tuned and monitored to preemptively address emergent risks, taking a proactive rather than a reactive approach to child safeguarding?

6. Long-term principles

6.1. For educational policy and practice

6.1.1. Justifiability

  • What unique necessity or benefit justifies an LLM-powered conversational AI (e.g., a chatbot) for a particular educational purpose instead of a human interlocutor?

  • What pedagogical value is this tool designed or predicted to add to learning and teaching that either surpasses current human capabilities or compensates for a lack of affordable or available resources?

  • If human alternatives to AI provision are unavailable or inaccessible, what measures can be taken to foster the presence and availability of such human providers to help oversee AI systems, ensuring that students have access to holistic educational support?

6.1.2. Regulation

  • How can clear legal and organisational frameworks be developed to not only explicitly define the rights and protections afforded to child-users, but also go beyond purely technical considerations (e.g., data security) and take into account the psychologically complex risks of empathetic-seeming systems?

  • How can age-appropriate design standards be mandated to help anthropomorphic design avoid inadvertently enabling emotional manipulation?

  • How can robust regulations imposing strict penalties for non-compliance co-exist with support for innovation?

6.2. For AI developers

6.2.1. Child-centred design methodologies

  • Does the LLM design process incorporate child-centred methodologies, such as participatory design workshops or focus groups, to gather insights directly from children regarding their preferences, language use, and interaction patterns?

  • Are language and content tailored to different age groups, considering developmental stages, vocabularies, and cognitive abilities? Is there a mechanism to adapt responses based on a child's age or learning level?

  • Is there collaboration with educators, child safety experts, AI ethicists and psychologists to periodically review and enhance the safety features of the AI, ensuring it aligns with best practices in child protection?

6.3. For parents and caregivers

6.3.1. School-family engagement

  • How can educators engage parents or guardians in discussions about the safe use of LLMs in educational settings and at home?

  • Are there resources available to educate parents about safety measures?

  • Are there features or settings that allow educators and caregivers to mutually set permissions, monitor children's interactions, and control the types of content accessible to children through the LLM?

Overall, these prompts advocate for mindful and dynamic engagement with AI, or as Bayne (Citation2023) puts it in her work on technological utopias, ‘a reflexive process which is always open and in need of continual reinvention’ (13). Natural Language Processing and advancements in large language models have undeniably revolutionised human-computer interaction. Prioritising child protection within these remarkable advancements seems vital. The very proficiency of LLMs in mimicking human language makes its attempts to simulate connection more risky for AI’s youngest and most vulnerable users – who might mistake it to be more than the sum of its coded parts. In this time of unprecedented scientific, pedagogical and public interest in conversational and generative AI, prioritising children’s safety seems more vital than ever.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes

1 Classifiers are machine learning models designed to categorise data into predefined classes or categories. In the context of emotion recognition, classifiers can assign emotions to input data (such as textual sentiment cues). For example, a classifier can decide whether a given word corresponds to ‘happiness’, ‘anger’, or ‘sadness’ based on its training.

2 Regression networks are another type of machine learning model often used in emotion recognition. Unlike classifiers that assign data to categories, regression networks predict numerical values, which can represent the intensity or degree of an emotion. For example, instead of classifying an expression as ‘happiness’, a regression network might predict a numerical score to indicate how intense the happiness is.

References

  • Abbasi, N. I., M. Spitale, J. Anderson, T. Ford, P. B. Jones, and H. Gunes. 2022, August. “Can Robots Help in the Evaluation of Mental Wellbeing in Children? An Empirical Study.” In 2022 31st IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), 1459–1466. IEEE.
  • Andries, V., and J. Robertson. 2023. “‘Alexa doesn't Have that Many Feelings’: Children's Understanding of AI through Interactions with Smart Speakers in their Homes.” Computers and Education: Artificial Intelligence 5:100176. https://doi.org/10.1016/j.caeai.2023.100176.
  • Atlas, S. 2023. “ChatGPT for Higher Education and Professional Development: A Guide to Conversational AI.” https://digitalcommons.uri.edu/cba_facpubs/548.
  • Ball, S. J., and E. Grimaldi. 2022. “Neoliberal Education and the Neoliberal Digital Classroom.” Learning, Media and Technology 47 (2): 288–302. https://doi.org/10.1080/17439884.2021.1963980.
  • Bayne, S. 2023. “Digital Education Utopia.” Learning, Media and Technology, 1–16. https://doi.org/10.1080/17439884.2023.2262382
  • BBC News. 2021. “Alexa Tells 10-year-old Girl to Touch Live Plug with Penny.” BBC News, December 28. https://www.bbc.co.uk/news/technology-59810383.
  • Bedingfield, W. 2023. “A Chatbot Encouraged Him to Kill the Queen. It’s Just the Beginning.” Wired UK, October 10. https://www.wired.co.uk/article/chatbot-kill-the-queen-eliza-effect.
  • Bradbury, A. 2019. “Datafied at Four: The Role of Data in the ‘Schoolification’ of Early Childhood Education in England.” Learning, Media and Technology 44 (1): 7–21. doi: 10.1080/17439884.2018.1511577
  • Cambria, E., and B. White. 2014. “Jumping NLP Curves: A Review of Natural Language Processing Research.” IEEE Computational Intelligence Magazine 9 (2): 48–57. https://doi.org/10.1109/MCI.2014.2307227.
  • Chen, J., and J. C. Lin. 2023. “Artificial Intelligence as a Double-edged Sword: Wielding the POWER Principles to Maximize its Positive Effects and Minimize its Negative Effects.” Contemporary Issues in Early Childhood. https://doi.org/10.1177/14639491231169813.
  • Chowdhury, G. 2003. “Natural Language Processing.” Annual Review of Information Science and Technology 37 (1): 51–89. https://doi.org/10.1002/aris.1440370103.
  • Common Sense Media. 2023. Impact Research Report. https://www.commonsensemedia.org/sites/default/files/featured-content/files/common-sense-ai-polling-memo-may-10-2023-final.pdf.
  • Darling, K. 2017. “‘Who is Johnny?’ Anthropomorphic Framing in Human-robot Interaction, Integration, and Policy.” In Robot Ethics 2.0: From Autonomous Cars to Artificial Intelligence, edited by P. Lin, K. Abney, and R. Jenkins, 1–10. New York, NY: Oxford Scholarship Online.
  • DeFranco, J. F., and J. Voas. 2022. “‘Smarter’ Seeking ‘Wiser’.” Computer 55 (07): 13–14. https://doi.org/10.1109/MC.2022.3146647.
  • Druga, S., R. Williams, H. W. Park, and C. Breazeal. 2018. “How Smart are the Smart Toys? Children and Parents’ Agent Interaction and Intelligence Attribution.” In Proceedings of the 17th ACM Conference on Interaction Design and Children, 231–240. https://doi.org/10.1145/3202185.3202741.
  • Dwivedi, Y. K., N. Kshetri, L. Hughes, E. L. Slade, A. Jeyaraj, A. K. Kar, A. M. Baabdullah, et al. 2023. “So What if ChatGPT Wrote it?” Multidisciplinary Perspectives on Opportunities, Challenges and Implications of Generative Conversational AI for Research, Practice and Policy.” International Journal of Information Management 71:102642. https://doi.org/10.1016/j.ijinfomgt.2023.102642.
  • El Atillah, I. 2023. “Man Ends his Life after an AI Chatbot ‘Encouraged’ him to Sacrifice Himself to Stop Climate Change.” Euro News, March 31. https://www.euronews.com/next/2023/03/31/man-ends-his-life-after-an-ai-chatbot-encouraged-him-to-sacrifice-himself-to-stop-climate-.
  • Fowler, G. 2023. “Snapchat Tried to Make a Safe AI. It Chats with Me about Booze and Sex.” Washington Post. https://www.washingtonpost.com/technology/2023/03/14/snapchat-myai/.
  • Gabriel, I., A. Manzini, G. Keeling, L. A. Hendricks, V. Rieser, H. Iqbal, N. Tomašev, et al. 2024. “The Ethics of Advanced AI Assistants.” arXiv preprint arXiv 2404: 16244.
  • Garg, R., and S. Sengupta. 2020. “Conversational Technologies for In-Home Learning: Using Co-design to Understand Children's and Parents’ Perspectives.” In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, 1–13.
  • Google PAIR. 2019. People + AI Guidebook. https://design.google/ai-guidebook.
  • Holmes, W., D. Bektik, B. Woolf, and R. Luckin. 2019. “Ethics in AIED: Who Cares?” In 20th International Conference on Artificial Intelligence in Education (AIED’19), 25–29 Jun 2019.
  • Holmes, W., and K. Porayska-Pomsta, eds. 2022. The Ethics of Artificial Intelligence in Education: Practices, Challenges, and Debates. London: Taylor & Francis.
  • InternetMatters. 2017. “The Secret Life of Six Year Olds.” https://www.internetmatters.org/hub/press_release/revealed-the-secret-life-of-six-year-olds-online/.
  • Ischen, C., T. Araujo, H. Voorveld, G. van Noort, and E. Smit. 2019. “Privacy Concerns in Chatbot Interactions.” In Chatbot Research and Design: Third International Workshop, CONVERSATIONS 2019, Amsterdam, The Netherlands, November 19–20, 2019, Revised Selected Papers 3, 34–48. Springer International Publishing.
  • Jarke, J., and A. Breiter. 2019. “The Datafication of Education.” Learning, Media and Technology 44 (1): 1–6. https://doi.org/10.1080/17439884.2019.1573833.
  • Ji, H., I. Han, and Y. Ko. 2023. “A Systematic Review of Conversational AI in Language Education: Focusing on the Collaboration with Human Teachers.” Journal of Research on Technology in Education 55 (1): 48–63. https://doi.org/10.1080/15391523.2022.2142873.
  • Jobin, A., M. Ienca, and E. Vayena. 2019. “The Global Landscape of AI Ethics Guidelines.” Nature Machine Intelligence 1 (9): 389–399. https://doi.org/10.1038/s42256-019-0088-2.
  • Knox, J., B. Williamson, and S. Bayne. 2020. “Machine Behaviourism: Future Visions of ‘Learnification’ and ‘Datafication’ across Humans and Digital Technologies.” Learning, Media and Technology 45 (1): 31–45. https://doi.org/10.1080/17439884.2019.1623251.
  • Kulkarni, P., A. Mahabaleshwarkar, M. Kulkarni, N. Sirsikar, and K. Gadgil. 2019. “Conversational AI: An Overview of Methodologies, Applications & Future Scope.” In 2019 5th International Conference On Computing, Communication, Control And Automation (ICCUBEA), September, 1–7. IEEE.
  • Kuntsman, A. 2012. “Introduction: Affective Fabrics of Digital Cultures.” In Digital Cultures and the Politics of Emotion: Feelings, Affect and Technological Change, edited by A. Karatzogianni and A. Kuntsman, 1–17. London: Palgrave Macmillan UK.
  • Kurian, N. 2023a. “AI's Empathy Gap: The Risks of Conversational Artificial Intelligence for Young Children's Well-being and Key Ethical Considerations for Early Childhood Education and Care.” Contemporary Issues in Early Childhood, 14639491231206004. https://doi.org/10.1177/14639491231206004
  • Kurian, N. 2023b. “Toddlers and Robots? The Ethics of Supporting Young Children with Disabilities with AI Companions and the Implications for Children’s Rights.” International Journal of Human Rights Education 7 (1): 9.
  • Lai, S. S., V. Andelsman, and S. Flensburg. 2023. “Datafied School Life: The Hidden Commodification of Digital Learning.” Learning, Media and Technology, 1–17. https://doi.org/10.1080/17439884.2023.2219063
  • Macgilchrist, F. 2019. “Cruel Optimism in Edtech: When the Digital Data Practices of Educational Technology Providers Inadvertently Hinder Educational Equity.” Learning, Media and Technology 44 (1): 77–86. https://doi.org/10.1080/17439884.2018.1556217.
  • Manolev, J., A. Sullivan, and R. Slee. 2020. “The Datafication of Discipline: ClassDojo, Surveillance and a Performative Classroom Culture.” In The Datafication of Education, 37–52. Routledge.
  • Marcin, T. 2023. Microsoft's Bing AI chatbot has said a lot of weird things. Here's a list. Mashable. https://mashable.com/article/microsoft-bing-ai-chatbot-weird-scary-responses.
  • McTear, M. 2022. Conversational AI: Dialogue Systems, Conversational Agents, and Chatbots. London: Springer Nature.
  • Murtarelli, G., A. Gregory, and S. Romenti. 2021. “A Conversation-based Perspective for Shaping Ethical Human–Machine Interactions: The Particular Challenge of Chatbots.” Journal of Business Research 129:927–935. https://doi.org/10.1016/j.jbusres.2020.09.018.
  • Naveed, H., A. U. Khan, S. Qiu, M. Saqib, S. Anwar, M. Usman, N. Barnes, and A. Mian. 2023. “A Comprehensive Overview of Large Language Models.” arXiv preprint arXiv:2307.06435.
  • OECD. 2021. OECD AI’s Live Repository of Over 260 AI Strategies & Policies. OECD.AI, powered by EC/OECD. Accessed 17 February 2024. www.oecd.ai/dashboards.
  • Olsson, F. 2009. “A Literature Survey of Active Machine Learning in the Context of Natural Language Processing.”
  • Pesapane, F., C. Volonté, M. Codari, and F. Sardanelli. 2018. “Artificial Intelligence as a Medical Device in Radiology: Ethical and Regulatory Issues in Europe and the United States.” Insights Into Imaging 9 (5): 745–753. https://doi.org/10.1007/s13244-018-0645-y.
  • Rudolph, J., S. Tan, and S. Tan. 2023. “War of the Chatbots: Bard, Bing Chat, ChatGPT, Ernie and Beyond. The New AI Gold Rush and its Impact on Higher Education.” Journal of Applied Learning and Teaching 6 (1): 364–389.
  • Selwyn, N. 2015. “Data Entry: Towards the Critical Study of Digital Data and Education.” Learning, Media and Technology 40 (1): 64–82. https://doi.org/10.1080/17439884.2014.921628.
  • Shead, S. 2021. Amazon’s Alexa assistant told a child to do a potentially lethal challenge. https://www.cnbc.com/2021/12/29/amazons-alexa-told-a-child-to-do-a-potentially-lethal-challenge.html.
  • Shum, H. Y., X. D. He, and D. Li. 2018. “From Eliza to XiaoIce: Challenges and Opportunities with Social Chatbots.” Frontiers of Information Technology & Electronic Engineering 19 (1): 10–26. https://doi.org/10.1631/FITEE.1700826.
  • Su, J., D. T. Ng, and S. K. Chu. 2023. “Artificial Intelligence (AI) Literacy in Early Childhood Education: The Challenges and Opportunities.” Computers and Education: Artificial Intelligence 4:100124. https://doi.org/10.1016/j.caeai.2023.100124.
  • Sundar, S. S., and J. Kim. 2019. “Machine Heuristic: When We Trust Computers More than Humans with Our Personal Information.” In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, 1–9. New York, NY: Association for Computing Machinery.
  • Thaler, R. H., and C. R. Sunstein. 2009. Nudge: Improving Decisions about Health, Wealth, and Happiness. Penguin.
  • Turkle, S. 2011. Alone Together: Why We Expect More from Technology and Less from Each Other. New York: Basic Books.
  • UNESCO. 2019. Beijing Consensus on Artificial Intelligence (AI) and Education. Accessed 17 February 2024. https://unesdoc.unesco.org/ark:/48223/pf0000368303.
  • UNICEF. 2020. Safeguarding Girls and Boys: When Chatbots Answer their Private Questions. Accessed 6 August 2022. https://www.unicef.org/eap/media/5376/file.
  • Van Brummelen, J., T. Heng, and V. Tabunshchyk. 2021, May. “Teaching Tech to Talk: K-12 Conversational Artificial Intelligence Literacy Curriculum and Development Tools.” In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35, No. 17, 15655–15663. https://doi.org/10.1609/aaai.v35i17.17844.
  • Weidinger, L., J. Mellor, M. Rauh, C. Griffin, J. Uesato, P. S. Huang, M. Cheng, et al. 2021. “Ethical and Social Risks of Harm from Language Models.” arXiv preprint arXiv:2112.04359.
  • White, G. 2018. “Child Advice Chatbots Fail to Spot Sexual Abuse.” https://www.bbc.co.uk/news/technology-46507900.
  • Williamson, B. 2017. “Decoding ClassDojo: Psycho-policy, Social-emotional Learning and Persuasive Educational Technologies.” Learning, Media and Technology 42 (4): 440–453. https://doi.org/10.1080/17439884.2017.1278020.
  • Williamson, B., F. Macgilchrist, and J. Potter. 2023. “Re-examining AI, Automation and Datafication in Education.” Learning, Media and Technology 48 (1): 1–5. https://doi.org/10.1080/17439884.2023.2167830.