Search in:

Cogent Education Volume 11, 2024 - Issue 1

Submit an article Journal homepage

Open access

843

Views

CrossRef citations to date

Altmetric

Listen

Information & Communications Technology in Education

Artificial intelligence pedagogical chatbots as L2 conversational agents

Assim S. AlrajhiDepartment of English Language and Literature, College of Languages and Humanities, Qassim University, Qassim, Saudi ArabiaCorrespondence[email protected]

https://orcid.org/0000-0002-6205-9943 View further author information

Article: 2327789 | Received 05 Jan 2024, Accepted 01 Mar 2024, Published online: 18 Mar 2024

Cite this article
https://doi.org/10.1080/2331186X.2024.2327789
CrossMark

In this article

Abstract
1. Introduction
2. Literature review
3. The significance of the current study
4. Methods
5. Findings
6. Discussion
7. Conclusion
8. Limitations of the study
Disclosure statement
Additional information
References

Full Article
Figures & data
References
Citations
Metrics
Licensing
Reprints & Permissions
View PDF PDF View EPUB EPUB

Abstract

This paper reports on a mixed-methods study delving into EFL students’ experiences and perspectives on a text-based pedagogical chatbot. Utilizing chatbot-mediated interaction, a questionnaire survey, and focus group discussions, the study centers around the cognitive and affective domains of learning in relation to the chatbot’s affordances and limitations. Additionally, it investigates potential associations between L2 proficiency and perceptions on the chatbot. The sample (n = 143) consisted of undergraduate students from a Saudi university who engaged in guided and self-initiated interactions with the chatbot. By and large, the findings point to positive experiences concerning the chatbot’s intelligibility and comprehension. In terms of the interaction, the chatbot is perceived as supportive of L2 practice and writing development, interest-provoking, enhancing motivation, and alleviating writing anxiety. Contrastingly, certain demotivating factors are reported regarding the chatbot’s interactional and instructional abilities, including the lack of extended conversations, sensitivity to inaccurate language forms, and sporadic irrelevant responses. Moreover, the Mann-Whitney U test reveals that L2 proficiency does not affect overall views on the chatbot-mediated interaction, except for the aspect of usefulness for L2 practice, which has significantly more positive views from high-intermediate students. Pedagogical implications pertinent to the integration of chatbots in L2 learning are discussed.

Keywords:

Chatbot-assisted learning
learner experience
conversational agents
learner perspective
second language interaction

REVIEWING EDITOR:

Lin Zhong
Southern Illinois University Carbondale
Carbondale
United States

SUBJECTS:

Artificial Intelligence
Information & Communication Technology (ICT)
Language & Linguistics
Language Teaching & Learning

1. Introduction

In the field of technology-enhanced language instruction, a wide array of promising and potentially efficacious tools have recently emerged (Alrajhi, Citation2020a). These tools are designed to offer assistance to users and facilitate communication beyond the boundaries of formal educational environments. The rapid and remarkable technological advancements in artificial intelligence (AI) have led to the advent of chatbots, defined as software that has the ability to utilize spoken and written language to simulate chat-like interactions with humans (Luo et al., Citation2019). Chatbots, as sophisticated software, can offer a plethora of opportunities for computer-assisted language learning (CALL). Their versatile capabilities allow them to cater to a wide range of users’ needs, including the ability to respond to inquiries and engage in interactive conversations (Copulsky, Citation2019). The utilization of chatbots in second language (L2) learning holds potential, as they offer opportunities for meaningful interactions that can contribute to L2 proficiency enhancement. Interaction with chatbots can be perceived as both a motivating factor and a pedagogical strategy aimed at facilitating L2 practice (Shin et al., Citation2021). Therefore, within the theoretical framework of L2 learning that emphasizes the pivotal role of interaction (Chapelle, Citation2005; Long, Citation1996), AI chatbots emerge as powerful tools (Fryer et al., Citation2020). Consequently, L2 researchers have undertaken endeavors to investigate various aspects of chatbots, such as learner interest (Fryer et al., Citation2019), learner experience of chatbots (Fryer & Carpenter, Citation2006), their impact on learning motivation (Fryer et al., Citation2017), and their overall usefulness in L2 learning (Yin & Satar, Citation2020).

Nonetheless, despite such endeavors, there are many gaps yet to be addressed. While several previous studies have examined oral-based interactions with chatbots, limited attention has been given to text-mediated interactions with these AI tools. Furthermore, the bulk of existing research has overlooked pedagogical chatbots and their potential to enhance interpersonal communication skills and facilitate L2 learning. Additionally, there is a notable paucity of studies investigating the effect of self-perceived L2 proficiency on learner attitude toward such pedagogical tools. Therefore, it is of utmost importance to delve into the perspectives of EFL learners regarding the effectiveness of online human-machine dialog in L2 education. According to Fryer et al. (Citation2020), although the advantages of chatbots have been substantiated across various domains, scholarly investigation into the potential of chatbot-based interaction within the field of language education remains limited. Thus, the present study endeavors to explore the experiences and perceptions of EFL students concerning the applicability of a text-based pedagogical chatbot in L2 learning. In so doing, it aims to contribute to the expanding body of literature on the plausible role of chatbots in foreign language education.

2. Literature review

2.1. AI chatbot

The remarkable surge in technological advancements in the twenty-first century has yielded the emergence of innovative forms of communication (Alrajhi, Citation2023), including the interaction between humans and AI chatbots, which has garnered considerable attention. According to Bailey (Citation2019), several terms have been widely employed to designate chatbots, including virtual assistants, instructional agents, and conversational agents. Chatbots manifest in various forms, appearing on different platforms and devices, including phone applications, virtual personal assistants, and websites (Melián-González et al., Citation2021). The purpose of chatbots is to engage users in natural-like language interactions, facilitating meaningful conversations. Harnessing the advancements in AI and natural language processing technologies, currently available forms of chatbots have remarkably reached higher levels of enhancement, elevating their capabilities in both voice-enabled and written exchanges (Shah et al., Citation2016).

The recent developments in chatbots technology coupled with their refined realistic attributes, have influenced learner perception regarding the practicality and usefulness of this tool across educational and non-educational domains (Kwon et al., Citation2023). Pedagogical chatbots are perceived as tools capable of simulating human traits (Kwon et al., Citation2023). Several forms of chatbots possess anthropomorphic (i.e. human-like) attributes, displaying characteristics such as moral judgments and emotional experience (Golossenko et al., Citation2020). The anthropomorphic characteristics that are mediated by the quality of interaction and perceived empathy significantly shapes the relation between users and AI tools (Pelau et al., Citation2021). Apart from factors like enjoyment and increased efficiency, the social and emotional aspects associated with the use of AI tools play a crucial role in user acceptance of such technologies (Pelau et al., Citation2021).

While the utilization of chatbots has proliferated across various domains, and the number of online chatbots is increasing rapidly (Dale, Citation2016), employing such tools as L2 learning companions is considered a recent notion (Fryer et al., Citation2020). One of the potential merits attributed to chatbots is pertinent to user satisfaction, as they can conveniently engage in interactions, unbound by temporal constraints (Winkler & Söllner, Citation2018). Consequently, the availability of chatbots has implications for L2 learning and practice beyond the confines of the traditional language classroom. In contrast, a notable constraint of the chatbots lies in their limited capacity to navigate communication that takes a non-linear fashion (i.e. conversations where retrieval and integration of previously discussed topics or ideas in response to newly introduced topics pose a challenge to chatbots) (Grudin & Jacques, Citation2019).

Fryer et al. (Citation2020) argued that chatbots are innovative tools within the realm of L2 education, with their technological prowess continuing to advance progressively. Acknowledging that chatbots may not possess linguistic proficiency, as not perfect conversers, Fryer and Carpenter (Citation2006) highlight their promising potential in fostering L2 learning. They expound upon several merits of chatbots that can be effectively harnessed in L2 education: (1) promoting reading skills through text reading; (2) supporting the use of a diverse range of L2 lexical items that are not typically utilized in the language classroom; (3) increasing L2 motivation and interest among learners; (4) providing a comfortable learning environment, which does not provoke anxiety as compared to human-human interactions; (5) providing long-lasting conversations, as chatbots are patient conversers that continuously generate L2 discourse; and (6) offering instantaneous feedback on language errors.

2.2. The interactionist framework

Learner interaction in the target language is pivotal to L2 acquisition (Long, Citation1996), which can encompass various modalities, including CALL-based dialogs (Chapelle, Citation2005). Therefore, communicating with chatbots, either in a written or spoken form, is well grounded in the interactionist theoretical framework (Long, Citation1996). Based on this approach to language learning, interaction in L2 affords learners the opportunity to engage in three fundamental processes that foster development: directing attention to linguistic form, receiving modified input, and negotiating intended meaning by modifying output (enhancing learner understanding of the association between language meaning and form), thereby highlighting their awareness of gaps in their linguistic knowledge (Chapelle, Citation2005). Consequently, chatbot-learner communication engenders extensive involvement in the aforesaid processes, thus maximizing the potential of this interaction to contribute to L2 development (Bibauw et al., Citation2022).

2.3. Chatbots as L2 conversational partners

Several studies have been undertaken to assess the suitability of chatbots as educational tools. For instance, Coniam (Citation2014) conducted a study to evaluate the effectiveness of five chatbots in terms of lexical and grammatical accuracy, as well as the quantity of language produced. The findings indicated that the chatbots demonstrated a high level of grammatical accuracy, ranging from 77% to 93%. Additionally, a substantial portion (69% to 78%) of the words generated by the chatbots belonged to the most commonly used 2000 words. However, there was considerable variability in meaning fit (47% to 71%) among the chatbots. Furthermore, the appropriateness of word meaning and grammar was relatively lower, ranging from 44% to 66%. Regarding the quantity of output, the overall word count varied significantly across the chatbots, ranging from 912 to 3258.

Smutny and Schreiberova (Citation2020) examined 47 chatbots and evaluated their instructional potential. Their findings indicated that the current status of such conversational systems does not qualify them as highly effective pedagogical tools. Moreover, the researchers proposed that particular types of chatbots appear more suitable for educational purposes. In another study, Alm and Nkomo (Citation2020) investigated L2 learners’ experience-based reviews of four recently developed chatbots designed for informal language learning contexts (e.g. Duolingo, Memrise). The outcomes unveiled users’ disposition to utilize such conversational companions, as chatbots piqued their interest. Nevertheless, the users noted inconvenience and disappointment when a mismatch between their learning goals and dialogs occurs.

Fryer et al. (Citation2017) probed into learners’ interest in foreign language tasks, where learners performed both human-based and chatbot-mediated oral interactions. The findings revealed a noteworthy distinction between the two forms of interactions. Unlike human-human interaction, learners’ interest in interacting with the chatbots declined significantly over time, suggesting that the initial allure and novelty of the first chatbot dialog task might have contributed to the high level of interest. In a more recent study, Yang et al. (Citation2022) employed a voice-enabled chatbot to assess its effect on the development of oral skills among Korean EFL learners. The findings showed increased motivation for oral interaction, and despite perceived limitations, learners expressed positive perceptions of the chatbot’s utility within the L2 context.

In another investigation, Thompson et al. (Citation2018) explored the effectiveness of chatbots in conversations, and delved into Japanese university learners’ perceptions of their interest in engaging in dialogues with humans as opposed to chatbots. The participants were divided into either a human-interaction group or a chatbot-interaction group. This study revealed a notable contrast between the two groups and found that dissimilar to the human-interaction group, the interest of the chatbot-interaction group declined (M = 4.18 to 3.84) over the course of the experimental duration. The researchers postulated that this decline might be ascribed to the artificial nature of conversations and unsuitability of the chatbots’ responses.

Gallacher et al. (Citation2018) carried out a study to explore the attitudes of Japanese EFL learners toward oral interaction with the chatbot Cleverbot, in comparison to dialogue with human interlocutors. The students had varying levels of L2 proficiency, and the chatbot was designed to receive oral input and generate written responses. The study revealed that while the students recognized the potential benefits of such interactions for L2 learning, they perceived the chatbot as a novel and unconventional conversation tool that is not as effective in communication as humans. Moreover, the findings illuminated the students’ views regarding both the advantages and disadvantages inherent in such interactions. The advantages, indicated by 202 responses, encompassed various aspects, including exposure to new lexical items, heightened learner interest, opportunities for autonomous learning, highly intelligible chatbot’s responses, the provision of immediate and occasionally reliable answers, and a straightforward mode of communication characterized by question-response exchanges. Conversely, the identified drawbacks, revealed by 248 responses, were primarily associated with the unsuitability and irrelevance of many chatbot’s responses, as well as its limited ability to generate questions in a progressive manner.

Fryer et al. (Citation2019) investigated the interest of Japanese EFL students in interacting with a chatbot named Cleverbot, in contrast to human interlocutors. The findings unveiled a significant relationship between students’ level of L2 proficiency and their inclination toward conversing with the chatbot. Furthermore, the learning gains resulting from chatbot-mediated interaction were strongly associated with students’ interest in interacting with the chatbot. In another study, Yang et al. (Citation2019) examined the potential of a chatbot named Ellie as an L2 speaking practice tool, using two different interaction tasks among young EFL students in Korea. Broadly speaking, the findings revealed that the participants engaged in meaningful interactions with the chatbot, utilizing various conversation strategies. Nonetheless, the study reported technical issues concerning the chatbots’ lengthy utterances and voice recognition functionality. Despite these difficulties, the majority of students exhibited positive perceptions of the chatbot as a conversational agent.

Yin and Satar (Citation2020) undertook a study to explore the potential of text-based chatbots in facilitating L2 learning, employing two different tools: Mitsuku, a non-pedagogical chatbot, and Tutor Mike, a pedagogical chatbot. The findings indicated that students with a high level of L2 competency exhibited minimal engagement in conversations with Tutor Mike and expressed dissatisfaction with both conversational agents. Additionally, it was observed that interacting with Tutor Mike can be most effective for students with a low level of L2 competency. In another recent investigation, Shin et al. (Citation2021) examined the feasibility of incorporating Mitsuku as a conversational tool for L2 students. Notably, the vocabulary used by the chatbot was deemed suitable for both high school and college students’ L2 proficiency levels. Moreover, the findings of the conversation sentiment analysis revealed that students held a positive emotional stance toward the interaction with the chatbot. Specifically, high school students demonstrated a higher level of enjoyment throughout the interaction, with a mean compound score of .93, as compared to college students, with a mean compound value of .86.

In their research, Kwon et al. (Citation2023) developed an educational chatbot to investigate the impact of 15-week chatbot-mediated written interactions on the development of EFL learners’ English writing. Furthermore, the study explored the participants’ attitudes toward the chatbot. The findings indicated that students who utilized the chatbot had significantly higher scores in the post-treatment test compared to those who received traditional instruction. Additionally, the students perceived the chatbot as fostering a more comfortable learning environment and facilitating the development of their L2 skills. These findings highlight the potential of chatbot-mediated written interactions in promoting successful L2 learning outcomes.

3. The significance of the current study

While the previous studies have made contributions to the literature on the use of AI chatbots in education, there are still research gaps that need to be addressed. First, it is necessary to subject the effectiveness of chatbot-assisted language learning to rigorous investigation (Kwon et al., Citation2023). To effectively harness the potential of intelligent conversational systems in L2 instruction, it is crucial to investigate the affordance of these tools (Fryer & Carpenter, Citation2006; Fryer et al., Citation2019). Second, given the proliferation of diverse and advanced chatbots currently available to learners, which are being developed with more sophistication and additional features and abilities, previous research findings of learner experience and perception might not be applicable to currently available chatbots. Consequently, the findings associated with more recent forms of conversational systems can provide valuable insight into their efficacy for L2 learning. Third, previous research has examined learner perception of voice-enabled chatbots, and a few studies have explored student attitude toward text-based chatbots. Fourth, there is a dearth of L2 research that has delved into learner experience and perspective on pedagogical chatbots, with only a limited number of studies utilizing small sample sizes or employing chatbots not widely accessible to L2 learners. Fifth, there is a paucity of studies examining the effect of self-perceived L2 proficiency on learner attitude toward pedagogical chatbots. Consequently, there is a need for further research that conducts a comprehensive investigation into text-based pedagogical chatbots. Such an endeavor would fortify the validity of research findings and broaden our understanding of the affordances and limitations of these tools specifically designed for instructional purposes. To this end, this research inquiry addresses the following questions:

What are the views of EFL learners on the utilization of a text-based pedagogical chatbot as an L2 conversational agent?
To what extent do learners’ perspectives on the pedagogical chatbot as a conversational partner differ according to their self-perceived L2 proficiency?

4. Methods

4.1. Participants and research setting

Utilizing convenience sampling, a cohort of one hundred and forty-three English majors (Age: M = 20.64 years, SD = 1.825), who speak Arabic as a first language participated in this study (). This sampling technique not only brings about the inclusion of willing participants but also enables the recruitment of a sample that aligns with the criteria suitable for the objectives of the research inquiry (Dörnyei, Citation2007). The students, who had a prior L2 learning experience of eight years in general education, were enrolled in an undergraduate English program at a public university in Saudi Arabia (M = 2.49 years at university, SD = .999).

Table 1. Demographic information.

Download CSV Display Table

The participants were drawn from four classes, each of which met once a week for three hours. The students in these classes were taught by the author in a well-equipped language lab (e.g. a large interactive white board, computers, and internet access). Demographic information shows that perceived L2 proficiency of the participants varied, with the intermediate and high-intermediate levels dominating the proficiency levels. The students voluntarily participated in this study, and provided their informed consent prior to commencing the study procedures.

4.2. Procedures, instruments, and analysis

This study employed a mixed-methods approach to address the first research question, as triangulation of methods and data sources enhances the robustness and validity of research findings (Creswell, Citation2009). Additionally, a quantitative approach was adopted to answer the second research question. The research design encompassed four phases: (1) introducing a chatbot (2) individualized interactions with the chatbot, (3) a questionnaire survey, and (4) focus group discussions. The sequential implementation of these phases () was conducted within regular class sessions in the lab.

Figure 1. Study procedures.

4.2.1. Chatbot-mediated interaction

This study employed a web-based AI pedagogical chatbot named Tutor Mike (https://www.rong-chang.com/tutor_mike.htm). The chatbot is designed to function as an English tutor, and can be used for English learning and practice. According to the eslfast website (www.eslfast.com/robot/english_tutor.htm), Tutor Mike is an award-wining chatbot (1^st place in the 2018 selection contest and 2^nd place in the 2016 and 2018 contests) in the International Loebner Prize contest in AI. According to the official website (https://www.rong-chang.com/tutor_know.htm), this chatbot can perform several functions, such as offering tips for learning English, teaching grammar and idiomatic expressions, inspecting spelling and grammatical mistakes, and providing answers relevant to a wide array of general knowledge topics.

The students had no previous experience of using the chatbot in L2, which was confirmed through a quick survey. Prior to engagement in the chatbot text-based interaction, the author introduced Tutor Mike to the students (15 mins) and displayed several examples on how to initiate a conversation with the chatbot. Subsequently, each student embarked on an individualized interaction with the chatbot during an online session. Utilizing the computers in the lab, the students were instructed to interact with the chatbot using an optional guided interaction, which was prepared by the author, followed by self-initiated interactions (lasting approximately 25 to 30 minutes). In the event that the students encountered unfamiliar vocabulary within Tutor Mike’s responses, they were instructed to promptly request clarification and further explanation from the chatbot. The purpose of the guided interaction was to familiarize the students with the context of the chatbot and facilitate the initiation and maintenance of communication. The guided interaction encompassed a diverse range of topics, including daily life, languages, school, countries and cities, vocabulary, and grammar. Moreover, it incorporated a series of stimulating inquiries, such as, “What languages do you speak?” “Do you know my country?” “Can you explain the present tense?” What is the meaning of (vocabulary items)?”

4.2.2. Questionnaire survey

In order to explore the participants’ perspectives regarding the effectiveness of the text-based pedagogical chatbot, an online questionnaire was devised by the author. This questionnaire consisted of a comprehensive set of 20 items, which drew upon an extensive examination of the existing literature. The author conducted a thorough review of previous studies, comparing and evaluating the instruments utilized in those studies. By doing so, the author identified several gaps that needed to be addressed, particularly concerning the integration of pedagogical chatbots in L2 learning. Subsequently, the questionnaire underwent a rigorous evaluation process, wherein two experienced language instructors and researchers scrutinized its content, and it was modified and refined according to their feedback and suggestions. The questionnaire was administered via the Google Forms platform based on a five-point Likert scale (strongly disagree = 1 – strongly agree = 5). It aimed to delve into learners’ views on items pertinent to cognitive (e.g. L2 writing improvement and practice) and affective (motivation, anxiety, willingness to communicate, and interest) domains of learning as well as affordances and limitations (e.g. ability to extend conversation, intelligibility, and linguistic ability) of the chatbot for L2 learning. Prior to eliciting responses, the author piloted the questionnaire to a large sample (n = 61) in order to assess its consistency and confidentiality (Bonett & Wright, Citation2015); the results revealed an excellent level of questionnaire item reliability, as indicated by Cronbach’s alpha coefficient (α = .901). Upon completion of the chatbot-mediated interaction session, the students filled out the questionnaire within a timeframe of 10 to 12 minutes via the computers in the lab. Following the data collection process, a rigorous data cleaning procedure was implemented. Invalid responses were identified and subsequently removed from analysis. These included responses exhibiting straightlining tendencies, where participants consistently provided the same responses throughout the questionnaire, as well as responses from participants who completed the survey significantly faster than the average response time for all students. The final dataset comprised 143 valid responses, which formed the basis of the subsequent analysis.

The analysis of the quantitative data was performed through the application of both descriptive and inferential statistics. Descriptive statistics were employed to obtain the frequency and percentage of responses, as well as to calculate the mean values and standard deviation. Additionally, inferential statistics were utilized to investigate potential associations between variances in L2 proficiency and perspectives on the efficacy of the chatbot. The assumption of normality of data distribution was assessed by the Shapiro-Wilk test. The results, however, indicated that the data were not normally distributed (p < .05). Consequently, the Mann-Whitney U test, a non-parametric alternative, was used to answer the second research question, given its robustness against violations of normality assumptions. In terms of the scale analysis procedures, mean values ranging from 1.00 to 2.60 were considered indicative of negative views, mean values ranging from 2.61 to 3.40 were deemed neutral, while mean values ranging from 3.41 to 5 denoted positive perceptions. To validate the analysis of perceived L2 proficiency effect on learners’ perspectives, a preliminary analysis was conducted to identify the most prevalent proficiency groups. The results revealed that there were two dominant groups (n = 111) that could be compared: the intermediate (n = 49) and the high-intermediate groups (n = 62). Subsequently, to balance the number of the participants across the two groups, thirteen cases from the high-intermediate group were randomly excluded from the analysis.

4.2.3. Focus group discussions

To further elucidate and explicate the responses obtained from the questionnaire, audio-recorded focus group discussions were conducted in the lab, spanning a duration of approximately 64 minutes, provided that informed consent was obtained in advance for audio-recording. Thirteen (9% of the sample size) students were invited to partake in the discussions. They were asked open-ended questions, such as, “What do you think about chatting with the chatbot?”, “How do you evaluate the chatbot language and conversational abilities?”, “What do you think about the effectiveness of the chatbot for all learners regardless of their L2 proficiency?”

Following meticulous transcription of the recorded discussions, a theme-based analysis (Braun & Clarke, Citation2006) was used for the qualitative data. The analysis was conducted based on a rigorous process to identify patterns and relationships among categories and emerging themes. This was done through (a) conducting multiple readings of the transcribed discussions, (b) meticulously identifying initial codes that are relevant to the objective of the study (the first research question), and (c) organizing the codes into unifying categories. The initial coding phase involved the allocation of codes to various aspects of the data, such as chatting, short responses, improvement, usage, language proficiency of the chatbot, attention, intermediate level, irrelevant answers, satisfaction, effectiveness, writing, and spelling. To ensure the reliability of the coding process, another researcher independently analyzed the data. Subsequently, an analysis of inter-coder reliability was conducted. The results indicated a substantial agreement of 92% on the assigned codes, while any disagreements were discussed until a consensus was reached. Drawing upon the code frequency and significance, several categories were identified, leading to the emergence of four main themes. The first theme pertains to the pedagogical chatbot and L2 learning. This theme encompasses categories such as usefulness, L2 writing practice, attention to linguistic forms, and willingness to utilize the chatbot in the future. The second theme is relevant to the affordances and limitations of the chatbot. This theme consists of categories such as effortlessness, responsiveness (or lack thereof), interest, motivation, instructional ability, question formation, and language quality. The third theme revolves around the potential of the chatbot in relation to different levels of L2 proficiency. This theme includes categories such as minimum L2 proficiency needed and suitability for intermediate learners.

5. Findings

5.1. Learners’ views on the pedagogical chatbot as an L2 conversational agent

Among the twenty items in the questionnaire ( and ), there are fourteen items with mean values that fall within the positive range (M > 3.41). Moreover, the results reveal that there are eleven items (1, 4, 5, 6, 7, 11, 14, 15, 17, 19, and 20) with greater mean values (M = 3.76–4.38) than the overall mean score (M = 3.71). These findings, in conjunction with the total mean value, underscore the students’ positive experience-based views regarding their interaction with the chatbot as a conversational agent.

Figure 2. Learners’ perspectives on the pedagogical chatbot.

Table 2. Descriptive statistics of learners’ perspectives on the pedagogical chatbot.

Download CSV Display Table

In specific, ten items have the highest agreement scores. That is, the students had no difficulty understanding the chatbot’s English (item 20; M = 4.38, SD = .759), acknowledged the utility of the interaction in L2 writing development (item 7; M = 4.37, SD = .828), and perceived chatting with the chatbot as not demanding considerable effort (item 17; M = 4.35, SD = .744). Furthermore, the students held the view that the chatbot-mediated interaction facilitated their practice of the target language (item 6; M = 4.25, SD = .835), had positive views on the chatbot as increasing motivation to practice L2 writing (item 15; M = 4.10, SD = .984), and positively perceived the chatbot’s ability to understand their English language (item 19; M = 4.05, SD = .899). Moreover, the students perceived making writing mistakes during such interactions as not anxiety-provoking (item 14; M = 4.01, SD = 1.141), had positive views on the potential of the interaction to contribute to L2 improvement (item 5; M = 3.97, SD = 1.034), perceived the use of the chatbot as stimulating their interest (item 4; M = 3.95, SD = .974), and expressed a favorable disposition toward the experience of engaging in chatbot-mediated dialogs (item 1; M = 3.82, SD = .976).

On the other hand, the results suggest a sense of neutrality among the students concerning certain aspects germane to the interaction. Specifically, there are six items that yield mean values that align with the neutral continuum (M = 2.61–3.40). Put differently, the students exhibited an ambivalent stance regarding the chatbot’s capacity to maintain a conversation for an extended period of time (item 3; M = 2.78, SD = 1.224), and also in terms of the interaction as more efficacious for L2 learning than chatting with classmates (item 10; M = 2.78, SD = 1.312). Additionally, the student showed a neutral stance toward the relevance of the chatbot’s responses to their English expressions (item 18; M = 2.99, SD = 1.055), and toward feeling more comfortable and less anxious while interacting with the chatbot vis-à-vis conversing with humans (item 13; M = 3.17, SD = 1.358). Moreover, the students had neutral views on the chatbot’s ability to teach L2 grammar and vocabulary (item 9; M = 3.36, SD = .968), and were undecided with respect to their willingness to continue utilizing the chatbot for L2 learning and practice purposes (item 16; M = 3.40, SD = 1.290).

5.2. Perceived L2 proficiency and perspectives on the chatbot as a conversational partner

The results of the Mann-Whitney U test yield no significant differences between the views held by the two prevailing proficiency groups (intermediate and high-intermediate) in relation to nineteen items in the questionnaire (p > .05). However, the only significant difference that emerged is associated with the sixth item (U = 924.50, z = − 2.155, p = .031). This finding suggests that the high-intermediate learners have significantly more positive views () on the utility of the interaction with the chatbot as a means to practice L2.

Table 3. Ranks of the usefulness of the chatbot-mediated interactions for L2 practice.

Download CSV Display Table

5.3. Focus group discussions

The findings from the qualitative analysis are reported in this section, which are organized into three subsections. The first subsection delves into the pedagogical chatbot and L2 learning. It is followed by an exploration of the affordances and limitations of the chatbot. The third subsection is pertinent to the chatbot’s potential in relation to different L2 proficiency levels. present the key categories that emerged from the discussions, along with the corresponding frequency of students’ responses associated with each respective category.

Table 4. Thematic categories and corresponding frequencies of responses in the first theme.

Download CSV Display Table

Table 5. Thematic categories and corresponding frequencies of responses in the second theme.

Download CSV Display Table

Table 6. Thematic categories and corresponding frequencies of responses in the second theme.

Download CSV Display Table

Table 7. Thematic categories and corresponding frequencies of responses in the third theme.

Download CSV Display Table

5.3.1. The pedagogical chatbot and L2 learning

Broadly speaking, the majority of students () held positive views on the chatbot as a useful L2 learning tool, albeit to some extent and for a limited period of time. These perspectives emanate from an awareness of the chatbot’s capacity to encourage writing practice, its potential to improve many L2 aspects, and the interaction as being analogous to real life communication:

Student 10: I can use the chatbot to develop my level in an international language test because it makes me use vocabulary a lot and different ways to chat. I can work on writing, vocabulary, and many other things. Even with speaking, I can remember memories with the chatbot so I can use them when I speak. The topics with the chatbot are similar to real life communication, same questions. Chatting with the chatbot really helps to develop English.

S6: It is useful for a short time, not using Mike continuously…Chatting with Mike can prepare you for formal written communication with others. It teaches me, as I have knowledge of words so I can start a conversation with Mike, I can ask it about basic grammar, and it explains to me in a simple and easy way, so I can learn.

Interacting with the chatbot necessitated the students’ unwavering focus on the linguistic forms they generated in L2, a facet of the interaction that was perceived by some students as a favorable attribute. The chatbot was sensitive and unresponsive to erroneous language input, particularly in terms of spelling and grammatical structure, which prompted the students to concentrate more on their L2 accuracy:

S3: I always tried to give Mike an opportunity to understand me. If you do not deliver your ideas correctly, maybe Mike is not going to respond with what you want.

S1: You need to ask complete questions with correct spelling. Because if you type a misspelled word, you have to correct that mistake. It [Mike] is going to tell you the right spelling, and then you ask the question again.

Several students noted that they might consider using the chatbot in the future. Satisfaction and willingness to keep using the chatbot stems from a variety of reasons, including its potential to support L2 learning in general, and writing in particular:

S8: Maybe I use the bot to practice writing, because I felt it helped to improve my spelling. I feel it is good at this.

S2: I might use Tutor Mike later, maybe because of the mistakes in sentences. It motivates you to learn how to write sentences. It is nice. If you have simple mistakes in sentences, it tells you; you ask it. It is nice in writing and grammar.

5.3.2. Affordances and limitations of the chatbot

Additional reasons () elucidated the students’ positive perceptions regarding the chatbot-mediated interactions, including effortlessness, responsiveness to language input without technical issues, and the novelty of the experience:

S9: The chatbot was fast [to respond] and easy to use. Some chatbots I used before were lagging a lot and take time and do not understand immediately.

S2: Trying Tutor Mike was a new thing, and this is the first time a chatbot answers correctly with no technical errors. Tutor Mike is useful in writing when you do not have someone to write to. You practice writing.

The initial stage of the interactions highly aroused the interest of some students. Nonetheless, as the chatbot’s discourse unfolded, some issues surfaced that culminated in a gradual decline in motivation to sustain the interactions. Such issues were manifested in the form of the chatbot’s responses to questions with other questions or its occasional lack of ability to comprehend the students’ output. These issues might instigate a diminishing predisposition toward continued utilization of the chatbot:

S1: Mike was responsive, and it was something new. Because I used it for the first time, so I was excited, but maybe over time, you start to feel bored by the experience. Mike does not motivate me to keep the conversation going.

S11: It is disappointing when Mike sometimes responds with a question or says that it does not understand. At first, you get excited to chat, but sometimes because of problems with forming questions, you do not like the experience and want to close the page. This is because of Mike and not the users, because users can learn from their mistakes.

The majority of students were highly satisfied with the chatbot’s language quality, and its conversational prowess appealed to some students. Some students indicated that as a conversational partner, it can be useful for only a limited period of time. This is primarily due to some drawbacks that emerged during the discourse, including very brief responses and the inability to extend discourse beyond a very short conversation and for an extended period of time. Consequently, these limitations () are considered negative aspects of such interactions:

S5: The chatbot’s language is excellent. Its responses were positive. I mean by positive is that it provides understandable answers in a funny way. So, you receive the responses directly; they are not difficult to understand. The chatbot does not make it difficult for you to understand the responses.

S4: Mike’s English is excellent, but it can’t keep chatting with you for a long time. Mike as a teacher may be good. But in conversation, I do not think it is good at all. Because it can’t do a full conversation. You ask a question and it just answers, and that’s it. So, you can’t chat with it; you can’t have a full conversation with it. It is like just it is there to answer questions.

Some students noted that the chatbot has the ability to teach some aspects of L2, including grammar, vocabulary, and some basics. However, other students reported that the chatbots’ instructional ability suffers from inherent limitations, and it can help in L2 learning to some extent since it cannot address language-specific questions similar to language teachers:

S3: While you are chatting with Mike, you do not feel it is a bot; you feel it is a teacher. Mike’s language is excellent. It is excellent in vocabularies and even grammar. I asked it some questions about grammar, and it answered me in a simple and understandable way…It gave me an answer in two sentences that made me understand something I spend a long time in the past trying to understand.

S2: Tutor Mike improves skills, improves writing, improves how to write sentences with good grammar…It only helps in learning. It deserves the title of tutor, but it just helps in learning, as a secondary thing. It corrects you if you write wrong sentences or questions. It tells you that this is this, how you ask a question, and this motivates you. You have a problem in how to ask a question, so this motivates you to learn how to ask questions the right way.

S11: Mike does not give details. A teacher gives details, and you can ask a teacher about something you do not understand and the teacher explains it to you. It is excellent only for a specific period of time and a specific level to get experience. You benefit from just chatting.

One of the main concerns reported on the chatbot’s conversational ability pertains to the questions formed by the students during the interactions. Some students mentioned that the chatbot’s responsiveness can be negatively impacted by question formation, despite proper structuring of questions, consequently leading to breakdowns in communication. Additionally, the chatbot’s irrelevant responses to questions emerged as significant drawback of the interaction. Furthermore, some students noted that heightened focus on linguistic forms in such interactions might be inconvenient:

S11: The only problem was how to write questions. A human would understand those questions, but Mike does not. Mike can’t understand some question types at all. Sometimes, its responses were not related to my questions.

S12: Sometimes, questions were written in an excellent way, but Mike said: I do not understand your question, even when I changed question forms. This was awkward.

S13: How to ask questions. I wrote more than one question that I think are good, but the chatbot could not answer my questions from the question types I used.

S4: Attention to spelling is a negative thing; you focus on language and not the information you want to get from Mike. It is supposed to be a tutor. Mike was criticizing my spelling sometimes. When I ask a question and make a spelling mistake in one word, as a tutor you are supposed to correct my mistake and continue to answer my original question, but Mike just corrects spelling mistakes and does not get back to my question, so, I have to write it again. If you do not provide correct spelling and grammar, Mike could provide an unrelated answer.

5.3.3. The potential of the chatbot across different L2 proficiency levels

The students exhibited divergent perceptions () concerning the most suitable L2 proficiency level for engaging in chatbot-mediated interactions. A fraction of students espoused the notion that the chatbot has the potential to support beginner language learners, serving as a source of motivation, correcting L2 errors, introducing new vocabulary, and furnishing easily comprehensible responses:

S3: Mike is useful for beginners because it motivates them and corrects their spelling. They can learn new words from Mike, and when asked about grammar, it answers in a simple way, without complications for beginners. You should simplify vocabulary and grammar for beginners so they can understand in a way that matches their level.

Conversely, many students underscored the chatbot’s suitability for intermediate-level learners, primarily due to the prerequisite of possessing a minimum threshold of L2 knowledge and adequate writing skills to facilitate effective communication with the chatbot:

S8: I do not think the bot is for beginners. It is for intermediates, for those who can converse and have some language to chat. I do not think beginners can do that. Maybe advanced students do not benefit from the bot, because it does not know information that they do not know. They can use more advanced ways.

S1: Mike is good for the intermediate level, because you should have background in language before you ask it questions, because it depends on writing. Once I tried to ask a question, but I had to write the question in a better way so it gives me a related answer.

6. Discussion

6.1. Perspectives on the pedagogical chatbot as an L2 conversational agent

The study aimed to delve into L2 learners’ experiences and views concerning the utilization of a pedagogical chatbot as a conversational partner. These views were subsumed under the cognitive and affective domains of learning, as well as the chatbot’s affordances and limitations. The findings shed light on multifaceted factors relevant chatbot-mediated interactions. By and large, the positive experience-based views on the chatbot underscore its potential as a valuable L2 conversational agent. Notably, the most positive views are associated with the intelligibility of the chatbot (Gallacher et al., Citation2018; Shin et al., Citation2021), its capacity to support L2 learning and practice (Fryer & Carpenter, Citation2006; Fryer et al., Citation2019; Gallacher et al., Citation2018; Kwon et al., Citation2023; Yang et al., Citation2022), its potential to increase motivation and enhance L2 writing, and the effortlessness of engaging in chatbot-based dialogs. The mean values for the chatbot’s language proficiency, potential to support L2 practice, and capacity to foster L2 learning and development are 4.38, 4.25, and 3.97, respectively. These findings align with previous research conducted by Gallacher et al. (Citation2018), reporting similar advantages based on 202 responses from participants pertaining to the merits of chatbot-mediated communication. These advantages included a high level of intelligibility in the chatbot’s responses and exposure to new lexical items. Furthermore, Yang et al. (Citation2022) found that student engagement in voice-enabled communication with a chatbot led to increased motivation for oral interaction. Similarly, the present investigation reveals that the utilization of the pedagogical chatbot can contribute to heightened levels of learners’ motivation to engage in L2 writing practice, as reflected by a mean value of 4.10.

Given the chatbot’s sensitivity to language errors, occasional non-responsiveness prompted the students to focus more intently on linguistic accuracy, making necessary modifications to their output during communication. Aligned with the interactionist theoretical perspective (Chapelle, Citation2005; Long, Citation1996), this CALL-based dialog offered opportunities for extended engagement in attending to linguistic forms, refining output, and negotiating meaning, potentially leading to L2 improvement (Bibauw et al., Citation2022). Previous research (e.g. Gallacher et al., Citation2018) has revealed similar findings, indicating learners’ positive attitudes toward the effectiveness of chatbot utilization for L2 learning.

However, despite the chatbot’s appealing language quality (Coniam, Citation2014) and conversational ability, some students regarded its usefulness as merely temporary. This might be an outcome of the novelty effect (Fryer et al., Citation2017), associated with encountering a new L2 technology-based classroom experience. Through successive interactional attempts, the students became increasingly aware of the limitations in the chatbot’s conversational ability, such as its inability to comprehend all their output or extend a conversation beyond a few exchanges, or even a brief exchange (Fryer et al., Citation2020). Furthermore, an awareness of the aforementioned drawbacks could lead to decreased motivation to continue such interactions (Fryer et al., Citation2019). In addition, the chatbot’s irrelevant responses may provoke feelings of boredom, thereby diminishing motivation and interest levels (Thompson et al., Citation2018). Similar findings were observed in the study conducted by Gallacher et al. (Citation2018), which identified the disadvantages of chatbot-mediated communication based on 248 responses from participants. The study highlighted the chatbot’s inability to generate questions in a progressive manner, as well as the unsuitability and irrelevance of many responses provided by the chatbot. In a related study by Kwon et al. (Citation2023), 43% of participants terminated the activity after attempting to rephrase their written output, while 24% chose not to continue the interaction with the chatbot. Additionally, 38% expressed a desire to terminate the activity when the AI tool failed to comprehend their output. Nevertheless, a discrepancy emerged from the results, as the students considered the chatbot capable of easily understanding their English, they also maintained neutral attitudes toward the relevance of the chatbot’s responses to their queries and reported frequent instances of communication breakdown. This finding suggests the students’ general positive views of the chatbot’s comprehension ability. However, it concurrently reveals their dissatisfaction with particular occurrences of miscommunication.

Consistent with previous research conducted by Alm and Nkomo (Citation2020), Fryer and Carpenter (Citation2006), Gallacher et al. (Citation2018), and Shin et al. (Citation2021), the findings of this study indicate that the students expressed a favorable disposition toward the chatbot-based dialog experience, as it sparked their interest, with a mean score of 3.95 and 3.82, respectively. The findings of Shin et al. (Citation2021) further support these findings through conversation sentiment analysis, revealing a positive emotional stance among learners toward chatbot-mediated interaction, particularly high school learners who demonstrated a higher level of enjoyment, with average compound scores of .93. and .86. Nonetheless, for some students in the present study, this interest that was sparked during the initial phase of the interaction gradually waned over time. This finding aligns with Thompson et al. (Citation2018) research, which observed a decline in interest among students engaged in chatbot interactions throughout the experimental duration, as reflected by a decrease in mean scores from 4.18 to 3.84.

One finding reveals that interacting with the chatbot does not induce stress related to L2 mistakes (Fryer et al., Citation2020; Kwon et al., Citation2023). However, this finding is not in harmony with the students’ statements concerning the chatbot’s sensitivity to language errors and the requirement for careful attention to language forms to ensure successful interactions. Thus, the students might perceive the interaction as an anxiety-free activity (Fryer & Carpenter, Citation2006), where they feel comfortable even when making writing mistakes. Additionally, with accumulative awareness of the possibility of receiving unsatisfactory responses due to insufficient attention to the accuracy of their output, the students turned their attention to more accurate language without experiencing anxiety. In contrast, and contrary to the findings of Fryer and Carpenter (Citation2006), the students maintained a neutral stance toward chatbot-human interactions as less anxiety-provoking and stressful than human-human interaction. This perception might arise from the issues encountered during conversations, resulting in unnatural interactions (Thompson et al., Citation2018). Kwon et al. (Citation2023) also found that L2 learners exhibited neutral attitudes toward comfortableness while interacting with the chatbot. Moreover, for language learning purposes, the students considered such an interaction not more effective than chatting with peers (Gallacher et al., Citation2018). This observation might also point to the impact of the chatbot’s occasional unnatural output as well as limitations in its instructional ability.

Despite interacting with a pedagogical agent, some students questioned the chatbot’s instructional potential to explicitly elucidate linguistic aspects. Although the chatbot managed to generate satisfactory answers to some questions related to English grammar and vocabulary, it was conceived of as lacking the ability to function as a language instructor (Smutny & Schreiberova, Citation2020). This finding could be ascribed to irrelevant responses to several questions, and the students’ experiences with the chatbot seem to emphasize the lack of ability to answer language-related questions akin to an actual teacher, as succinct responses could not be fully satisfying.

The findings revealed a mixed disposition among the students regarding their inclination to utilize the chatbot in the future. Within the participant cohort, both willingness and unwillingness toward future interactions with the chatbot were evident. Willingness to engage in such interactions stems from the perceived capacity of the chatbot to provide support and facilitate improvement in L2 learning and use (Fryer et al., Citation2019). In line with these observations, Kwon et al. (Citation2023) reported a similar inclination among the majority of students who participated in chatbot-based written communication. Contrastingly, unwillingness is an outcome of the students’ experience-based awareness of various limitations of the chatbot. Given the English majors’ intensive and extensive involvement in L2, and their accumulated knowledge of best practices for language learning and effective communication (Alrajhi, Citation2020b), it is not unexpected that they were not highly satisfied with the conversational ability of such an innovative technology. Nonetheless, a synthesis of the findings suggests that the students’ perspectives on the interactions with the pedagogical chatbot are in line with the notion that such interactions can offer an array of opportunities for L2 learning.

6.2. Self-perceived L2 proficiency and views on the chatbot as a conversational partner

Most of the students’ views regarding the linguistic ability of the chatbot suggest that effective communication in AI-mediated conversations requires users to possess sufficient L2 knowledge and writing skills. On the other hand, some students perceived the chatbot as a source of motivation that is particularly beneficial for learners with lower L2 proficiency (Coniam, Citation2014; Yin & Satar, Citation2020), as it can provide easily comprehensible language, correct errors, and provide exposure to new vocabulary. The divergence of opinions might reflect differences in self-perceived competency, as different views could be associated with certain difficulties that the students encountered during the interactions, and how they felt regarding potential learning gains.

Contrary to the findings of Yin and Satar (Citation2020), the current study does not reveal significant differences between the two dominant groups (intermediate and high-intermediate learners) in terms of the effect of perceived L2 proficiency on attitudes toward the chatbot as a conversational partner. Yin and Satar (Citation2020) reported that some students with higher proficiency levels exhibited negative attitudes toward interacting with the pedagogical chatbot and were hesitant to engage in such interactions. In the present study, both groups expressed comparable levels of satisfaction with the chatbot. Notably, the only exception is that high-intermediate students exhibited significantly more positive views on the utility of the interaction for L2 practice (p = .031). This disparity in the findings could be ascribed to the notable difference in L2 proficiency levels between the participants in Yin and Satar (Citation2020) study, which consisted of undergraduate and graduate students. Conversely, the majority of participants in the current study reported intermediate or high-intermediate levels of proficiency. Consequently, the only varation in perceptions toward the chatbot in this study could be attributed to the subtle differences in L2 proficiency, a factor that influences human-machine interaction. Higher proficiency levels afford learners greater control and opportunities for successful and extended conversations with the chatbot.

7. Conclusion

In this study, we explore the perspectives of EFL students regarding the utilization of a pedagogical chatbot as a conversation agent. The study sheds light on the positive perceptions surrounding the chatbot’s intelligibility, comprehension, supportive role in L2 practice and writing development, and potential to alleviate writing anxiety. However, the study also reveals certain drawbacks pertaining to the chatbot’s interactional and instructional abilities. These limitations include the lack of extended conversations, sensitivity to inaccurate language forms, and sporadic irrelevant responses. The learners’ perspectives unveil the potential of pedagogical chatbots in L2 education contexts (Coniam, Citation2014; Fryer & Carpenter, Citation2006; Gallacher et al., Citation2018; Kwon et al., Citation2023). Nevertheless, they also highlight the existing limitations of these AI tools (Gallacher et al., Citation2018; Grudin & Jacques, Citation2019; Smutny & Schreiberova, Citation2020; Thompson et al., Citation2018). It is evident that pedagogical chatbots have yet to reach a status where they provide learners with optimal L2 learning experiences. Furthermore, the study indicates that L2 proficiency does not affect overall perceptions of the chatbot-mediated interactions, with one notable exception. High-intermediate students demonstrated significantly more positive views regarding the chatbot’s usefulness for L2 practice. This suggests that while learners’ overall views are relatively similar, L2 proficiency level can influence their perceptions of such interactions (Fryer et al., Citation2019; Yin & Satar, Citation2020).

Several pedagogical implications can be derived from the findings. First, text-based chatbots can effectively stimulate L2 student engagement in writing practice, as foreign language learners generally exhibit positive attitudes toward novel learning tools (Alrajhi, Citation2020c). Thus, instructors can exploit this inclination by incorporating innovative CALL-based dialog systems into their instructional routine to promote L2 learning. However, such interest-provoking tools have several drawbacks. Curriculum designers and educators who aspire to incorporate AI chatbots into L2 instruction need to be aware that despite significant advancements in current conversational partner technology, further improvement is necessary to maximize their efficacy as pedagogical tools within the realm of L2 education (Coniam, Citation2014). Consequently, continuous use of chatbots for L2 purposes could result in waning interest and reduced motivation among students, which is attributed the limitations of these tools. Second, to sustain L2 learner interest in such dialogs, L2 teachers can employ guided interactions and a diverse range of chatbot-mediated tasks. As indicated by the views on the chatbot, the students initially exhibited high interest when engaging in free-style interactions with the chatbot; however, the interest of some students gradually diminished. Therefore, interactions that lack predetermined tasks could eventually undermine the potential merits due to reduced interest.

Third, the students turned their attention more toward language form in order to effectively communicate with the chatbot. Therefore, L2 teachers can utilize such a text-based activity to focus on L2 accuracy, capitalizing on such attention to language structure. Fourth, the accurate English output generated by the chatbot offered learners opportunities for exposure to L2 input, thereby maximizing the potential of chatbot interactions to promote L2 learning. Fifth, high-quality interactions between users and pedagogical chatbots, coupled with their anthropomorphic attributes such as a conversational ability, can contribute to L2 teachers’ trust and acceptance of chatbots, fostering their willingness to embrace these emerging technologies in language learning. It is essential to recognize that apart from enjoyment and increased efficiency, the social and emotional dimensions pertinent to the use of AI tools play a pivotal role in user acceptance of such tools (Pelau et al., Citation2021). Moreover, the cultivation of relationships between learners and technological tools through chatbot-human interactions can offer substantial support, particularly in situations where direct human interactions may evoke anxiety (Kwon et al., Citation2023).

In light of these pedagogical implications, L2 educators are encouraged to consider the integration of chatbots into their instructional practices. However, there is a need to balance the novelty of such tools with careful task design, ensuring sustained motivation and interest among learners. Continued research and development in chatbot technology, coupled with effective teaching strategies, hold promise for enhancing L2 learning in the digital age.

8. Limitations of the study

This study acknowledges a few limitations that warrant consideration. First, the study relied on the utilization of one pedagogical chatbot within a single interaction session. Thus, the reported levels of motivation and interest exhibited by the students during the initial phase of the chatbot-mediated interactions may be ascribed to the novelty effect. Therefore, a future line of research could explore learner interaction with diverse chatbot models over an extended duration, which might yield different experiences and perspectives. The second limitation pertains to the findings obtained from self-reported data. While such data provide valuable insights into participants’ perceptions, it is advisable for future studies to incorporate objective measures as complementary indicators. These measures could include assessments of actual learning gains following a series of interaction sessions and evaluations of participants’ L2 proficiency levels. Third, most interactions with the chatbot were initiated by the students themselves, without predetermined learning objectives. This form of interactions could have contributed to diminished interest and motivation over time. Therefore, future studies could investigate learner experience and perception concerning structured interactions within pedagogical chatbot-mediated tasks that are guided by predefined objectives. Finally, a future avenue of research could involve the examination and evaluation of the potential of pedagogical chatbots across a broader range of L2 proficiency levels.

Disclosure statement

The author declares that there is no conflict of interest.

Additional information

Notes on contributors

Assim S. Alrajhi

Assim S. Alrajhi is an associate professor of applied linguistics in the Department of English Language and Literature, College of Languages and Humanities, at Qassim University in Saudi Arabia. His research interests include technology-enhanced language learning, L2 writing, and L2 vocabulary acquisition.

References

Alm, A., & Nkomo, L. M. (2020). Chatbot experiences of informal language learners: A sentiment analysis. International Journal of Computer-Assisted Language Learning and Teaching), 10(4), 51–65. https://doi.org/10.4018/IJCALLT.2020100104
Web of Science ®Google Scholar
Alrajhi, A. S. (2020a). EFL learners’ beliefs concerning the effects of accumulative gaming experiences on the development of their linguistic competence. Electronic Journal of Foreign Language Teaching, 17(2), 367–380. https://doi.org/10.56040/asla1731.
Google Scholar
Alrajhi, A. S. (2020b). English learners’ perceptions of video games as a medium for learning and integration into the English curriculum. MEXTESOL Journal, 44(4), 1–17.
Google Scholar
Alrajhi, A. S. (2020c). Static infographics effects on the receptive knowledge of idiomatic expressions. Indonesian Journal of Applied Linguistics, 10(2), 315–326. https://doi.org/10.17509/ijal.v10i2.28596
Google Scholar
Alrajhi, A. S. (2023). EFL learners' perceptions and attitudinal fluctuations toward digital multimodal composition: A longitudinal approach. International Journal of Computer-Assisted Language Learning and Teaching, 13(1), 1–15. https://doi.org/10.4018/IJCALLT.317748
Web of Science ®Google Scholar
Bailey, D. (2019 , August 27–29). Chatbots as conversational agents in the context of language learning [Paper presentation]. Proceedings of the Fourth Industrial Revolution and Education., South Korea. https://t.ly/vwp0
Google Scholar
Bibauw, S., Van den Noortgate, W., François, F., & Desmet, P. (2022). Dialogue systems for language learning: A meta-analysis. Language Learning & Technology, 26(1), 1–24. https://hdl.handle.net/10125/73488
Web of Science ®Google Scholar
Bonett, D. G., & Wright, T. A. (2015). Cronbach’s alpha reliability: Interval estimation, hypothesis testing, and sample size planning. Journal of Organizational Behavior, 36(1), 3–15. https://doi.org/10.1002/job.1960
Web of Science ®Google Scholar
Braun, V., & Clarke, V. (2006). Using thematic analysis in psychology. Qualitative Research in Psychology, 3(2), 77–101. https://doi.org/10.1191/1478088706qp063oa
Google Scholar
Chapelle, C. (2005). Interactionist SLA theory in CALL research. In J. L. Egbert & G. M. Petrie (Eds.), CALL research perspectives (pp. 53–64). Lawrence Erlbaum.
Google Scholar
Coniam, D. (2014). The linguistic accuracy of chatbots: Usability from an ESL perspective. Text & Talk, 34(5), 545–567. https://doi.org/10.1515/text-2014-0018
Web of Science ®Google Scholar
Copulsky, J. (2019). Do conversational platforms represent the next big digital marketing opportunity? Applied Marketing Analytics, 4(4), 311–316.
Google Scholar
Creswell, J. W. (2009). Research design: Qualitative, quantitative, and mixed methods approaches (3rd ed.). Sage Publications, Inc.
Google Scholar
Dale, R. (2016). The return of the chatbots. Natural Language Engineering, 22(5), 811–817. https://doi.org/10.1017/S1351324916000243
Web of Science ®Google Scholar
Dörnyei, Z. (2007). Research methods in applied linguistics: Quantitative, qualitative, and mixed methodologies. Oxford University Press.
Google Scholar
Fryer, L. K., Ainley, M., Thompson, A., Gibson, A., & Sherlock, Z. (2017). Stimulating and sustaining interest in a language course: An experimental comparison of chatbot and human task partners. Computers in Human Behavior, 75, 461–468. https://doi.org/10.1016/j.chb.2017.05.045
Web of Science ®Google Scholar
Fryer, L., & Carpenter, R. (2006). Bots as language learning tools. Language Learning & Technology, 10(3), 8–14. 10125/44068
Web of Science ®Google Scholar
Fryer, L. K., Coniam, D., Carpenter, R., & Lăpușneanu, D. (2020). Bots for language learning now: Current and future directions. Language Learning & Technology, 24(2), 8–22. http://hdl.handle.net/10125/44719
Web of Science ®Google Scholar
Fryer, L. K., Nakao, K., & Thompson, A. (2019). Chatbot learning partners: Connecting learning experiences, interest and competence. Computers in Human Behavior, 93, 279–289. https://doi.org/10.1016/j.chb.2018.12.023
Web of Science ®Google Scholar
Gallacher, A., Thompson, A., & Howarth, M. (2018). “My robot is an idiot!” – Students’ perceptions of AI in the L2 classroom. In P. Taalas, J. Jalkanen, L. Bradley & S. Thouësny (Eds), Future-proof CALL: language learning as exploration and encounters – short papers from EUROCALL 2018 (pp. 70–76). Research-publishing.net. https://doi.org/10.14705/rpnet.2018.26.815
Google Scholar
Golossenko, A., Pillai, K. G., & Aroean, L. (2020). Seeing brands as humans: Development and validation of a brand anthropomorphism scale. International Journal of Research in Marketing, 37(4), 737–755. https://doi.org/10.1016/j.ijresmar.2020.02.007
Web of Science ®Google Scholar
Grudin, J., & Jacques, R. (2019). (Eds). Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. ACM. https://doi.org/10.1145/3290605.3300439
Google Scholar
Kwon, S. K., Shin, D., & Lee, Y. (2023). The application of chatbot as an L2 writing practice tool. Language Learning & Technology, 27(1), 1–19. 10125/73541
Web of Science ®Google Scholar
Long, M. (1996). The role of the linguistic environment in second language acquisition. In W. C. Ritchie & T. K. Bhatia (Eds.), Handbook of second language acquisition (pp. 413–468). Academic Press.
Google Scholar
Luo, X., Tong, S., Fang, Z., & Qu, Z. (2019). Frontiers: Machines vs. humans: The impact of artificial intelligence chatbot disclosure on customer purchases. Marketing Science, 38(6), 937–947. https://doi.org/10.1287/mksc.2019.1192
Web of Science ®Google Scholar
Melián-González, S., Gutiérrez-Taño, D., & Bulchand-Gidumal, J. (2021). Predicting the intentions to use chatbots for travel and tourism. Current Issues in Tourism, 24(2), 192–210. https://doi.org/10.1080/13683500.2019.1706457
Web of Science ®Google Scholar
Pelau, C., Dabija, D. C., & Ene, I. (2021). What makes an AI device human-like? The role of interaction quality, empathy and perceived psychological anthropomorphic characteristics in the acceptance of artificial intelligence in the service industry. Computers in Human Behavior, 122, 106855. https://doi.org/10.1016/j.chb.2021.106855
Web of Science ®Google Scholar
Shah, H., Warwick, K., Vallverdú, J., & Wu, D. (2016). Can machines talk? Comparison of Eliza with modern dialogue systems. Computers in Human Behavior, 58, 278–295. https://doi.org/10.1016/j.chb.2016.01.004
Web of Science ®Google Scholar
Shin, D., Kim, H., Lee, J. H., & Yang, H. (2021). Exploring the use of an artificial intelligence chatbot as second language conversation partners. Korean Journal of English Language and Linguistics, 21, 375–391. https://doi.org/10.15738/kjell.21.202104.375
Google Scholar
Smutny, P., & Schreiberova, P. (2020). Chatbots for learning: A review of educational chatbots for the Facebook messenger. Computers & Education, 151, 103862. https://doi.org/10.1016/j.compedu.2020.103862
Web of Science ®Google Scholar
Thompson, A., Gallacher, A., & Howarth, M. (2018). Stimulating task interest: Human partners or chatbots?. In P. Taalas, J. Jalkanen, L. Bradley & S. Thouësny (Eds), Future-proof CALL: Language learning as exploration and encounters – Short papers from EUROCALL 2018 (pp. 302–306). Research-publishing.net. https://doi.org/10.14705/rpnet.2018.26.854
Google Scholar
Winkler, R., & Söllner, M. (2018). Unleashing the potential of chatbots in education: A state-of-the-art analysis. In Academy of Management Annual Meeting (AOM) . Chicago, USA.
Google Scholar
Yang, H., Kim, H., Lee, J., & Shin, D. (2022). Implementation of an AI chatbot as an English conversation partner in EFL speaking classes. ReCALL, 34(3), 327–343. https://doi.org/10.1017/S0958344022000039
Web of Science ®Google Scholar
Yang, H., Kim, H., Shin, D., & Lee, J. H. (2019). A study on adopting AI-based chatbot in elementary English-speaking classes. Multimedia-Assisted Language Learning, 22(4), 184–205. https://doi.org/10.15702/mall.2019.22.4.184
Google Scholar
Yin, Q., & Satar, M. (2020). English as a foreign language learner interaction with chatbots: Negotiation for meaning. International Online Journal of Education and Teaching (IOJET), 7(2), 390–410. http://iojet.org/index.php/IOJET/article/view/707
Google Scholar

Download PDF

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Your download is now in progress and you may close this window

Did you know that with a free Taylor & Francis Online account you can gain access to the following benefits?

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Have an account?
Login now Don't have an account?
Register for free

Login or register to access this feature

Have an account?
Login now Don't have an account?
Register for free

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Artificial intelligence pedagogical chatbots as L2 conversational agents

Abstract

1. Introduction

2. Literature review

2.1. AI chatbot

2.2. The interactionist framework

2.3. Chatbots as L2 conversational partners

3. The significance of the current study

4. Methods

4.1. Participants and research setting

Table 1. Demographic information.

4.2. Procedures, instruments, and analysis

4.2.1. Chatbot-mediated interaction

4.2.2. Questionnaire survey

4.2.3. Focus group discussions

5. Findings

5.1. Learners’ views on the pedagogical chatbot as an L2 conversational agent

Table 2. Descriptive statistics of learners’ perspectives on the pedagogical chatbot.

5.2. Perceived L2 proficiency and perspectives on the chatbot as a conversational partner

Table 3. Ranks of the usefulness of the chatbot-mediated interactions for L2 practice.

5.3. Focus group discussions

Table 4. Thematic categories and corresponding frequencies of responses in the first theme.

Table 5. Thematic categories and corresponding frequencies of responses in the second theme.

Table 6. Thematic categories and corresponding frequencies of responses in the second theme.

Table 7. Thematic categories and corresponding frequencies of responses in the third theme.

5.3.1. The pedagogical chatbot and L2 learning

5.3.2. Affordances and limitations of the chatbot

5.3.3. The potential of the chatbot across different L2 proficiency levels

6. Discussion

6.1. Perspectives on the pedagogical chatbot as an L2 conversational agent

6.2. Self-perceived L2 proficiency and views on the chatbot as a conversational partner

7. Conclusion

8. Limitations of the study

Disclosure statement

Additional information

Notes on contributors

Assim S. Alrajhi

References

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date