678
Views
0
CrossRef citations to date
0
Altmetric
Information & Communications Technology in Education

Siri as an interactive pronunciation coach: its impact on EFL learners

ORCID Icon
Article: 2304245 | Received 30 Sep 2023, Accepted 06 Jan 2024, Published online: 14 Feb 2024

Abstract

Language is a powerful tool that facilitates communication, knowledge acquisition, and negotiation of meaning, but it is a complex skill for English as foreign language (EFL) learners who often encounter various challenges and complexities. Among the critical language skills, pronunciation holds a significant place. As much as it is challenging, many EFL teachers find it necessary to teach it explicitly, so exploring innovative approaches to address this challenge is essential. This preliminary study investigated how Siri, a voice assistant tool, can be integrated with in-class instruction to improve the pronunciation of EFL learners. The study demonstrated that Siri could lead to better learning gains for students than using only in-class instructions and feedback when combined with teacher instructions. Forty EFL learners participated; twenty-five who did not have iPhones compatible with Siri were treated as the control group, whereas fifteen with iPhone devices compatible with Siri were treated as the experimental group. Each group was assigned an experienced English teacher to help give instructions and feedback. The primary data collection method involved students’ reading the assigned tasks, recording, and sending them to their respective teachers and peers for feedback and peer review, respectively. Cohen’s kappa coefficient and independent t-test were used to analyze the data. The results showed that Siri improved the students’ pronunciation skills more than in-class instruction.

Introduction

Language is a powerful tool that allows people to communicate, access knowledge, and negotiate means. However, it is complex in a way that EFL learners experience different challenges and complexities (Mora-Flores, Materials, et al., Citation2014). One of the essential critical language skills is pronunciation (Gilakjani, Citation2016); however, most EFL teachers need to teach it explicitly (Yang et al., Citation2022). Besides, pronunciation is one of the most difficult challenges that EFL teachers and learners face as a challenging skill to teach. The fundamental principle underlying teaching and learning, including pronunciation and technology, is to empower students to transcend the traditional teacher-centric model and create more autonomous and self-directed roles. By leveraging technology and fostering active collaboration between students, education can become more transformative and impactful via the proper use of technology (Rospigliosi, Citation2022).

According to Jarosz (Citation2019), learners should receive materials and instruments to improve their pronunciation. Jarosz also believes that good pronunciation skills are an essential element of the learners’ ability intelligible speech. He emphasized that in addition to vocabulary and grammar, pronunciation encompasses the mechanical aspects of communication skills. The production of intelligible utterances is a prevalent problem among EFL learners. Kang (Citation2014) argues further that comprehensible pronunciation is one of the primary aims of pronunciation instructions, and it’s an integral component of communicative competence. Furthermore, Sullivan (Citation2010) maintains that achievable objectives that are practical and appropriate for learners’ communication needs should be set. Fluency communication skills, including pronunciation and writing, are needed to enable EFL learners to convey ideas and concepts in various contexts (Tsai, Citation2019). Therefore, EFL learners must speak English as understandably as possible to communicate and interact as effectively as possible across contexts and spaces.

On the other hand, Jarosz (Citation2019) views pronunciation as the primary oral communication skill. Pronunciation instruction has grown its importance among EFL learners because of the awareness that the most pressing, justifiable, and sensible objective of teaching pronunciation is not to acquire perfect or native-like pronunciation but to produce an intelligible and understandable speech. Nurani and Rosyada (Citation2015) stated that there is a need to balance pronunciation with other communication skills. Nurani and Rosyada also expressed those teachers and technology play a significant role in helping students develop their pronunciation skills to produce comprehensible utterances. However, the ability to speak English is influenced by other sub-skills, including pragmatics, grammar, and vocabulary; Nurani and Rosyada view pronunciation as essential. With acceptable pronunciation, the student’s speech can be understood.

Notwithstanding having other errors, however, with lousy pronunciation, understanding the learner’s speech would be very difficult. García (Citation2017) also expressed that pronunciation is the foundation of oral communication and one of the basic skills for EFL learners. García continued to express that without pronunciation, and there would be no verbal communication and spoken language. Therefore, the aim of helping EFL learners improve their pronunciation is to enable them to improve their oral communication (Prodanovska, Citation2017).

Integrating technology in classroom learning is one way to help EFL learners improve their pronunciation. Generally, technology has transformed all aspects of life, including teaching a language, either as a foreign or as a second one. In addition, technology and multimodal tools make learning heuristic and engaging, boosting productivity and interactivity in the learning process (Pargman & Jahnke, Citation2019). This study uses Siri, Apple’s mobile intelligent assistant, to enhance learners’ pronunciation skills through comprehensible commands.

The present study

The literature review has illuminated the critical role of pronunciation in language learning, especially for EFL learner. It emphasizes the challenges faced by both learners and teachers in teaching and acquiring proper pronunciation skills (Gilakjani, Citation2016). Additionally, the review underlines the potential of technology, such as voice assistant tools, to facilitate language learning (Pargman & Jahnke, Citation2019). In the context of voice assistant tools, the literature review shows that Siri, an artificial intelligent voice recognition software, has been widely used for practicing pronunciation (Moore, Citation2019). However, while some studies have explored technology-assisted pronunciation training (Zhu Zhang & Irwin, Citation2023), this research is unique in investigating how Siri can enhance pronunciation skills through comprehensible commands. Siri’s role in language training needs additional study (AbuSa’aleek, Citation2014).

EFL teachers have employed voice assistants and mobile-assisted language learning for years to improve sub-skills and language. This study, which builds on prior work, will examine how using Apple’s Siri in conjunction with in-class instructions can improve pronunciation learning among EFL students. However, there needs to be more research on using Apple’s Siri as a resource for learning English pronunciation. The rationale for conducting further research on Apple’s Siri as a resource for learning English pronunciation lies in the existing gap in the literature regarding the specific use of comprehensible commands with voice assistant tools. While technology-assisted pronunciation training has been explored, the literature lacks a detailed examination of Siri’s unique potential in facilitating pronunciation improvement through specific commands. Aligning with communicative language teaching principles and building on established concepts prioritizing intelligibility over native-like pronunciation, this study seeks to adapt traditional pronunciation instruction to the dynamic educational landscape of 2023. Recognizing Siri’s practical usage, accessibility, and integration of cutting-edge technology, the research aims to contribute valuable insights into how Siri, as an everyday tool, can effectively enhance pronunciation skills among EFL learners, addressing a crucial aspect that has not been comprehensively explored in current L2 literature.

Siri is a virtual assistant developed by Apple Inc. It is a voice-controlled intelligent assistant integrated into Apple’s operating systems, including iOS, iPadOS, macOS, watchOS, and HomePod devices. Siri uses natural language processing and voice recognition to interpret and respond to users’ commands and queries. Siri is one of the mobile intelligent assistants used today to help EFL learners improve their pronunciation skills. Considering this, the present study examined how Siri can be integrated with in-class instructions to improve pronunciation learning among EFL learners. To achieve this, the study is guided by the following research questions:

  1. To what extent can the integration of Siri into pronunciation learning be more effective than the traditional classroom instructions in improving the pronunciation skills of EFL Saudi learners?

  2. What are the aspects of Apple’s Siri that EFL Saudi students find appealing and unappealing in the context of learning pronunciation?

Teaching pronunciation

Three approaches are recommended for EFL teachers to help them teach pronunciation or oral skills. These approaches include the integrative approach, analytic-linguistic approach, and intuitive-imitative approach (Pennington & Rogerson-Revell, Citation2018). These approaches combine both modern techniques and traditional methods. The analytic-linguistic approach involves teachers providing clear instructions on pronunciation, such as articulatory descriptions, vocal charts, and phonetic alphabets. The intuitive-imitative process involves learners listening and imitating target language sounds without explicit instructions. Unique technologies such as computer-based programs, websites, audiotapes, and videos are used.

Researchers and teachers in the contemporary integrative approach view pronunciation as a critical communication component and not just an isolated sub-skill or drill. EFL learners rehearse pronunciation by engaging in evocative task-based activities. Language learners utilize learning activities that are pronunciation-focused to foster pronunciation learning. There is more emphasis on the supra-segmental features, including intonation, rhythm, and stress, as praised in protracted discourse beyond word and phoneme levels. Learners are taught pronunciation to help them in meeting their specific needs.

Various studies have examined new approaches to technology integration in language learning. For instance, Zhu Zhang and Irwin (Citation2023) performed a meta-analytic investigation to explore the impacts of the use of digital devices on ESL learners. Results showed that computer-mediated glosses significantly affected incidental vocabulary learning and pronunciation among ESL learners. In addition, Freynik et al. (Citation2012) reviewed various technologies and their efficacy in EFL classrooms. In their review, the researchers determined that technology-assisted pronunciation training effectively enhances pronunciation and interactivity among EFL learners. In line with these studies, Evers and Chen’s (Citation2020) research examined the effects of an automatic speech recognition (ASR) system with peer feedback on pronunciation instruction for adults. This study, conducted with working adults in Taiwan, utilized the Speech notes ASR software. The results of this study indicated a significant difference in pronunciation post-tests between the experimental and comparison groups, highlighting the efficacy of peer feedback in correcting pronunciation. Furthermore, the experimental group expressed higher satisfaction with the software compared to the comparison group. Unlike Evers and Chen, (Citation2020) and Hirose and Kawai’s (Citation2000) studies which emphasized minimal pairs, this study focuses on using Siri to enhance learners’ pronunciation skills through comprehensible commands.

The use of intelligent personal assistants (IPAS) in enhancing language acquisition

The field of second language learning has witnessed a surge in interest surrounding the use of intelligent personal assistants (IPAs) to enhance language acquisition. Notably, Dizon et al. (Citation2017) case study investigated Amazon’s IPA, Alexa, in the context of EFL learning. This study found that Alexa provided indirect pronunciation feedback and conversational opportunities, demonstrating its potential in language instruction. Dizon et al. (Citation2022) expanded the study to L2 English students’ listening and speaking skills. The study showed considerable speaking proficiency gains, highlighting the need for more research. Additionally, Moussalli and Cardoso’s (2020) study on voice-controlled IPAs like Amazon Echo’s Alexa highlighted the adaptability of IPAs to learners’ accented speech, providing stress-free input exposure and output practice. These studies collectively underscore the promising role of IPAs in addressing the limitations of traditional classrooms and fostering language development among EFL learners.

Moving forward, Tai’s (Citation2022) investigation into Google Assistant on smartphones expanded the understanding of IPAs’ impact on oral proficiency outside the classroom. This study revealed that Google Assistant significantly improved EFL learners’ oral proficiency, offering high-quality input, multimodal feedback, and reduced anxiety in speaking English. Furthermore, Tai and Chen’s (Citation2022) research delved into feedback presentation modes, emphasizing the effectiveness of IPA-mediated interaction with multimodal feedback in enhancing adolescent EFL learners’ speaking proficiency. These findings collectively indicate the potential of IPAs in providing authentic language interactions, encouraging self-directed learning, and mitigating learners’ apprehensions, thereby contributing significantly to language learning outcomes.

Employing Siri in the classroom to enhance pronunciation

Speech Interpretation and Recognition Interface (Siri) is one of the modern mobile technologies that EFL learners can use to improve their pronunciation skills. Developed by Apple Inc., Siri is an artificial intelligent (AI) voice recognition software that allows iPhone users to speak a regular voice command to operate the phone and its applications. Siri mobile assistant is preloaded on any Apple device and functions as the user’s assistant. Some of the uses of Siri include reading and sending emails or texts, locating contacts, determining food calorie content, defining words, recognizing music, giving directions, and setting the alarm (Moore, Citation2019). Even though Siri was designed to equip customers of Apple with a personal assistant, it has widely been used to practice pronunciation. To enable it to perform user requests, Siri needs to understand the utterances of the interlocutor. Therefore, teachers can use Siri to teach EFL learners any oral feature of language regarding intelligibility through negotiation for meaning (Ghanizadeh et al., Citation2015). However, these language features are limited to the types of languages available through the Apple device.

In the context of second language learning, the concepts of comprehensibility and intelligibility play a pivotal role in understanding how learners are perceived and understood by listeners. Comprehensibility refers to the extent to which a listener can understand the meaning intended by the speaker, encompassing not only individual sounds and words but also the overall message being conveyed (Derwing & Munro, Citation1997). On the other hand, intelligibility refers to the listener’s ability to recognize and interpret specific phonetic and linguistic elements, such as sounds, syllables, and words, in the speaker’s utterances. These concepts have been extensively explored in previous research, with studies like Nagle and Huensch (Citation2021) shedding light on the intricate relationship between accent, intelligibility, and comprehensibility in second language acquisition. In the realm of technology-assisted language learning, integrating voice assistant tools, such as Siri, into pronunciation instruction can enhance both comprehensibility and intelligibility. As these tools decipher learners’ speech patterns and provide feedback, they significantly improve pronunciation accuracy, ensuring that learners are both understood and capable of producing speech sounds accurately.

When it comes to the intelligibility of pronunciation, it’s critical to remember Crystal’s (Citation2017) argument that even if speakers incorporate perfect grammar usage and vocabulary, a certain pronunciation threshold must be achieved to circumvent communication breakdown. Similarly, Marlina and Giri (Citation2014) maintained that not perfect pronunciation but mutual intelligibility should always be the ultimate goal of language learning. Based on their argument, Marlina and Giri pointed out that mutual intelligibility is vital because today, English is an international phenomenon, and the number of English speakers as a Lingua Franca is far beyond the number of native English speakers.

Based on Marlina and Giri’s arguments, Siri helps pronunciation through various steps. First, it evaluates speech to determine whether or not it is comprehensible. Siri will perform the requested function if it comprehends what the speaker is saying. Nonetheless, if Siri fails to comprehend the speaker’s speech, it will search the web for answers or execute a different process, stating it doesn’t understand. It means that understandable output only sometimes equals impeccable pronunciation. Like human interlocutors, Siri is context-aware and understands the practical interpretation of minor errors that insignificantly interfere with meaning. illustrates the Marlina and Giri arguments.

Figure 1. Siri’s intelligibility pattern model and its involvement in the negotiation of meaning.

Figure 1. Siri’s intelligibility pattern model and its involvement in the negotiation of meaning.

This study reports on an investigation of how Siri can be used to enhance EFL learners’ pronunciations. The researcher also discusses both negative and positive Siri’s affordances.

Speech recognition software corrective feedback

Recent research has shown the value and importance of using technology to aid in language acquisition, especially regarding enhancing one’s pronunciation. Sheen (Citation2011) identifies two processes in which corrective feedback occurs. The first process is a natural conversation, in which the relationship between speakers lets them correct each other’s mistakes. The second instance of corrective feedback occurs when the intended meaning fails to convey. In both cases, the negotiation of meaning is integral. Siri helps in the first process by transforming interlocutors’ speeches into written discourse. If the user spells the request correctly, Siri will pronounce it correctly. However, when the intended meaning fails to get across, speakers can achieve the purpose by analyzing Siri’s reactions. If Siri performs the request successfully, the pronunciation is comprehensible, and if it fails to carry out the request, it means that perhaps the pronunciation was faulty.

Moreover, according to Kevin Jeisy (Citation2015), all pronunciation CALL systems must be able to explain to learners their errors, the severity of mistakes, and how to correct them. Although Siri doesn’t explicitly satisfy all the three requirements identified by Jeisy, it implicitly indicates errors (through spelling) and their severity (through intelligibility). However, it needs to provide a measure to enable EFL learners to correct their mistakes.

Zhu Zhang and Irwin (Citation2023) examine computer-mediated annotations’ effects on English as a Second Language students. The results showed that computer-mediated glosses greatly benefited ESL students’ incidental vocabulary acquisition and pronunciation. In addition, Freynik et al. (Citation2012) looked into using several technologies in EFL classes and came to the same conclusion: EFL students benefited greatly from technology-assisted pronunciation instruction. This research focuses on how students may improve their pronunciation by using understandable instructions using Siri, a voice recognition program, in contrast to others focusing on minimal pairs.

AbuSa’aleek (Citation2014) studied the prevalence and effects of mobile-assisted language learning and voice assistant tools on English proficiency. Educators and researchers need to know how mobile technologies can be successfully incorporated into teaching and learning, and this study illuminated the benefits and downsides of such tools to help them do just that. The results emphasized the widespread use of cell phones in education and called for more research into the efficacy of mobile technology in ESL classes.

Hirose and Kawai (Citation2000) studied speech recognition software that could listen and provide feedback. They developed software to help in teaching Japanese as L2. The software was specifically meant to aid teachers of the Japanese language in enhancing their pronunciation on the phoneme level. The researchers emphasized the length of the Japanese language’s double-mora phonemes instead of single-mora phonemes. While using the software, Hirose and Kawai asked the study participants to recite minimal pairs containing the target vowel length. The developed software was based on the perception of native Japanese speakers of confusability concerning these vowel lengths. The researchers concluded that a learner hurriedly captures the relevant vowel length cues; however, it doesn’t necessarily mean he can retain the knowledge for future use in a more informal setting.

illustrates how Hirose and Kawai’s (Citation2000) software models target phonemes, interpret users’ speech production, and provide corrective feedback. On the other hand, breaks down the foundation on which the software’s measurement length was built. For example, the phonemes "koi" and "kooi” are distinguished by the duration (number of milliseconds) the speaker takes to produce the o-sound. Suppose the speaker takes 120 milliseconds to make the o-sound. In that case, it will lead to a perfect pronunciation of "koi", 250 milliseconds would lead to an excellent articulation of the word "kooi". If the length of the pronunciation of the o-sound falls within that continuum of the benchmark time (120–250ms), the software will provide feedback to help the participants get closer to the phoneme. These benchmark durations resulted from testing the perception of native Japanese speakers of speech samples of different phoneme lengths.

Figure 2. Hirose and Kawai’s (Citation2000) system-user interaction (p. 134).

Figure 2. Hirose and Kawai’s (Citation2000) system-user interaction (p. 134).

Figure 3. Hirose and Kawai’s (Citation2000) length of measurement and evaluation (p. 134).

Figure 3. Hirose and Kawai’s (Citation2000) length of measurement and evaluation (p. 134).

Evers and Chen (Citation2020) and Hirose and Kawai’s (Citation2000) studies offered a much-appreciated insight into how software and speech recognition mobile applications work. Even with this, although decontextualized minimal pair drills can serve the same research purpose, they are not effective in the learning of language. The attention given to the target phonemes may assist learners in producing them; however, in natural speech production, this attention is spent on pronunciation and negotiation of meaning. Therefore, to assist in developing phoneme pronunciation skills that are applicable globally, studies must be conducted to show how language learners can negotiate for meaning through speech-recognition mobile applications and software in a communicative and contextualized way.

Considering the current research, it is clear that technologically based language learning treatments have proven effective in boosting vocabulary, enhancing pronunciation, and facilitating voice assistant tools for learning. However, further research and testing are required to evaluate Siri’s performance in language learning. This research hopes to fill this knowledge gap and contribute to developing tech-based approaches to language education by exploring how Siri may improve learners’ pronunciation abilities.

Hirose and Kawai’s (Citation2000) model reflects timeless pronunciation instruction concepts in 2023 AI-based learning. The 2000 software may be old, but the core concept of using speech recognition to improve pronunciation remains. and show how phonemes and speech characteristics are carefully analyzed and measured for successful pronunciation instruction. Despite technical advances, speech recognition techniques for pronunciation enhancement remain a useful framework for new language learning experiences. Moreover, the limitations identified in earlier studies, such as the decontextualized nature of minimal pair drills, underscore the need for innovative applications of speech recognition technology. Hirose and Kawai’s basic principles can be used by Siri and other AI-driven tools to explore communicative and interactive language learning experiences. Siri’s capacity to understand natural language instructions and deliver contextually rich feedback supports communicative language training, which emphasizes accurate pronunciation, practical communication, and meaning negotiation. This study adapts theoretical frameworks to cutting-edge technology to get insights about AI-driven tools that improve pronunciation skills in 2023's dynamic educational landscape.

Pronunciation skills enhancement through Siri

Siri can be an appropriate pronunciation tool in tasks similar to real-life communication as it is context-aware and can automatically interpret and correct small grammar mistakes. Siri can also help pronunciation because it primarily aims at intelligibility. Most EFL learners seek to achieve the accent of native English speakers, making Siri an effective tool as it is accessible and most people, including EFL learners, are familiar with it. Some voice assistants are free of charge and already uploaded on people’s (students) devices. Students are also more likely to use the apps outside the classroom due to their diverse functions that do not explicitly focus on language features (Al Shamsi et al., Citation2022). Siri makes pronunciation easy, targeting comprehensibility and intelligibility as it is also accent-sensitive. EFL teachers can use Siri to teach learners pronunciation based on British, American, Canadian, and Australian accents, as pronunciation practicing with voice-assistants such as Siri is naturally motivating.

The value of voice-assistants such as Siri in language practice continues beyond pronunciation, as EFL students can also sharpen their listening skills. Initially, students would want to ask the voice assistants such as Siri various "silly" questions. Siri will reward them with different unexpected and humorous answers that they will be enthused to comprehend (Al Shamsi et al., Citation2022). Teachers can give students tasks that require comprehension and provide correct responses to the following questions posted by Siri. Learners can pick up the natural language and vocabulary from Siri. Siri provides various responses that are composed of high-frequency expressions and meaningful words.

Siri is one of the essential voce-assistant tools that can assist and facilitate pronunciation teaching in EFL classrooms. Most of these tools collect the voice command and convert the sound into text (Zou et al., Citation2020). Once that is done, the next step involves comprehending what the learner is asking for, a process known as natural language processing. They then parse the syntactical structure of the text, extracting parts of speech such as verbs, adjectives, and nouns and the general sentence intonation. Through this, Siri searches the Apple database for the right sound of the text.

It is important to note that no single study has focused on Siri and its impact on L2 pronunciation enhancement. Most of the published works deal with other voice assistants like Alexa or voice assistants as a collection. Zou et al. (Citation2020) focused on MALL rather than on Siri. The absence of Siri-focused L2 empirical investigations highlights a critical gap in the existing literature, underscoring the pressing need for rigorous research specifically delving into Siri’s efficacy as a pronunciation tool in L2 language learning contexts. Consequently, the present study emerges as a vital endeavor to contribute empirical insights, addressing this conspicuous gap and shedding light on the practical implications of integrating Siri into EFL classrooms for pronunciation enhancement.

In conclusion, the literature review provides a wide-ranging overview of pronunciation instruction for second language learners and highlights the development of various approaches. The three most common approaches to teaching pronunciation today are integrative, analytic-linguistic, and intuitive-imitative, combining cutting-edge research with time-tested practices. One standard method to improve articulation instruction is using technical instruments like computer programs, internet resources, audio recordings, and visual media. The research then moves on to a literature review of voice assistant tools as a helpful instrument for learning a new language. The paper elaborates on the merits of mobile learning tools, including their portability, interactivity, and ability to provide personalized feedback. Mobile learning tools may help students improve their language skills, including vocabulary, grammar, and pronunciation. However, the analysis highlights the challenges associated with mobile learning tools, such as potential disruptions, costs, and limitations compared to traditional learning options.

Scholars have done studies in various contexts to determine the usefulness of voice assistant tools, and they have shown that they positively impact students’ self-regulation skills, motivation for learning, and phonological competence. Recent studies have shown that mobile apps and platforms may greatly aid language learning, especially concerning improving phonetic articulation. Despite several drawbacks, the above analysis recognizes voice assistant tools’ promise as a helpful tool for language study. The statement highlights the relevance of meticulously preparing and executing voice assistant tools to enhance learners’ linguistic proficiency practically and engagingly. At the end of the review, the authors propose a further study to address the issues related to voice assistant tools and to research their full potential in language instruction. The declaration lays the groundwork for further research into Siri’s role in enhancing speakers’ pronunciation skills and highlights the benefits of using technology in language instruction.

Methodology

Participants

To improve testing reliability and with interest in pronunciation difficulties based on language interferences, a first-year class of English Language majors containing forty Saudi EFL learners took part in the study. The study was conducted in one of the Saudi universities in the Central part of the country. The participants were males aged between 18 to 25 years. Due to the prevalent segregated gender culture, the researcher encountered limitations in accessing female students on campuses and within classes. Consequently, female students were deliberately excluded from this study. Due to regional cultural norms and practices, female volunteers were excluded from the study. Saudi Arabia’s traditional culture restricts unrelated men and women from interacting in public settings, including schools. This investigation took place in a gender-segregated classroom and campus. The researcher had trouble reaching and engaging female university participants due to cultural and institutional barriers. In this project, engaging female students would have required different arrangements, permissions, and accommodations, which were logistically unrealistic.

Of the forty students, only fifteen had iPhone devices compatible with Siri. These fifteen students were selected as the experimental group for the study, while the remaining 25 were used as the control group. Two teachers were assigned to teach pronunciation skills for either class. Teacher A was given the experimental group, while B was given the control group. The two types received three hours of in-class instruction per week. The control group used a standard curriculum involving only teacher instructional methods, which included teaching pronunciations using textbooks and assessments. The experimental group used both Siri and teacher instruction to learn pronunciation. The control group results served as a yardstick to evaluate the efficacy of Siri in pronunciation learning. Each teacher provided feedback for their group.

Instruments

The study adopted semi-structured observations, interviews, and note-taking to collect data. These methods were used together with an instrument known as Tasks with Siri to provoke participants’ pronunciation. The device comprises guided practice exercises based on segmental phonemes likely to challenge Arabic EFL learners. After contrastive analysis of the North American English and Arabic phonotactics and phoneme inventories, the researcher determined these segmental phonemes. Contrastive analysis is a linguistic approach used to identify the differences between the native language (L1) and the target language (L2). It helps researchers and educators anticipate challenges that learners might face based on the differences between phonetic systems, grammatical structures, and vocabulary between the two languages (Alwohaibi, Citation2019). In the case of Arabic speakers learning English, several studies have explored the challenges related to phonetic differences. Arabic and English have different phonological systems, including distinct consonant and vowel sounds (Alharbi & Aljutaily, Citation2020). Arabic, for instance, has sounds that do not exist in English, and vice versa. Consonant clusters, specific vowel sounds, and stress patterns often pose difficulties for Arabic learners of English.

The contractive investigation discovered potential challenges with consonant clusters, phonemes, and word stress. In the tasks assigned to participants, the researcher ensured that all requests and questions had the targeted problematic segmental. It was deliberately done to ensure the study purpose was less obvious and to lower the effective filters of the participants by providing they could at least execute some tasks. The researcher was mindful that since the functions only needed the participants to read the sentences loudly and record, English orthography could cause some challenges. It was because the structure of Arabic orthography and language is close to its phenomic inventories, while this is not the case with the English language. English inclines to have its graphemes representing various phonemes that can imbricate with other graphemes. Most Arabic English speakers can pronounce all graphemes they encounter while reading new words.

In addition, the researcher interviewed 10 participants after conducting the study to determine (a) if they understood the communication breakdowns while completing the tasks, (b) personal displeasure and likes toward Siri, and (c) recommendations for the use of Siri in the future for language learning. The researcher conducted in-depth, semi-structured face-to-face interviews. 10 students were selected based on their willingness to participate in the discussion. The language that was used is mostly Arabic since the participant’s language level is a beginner. After that, the researcher translated the given responses into English to generate the themes. Each interview with individual students lasted about 15 to 25 minutes. Then, the researcher followed up this interview with a follow-up group interview (for about 40 minutes) to gain more insights and collective conceptions of Siri. The purpose of this approach was to enrich the data with the interviewees’ shared and divergent perspectives on their tech experiences, views, resources, practices, and so on. The analysis of this study is thematic by using an inductive/deductive analysis of emerged themes (Bernard, Wutich, et al., Citation2016; Emerson et al., Citation2011; Maxwell, Citation2005), which is a revealing strategy for the identification of recurring themes, critical aspects, and explicit contradictions in the data.

Data collection

At the start of the study, all participants were given a list of words to define. The comments contained targeted problematic segmental. The students took the first two hours of the lesson to record the definitions of terms. The experimental group was to request Siri to define the terms and record the definitions based on what they heard. For the control group, teacher B dictated the exact words, and students were to write them down based on what they heard and record their definitions. Students were to make the recordings independently under their respective teachers’ supervision. The teachers were to evaluate the recordings on three different levels of errors, including consonant clusters, phonemes, and word stress. They annotated the problems in the recordings on a copy of the text by circling the missing consonant clusters, using (X) to mark incorrect phonemes. The first test was the pretest. The researcher used annotations to verify the holistic score based on Kevin Jeisy’s (Citation2015) requirements for all pronunciation CALL systems to show the types of errors, the severity of mistakes, and the intensity of pronunciation deficits.

In the second week of the Spring Semester, the participants were taught areas to focus on to improve their pronunciation based on the first week’s feedback. They were also trained to record their readings and email them to their teachers for comment. The experimental group was taught how to use Siri to improve their pronunciation of the terms. One of the experimental group’s strategies included repeating the word until Siri correctly transcribed their pronunciation. In the third week, both teachers provided feedback to students with technical problems and encouraged students who still needed to bring their recordings to present them in the next meeting. Teachers’ feedback was only limited to the definition of terms given to students. Sample checks indicated that in the first weeks of the semester, learners from the two groups had severe errors affecting intelligibility. The majority of consonant clusters were mispronounced. The participants made several incorrect word stresses and also omitted syllables. Teachers also noted critical mistakes in at least three phonemes. The students also experienced difficulties differentiating/ɪ/and/ɛ/sounds.

During the fifth week, the teachers gave the participants new sentences (see section B of Appendix A) to read aloud and record. As usual, students from both classes were given one hour to record in class and send their responses to their respective teachers by the end of the lesson. Again, teacher B asked the students to pay attention to/ɛı/sounds. Those using Siri were also asked to pay attention to/ɛı/sounds in addition to Siri’s corrective feedback. Week five recording showed improvement in students’ pronunciations. The control group had minor errors in phonemes and consonant clusters. They, however, demonstrated frequent incorrect word stress. On the other hand, the experimental group showed minor errors in phonemes, occasional problems with consonant clusters, and occasional omission of syllables.

Learners were given two more weeks to practice recordings and share them among themselves for comments before making the third recording. Students were asked to correct their pronunciations based on their colleagues’ feedback. Then, they were to upload the original and updated recordings on the discussion board for grading. The teachers emphasized grading to motivate students to improve their pronunciation. The respective groups worked cooperatively to provide effective feedback. Each recording was supposed to have at least two peer reviewers to improve the quality of feedback. Based on the original recordings, students made almost similar comments. Some of the statements included incorrect stress, confusing/ɛı/sounds, the omission of syllables, and the omission of the sound of the final consonant cluster. However, the third recording was a noteworthy variance among the students. The experimental group had a few phoneme and cluster errors but occasional incorrect word stress. The control group still had minor phonemes, word stress, and syllable omission errors.

Before making the final recording, the teachers met to discuss the most effective teaching practices that could be applied to both classes to help them improve. Finally, allowing time for post-test evaluation and discussion of the results, the study concluded with a post-test and an interview to determine (a) if students understood the communication breakdowns while completing the tasks, (b) personal displeasure and likes toward Siri and (c) recommendations for the use of Siri in the future for language learning.

The study meticulously ensured the validity and reliability of its data collection instruments and analysis methods through a comprehensive approach. To assess participants’ pronunciation problems, an initial task required them to define a list of words, focusing on specific segmental aspects. These recordings were then independently evaluated by two raters using established criteria, and inter-observer reliability was enhanced through Cohen’s kappa coefficient, ensuring consistency in the evaluation process. The study incorporated pretest and post-test assessments, offering baseline data on participants’ pronunciation difficulties in areas like consonant clusters, phonemes, and word stress. Levene’s homogeneity test further validated the equality of group variances, strengthening the comparative analysis. Throughout the intervention, participants engaged in targeted pronunciation practice guided by feedback from teachers and Siri, leading to significant improvements in both experimental and control groups. Moreover, qualitative data from participant interviews provided valuable insights, highlighting Siri’s convenience, availability, and diverse functionalities for language learning. However, participants also noted limitations, such as Siri’s occasional misinterpretation of spoken requests, emphasizing the need for continuous improvements in voice recognition technology.

The qualitative data analysis in this study was rigorous and comprehensive, involving semi-structured interviews with participants and employing established methods for thematic analysis. After in-depth, face-to-face interviews with individual students, a group interview collected different Siri experiences. Translating interview replies from Arabic to English-generated themes to ensure a systematic approach to understanding participants’ perspectives. The study used a well-established method—thematic analysis—to find, analyze, and present themes in qualitative data to improve reliability. Thematic analysis, a popular qualitative research method, organizes and transparently analyzes qualitative data to ensure findings are found in participants’ responses. The study also followed theme analysis procedures, including using multiple coders for reliability. The analysis’s themes came from participants’ comments, confirming the conclusions’ authenticity and reliability. However, there was no member checking. This is where participants review themes and interpretations to verify findings. This could have strengthened the qualitative data analysis.

Task scoring analysis

Two raters were used to score students’ recordings to improve the inter-observer reliability of the tasks. Cohen’s kappa coefficient procedure was then used to achieve the students’ recordings. show the pretest results. From the tables, the inter-observer reliabilities had strong pre- and post-test coefficients. The strong coefficients also show a high agreement level between raters. SPSS version 26 was chosen for statistical analyses. The researcher ran an independent t-test to determine the differences between the pretest and the post-test. The researcher used Leven’s homogeneity test to test the equality of the group variances ().

Table 1. Participants’ demographic profiling.

Table 2. Inter-observer reliability coefficients for pretest for experimental Group.

Table 3. Inter-observer reliability coefficients for pretest for control group.

Table 4. Inter-observer reliability coefficients for post-test for experimental group.

Table 5. Inter-observer reliability coefficients for post-test for the control group.

Results

The preeminence of Siri to in-class instruction in pronunciation learning

The study utilized the independent sample t-test and descriptive statistics for pretest and post-test analyses to evaluate the distinction between the two groups. The descriptive statistics and homogeneity of variance results are presented in and .

Table 6. Descriptive statistics (Pretest).

Table 7. Homogeneity of variance (Pretest).

The pretest results indicate that consonant clusters had the highest average errors at 4.7, followed by word stress at 4.5 and phonemes at 4.2. The outcomes of the independent sample t-test demonstrated that the groups did not exhibit any statistically significant variations (p < .05), fulfilling the homogeneity criteria.

and show post-test results. The post-test indicates a significant improvement across the student groups. In the experimental group, the mean errors related to consonant clusters in the experimental group were 2.2, phonemes (2.4), and word stress (2.3). The control group also improved the mean for errors due to consonant clusters changing to 3.52, phonemes (2.68), and word stress (2.64). The statistical analysis using an independent sample t-test did not reveal significant differences, indicating that the post-test homogeneity requirements were met.

Table 8. Descriptive statistics (Post-test).

Table 9. Homogeneity of variance (Post-test).

As the study culminated (fourth recording), pronunciation differences between the two groups were noticeable. However, the experimental group demonstrated a significant improvement in the three areas. In addition, students from both groups achieved the highest learning gain in the phoneme levels. Overall, the experimental group accomplished the highest growth in learning achieved for all types of errors, as summarized in .

Table 10. Learning gains.

Perception of learners concerning Siri as a tool for pronunciation practice

The interviewees’ replies shed important light on Siri’s effectiveness as a powerful tool for language learning. These reports highlight its affordability, ease of use, and ability to promote significant improvements in vocabulary, oral expression, listening comprehension, spelling, and speech. These conversations validate the importance of Siri as a teaching aid for language learning, an issue that will be further explored in the subsequent themes.

Siri’s impact on language learning and pronunciation improvement

The responses from Participants 1 and 4 show that they have a favorable and enthusiastic view of using Siri to help them learn a language.

Participant 1 (Sami, pseudo name): "Siri is a useful tool, and I couldn’t find a better tool that is always with me, available 24/7. Being tech-savvy, I always have my charged iPhone ready for various purposes. After getting to know Siri better, it has become an invaluable tool for accomplishing language learning tasks. For example, I asked Siri to help me define certain words and provide synonyms, and it did so perfectly, helping me expand my vocabulary. Not only that, but it also helped me use these words in sentences".

Sami starts out by stressing the usefulness of Siri and its constant availability, underscoring how convenient it is to have it on hand on a fully charged iPhone. The interviewee’s willingness to use Siri for a variety of purposes seems to be influenced by their tech-savvy disposition. The participant’s acknowledgment of Siri as a very helpful tool for language acquisition is what makes this comment stand out. He talks about how Siri has assisted them in growing their vocabulary in addition to helping them define words and find synonyms. The interviewee also emphasizes how Siri has aided them in using these recently learned words in phrases, which is an important part of learning a language.

Participant 4 (Mohammed): "The more time I spend with Siri, the better my pronunciation and spelling become. I wholeheartedly recommend it to my colleagues and friends for daily use. My listening and speaking skills have improved significantly. I'm grateful to have been introduced to this free tech tool that I can always enjoy using".

Mohammed confirms that Siri has a beneficial effect on language learning. They credit more engagement with the tool for better spelling and pronunciation. The respondent fervently endorses it to peers and associates, highlighting notable improvements in communication and listening abilities. Their response also shows how much they value Siri as a fun and free language-learning tool.

Challenges and potential of Siri as a language learning tool

This theme addresses Siri’s complex role in the field of language learning. This investigation looks at the challenges involved in using Siri to help with language pronunciation. Siri has several limits, even though it provides a number of functions that can help with language learning, such as practice pronouncing words correctly and vocabulary support. This theme explores these areas, illuminating the difficulties that language learners could run into as well as the ways in which Siri can be used as an efficient tool. Through an analysis of the interplay between Siri’s potential and the challenges it poses, we may better comprehend its place in contemporary language instruction.

When the researcher asked the participants whether they liked Siri, they indicated their love for the various functions the software could perform for them. They said Siri effectively enhanced their pronunciation, providing a range of tasks within one application, making its use fun. However, the participants expressed some disappointment with the software. They said that Siri needed to be more picky with pronunciations and quickly misinterpret users’ requests rather than asking them what they intended to utter.

The other challenge with Siri is misunderstanding learners’ spoken requests. Since the introduction of Siri in 2011, voice recognition has been the most significant challenge: often, the Apple assistant whiffs when interpreting users’ commands. Although Siri correctly recognized the participants’ voices, there were cases when it got them halfway to the answer before crashing impetuously into its limitation. Amazon’s Alexa and Google’s Assistant are imperfect; nonetheless, they reliably handle more advanced requests better than Siri (O’Kane, Citation2017). Partly, this is due to the way Apple has been reluctant to upgrade its AI efforts.

When asked their thoughts on Siri’s misinterpretation of their voices in some cases, the participants said that the primary reason for that could be their struggle to pronounce vowels and the use of Arabic accents in pronunciation. They also believed Siri was too selective about pronunciation.

EFL learners’ perceptions of Siri in language education

This theme examines the perspectives and firsthand accounts of EFL students using Siri as a teaching aid. This theme offers a prism through which to look at how EFL students use Siri to improve their language skills and how they understand it. An early look at Siri’s potential for language learning can be found in the statements of Adel and Ali.

Participant 2 (Adel): "I wholeheartedly recommend using Siri to my fellow Arabic students who want to enhance their English speaking and listening skills. I always desired to have short conversations from time to time with native English speakers who could accurately spell and pronounce words or sentences. Siri is the best and most cost-effective choice for me. I tried subscribing to websites and apps offering similar features, but they required a significant upfront investment. With Siri, it’s free and available to you anytime".

Adel strongly suggests utilizing Siri to learn English, particularly to enhance speaking and listening abilities. They said they would like to improve their spelling and pronunciation by having conversations with native speakers. Regarding the free option, Siri is praised as being the most economical option when compared to other paid language learning systems. This testimonial highlights Siri’s usefulness in language learning while highlighting its affordability and ease of use.

Participant 3(Ali): "Yes, Siri has helped me become more talkative and use the language in class and with others. Initially, I wasn’t sure how to benefit from this opportunity I had in my hands. But I've become a better speller and communicator since using Siri. Even my younger brother and sister started using Siri at home and are familiar with more English vocabulary. They, like me, have built a substantial vocabulary through interaction with Siri".

Adel said he had been using Siri daily to learn English spelling, practice pronunciation of characters, and determine weather conditions. Sami said he used Siri to search the web, set alarms, and create reminders. Ali used Siri to determine holidays and friends’ birthdays and to calculate math. On the other hand, Mohammed said he used Siri for messaging, emailing, and sending SMS. Mohammed explained that he had been using Siri about four times a week to call his contacts, search Google, and practice pronunciation of places. They also acknowledged using Siri for fun when they are bored and tired. Most millennials and teenagers, like EFL learners in this study, always carry their mobile devices everywhere and use them whenever they have free time. Therefore, Siri can facilitate classroom pronunciation learning outside the classrooms in a manner that EFL learners find motivating and fun. However, irrespective of the users’ drive-in using the technology, Siri requires learners to speak often as they focus on pronunciation and the meaning, they seek to express to enable it to perform those functions effectively.

Discussion

The present study’s results make a valuable addition to the current body of literature on voice assistant tool and mobile-assisted language learning by emphasizing the efficacy of incorporating Siri, a voice assistant technology, into pronunciation learning. The findings suggest that integrating Siri into the educational curriculum had a more significant effect on the students’ pronunciation skills than exclusively depending on traditional classroom instruction. Siri’s accessibility and convenience are among its notable benefits. Utilizing mobile devices equipped with Siri allows learners to practice their pronunciation skills flexibly, unconstrained by the limitations of the classroom setting. The flexibility offered by this approach will enable individuals with time constraints to pursue language learning without attending conventional classes, thereby enhancing their pronunciation abilities. The present discovery is consistent with prior research highlighting the ease and flexibility of mobile devices in the context of learning languages (Kukulska-Hulme & Shield, Citation2008; Lei et al., Citation2022). In addition, this study underscores the significance of the user-friendliness and efficacy of Siri in facilitating pronunciation exercises. According to student reports, the utilization of Siri resulted in a streamlined learning experience due to its prompt and precise feedback. Through the provision of unambiguous vocal instructions, individuals were able to engage with Siri and obtain immediate assessments of their pronunciation. The assertion is corroborated by Liberman’s (Citation1995) study, which posits that oral communication is typically less challenging than written expression, and the integration of voice-activated assistants can augment the effectiveness of language acquisition. AbuSa’aleek (Citation2014) conducted a study to investigate the prevalence and effects of voice assistant tools on the development of English language proficiency. AbuSa’aleek’s (Citation2014) study provided insights into voice assistant tools’ present outlook and prospects, underscoring the necessity for additional investigation into the advantages of mobile technologies in the context of English language instruction and acquisition. Incorporating AbuSa’aleek’s research outcomes into the current discourse offers supplementary comprehension regarding the capacity of mobile assistant tools for language acquisition.

In the realm of teaching pronunciation, the study aligns with modern language teaching approaches, combining traditional methods with innovative technologies. Three recommended approaches for EFL teachers—integrative, analytic-linguistic, and intuitive-imitative—provide a comprehensive framework for addressing pronunciation challenges (Pennington & Rogerson-Revell, Citation2018). The integrative approach emphasizes pronunciation as a crucial communication component, emphasizing supra-segmental features like intonation, rhythm, and stress. Technology integration in language learning, specifically pronunciation training, has shown positive outcomes in previous research. Studies by Zhu Zhang and Irwin (Citation2023) and Freynik et al. (Citation2012) demonstrated the efficacy of technology-assisted pronunciation instruction, aligning with the current study’s focus on Siri’s comprehensible commands. Siri’s role in enhancing pronunciation skills is multifaceted. Beyond pronunciation, Siri aids in improving listening skills, providing learners with authentic, contextual language experiences. By engaging with Siri’s diverse functionalities, learners can enhance their vocabulary, language comprehension, and oral communication. Its ability to provide instant corrective feedback fosters a supportive learning environment, guiding learners toward accurate pronunciation.

Although the utilization of Siri in pronunciation acquisition has demonstrated encouraging outcomes, it is imperative to recognize certain conceivable obstacles. Instructing individuals on the optimal utilization of Siri may necessitate an initial time allocation. This requirement poses a significant obstacle as individuals must invest time to become proficient in Siri’s voice commands and functionalities. Educators are forced to design instructional strategies that accommodate this learning curve, and learners must dedicate extra effort to grasp the intricacies of Siri’s features. Overcoming this challenge is vital to fully harnessing Siri’s potential as an effective tool for language education. Once learners attain proficiency in voice commands and functionalities, they can autonomously participate in pronunciation exercises, diminishing the necessity for substantial teacher supervision. Moreover, the participants in the experimental cohort conveyed their gratitude towards Siri’s uncomplicated and productive nature. The duration dedicated to honing their skills with Siri was considered worthwhile.

In comparison, while beneficial, conventional face-to-face teaching methods frequently entail a substantial allocation of time towards diverse tasks, including writing and honing pronunciation skills. The present study underscores the prospective efficiency gains that can be realized through integrating mobile devices and voice assistants such as Siri in the context of language acquisition. Through mobile technology, individuals can effectively maximize their practice time and further enhance their pronunciation abilities beyond the confines of the traditional classroom setting.

Siri, like any other voice assistant tool, has negative and positive affordances. The software is easily accessible and popular among iPhone users.

However, it provides feedback based on intelligibility, which depends on the quality of users’ voice inputs. There can be communication breakdowns with Siri, which may stem from pronunciation deficits. For instance, both teachers noted various problems in the pretest. The students had difficulty differentiating/ɪ/and/ɛ/. Teacher A indicated that students did not supply Siri with enough context to enable them to understand their pronunciation. For example, the students were told to define the word "bet". The experimental group supplied Siri with an unintended pronunciation making Siri transcribe the wrong word. Eleven students in the experimental group requested Siri to define the term "big" instead of "bet". Twenty-one students in the control group also recorded the definition of significant. Other mistakes identified during the first two weeks include the .

Table 11. Pronunciation errors.

Letting the students repeat the same word’s pronunciation severally helped them produce the correct sound. Those using Siri could identify their mistakes and improve faster than those receiving classroom instructions alone. Teacher A said that Siri provided a conducive environment for learning whereby students could practice independently without fearing anyone. The percentage of pronunciation errors was reduced daily among the students in the experimental group.

From this study, it is clear that the integration of advanced voice assistants like Siri into education showcases the potential of technology in enhancing learning experiences. By leveraging speech recognition and artificial intelligence, educators can provide personalized and interactive learning opportunities for students. The study offers valuable insights into the practical application of these technologies in language education, highlighting both the advantages and challenges associated with their implementation. As technology continues to advance, these approaches can significantly contribute to making education more accessible, engaging, and effective for learners around the world.

Siri’s integration into language instruction has significant pedagogical ramifications and supports language learning goals. The observed improvements in pronunciation among the experimental group, as evidenced by significant reductions in mean errors related to consonant clusters, phonemes, and word stress, underscore the potential of Siri as a valuable tool for enhancing language learning. The personalized and adaptable nature of Siri’s interaction, highlighted by participants who commended its constant availability and user-friendly interface, suggests that educators can leverage Siri to cater to individual learning styles. The experimental group’s pronounced growth in learning, particularly in phoneme levels, indicates that Siri can effectively contribute to targeted pronunciation practice. The emphasis on fluency and accuracy in authentic spoken interactions aligns with the communicative approach to language learning and is supported by participants who reported improvements in listening and speaking skills.

Furthermore, participants’ positive responses, emphasizing Siri’s role in expanding vocabulary, aiding in defining words, and providing synonyms, illuminate its potential for comprehensive language acquisition. The study implies that Siri not only facilitates pronunciation improvements but also supports broader language skills, including vocabulary development and contextual usage. The recommendation to use Siri for enhancing English speaking and listening skills, as expressed by participants like Adel, adds practical weight to its pedagogical relevance. However, challenges identified in the study, such as Siri’s occasional misinterpretation of user requests and struggles with certain accents, highlight the need for careful consideration and guidance in implementing Siri as a language learning tool. Educators should be aware of these limitations and provide additional support to address potential issues, ensuring a more effective integration into language instruction.

Limitations

The study on the use of Siri in pronunciation learning among Saudi EFL learners presents several limitations that need to be considered when interpreting the results. Firstly, the study’s sample size was relatively small, comprising only 40 participants, out of which only 15 had compatible iPhone devices with Siri. This limited sample size might restrict the generalizability of the findings to a broader population of EFL learners. Additionally, the exclusion of female students from the study due to cultural reasons further narrows the scope of the research, making it important to be cautious when applying the results to diverse learner groups. Furthermore, the study was conducted in a specific cultural context, namely Saudi Arabia, and involved Arabic-speaking learners. The cultural and linguistic context in which the study took place might affect the learners’ attitudes, behaviors, and responses, making it challenging to extrapolate the findings to EFL learners from different cultural backgrounds. The study also focused solely on pronunciation learning, omitting other critical language skills such as grammar, vocabulary, reading, and writing. Consequently, the effectiveness of Siri in enhancing these neglected language aspects remains unexplored, limiting the comprehensive understanding of its impact on language acquisition. Moreover, the study’s choice to focus exclusively on iPhone devices compatible with Siri excludes learners who use Android devices or other platforms. This limitation in device compatibility raises concerns about the study’s applicability to a wider range of learners who might be using different voice assistant technologies. Additionally, the study did not thoroughly investigate the technical challenges faced by participants while using Siri, such as issues related to internet connectivity, device compatibility, or participants’ familiarity with the Siri interface. These technical challenges could significantly impact the overall learning experience but were not adequately explored in the research.

Recommendations

The affordability of iPhone devices presents a significant challenge in using Siri for language learning. Classroom teachers that want to integrate Siri into teaching pronunciation should consider if the student’s devices are compatible with the software. If the number of students with incompatible devices is small, the teacher may group them with their colleagues with compatible devices. The fact that Siri users outperformed the control group tells how important teachers should integrate technology into language learning. However, 65% of the participants still needed iPhone devices supporting Siri, which proved how difficult it is to depend on Siri for language learning. Fortunately, Google and Android smartphones also have popular speech recognition software that can be used to substitute or complement Siri if a class has a large number of students with an incompatible device with Siri; for instance, the style used in this study may adopt the Google Assistant which everyone can access.

Teachers should also allow students to correct themselves. During the pretest, students were asked to share their recordings for comment. Most participants from both groups acknowledged that listening and commenting on their peers’ recordings enhanced their pronunciation skills and helped them identify their weaknesses and shortcomings in English vowels. Lloyd Rieber (Citation2006) argues that while students review their peers’ assignments, they get valuable time to review the assignment requirements; thus, they revise their problems. Rieber also believed that learners would positively react to their peers’ comments compared to comments made by the teacher.

Conclusion and further research

This study demonstrated that incorporating mobile learning technologies such as Siri in the pronunciation course effectively improves EFL learners’ academic gains compared to classroom instructions and feedback. The study results can be used in all aspects of EFL learning, which require teachers to provide more than enough instructions and feedback to students. The usage of smartphone software in education is more effective in a small class size where every student owns a smartphone compatible with the target application. However, this is not the case in countries with underperforming economies because not all learners can afford high-end Apple devices. As technology-oriented learning can result in social and learning inequalities, teachers must explore methods of integrating technology into education, ensuring that no student is left behind, as Rospigliosi (Citation2021) emphasized. Future studies would benefit from establishing how Siri can improve pronunciation learning in developing EFL nations such as Afghanistan, Nepal, and Yemen, where students have limited access to high-end mobile devices. Future studies should also compare and contrast Siri’s learning outcomes with other voice assistant tools such as Google Assistant, Alexa or Windows Cortana.

Acknowledgment

The researcher would like to thank the Deanship of Scientific Research at Majmaah University for supporting this research under project No. R-2023-701.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Notes on contributors

Saleh Mosleh Alharthi

Saleh Mosleh Alharthi is an Associate Professor in Applied Linguistics at Majmaah University, specializes in language learning, technology, and sociolinguistics. He holds an MA from The University of Colorado Boulder and a PhD from The University of Memphis. He teaches various undergrad and graduate-level linguistics courses.

References

  • AbuSa’aleek, A. O. (2014). A review of emerging technologies: Mobile Assisted Language Learning (MALL). Asian Journal of Education and e-Learning, 2(6), 469–475.
  • Al Shamsi, J. H., Al-Emran, M., & Shaalan, K. (2022). Understanding key drivers affecting students’ use of artificial intelligence-based voice assistants. Education and Information Technologies, 27(6), 8071–8091. https://doi.org/10.1007/s10639-022-10947-3
  • Alharbi, B., & Aljutaily, M. (2020). On the perceptual accuracy of non-native phonemic contrasts: A case study of native Arabic speakers. Dil ve Dilbilimi Çalışmaları Dergisi, 16(4), 2003–2023. https://doi.org/10.17263/jlls.851031
  • Alwohaibi, H. (2019). Investigating native English speakers’ perception of novel Arabic phonemes after first exposure. SSRN Electronic Journal, 2019, 1447. https://doi.org/10.2139/ssrn.3451447
  • Bernard, H. R., Wutich, A., & Ryan, G. W. (2016). Analyzing qualitative data: Systematic approaches. SAGE Publications.
  • Cardoso, W., Liakin, D., & Liakina, N. (2013). Mobile speech recognition software: A tool for teaching second language pronunciation. OLBI Working Papers, 5, 1120. https://doi.org/10.18192/olbiwp.v5i0.1120
  • Crystal, D. (2017). Teaching original pronunciation. Oxford Scholarship Online. https://doi.org/10.1093/oso/9780190611040.003.0027
  • Derwing, T. M., & Munro, M. J. (1997). Accent, intelligibility, and comprehensibility. Studies in Second Language Acquisition, 19(1), 1–16. https://doi.org/10.1017/S0272263197001010
  • Dizon, G. (2017). Using intelligent personal assistants for second language learning: A case study of Alexa. Tesol Journal, 8(4), 811–830.
  • Dizon, G., Tang, D., & Yamamoto, Y. (2022). A case study of using Alexa for out-of-class, self-directed Japanese language learning. Computers and Education, 3, 100088. https://doi.org/10.1016/j.caeai.2022.100088
  • Emerson, R. M., Fretz, R. I., & Shaw, L. L. (2011). Writing ethnographic fieldnotes. University of Chicago Press.
  • Evers, K., & Chen, S. (2020). Effects of an automatic speech recognition system with peer feedback on pronunciation instruction for adults. Computer Assisted Language Learning, 35(8), 1869–1889. https://doi.org/10.1080/09588221.2020.1839504
  • Freynik, S., Richardson, D. L., Frank, V. M., Bowles, A. R., & Golonka, E. N. (2012). Technologies for foreign language learning: A review of technology types and their effectiveness. Computer Assisted Language Learning, 27(1), 70–105. https://doi.org/10.1080/09588221.2012.700315
  • García, N. M. (2017). Teaching English Literature in English as a Foreign Language (EFL) Classrooms. In Proceedings of the 5th Human and Social Sciences at the Common Conference. https://doi.org/10.18638/hassacc.2017.5.1.226
  • Ghanizadeh, A., Razavi, A., & Jahedizadeh, S. (2015). Technology-enhanced language learning (tell): A review of Resources and Upshots. International Letters of Chemistry, Physics, and Astronomy, 54, 73–87. https://doi.org/10.56431/p-z6sj8g
  • Gilakjani, A. P. (2016). English pronunciation instruction: A literature review. International Journal of Research in English Education, 1(1), 1–6.
  • Haggag, H. M. (2018). Teaching phonetics using a mobile-based application in an EFL context. European Scientific Journal, 14(14), 189–204.
  • Hirose, K., & Kawai, G. (2000). Teaching the pronunciation of Japanese double-mora phonemes using speech recognition technology. Speech Communication, 30(2–3), 131–143. https://doi.org/10.1016/s0167-6393(99)00041-2
  • Hou, Z., & Aryadoust, V. (2021). A review of the methodological quality of quantitative mobile-assisted language learning research. System, 2021, 102–568. https://doi.org/10.1016/j.system.2021.102568
  • Jarosz, A. (2019). English pronunciation in L2 instruction. Second Language Learning and Teaching. https://doi.org/10.1007/978-3-030-13892-9
  • Jeisy, K. (2015). Automatic pronunciation checker. Interspeech. https://doi.org/10.21437/interspeech.2012-257
  • Jeisy, K. (2015). Automatic Pronunciation Checker [Master’s thesis]. ETH Zurich. Retrieved from https://pub.tik.ee.ethz.ch
  • Johansson, E., & Cukalevska, M. (2021). The Impact of MALL on English Grammar Learning (Dissertation). Retrieved from https://urn.kb.se/resolve?urn=urn:nbn:se:mau:diva-40433
  • Kang, O. (2014). Learners’ perceptions toward pronunciation instruction in three circles of world Englishes. TESOL Journal, 6(1), 59–80. https://doi.org/10.1002/tesj.146
  • Kukulska-Hulme, A., & Shield, L. (2008). An overview of mobile assisted language learning: From Content delivery to supported collaboration and interaction. ReCALL, 20(3), 271–289. https://doi.org/10.1017/S0958344008000335
  • Lei, X., Fathi, J., Noorbakhsh, S., & Rahimi, M. (2022). The impact of mobile-assisted language learning on English as a foreign language learners’ vocabulary learning attitudes and self-regulatory capacity. Frontiers in Psychology, 13, 872922. https://doi.org/10.3389/fpsyg.2022.872922
  • Liberman, A. M. (1995). Why is speech so much easier than reading and writing? Retrieved from https://pdfs.semanticscholar.org
  • Marlina, R., & Giri, M. T. (2014). The pedagogy of English as an international language (EIL): More Reflections and dialogues. The Pedagogy of English as an International Language, 2014, 1–19. https://doi.org/10.1007/978-3-319-06127-6_1
  • Marlina, R., & Giri, R. A. (2014). The pedagogy of English as an international language: Perspectives from scholars, teachers, and students. Springer.
  • Maxwell, J. A. (2005). Qualitative research design: An interactive approach. Sage Publications.
  • Mohammadi, E., & Shirkamar, Z. S. (2018). Mobile-assisted language learning: Challenges and setbacks in developing countries (pp. 172–186). IGI Global.
  • Moore, Q. (2019, August 22). Siri commands list: How to use Siri for iPhone X, iPad: Siri App iOS 12 (2018). Retrieved from https://smartbro.co/siri-commands-list-how-to-use/
  • Mora-Flores, E., Materials, T. C., & Machado, A. (2014). Strategies for connecting content and language for english language learners in social studies. Shell Education Pub.
  • Nagle, C. L., & Huensch, A. (2021). Expanding the scope of L2 intelligibility research. Benjamins Current Topics, 2021, 51–73. https://doi.org/10.1075/bct.121.04nag
  • Nurani, S., & Rosyada, A. (2015). Improving English pronunciation of adult ESL learners through reading aloud assessments. Lingua Cultura, 9(2), 107. https://doi.org/10.21512/lc.v9i2.825
  • O’Kane, S. (2017, June 7). Apple still hasn’t fixed Siri’s biggest problem. Retrieved from https://www.theverge.com/2017/6/7/15742936/apple-siri-problems-voice-recognition-wwdc-2017.
  • Pargman, T. C., & Jahnke, I. (2019). Emergent practices and material conditions in learning and teaching with technologies. Springer.
  • Pennington, M. C., & Rogerson-Revell, P. (2018). Pronunciation in the classroom: Teachers and teaching methods. English Pronunciation Teaching and Research, 2018, 173–233. https://doi.org/10.1057/978-1-137-47677-7_4
  • Prodanovska, V. (2017). A study of proper pronunciation as a factor of successful communication. CBU International Conference Proceedings. 5. CBUNI.
  • Rieber, L. J. (2006). Using peer review to improve student writing in business courses. Journal of Education for Business, 81(6), 322–326. https://doi.org/10.3200/JOEB.81.6.322-326
  • Rospigliosi, P. A. (2021). The risk of algorithmic injustice for interactive learning environments. Interactive Learning Environments, 29(4), 523–526. https://doi.org/10.1080/10494820.2021.1940485
  • Rospigliosi, P. A. (2022). Why we need critical pedagogy in post-pandemic interactive learning environments. Interactive Learning Environments, 30(5), 779–781. https://doi.org/10.1080/10494820.2022.2079277
  • Sadun, E., & Sande, S. (2013). Talking to Siri: Learning the language of apple’s intelligent assistant. Que Publishing.
  • Sheen, Y. (2011). Pedagogical perspectives on corrective feedback. Corrective Feedback, Individual Differences and Second Language Learning, 39–51. https://doi.org/10.1007/978-94-007-0548-7_3
  • Sullivan, M. (2010). Communication and pain behavior: Distinguishing between communication goals and communication value. PsycEXTRA Dataset. https://doi.org/10.1037/e524372011-050
  • Tai, T.-Y. (2022). Effects of intelligent personal assistants on EFL learners’ oral proficiency outside the classroom. Computer Assisted Language Learning, 2022, 1–30. https://doi.org/10.1080/09588221.2022.2075013
  • Tai, T.-Y., & Chen, H. H.-J. (2022). The impact of Intelligent Personal Assistants on adolescent EFL learners’ speaking proficiency. Computer Assisted Language Learning, 2022, 1–28. https://doi.org/10.1080/09588221.2022.2070219
  • Tsai, S. (2019). Using google translate in EFL drafts: A preliminary investigation. Computer Assisted Language Learning, 32(5–6), 510–526. https://doi.org/10.1080/09588221.2018.1527361
  • Tseng, W.-T., & Chen, S. (2022). The effects of MALL on L2 pronunciation learning: A meta-analysis. Journal of Educational Computing Research, 60(5), 662. https://doi.org/10.1177/07356331211058662
  • Yang, C. T. Y., Lai, S. L., & Chen, H. H. J. (2022). The impact of intelligent personal assistants on learners’ autonomous learning of second language listening and speaking. Interactive Learning Environments, 2022, 1–21. https://doi.org/10.1080/10494820.2022.2141266
  • Yousef, M., & Abduh, M. (2019). The effect of implementing MALL applications on learning pronunciation of English by EFL learners at Najran University. International Journal of Linguistics, 2019, 1945–5425.
  • Zhu, T., Zhang, Y., & Irwin, D. (2023). Second and foreign language vocabulary learning through digital reading: A meta-analysis. Education and Information Technologies, 2023, 696. https://doi.org/10.1007/s10639-023-11969-1
  • Zou, B., Li, H., & Yan, X. (2020). Students’ perspectives on using online sources and apps for EFL Learning in the mobile-assisted language learning context. In Handbook of Research on integrating technology into contemporary language learning and teaching. essay, IGI Global.

Appendix A

Terms to define

Part A: To start, say, "Hey, Siri". Then requests Siri to define the following terms:

  1. Bet

  2. Ice cream

  3. Tasks

  4. Documents

  5. Gentle

  6. Lesson plan

  7. Table

  8. Splendid

  9. Priority

  10. Thanks

  11. News

Part B: Record the definition of the following terms:

  1. Surprise

  2. Willingness

  3. Intelligently

  4. Deficiency

  5. Influenced

  6. Structurally

  7. Development

  8. Shopping

Appendix B

Interview questions

  1. What didn’t you like about Siri?

  2. What did you like, at least, about Siri?

  3. Did you experience challenges with Siri understanding your utterances? What could be the reason(s)?

  4. Do you think EFL classes should adopt Siri in pronunciation learning? If so, why?

  5. Probing questions such as: tell me more about, what do you mean by this, in what sense, did you mean, can you clarify this, etc.