1,145
Views
0
CrossRef citations to date
0
Altmetric
Articles

The trade-off between STEM knowledge acquisition and language learning in short-term CLIL implementations

ORCID Icon & ORCID Icon
Pages 338-361 | Received 22 Sep 2021, Accepted 29 Jun 2023, Published online: 13 Jul 2023

ABSTRACT

Bilingual education could solve many challenges introduced by an increasingly internationalised education system. Content and Language Integrated Learning (CLIL), in particular, may equip students with the necessary cultural and communicative skills to succeed in today’s academic environment. However, it is not yet clear how CLIL can be employed effectively in short-term educational contexts where full-term bilingual programmes are not feasible. We designed and assessed a one-day CLIL module for ninth graders at our university’s gene-technology lab. The assessment of our module with 252 grammar school students indicates that a CLIL module does not achieve the same learning success as an equivalent non-CLIL module. Even with additional language scaffolding material, full access to online dictionaries, and the availability of crucial workbook passages in their native language, CLIL students could not achieve the same short-term content learning success. We consequently argue that more attention should be paid to the inherent trade-off between language and content learning when carrying out short-term CLIL programmes. Moreover, we caution against using only content and language scaffolding to mediate this trade-off.

Introduction

Appropriate English language skills have become a requirement for most professions (Sardegna et al., Citation2017). In higher education, they are often needed for lectures, scientific publications, or essays and theses (Earls, Citation2013; Lanvers, Citation2018). Despite the high relevance of English as a lingua franca (Oktaviani & Fauzan, Citation2017), students often lack appropriate English language competencies (Zenner-Höffkes et al., Citation2021). In response, the European Commission (Citation2004) has promoted the introduction of Content and Language Integrated Learning (CLIL) in grammar school teaching (Finkbeiner & Fehling, Citation2006).

CLIL is understood to be an innovative approach to foreign language learning (Tarasenkova et al., Citation2020). It retains the focus on content learning but delivers content in both the native (L1) and the foreign language (L2) (Eurydice, Citation2005; Coyle et al., Citation2010). The strategic combination of the L1 and L2 increases L2 exposure and ensures that students understand the input regardless of their L2 proficiency (Krashen, Citation1982; Lin, Citation2015). The aim of CLIL is to ‘deepen awareness of both [the native] and target language’ through a plurilingual approach (Marsh et al., Citation2001, p. 16). Unlike many immersion models (e.g. Wode, Citation1995), the L2 is an important didactic element in CLIL (Coyle et al., Citation2010).

Effectively combining the L1, L2, and scientific concepts in CLIL requires well-planned curricula and extensive scaffolding (Lin, Citation2015). Scaffolding is ‘a type of teacher assistance that helps students learn new skills, concepts, or levels of comprehension of material’ (Maybin et al., Citation1992, p. 188). It empowers students to construct knowledge from guided experience in a bottom-up approach independent of the instructor (Roehler & Cantlon, Citation1997). These qualities make scaffolding particularly suited for less structured and more interactive learning environments (Prawat & Floden, Citation1994; Korthagen & Lagerwerf, Citation1995). Prominent examples include experiments or discovery activities in science subjects (Lin, Citation2015; Marsh et al., Citation2001). In addition to content learning, these learning environments encourage discussion amongst students and require the use of scientific terminology in both L1 and L2 groups (Klieme et al., Citation2010; Lemke, Citation1990; Meyerhöffer & Dreesmann, Citation2019). This makes science subjects ideal testbeds for CLIL implementation (Rodenhauser & Preisfeld, Citation2014).

Despite the potential benefits of CLIL for science subjects, its adoption falls short in Germany. One reason for this slow uptake appears to be mixed results for content learning in many CLIL studies (e.g. De Dios Martínez Agudo, Citation2019; Piesche et al., Citation2016). These studies generally report positive effects of CLIL on language learning, but they also observe significantly lower scores for content learning (Koch & Bünder, Citation2006; Meyerhöffer & Dreesmann, Citation2019). Moreover, many grammar school teachers in Germany still study a combination of two science subjects, such as Biology and Chemistry, instead of a science subject and a foreign language, such as English. This leaves few teachers with the opportunity to teach CLIL classes (Sylvén, Citation2013; Vásquez et al., Citation2020). Additionally, an ambitious and crowded curriculum restricts the time that teachers can dedicate to experimentation and discovery activities. They tend to relegate science teaching to regular classrooms to maximise content learning, which leaves little room for CLIL (Itzek-Greulich et al., Citation2014).

Previous research has already tried to tackle this problem by offering science laboratories outside of regular classrooms. In Germany, these laboratories are typically located at universities which allow teachers and their classes to spend a full day in an authentic learning environment focused on laboratory learning. However, there are only two laboratories in over 443 that offer practical CLIL experimentation in a science subject (Schülerlabor Atlas, Citation2022). Whilst their results are promising and indicate that even one-day interventions can positively influence content and language learning (e.g. Buse et al., Citation2018; Rodenhauser & Preisfeld, Citation2014), more research is required.

Inspired by Rodenhauser and Preisfeld (Citation2014) and Buse et al. (Citation2018), we designed and piloted a one-day CLIL university laboratory module for ninth graders of Bavarian grammar schools with a focus on genetics. The module builds on an established non-CLIL laboratory module that successfully combines experimentation and model learning. We retained the overall structure of the module and used the previous non-CLIL learning group as a comparison group (Roth et al., Citation2020). As shown by previous research, we expect a general trade-off between content knowledge and language learning. Drawing on cognitive load theory, we hypothesise that the cognition capacity required for language learning will limit the acquisition of content knowledge, even when extensive content and language scaffolding is provided (e.g. Coyle, Citation2007; Grandinetti et al., Citation2013). To mediate this trade-off, CLIL modules need to reduce the amount of content knowledge to accommodate the cognitive effort required for learning content in another language. In the following, we will firstly outline the relevant theoretical background including the basic principles of content and foreign language learning and examine how they can be successfully combined in CLIL education. Additionally, we explain the benefits of practical experimentation, which has already proven its value in content learning. We then elaborate on the different phases of our laboratory experimentation and explain how we measured content learning. We conclude with a critical discussion of our results and relate them to previous studies.

Theoretical background

Research context

In the fall of 2020/2021, over 180 000 students (61 000 female) enrolled in STEM undergraduate courses at German universities (Statistisches Bundesamt, Citation2021a). Another 22 000 students pursued their studies in various international study programmes and many students selected study programmes that require professional English language competences, such as European Studies, International Relations, and Information Systems (Statistisches Bundesamt, Citation2021b). Moreover, in the past two decades, higher education in Germany has developed an increasingly international curriculum. English has gained a strong foothold in this curriculum as the language of sciences, and most study programmes use peer-reviewed and internationally published articles as reference literature for their lectures (Earls, Citation2013; Lanvers, Citation2018). These articles are used as the foundation of term papers, scientific essays, or theses (Ammon, Citation2001; Gürtler & Kronewald, Citation2015). STEM undergraduate courses like Biology seminars, for instance, often require students to analyse scientific articles and present their findings to their peers in English (Gürtler & Kronewald, Citation2015). Yet, many students lack the required English language competencies (Zenner-Höffkes et al., Citation2021). To address this lack and promote a common language across the EU, the European Commission published several guidelines and briefs on how to best incorporate English language learning in schools across Europe (European Commission, Citation2003; European Commission, Citation2012; European Commission, Citation2017).

In over 18 EU member states, English has become a compulsory language. Most of these countries consider CLIL essential for improving English language practice without further straining the already crowded curriculum (European Commission, Citation2003; European Commission, Citation2012; European Commission, Citation2017). In Germany, for instance, CLIL has gained particular traction in grammar schools. Grammar schools aim to equip students with the necessary skills to access colleges or universities. Before students enter grammar schools, they have typically acquired advanced literacy skills in their native language so that CLIL instruction can be implemented effectively (Feddermann et al., Citation2021). CLIL is realised in either bilingual stands or singular CLIL modules. The modules do not have a fixed timeframe and can even be single lessons (Krechel, Citation2013).

Foreign language and content learning

Unlike immersion models, CLIL follows the Pluriliteracies Teaching for Learning (PTL) approach, which aims to develop scientific literacy and content learning in more than one language (Meyer et al., Citation2015; Meyer & Coyle, Citation2017). In CLIL, the native L1 and the foreign L2 receive equal attention in the development of scientific literacy and are of equivalent importance with content learning (Cammarata & Haley, Citation2017; Poza, Citation2016). CLIL is guided by two of Krashen’s (Citation1982) hypotheses on the importance of language for the construction of meaning. The maximum input hypothesis proposes maximum exposure to the L2 to promote second language learning. The comprehensive input hypothesis highlights meaningful exposure to content in the L2 combined with the occasional use of the L1 to support learning.

This complex and mutually dependent relationship between L1, L2, and content (Yore & Treagust, Citation2006) is reflected in Cummins’ (Citation1979) developmental interdependence hypothesis. He postulates that the L1 must exceed a certain threshold of proficiency to (1) enable higher-order thinking, (2) to allow for adequate L2 acquisition, and (3) to facilitate content learning in the L2. For L2 learning, Cummins (Citation1979) distinguishes between everyday language and scientific language. He identifies them as Basic Interpersonal Communicative Skills (BICS) and Cognitive Academic Language Proficiency (CALP) respectively. Scientific literacy, as a core achievement of CLIL (Meyerhöffer & Dreesmann, Citation2019), requires both BICS and CALP. Yet, combining their teaching can be challenging (Cummins, Citation1979).

One way to navigate this challenge is to engage students ‘in authentic communication through the use of hands-on tasks […] related to everyday experiences’ (Buxton et al., Citation2008, p. 501). Hands-on tasks require information to be presented in a coherent and accessible manner, which stimulates CALP and BICS simultaneously (Gonzalez-Howard & McNeill, Citation2016). Argumentation during hands-on tasks fosters scientific literacy in both the L1 and L2 (Gonzalez-Howard & McNeill, Citation2016; Oga-Baldwin, Citation2019; Walker & Sampson, Citation2013). For CLIL, this combination can mediate between external stimuli, internal processes and academic goals (Lam et al., Citation2012). Lin (Citation2015) incorporates this idea in a scaffolding strategy called the Multimodalities/Entextualization Cycle. The MEC encompasses three distinct phases in active engagement of the student with the language and content. The first phase uses multimodalities such as videos, diagrams, experiments, or discovery activities to create a ‘rich experiential context’. The second phase requires the student to combine everyday L1 or L2 to explore the topic and switch between multimodalities while ‘engag[ing] in reading and note-making’. The third phase encourages the student to use L1 and L2 scientific terminology to, for instance, explain and evaluate an experimental design with the help of language scaffolding material (Lin, Citation2010, Citation2015).

The ‘rich experiential context’ and the individual learning approaches of the MEC are rooted in the constructivist model of learning. Constructivism is a theory of learning in which the learner actively construct their knowledge from experience. This framework for learning requires more interactive learning environments than the previous transmission model and (Prawat & Floden, Citation1994; Korthagen & Lagerwerf, Citation1995) and is at the very heart of CLIL (Ting, Citation2010). Building on the Five Es – Engagement, Exploration, Explanation, Elaboration, and Evaluation (Bybee, Citation1997; Bybee & Powell, Citation1993) – the constructivist model of learning has the power to not only explain content learning but also language learning. Much like L2 acquisition (Gonzalez-Howard & McNeill, Citation2016; Lin, Citation2015; Oga-Baldwin, Citation2019; Walker & Sampson, Citation2013), engagement or motivation is a requirement of constructivist learning. Students need to be interested in the content but need to actively construct meaning from their experience (Boddy et al., Citation2003).

Learning, however, is more than just an outcome of experience and one method does not work for all students (Hodson, Citation2014), especially in a language learning context (Gonzalez-Howard & McNeill, Citation2016). Experience is highly individual and may be influenced by factors like prior knowledge and exposure, or different learning style (Lee et al., Citation2015; Lin, Citation2015). The individuality and heterogeneity of experience can be accounted for in different ways, such as using an inquiry learning approach (Vygotsky, Citation1971). This approach allows each student to choose their preferred learning style. Inquiry learning is commonly understood as a ‘bottom up’ approach that gives learners agency to create knowledge through observation and experimentation with the teacher acting as a guide (Donaldson & Allen-Handy, Citation2020 Rocard et al., Citation2007;). This does not necessarily mean that learners can freely decide on learning content, since curricula predefine a clear set of learning goals. Students can, however, choose methods for their individual learning. They are given the agency to shape their own classroom experience and create meaning through social construction (Renninger et al., Citation2018).

Content and language integrated experimentation

Hands-on experiments in an educational setting can offer an engaging learning environment required for CLIL and rooted in constructivist learning and inquiry (e.g. Buse et al., Citation2018; Lin, Citation2015). During the experiments, students can either follow a predefined laboratory procedure and discuss their learnings, or transfer it into another form of representation, and design their own experiments (Gardner & Elliott, Citation2014; Tobin, Citation1990). Either way, the processes involved in experimentation satisfy requirements for the Five Es and engage the MEC for the development of scientific literacy in both the L1 and L2 (Bybee, Citation1997; Bybee & Powell, Citation1993; Lin, Citation2015). Hands-on experimentation can also lead to intense scientific discourse that can help students to not only transfer their content knowledge but encourage them to interactively practise appropriate scientific terminology (Honeycutt-Swanson et al., Citation2014; Kelly, Citation2007). Moreover, it can engage students in science (Lovey & Riggs, Citation2019).

Whilst hands-on experiments have been shown to positively influence content learning across various settings (e.g. Fernández-Fontecha et al., Citation2020; Kelly, Citation2007; Mierdel & Bogner, Citation2019), the results for combinations with CLIL remain ambiguous. A study by Evnitskaya and Morton (Citation2011), for instance, highlights the benefits of putting students into the role of observers, constructors, and critiquers during experimentation. Using both L1 and L2, they can learn how to communicate their findings to different audiences. Studies by Rodenhauser and Preisfeld (Citation2014) and Buse et al. (Citation2018) show that CLIL does not provide substantial benefits over regular non-CLIL experimentation. On the contrary, Piesche et al. (Citation2016) found that CLIL negatively impacts content learning. Most of these studies based their insights on long-term observations of CLIL strands. Very few studies have investigated content learning (Meyerhöffer & Dreesmann, Citation2019) and even fewer have examined the feasibility of CLIL-based experimentation and scientific modelling in a one-day setting.

Objectives of the study

The present study seeks to explore the potential benefits of incorporating a Content and Language Integrated Learning (CLIL) approach in science education. By investigating the influence of a short-term CLIL science module during a hands-on laboratory experience, this research aims to shed light on how CLIL can enhance students' content learning. Furthermore, comparing the content learning outcomes between the CLIL group and the non-CLIL group will provide valuable insights into the effectiveness of CLIL as a teaching method. Thus, our study focuses on the following research questions:

  1. How does a short-term CLIL science module influence content learning during a hands-on laboratory?

  2. How does CLIL influence content learning outcomes when compared to the non-CLIL group?

The specific goals were three-fold:

  • to assess students’ overall content learning throughout the laboratory experience

  • to determine potential differences between non-CLIL and CLIL groups by comparing their content learning scores

  • to examine correlations between CLIL learner’s content learning performance with their Biology and English grades

Materials and methods

Intervention phases

Our study builds on a one-day gene-technology laboratory module which has been developed in various iterations since 2016 (Goldschmidt & Bogner, Citation2016; Langheinrich & Bogner, Citation2016; Mierdel & Bogner, Citation2019; Roth et al., Citation2020). The structure of the last iteration was the basis for our non-CLIL comparison group. We also retained its overall structure and experimental and modelling phases for the design of our CLIL module (Roth et al., Citation2020). However, we used English as the language of instruction (English texts and workbooks) and a separate vocabulary exercise book for language scaffolding. These small modifications allowed us to explore the effects of CLIL whilst ensuring treatment comparability ().

Table 1. Quasi-experimental intervention design.

The CLIL (and the non-CLIL) module was designed for ninth graders of Bavarian grammar schools and focus on the structure of DNA. Learning activities follow the Five Es of the constructivist model of learning (Bybee, Citation1997; Bybee & Powell, Citation1993) and each phase of the module focused on a different aspect of the structure and function of DNA. Short theoretical introductions provided the ‘hooks’ for further exploration in the experimentation and modelling phases. Students were encouraged to take notes on their observations during the experimentation and modelling phases. Open questions in the laboratory manual additionally required them to find explanations for their observations (ESM 1). A short recapitulation after each phase and a separate evaluation phase for the modelling activities were provided to help students contextualise their content knowledge.

The different phases of the module each reflected the three scaffolding cycles of the MEC (Lin, Citation2015). For the experimentation and modelling phases, the laboratory manual provided visuals, process models, and diagrams to make the topic more accessible independent of language proficiency (ESM 2). Group work on the different phases and open questions in the laboratory manual encouraged students to use both the L1 and L2 and train BICS and CALPS (Cummins, Citation1979) while exploring the structure of DNA. English was used for instruction and was encouraged for communication between students. However, we included German translations (code-switching) for key vocabulary in the laboratory manual (Cheshire & Gardner-Chloros, Citation1998). An additional vocabulary exercise book, which included one page for each phase, provided the relevant scientific terminology. Students could use the vocabulary to answer the open questions or to discuss and evaluate the experimentation and modelling phases. The questions were in English, but answers were accepted in both English and German. This approach was consistent with a supportive lexical focus on form (FonF) in CLIL environments (Morton, Citation2015, p. 256).

The instructor, who has a background in both English and Biology, acted as guide throughout the module and provided demonstrations of key experimental steps prior to experimentation. For the theoretical phases and for the final interpretation phase, the instructor used an interactive smartboard presentation to engage students and a poster on gel-electrophoresis with clozes to explain the procedure (). Throughout the two months of interventions at the university laboratory, three to four classes participated on separate days each week. Students always remained in their respective classes and worked in pairs.

1. Pre-lab phase. Many students had little prior experience in laboratory procedures. In this phase, the instructor used an interactive smartboard presentation with visuals and demonstrations to familiarise students with the different laboratory techniques and concepts related to experimentation (e.g. Sarmouk et al., Citation2019).

2. DNA-related theoretical and experimental phases. We introduced each experimental phase with a short overview of the equipment, experimental procedures, and underlying concept. To engage students, we invited them to solve a mystery murder case using the evidence (saliva) similar to the one that the assailant left on the coat of the victim’s spouse (DNA relevance, ). A closer look at the composition of saliva in an interactive smart-board presentation reactivated students’ prior knowledge about cells and introduced the concept of DNA. After DNA extraction, students received an introduction to gel-electrophoresis. With the help of a poster and a demonstration of the agarose gel preparation, the instructor explained the techniques of gel-electrophoresis.

3. Experimental Phases. We used an evidence-based, two-step approach in this phase (Roth et al., Citation2020): Students answered questions in their laboratory manuals and then worked in pairs to explore possible approaches to solve the problem before carrying out the experiments. On completion of the experiments, students took notes, preferably in English, about their observations and answered open questions to clarify the explanations of their observations. This reflective writing (Kovanović et al., Citation2018; Wilmes & Siry, Citation2019) encouraged the students to rethink and reassess the steps in experimental procedures (Mierdel & Bogner, Citation2019; for details, see below). Instead of simply following instructions to complete their tasks, they used their cognitive abilities to make sense of the experiments (Mierdel & Bogner, Citation2019).

4. Model-related Phases. After a short lunch break, the students entered the model-related phases. During these phases, they were given the opportunity to build on the knowledge of the experimentation phases and consolidate their knowledge through model building and evaluation (Bybee, Citation1997; Bybee & Powell, Citation1993). To help students understand modelling in science, we divided our model-related phases into two modelling and evaluation phases. The modelling phases involved mental modelling based on the analysis of an original English letter from Crick to his son (Usher, Citation2013) and a modelling phase using craft materials. Model evaluation included both model evaluation-1 and model evaluation-2 ().

Our four model-related phases were adapted from the four stages of modelling defined by Justi and Gilbert (Citation2002). Students firstly gathered information on the structure of DNA. Based on an analysis of Crick’s letter, which contained metaphors describing the structure of DNA (Usher, Citation2013), they were given the opportunity to construct a mental model (model phase 1, ). This mental model served as a basis for a physical model crafted from craft materials (model phase 2, ). In the final phase, students identified limitations of their first model (Justi & Gilbert, Citation2002).

Students evaluated their model in a reciprocal self-evaluation mode (evaluation-1 phase) by comparing their craft DNA models with a paper-and-pencil version of the model. Additional open-ended questions in the laboratory manual on the structure of DNA supported this self-evaluation process (Roth et al., Citation2020). In model evaluation-2 phase, students used a commercially available DNA demonstration model to assess the quality of their hand-crafted DNA models.

5. Interpretation Phase. The interpretation phase began with the result of the modelling and evaluation phases. The instructor used an interactive smartboard presentation to review the different modelling phases and presented the original DNA model created by Watson and Crick. The presentation aimed to raise students’ awareness about the different DNA models, which often vary in their level of detail. The instructor showed the results of the gel-electrophoresis and explained the reason for the formation of the different bands on the electrophoresis gel.

Language scaffolding

We designed a language scaffolding exercise book with scaffolding exercises for each intervention phase (examples ESM3). It contained language specific riddles in the form of different word search puzzles or crossword puzzles. Moreover, students were required to match definitions with words and select or provide its appropriate translation. Students were allowed to use English-English and English-German dictionaries when necessary.

In addition to the language scaffolding exercise book, we included short translations or explanations for key scientific vocabulary in the laboratory manuals and interactive smartboard presentations. We also used code-switching for our interactive poster with clozes to explain the procedure of gel electrophoresis. The text on the poster was written in English. New scientific vocabulary, which required more in-depth explanations, was omitted from the text, and printed on a magnetic stripe. The magnetic stripes also included German translations of the scientific vocabulary. Where explanations in English were unclear, the instructor switched to German upon request.

Students were encouraged to approach the instructor or a junior assistant to ask questions. The research assistants were proficient in English and Biology, with a language level of at least C1 (requirement to graduate in English at German universities). The instructor and assistants would first try to rephrase explanations in English or use code-switching for specific terms or sentences. Students were given cards in green, yellow, and red to indicate how well they have understood the theoretical phases. Green indicated that the content has been well understood; yellow showed that additional explanations were necessary; and red mandated a recapitulation of the theoretical phase and/or code-switching.

Participants

252 ninth graders from Bavarian grammar schools participated in the intervention (girls 52.4%, boys 47.6%; SDGender = 6.2; MAge = 14.6, SDAge = 0.7). Seven classes took part in the non-CLIL intervention (n = 139) and eight classes in our CLIL intervention (n = 107). The non-CLIL group were divided into 70 student groups (68 2-person groups and one 3-person group) and the CLIL treatment group into 53 student groups (51 2-person groups and one 3-person group). To reduce the potential influence of school grades and previous knowledge, we calculated T0 knowledge scores and compared Biology grades for both groups. We found no significant difference (Mann–Whitney U test [MWU]: Z = −.725, p = .468) in grades between the groups. We did not perform any testing of prior language skills, as none of the students had considerable exposure to the English language outside of the classroom. The participating teachers had a background in Biology and Chemistry and only one teacher studied Biology and English. Information on the teachers’ background was provided in the application letter and could be verified by short resumes on the schools’ websites. The teachers either actively participated in the laboratory experiments or observed their students’ performance and involvement.

Participation of schools and students was voluntary. To recruit volunteers, we sent invitations to neighbouring grammar schools in a 50 miles radius six months prior to the intervention. Teachers were asked not to teach DNA structure and function before student participation in the study. To comply with regulations of the Bavarian Ministry of Education and Cultural Affairs, we asked for written parental consent before the students participated in our study. Data collection was pseudo-anonymous and students could not be identified (Declaration of Helsinki, Citation2013). The design of the module and the questionnaires were pre-approved by the ethics committee of the Bavarian Ministry of Education and Cultural Affairs and received the reference number X.7-BO5106/149/10. The content of the module matches requirements of the state’s syllabus and follows national competency requirements (KMK, Citation2005). The 2018 non-CLIL module underlying our intervention was adapted in two design, evaluation, and development cycles with piloting groups to accommodate CLIL. The 2018 module classes did not take part in the 2020 CLIL intervention, and the non-CLIL and CLIL groups were treated as an independent variable (Cook & Campell, Citation1979) ().

Variables

We tested content learning using an established content knowledge questionnaire for structural and functional characteristics of DNA (Langheinrich & Bogner, Citation2016; Mierdel & Bogner, Citation2019; Roth et al., Citation2020). The questionnaire contained 30 multiple-choice questions each with four distractors and one correct answer (for examples, see ). In addition to the questionnaire, we distributed a cloze test to assess language learning, a questionnaire to measure student self-efficacy beliefs, and a questionnaire to capture creativity (Roth, Conradty, et al., Citation2022) at three different testing times – two weeks before participation (pre-test; T0), directly after the module (post-test; T1), and eight weeks after participation (retention-test; T2). The cloze test, which assessed reading skills, was taken in the L2 while all other questionnaires were in German. Based on studies like Serra (Citation2007) and Massler (Citation2011), we decided to use the content knowledge questionnaire in the L1 to measure the advantages of L1 cognition in CLIL beginners. A more recent study by Canz et al. (Citation2021) supports our approach and highlights the importance of students being able to communicate content knowledge in the L1 ‘to reach educational standards in the subject and do not experience disadvantages compared to monolingually taught students’ (p. 11). For purpose of objectivity, students were not given the hypotheses underlying the study.

Table 2. Knowledge item examples.

The questionnaire used for content learning was designed and developed during three studies with more than 2000 student participants and their teachers (Langheinrich & Bogner, Citation2016; Mierdel & Bogner, Citation2019; Roth et al., Citation2020). Its items were created based on texts from schoolbooks and the course syllabus for genetics. The items’ difficulties were calculated with the help of pilot groups, and we replaced items that were too easy or too difficult. 18 items focused on structural aspects of DNA and 12 on functional aspects (Roth et al., Citation2020; Roth, Scharfenberg, Citation2022). Questions and respective multiple-choice answers were altered and randomly assigned after each testing period (T0, T1, T2) to prevent ‘automated’ responses.

We showed content validity of the questions by comparing them with the state syllabus and the content of the module. Inter-item correlations below .20 (T0 = .08; T1 = .19; T2 = .18) indicated that the items were distinct and that they tested different areas of content knowledge. In combination with the complexity of the latent construct of cognitive achievement, the low inter-item correlations confirmed construct validity (Rost, Citation2004). The reliability of the questionnaire was determined by Cronbach’s alpha. Scores of .74 (T0), .76 (T1), and .78 (T2) exceeded the threshold of .70, which, according to Lienert and Raatz (Citation1998), allows for differentiating groups. An additional calculation of item difficulties (percentage of correct answers, Bortz & Döring, Citation1995) indicated a range between 5% (high difficulty) and 90% (low difficulty). Comparisons between the different testing times showed that item difficulties improved from T0 to T1, particularly for the CLIL group ().

Figure 1. Item difficulties of monolingual and bilingual learners for knowledge items between T0 and T1.

Figure 1. Item difficulties of monolingual and bilingual learners for knowledge items between T0 and T1.

We calculated sum-scores and analysed these for improvements in content knowledge (T1 minus T0) and retention (T2 minus T0). Furthermore, we calculated the actual learning success with respect to the maximum attainable score (30 correct answers): (T1 – T0) x (T1/30); and the persistent learning success (T2 – T0) x (T2/30) (Scharfenberg et al., Citation2007). Increases in content knowledge were rated according to students’ actual knowledge to better compare learning success. This rating also accounts for students who exhibit a significant increase in knowledge yet low achieved scores, and vice versa. We calculated correlations between Biology and English grades and post-test (T1) scores for knowledge items using the Spearman-Rho test (Field, Citation2012).

Statistical analysis

We used nonparametric tests to analyse our data as first assessments showed a non-normal distribution of our variables (Kolmogorov–Smirnov test (Lilliefors modification): partially p < .001). Our assessment of subgroups, which often did not reach the threshold required for assuming a Gaussian distribution, supported our decision to use this test (Lomax, Citation1986). We used boxplots to visualise our results. To analyse intra-group differences of the three different testing times, we applied the Friedman test (F) to illustrate general differences and Wilcoxon (W) signed-rank test to show changes between testing times. For differences between the non-CLIL and CLIL treatment group, we used the Mann–Whitney U test (MWU). Moreover, we used Bonferroni corrections to eliminate minimally significant results, which could have simply been a coincidence due to multiple testing (Field, Citation2012). Where the results remained significant after the corrections, we calculated effect sizes r (Lipsey & Wilson, Citation2001) and categorised them into small (> 0.1), medium (> 0.3), and large (> 0.5) effect sizes. We used Spearman’s rank correlations for correlation analyses and reported them as Spearman’s Rho values.

Results

Intra-group analyses of content learning

Our intragroup analysis (F and W tests, ) of students’ overall knowledge of model-related and scientific knowledge of DNA indicated significant changes for monolingual and bilingual treatment groups: Their knowledge first increased and then declined from T1 to T2, but never reached levels below T0. These results suggest that both student groups were able to attain short-term and mid-term knowledge through participation in the intervention ().

Table 3. Content learning of non-CLIL and CLIL student groups.

Inter-group analyses of content learning

To mitigate the influence of differences in students’ prior knowledge at T0 and to determine students’ short-term increase in content knowledge and mid-term retention rates, we also calculated difference variables (, note a). Our calculations for improvements in knowledge (T1-T0) and retention rates (T2-T0) of overall 30 knowledge-test items were based on sum scores (Field, Citation2012) (). Additional calculations aimed to determine learning success variables for actual learning success ((T1 – T0) x (T1/30)) and persistent learning success ((T2 – T0) x (T2/30)) (Scharfenberg et al., Citation2007). Both difference and learning success variables were then used to assess inter-group differences employing the Mann–Whitney-U test (see , notes).

Table 4. Dependent variables for both non-CLIL and CLIL, analysed with regard to content knowledge scores, difference variables and learning success.

Overall knowledge

The assessment of the 30-item knowledge test revealed significant differences in improvements in knowledge and retention rates between non-CLIL and CLIL learners. Non-CLIL learners scored significantly higher with a medium-to-large effect size in both difference variables (, notes d/e).

While these results already provided a good indication of the students’ performance, they did not account for overall content knowledge. We also completed additional calculations of learning success variables – actual learning success ((T1 – T0) x (T1/30)) and persistent learning success ((T2 – T0) x (T2/30)). We then calculated Mann–Whitney-U scores for inter-group differences and Wilcoxon scores for differences between the testing times (Field, Citation2012). For short-term content learning, we identified significant differences with a small effect size (; notes h). Non-CLIL learners achieved higher short-term content learning scores than CLIL learners. The differences, however, did not persist over time (; notes i). For mid-term permanent content learning, both non-CLIL and CLIL learners achieved comparable content learning scores in the 30-item knowledge test ().

Figure 2. Differences between monolingual and bilingual learners in content learning between testing times T0, T1, and T2; calculated improvements in content knowledge and retention rates as well as temporary and permanent learning success.

Figure 2. Differences between monolingual and bilingual learners in content learning between testing times T0, T1, and T2; calculated improvements in content knowledge and retention rates as well as temporary and permanent learning success.

Correlation biology grades

There was no correlation in the CLIL group (rS = −.049, p = 621) between Biology grades and post-test (T1) scores. Yet, difference and learning success variables of the 30-item knowledge test revealed significant negative correlations (rS = −.171, p = .045) for non-CLIL learners. In effect, non-CLIL students with lower grades appear to have improved their difference and learning success scores in post-tests when compared to students with better grades. This did not hold true for CLIL learners.

Discussion

Data from our one-day outreach module suggests that it had a positive effect on content learning regardless of whether it was implemented as CLIL or non-CLIL. Both treatment groups profited from the combination of hands-on tasks and minds-on activities (), which corroborates findings from previous studies (e.g. Goldschmidt & Bogner, Citation2016; Langheinrich & Bogner, Citation2016; Mierdel & Bogner, Citation2020; Roth et al., Citation2020). An important determinant of this success appears to have been our choice not to use ‘cookbook’ procedures and to accommodate both cognitive and affective domains (Hofstein & Lunetta, Citation2004). In effect, the students did not only learn to confidently handle laboratory equipment but also to identify problems, generate feasible hypothesis, analyse data, and develop models to explain their results (Carmel et al., Citation2020).

Yet, if not regularly taught in class, these ‘non-cookbook’ procedures can limit knowledge acquisition due to cognitive load (e.g. Mierdel & Bogner, Citation2020; Roth et al., Citation2020). We observed the negative impact of high cognitive load on knowledge processing in a previous study (Mierdel & Bogner, Citation2019). The study showed that a non-CLIL module based on outreach hands-on learning is already cognitively exhausting (e.g. Meissner & Bogner, Citation2012; Scharfenberg & Bogner, Citation2010). Adding conceptual learning in a foreign language can easily increase this exhaustion (Piesche et al., Citation2016; Rodenhauser & Preisfeld, Citation2014) and overstrain the working memory (Roussel et al., Citation2017). Although we did not measure cognitive load in the current study, we observed signs of exhaustion like a loss of focus and fatigue in the second half of the intervention. Students and teachers also made frequent remarks on how exhausted they were during and after the laboratory and how much effort the use of the L2 required. Moreover, the item difficulty was perceived to be much higher by the CLIL groups although the items were the same as for the non-CLIL group (). This suggests that the germane load, which is crucial for memorising and learning, has been impaired by high intrinsic and extraneous load (Sweller, Citation2015) resulting from the combination of content and language learning.

Initially poorer short-term knowledge scores for the CLIL groups (; ) suggest that our module mentally exhausted the students and exceeded their processing capabilities. These results confirm competing demands between language and content learning. Canz et al. (Citation2021), for instance, show that greater experience in the L1 leads to an improvement in the processing of content as opposed to the L2. Poor skills in the active and passive language, may negatively affect content learning. In our study, students did not only receive a maximum of L2 exposure but also high content input from experimentation and modelling activities. This may have been too challenging (Craik & Lockhart, Citation1972).

We also observed positive effects. For instance, the discursive activity of negotiating meaning in the process of doing science by talking science helped many of the students to understand fundamental scientific concepts (e.g. Evnitskaya & Morton, Citation2011; Kress et al., Citation2001). The combination of hands-on tasks with minds-on activities in CLIL learning (Glynn & Muth, Citation1994) seems to have supported knowledge acquisition (). Students conversed during the sessions, they answered the open questions in the laboratory manual, and solved riddles in the vocabulary exercise book. Moreover, the instructor and research assistants rarely used code-switching except for the content knowledge test, which was consistently in the L1, to make CLIL beginners feel more confident. Besides positive effects on confidence, testing content knowledge in the L1 helps to better map students’ de-facto knowledge achievement (Canz et al., Citation2021; Massler Citation2011), and shows their ability to transfer knowledge encoded in the L2 to the L1 (Canz et al., Citation2021).

With mid-term permanent content learning success, other factors than cognitive load appear to be at play as the differences between the CLIL and non-CLIL groups were significantly lower (; ). A possible explanation may be that CLIL learners cognitively process content on several levels (Meyerhöffer & Dreesmann, Citation2019; Rodenhauser & Preisfeld, Citation2014). Although more challenging, many text passages in our CLIL module required repeated and precise reading, which may explain why a similar amount of knowledge remained anchored in long-term memory (e.g. Marian & Fausey, Citation2006; Rodenhauser & Preisfeld, Citation2014). Piesche et al. (Citation2016) or Fernández-Sanjurjo et al. (Citation2017) provide a similar explanation for knowledge acquisition in both native language and CLIL learners. Admiraal et al. (Citation2006) or Haagen-Schützenhöfer et al. (Citation2011) do not report significant differences between their CLIL and non-CLIL groups for all testing times. Instead, they ascribe their success to the choice of teaching methods, such as delivering the same content knowledge across several learning cycles in different forms of presentation (e.g. text, visuals, etc.).

We used this approach in our CLIL module and adopt linguistic, graphic, and interactive scaffolds (ESM 1; ESM2) to support content and language learning. Appropriate scaffolding at both the language and content level can reduce the cognitive load but warrants high quality scaffolds (e.g. Fernández-Fontecha et al., Citation2020; Gottlieb, Citation2016). Our short-term content knowledge acquisition was not as hypothesised. This suggests that our scaffolds neither reduces the language-cognitive nor the content-cognitive demands (Grandinetti et al., Citation2013). Many students had difficulties learning the required vocabulary whilst at the same time understanding the genetic concepts and laboratory procedures. Their language may not have developed fast enough to realise positive content level effects of the scaffolds. CLIL short-term implementations may therefore require more than scaffolding to promote knowledge acquisition (e.g. Coyle, Citation2007; Grandinetti et al., Citation2013). Fernández-Fontecha et al. (Citation2020), for instance, suggest multimodal scaffolding, which includes images, to present content in semiotic forms other than text.

Another approach could be to have students learn context-specific key vocabulary, such as experimental procedures and equipment, prior to participation in short-term CLIL implementations, like our gene-technology lab. This would reduce cognitive load and students would only need to put their recently learned vocabulary into context (McGuiness, Citation1999). At the same time, an increased focus on key vocabulary should not distract from the complex relationship between language and content learning (European Commission, Citation2004). To further explore this relationship in the science classroom, additional research will be required. One avenue worth investigating may be how deep-learning as opposed to more superficial content learning could be fostered using CLIL in short-scale bilingual implementations. Ke et al. (Citation2020), for instance, have found commonalities between high levels of discursive activity and deep-learning processes associated with scientific modelling. The exploration of this type of content knowledge could help us tackle the trade-off between content and language learning.

Limitations

Our study builds on a one-day outreach CLIL module with ninth graders who have little prior experience in hands-on experimentation and with the foreign language outside of classrooms. We also lacked commonly agreed standardised instruments for assessing CLIL content learning, which may impair adequate comparison (Dalton-Puffer, Citation2011). Additionally, the context-dependency of CLIL encumbered the extrapolation of our results to other contexts (Pérez-Cañado, Citation2012). Due to the diversity of possible CLIL implementations, the generalisation of our results is limited to short-term implementations of CLIL in science subjects (Fernández-Sanjurjo et al., Citation2017). To better understand this type of CLIL, CLIL researchers may better explain the various options of implementation.

Conclusion

Previous studies have primarily focused on language learning or content learning in long-term CLIL modules (Meyerhöffer & Dreesmann, Citation2019). Our study of a short-term CLIL module contributes to understanding the relationship between language and science content learning at different levels. Although the CLIL group has shown to be less successful, CLIL outreach learning provides positive data for long-term learning in the content knowledge test. Whilst we cannot establish the exact reason for lower scores in the CLIL group, we think that either reducing the content in later trials or prior training of key vocabulary may help reduce cognitive load (Martin, Citation2015; Sweller, Citation2015).

Ethical statement

Hereby, we, Tamara Roth and Franz X. Bogner, consciously assure that for the manuscript ‘The trade-off between STEM knowledge acquisition and language learning in short-term CLIL implementations’ general ethical standards have been fulfilled and ‘all procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.’

Consent statement

‘Informed consent was obtained from all individual participants included in the study.’

Acknowledgements

Special thanks to the teachers and students involved in this study for their cooperation. Also, many thanks to the University of Bayreuth together with the national BMBF's ‘Qualitätsoffensive Lehrerbildung’ (#01JA1901) and the Horizon 2020 Framework Program (grant number 665917), as well as the Luxembourg National Research Fund FNR and PayPal (P17/IS/13342933/PayPal-FNR/Chair in DFS/Gilbert Fridgen).

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

The funding of the research was a joint endeavor of the University of Bayreuth together with the national BMBF's ‘Qualitätsoffensive Lehrerbildung’ (#01JA1901) and the Horizon 2020 Framework Program (grant number 665917), as well as the Luxembourg National Research Fund FNR and PayPal (P17/IS/13342933/PayPal-FNR/Chair in DFS/Gilbert Fridgen).

References

  • Admiraal, W., Westhoff, G., & de Bot, K. (2006). Evaluation of bilingual secondary education in The Netherlands: Students’ language proficiency in English. Educational Research and Evaluation, 12(1), 75–93. https://doi.org/10.1080/13803610500392160
  • Ammon, U. (2001). English as a future language of science at German universities? A question of difficult consequences, posed by the decline of German as a language of science. In U. Ammon (Ed.), The dominance of English as a language of science (pp. 343–361). Mouton de Gruyter.
  • Boddy, N., Watson, K., & Aubusson, P. (2003). A tiral of the Five Es: A referent model for constructivist teaching and learning. Research in Science Education, 33(1), 27–42. https://doi.org/10.1023/A:1023606425452
  • Bortz, J., & Döring, N. (1995). Forschungsmethoden und evaluation [Research methods and evaluation]. Springer.
  • Buse, M., Damerau, K., & Preisfeld, A. (2018). A scientific Out-of-school programme on neurobiology employing CLIL. Its impact on the cognitive acquisition and experimentation-related ability self-concepts. International Journal of Environmental & Science Education, 13(8), 647–660.
  • Buxton, C., Lee, O., & Santau, A. (2008). Promoting science Among English language learners: Professional development for today’s culturally and linguistically diverse classrooms. Journal of Science Teacher Education, 19(5), 495–511. https://doi.org/10.1007/s10972-008-9103-x
  • Bybee, R. W. (1997). Achieving scientific literacy: From purposes to practices. Heinemann.
  • Bybee, R. W., & Powell, J. C. (1993). Investigating diversity and limits, middle school science and technology. Dubuque. Kendall Publishing.
  • Cammarata, L., & Haley, C. (2017). Integrated content, language, and literacy instruction in a Canadian English immersion context: A professional development journey. International Journal of Bilingual Education and Bilingualism, 21(3), 1–17. https://doi.org/10.1080/13670050.2017.1386617
  • Canz, T., Piesche, N., Dallinger, S., & Jonkmann, K. (2021). Test-language effects in bilingual education: Evidence from CLIL classes in Germany. Learning and Instruction, 75, 1–14. https://doi.org/10.1016/j.learninstruc.2021.101499
  • Carmel, J. H., Herrington, D. G., Posey, L. A., Ward, J. S., Pollock, A. M., & Cooper, M. M. (2020). Helping students to “do science”: Characterizing scientific practices in general chemistry laboratory curricula. Journal of Chemical Education, 96(3), 423–434. https://doi.org/10.1021/acs.jchemed.8b00912
  • Cheshire, J., & Gardner-Chloros, P. (1998). Code-switching and the sociolinguistic gender pattern. International Journal of the Sociology of Language, 129(1), 5–34. https://doi.org/10.1515/ijsl.1998.129.5
  • Cook, T. D., & Campell, D. (1979). Quasi-experimentation. Design & analysis issues for field settings. Rand McNally College Publishing Company.
  • Coyle, D. (2007). Content and language integrated learning: Towards a connected research agenda for CLIL pedagogies. International Journal of Bilingual Education and Bilingualism, 10(5), 543–562. https://doi.org/10.2167/beb459.0
  • Coyle, D., Hood, P., & Marsh, D. (2010). CLIL: Content and language integrated learning. Cambridge University Press.
  • Craik, F. I. M., & Lockhart, R. S. (1972). Levels of processing: A framework for memory research. Journal of Verbal Learning and Verbal Behavior, 11(6), 671–684. https://doi.org/10.1016/S0022-5371(72)80001-X
  • Cummins, J. (1979). Cognitive/academic language proficiency, linguistic interdependence, the optimum age question and some other matters. Working Papers on Bilingualism, 19, 121–129.
  • Dalton-Puffer, C. (2011). Content-and-language integrated learning: From practice to principles? Annual Review of Applied Linguistics, 31, 182–204. https://doi.org/10.1017/S0267190511000092
  • Declaration of Helsinki, World Medical Association. (2013). Journal of the American Medical Association, 310(20), 2191–2194. https://doi.org/10.1001/jama.2013.281053
  • De Dios Martínez Agudo, J. (2019). The impact of CLIL on English language competence in monolingual context: A longitudinal perspective. The Language Learning Journal, https://doi.org/10.1080/09571736.2019.1610030
  • Donaldson, J. P., & Allen-Handy, A. (2020). The nature and power of conceptualizations of learning. Educational Psychology Review, 32(2), 545–570. https://doi.org/10.1007/s10648-019-09503-2
  • Earls, C. W. (2013). Setting the Catherine wheel in motion: An exploration of ‘Englishization’ in the German higher education system. Language Problems and Language Planning, 37(2), 125–150. https://doi.org/10.1075/lplp.37.2.02ear
  • European Commission. (2003). Communication from the Commission to the Council, the European Parliament, the Economic and Social Committee and the Committee of the Regions. Retrieved July 3, 2023, from https://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=COM:2003:0449:FIN:en:PDF
  • European Commission. (2004). Promoting language learning and linguistic diversity. An action plan 2004–06. Office for Official Publications of the European Communities.
  • European Commission. (2012). FAQs on multilingualism and language learning. http://europa.eu/rapid/press-release_MEMO-12-703_en.pdf
  • European Commission. (2017). Key data on teaching languages at school in Europe - Eurydice report. Retrieved July 3, 2023, from https://www.eurydice.si/publikacije/Key-Data-on-Teaching-Languages-at-School-in-Europe-2017-EN.pdf?_t=1554834232
  • Eurydice (Hg.). (2005). Content and language integrated learning (CLIL) at school in Europe. Eurydice, European Unit.
  • Evnitskaya, N., & Morton, T. (2011). Knowledge construction, meaning-making and interaction in CLIL science classroom communities of practice. Language and Education, 25(2), 109–127. https://doi.org/10.1080/09500782.2010.547199
  • Feddermann, M., Moeller, J., & Baumert, J. (2021). Effects of CLIL on second lan- guage learning: Disentangling selection, preparation, and CLIL-effects. Learning and Instruction, 74, 101459. https://doi.org/10.1016/j.learninstruc.2021.101459
  • Fernández-Fontecha, A., O’Halloran, K. L., Wignell, P., & Tan, S. (2020). Scaffolding CLIL in the science classroom via visual thinking: A systematic functional multimodal approach. Linguistics and Education, 55, 1–10. https://doi.org/10.1016/j.linged.2019.100788
  • Fernández-Sanjurjo, J., Fernández-Costales, A., & Arias Blanco, J. M. (2017). Analysing students’ content-learning in science in CLIL vs. Non-CLIL programmes: Empirical evidence from Spain. International Journal of Bilingual Education and Bilingualism, 22(6), 661–674. https://doi.org/10.1080/13670050.2017.1294142
  • Field, A. (2012). Discovering statistics using IBM SPSS statistics (4th ed.). SAGE.
  • Finkbeiner, C., & Fehling, S. (2006). Investigating the role of awareness and multiple perspectives in intercultural education. In P. Ruggiano Schmidt, & C. Finkbeiner (Eds.), ABC’s of cultural understanding and communication (pp. 93–110). Age Publishing.
  • Gardner, M. R., & Elliott, J. B. (2014). The immersive education laboratory: Understanding affordances, structuring experiences, and creating constructivist, collaborative processes in mixed-reality smart environments. EAI Endorsed Transaction on Future Intelligent Educational Environments, 1(1), 1–13. http://doi.org/10.4108/fiee.1.1.e6
  • Glynn, S. M., & Muth, K. D. (1994). Reading and writing to learn science: Achieving scientific literacy. Journal of Research in Science Teaching, 31(9), 1057–1073. https://doi.org/10.1002/tea.3660310915
  • Goldschmidt, M., & Bogner, F. X. (2016). Learning about genetic engineering in an outreach laboratory: Influence of motivation and gender on students’ cognitive achievement. International Journal of Science Education, 6(2), 166–187. https://doi.org/10.1080/21548455.2015.1031293
  • Gonzalez-Howard, M., & McNeill, K. L. (2016). Learning in a community of practice: Factors impacting English-learning students’ engagement in scientific argumentation. Journal of Research in Science Teaching, 53(4), 527–553. https://doi.org/10.1002/tea.21310
  • Gottlieb, M. (2016). Assessing English language learners: Bridges for language proficiency to academic achievement. Corwin.
  • Grandinetti, M., Langellotti, M., & Teresa Ting, Y. L. (2013). How CLIL can provide a pragmatic means to renovate science education – even in a sub-optimally bilingual context. International Journal of Bilingual Education and Bilingualism, 16(3), 354–374. https://doi.org/10.1080/13670050.2013.777390
  • Gürtler, K., & Kronewald, E. (2015). Internationalization and English-medium instruction in German higher education. In S. Dimova, A. K. Hultgren, & C. Jensen (Eds.), English-Medium instruction in European higher education (pp. 88–114). De Gruyter Mouton.
  • Haagen-Schützenhöfer, C., Mathelitsch, L., & Hopf, M. (2011). Fremdsprachiger Physikunterricht: Fremdsprachlicher Mehrwert auf Kosten fachlicher Leistungen? Auswirkungen fremdsprachenintegrierten Physikunterrichts auf fachliche Leistungen [Foreign language physics lessons: Added value for language skills at the cost of content achievement? Effects of content-language-integrated physics classes on content achievement]. Zeitschrift für Didaktik der Naturwissenschaften, 17, 223–260.
  • Hodson, D. (2014). Learning science, learning about science, doing science: Different goals demand different learning methods. International Journal of Science Education, 36(15), 2534–2553. https://doi.org/10.1080/09500693.2014.899722
  • Hofstein, A., & Lunetta, V. N. (2004). The laboratory in science education: Foundations for the twenty-first century. Science Education, 88(1), 28–54. https://doi.org/10.1002/sce.10106
  • Honeycutt-Swanson, L., Bianchini, J. A., & Lee, J. S. (2014). Engaging in argument and communicating information: A case study of English language learners and their science teacher in an urban high school. Journal of Research in Science Teaching, 51(1), 31–64. https://doi.org/10.1002/tea.21124
  • Itzek-Greulich, H., Flunger, B., Vollmer, C., Nagengast, B., Rehm, M., & Trautwein, U. (2014). The impact of a science center outreach lab workshop on German 9th graders’ achievement in science. In ESERA (Ed.), 10th Conference of the European Science Education Research Association Proceedings (pp. 97–106).
  • Justi, R. S., & Gilbert, J. K. (2002). Modelling, teachers’ views on the nature of modelling, and implications for the education of modellers. International Journal of Science Education, 24(4), 369–387. https://doi.org/10.1080/09500690110110142
  • Ke, L., Zangori, L., Sadler, T. D., & Friedrichsen, P. J. (2020). Integrating scientific modeling and socio-scientific reasoning to promote scientific literacy. In W. A. Powell (Ed.), Socioscientific issues-based instruction for scientific literacy development (pp. 31–56). IGI Global.
  • Kelly, G. J. (2007). Discourse in science classrooms. In S. K. Abell, & N. G. Lederman (Eds.), Handbook of research on science education (pp. 443–469). Lawrence Erlbaum.
  • Klieme, E., et al. (Eds.). (2010). PISA 2009: Bilanz nach einem Jahrzehnt [PISA 2009: Review after a decade]. Waxmann.
  • KMK. (2005). Beschlüsse der Kultusministerkonferenz – Bildungsstandards im Fach Biologie für den Mittleren Bildungsabschluss [Resolution of the Standing Conference of the Ministers of Education and Cultural Affairs of the Länder in the Federal Republic of Germany - standards of Biology education for secondary school]. Munich: Luchterhand.
  • Koch, A., & Bünder, W. (2006). Fachbezogener Wissenserwerb im Bilingualen Naturwissenschaftlichen Anfangsunterricht [Content-related knowledge achievement in bilingual science lessons]. Zeitschrift für Didaktik der Naturwissenschaften, 12, 67–76.
  • Korthagen, F., & Lagerwerf, B. (1995). Levels in learning. Journal of Research in Science Teaching, 32(1), 1011–1038. https://doi.org/10.1002/tea.3660321004
  • Kovanović, V., et al. (2018). Understanding the relationship between technology-use and cognitive presence in MOOCs. Proceedings of the Seventh International Conference on Learning Analytics and Knowledge, 582-583. New York: Association for Computing Machinery.
  • Krashen, S. (1982). Principles and practice in second language acquisition. Pergamon.
  • Krechel, H.-L. (2013). Organisationsformen und Modelle in weiterführenden Schulen. [Organisation and models of bilingual education]. In W. Hallet, & F. G. Königs (Eds.), Handbuch Bilingualer Unterricht. Content and language integrated learning (pp. 74–80). Friedrich Verlag.
  • Kress, G., Charalampos, T., Jewitt, C., & Ogborn, J. (2001). Multimodal teaching and learning: The rhetorics of the science classroom. Continuum.
  • Lam, S. F., Wong, B. P. H., Yang, H., & Liu, Y. (2012). Understanding student engagement with a contextual model. In S. L. Christenson, A. L. Reschly, & C. Wylie (Eds.), Handbook of research on student engagement (pp. 403–420). Springer US.
  • Langheinrich, J., & Bogner, F. X. (2016). Computer-related self-concept: The impact on cognitive achievement. Studies in Educational Evaluation, 50, 46–52. https://doi.org/10.1016/j.stueduc.2016.06.003
  • Lanvers, U. (2018). Public debates on the Englishization of education in Germany: A critical discourse analysis. European Journal of Language Policy, 10(1), 39–76. https://doi.org/10.3828/ejlp.2018.3
  • Lee, S., Kang, E., & Kim, H.-B. (2015). Exploring the impact of students‘ learning approach on collaborative group modeling of blood circulation. Journal of Science Education Tech-nology, 24(2–3), 234–255. https://doi.org/10.1007/s10956-014-9509-5
  • Lemke, J. L. (1990). Talking science: Language, learning, and values. Ablex.
  • Lienert, G. A., & Raatz, U. (1998). Testaufbau und Testanalyse [Test setup and test analysis] (6th ed.). Psychologie Verlags Union.
  • Lin, A. M. Y. (2010). How to teach academic science language. Symposium on Language and Literacy in Science Learning Hong Kong.
  • Lin, A. M. Y. (2015). Conceputalising the potential role of L1 in CLIL. Language, Culture and Curriculum, 28(1), 74–89. https://doi.org/10.1080/07908318.2014.1000926
  • Lipsey, M. W., & Wilson, D. B. (2001). Practical meta-analysis. Sage.
  • Lomax, R. G. (1986). The effect of measurement error in structural equation modeling. The Journal of Experimental Education, 54(3), 157–162. https://doi.org/10.1080/00220973.1986.10806415
  • Lovey, B. R., & Riggs, K. M. (2019). Flipping the laboratory: Improving student engagement and learning outcomes in second year science courses. International Journal of Science Education, 41(1), 64–79. https://doi.org/10.1080/09500693.2018.1533663
  • Marian, V., & Fausey, C. M. (2006). Language-dependent memory in bilingual learning. Applied Cognitive Psychology, 20(8), 1025–1047. https://doi.org/10.1002/acp.1242
  • Marsh, D., Maljers, A., & Hartiala, A.-K. (2001). Profiling European CLIL classrooms. University of Jyväskylä.
  • Martin, M. A. (2015). Teacher education for content and language integrated learning: Insights from a current European debate. Revista Electrónica Interuniversitaria de Formación del Profesorado, 18(3), 153–168. https://doi.org/10.6018/reifop.18.3.210401
  • Massler, U. (2011). Assessment in CLIL Learning. In S. Ioannou-Georgiou, & P. Pavlou (Eds.), Guidelines for CLIL implementation in primary and pre-primary education (pp. 114–136). PROCLIL. http://www.iccoccaglio.gov.it/wp-content/uploads/2017/02/CLIL-for-Primary.pdf
  • Maybin, J., Mercer, N., & Stierer, B. (1992). Scaffolding: Learning in the classroom. In K. Norman (Ed.), Thinking voices. The work of the national oracy project (pp. 186–195). Hodder & Stoughton.
  • McGuiness, C. (1999). From Thinking Skills to Thinking Classrooms: A Review and Evaluation of Approaches for Developing Pupils’ Thinking. Research Report 115, DfEE: HMSO.
  • Meissner, B., & Bogner, F. X. (2012). Science teaching based on cognitive load theory: Engaged students, but cognitive deficiencies. Studies in Educational Evaluation, 38(4–5), 127–134. https://doi.org/10.1016/j.stueduc.2012.10.002
  • Meyer, O., & Coyle, D. (2017). Pluriliteracies teaching for learning: Conceptualizing progression for deeper learning in literacies development. European Journal of Applied Linguistics, 5(2), 199–222. https://doi.org/10.1515/eujal-2017-0006
  • Meyer, O., Coyle, D., Halbach, A., Schuck, K., & Ting, T. (2015). A pluriliteracies approach to content and language integrated learning – mapping learner pro- gressions in knowledge construction and meaning-making. Language, Culture and Curriculum, 28(1), 41–57. https://doi.org/10.1080/07908318.2014.1000924
  • Meyerhöffer, N., & Dreesmann, D. C. (2019). English-bilingual biology for standard classes development, implementation and evaluation of an English-bilingual teaching unit in standard German high school classes. International Journal of Science Education, 41(10), 1366–1386. https://doi.org/10.1080/09500693.2019.1607620
  • Mierdel, J., & Bogner, F. X. (2019). Comparing the use of two different model approaches on students’ understanding of DNA models. Education Sciences, 9(2), 115–133. https://doi.org/10.3390/educsci9020115
  • Mierdel, J., & Bogner, F. X. (2020). Simply inGEN(E)ious! how creative DNA modeling can enrich classic hands-on experimentation. Journal of Microbiology & Biology Education, 21(2), 1–10. http://doi.org/10.1128/jmbe.v21i2.1923
  • Morton, T. (2015). Vocabulary explanations in CLIL classrooms: A conversation analysis perspective. The Language Learning Journal, 43(3), 256–270. http://doi.org/10.1080/09571736.2015.1053283
  • Oga-Baldwin, Q. W. L. (2019). Acting, thinking, feeling, making, collaborating: The engagement process in foreign language learning. System, 86, 1–10. https://doi.org/10.1016/j.system.2019.102128
  • Oktaviani, A., & Fauzan, A. (2017). Teacher perceptions about the importance of English for young learners. Linguistic, English Education and Art (LEEA) Journal, 1(1), 1–15. https://doi.org/10.31539/leea.v1i1.25
  • Pérez-Cañado, M. L. (2012). CLIL research in Europe: Past, present, and future. International Journal of Bilingual Education and Bilingualism, 15(3), 315–341. https://doi.org/10.1080/13670050.2011.630064
  • Piesche, N., et al. (2016). CLIL for all? A randomised controlled field experiment with sixth grade students on the effects of content and language integrated science learning. Learning and Instruction, 44, 108–116. https://doi.org/10.1016/j.learninstruc.2016.04.001
  • Poza, L. E. (2016). The language of ciencia: Translanguaging and learning in a bilingual science classroom. International Journal of Bilingual Education and Bilingualism, https://doi.org/10.1080/13670050.2015.1125849
  • Prawat, R. S., & Floden, R. E. (1994). Philosophical perspectives on constructivist views of learning. Educational Psychologist, 29(1), 37–48.
  • Renninger, K. A., Ren, Y., & Kern, H. (2018). Motivation, engagement, and interest: “In the end, it came down to you and how you think of the problem”. In C. E. Hmelo-Silver, S. R. Goldman, P. Reimann, & F. Fischer (Eds.), International handbook of the learning sciences (pp. 116–126). Routledge Ltd.
  • Rocard, M., Csermely, P., Jorde, D., Lenzen, D., Walberg-Henriksson, H., & Hemmo, V. (2007). Rocard report: “Science education now: A new pedagogy for the future of Europe”. EU 22845, European Commission.
  • Rodenhauser, A., & Preisfeld, A. (2014). Bilingual (German-English) molecular biology courses in an Out-of-School Lab on a University Campus: Cognitive and affective evaluation. International Journal of Environmental and Science Education, 10(1), 99–110.
  • Roehler, L. R., & Cantlon, D. J. (1997). Scaffolding: A powerful tool in social constructivist classrooms. In K. Hogan, & M. Pressley (Eds.), Scaffolding student learning: Instructional approaches and issues (pp. 6–42). Brookline Books.
  • Rost, J. (2004). Lehrbuch Testtheorie–Testkonstruktion [Textbook test theory–test construc-tion] (2nd ed.). Hans Huber.
  • Roth, T., Conradty, C., & Bogner, F. X. (2022). The relevance of school self-concept and creativity for CLIL outreach learning. Studies in Educational Evaluation, 73, 1–12. https://doi.org/10.1016/j.stueduc.2022.101153
  • Roth, T., Scharfenberg, F.-J., & Bogner, F. X. (2022). Content and language integrated scientific modelling: A novel approach to model learning. Frontiers in Education, 7, 1–19. https://doi.org/10.3389/feduc.2022.922414
  • Roth, T., Scharfenberg, F.-J., Mierdel, J., & Bogner, F. X. (2020). Self-evaluative scientific modeling in an outreach gene technology laboratory. Journal of Science Education and Technology, 29(6), 725–739. https://doi.org/10.1007/s10956-020-09848-2
  • Roussel, S., Joulia, D., Tricot, A., & Sweller, J. (2017). Learning subject content through a foreign language should not ignore human cognitive architecture: A cognitive load theory approach. Learning and Instruction, 52, 69–79. https://doi.org/10.1016/j.learninstruc.2017.04.007
  • Sardegna, V. G., Lee, J., & Kusey, C. (2017). Self-efficacy, attitudes, and choice of strategies for English pronunciation learning. Language Learning, https://doi.org/10.1111/lang.12263
  • Sarmouk, C., et al. (2019). Pre-laboratory online learning resource improves preparedness and performance in pharmaceutical sciences practical classes. Innovations in Education and Teaching International, 1–12. https://doi.org/10.1080/14703297.2019.1604247
  • Scharfenberg, F.-J., & Bogner, F. X. (2010). Instructional efficiency of changing cognitive load in an out-of-school laboratory. International Journal of Science Education, 32(6), 829–844. https://doi.org/10.1080/09500690902948862
  • Scharfenberg, F. J., Bogner, F. X., & Klautke, S. (2007). Learning in a gene technology labor-atory with educational focus—Results of a teaching unit with authentic experiments. Biochemistry and Molecular Biology Education, 35(1), 28–39. https://doi.org/10.1002/bmb.1
  • Schülerlabor Atlas. (2022). Es konnten keine Einträge für den Filter gefunden warden. Retrieved July 3, 2023, from https://www.schuelerlabor-atlas.de/home/LabListe
  • Serra, C. (2007). Assessing CLIL at primary school: A longitudinal study. International Journal of Bilingual Education and Bilingualism, 10(5), 582–602. https://doi.org/10.2167/beb461.0
  • Statistisches Budesamt. (2021a). Anzahl der MINT-Studienanfänger* an deutschen Hochschulen nach Geschlecht in den Studienjahren von 2007/2008 bis 2020/2021. URL: https://de.statista.com/statistik/daten/studie/28346/umfrage/anzahl-der-mint-studienanfaenger/
  • Statistisches Bundesamt. (2021b). Bildung und Kultur. Studierende an Hochschulen. URL: https://www.destatis.de/DE/Themen/Gesellschaft-Umwelt/Bildung-Forschung-Kultur/Hochschulen/Publikationen/Downloads-Hochschulen/studierende-hochschulen-endg-2110410217004.pdf?__blob=publicationFile
  • Sweller, J. (2015). In academe, what is learned and how is it learned? Current Directions in Psychological Science, 24(3), 190–194. https://doi.org/10.1177/0963721415569570
  • Sylvén, L. K. (2013). CLIL in Sweden – why does it not work? A metaperspective on CLIL across contexts in Europe. International Journal of Bilingual Education and Bilingualism, 16(3), 301–320. https://doi.org/10.1080/13670050.2013.777387
  • Tarasenkova, N., Akulenko, I., Kulish, I., & Nekoz, I. (2020). Preconditions and preparatory steps of implementing CLIL for future mathematics teachers. Universal Journal of Educational Research, 8(3), 971–982. https://doi.org/10.13189/ujer.2020.080332
  • Ting, T. (2010). CLIL appeals to how the brain likes its information: Examples from CLIL-(neuro)science. International CLIL Research Journal, 3, 1–18. https://hdl.handle.net/20.500.11770/139714
  • Tobin, K. G. (1990). Research on science laboratory activities. In pursuit of better questions and answers to improve learning. School Science and Mathematics, 90(5), 403–418. https://doi.org/10.1111/j.1949-8594.1990.tb17229.x
  • Usher, S. (2013). Letters of note. Correspondence deserving of a wider audience. Canongate.
  • Vásquez, V. P., Lancaster, N., Bretones Callejas, C. (2020). Keys issues in developing teachers’ competences for CLIL in Andalusia: Training, mobility and coordination. The Language Learning Journal, 40(1), 81–98.
  • Vygotsky, L. S. (1971). The psychology of art (Scripta Technica, Inc., Trans.). MIT Press.
  • Walker, J. P., & Sampson, V. (2013). Learning to argue and arguing to learn – driven inquiry as a way to help undergraduate chemistry students learn how to construct arguments and engage in argumentation during a laboratory course. Journal of Research in Science Teaching, 50(5), 561–596. https://doi.org/10.1002/tea.21082
  • Wilmes, S. E. D., & Siry, C. (2019). Science notebooks as interactional spaces in a multilingual classroom: Not just ideas on paper. Journal of Research in Science Teaching, 57(7), 999–1027. https://doi.org/10.1002/tea.21615
  • Wode, H. (1995). Lernen in der Fremdsprache. Grundzüge von Immersion und bilingualem Unterricht [Foreign language learning. Basic features of immersion and bilingual education]. Hueber (Forum Sprache).
  • Yore, L. D., & Treagust, D. F. (2006). Current realities and future possibilities: Language and science literacy - empowering research and informing instruction. International Journal of Science Education, 28(2–3), 291–314. https://doi.org/10.1080/09500690500336973
  • Zenner-Höffkes, L., Harris, R., Zirkle, C., & Pilz, M. (2021). A comparative study of the expectations of SME employers recruiting young people in Germany, Australia and the United States. International Journal of Training and Development, 25(2), 124–143. https://doi.org/10.1111/ijtd.12214