232
Views
0
CrossRef citations to date
0
Altmetric
Educational Assessment & Evaluation

Pre-service language teachers’ grading literacy: perceptions of grading in three university programs in Finland and Sweden

ORCID Icon, ORCID Icon, ORCID Icon & ORCID Icon
Article: 2326724 | Received 29 Aug 2023, Accepted 29 Feb 2024, Published online: 13 Mar 2024

Abstract

The objective of this study is to expand teachers’ knowledge base of assessment in teaching by exploring pre-service language teachers’ construct of grading literacy during their pedagogical studies and immediately after. We conceptualize pre-service language teachers’ grading literacy with a literature-based flow model of decision-making comprising six major phases. These involve (1) mobilizing grading related knowledge, skills and dispositions, (2) distinguishing relevant assessment events, (3) identifying and filing evidence, (4) interpreting assessment rubrics, (5) making the grading decision, and (6) communicating it to stakeholders. The decision trajectory is constrained by individual and social factors. The research questions address pre-service teachers’ perceptions of the multiple components, primarily the evidence base of a grade. The data comprise survey responses (n = 131) and interview data (n = 26) from three universities, one in Finland and two in Sweden. Regarding the factorial structure, variables incorporating process and disposition aspects provided the strongest explanatory the most frequently attended grading evidence comprises written and oral tests. Pre-service teachers in Finland and Sweden differ with regard to most aspects of what evidence they base their grading on. The interview responses confirm pre-service teachers’ high awareness of the complex nature of grading and the responsibility attached to the endeavour. Based on the findings we propose recommendations for teacher education programs to broaden and enrich pre-service teachers’ grading practice to encompass processual and progress aspects. Increased attention should be paid to practical implementation of grading standards and guidelines in interpreting and justifying grading decisions.

Introduction

Teachers play a paramount role in determining and assigning grades, more or less guided by national standards. Teachers value fairness in grading above all, but their conceptions and interpretations of fairness diverge substantially (Anderson, Citation2018). Respectively, teacher education plays a key and critical role in establishing the foundations of pre-service teachers’ conceptions of grading and helping them to interpret and implement the goal standards to safeguard fairness and justice of pupils’ grades. The aim of this study is to explore pre-service language teachers’ construct of grading literacy and the development of its various components during their pedagogical studies.

There is ample research testifying to the challenges in teacher summative literacy, particularly regarding final grades. They are composed of an undefined combination of achievement, ability, behavior and effort (Randall & Engelhard, Citation2010; Riley & Ungerleider, Citation2019). Furthermore, the weight attached to multiple dimensions incorporated in a school grade may also vary greatly within the same school, and rarely comply with the recommended policies and written standards (Guskey & Link, Citation2019, p. 305; Meissel et al., Citation2017).

Despite the vagueness of compilation strategies, content and intended decisions, grades undeniably play a crucial role in educational crossroads. In Nordic countries they regulate advanced from compulsory basic education to upper secondary level and further to tertiary education. For these purposes fairness and justice issues are at stake. Nevertheless, studies specifically targeting pre-service teachers’ grading literacy are scarce. Grades are embedded in a variety of summative assessment knowledge and skills that pre-service teachers develop and value (Edwards, Citation2017; Hilden et al., Citation2022) and are moderately positive regarding their confidence to apply them (DeLuca & Klinger, Citation2010; Kremmel & Harding, Citation2020). As underscored by previous researchers (e.g. DeLuca et al., Citation2013; Hildén & Fröjdendahl, Citation2018), teacher education has a focal role to play in enhancing overall assessment literacy, thereby contributing to the goal of grading consistency among teachers.

The inconsistencies in grading and the accompanying validity and reliability issues might arise from a lack of common understanding of grading among teachers. Knowledge, skills and dispositions of grading have been addressed in the previous conceptualisations of assessment literacy (Fulcher, Citation2012; Pastore & Andrade, Citation2019; Taylor, Citation2013; Xu & Brown, Citation2016), but it has not been elaborated on in detail. Assigning valid and fair grades is a most challenging endeavor for in-service teachers, let alone practising pre-service teachers. Pinning down the theoretical assumptions of validation on a practice-oriented model (Bachman & Damböck, Citation2017; Bachman & Palmer, Citation2010) might contribute to systematizing the instruction of summative assessment in general and grading in particular.

Drawing on the models of teacher knowledge and assessment literacy we propose a model for the specific subsection of grading literacy depicted in as a basis for analyzing the data on pre-service teachers’ perceptions on grading in this study. The course of enactment of distinctive kinds of teacher knowledge to generate a grade might comprise the following phases: (1) mobilizing grading related knowledge, skills and dispositions, (2) distinguishing relevant assessment events for grading, (3) filing evidence process, progress and product, (4) interpreting the rubrics of multiple curricular goals, (5) making a grading decision by weighing the goals, and (6) communicating the grading decision.

Figure 1. Components and processes of teacher grading literacy (inspired and modified from Bronfenbrenner (Citation1996), Shulman (Citation1986), Bachman and Palmer (Citation2010), Fulmer et al. (Citation2015), Kunnath (Citation2017), Pastore and Andrade (Citation2019), Yan et al. (Citation2021).

Figure 1. Components and processes of teacher grading literacy (inspired and modified from Bronfenbrenner (Citation1996), Shulman (Citation1986), Bachman and Palmer (Citation2010), Fulmer et al. (Citation2015), Kunnath (Citation2017), Pastore and Andrade (Citation2019), Yan et al. (Citation2021).

The following tentative definition of grading literacy is explored in this study:

Teachers mobilize grading knowledge, skills and dispositions (including awareness of the impact of grades on students) that are mediated by personal factors such as self-efficacy and experience, to distinguish assessment events (including mode and content) and to identify and file evidence of process, progress and product for interpreting the curricular goal rubrics, and weighing them in making a decision on grades to be communicated to multiple stakeholders. The stages of the process are framed by micro, meso and macro level social constraints and determinants.

Issues of validity come into play through the entire process of teacher decision making. Teachers do not make their decisions in a vacuum, but in addition to their individual cognitive capacity, self-efficacy and experience of work and life, they are influenced by a wide array of social macro and micro factors (Bronfenbrenner, Citation1996; Kunnath, Citation2017; Yan et al., Citation2021). Fulmer et al. (Citation2015) also include a meso level of local administration and municipal policy in these determinants.

Among macro social constraints, the official curriculum and related guiding documents are among the most transparent and acknowledged, but there may also be more succinct societal or political priorities, such as favoring or suppressing certain groups based on their gender, origin or economic status (Shohamy, Citation2001a, Citation2001b). At the meso level of community and school, there may be particular local cultures and unspoken principles that the members follow more or less consciously. Factors such as the quality of leadership, interaction among teacher community and accustomed ways to implement the written curriculum are influential all the way down to classroom level and may substantially gear or dictate individual teachers’ grading policy (DeLuca, Citation2017). An example is the diverse interpretations of official assessment standards and guidelines indicated by the numerous studies cited in the literature review (e.g. Cheng & Sun, Citation2015; Thorsen, Citation2014).

Bachman and Palmer (Citation2010) maintain that making assessment decisions should start with considering consequences. In the proposed model, this idea is present in the first phase of mobilizing teacher related knowledge, skills and dispositions. Since everything that is taught or learned at school does not lend itself to grading, teachers need to select relevant assessment events and situations (Herppich et al., Citation2018) that reflect learning objectives to be incorporated into a grade (Mertler, Citation2005). Frequently these events entail written or spoken tests consisting of various task types and content, but also other indications (e.g. completion of homework, participation in class activities) can be attended.

Guskey and Bailey (Citation2010) identify three main categories of grading evidence: evidence for product, process and progress. The product category refers to what students know or are able to do at a particular point in time. Processes encompass traits emerging over time, such as responsibility, effort, work habits, formative tests, class participation and attendance. Progress is exhibited by improvement over a period of time, value-added learning and educational growth (Muñoz & Guskey, Citation2015). Product evidence typically implies records of samples and the assessments assigned to them, while process evidence often relies on unsystematic impressions of a student being engaged or absent at lessons, and even student personal traits (Meissel et al., Citation2017). Detecting progress prerequisites various measures such as compiling evidence overtime, using tools like portfolios or learning diaries as sources of assessment, filing and storing performance samples to enable checking and revision, if required by guardians.

Turning the evidence into a grade requires teachers to interpret the evidence in the light of target rubrics in the curricula. Commonplace target sets in language subjects imply linguistic, interactional and cultural goals, as in the Finnish and Swedish national core curricula (Education Act, Citation2010; Finnish National Agency of Education, Citation2020). Teachers check the sufficiency and adequacy of the evidence available for each goal component and if they detect under-representation of the construct being graded, they need to gather additional evidence. In practice, the sufficiency and relevance should be considered at an earlier stage, as there is rarely time for corrective measures by the end of a study sequence.

The final operation of setting a grade may occur either based on definite rules on how to weigh the multiple components or depend more or less on teachers’ intuition and unconscious weighing. The latter tends to be the case even in systems that seek to provide exact guidance for gauging the multiple curricular goals (e.g. Tierney et al., Citation2011).

At the last stage of the model, teachers communicate the grading decision (i.e. the grade in a numeric or verbal guise) to stakeholders, primarily to the students and their guardians. In a broader sense, grades are filed and can be followed and inspected by multiple levels of authorities depending on national and local policy (Lundahl et al., Citation2017; Taylor, Citation2013).

In mainstream validity theory, the stepwise setup of the model bears resemblance to validity as interpretation and use (Chapelle, Citation2020; Chapelle et al., Citation2010; Messick, Citation1989). The gist of usage-based argumentation involves attestting and probing the solidity of each phase in terms of its capacity to count for content adequacy and accuracy to enable generalization of the assessment events to language use domains defined in the curricula as well as extrapolation of the grade to reliably represent a student’s communicative language ability (Chapelle, Citation2020).

A thorough validation of the model is beyond the scope of this study, and is acknowledged as an indispensable focus of future scrutiny. In the current study the model is piloted as a framework to shed light on perceptions of pre-service teachers of the knowledge base and processes in general, and more specifically on the value they assign to the multiple evidence base to ground their grading decisions on.

Research questions

The aim of this study is to describe and analyze the state of pre-service teachers’ grading literacy at the end of their educational studies with regard to their overall perceptions but also more specifically to the events and evidence base they resort to when making grading decisions. In line with the conceptual model, we set out to investigate pre-service language teachers’ grading literacy guided by the following research questions (RQ’s):

  • RQ1. What assessment events and evidence base do pre-service teachers’ value in making grading decisions?

  • RQ2. What components of grading literacy do pre-service teachers address when articulating their beliefs about grading?

The first RQ targets the ideal evidence base of a school grade perceived by the pre-service teachers at three universities located in two Northern European countries. We also explore the factorial structure of the perceptions, and attend to the differences, if any, between the two countries. The second RQ is more comprehensive in nature and attempts to capture the pre-service teachers’ perceptions of the multiple components and processes and their beliefs related to them. A secondary aim of this study is to pilot the applicability of the model proposed in to chart and refine components of grading literacy.

Study context

This study was carried out at three universities, two in Sweden (A and B) and one in Finland (C) in 2019–2020. The reasons for us choosing these two countries adhere to similar history and administration structure (Finland once was a part of Sweden) as well as their current status as Nordic welfare countries enjoying stable and warm mutual relationships at an official and unofficial level. Both countries have established and maintained an equity-based compulsory education system, and executed close collaboration in educational policy and research, by introducing and implementing the European framework of reference (CEFR) to various degrees in language education. At the same time the format of subject teacher education programs differ substantially in the two countries (see below). All participants were prospective subject teachers of foreign languages, primarily English.

Table 1. Overview of language teacher education in Finland and Sweden.

The Finnish TE system

In Finland, pre-service teachers aiming at teaching in compulsory basic and upper secondary education first take their language studies (60 credits) at the Faculty of Arts, and then they enter the Faculty of Education for two semesters (four periods) if they pass an admission interview to pedagogical studies directed at subject teachers. Here they acquire a broad teacher certificate that qualifies them to teach any subject, in which they acquire a sufficient number of subject studies. The most frequent combination is, however, two languages.

Pedagogical studies for subject teachers include assessment education incorporated in the subject didactic section of the studies. In the autumn term 2019, formative assessment practices were introduced, and in the spring 2020, there was a study unit called ‘Assessment and development’ that was dedicated to summative assessment theory. It comprised 5 credits (1 credit = 27 working hours) and consisted of lectures and practical training in groups. Furthermore, some practical assessment training was provided during the two school practice periods.

The Finnish school curricula

National core curriculum is a focal part of course literature in pedagogical studies in both countries. In Finland, the steering document includes a specific chapter on assessment that defines formative and summative purposes of educational assessment. The general instructions are completed by a set of grading rubrics for grades 5, 7, 8, and 9 on a 4–10 scale issued in 2020. The previous version from year 2014 (Finnish National Board of Education, Citation2014) included more superficial guidance for assessment and grading, and merely one grade description for good language proficiency, such as grade 8 on a 4–10 point scale. At the point of data gathering for this study, both versions were introduced to pre-service teachers.

In the core curriculum 2014 and the revision 2020, the main focus is on formative assessment to promote learning. The general guide acknowledges principles of equity and equality, transparency, collaboration and participation. Assessment is further featured as a well-reasoned and logical activity, based entirely on the learning goals and criteria and being free from any bias pertaining to pupils’ individual traits, such as personality or temperament. Pupils are not compared with each other. Instead, they are entitled to show their proficiency in multiple ways and by diverse methods appropriate for their age and developmental stage (Finnish National Agency of Education, Citation2020).

Regarding final grading at the end of compulsory basic education, it should be based on the criteria for the following goal components selected to be assessed: growing into cultural diversity and language awareness, language learning skills and evolving language proficiency comprising interaction skills, text interpretation skills and production skills. The target levels of language proficiency are linked to the Common European Framework of Reference since 2003 (Hilden & Takala Citation2007). However, in the recent core curricula the linkage is slightly vaguer. National goals of school education (growth as a human being and membership in society, requisite knowledge and skills, promotion of knowledge and ability, equality and lifelong learning) and seven cross-curricular transversal competences are also subtly included in final grades, but they are considered as processual in nature and not described in detail at the school-leaving stage. Work and effort is embedded in a subject grade, yet without any further specification. Behavior is assessed as a separate entity, but it does not affect the subject grade and is not displayed in the final school report (Finnish National Board of Education, Citation2014, p. 52). A final grade yields a composite assessment in line with the learning goals and grading criteria. Attaining a higher level in one goal may compensate for failed or weaker performances in another goal (Finnish National Agency of Education, Citation2020, p. 10).

In upper secondary education, the grading principles are largely similar to the ones in basic compulsory education. Grades are awarded separately for each module and averaged to yield a composition grade for a completed syllabus in the final school-leaving certificate (Finnish National Board of Education, Citation2014, p. 246). The objective is that a student accomplishes the language skills illustrated in the same scale as in basic education and general objectives involving competencies such as cross-curricular transversal competencies. For oral language skills, a separate certificate may be awarded (Finnish National Board of Education, Citation2014, p. 242).

The Swedish TE system

In Sweden, pre-service teachers aiming at teaching in basic compulsory and upper secondary education choose either a longer, language education program which takes 4.5–5.5 years (up to 330 credits for the advanced level) depending on subjects studied or a shorter, certificate program which takes 1.5 years (90 credits) but requires the students to already have a relevant Bachelor’s (or higher) degree. The longer program leads to a Masters degree in Education while the latter is a Diploma course, not at the advanced level.

The course of studies may vary at different universities within Sweden but includes assessment either incorporated in the didactic section of the studies at relevant faculty and/or stand-alone courses on assessment at the Faculty of Education. During the longer program, the students can meet both types of courses. Most pre-service teachers come into contact with some form of practical assessment training during their school practice periods (which also vary in length depending on the TE program).

The Swedish school curricula

In the Swedish national core curricula, the goals to reach for classes in year 3, 6, 7, 8, and 9 are on a scale with grades designated as A-F. A-E are passing grades with A being the highest. For each course there are knowledge requirements for grades E, C and A. Grades D and B are given when the pupil has met all the requirements for the lower grade and a considerable part of the grade above (Education Act, Citation2010, p. 800). This grading system was slightly revised in 2022 so that a higher achievement in one criterion may compensate for a weaker performance in another.

In Sweden, the national core curriculum prioritizes the use of formative assessment to promote learning rather than providing standards for summative assessment decisions. Equity and equality, transparency, collaboration and participation are the main principles guiding the implementation of the curriculum. Assessment is further featured as well-reasoned and logical activity, based entirely on the learning goals and criteria and being free from any bias pertaining to pupils’ individual traits, such as personality or temperament. Pupils are not to be compared with each other. Instead, they are entitled to show their proficiency in multiple ways and diverse methods appropriate for their age and developmental stage (Education Act, Citation2010). Final grading at the end of compulsory education as well as the three English courses at the upper secondary level should be based on the set national criteria. Behavior as well as effort and attitude is not to be considered in any subject grade. All language courses at both basic and upper secondary school are to some degree linked to the Common European Framework of Reference in a system of 7 steps (Oscarson, Citation2015).

Methods

In this study, a mixed method approach is used to collect and analyze the data in line with the multiple phases of the proposed model on grading (). Both the survey and the interviews were carried out after the completion of pedagogical studies in the 2019–2020 academic year and were part of the framework of a project between three universities in Sweden and Finland in 2019–2021. The participants responded to the online survey anonymously (N = 131) but also were asked whether they would consider being interviewed. The volunteers (N = 26) gave their written consent according to the European General Data Protection Rule (GDPR). An approval by the Ethics Board was not required for this study based on the ethical review rules for educational research purposes in both countries.

Material

The scale used in this study (Appendix 1) is a part of the large survey on teacher assessment literacy in the project. The large survey comprised several sections under the themes of understanding, skills and dispositions of assessment literacy (e.g. grading). These themes were derived from current assessment literature and adapted to local educational contexts. The large survey consisted of 23 open ended and multiple-choice items. Before launching and disseminating the survey online to all graduate students of the TE programs, the questionnaire was piloted by 30 pre-service teachers (10 from each university) and checked for meaningfulness of the items.

The semi-structured interview guide (Appendix 1) was based on the same areas/themes as the survey. The interviews were carried out in English, Finnish or Swedish by three different researchers due to geographical restrictions and lasted between 45 minutes to 1.5 hours. The 26 interviews were then transcribed by administrative personnel and coded in a qualitative data analysis program by the three researchers independently under the different sections of theoretical understanding and knowledge, skills and dispositions.

Quantitative analyses

Phases 2 and 3 of the model () are primarily addressed through quantitative data, while the other stages of decision-making are explored qualitatively. The major source of quantitative data was a set of 14 Likert type items that aim to explore the degree of weight pre-service teachers assign to various measures in grading (Cronbach alpha: 0.886). The types of events and evidence included three product and eleven process variables in the model as categorized by Guskey and Bailey (Citation2010). Product variables comprised written, oral and national tests. The following options were counted as processual: Use of target language out-of-school, Use of oral (target) language in class, Attendance, Engagement in class, Students’ use of metacognitive strategies, In-class effort, Attitude towards learning, Homework/projects, Self-directed learning skills, Portfolio/samples of course work, and Cultural competence.

Qualitative analyses

Experiential approaches focus on what par­ticipants think, feel and do, and are underpinned by the theoretical assumption that language reflects reality (Terry et al., Citation2017). The qualitative analysis thus followed a traditional, deductive content approach, identifying evidence for each predetermined theme established in the larger project and which was then coded and reflected in the structure of the interview guide and questions. Addressing the Phases 1- 6 in the model, the interview guide included a specific question targeting grading: ‘Now, after having gone through teacher education, how comfortable are you with grading, analyzing and using the results of summative assessment? Areas of strengths, of improvement and/or of weaknesses?’ Furthermore, responses to other interview questions on summative assessment revealed data relevant for grading. Therefore, a word search was carried out to detect terms related to grading in English (grade*), Swedish (betyg*) and Finnish (arvos*), to identify responses related to grading in all the transcripts for the interview data. The thematic units comprising one or several sentences or utterances were categorized according to the seven stages and processes of the model above () in regard with mediating social and personal factors when applicable. The classification was double-checked by all involved researchers, who confirmed their mutual understanding by consensus-oriented discussion and confirming coding reliability.

Results

The first research question in this study aims to explore pre-service teachers’ preferred structure of a school grade in terms of indicating contents and types of assessment. In the literacy model () these portray categories 2 (distinguishing relevant assessment events in terms of mode and content) and 3 (identifying and filing evidence of process, progress, and product). In line with this purpose, a factor analysis was first run for the survey data across both countries (). Two of the items in the original scale (use of target language out of school and use of oral language in class) were omitted in this analysis since their factorial loadings in the initial factor analysis were below 0.5.

Table 2. Factor loadings and factorial structure of the two main components extracted.

The two-component solution explained 59.7% of the total variance. The components were labelled as processes and dispositions (1st) and tests (2nd). The process and dispositions component (47% of total variance) involves students’ process related performance, dispositions and work samples while the tests component (12.7% of total variance) accommodates written, oral and national tests. Four items in the processes and dispositions strand; metacognitive strategies, in-class effort, self-directed learning skills and cultural competence, also embody a natural progress aspect to be addressed until the attribute can be considered in grading.

An independent samples t-test was then run and shows that pre-service teachers in language TE programs in Sweden and Finland differ significantly in the weight they assign to the assessment processes and dispositions in grading while they do not differ when it comes to tests (). The participants from the TE program in Finland assigned ‘high weight’ whereas the ones in the two TE programs in Sweden assigned ‘medium weight’ to processes and dispositions in grading.

Table 3. Weight pre-service teachers assign to the two main components of measures (tests and processes/dispositions) in grading student performance by country (Sweden and Finland).

Further, at the item level, an independent samples t-test shows the pre-service teachers in language TE programs in Sweden and Finland assign different levels of weight to most of the measures in grading except oral tests and the use of target language out of school (). Although there are significant differences between the two groups in the weights they assign to ‘written tests’ and ‘national tests’, these differences are not observed at the scale (component) level () for two reasons: (1) Pre-service teachers in the two TE programs in Sweden assign significantly more weight to national tests in grading than their counterparts in the TE program in Finland, and vice versa when it comes to written tests, (2) the two groups do not differ significantly in the weight they assign to oral tests in grading. When the responses to these three items are computed into a ‘tests’ scale, these differences are averaged into means that are not significantly different.

Table 4. Weight pre-service teachers assign to the two main components of measures (tests and processes/dispositions) in grading by country (Sweden and Finland).

Among the components derived from the national core curricula in Finland and Sweden the highest scores on weight were given to written tests, oral tests and the use of target language in-class (). Written tests as the key constituent of a grade were primarily emphasized by the Finnish pre-service teachers. Both written and oral tests belonged initially to the product type of evidence. The tendency to favor written tests as grading evidence among teachers at higher stages of education accords with the findings by Guskey and Link (Citation2019) and Cheng and Sun (Citation2015). Notably, the participants of the current study were trained in particular as subject teachers, who would primarily instruct students aged 13–18 years.

Use of oral language in-class was the most favored processual source that could be used in grade assignment followed by portfolio samples and engagement in-class. These processual achievement factors are typically prominent at lower stages of education (Cheng & Sun, Citation2015; Kunnath, Citation2017). Our findings are thus in complete accordance with Seden and Svaricek (Citation2018) who contested that ‘old-fashioned’ practices such as tests prevailed among English teachers, while peer-assessment and portfolios were far more rarely executed.

Attitude towards learning and target language use were the criteria with the lowest average value. As attitude should not be included in grading either in Finland or Sweden, this was apparently reflected in the responses of the majority of respondents. Language use out-of-school can for instance partly be captured by portfolio work but extramural learning is not in principle part of school grading, even if the borders between in-school and out-of-school learning are presently fading (Sundqvist & Sylvén, Citation2016).

The foundations of grading portrayed by the overall sample underscore tests, written and oral, as well as oral language use in class. These findings are also in accordance with those by DeLuca and Klinger (Citation2010) and Vogt and Tsagari (Citation2014).

Regarding differences between pre-service teachers studying in Finland vs. Sweden, significant differences with the largest effect sizes (Cohen’s d value above 0.8) appeared in relation to the importance of the use of target language out of school, engagement in class, attitude towards learning and self-directed learning skills (). The pre-service teachers in Finland put more value on all of these. In addition they also valued grading components with slightly lower but still a large effect size for the use of oral language in class, attendance, use of metacognitive skills, in-class effort, and attitude towards learning and self-directed learning skills. The use of national tests was the only variable that was prioritised by Swedish pre-service teachers more than their Finnish counterparts. The reason is logical and dates back to the diverging policy regarding national tests in the two countries. National tests play a focal role throughout the Swedish educational system, while in Finland, there are no official nationwide tests until the end of general upper secondary education. When it comes to written and oral tests and the role of homework and projects in grading, Finnish pre-service teachers again express higher appreciation with a difference of modest effect size (Cohen’s d value 0.5–0.8).

The second research question targeted the components and processes pertaining to the grading literacy model (). The interview responses yielded more in-depth information about the perceived factors that have an impact on grading decisions. Due to the comparatively low number of interviews no statistical comparisons can be made, but categories are presented, exemplified and discussed.

Mobilizing grading related knowledge, skills and dispositions

Mobilizing teacher grading related knowledge, skills, dispositions, incorporating an awareness of grade impact to students’ future is inherently a social endeavour, yet mediated by personality and experiential accounts (, phase 1). In the interviews, the pre-service teachers acknowledge their responsibility as assessors and an awareness of impending bias. The dispositional sub-domain of grading competence is reflected in the responses, with particular focus on the importance of fairness and how collaborative assessment helps attain this.

A common response is that a teacher needs to be aware of their personal values and priorities in grading to achieve fairness. Pre-service teachers realized it was difficult to be objective when interpersonal relations played in, ‘Well it is important that you don’t have any favourites and give them higher grades because you like them’ (A5.1) but also in cases when the pupil reflected the teacher’s own personality; ‘you see the behavior you like, because I like ambition, I am ambitious myself, so when I see that, of course I want to reward that, but that is not what we should reward.’ (A6.1).

Collaborative grading decisions were assumed to promote teacher’s knowledge and skills but also appreciated as evidence of collegial support and trust, ‘she was an experienced teacher, [who] had worked more than 30 years, and still wanted to sit down with me.’ (A7.1). On the other hand, collegial pressure was tangible in other cases, ‘there was a strange atmosphere between the teachers, a bit uncomfortable. What if I give a grade and then the other teachers can come and … she gave too high or too low grades ….’ (A5.1) something also noted by earlier researchers (DeBoer et al., Citation2007; Kunnath, Citation2017).

Knowledge of the consequences of the of the complexity of grading and its partly controversial consequences was also observed in relation to the teacher’s own values, ‘a lot of what is hidden when it comes to grade inflation is possibly about how one acts during lessons and how a nice pupil will get a little higher grade and so on… but at the same time it has been shown that often these are the pupils who can manage higher education because they have learnt how you act and how you progress in a studious environment’ (A9X.1). This comment aligns with findings by Thorsen (Citation2014) voicing the idea of a common grade dimension that apart from discipline specific academic ability may contribute to the predictive power of school grades.

Another dispositional aspect which was notable was a deep sense of social responsibility expressed in respect to the future of their students, ‘grades have a great importance in Sweden compared to many other countries’ (B7.1). And, also by engaging pupils to take responsibility for their learning, supported by conveying the grading standards in a gentle way, ‘I would assemble some of [the criteria]. In good time, I would make it clear to myself what they are, and also to the students so that they know what to invest in.’ (C1.1) A developing ethical stance can be observed.

Personal factors mediating the deployment of assessment related knowledge seem to be of critical importance and also whether the respondents themselves recognize their self-esteem, or lack thereof. Statements to the effect of being ‘100% sure [of what I am doing]’ (A3.1) at one end, to the understanding that assessment competence as a process that has just started ‘I feel ready to start grading, but it does feel a little scary!’ (A8.1) at the other show the range of this feeling.

Likewise, an important finding is that the interviewed respondents understand the function of grading as feedback on their own work and a prompt to reflect on their teaching, aptly expressed by one pre-service teacher, ‘[you] are able to modify your own teaching based on the feedback and not only leave the results to the students, but try a little to observe ‘why this always results in good or bad results’. (C2.1)

Distinguishing relevant assessment events

Distinguishing relevant assessment events to ground grading decisions on are partly determined by social factors, for example by micro conditions of school climate and collegial discourse (, phase 2). A statement such as ‘Teacher L has given an A or B to Pupil X and Pupil Y and I mean that they don’t deserve it! I know that they are low achievers in my subject so I know that they should be just as poor in civics as he is in my subject’ (A5.1) is likely a reflection of the particular school environment, considering the fact that pupils may have various interests, talents, strengths and weaknesses in different subjects and therefore be able to perform at different levels.

Identifying and filing evidence

Identifying and filing evidence of process, progress and product belong to phase 3 in the model (). At the macro level the social impact of national tests in Sweden is frequently stated as a factor steering the process of identifying and filing evidence for grading but ‘there is a risk that the national tests control how we teach, too much, and that there is a skewed coverage of the curriculum and… and that you just test whatever will come on the national tests’ (A1.1). Considering that the national tests are supposed to reflect the curriculum even if not able to encompass everything there is still a sound scepticism and understanding that tests cannot cover everything expressed by pre-service teachers. Such reflections based on individual value judgements strongly feature the inclusion or exclusion of certain types grading evidence, ‘Tests—you can certainly have tests to check if the pupils have learned what you want them to learn from your teaching… but when it comes to modern languages, for example where you have a grammar test that is graded, it is very far from the knowledge requirements’ (A10X.1).

Interpreting assessment rubrics

How to interpret the multiple goal rubrics incorporated into curricular standard descriptions was a persisting source of uncertainty and serious consideration for pre- service teachers (, phase 4). Ambiguity of wording and difficulty of interpretation were conceived as cumbersome, ‘even with the most competent teaching staff, you cannot expect an equivalence because it is so unclear, and up to everyone to judge for themselves’ (B7.1). The respondents are evidently aware of the diversity of goals and how difficult it is to embed them into a single grade. Often partial tests are merely aggregated together, ‘and it is difficult to test it in a summative manner if it isn’t some type of final test that controls what you know in the subject’ (A9X.3). Pre-service teachers also seem to identify their own and colleagues’ bias in choosing interpretations that coincide with their personal values, ‘My eyes have opened to how much of other factors we take into our assessment that probably shouldn’t be there’ (A7.1). Making a grading decision by explicit or unconscious weighing of the official goals likewise leads to a timely discourse on educational policy, exemplified by the many misunderstandings of the former normative grading system, where ‘you thought when only X number could get a high grade and the rest had to be satisfied with a lower one, and, also in connection with grade inflation’ (A7.1).

Making the grading decision

Making a grading decision implies several steps of technical and ideological reasoning (, phase 5). Teachers’ knowledge-based dilemmas can concern technical implementation of grade compilation, especially regarding criteria-based and norm-referenced assignment and profiling the multiple skills. Some respondents do not perceive themselves as autonomous agents in assigning grades at this point, and are uncertain in their assessment role, ‘in the first period, I will adjust them and look at the previous grades and evaluate the skills according to them’. (C6.1) which indicates a lack of both experience and confidence. These respondents fail to specify the grounds for their grading and rely on decisions taken by their predecessors. This case bears resemblance to the findings by Svennberg et al. (Citation2014). Likewise, unofficial individual preferences that mirror teachers ́ micro level experience and attitudes often surface in grading decisions, such as ‘I would very much like to be able to give a bit extra to a pupil who really makes an effort, compared to the pupil who maybe has the language from the beginning, […but] doesn’t do anything, has a bad attitude’ (A2.1) all the while knowing that these factors should not be taken into consideration.

Studying ethics is frequently considered as a valuable attribute for pre-service teachers, which is indirectly acknowledged in subject related learning outcomes as enabling factors (Riley & Ungerleider, Citation2019). Some decisions resort to teachers’ subject-related knowledge mediated by individual experience instead of official criteria, putting emphasis on for example, ‘the oral tests because the most important thing is that you can produce speech and make yourself understood and succeed in that language. It’s really great if you know the grammar really well, but usually those who know it can also speak it.’ (C6.1)

Communicating to stakeholders

In communicating grading decisions (, phase 6) at a personal level, the respondents highly appreciated transparency as an axiomatic value, but also to avoid conflict and misunderstanding, thinking it was important to ‘be very direct and clear about what you are doing, how many points you give, why the pupil received this grade’. (A2.1)

Moreover, communication by default involves both micro and meso levels of activity. There was a broad range of concerns articulated by respondents from all three universities explored. Fears related to the knowledge and skills of how to convey grading messages were frequently voiced by the interviewees in terms of ‘how can the grade be communicated to the children, and what is lost in translation’. (B1.1) Similarly, facing guardians raises emotions of insecurity, ‘to handle pupils and the pupils’ parents… I’m a little worried about that’. (A3.1)

The interviewed pre-service teachers also perceived obligations from local school administration or the leader of their own school, echoing the findings by Fulmer et al. (Citation2015). Regulations which meant saving ‘our basis for grading for five years…[so] that a pupil should be able to call the school after four years and ask why they received the grade they did’ (A7.1) for example, were considered necessary for transparency but also daunting.

In previous research, multiple sources of external pressure from policy makers and parents are identified by Kunnath (Citation2017) and Cheng et al. (Citation2020). Similarly, this study shows that multiple external factors influence pre-service teachers’ grading decisions, and how they perceive these factors may result in validity and reliability issues in their grading.

Discussion and implications

Mobilizing teacher grading related knowledge, skills, dispositions, incorporating the awareness of grade impact to students’ future is inherently a social endeavor, yet mediated by personality and experiential accounts. The results of this study show that pre-service teachers acknowledge their responsibility as assessors and the impending bias, and consider it crucial to be alert to one’s personal values and priorities in grading. They also acknowledge the function of grading in providing feedback on their work. On the other hand, they voice their concern of students’ awareness and understanding grading principles, as well as the complexity of grading exercises for the teacher.

In distinguishing relevant assessment events and identifying and filing evidence to ground grading decisions on, variables incorporating process and disposition aspects explained a major part of the pre-service teachers’ prevailing evidence basis for grading. Considering individual grading components, the quantitative findings (RQ1) suggest the primacy of tests as ground-stones of summative assessment decisions. Written tests prevail, accompanied by oral tests, while the role of national tests is prominent in the Swedish context. In Sweden the most rarely used evidence basis for a school grade were attendance, attitude and out-of-school use of target language. In Finland these counted the use of national tests, attitude and self-directed learning. Large or moderate differences in perceptions of pre-service language teachers in the two countries appeared in most grading components. With the exception of national tests, the Finnish pre-service teachers valued all proposed instances higher than their Swedish peers, most of them at significant level. The explanation may partly be found in the timing of the pedagogical studies, partly in the content of them. In Finland these study units are taken during one year and the survey and interviews were carried out right after the assessment course, while in Sweden the respondents to the survey may have had a longer time to digest their knowledge after they had completed similar courses. Second, national core curricula are used as steering documents and teaching material in both countries. The range and precision of these assessment guides in the two countries varies considerably: the Finnish curricula provide long and detailed instructions, while assessment only plays a modest part in the Swedish language syllabi (Dragemark-Oscarson et al., Citation2023). This condition may lead to higher awareness of assessment issues in Finland.

Ambiguity of wording and the diversity of goals to be integrated into a single grade were perceived as a challenge. The respondents realized that making a grading decision implies weighing the components explicitly or unconsciously, often reflecting teachers’ local experiences and attitudes instead of official criteria. Collaborative marking was appreciated in promoting individual grading competence. Communicating grading decisions to guardians was viewed as a somewhat intimidating task for pre-service teachers, especially in combination with perceived obligations from local administration.

By proposing a practice-oriented model of grading literacy in this study, we aimed to describe the flow of teachers’ grading decisions, and our analysis of quantitative and qualitative data showed that it served its purpose. Although it is based on a multitude of titles in previous literature, the current setup is a novelty to our knowledge. The tentative model may need adjustment by merging categories, since distinguishing assessment events and identifying and filing the evidence may not be separable. Discerning the knowledge categories and specifying the social factors by level and content might also be important for an elaborated model. However, given the usability of the model in research of grading literacy, adopting an amended version of it would contribute to systemizing the instruction of summative assessment in general and advancing grading literacy in particular, in the context of pre-service teacher education.

Although this study sheds light on the components of pre-service teachers grading literacy, we acknowledge limitations particularly pertaining to the amount of quantitative data. The number of interviews yielded from each of the three universities varied in number and not all respondents specifically mentioned grading. Therefore, replications with new cohorts are focal in establishing the solidity of the model and the activity it portrays.

In conclusion, the findings contribute to the discourse on assessment literacy needed by various stakeholder groups initiated by Taylor (Citation2013) by discerning the domain of grading literacy in pre-service teachers’ evolving repertoire. It is beneficial for pre-service teachers if we manage to broaden and enrich their grading practice to encompass processual and progress aspects rather than merely resorting to work samples and tests. Teacher educators should pay increased attention to practical implementation of grading standards and guidelines to encourage and train pre-service teachers’ ability to interpret and justify their grading decisions. To do this, measures need to be taken by curriculum designers to clarify and concretize their message in practical terms. The uncertainty in making grading decisions, one of the most fundamental obligations of the teaching profession, is repeatedly voiced by in-service teachers as well. Teacher education should equip prospective language teachers as active professionals to critically evaluate the quality and clarity of administrative decisions and steering documents. To intensify and systematize assessment instruction to ensure that teachers are capable of making high-quality grading decisions, multi-method longitudinal studies are imperative in the Nordic context and beyond.

Disclosure statement

The authors report there are no competing interests to declare.

Additional information

Funding

This study was supported by Vetenskapsrådet.

Notes on contributors

Raili Hilden

Raili Hilden teaches foreign language didactics, conducts research in language assessment and leads a number of assessment related projects. She also works for the Finnish Matriculation Examination Board. Her scholar contribution covers a broad range of research articles and popularized papers.

Anne Dragemark-Oscarson

Anne Dragemark-Oscarson works at the Department of Pedagogical, Curricular and Professional Studies, University of Gothenburg, where she has taught foreign language didactics, led and participated in several language assessment projects as well as worked with national testing. Her main research interests are formative assessment practices, self-assessment in particular, and teacher education.

Ali Yildirim

Ali Yildirim is a Professor of Pedagogical Work at the Department of Pedagogical, Curricular and Professional Studies, University of Gothenburg, where he leads the Didactic classroom studies research environment. His main research interests include curriculum studies, teacher education and professional development.

Birgitta Fröjdendahl

Birgitta Fröjdendahl Senior Lecturer in Language education teaches and conducts research at the Department of Teaching and Learning, Stockholm University. Her main research interests include assessment, curriculum theory and leadership studies.

References

  • Anderson, L. W. (2018). A critique of grading: Policies, practices, and technical matters. Education Policy Analysis Archives, 26(49). Historical and Contemporary Perspectives on Educational Evaluation: Dialogues with the International Academy of Education, 49. https://doi.org/10.14507/epaa.26.3814
  • Bachman, L. F., & Damböck, B. (2017). Language assessment for classroom teachers. Oxford University Press.
  • Bachman, L. F., & Palmer, A. (2010). Language assessment in practice. Developing language assessments and justifying their use in the real world. Oxford University Press.
  • Bronfenbrenner, U. (1996). The ecology of human development: Experiments by nature and design. Harvard University Press.
  • Chapelle, C. A. (2020). Argument-based validation in testing and assessment. Sage Publications.
  • Chapelle, C. A., Enright, M. K., & Jamieson, J. (2010). Does an argument-based approach to validity make a difference? Educational Measurement: Issues and Practice, 29(1), 3–13. https://doi.org/10.1111/j.1745-3992.2009.00165.x
  • Cheng, L., DeLuca, C., Braund, H., Yan, W., & Rasooli, A. (2020). Teachers’ grading decisions and practices across cultures: Exploring the value, consistency, and construction of grades across Canadian and Chinese secondary schools. Studies in Educational Evaluation, 67, 100928. https://doi.org/10.1016/j.stueduc.2020.100928
  • Cheng, L., & Sun, Y. (2015). Interpreting the impact of the Ontario secondary school literacy test on second language students within an argument-based validation framework. Language Assessment Quarterly, 12(1), 50–66. https://doi.org/10.1080/15434303.2014.981334
  • DeBoer, B. V., Anderson, D. M., & Elfessi, A. M. (2007). Grading styles and instructor attitudes. College Teaching, 55(2), 57–64. https://doi.org/10.3200/CTCH.55.2.57-64
  • DeLuca, C., Chavez, T., Bellara, A., & Cao, C. (2013). Pedagogies for preservice assessment education: Supporting teacher candidates’ assessment literacy development. The Teacher Educator, 48(2), 128–142. https://doi.org/10.1080/08878730.2012.760024
  • DeLuca, C., & Klinger, D. A. (2010). Assessment literacy development: Identifying gaps in teacher candidates’ learning. Assessment in Education: Principles, Policy & Practice, 17(4), 419–438. https://doi.org/10.1080/0969594X.2010.516643
  • DeLuca, C. (2017). Examining variability in teachers’ approaches to classroom assessment: A latent class analysis study. AERA Online Paper Repository
  • Dragemark Oscarson, A., Fröjdendahl, B., & Hildén, R. (2023). Att läsa bedömningsuppdraget: Ett textanalytiskt exempel från tidigare svenska och finländska läroplaner. Educare, (2), 210–242. https://doi.org/10.24834/educare.2023.2.819
  • Education Act. (2010). (800). https://www.skolverket.se/download/18.47fb451e167211613ef398/1542791697007/swedishgrades_b).
  • Edwards, F. (2017). A rubric to track the development of secondary pre-service and novice teachers’ summative assessment literacy. Assessment in Education: Principles, Policy & Practice, 24(2), 205–227. https://doi.org/10.1080/0969594X.2016.1245651
  • Finnish National Agency of Education. (2015). National core curricula for general upper secondary education. Finnish National Board of Education.
  • Finnish National Agency of Education. (2020). Oppilaan oppimisen ja osaamisen arviointi perusopetuksessa. Perusopetuksen opetussuunnitelman perusteiden 2014 muutokset. [Evaluation of pupils’ learning and learning outcomes in basic education. Amendments to the core curricula for basic education 2014]. 10.2.2020.
  • Finnish National Board of Education. (2014). National core curricula for basic education. Finnish National Board of Education.
  • Fulcher, G. (2012). Assessment literacy for the language classroom. Language Assessment Quarterly, 9(2), 113–132. https://doi.org/10.1080/15434303.2011.642041
  • Fulmer, G. W., Lee, I. C. H., & Tan, K. H. K. (2015). Multi-level model of contextual factors and teachers’ assessment practices: A1n integrative review of research. Assessment in Education: Principles, Policy & Practice, 22(4), 475–494. https://doi.org/10.1080/0969594X.2015.1017445
  • Guskey, T. R., & Bailey, J. M. (2010). Developing standards-based report cards. Corwin Press.
  • Guskey, T. R., & Link, L. J. (2019). Exploring the factors teachers consider in determining students’ grades. Assessment in Education: Principles, Policy & Practice, 26(3), 303–320. https://doi.org/10.1080/0969594X.2018.1555515
  • Herppich, S., Praetorius, A.-K., Förster, N., Glogger-Frey, I., Karst, K., Leutner, D., Behrmann, L., Böhmer, M., Ufer, S., Klug, J., Hetmanek, A., Ohle, A., Böhmer, I., Karing, C., Kaiser, J., & Südkamp, A. (2018). Teachers’ assessment competence: Integrating knowledge-, process-, and product-oriented approaches into a competence-oriented conceptual model. Teaching and Teacher Education, 76, 181–193. https://doi.org/10.1016/j.tate.2017.12.001
  • Hildén, R., & Fröjdendahl, B. (2018). The dawn of assessment literacy – exploring the conceptions of Finnish student teachers in foreign languages. Apples - Journal of Applied Language Studies, 12(1), 1–24. https://doi.org/10.17011/apples/urn.201802201542
  • Hilden, R., Oscarson, A. D., Yildirim, A., & Fröjdendahl, B. (2022). Swedish and Finnish pre-service teachers’ perceptions of summative assessment practices. Languages (Basel), 7(1), 10. https://doi.org/10.3390/languages7010010
  • Hilden, R., & Takala, S. (2007). Relating descriptors of the Finnish school scale to the CEF overall scales for communicative activities. In Koskensalo, A., Smeds, J.,Kaikkonen, P. & Kohonen, V. (Eds.), Foreign languages and multicultural perspectives in the European context. Fremdsprachen und multikulturelle Perspektiven im europäischen Kontext. Dichtung, Wahrheit und Sprache. (pp. 291–300). LIT-Verlag.
  • Kremmel, B., & Harding, L. (2020). Towards a comprehensive, empirical model of language assessment literacy across stakeholder groups: Developing the language assessment literacy survey. Language Assessment Quarterly, 17(1), 100–120. https://doi.org/10.1080/15434303.2019.1674855
  • Kunnath, J. P. (2017). Teacher grading decisions: Influences, rationale, and practices. American Secondary Education, 45(3), 68–88.
  • Lundahl, C., Hultén, M., & Tveit, S. (2017). The power of teacher-assigned grades in outcome-based education. Nordic Journal of Studies in Educational Policy, 3(1), 56–66. https://doi.org/10.1080/20020317.2017.1317229
  • Meissel, K., Meyer, F., Yao, E. S., & Rubie-Davies, C. M. (2017). Subjectivity of teacher judgments: Exploring student characteristics that influence teacher judgments of student ability. Teaching and Teacher Education, 65, 48–60. https://doi.org/10.1016/j.tate.2017.02.021
  • Mertler, C. A. (2005). ERRATUM: Secondary teachers’ assessment literacy: Does classroom experience make a difference? American Secondary Education, 33(2), 76–73.
  • Messick, S. (1989). Meaning and values in test validation: The science and ethics of assessment. Educational Researcher, 18(2), 5–11. https://doi.org/10.2307/1175249
  • Muñoz, M. A., & Guskey, T. R. (2015). Standards-based grading and reporting will improve education. Phi Delta Kappan, 96(7), 64–68. https://doi.org/10.1177/0031721715579043
  • Oscarson, M. (2015). Bedömning på systemnivå-En komparativ studie av stegsystemet i språk i den svenska skolan och språknivåer i Europarådets Common European Framework of Reference. Educare, 2(2), 128–153. https://doi.org/10.24834/educare.2015.2.1135
  • Pastore, S., & Andrade, H. L. (2019). Teacher assessment literacy: A three-dimensional model. Teaching and Teacher Education, 84, 128–138. https://doi.org/10.1016/j.tate.2019.05.003
  • Randall, J., & Engelhard, G. (2010). Examining the grading practices of teachers. Teaching and Teacher Education, 26(7), 1372–1380. https://doi.org/10.1016/j.tate.2010.03.008
  • Riley, T., & Ungerleider, C. (2019). Imputed meaning: An exploration of how teachers interpret grades. Action in Teacher Education, 41(3), 212–228. https://doi.org/10.1080/01626620.2019.1574246
  • Seden, K., & Svaricek, R. (2018). Teacher subjectivity regarding assessment: exploring English as a foreign language teachers’ conceptions of assessment theories that influence student learning. Center for Educational Policy Studies Journal, 8(3), 119–139. https://doi.org/10.26529/cepsj.500
  • Shohamy, E. (2001a). Democratic assessment as an alternative. Language Testing, 18(4), 373–391. https://doi.org/10.1177/026553220101800404
  • Shohamy, E. (2001b). The power of tests: A critical perspective on the uses of language tests. Longman.
  • Shulman, L. S. (1986). Those who understand: Knowledge growth in teaching. Educational Researcher, 15(2), 4–14. https://doi.org/10.2307/1175860
  • Sundqvist, P., & Sylvén, L. K. (2016). Extramural English in teaching and learning: From theory and research to practice. Springer.
  • Svennberg, L., Meckbach, J., & Redelius, K. (2014). Exploring PE teachers’ ‘gut feelings’ An attempt to verbalise and discuss teachers’ internalised grading criteria. European Physical Education Review, 20(2), 199–214. https://doi.org/10.1177/1356336X13517437
  • Taylor, L. (2013). Communicating the theory, practice and principles of language testing to test stakeholders: Some reflections. Language Testing, 30(3), 403–412. https://doi.org/10.1177/0265532213480338
  • Terry, G., Hayfield, N., Clarke, V., & Braun, V. (2017). Thematic analysis. Chapter 2. in Ed. Willig, C & Stainton Rogers, W. The sage handbook of qualitative research in psychology. https://doi.org/10.4135/9781526405555
  • Thorsen, C. (2014). Dimensions of norm-referenced compulsory school grades and their relative importance for the prediction of upper secondary school grades. Scandinavian Journal of Educational Research, 58(2), 127–146. https://doi.org/10.1080/00313831.2012.705322
  • Tierney, R. D., Simon, M., & Charland, J. (2011). Being fair: Teachers’ interpretations of principles for standards-based grading. The Educational Forum, 75(3), 210–227. https://doi.org/10.1080/00131725.2011.577669
  • Vogt, K., & Tsagari, D. (2014). Assessment literacy of foreign language teachers: Findings of a European study. Language Assessment Quarterly, 11(4), 374–402. https://doi.org/10.1080/15434303.2014.960046
  • Xu, Y., & Brown, G. T. L. (2016). Teacher assessment literacy in practice: A reconceptualization. Teaching and Teacher Education, 58, 149–162. https://doi.org/10.1016/j.tate.2016.05.010
  • Yan, Z., Li, Z., Panadero, E., Yang, M., Yang, L., & Lao, H. (2021). A systematic review on factors influencing teachers’ intentions and implementations regarding formative assessment. Assessment in Education: Principles, Policy & Practice, 28(3), 228–260. https://doi.org/10.1080/0969594X.2021.1884042

Appendix 1

Interview guide

Interviewee/respondent:

Interviewer:

Date:

Time, start:           end:

Dear student,

We are a group of researchers who are studying pre-service teachers’ summative assessment literacy and possible changes in their conceptions and practice during the first year of teaching. We would like to ask you questions about various aspects of assessment that deal with what you have learned in the TE programme, and what you think about using various strategies and techniques in the future. The interview will take about an hour.

Your responses are confidential and will be used only for research purposes. We would like to record this interview if it is OK with you. However, if you feel uncomfortable at any stage of this process, you can cancel the interview and the recording will be erased.

Do you have any questions?

Let us start with some background questions:

  • TE programme:

  • Subject(s):

  • CEFR level (A1, A2, B1, B2, C1 and C2):

  • Year in the programme:

  • Native language:

  • Previous teaching experience:

What do you think about various aspects of assessment:

  1. What does summative assessment mean to you? (conceptions/perspectives about purpose, scope). How do you differentiate between formative and summative assessment – their connections, implications for teaching and learning, students, schools?

  2. How have you developed your knowledge of summative assessment during teacher training?

    1. Which courses/modules in teacher education (TE) have covered assessment topics and competencies? In what ways and how? Do they suffice? Is anything missing? (Accounts of the courses can be given by the interviewer to assist the respondent. A list of courses can be attached.)

    2. What do you think about the goals of TE in relation to developing your understanding of and competency in assessment?

    3. Do you think the TE programme is clear or successful in relation to assessment literacy with the instruction of competencies and expectations through documents, course guides, etc.? If the answer is yes, how and to what degree? If no, what problems may this bring about in relation to your education?

  3. Experiences during school practica

    1. Can you describe how assessment issues were covered during school practica?

    2. How did you develop your knowledge of assessment during these courses? What helped you the most?

    3. How were you introduced to assessment practices at your VFU school, e.g. by a general plan for how to approach, conduct and evaluate (formative and) summative assessment in your practicum school? Was the importance of alignment for formative and summative assessment strategies underlined or not?

    4. What about assessment issues in light of the school culture? Mention some of the approaches or attitudes to summative assessment that you have come across.

    5. What other sources of information are there in your development of assessment literacy? Seminars, conferences, web sources, projects, etc.

  4. Name some crucial aspects of summative assessment. How would you define critical areas of summative assessment and their importance? How do you assess your understanding of these aspects? Areas of strengths, of improvement and/or of weaknesses?

  5. At this stage of your teacher training, how comfortable are you with methods to assess students’ performance in your subject area (performance, process, other), teacher made and others)? Areas of strengths, of improvement and/or of weaknesses?

  6. Now, after having gone through teacher education, how comfortable are you with grading, analyzing and using the results of summative assessment? Areas of strengths, of improvement and/or of weaknesses?

  7. What do you know about national and international testing in schools? How do you see them in relation to what you have learned on assessment in your TE programme?

  8. Are there are any further comments or thoughts?

Many thanks for your participation!