2,174
Views
0
CrossRef citations to date
0
Altmetric
Editorial

Current controversies in educational assessment

As the global education community is adapting to life in a post-pandemic world, controversies in educational assessment continue to challenge researchers across countries and regions. Some of the controversies in educational assessment are linked to inequalities in the education system, and the fact that students do not have access to the same resources globally, which continues to impact them unfairly with respect to how they are assessed. Perhaps the most dramatic development in this respect is countries which continue to deny girls education, with Afghanistan as a recent example. It demonstrates how important it is to work even harder to reach the UN sustainable development goals, with aspiration for a world of peace, prosperity, and dignity where girls and women can live free from discrimination, and actively take part in education and sit exams for future higher education and careers.

One of OECD’s ambitions is to provide evidence-based knowledge to policy makers about their education systems and to enhance equality for all students through their large-scale assessment studies such as PISA. Such ambition is thus dependent upon trust in the actual assessment and demands transparency in how concepts are measured and reported.

In the first paper of this issue, Zieger et al. (Citation2022) discusses the so-called ‘conditioning model’, which is part of the OECD’s Programme for International Student Assessment (PISA). The aim of the paper is to discuss this practice and use of the model, and what impact it has on the PISA results. PISA is widely used and cited globally after eight cycles of data collection in almost 100 countries, just during the first quarter of the century (Jerrim, Citation2023). Despite this prominence as the world’s largest and most known comparative international education study, the knowledge around how student background variables are used when deriving students’ achievement scores are less known. More specifically, in their paper, Zieger et al. (this issue) demonstrate that the conditioning model is sensitive to which background variables are included. In fact, changes to how background variables are used lead to changes in the ranking of countries and how they are compared in PISA. This was particularly the case with the variables around socio-economic background, measures used to measure inequality on education. The authors understandably suggest this issue needs to be further addressed, both within and outside OECD, and results around comparisons of certain measures must be treated with caution.

Debates around PISA and other international large-scale studies are not new, and controversial topics around calculations of scores and rankings have been an ongoing debate since the introduction of these studies (Goldstein, Citation2004). Nevertheless, the call for more openness around the use of different models and the impact it has on the rankings must be addressed, as such studies are dependent upon the public’s trust.

The second paper describes the implementation of a vertical scaling design to numeracy tests given in grades 5 and 8 as part of the national testing system in Norway (Ræder et al., Citation2022). The design bridges the gap between grades 5 and 8 using linking tests tailored for grades 6 and 7. The researchers suggest that this approach is cost-effective, as there is no need for new item development. They further discuss the implications for creating vertical scales in the contexts of national assessment systems. Vertical scaling design is used in the US and Australia, but less so in Europe. The study therefore offers a blueprint for other countries and contexts on how to link established tests administered in different grades originally designed to operate independently. It also demonstrates how innovative research can support the practice of assessment, which is of benefit to both students and teachers and the public overall.

Kelly et al. (Citation2022) discusses Comparative Judgement (CJ) as a new assessment form. CJ is gaining increasing popularity in educational assessment (for a broader introduction, see Mcgrane, Citation2023). Still, Kelly et al. (this issue) argues that CP lacks clarity, by suggesting that comparative judgement advocates have not compiled a compelling case to support two of their central claims: 1) humans are better at comparative judgements than absolute judgements, and 2) comparative judgement is necessarily valid because it aggregates judgements made by experts in a naturalistic way. It is hoped that the concerns raised in the paper will be addressed by both researchers and practitioners engaging with CP. The method has been a promising tool for many, but there is still work do to developing CP and securing more empirical evidence for its use.

Crisp & Greatorex (Citation2023) presents a study analysing items used in science after the GCSE (General Certificate of Secondary Education) reforms in England. New requirements for assessing application in science were introduced. The reported study explored the nature of the contexts used in the reformed GCSE combined science examination. A qualitative coding frame was used, and the risk of introducing construct-irrelevant variance is discussed. The study is of importance beyond England, as it offers knowledge not only to test-developers, but to the users of these assessments, both teachers, students and research community.

Cai et al, (Citation2022) report from a study conducted with 4837 Hong Kong students, investigating the relationship between formative assessment strategies and reading achievement. Results showed that there was significant effect of formative assessment strategies with low- and medium-reading achievers but not with high-reading achievers. Implications for future research and practices are discussed, both in the context of Hong Kong and beyond. The study is of particular interest as we still have few large-scale studies investigating formative assessment practices and their impact on achievement. Historically, claims have been raised of its effect (Black & Wiliam, Citation1998; Mandouit & Hattie, Citation2023), but few studies have examined and documented how such practices can enhance students’ learning as measured on achievement tests.

The final paper by Riegel et al. (Citation2022) reports from a study in New Zealand where the research team studied self-efficacy related to assessment, more specifically ‘comprehension and execution’ and ‘emotional regulation’ in two scenarios, low-stakes and high stakes. The development and testing of a new measure is reported, after being tested in two studies with N = 301 and Ns + 277, 329, respectively. Potential use of the new Measure of Assessment Self-Efficacy (MASE) is discussed. This journal has traditionally not published validation studies of different measures and assessments, but with the increased focus upon transparency in assessment, and the controversies discussed in the papers published in this issue, we will welcome papers which validate existing measures, or develop new and better assessment for the future. Only through innovation will we be able to tackle the current controversies in educational assessment.

Disclosure statement

No potential conflict of interest was reported by the author.

References

  • Black, P., & Wiliam, D. (1998). Assessment and classroom learning. Assessment in Education: Principles, Policy & Practice, 5(1), 7–74. https://doi.org/10.1080/0969595980050102
  • Cai, Y., Yang, M., & Yao, J. (2022). More is not always better: The nonlinear relationship between formative assessment strategies and reading achievement. Assessment in Education: Principles, Policy & Practice. https://doi.org/10.1080/0969594X.2022.2158304
  • Crisp, V., & Greatorex, J. (2023). The appliance of science: Exploring the use of context in reformed GCSE science examinations. Assessment in Education: Principles, Policy & Practice. https://doi.org/10.1080/0969594X.2022.2156980
  • Goldstein, H. (2004). International comparisons of student attainment: Some issues arising from the PISA study. Assessment in Education: Principles, Policy & Practice, 11(3), 319–330. https://doi.org/10.1080/0969594042000304618
  • Jerrim, J. (2023). Has Peak PISA passed? An investigation of interest in International Large-Scale Assessments across countries and over time. European Educational Research Journal. https://doi.org/10.1177/14749041231151793
  • Kelly, K. T., Richardson, M., & Isaacs, T. (2022). Critiquing the rationales for using comparative judgement: A call for clarity. Assessment in Education: Principles, Policy & Practice, 1–15. https://doi.org/10.1080/0969594X.2022.2147901
  • Mandouit, L., & Hattie, J. (2023, April). Revisiting “The Power of Feedback” from the perspective of the learner. Learning and Instruction, 84, 101718. https://doi.org/10.1016/j.learninstruc.2022.101718
  • Mcgrane, J. (2023). Comparative judgment. International, 73–78. https://doi.org/10.1016/b978-0-12-818630-5.09023-0
  • Ræder, H. G., Andersson, B., & Olsen, R. V. (2022). Numeracy across grades – vertically scaling the Norwegian national numeracy tests. Assessment in Education: Principles, Policy & Practice, 1–21. https://doi.org/10.1080/0969594X.2022.2147483
  • Riegel, K., Evans, T., & Stephens, J. M. (2022). Development of the measure of assessment self-efficacy (MASE) for quizzes and exams. Assessment in Education: Principles, Policy & Practice, 1–17. https://doi.org/10.1080/0969594X.2022.2162481
  • Zieger, L. R., Jerrim, J., Anders, J., & Shure, N. (2022). Conditioning: How background variables can influence PISA scores. Assessment in Education: Principles, Policy & Practice, 1–21. https://doi.org/10.1080/0969594X.2022.2118665

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.