Search in:

Research Papers in Education Volume 37, 2022 - Issue 6

Submit an article Journal homepage

530

Views

CrossRef citations to date

Altmetric

Articles

Demythologising A level Exam Standards

Paul E. NewtonStrategy, Risk and Research Directorate, Office of Qualifications and Examinations Regulation (Ofqual), Coventry, UKCorrespondence[email protected]
View further author information

Pages 875-906 | Received 01 Sep 2020, Accepted 28 Dec 2020, Published online: 06 Jan 2021

Cite this article
https://doi.org/10.1080/02671522.2020.1870543
CrossMark

Sample our Education journals, sign in here to start your access, latest two full volumes FREE to you for 14 days

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions
Read this article /doi/full/10.1080/02671522.2020.1870543?needAccess=true

ABSTRACT

There are two major myths concerning A level exam standards in England. First, the Ancient Myth, which insists that standards were norm-referenced until the 1980s, when they transitioned to being criterion-referenced. Second, the Modern Myth, which insists that standards transitioned again, during the 2010s, to being based upon the comparable outcomes principle. The present paper debunks these myths, arguing that: except for the occasional use of comparable outcomes to bridge qualification reforms, A level standards have always been attainment-referenced; and that this has always been operationalised using a combination of methods, including both examiner judgement of exam performances and statistical expectations of cohort attainment. The paper also argues that what has changed significantly is the degree of confidence that the exam industry has placed in examiner judgement relative to statistical expectations, which has waxed and waned over time. When statistical expectations have prevailed, pass rates have tended to plateau; somewhat implausibly. When examiner judgement has prevailed, pass rates have tended to rise; also somewhat implausibly. These trends have given a false impression of principled transitions, which the paper dispels.

KEYWORDS:

A level
exam standards
norm-referencing
criterion-referencing
attainment-referencing
comparable outcomes

Acknowledgments

I am grateful to Dennis Opposs, Tim Oates, and an anonymous reviewer for helpful feedback on an earlier draft of this report; and to Cambridge Assessment for support in accessing documents and data from the UCLES archives.

Disclosure statement

No conflicts of interest to disclose.

Notes

1. Although the present report focuses on A level standards, most of its conclusions generalise to both O level and GCSE exams, which have operated in essentially the same way.

2. Pollard was actually discussing percentage requirements; so when referring to ‘number’ he was presumably assuming a similar cohort size.

3. For simplicity of exposition, the analysis in present paper will be framed primarily in terms of the pass rate, i.e. the percentage of a cohort that is awarded grade E or above. This might be the cohort of candidates that sat a particular subject exam within a particular year within a particular board. Or it might be a higher-level cohort, e.g. the cohort of all candidates that sat a subject exam within a particular subject area, across all boards, within a particular year. Or it might even be at the highest level, i.e. the ‘cohort’ of all exams sat within a particular year, across all subject areas, across all boards.

4. These data (and those relating to ) were collated from records in the Cambridge Assessment archives. They include candidates from the University of Cambridge Local Examinations Syndicate Summer examinations series (Home candidates only, Main syllabuses only).

5. Now, if norm-referencing were being applied as a matter of principle, then this ought, in theory, to be a perfectly straight line. In practice, though, it is impossible to engineer absolute precision when setting grade boundaries. For instance, if 72.3% of a subject cohort were to achieve a mark of 26 or more, while 67.3% of that cohort were to achieve a mark of 27 or more, then an exam board would no doubt designate 26 as the ‘70%’ pass mark, even though this would actually return a slightly higher percentage pass rate.

6. See Newton (Citation2011) for other possible reasons, including misinterpreted recommendations from a Secondary Schools Examinations Council report (see from Newton Citation2011).

7. Crofts was Secretary to the Joint Matriculation Board of the Northern Universities (JMB), from 1919 to 1941.

8. The precise details of how grade boundary marks are derived have differed over time and across exam boards. Differences across boards were more prevalent from the 1950s to the mid-1980s; after which, they were gradually brought into alignment, especially during the 1990s. However, the basic idea of empowering senior examiners to judge the quality of candidate performances was common across all boards (Robinson Citation2007; see Taylor and Opposs Citation2018 for a more up-to-date account).

9. I undertook this research as Director of the Cambridge Assessment Network, within the Assessment Research and Development Division of Cambridge Assessment.

10. The raw data were uploaded from photocopies of documents, prepared by the boards on an annual basis, which presented A level exam results, across all boards, broken down by gender. These documents, one for each year, were titled: ‘General Certificate of Education, Advanced level, Summer [year]’. Where alternative syllabuses were offered by a board, within a particular subject area, those results were aggregated. (The original documents were provided by the Cambridge Assessment Archive Team, 29 June 2011.)

11. Although the data in come from a single exam board, similar trends can be seen at the national level, across all subjects and all exam boards (e.g. University of Buckingham Citation2010).

12. Alongside the debate over criterion-referencing, procedures for determining certain A level grade boundaries were changed, and brought into line across the boards. This change, which occurred in 1987, is often mistaken for the ‘transition’ to criterion-referencing (e.g. Shackleton Citation2014; Tattersall Citation2007; cf. Kingdon Citation1991; Newton Citation2011).

13. Although the scales tipped in favour of examiner judgement from the 1980s onwards, it is important to appreciate that the balance between judgement and statistics waxed and waned even during the ‘strangely escalating’ phase. For instance, the Chief Executive of the QCA believed that the A level exams crisis of 2002 was largely a consequence of failing to pay sufficient attention to examiner judgement. Subsequently, a spokesman for the QCA was quoted as saying: ‘When chairs of examiners look to set grade boundaries this year they will have to take account of a revised code of practice that clearly gives priority to examiner judgement not statistics.’ (Townsend and Bright Citation2003).

14. These practices evolved slightly differently across exam boards. For instance, although all boards set the pass/fail grade boundary using examiner judgement of performance evidence, this was not true of the other grade boundaries. While some of the boards set all grade boundaries using judgement, others reserved judgement for key grade boundaries (e.g. A/B, B/C and E/fail), and then ‘interpolated’ the remaining ones. These practices were aligned in 1987 (Kingdon Citation1991).

15. In 1944, Brereton was Assistant Secretary to UCLES. Jenkins was Secretary to the University of London University Entrance and School Examinations Council from 1945 to 1957. In 1953, Petch was Secretary to the JMB.

16. In 1980, Christopher was Secretary to the JMB.

17. This is a fairly recent consensus, formalised to facilitate the introduction of Curriculum 2000 A level exams. It is possible to find evidence of the principle having been applied prior to the turn of the century, although this evidence is limited. Indeed, insufficient understanding and use of this principle during the 1980s and 1990s is likely to have contributed significantly to grade inflation (Pollitt Citation1998; Newton Citation2020a, Citation2020b).

18. Nowadays, as noted earlier, the demographic composition of an exam cohort is judged in terms of a single prior attainment indicator, e.g. mean GCSE point score. The process is operationalised using prediction matrices.

19. Examiner judgement still has a role, albeit a different one. It is used to check that applying the comparable outcomes principle does not lead to results that would lack credibility. If examiners for a particular subject exam were to raise credibility concerns, then this might necessitate a bespoke solution (Cresswell Citation2003).

20. Analyses by Benton (Citation2016) have helped to elucidate this matter. Whilst he acknowledged that comparable outcomes does make system improvement harder to track over time, he also explained how individual schools should still be able to demonstrate improvement; observing that grade boundaries under comparable outcomes are likely to be in the order of only a mark or so more severe than they would have been otherwise. Clearly, we must be careful not to mythologise the degree of impact of comparable outcomes.

21. See Coe (Citation2013) for a pessimistic, but not unreasonable answer.

22. This is why A levels changed from being ungraded (pass-fail) to being graded, in the early 1960s (Montgomery Citation1965).

Newton, P. E. 2011. “A Level Pass Rates and the Enduring Myth of Norm-referencing.” Research Matters 2: 20–26.

Demythologising A level Exam Standards

ABSTRACT

Acknowledgments

Disclosure statement

Notes

Additional information

Funding

Notes on contributors

Paul E. Newton

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature