References
- Baird , J.-A. , Beguin , A. , Black , P. , Pollitt , A. and Stanley , G. 2012 . “ The Reliability Programme Final Report of the Technical Advisory Group ” . In Ofqual’s Reliability Compendium , Edited by: Opposs , D. and He , Q. 771 – 838 . Coventry : Office of Examinations and Qualifications Regulation .
- Baker , E. L. , Ayres , P. , O’Neil , H. F. , Choi , K. , Sawyer , W. , Sylvester , R. M. and Carroll , B. 2008 . KS3 English Test Marker Study in Australia. Final report to the National Assessment Agency of England , London : National Assessment Agency .
- Bennett , R. E. , Gottesman , R. L. , Rock , D. A. and Cerullo , F. 1993 . Influence of Behaviour Perceptions and Gender on Teachers’ Judgements of Students’ Academic Skill . Journal of Educational Psychology , 85 : 347 – 356 .
- Bew, P. 2011. Independent Review of Key Stage 2 testing, assessment and accountability. Final Report to the British Government Department for Education.
- Black , P. , Harrison , C. , Hodgen , J. , Marshall , B. and Serret , N. 2010 . Validity in Teachers’ Summative Assessments . Assessment in Education , 17 : 215 – 232 .
- Black , P. , Harrison , C. , Hodgen , J. , Marshall , B. and Serret , N. 2011 . Can Teachers’ Summative Assessments Produce Dependable Results and Enhance Classroom Learning? . Assessment in Education , 18 : 451 – 469 .
- Brennan , R. L. 2001 . Generalizability Theory , New York , NY : Springer-Verlag .
- Burgess, S., and E. Greaves. 2009. Test Scores, Subjective Assessment and Stereotyping of Ethnic Minorities. Working paper 09/221, University of Bristol, Centre for Market and Public Organization.
- Cardinet , J. , Johnson , S. and Pini , G.-R. 2010 . Applying Generalizability Theory using EduG , New York , NY : Routledge .
- Cumming , J. J. and Maxwell , G. S. 2004 . Assessment in Australian Schools: Current Practice and Trends . Assessment in Education , 11 : 94 – 108 .
- Daugherty , R. 2007 . National Curriculum Assessment in Wales: Evidence-informed Policy? . Welsh Journal of Education , 14 : 62 – 77 .
- Daugherty, R. 2011. “Designing and Implementing a Teacher-based Assessment System: Where is the infrastructure?” Paper presented at the Oxford University Centre for Educational Assessment seminar Teachers’ judgments within systems of summative assessment: strategies for enhancing consistency, Oxford, June.
- Dhillon , D. 2005 . Teachers’ Estimates of Candidates’ Grades. Curriculum 2000 Advanced Level Qualifications . British Educational Research Journal , 31 : 69 – 88 .
- Estyn . 2010 . Evaluation of the Arrangements to Assure the Consistency of Teacher Assessment in the Core Subjects at Key Stage 2 and Key Stage 3 , Cardiff : Her Majesty’s Inspectorate for Education and Training in Wales .
- Gustafsson, J.-E., and G. Erickson. 2011. “To Trust or Not to Trust? Contrasting Findings from Teachers’ Assessments.” Paper presented at the annual conference of the Association for Educational Assessment – Europe, Belfast, Northern Ireland, November.
- Harlen , W. 2004a . A Systematic Review of the Evidence of the Impact on Students, Teachers and the Curriculum of the Process of Using Assessment by Teachers for Summative Purposes , London : EPPI-Centre, Social Science Research Unit, Institute of Education, University of London .
- Harlen , W. 2004b . A Systematic Review of the Evidence of the Reliability and Validity of Assessment by Teachers for Summative Purposes , London : EPPI-Centre, Social Science Research Unit, Institute of Education, University of London .
- Harlen , W. 2005 . Trusting Teachers’ Judgements: Research Evidence of the Reliability and Validity of Teachers’ Assessment Used for Summative Purposes . Research Papers in Education , 20 : 245 – 270 .
- Harlen , W. 2007 . Assessment of Learning , London : Sage .
- Harlen , W. and Deakin Crick , R. 2002 . A Systematic Review of the Impact of Summative Assessment and Tests on Students’ Motivation for Learning , London : EPPI-Centre, Social Science Research Unit, Institute of Education .
- Harlen , W. and Deakin Crick , R. 2003 . Testing and Motivation to Learn . Assessment in Education , 10 : 170 – 207 .
- Hauser-Cram , P. , Sirin , S. R. and Stipek , D. J. 2003 . When Teachers’ and Parents’ Values Differ: Teacher Ratings of Academic Competence in Children from Low-income Families . Journal of Educational Psychology , 95 : 813 – 820 .
- Hayward , E. L. 2007 . Curriculum, Pedagogies and Assessment in Scotland: The Quest for Social Justice. ‘Ah kent yir faither’ . Assessment in Education , 14 : 251 – 268 .
- Hutchinson , C. and Hayward , L. 2005 . The Journey so Far: Assessment for Learning in Scotland . The Curriculum Journal , 16 : 225 – 248 .
- Hutchison , D. and Benton , T. 2010 . Parallel Universes and Parallel Measures: Estimating the Reliability of Test Results , Coventry : Office of Qualifications and Examinations Regulation .
- Johnson, S. 2010. The Reliability of Writing in the 2009 Survey. Internal report produced for the Scottish Government.
- Johnson , S. 2011 . Assessing Learning in the Primary Classroom , London : Routledge .
- Johnson , S. 2012 . “ A Focus on Teacher Assessment Reliability in GCSE and GCE ” . In Ofqual’s Reliability Compendium , Edited by: Opposs , D. and He , Q. 365 – 416 . Coventry : Office of Qualifications and Examinations Regulation .
- Johnson, S., and L. Munro. 2008. “Teacher Judgements and Test Results: Should Teachers and Tests Agree?” Paper presented at the Annual Conference of the Association of Educational Assessment – Europe, Hissar, Bulgaria, November.
- Lafontaine , D. and Monseur , C. 2009 . Les évaluations des performances en mathématiques sont-elles influencées par le sexe de l’élève? . Mesure et Evaluation en Education , 32 : 71 – 98 .
- MacCann , R. G. and Stanley , G. 2010 . Classification Consistency When Scores are Converted to Grades: Examination Marks Versus Moderated School Assessments . Assessment in Education , 17 : 255 – 272 .
- Martinez , J. F. , Stecher , B. and Borko , H. 2009 . Classroom Assessment Practices, Teacher Judgments, and Student Achievement in Mathematics: Evidence from the ECLS . Educational Assessment , 14 : 78 – 102 .
- Maxwell, G. 2006. “Quality Management of School-based Assessments: Moderation of Teacher Judgements.” Paper presented at the 32nd IAEA Conference, Singapore, May.
- Meadows , M. and Billington , L. 2005 . A Review of the Literature on Marking Reliability , London : National Assessment Agency .
- Morgan , C. and Watson , A. 2002 . The Interpretive Nature of Teachers’ Assessment of Students’ Mathematics: Issues for Equity . Journal of Research in Mathematics Education , 33 : 78 – 110 .
- Murphy , D. J. , Bruce , D. A. , Mercer , S. W. and Eva , K. W. 2009 . The Reliability of Workplace-based Assessment in Postgraduate Medical Education and Training: A National Evaluation in General Practice in the United Kingdom . Advances in Health Sciences Education , 14 : 219 – 232 .
- Newton, P., and M. Meadows. 2011. “Special Issue: Marking Quality Within Test and Examination Systems.” Assessment in Education, 18: 213–216.
- Opposs , D. and He , Q. , eds. 2012 . Ofqual’s Reliability Compendium , Coventry : Office of Qualifications and Examinations Regulation .
- QCA . 2006 . A Review of GCSE Coursework , London : Qualifications and Curriculum Authority .
- QCDA . 2009 . Changes to GCSEs and the Introduction of Controlled Assessment for GCSEs , London : Qualifications and Curriculum Development Agency .
- QSA . 2010 . Moderation Handbook for Authority Subjects , Brisbane : Queensland Studies Authority .
- QSA. 2011. Random Sampling Project. 2011 Report on Random Sampling of Assessment in Authority subjects. Brisbane: Queensland Studies Authority.
- Ready , D. D. and Wright , D. L. 2011 . Accuracy and Inaccuracy in Teachers’ Perceptions of Young Children’s Cognitive Abilities: The Role of Child Background and Classroom Context . American Educational Research Journal , 48 : 335 – 360 .
- Reeves , D. J. , Boyle , W. F. and Christie , T. 2001 . The Relationship Between Teacher Assessments and Pupil Attainments in Standard Test Tasks at Key Stage 2, 1996–98 . British Educational Research Journal , 27 : 141 – 160 .
- Robinson , C. 2007 . “ Awarding Examination Grades: Current Processes ” . In Techniques for Monitoring the Comparability of Examination Standards , Edited by: Newton , P. , Baird , J.-A. , Goldstein , H. , Patrick , H. and Tymms , P. 97 – 123 . London : Qualifications and Curriculum Authority .
- Schoonen , R. 2005 . Generalizability of Writing Scores: An Application of Structural Equation Modelling . Language Testing , 22 : 1 – 30 .
- Shavelson , R. J. , Baxter , G. P. and Gao , X. 1993 . Sampling Variability of Performance Assessments . Journal of Educational Measurement , 30 : 215 – 232 .
- SSA. 2006. Scottish Survey of Achievement. 2005 English language and Core Skills – Practitioner’s Report. Edinburgh: Scottish Government.
- Stanley, G., R. MacCann, J. Gardner, L. Reynolds, and I. Wild. 2009. Review of Teacher Assessment: Evidence of What Works Best and Issues for Development. Oxford: University of Oxford Centre for Educational Assessment.
- Taylor, M. 1992. The Reliability of Judgements Made by Coursework Assessors. Associated Examining Board internal report.
- Thomas , S. , Madaus , G. E. , Raczek , A. E. and Smees , R. 1998 . Comparing Teacher Assessment and Standard Task Results in England: The Relationship Between Pupil Characteristics and Attainment . Assessment in Education , 5 : 213 – 246 .
- van Rijn , P. W. , Béguin , A. A. and Verstralen , H. H. F. M. 2012 . Educational Measurement Issues and Implications of High Stakes Decisions Making in Final Examinations in Secondary Education in the Netherlands . Assessment in Education , 19 : 117 – 136 .
- Wikstrom , C. 2006 . Education and Assessment in Sweden . Assessment in Education , 13 : 113 – 128 .
- Wiliam , D. 2001 . Validity, Reliability and all that Jazz . Education , 3–13 ( 29 ) : 17 – 21 .
- Wiliam , D. 2003 . National Curriculum Assessment: How to Make it Better . Research Papers in Education , 18 : 129 – 136 .
- Wilmut , J. 2005 . Experiences of Summative Teacher Assessment in the UK , London : Qualifications and Curriculum Authority .
- Wyatt-Smith , C. and Castleton , G. 2005 . Examining How Teachers Judge Student Writing: An Australian Case Study . Journal of Curriculum Studies , 37 : 131 – 154 .
- Wyatt-Smith , C. , Klenowski , V. and Gunn , S. 2010 . The Centrality of Teachers’ Judgement Practice in Assessment: A Study of Standards in Moderation . Assessment in Education , 17 : 59 – 75 .