References
- Arthur, W., Jr, Bennett, W., Jr, Stanush, P. L., & McNelly, T. L. (1998). Factors that influence skill decay and retention: A quantitative review and analysis. Human Performance, 11(1), 57–101. doi:10.1207/s15327043hup1101_3
- Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 1, 1–48.
- Bond, L. 1995. Unintended consequences of performance assessment: Issues of bias and fairness. Educational Measurement Issues and Practice, 14(4), 21–24. doi:10.1111/j.1745-3992.1995.tb00885.x
- Braun, H. I. (1988). Understanding scoring reliability: Experiments in calibrating essay readers. Journal of Educational Statistics, 13, 1–18. doi:10.3102/10769986013001001
- Congdon, P. J., & McQueen, J. (2000). The stability of rater severity in large scale assessment programs. Journal of Educational Measurement, 37(2), 163–178. doi:10.1111/j.1745-3984.2000.tb01081.x
- Deane, P. (2011). Writing assessment and cognition (Research report 11-1). Princeton, NJ: Educational Testing Service.
- Ebbinghaus, H. (1964). Memory: A contribution to experimental psychology ( H. A. Ruger, C. E. Bussenius Trans.). New York, NY: Dover. (Original work published 1885).
- Farr, M. J. (1986). The long-term retention of knowledge and skills: A cognitive and instructional perspective (No. IDA-M-205). Alexandria, VA: Institute For Defense Analyses
- Finn, B., & Roth, A. (2020). An interview study with rSAT raters on calibration and operational scoring practices. Manuscript in preparation.
- Finn, B., Wendler, C., Pedley, K., & Arslan, B. (2018). Does the time between scoring session impact scoring accuracy? (ETS Research Report 18-31). Princeton, NJ: Educational Testing Service.
- Healy, A. F., Clawson, D. M., McNamara, D. S., Marmie, W. R., Schneider, V. I., Rickard, T. C., … Bourne, L. E., Jr (1993). The long-term retention of knowledge and skills. The Psychology of Learning and Motivation, 30, 135–164.
- Hintzman, D. L., & Ludlum, G. (1980). Differential forgetting of prototypes and old instances: Simulation by an exemplar-based classification model. Memory & Cognition, 8(4), 378–382. doi:10.3758/BF03198278
- Jastrzembski, T., Gluck, K., & Gunzelmann, G. (2006). Knowledge tracing and prediction of future trainee performance. In: Proceedings of the 2006 Interservice/Industry Training, Simulation, and Education Conference (pp. 1498–1508). Orlando, FL: National Training Systems Association.
- Joe, J. N., Harmes, J. C., & Hickerson, C. A. (2011). Using verbal reports to explore rater perceptual processes in scoring: A mixed methods application to oral communication assessment. Assessment in Education: Principles, Policy & Practice, 18, 239–258.
- Lumley, T., & McNamara, T. F. (1995). Rater characteristics and rater bias: Implications for training. Language Testing, 12(1), 54–71. doi:10.1177/026553229501200104
- McClellan, C. A. (2010). Constructed-response scoring—Doing it right. R&D Connections, 13. Retrieved from https://www. ets.org/Media/Research/pdf/RD_Connections13.pdf.
- Murre, J. M., & Dros, J. (2015). Replication and analysis of Ebbinghaus’ forgetting curve. PLoS One, 10(7), e0120644. doi:10.1371/journal.pone.0120644
- Myford, C. M., & Wolfe, E. W. (2009). Monitoring rater performance over time: A framework for detecting differential accuracy and differential scale category use. Journal of Educational Measurement, 46(4), 371–389. doi:10.1111/j.1745-3984.2009.00088.x
- Parke, C. S., Lane, S., & Stone, C. A. (2006). Impact of a state performance assessment program in reading and writing. Educational Research and Evaluation, 12, 239–269.
- Pavlik, P. I., Jr (2007). Understanding and applying the dynamics of test practice and study practice. Instructional Science, 35, 407–441
- Pavlik, P. I., Jr., & Anderson, J. R. (2003). An ACT-R model of the spacing effect. In F. Detje, D. Doerner, & H. Schaub (Eds.), In Proceedings of the Fifth International Conference on Cognitive Modeling (pp. 177–182). Bamberg, Germany: Universitats-Verlag Bamberg.
- Pavlik, P. I., Jr, & Anderson, J. R. (2005). Practice and forgetting effects on vocabulary memory: An activation-based model of the spacing effect. Cognitive Science, 29, 559–586.
- Pavlik, P. I., Jr, & Anderson, J. R. (2008). Using a model to compute the optimal schedule of practice. Journal of Experimental Psychology: Applied, 14, 101–117.
- R Development Core Team. (2008). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria: Author. ISBN 3-900051-07-0. Retrieved from http://www.R-project.org.
- Ricker-Pedley, K. L. (2011). An examination of the link between rater calibration performance and subsequent scoring accuracy in Graduate Record Examinations®(GRE®) Writing (Research report 11-1). Princeton, NJ: Educational Testing Service.
- Ricker-Pedley, K. L., & Li, H. (2010). Rater calibration and subsequent scoring performance [Internal manuscript]. Princeton, NJ: Educational Testing Service.
- Schmidt, R. A., & Bjork, R. A. (1992). New conceptualizations of practice: Common principles in three paradigms suggest new concepts for training. Psychological Science, 3, 207–217.
- van Rijn, D. H., van Maanen, L., & van Woudenberg, M. (2009). Passing the test: Improving learning gains by balancing spacing and testing effects. In Proceedings of the 9th International Conference of Cognitive Modeling, Manchester, UK.
- Wendler, C., Glazer, N., & Cline, F. (2019). Examining the calibration process for GRE raters. (ETS GRE-19-01 & ETS RR-19-09). Princeton, NJ: Educational Testing Service. https://doi.org/10.1002/ets2.12245
- Wigglesworth, G. (1994). Patterns of rater behaviour in the assessment of an oral interaction test. Australian Review of Applied Linguistics, 17, 77–103.
- Wilson, K. M. (1982). GMAT and GRE aptitude test performance in relation to primary language and scores on TOEFL (Research report 82-28). Princeton, NJ: Educational Testing Service.
- Zhang, M. (2013). Contrasting automated and human scoring of essays. R & D Connections, 21, 2. Princeton, NJ: Educational Testing Service.