References
- Ackerman-Piek, D., & Massing, N. (2014). Interviewer behaviour and interviewer characteristics in PIAAC Germany. Methods, Data, Analyses, 8(2), 199–222.
- Cope, B., & Kalantzis, M. (2016). Big data comes to school: Implications for learning, assessment and research. AERA Open, 2(2), 1–19.
- D’Mello, S., & Kory, J. (2015). A review and meta-analysis of multi-modal affect detection systems. ACM Computing Surveys, 47(3), 1–36.
- Du Bois, J. W., & Karkkainen, E. (2012). Taking a stance on emotion: Affect, sequence and intersubjectivity in dialogic interaction. Text and Talk, 32(4), 433–451.
- Duranti, A. (1997). Linguistic anthropology. Cambridge, UK: Cambridge University Press.
- Eklöf, H. (2010). Skill and will: Test‐taking motivation and assessment quality. Assessment in Education: Principles, Policy & Practice, 17(4), 345–356.
- Ercikan, K., & Pellegrino, J. W. (Eds). (2017). Validation of score meaning for the next generation of assesments. New York, NY: Routledge.
- Ericsson, K., & Simon, H. (1993). Protocol analysis: Verbal reports as data. Cambridge, MA: MIT Press.
- Frohlich, D., Drew, P., & Monk, A. (1994). Management of repair in human-computer interaction. Human-Computer Interaction, 9(3–4), 385–425.
- Goffman, E. (1963). The neglected situation. American Anthropologist, 66, 133–136.
- Goffman, E. (1981). Forms of talk. philadelphia. Philadelphia, PA: University of Pennsylvania Press.
- Goffman, E. (1983a). The interaction order. American Sociological Review, 48, 1–17.
- Goffman, E. (1983b). Felicity’s condition. American Journal of Sociology, 89, 1–53.
- Goldhammer, F. (2015). Measuring ability, speed, or both? – challenges, psychometric solutions and what can be gained from experimental control. Measurement: Interdisciplinary Research and Perspectives, 13, 133–164.
- Goldhammer, F., Martens, T., Christoph, G., & Lüdtke, O. (2016), Test-taking engagement in PIAAC. OECD working papers. OECD. Paris.
- Goldhammer, F., Naumann, J., & Greiff, S. (2015). More is not always better: The relation between item response and item response time in Raven’s matrices. Journal of Intelligence, 3, 21–40.
- Goodwin, C. (2007). Environmentally couples gestures. In S. D. Duncan, J. Cassell, & E. T. Levy (Eds.), Gesture. and the Dynamic Dimension of Language. (pp195-212). Amsterdam, The Netherlands: John Benjamins.
- Goodwin, C., & Duranti, A. (1992). Rethinking context: An introduction. In A. Duranti, & C. Goodwin Eds., Rethinking context: Language as an interactive phenomenon (pp. 1–42). Cambridge, UK: Cambridge University Press.
- Goodwin, C., & Goodwin, M. H. (1992). Assessments and the construction of context. In A. Duranti, & C. Goodwin (Eds.), Rethinking context: Language as an interactive phenomenon (pp. 147–189). Cambridge, UK: Cambridge University Press.
- Goodwin, M. H., Cekaite, A., & Goodwin, C. (2012). Emotion as Stance. In M. L. Sorjonen, & A. Perakyla (Eds.), Emotion in Interaction. Oxford, UK: Oxford University Press.
- Greiff, A., Niepel, C., Scherer, R., & Martin, R. (2016). Understanding students’ performance in a computer-based assessment of complex problem solving: An analysis of behavioural data from computer-generated log files. Computers in Human Behaviour, 61, 36–46.
- Hubley, A.M& Zumbo, B.D. (2017). ‘Response Processes in the Context of Validity: Setting the Stage’. In B.D. Zumbo and A.M. Hubley (Eds.), Understanding and Investigating Response Processes in Validation Research. Springer Press. pp 1–13.
- Hudlicka, E. (2003). To feel or not to feel: The role of affect in human–computer interaction. International Journal of Human-Computer Studies, (59), 1–32.
- Jeong, H. (2014). A comparative study of scores on computer-based and paper-based tests. Behaviour and Information Technology, 33(4), 410–422.
- Jerrim, J. (2016). PISA 2012: How do results for the paper and computer tests compare? Assessment in Education: Principles, Policy & Practice (published on-line, volume and page numbers to follow).
- Job, V., Dweck, C. S., & Walton, G. M. (2010). Ego depletion, is it all in your head? Implicit theories about willpower affect self-regulation. Psychological Science, 21(11), 1686–1693.
- Kane, M. T. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50, 1–73.
- Kane, M. T. (2016). Explicating validity. Assessment in Education: Principles, Policy & Practice, 23(2), 198–211.
- Kane, M. T., & Mislevy, R. (2017). Validating score interpretations based on response processes. In K. Ercikan, & J. W. Pellegrino (Eds.), Validation of Score Meaning for the Next Generation of Assesments (pp. 11–24). New York, NY: Routledge.
- Kendon, A. (2007). On the origins of modern gesture studies. In S.D. Duncan, J. Cassell, and E.T. Levy (Eds.), Gesture and the dynamic dimension of language (pp. 13–28). Amsterdam, Netherlands: John Benjamins.
- Latour, B. (1992). Where are the missing masses? The sociology of a few mundane artefacts. In W. E. Bijker, & J. Law (Eds.), Shaping technology/building society: Studies in sociotechnical change (pp. 225–258). Cambridge, Mass: MIT Press.
- Latour, B. (1999). Pandora’s hope: Essays on the reality of science studies. Cambridge M.A. and London. Harvard University Press.
- Lee, H., & Haberman, S. J. (2016). Investigating test-taking behaviours using timing and process data. International Journal of Testing, 16(3), 240–267.
- Li, Z., Banerjee, J., & Zumbo, B. D. (2017). Response time data as validity evidence: Has it lived up to its promise and, if not, what would it take to do so. In B. D. Zumbo, & A. M. Hubley (Eds), Understanding and Investigating Response Processes in Validation Research (pp. 159–178). Cham, Switzerland: Springer.
- Maddox, B. (2014). ‘Globalising Assessment: An Ethnography of Literacy Assessment, Camels and Fast Food in the Mongolian Gobi’. Comparative Education. 50. 474–489.
- Maddox, B. (2015). ‘The Neglected Situation: Assessment Performance and Interaction in Context’. Assessment in Education: Principles, Policy and Practice. Vol. 22 (4) 427–443.
- Maddox, B., Zumbo, B. D., Tay-Lim, B. S-H., & Qu, D. (2015). ‘An Anthropologist among the Psychometricians: Assessment Events, Ethnography and DIF in the Mongolian Gobi’. International Journal of Testing. 14(2), 291–309.
- Maddox, B., Zumbo, B. D. (2017). ‘Observing testing situations: Validation as Jazz’ in B.D. Zumbo and A.M. Hubley (Eds.) Understanding and Investigating Response Processes in Validation Research. Springer. (pp. 179–192.
- McNamara, T. (1997). “Interaction” in second language performance assessment: Whose performance? Applied Linguistics, 18, 446–466.
- McNamara, T., & Roever, C. (2006). Language testing: The social dimension. Malden, MA: Blackwell Publishing.
- McNeill, D. (1985). So you think gestures are non-verbal? Psychological Review, 92, 350–371.
- Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13–103). New York, NY: American Educational Council and Macmillan.
- Monkaresi, H., Bosch., N., Calvo, R., & D’Mello, S. K. (2016). Automated detection of engagement using video-based estimation of facial expressions and heart rate. IEEE Transactions on Affective Computing, 8(1), 15–28.
- Newton, P. (2016). ‘Macro- and Micro-Validation: Beyond the ‘Five Sources’ Framework for Classifying Validation Evidence and Analysis’. Practical Assessment, Research and Evaluation. 21 (12).
- OECD. (2011). Interview procedures manual. PIAAC Main Study. 28th February, 2011. Paris, France.
- OECD. (2016a). Skills matter. Further results from the survey of adult skills. Paris, France.
- OECD. (2016b). Technical report of the survey of adult skills (PIAAC) (2nd ed.). Paris, France.
- Orange, A., Gorin, J., Jia, Y., & Kerr, D. (2017). ‘Collecting, analysing, and interpreting response time, eye tracking and log data’ in. In K. Ercikan, & J. W. Pellegrino (Eds), Validation of score meaning for the next generation of assesments (pp. 39–51). New York, NY: Routledge.
- Pepper, D., Hodgen, J., Lamesoo, K., Kõiv, P., & Tolboom, J. (2016). Think aloud: Using cognitive interviewing to validate the PISA assessment of student self-efficacy in mathematics. International Journal of Research & Method in Education, 15–28.
- Russell, M., Goldberg, A., & O’Connor, K. (2003). ‘Computer-based testing and validity: A look back at the future. Assessment in Education: Principles, Policy and Practice, 10(3), 279–293.
- Sacks, E., Shegloff, E.A., and Jefferson, G (1974). ‘A simplest systematics for the organisation of turn-taking in conversation’, Language 50 (4) 696–735.
- Schegloff, E., & Sacks, H. (1973). Opening up closings. Semiotica, 8, 289–327.
- Schegloff, E. A. (1988). Goffman and the analysis of conversation. In P. Drew, & A. Wootton (Eds.), Erving goffman: Exploring the interaction order (pp. 89–135). Cambridge, UK: Polity Press.
- Sellar, S. (2014). A feel for numbers: Affect, Data and Education Policy. Critical Studies in Education, 56(1), 131–146.
- Shear, B., & Zumbo, B. D. (2014). ‘What counts as evidence: a review of validity studies in educational and psychological measurement. In B. D. Zumbo, & E. K. H. Chan (Eds.), Validity and validation in social. behavioural, and health sciences. Springer. pp91-112.
- Shohamy, E. (1993). The exercise of power and control in the rhetorics of testing. In A. Huhta, V. Kohonen, L. Kurki-Suonio, & S. Luoma (Eds.), Current developments and alternatives in language assessment (pp. 23–38). Tampere, Finland: Universities of Tampere and Jyväskylä.
- Stone, J., & Zumbo, B. D. (2016). ‘Validity as a pragmatist project: A global concern with local application. In V. Aryadoust, & J. Fox (Eds), Trends in language assessment research and practice (pp. 555–573). Newcastle, UK: Cambridge Scholars publishing.
- Williamson, B. (2016). Digital education governance: data visualisation, predictive analytics and ‘real-time’ policy instruments. Journal of Educational Policy, 31(2), 123–144.
- Wise, S. (2006). An investigation of the differential effort received by items on a low-stakes computer based test. Applied Measurement in Education, 19, 95–114.
- Zumbo, B. D. (2007). Three generations of differential item functioning (DIF) analyses: Considering where it has been, where it is now, and where it is going. Language Assessment Quarterly, 4, 223–233.
- Zumbo, B. D. (2009). Validity as contextualised and pragmatic explanation, and its implications for validation practice’. In R. Lissitz (Ed), The concept of validity: Revisions, new directions and applications (pp. 65–82). Charlotte, N.C: Information Age publishing.
- Zumbo, B. D., & Chan, E. (2014). Validity and validation in social, behavioural and health sciences. Springer.
- Zumbo, B. D., & Hubley, A. M. (2017). Bringing consequences and side effects of testing and assessment to the foreground. Assessment in Education: Principles, Policy & Practice, 23(2), 299–303.
- Zumbo, B. D., & Hubley, A. M. (Eds.). (2017). Understanding and investigating response processes in validation research. Springer.