References
- American Educational Research Association, American Psychological Association, & National Council on Measurement in Education (2014). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.
- Bates, D., Maechler, M., Bolker, B., & Walker, S. (2018). Package ‘lme4ʹ. Package version 1.1-17.
- Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society: Series B (Methodological), 57, 289–300.
- Boe, E. E., & May, H. M. (2002). Student task persistence in the third international mathematics and science study: A major source of achievement differences at the national, classroom, and student levels (Research Report No. 2002-TIMSS1). Philadelphia, PA: Center for Research and Evaluation in Social Policy, University of Pennsylvania.
- Bollen, K. A., & Jackman, R. W. (1990). Regression diagnostics: An expository treatment of outliers and influential cases. Modern Methods of Data Analysis, 13, 257–291.
- Brown, G. T. L., & Harris, L. R. (2016). Handbook of human and social conditions in assessment. New York, NY: Routledge.
- Debeer, D., Buchholz, J., Hartig, J., & Janssen, R. (2014). Student, school, and country differences in sustained test-taking effort in the 2009 PISA reading assessment. Journal of Educational and Behavioral Statistics, 39(6), 502–523. doi:10.3102/1076998614558485
- DeMars, C. E., Bashkov, B. M., & Socha, A. B. (2013). The role of gender in test-taking motivation under low-stakes conditions. Research and Practice in Assessment, 8, 69–82.
- Eklöf, H., Pavešič, B. J., & Grønmo, L. S. (2014). A cross-national comparison of reported effort and mathematics performance in TIMSS advanced. Applied Measurement in Education, 27(1), 31–45. doi:10.1080/08957347.2013.853070
- Gneezy, U., List, J. A., Livingston, J. A., Sadoff, S., Qin, X., & Xu, Y. (2017). Measuring success in education: The role of effort on the test itself (Research Report No. w24004). Cambridge, MA: National Bureau of Economic Research.
- Goldhammer, F., Martens, T., Christoph, G., & Lüdtke, O. (2016). Test-taking engagement in PIAAC (OECD Education Working Papers, No. 133). Paris, France: OECD Publishing.
- Goldhammer, F., Martens, T., & Lüdtke, O. (2017). Conditioning factors of test-taking engagement in PIAAC: an exploratory IRT modelling approach considering person and item characteristics. Large-Scale Assessments in Education, 5(1), 1–25. doi:10.1186/s40536-017-0051-9
- Guo, H., Rios, J. A., Haberman, S., Liu, O. L., Wang, J., & Paek, I. (2016). A new procedure for detection of students’ rapid guessing responses using response time. Applied Measurement in Education, 29(3), 173–183. doi:10.1080/08957347.2016.1171766
- Hambleton, R. K., Merenda, P. F., & Spielberger, C. D. (2004). Adapting educational and psychological tests for cross-cultural assessment. Mahwah, NJ: Lawrence Erlbaum Associates.
- Hu, L.-T., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling: A Multidisciplinary Journal, 6(1), 1–55. doi:10.1080/10705519909540118
- Huggins, A. C. (2014). The effect of differential item functioning in anchor items on population invariance of equating. Educational and Psychological Measurement, 74(4), 627–658. doi:10.1177/0013164413506222
- Kam, C. C. S., & Meyer, J. P. (2015). How careless responding and acquiescence response bias can influence construct dimensionality: The case of job satisfaction. Organizational Research Methods, 18(3), 512–541. doi:10.1177/1094428115571894
- Kim, S., & Moses, T. (2016). Investigating robustness of item response theory proficiency estimators to atypical response behaviors under two-stage multistage testing (ETS RR-16-22). Princeton, NJ: Educational Testing Service.
- Kong, X. J., Wise, S. L., & Bhola, D. S. (2007). Setting the response time threshold parameter to differentiate solution behavior from rapid-guessing behavior. Educational and Psychological Measurement, 67(4), 606–619. doi:10.1177/0013164406294779
- Kuznetsova, A., Brockhoff, P. B., & Christensen, R. H. B. (2018). Package ‘lmerTest’. R package version, 2(0).
- Liu, O. L., Frankel, L., & Roohr, K. C. (2014). Assessing critical thinking in higher education: Current state and directions for next-generation assessment. ETS Research Report Series, 2014(1), 1–23. doi:10.1002/ets2.12009
- Liu, O. L., Mao, L., Frankel, L., & Xu, J. (2016). Assessing critical thinking in higher education: The HEIghten approach and preliminary validity evidence. Assessment & Evaluation in Higher Education, 41(5), 677–694. doi:10.1080/02602938.2016.1168358
- Matsumoto, D., & van de Vijver, F. J. R. (2011). Introduction to the methodological issues associated with cross-cultural research. New York, NY: Cambridge University Press.
- Mittelhaëuser, M.-A., Béguin, A. A., & Sijtsma, K. (2015). The effect of differential motivation on IRT linking. Journal of Educational Measurement, 52(3), 339–358. doi:10.1111/jedm.12080
- Oliveri, M. E., & von Davier, M. (2011). Investigation of model fit and score scale comparability in international assessments. Psychological Test and Assessment Modeling, 53, 315–333.
- Oliveri, M. E., & von Davier, M. (2014). Toward increasing fairness in score scale calibrations employed in international large-scale assessments. International Journal of Testing, 14(1), 1–21.
- Organisation for Economic Co-Operation and Development (OECD), (2013). Technical report of the survey of adult skills (PIAAC). Paris: OECD.
- Osborne, J. W., & Blanchard, M. R. (2011). Random responding from participants is a threat to the validity of social science research results. Frontiers in Psychology, 1, 220. doi:10.3389/fpsyg.2010.00220
- Penk, C., & Schipolowski, S. (2015). Is it all about value? Bringing back the expectancy component to the assessment of test-taking motivation. Learning and Individual Differences, 42, 27–35. doi:10.1016/j.lindif.2015.08.002
- Pintrich, P. R., & Schunk, D. H. (2002). Motivation in education: Theory, research, and applications (2nd ed.). Columbus, OH: Merrill Prentice Hall.
- Pizmony-Levy, O., Harvey, J., Schmidt, W. H., Noonan, R., Engal, L., Feuer, M. J., … Torney- Purta, J. (2014). On the merits of, and myths about, international assessments. Quality Assurance in Education, 22(4), 319–338. doi:10.1108/QAE-07-2014-0035
- R. Core Team. (2019). R: A language and environment for statistical computing [Computer software]. R Vienna, Austria: Foundation for Statistical Computing. Retrieved from http://www.R-project.org/
- Rios, J. A. (in press). Improving test-taking effort in low-stakes group-based educational testing: A meta-analysis of interventions. Applied Measurement in Education
- Rios, J. A., Guo, H., Mao, L., & Liu, O. L. (2017). Evaluating the impact of noneffortful responses on aggregated scores: To filter unmotivated examinees or not? International Journal of Testing, 17(1), 74–104. doi:10.1080/15305058.2016.1231193
- Rios, J. A., & Liu, O. L. (2017). Online proctored versus unproctored low-stakes internet test administration: Is there differential test-taking behavior and performance? American Journal of Distance Education, 31, 226–241.
- Schnipke, D. L., & Scrams, D. J. (1997). Modeling item response times with a two-state mixture model: A new method of measuring speededness. Journal of Educational Measurement, 34(3), 213–232. doi:10.1111/j.1745-3984.1997.tb00516.x
- Sireci, S. G., Rios, J. A., & Powers, S. (2017). Comparing scores from tests administered in different languages. In N. J. Dorans & L. L. Cook (Eds.), Fairness in educational assessment and measurement (pp. 181–202). New York, NY: Routledge.
- van Barnevald, C. (2007). The effect of examinee motivation on test construction within an IRT framework. Applied Psychological Measurement, 31(1), 31–46. doi:10.1177/0146621606286206
- von Davier, M. (2006). Multidimensional latent trait modelling (MDLTM) [Software program]. Princeton, NJ: Educational Testing Service.
- Wigfield, A., & Eccles, J. S. (2000). Expectancy-value theory of achievement motivation. Contemporary Educational Psychology, 25(1), 68–81. doi:10.1006/ceps.1999.1015
- Wise, S. L. (2009). Strategies for managing the problem of unmotivated examinees in low-stakes testing programs. The Journal of General Education, 58(3), 152–166. doi:10.1353/jge.0.0042
- Wise, S. L., & Smith, L. F. (2016). The validity of assessment when students don’t give good effort. In G. T. L. Brown & L. R. Harris (Eds.), Handbook of human and social conditions in assessment (pp. 204–220). New York, NY: Routledge.
- Wise, S. L. (2017). Rapid-guessing behavior: Its Identification, interpretation, and implications. Educational Measurement: Issues and Practice, 36(4), 52–61. doi:10.1111/emip.12165
- Wise, S. L., & DeMars, C. E. (2005). Examinee motivation in low-stakes assessment: Problems and potential solutions. Educational Assessment, 10(1), 1–18. doi:10.1207/s15326977ea1001_1
- Wise, S. L., & DeMars, C. E. (2009). A Clarification of the effects of rapid guessing on Coefficient α: A note on Attali’s “reliability of speeded number-right multiple-choice tests”. Applied Psychological Measurement, 33, 488–490. doi:10.1177/0146621607304655
- Wise, S. L., & DeMars, C. E. (2010). Examinee noneffort and the validity of program assessment results. Educational Assessment, 15(1), 27–41. doi:10.1080/10627191003673216
- Wise, S. L., & Kong, X. (2005). Response time effort: A new measure of examinee motivation in computer-based tests. Applied Measurement in Education, 18(2), 163–183. doi:10.1207/s15324818ame1802_2
- Wise, S. L., & Ma, L. (2012, April). Setting response time thresholds for a CAT item pool: The normative threshold method. Presented at the annual meeting of the National Council on Measurement in Education, Vancouver, Canada.
- Wise, S. L., Ma, L., Cronin, J., & Theaker, R. A. (2013, April). Student test-taking effort and the assessment of student growth in evaluating teacher effectiveness. Presented at the annual conference of the American Educational Research Association, San Francisco, CA.