481
Views
1
CrossRef citations to date
0
Altmetric
Research Articles

Who responds inconsistently to mixed-worded scales? Differences by achievement, age group, and gender

ORCID Icon, ORCID Icon & ORCID Icon
Pages 5-31 | Received 06 Oct 2023, Accepted 06 Feb 2024, Published online: 15 Feb 2024

References

  • Arias, V. B., Garrido, L. E., Jenaro, C., Martínez-Molina, A., & Arias, B. (2020). A little garbage in lots of garbage out: Assessing the impact of careless responding in personality survey data. Behavior Research Methods, 52(6), 2489–2505. https://doi.org/10.3758/s13428-020-01401-8
  • Baumgartner, H., Weijters, B., & Pieters, R. (2018). Misresponse to survey questions: A conceptual framework and empirical test of the effects of reversals, negations, and polar opposite core concepts. Journal of Marketing Research, 55(6), 869–883. https://doi.org/10.1177/0022243718811848
  • Bedard, K., & Dhuey, E. (2006). The persistence of early childhood maturity: International evidence of long-run age effects. The Quarterly Journal of Economics, 121(4), 1437–1472. https://doi.org/10.1162/qjec.121.4.1437
  • Bolt, D., Wang, Y. C., Meyer, R. H., & Pier, L. (2020). An IRT mixture model for rating scale confusion associated with negatively worded items in measures of social-emotional learning. Applied Measurement in Education, 33(4), 331–348. https://doi.org/10.1080/08957347.2020.1789140
  • Brevik, L. M., Olsen, R. V., & Hellekjær, G. O. (2016). The complexity of second language reading: Investigating the L1-L2 relationship. Reading in a Foreign Language, 28(2), 161–182. https://doi.org/10.125/66899
  • Bulut, H. C., & Bulut, O. (2022). Item wording effects in self-report measures and reading achievement: Does removing careless respondents help? Studies in Educational Evaluation, 72, 101126. https://doi.org/10.1016/j.stueduc.2022.101126
  • Chen, J., Steinmann, I., & Braeken, J. (in print). Competing explanations for inconsistent responding to a mixed-worded self-esteem scale: Cognitive abilities or personality? Personality and Individual Differences.
  • Cole, K. L., Turner, R. C., & Gitchel, W. D. (2019). A study of polytomous IRT methods and item wording directionality effects on perceived stress items. Personality and Individual Differences, 147, 63–72. https://doi.org/10.1016/j.paid.2019.03.046
  • Desa, D., Van De Vijver, F. J. R., Carstens, R., & Schulz, W. (2018). Measurement invariance in international large-scale assessments: Integrating theory and method. In T. P. Johnson, B.-E. Pennell, I. A. L. Stoop, & B. Dorer (Eds.), Advances in comparative survey methods (pp. 879–910). John Wiley & Sons, Inc. https://doi.org/10.1002/9781118884997.ch40
  • DiStefano, C., & Motl, R. W. (2006). Further investigating method effects associated with negatively worded items on self-report surveys. Structural Equation Modeling: A Multidisciplinary Journal, 13(3), 440–464. https://doi.org/10.1207/s15328007sem1303_6
  • Ebbs, D., Wry, E., Wagner, J.-P., & Netten, A. (2020). Instrument translation and layout verification for TIMSS 2019. In Martin, M. O., Von Davier, M., & Mullis, I. V. S. Eds., Methods and procedures: TIMSS 2019 technical report (pp. 5.1–5.23). TIMSS & PIRLS International Study Center, Lynch School of Education and Human Development, Boston College and International Association for the Evaluation of Educational Achievement (IEA). https://timssandpirls.bc.edu/timss2019/methods/chapter-5.html
  • Foy, P., Fishbein, B., von Davier, M., & Yin, L. (2020). Implementing the TIMSS 2019 scaling methodology. In Martin, M. O., Von Davier, M., & Mullis, I. V. S. eds., Methods and procedures: TIMSS 2019 technical report (pp. 12.1–12.146). TIMSS & PIRLS International Study Center, Lynch School of Education and Human Development, Boston College and International Association for the Evaluation of Educational Achievement (IEA). https://timssandpirls.bc.edu/timss2019/methods/chapter-12.html
  • García-Batista, Z. E., Guerra-Peña, K., Garrido, L. E., Cantisano-Guzmán, L. M., Moretti, L., Cano-Vindel, A., Arias, V. B., & Medrano, L. A. (2021). Using constrained factor mixture analysis to validate mixed-worded psychological scales: The case of the Rosenberg self-esteem scale in the Dominican Republic. Frontiers in Psychology, 12, 636693. https://doi.org/10.3389/fpsyg.2021.636693
  • Gnambs, T., & Schroeders, U. (2020). Cognitive abilities explain wording effects in the Rosenberg self-esteem scale. Assessment, 27(2), 404–418. https://doi.org/10.1177/1073191117746503
  • Hong, M., Steedle, J. T., & Cheng, Y. (2020). Methods of detecting insufficient effort responding: Comparisons and practical recommendations. Educational and Psychological Measurement, 80(2), 312–345. https://doi.org/10.1177/0013164419865316
  • Huang, F. L. (2016). Alternatives to multilevel modeling for the analysis of clustered data. The Journal of Experimental Education, 84(1), 175–196. https://doi.org/10.1080/00220973.2014.952397
  • Kam, C. C. S., & Chan, G. H. (2018). Examination of the validity of instructed response items in identifying careless respondents. Personality and Individual Differences, 129, 83–87. https://doi.org/10.1016/j.paid.2018.03.022
  • Kam, C. C. S., & Meyer, J. P. (2015). Implications of item keying and item valence for the investigation of construct dimensionality. Multivariate Behavioral Research, 50(4), 457–469. https://doi.org/10.1080/00273171.2015.1022640
  • Kazak, A. E. (2018). Editorial: Journal article reporting standards. American Psychologist, 73(1), 1–2. https://doi.org/10.1037/amp0000263
  • Koda, K. (2007). Reading and language learning: Crosslinguistic constraints on second language reading development: Reading and language learning. Language Learning, 57(s1), 1–44. https://doi.org/10.1111/0023-8333.101997010-i1
  • LaRoche, S., Joncas, M., & Foy, P. (2020). Sample design in TIMSS 2019. In M. O. Martin, M. V. Davier, & I. V. S. Mullis (Eds.), Methods and procedures: TIMSS 2019 technical report (pp. 31–333) TIMSS & PIRLS International Study Center, Lynch School of Education and Human Development, Boston College and International Association for the Evaluation of Educational Achievement (IEA). https://timssandpirls.bc.edu/timss2019/methods/chapter-3.html
  • Lenzner, T., & Menold, N. (2016). GESIS survey guidelines: Question wording (2.0). SDM-Survey Guidelines (GESIS Leibniz Institute for the Social Sciences). https://doi.org/10.15465/gesis-sg_en_017
  • Likert, R. (1974). The method of constructing an attitude scale. In G. M. Maranell (Ed.), Scaling. A sourcebook for behavioral scientists (pp. 233–243). Aldine Publ.
  • Lindwall, M., Barkoukis, V., Grano, C., Lucidi, F., Raudsepp, L., Liukkonen, J., & Thøgersen-Ntoumani, C. (2012). Method effects: The problem with negatively versus positively keyed items. Journal of Personality Assessment, 94(2), 196–204. https://doi.org/10.1080/00223891.2011.645936
  • Lumley, T. (2019). Mitools: Tools for Multiple Imputation of Missing Data (2.4) [Computer Software]. https://cran.r-project.org/web/packages/mitools/index.html
  • Lumley, T. (2023). Survey: Analysis of Complex Survey Samples (4.2) [Computer Software]. https://cran.r-project.org/web/packages/survey/index.html
  • Marsh, H. W. (1996). Positive and negative global self-esteem: A substantively meaningful distinction or artifactors? Journal of Personality and Social Psychology, 70(4), 810–819. https://doi.org/10.1037/0022-3514.70.4.810
  • Marsh, H. W., Abduljabbar, A. S., Abu-Hilal, M. M., Morin, A. J. S., Abdelfattah, F., Leung, K. C., Xu, M. K., Nagengast, B., & Parker, P. (2013). Factorial, convergent, and discriminant validity of TIMSS math and science motivation measures: A comparison of Arab and Anglo-Saxon countries. Journal of Educational Psychology, 105(1), 108–128. https://doi.org/10.1037/a0029907
  • Martin, M. O., Von Davier, M., & Mullis, I. V. S. (Eds.). (2020). Methods and procedures: TIMSS 2019 technical report. TIMSS & PIRLS International Study Center, Lynch School of Education and Human Development, Boston College and International Association for the Evaluation of Educational Achievement (IEA).
  • Meinck, S. (2020). Sampling, weighting, and variance estimation. In H. Wagemaker (Ed.), Reliability and validity of international large-scale assessment (Vol. 10, pp. 113–129). Springer International Publishing. https://doi.org/10.1007/978-3-030-53081-5_7
  • Melnick, S. A., & Gable, R. K. (1990). The use of negative item stems. Educational Research Quarterly, 14(3), 31–36.
  • Michaelides, M. P. (2019). Negative keying effects in the factor structure of TIMSS 2011 motivation scales and associations with reading achievement. Applied Measurement in Education, 32(4), 365–378. https://doi.org/10.1080/08957347.2019.1660349
  • Mullis, I. V. S., Martin, M. O., Foy, P., & Hooper, M. (2017). PIRLS 2016 international results in reading. TIMSS & PIRLS International Study Center, Lynch School of Education, Boston College and the International Association for the Evaluation of Educational Achievement (IEA). https://pirls2016.org/wp-content/uploads/structure/CompletePDF/P16-PIRLS-International-Results-in-Reading.pdf
  • Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory (3rd ed.). McGraw-Hill. http://www.loc.gov/catdir/description/mh022/93022756.html
  • OECD. (2019). PISA 2018 results (volume II): Where all students can succeed. OECD Publishing. https://doi.org/10.1787/b5fd1b8f-en
  • Patton, J. M., Cheng, Y., Hong, M., & Diao, Q. (2019). Detection and treatment of careless responses to improve item parameter estimation. Journal of Educational and Behavioral Statistics, 44(3), 309–341. https://doi.org/10.3102/1076998618825116
  • Peng, K., & Nisbett, R. E. (1999). Culture, dialectics, and reasoning about contradiction. American Psychologist, 54(9), 741–754. https://doi.org/10.1037/0003-066X.54.9.741
  • Podsakoff, P. M., MacKenzie, S. B., Lee, J.-Y., & Podsakoff, N. P. (2003). Common method biases in behavioral research: A critical review of the literature and recommended remedies. The Journal of Applied Psychology, 88(5), 879–903. https://doi.org/10.1037/0021-9010.88.5.879
  • Quilty, L. C., Oakman, J. M., & Risko, E. (2006). Correlates of the Rosenberg self-esteem scale method effects. Structural Equation Modeling: A Multidisciplinary Journal, 13(1), 99–117. https://doi.org/10.1207/s15328007sem1301_5
  • R Core Team. (2022). R: A language and environment for statistical computing [Computer software]. R Foundation for Statistical Computing. https://www.R-project.org/
  • Rohde, T. E., & Thompson, L. A. (2007). Predicting academic achievement with cognitive ability. Intelligence, 35(1), 83–92. https://doi.org/10.1016/j.intell.2006.05.004
  • Roszkowski, M. J., & Soven, M. (2010). Shifting gears: Consequences of including two negatively worded items in the middle of a positively worded questionnaire. Assessment & Evaluation in Higher Education, 35(1), 113–130. https://doi.org/10.1080/02602930802618344
  • Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. Wiley.
  • Schmitt, N., & Stults, D. M. (1985). Factors defined by negatively keyed items: The result of careless respondents? Applied Psychological Measurement, 9(4), 367–373. https://doi.org/10.1177/014662168500900405
  • Schulz, W., & Carstens, R. (2020). Questionnaire development in international large-scale assessment studies. In H. Wagemaker (Ed.), Reliability and validity of international large-scale assessment (Vol. 10, pp. 61–83). Springer International Publishing. https://doi.org/10.1007/978-3-030-53081-5_5
  • Silm, G., Pedaste, M., & Täht, K. (2020). The relationship between performance and test-taking effort when measured with self-report or time-based instruments: A meta-analytic review. Educational Research Review, 31, 100335. https://doi.org/10.1016/j.edurev.2020.100335
  • Steedle, J. T., Hong, M., & Cheng, Y. (2019). The effects of inattentive responding on construct validity evidence when measuring social–emotional learning competencies. Educational Measurement Issues & Practice, 38(2), 101–111. https://doi.org/10.1111/emip.12256
  • Steinmann, I., & Olsen, R. V. (2022). Equal opportunities for all? Analyzing within-country variation in school effectiveness. Large-Scale Assessments in Education, 10(1), 2. https://doi.org/10.1186/s40536-022-00120-0
  • Steinmann, I., Sánchez, D., van Laar, S., & Braeken, J. (2022). The impact of inconsistent responders to mixed-worded scales on inferences in international large-scale assessments. Assessment in Education Principles, Policy & Practice, 29(1), 5–26. https://doi.org/10.1080/0969594X.2021.2005302
  • Steinmann, I., Strietholt, R., & Braeken, J. (2022). A constrained factor mixture analysis model for consistent and inconsistent respondents to mixed-worded scales. Psychological Methods, 27(4), 667–702. https://doi.org/10.1037/met0000392
  • Swain, S. D., Weathers, D., & Niedrich, R. W. (2008). Assessing three sources of misresponse to reversed likert items. Journal of Marketing Research, 45(1), 116–131. https://doi.org/10.1509/jmkr.45.1.116
  • TIMSS & PIRLS International Study Center. (2019). TIMSS: Trends in International Mathematics and Science Study. https://timssandpirls.bc.edu/timss-landing.html
  • Tourangeau, R., Rips, L. J., & Rasinski, K. (2000). The psychology of survey response (1st ed.). Cambridge University Press. https://doi.org/10.1017/CBO9780511819322
  • van Buuren, S. (2011). Multiple imputation of multilevel data. In J. J. Hox & J. K. Roberts (Eds.), Handbook of advanced multilevel analysis (pp. 173–196). Routledge.
  • Viechtbauer, W. (2010). Conducting meta-analyses in R with the metafor package. Journal of Statistical Software, 36(3), 36(3. https://doi.org/10.18637/jss.v036.i03
  • von Davier, M., Gonzalez, E. J., & Mislevy, R. (2009). What are plausible values and why are they useful? In M. von Davier & D. Hastedt (Eds.), IERI Monograph Series: Issues and Methodologies in Large-Scale Assessments (Vol. 2, pp. 9–36). IERI.
  • Woods, C. M. (2006). Careless responding to reverse-worded items: Implications for confirmatory factor analysis. Journal of Psychopathology and Behavioral Assessment, 28(3), 186–191. https://doi.org/10.1007/s10862-005-9004-7
  • Zeng, B., Wen, H., & Zhang, J. (2020). How does the valence of wording affect features of a scale? The method effects in the undergraduate learning burnout scale. Frontiers in Psychology, 11, 585179. https://doi.org/10.3389/fpsyg.2020.585179