1,405
Views
19
CrossRef citations to date
0
Altmetric
Original Articles

Comparing OECD PISA Reading in English to Other Languages: Identifying Potential Sources of Non-Invariance

&

REFERENCES

  • American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1999). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.
  • Arffman, I. (2002). In search of equivalence: Translation problems in international literacy studies. Unpublished master's thesis, University of Jyväskylä, Jyväskylä, Finland.
  • Arffman, I. (2010). Equivalence of translations in international reading literacy studies. Scandinavian Journal of Educational Research, 54(1), 37–59.
  • Asil, M., & Gelbal, S. (2012). PISA öğrenci anketinin kültürler arası eşdeğerliği. Egitim ve Bilim, 37(166), 236–249.
  • Baird, J., Isaacs, T., Johnson, S., Stobart, G., Yu, G., Sprague, T., & Daugherty, R. (2011). Policy effects of PISA. Oxford University Centre for Educational Assessment. Retrieved from http://oucea.education.ox.ac.uk/wordpress/wp-content/uploads/2011/10/Policy-Effects-of-PISA-OUCEA.pdf.
  • Bonnet, G. (2002). Reflections in a critical eye: on the pitfalls of international assessment. Assessment in Education: Principles, Policy & Practice, 9(3), 387–399.
  • Boroditsky, L. (2001). Does language shape thought? English and Mandarin speakers’ conceptions of time. Cognitive Psychology, 43(1), 1–22.
  • Boroditsky, L. (2011). How language shapes thought. Scientific American, 304(2), 62–65.
  • Boroditsky, L., Fuhrman, O., & McCormick, K. (2010). Do English and Mandarin speakers think differently about time?. Cognition, 118(1), 123–129.
  • Boroditsky, L., & Gaby, A. (2010). Remembrances of times east: Absolute spatial representations of time in an Australian Aboriginal community. Psychological Science, 21(11), 1635–1639.
  • Breakspear, S. (2012). The policy impact of PISA. OECD Education Working Paper 71. Paris: OECD.
  • Brown, G.T. L., & Wang, Z. (2013). Illustrating assessment: How Hong Kong university students conceive of the purposes of assessment. Studies in Higher Education, 38(7), 1037–1057. doi: 10.1080/03075079.2011.616955.
  • Chen, F.F. (2007). Sensitivity of goodness of fit indexes to lack of measurement invariance. Structural Equation Modeling, 14, 464–504.
  • Cheung, G.W., & Rensvold, R.B. (2002). Evaluating goodness-of-fit indexes for testing measurement invariance. Structural Equation Modeling, 9(2), 233–255.
  • Cohen, J. (1988). Statistical power analysis for the behavioural sciences ( 2nd ed.). Hillsdale, NJ: Erlbaum.
  • Elosua O.P., & Mujika L.J. (2013). Invariance levels across language versions of the PISA 2009 reading comprehension tests in Spain. Psicothema, 25(3), 390–395.
  • Elosua, P., & López-Jaúregui, A. (2007). Potential sources of differential item functioning in the adaptation of tests. International Journal of Testing, 7(1), 39–52.
  • Ercikan, K., & Lyons-Thomas, J. (2013). Adapting tests for use in other languages and cultures. In K. Geisinger (Ed.), APA handbook testing and assessment in psychology (Vol. , pp. 545–569). Washington, DC: American Psychological Association.
  • Fan, X., & Sivo, S.A. (2005). Sensitivity of fit indexes to misspecified structural or measurement model components: Rationale of two-index strategy revisited. Structural Equation Modeling, 12(3), 343–367.
  • Fan, X., & Sivo, S.A. (2007). Sensitivity of fit indices to model misspecification and model types. Multivariate Behavioral Research, 42(3), 509–529.
  • Fausey, C., & Boroditsky, L. (2011). Who dunnit? Cross-linguistic differences in eye-witness memory. Psychonomic Bulletin & Review, 18(1), 150–157.
  • Fromkin, V., & Rodman, R. (1978). An introduction to language ( 2nd ed.). New York: Holt, Rinehart and Winston.
  • Grek, S. (2009). Governing by numbers: The PISA ‘effect’ in Europe. Journal of Education Policy, 24(1), 23–37.
  • Grisay, A., de Jong, J.H., Gebhardt, E., Berezner, A., & Halleux-Monseur, B. (2007). Translation equivalence across PISA countries. Journal of Applied Measurement, 8(3), 249–266.
  • Grisay, A., Gonzalez, E., & Monseur, C. (2009). Equivalence of item difficulties across national versions of the PIRLS and PISA reading assessments. IERI Monograph Series: Issues and Methodologies in Large-Scale Assessments, 2, 63–84.
  • Grisay, A., & Monseur, C. (2007). Measuring the equivalence of item difficulty in the various versions of an international test. Studies in Educational Evaluation, 33(1), 69–86.
  • Hambleton, R.K., Merenda, P., & Spielberger, C. (2005). Adapting educational and psychological tests for cross-cultural assessment. Hillsdale, NJ: Lawrence S. Erlbaum Publishers.
  • He, J., & van de Vijver, F. (2012). Bias and equivalence in cross-cultural research. Online Readings in Psychology and Culture, 2(2). doi: 10.9707/2307-0919.1111
  • Hofstede, G. (2007). A European in Asia. Asian Journal of Social Psychology, 10(1), 16–21. doi: 10.1111/j.1467-839X.2006.00206.x.
  • Hofstede, G., & Bond, M.H. (1988). The Confucius connection: From cultural roots to economic growth. Organizational Dynamics, 16(4), 5–21. doi:http://dx.doi.org/10.1016/0090-2616(88)90009-5.
  • Horn, J.L., & McArdle, J.J. (1992). A practical and theoretical guide to measurement invariance in aging research. Experimental Aging Research, 18, 117–144.
  • Hu, L., & Bentler, P.M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1–55.
  • International Test Commission. (2000). ITC guidelines for adaptation. Retrieved from http://www.intestcom.org/test_adaptation.htm.
  • Kankaraš, M., & Moors, G. (2013). Analysis of cross-cultural comparability of PISA 2009 scores. Journal of Cross-Cultural Psychology, 45(3), 381–399.
  • Kirk, R.E. (2006). Effect magnitude: A different focus. Journal of Statistical Planning and Inference, 137, 1634–1646.
  • Kreiner, S., & Christensen, K.B. (2014). Analyses of model fit and robustness. A new look at the PISA scaling model underlying ranking of countries according to reading literacy. Psychometrika, 79(2), 210–231.
  • Martens, K., & Niemann, D. (2013). When do numbers count? The differential impact of the PISA rating and ranking on education policy in Germany and the US. German Politics, 22(3), 314–332.
  • Mazzeo, J., & von Davier, M. (2008). Review of the Programme for International Student Assessment (PISA) test design: Recommendations for fostering stability in assessment results. OECD Education Working Papers EDU/PISA/GB, 28.
  • McWhorter, J. (2003). The power of Babel. London: Arrow Books.
  • Muthén, L.K., & Asparouhov, T. (2002). Latent variable analysis with categorical outcomes: Multiple-group and growth modeling in Mplus. Retrieved from http://www.statmodel.com/download/webnotes/CatMGLong.pdf.
  • Muthén, L.K., & Asparouhov, T. (2013a). New methods for the study of measurement invariance with many groups. Retrieved from http://www.statmodel.com/download/PolAn.pdf.
  • Muthén, L.K., & Asparouhov, T. (2013b). BSEM measurement invariance analysis. Retrieved from http://www.statmodel.com/examples/webnotes/webnote17.pdf.
  • Muthén, L.K., & Muthén, B.O. (1998–2012). Mplus user's guide 7. Los Angeles: Muthén & Muthénv.
  • Muthén L.K., & Muthén B.O. (2013). Version 7.1 Mplus language addendum. Retrieved from http://www.statmodel.com/ugexcerpts.shtml.
  • Nye, C.D., & Drasgow, F. (2011). Effect size indices for analysis of measurement equivalence: Understanding the practical importance of difference between groups. Journal of Applied Psychology, 96, 966–980. doi: 10.1037/a0022955.
  • Oliveri, M.E., & von Davier, M. (2011). Investigation of model fit and score scale comparability in international assessments. Journal of Psychological Test and Assessment Modeling, 53(3), 315–333.
  • Organisation for Economic Co-Operation and Development. (2010a). PISA 2009 assessment framework: Key competencies in reading, mathematics and science. Paris: Author.
  • Organisation for Economic Co-Operation and Development. (2010b). PISA 2009 results: Learning trends: Changes in student performance since 2000 (Vol. ). Paris: Author.
  • Organisation for Economic Co-Operation and Development. (2010c). PISA 2009 results: Overcoming social background—Equity in learning opportunities and outcomes (Vol. ). Paris: OECD.
  • Organisation for Economic Co-Operation and Development. (2010d). TALIS technical report. Paris: Author.
  • Organisation for Economic Co-Operation and Development. (2012). PISA 2009 technical report. PISA, OECD Publishing, Paris. Retrieved from http://dx.doi.org/10.1787/9789264167872-en.
  • Organisation for Economic Co-Operation and Development. (2013). Education at a glance 2013: OECD Indicators. OECD Publishing, Paris. doi: 10.1787/eag-2013-en.
  • Perfetti, C.A., & Harris, L.N. (2013). Universal reading processes are modulated by language and writing system. Language Learning and Development, 9(4), 296–316. doi: 10.1080/15475441.2013.813828.
  • Peterson, E.R., Brown, G.T. L., & Hamilton, R.J. (2013). Cultural differences in tertiary students’ conceptions of learning as a duty and student achievement. The International Journal of Quantitative Research in Education, 1(2), 167–181. doi: 10.1504/IJQRE.2013.056462.
  • Sass, D.A., Schmitt, T.A., & Marsh, H.W. (2014). Evaluating model fit with ordered categorical data within a measurement invariance framework: A comparison of estimators. Structural Equation Modeling: A Multidisciplinary Journal, 21(2), 167–180.
  • Schmitt, N., & Kuljanin, G. (2008). Measurement invariance: Review of practice and implications. Human Resource Management Review, 18, 210–222.
  • Shuell, T.J. (1996). Teaching and learning in a classroom context. In D.C. Berliner & R.C. Calfee (Eds.), Handbook of educational psychology (pp. 726–764). New York: Macmillan.
  • Sörbom, D. (1974). A general method for studying differences in factor means and factor structures between groups. British Journal of Mathematical and Statistical Psychology, 27(2), 229–239.
  • Stark, S., Chernyshenko, O.S., & Drasgow, F. (2006). Detecting differential item functioning with confirmatory factor analysis and item response theory: Toward a unified strategy. Journal of Applied Psychology, 91(6), 1292–1306.
  • Stobart, G. (2006). The validity of formative assessment. In J. Gardner (Ed.), Assessment and learning (pp. 133–146). London: Sage.
  • Vandenberg, R.J., & Lance, C.E. (2000). A review and synthesis of the measurement invariance literature: Suggestions, practices, and recommendations for organizational research. Organizational Research Methods, 3, 4–69.
  • Walker, M. (2007). Ameliorating culturally based extreme response tendencies to attitude items. Journal of Applied Measurement, 8(3), 267–278.
  • Wetzel, E., & Carstensen, C.H. (2013). Linking PISA 2000 and PISA 2009: Implications of instrument design on measurement invariance. Psychological Test and Assessment Modeling, 55(2), 181–206.
  • Wu, A.D., Li, Z., & Zumbo, B.D. (2007). Decoding the meaning of factorial invariance and updating the practice of multi-group confirmatory factor analysis: A demonstration with TIMSS data. Practical Assessment, Research & Evaluation, 12(3). Retrieved from http://pareonline.net/getvn.asp?v=12&n=13.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.