Publication Cover
Educational Psychology
An International Journal of Experimental Educational Psychology
Volume 36, 2016 - Issue 2
1,585
Views
26
CrossRef citations to date
0
Altmetric
Articles

Exploring plausible causes of differential item functioning in the PISA science assessment: language, curriculum or culture

, &
Pages 378-390 | Received 01 Feb 2014, Accepted 10 Jun 2014, Published online: 14 Aug 2014

References

  • Adams, R. J., Wilson, M., & Wang, W.-C. (1997). The multidimensional random coefficient multinomial logit model. Applied Psychological Measurement, 21(1), 1–23.10.1177/0146621697211001
  • Afonso, A., & St. Aubyn, M. (2006). Cross-country efficiency of secondary education provision: A semi-parametric analysis with non-discretionary inputs. Economic Modelling, 23, 476–491.10.1016/j.econmod.2006.02.003
  • Allalouf, A. (2000). Retaining translated verbal reasoning items by revising DIF items. Paper presented at the AERA, New Orleans, LA.
  • Allalouf, A., Hambleton, R. K., & Sireci, S. G. (1999). Identifying the causes of DIF in translated verbal items. Journal of Educational Measurement, 36, 185–198.10.1111/jedm.1999.36.issue-3
  • Berliner, D. C. (1993). International comparisons of student achievement: A false guide for reform. National Forum, XXII, 25–29.
  • Buckley, J. (2009). Cross-national response styles in international educational assessments: Evidence from PISA 2006. Paper presented at the NCES PISA Research Conference, Washington, DC.
  • Chen, C., Lee, S., & Stevenson, H. W. (1995). Response style and cross-cultural comparisons of rating scales among East Asian and North American students. Psychological Science, 6, 170–175.10.1111/j.1467-9280.1995.tb00327.x
  • Cheung, S. K. (1996). Reliability and factor structure of the Chinese version of the depression self-rating scale. Educational and Psychological Measurement, 56, 142–154.10.1177/0013164496056001011
  • Clauser, B. E., & Mazor, K. M. (1998). Using statistical procedures to identify differential item functioning test items. Educational Measurement: Issues and Practice, 17, 31–44.
  • Cohen, A. S., & Kim, S.-H. (1993). A comparison of Lord’s chi square and Raju’s area measures in detection of DIF. Applied Psychological Measurement, 17, 39–52.10.1177/014662169301700109
  • Einarsdóttir, S., & Rounds, J. (2009). Gender bias and construct validity in vocational interest measurement: Differential item functioning in the strong interest inventory. Journal of Vocational Behavior, 74, 295–307.10.1016/j.jvb.2009.01.003
  • Enkins, S. P., Micklewright, J., & Schnepf, S. V. (2006). Social segregation in secondary schools: How does England compare with other countries? Paper presented at the IZA Discussion, Bonn, Germany.
  • Ercikan, K. (1998). Translation effects in international assessments. International Journal of Educational Research, 29, 543–553.10.1016/S0883-0355(98)00047-0
  • Ercikan, K., & Koh, K. (2005). Examining the construct comparability of the English and French versions of TIMSS. International Journal of Testing, 5, 23–35.10.1207/s15327574ijt0501_3
  • Fuchs, T., & Wößmann, L. (2006). What accounts for international differences in student performance? A re-examination using PISA data. Empirical Economics, 32, 433–464.
  • Grisay, A. (2003). Translation procedures in OECD/PISA 2000 international assessment. Language Testing, 20, 225–240.10.1191/0265532203lt254oa
  • Grisay, A., & Monseur, C. (2007). Measuring the equivalence of item difficulty in the various versions of an international test. Studies in Educational Evaluation, 33, 69–86.10.1016/j.stueduc.2007.01.006
  • Hambleton, R. K., & Kanjee, A. (1995). Increasing the validity of cross-cultural assessments: Use of improved methods for test adaptations. European Journal of Psychological Assessment, 11, 147–157.10.1027/1015-5759.11.3.147
  • Hambleton, R. K., & Patsula, L. (1999). Increasing the validity of adapted tests: Myths to be avoided and guidelines for improving test adaptation practices. Journal of Applied Testing Technology, 1, 1–12.
  • Hambleton, R. K., & Rogers, H. J. (1989). Detecting potentially biased test items: Comparison of IRT area and Mantel-Haenszel methods. Applied Measurement in Education, 2, 313–334.10.1207/s15324818ame0204_4
  • Holland, P. W., & Wainer, H. (1993). Differential item functioning. Hillsdale, NJ: Lawrence Erlbaum.
  • Johnson, D. H. (1999). The insignificance of statistical significance testing. The Journal of Wildlife Management, 63, 763–772.10.2307/3802789
  • Kemp, A. C. (2000). Science educator’s views on the goal of scientific literacy for all: An interpretative review of the literature. Paper presented at the Annual Meeting of the Nation Association for Research in Science Teaching, New Orleans, LA.
  • Linn, R. L., & Baker, E. L. (1995). What do international assessments imply for world-class standards? Educational Evaluation and Policy Analysis, 17, 405–418.10.3102/01623737017004405
  • Meulders, M., & Xie, Y. (2004). Person-by-item predictions. In P. De Boeck & M. Wilson (Eds.), Explanatory item response models (pp. 213–240). New York, NY: Springer.10.1007/978-1-4757-3990-9
  • OECD. (2006). Assessing scientific, reading and mathematical literacy: A framework for PISA 2006. Paris: Author.
  • OECD. (2007). PISA 2006: Science competencies for tomorrow’s world (Vol. 1). Paris: Author.
  • OECD. (2009). PISA 2006 techinical report. Paris: Author.
  • OECD. (2013). About PISA. Retrieved January, 2014, from http://www.oecd.org/pisa/aboutpisa/
  • Paek, I., & Wilson, M. (2011). Formulating the Rasch differential item functioning model under the marginal maximum likelihood estimation context and Its comparison with Mantel-Haenszel procedure in short test and small sample conditions. Educational and Psychological Measurement, 71, 1023–1046.10.1177/0013164411400734
  • Penfield, R. D. (2007). An approach for categorizing DIF in polytomous items. Applied Measurement in Education, 20, 335–355.10.1080/08957340701431435
  • Price, L. R., & Oshima, T. C. (1998). Differential item functioning and language translation: A cross-national study with a test developed for certification. Paper presented at the AERA, San Diego, CA.
  • Schwab, C. J. (2007). What can we learn from PISA? Investigating PISA’s approach to scientific literacy (PhD). University of California, Berkeley, CA.
  • Shealy, R., & Stout, W. (1993). An item response theory model for test bias and differential item functioning. In P. W. Holland & H. Wainer (Eds.), Differential item functioning (pp. 197−239). Hillsdale, NJ: Lawrence Erlbaum.
  • Stark, S., Chernyshenko, O. S., & Drasgow, F. (2006). Detecting differential item functioning with confirmatory factor analysis and item response theory: Toward a unified strategy. Journal of Applied Psychology, 91, 1292–1306.10.1037/0021-9010.91.6.1292
  • Swaminathan, H., & Rogers, H. J. (1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement, 27, 361–370.10.1111/jedm.1990.27.issue-4
  • The Hong Kong Education and Schooling System Explained. (2005). Retrieved, 2010, from http://www.tuition.com.hk/education-system.htm
  • Van De Vijver, F. J. R., & Poortinga, Y. H. (1997). Towards an Integrated Analysis of Bias in Cross-Cultural Assessment. European Journal of Psychological Assessment, 13, 29–37.10.1027/1015-5759.13.1.29
  • Walker, C. M., Zhang, B., & Surber, J. (2008). Using a multidimensional differential item functioning framework to determine if reading ability affects student performance in mathematics. Applied Measurement in Education, 21, 162–181.10.1080/08957340801926201
  • Wang, W.-C. (2008). Assessment of differential item functioning. Journal of Applied Measurement, 9, 387–408.
  • Westbury, I. (1993). American and Japanese achievement … again. Educational Researcher, 22, 21–25.
  • Wilson, M. (2005). Constructing measures: An item response modeling approach. Mahwah, NJ: Lawrence Erlbaum Associates.
  • Wu, M., Adams, R. J., & Wilson, M. (1998). ConQuest. Hawthorn: ACER Press.
  • Xie, Y., & Wilson, M. (2008). Investigating DIF and extensions using an LLTM approach and also an individual differences approach: An international testing context. Psychology Science Quarterly, 50, 403–416.
  • Yildirim, H. H., & Berberoĝlu, G. (2009). Judgmental and statistical DIF analyses of the PISA-2003 mathematics literacy items. International Journal of Testing, 9, 108–121.10.1080/15305050902880736

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.