References
- Ackerman, T. A. 1992. A didactic explanation of item bias, item impact, and item validity from a multidimensional perspective. Journal of Educational Measurement 29 (1):67–91.
- Andrich, D., and C. Hagquist. 2012. Real and artificial differential item functioning. Journal of Educational and Behavioral Statistics 37 (3):387–416.
- Andrich, D., and C. Hagquist. 2015. Real and artificial differential item functioning in polytomous items. Educational and Psychological Measurement 75 (2):185–207.
- Ansley, T. N., and R. A. Forsyth. 1985. An examination of the characteristics of unidimensional IRT parameter estimates derived from two-dimensional data. Applied Psychological Measurement 9 (1):37–48.
- Bolt, D., and M. Gierl. 2006. Testing features of geographical DIF: Application of a regression correction to three nonparametric statistical approaches. Journal of Educational Measurement 43:313–33.
- Bolt, D., and W. Stout. 1996. Differential item functioning: Its multidimensional model and resulting SIBTEST detection procedure. Behaviormetrika 23 (1):67–96.
- Camilli, G., and L. A. Shepherd. 1994. MMSS methods for identifying biased test items. Thousand Oaks, CA: Sage Publications.
- Candell, D. L., and F. Drasgow. 1988. An iterative procedure for linking metrics and assessing item bias in item response theory. Applied Psychological Measurement 12 (3):253–60.
- Clauser, B., K. Mazor, and R. Hambleton. 1993. The effects of purification of matching criterion on the identification of DIF using the Mantel-Haenszel procedure. Applied Measurement in Education 6 (4):269–80.
- Cohen, A. S., and S. Kim. 1993. A comparison of lord’s χ2 and Raju’s area measures on detection of DIF. Applied Psychological Measurement 17 (1):39–52.
- Fidalgo, A. M., G. J. Mellenbergh, and J. Muniz. 2000. Effects of amount of DIF, test length, and purification type on robustness and power of Mantel-Haenszel procedures. Methods of Psychological Research 5 (3):43–53.
- Finch, H. 2005. The MIMIC model as a method for detecting DIF: Comparison with mantel-haenszel, SIBTEST, and the IRT likelihood ratio. Applied Psychological Measurement 29 (4):278–95.
- French, B., and S. Maller. 2007. Iterative purification and effect size use with logistic regression for differential functioning detection. Educational and Psychological Measurement 67 (3):373–93.
- Gierl, M. J., A. Gotzmann, and K. Boughton. 2004. Performance of SIBTEST when the percentage of DIF items is large. Applied Measurement in Education 17 (3):241–64.
- Guilera, G., J. Gomez-Benito, M. D. Hidalgo, and J. Sanchez-Meca. 2013. Type I error and statistical power of the Mantel-Haenszel procedure for detecting DIF: A meta-analysis. Psychological Methods 18 (4):553–71.
- Han, K. T. 2007. WinGen: Windows software that generates IRT parameters and item responses. Applied Psychological Measurement 31 (5):457–59.
- Hidalgo, M., M. Lopez-Martinez, J. Gomez-Benito, and G. Guilera. 2016. A comparison of discriminant logistic regression and item response theory likelihood-ratio tests for differential item functioning (IRTLRDIF) in polytomous short tests. Psicothema 28 (1):83–88.
- Holland, P. W., and D. Thayer. 1988. Differential item performance and the Mantel-Haenszel procedure. In Test validity, eds. H. Wainer and H. I. Braun, 129–45. Hillsdale, NJ: Lawrence Erlbaum Associates.
- Jiang, H., and W. Stout. 1998. Improved type I error control and reduced estimation bias for DIF detection using SIBTEST. Journal of Educational and Behavioral Statistics 23 (4):291–322.
- Jodoin, M. G., and M. J. Gierl. 2001. Evaluating type I error and power rates using an effect size measure with the logistic regression procedure for DIF detection. Applied Measurement in Education 14 (4):329–49.
- Keiffer, E. A. 2011. Group-specific effects of matching subtest contamination on the identification of differential item functioning. Unpublished doctoral dissertation, University of Arkansas, Fayetteville, AR.
- Kopf, J., A. Zeileis, and C. Strobl. 2015. A framework for anchor methods and an iterative forward approach for DIF detection. Applied Psychological Measurement 39 (2):83–103.
- Lee, H., and K. F. Geisinger. 2016. The matching criterion purification for differential item functioning analyses in a large-scale assessment. Educational and Psychological Measurement 76 (1):141–63.
- Liu, Q. (2011). Item purification in differential item functioning using generalized linear mixed models. Unpublished doctoral dissertation, Florida State University, Tallahassee, FL.
- Miller, M. D., and T. C. Oshima. 1992. Effect of sample size, number of biased items, and magnitude of bias on a two-stage item bias estimation method. Applied Psychological Measurement 16 (4):381–88.
- Narayanan, P., and H. Swaminathan. 1994. Performance of the Mantel-Haenszel and simultaneous item bias procedures for detecting differential item functioning. Applied Psychological Measurement 18 (4):315–28.
- Oshima, T. C., and M. D. Miller. 1992. Multidimensionality and item bias in item response theory. Applied Psychological Measurement 16 (3):237–48.
- Park, D.-G., and G. J. Lautenschlager. 1990. Improving IRT item bias detection with iterative linking and ability scale purification. Applied Psychological Measurement 14 (2):163–73.
- Roju, N. S., W. J. van der Linden, and P. F. Fleer. 1995. IRT internal based measures of differential functioning of items and tests. Applied Psychological Measurement 19 (4):353–68.
- Reckase, M. D. 1985. The difficulty of test items that measure more than one ability. Applied Psychological Measurement 9 (4):401–12.
- Roussos, L., and W. Stout. 1996. A multidimensionality-based DIF analysis paradigm. Applied Psychological Measurement 20 (4):355–71.
- Shealy, R. T., and W. F. Stout. 1993. A model-based standardization approach that separates true bias/DIF from group ability differences and detects test bias/DTF as well as item bias/DIF. Psychometrika 58 (2):159–94.
- Sireci, S. G., and J. A. Rios. 2013. Decisions that make a difference in detecting differential item functioning. Educational Research and Evaluation 19 (2-3):170–87.
- Su, Y.-H., and W.-C. Wang. 2005. Efficiency of the mantel, generalized Mantel-Haenszel, and logistic discriminant function analysis methods in detecting differential item functioning in polytomous items. Applied Measurement in Education 18 (4):313–50.
- Wang, W.-C., C.-L. Shih, and G. W. Sun. 2012. The DIF-free-then-DIF strategy for the assessment of differential item functioning. Educational and Psychological Measurement 72 (4):687–708.
- Wang, W.-C., and Y.-H. Su. 2004. Effects of average signed area between two item characteristic curves and test purification procedures on the DIF detection via the Mantel-Haenszel method. Applied Measurement in Education 17 (2):113–44.
- Zwick, R., D. T. Thayer, and J. Mazzeo. 1997. Descriptive and inferential procedures for assessing differential item functioning in polytomous items. Applied Measurement in Education 10 (4):321–44.