7,105
Views
2
CrossRef citations to date
0
Altmetric
Research Articles

Why Full, Partial, or Approximate Measurement Invariance Are Not a Prerequisite for Meaningful and Valid Group Comparisons

ORCID Icon & ORCID Icon
Pages 859-870 | Received 18 Nov 2022, Accepted 12 Mar 2023, Published online: 03 May 2023

References

  • Arts, I., Fang, Q., van de Schoot, R., & Meitinger, K. (2021). Approximate measurement invariance of willingness to sacrifice for the environment across 30 countries: The importance of prior distributions and their visualization. Frontiers in Psychology, 12, 624032. https://doi.org/10.3389/fpsyg.2021.624032
  • Asparouhov, T., Muthén, B. (2023). Penalized structural equation models. Technical report. Retrieved March 3, 2023, from https://www.statmodel.com/download/PML.pdf
  • Asparouhov, T., & Muthén, B. (2014). Multiple-group factor analysis alignment. Structural Equation Modeling: A Multidisciplinary Journal, 21, 495–508. https://doi.org/10.1080/10705511.2014.919210
  • Avvisati, F. (2020). The measure of socio-economic status in PISA: A review and some suggested improvements. Large-Scale Assessments in Education, 8, 8. https://doi.org/10.1186/s40536-020-00086-x
  • Battauz, M. (2020). Regularized estimation of the four-parameter logistic model. Psych, 2, 269–278. https://doi.org/10.3390/psych2040020
  • Bauer, D. J., Belzak, W. C., & Cole, V. T. (2020). Simplifying the assessment of measurement invariance over multiple background variables: Using regularized moderated nonlinear factor analysis to detect differential item functioning. Structural Equation Modeling: A Multidisciplinary Journal, 27, 43–55. https://doi.org/10.1080/10705511.2019.1642754
  • Berk, R., Brown, L., Buja, A., George, E., Pitkin, E., Zhang, K., & Zhao, L. (2014). Misspecified mean function regression: Making good use of regression models that are wrong. Sociological Methods & Research, 43, 422–451. https://doi.org/10.1177/0049124114526375
  • Boer, D., Hanke, K., & He, J. (2018). On detecting systematic measurement error in cross-cultural research: A review and critical reflection on equivalence and invariance tests. Journal of Cross-Cultural Psychology, 49, 713–734. https://doi.org/10.1177/0022022117749042
  • Boos, D. D., & Stefanski, L. A. (2013). Essential statistical inference. Springer. https://doi.org/10.1007/978-1-4614-4818-1
  • Borgstede, M., & Eggert, F. (2023). Squaring the circle: From latent variables to theory-based measurement. Theory & Psychology, 33, 118–137. https://doi.org/10.1177/09593543221127985
  • Borsboom, D. (2008). Latent variable theory. Measurement: Interdisciplinary Research and Perspectives, 6, 25–53. https://doi.org/10.1080/15366360802035497
  • Brandmaier, A. M., & Jacobucci, R. C. (2023). Machine-learning approaches to structural equation modeling. In R. H. Hoyle (Ed.), Handbook of structural equation modeling. Guilford Press.
  • Byrne, B. M., Shavelson, R. J., & Muthén, B. (1989). Testing for the equivalence of factor covariance and mean structures: The issue of partial measurement invariance. Psychological Bulletin, 105, 456–466. https://doi.org/10.1037/0033-2909.105.3.456
  • Camilli, G. (1993). The case against item bias detection techniques based on internal criteria: Do item bias procedures obscure test fairness issues? In P. W. Holland & H. Wainer (Eds.), Differential item functioning: Theory and practice (pp. 397–417). Erlbaum.
  • Camilli, G. (2006). Test fairness. In R. L. Brennan (Ed.), Educational measurement (pp. 221–256). Praeger Publications.
  • Chen, F. F. (2008). What happens if we compare chopsticks with forks? The impact of making inappropriate comparisons in cross-cultural research. Journal of Personality and Social Psychology, 95, 1005–1018. https://doi.org/10.1037/a0013193
  • Chen, P. Y., Wu, W., Garnier-Villarreal, M., Kite, B. A., & Jia, F. (2020). Testing measurement invariance with ordinal missing data: A comparison of estimators and missing data techniques. Multivariate Behavioral Research, 55, 87–101. https://doi.org/10.1080/00273171.2019.1608799
  • Chen, Y., Li, C., & Xu, G. (2021). DIF statistical inference and detection without knowing anchoring items. arXiv. arXiv:2110.11112. https://doi.org/10.48550/arXiv.2110.11112
  • Cohen, A. S., & Bolt, D. M. (2005). A mixture model analysis of differential item functioning. Journal of Educational Measurement, 42, 133–148. https://doi.org/10.1111/j.1745-3984.2005.00007
  • Cole, V. T., Bauer, D. J., & Hussong, A. M. (2019). Assessing the robustness of mixture models to measurement noninvariance. Multivariate Behavioral Research, 54, 882–905. https://doi.org/10.1080/00273171.2019.1596781
  • Davidov, E., & Meuleman, B. (2019). Measurement invariance analysis using multiple group confirmatory factor analysis and alignment optimisation. In F. J. van de Vijver (Ed.), Invariance analyses in large-scale studies (pp. 13–20). OECD Education. Working Papers No. 201. https://doi.org/10.1787/254738dd-en
  • Davidov, E., Meuleman, B., Cieciuch, J., Schmidt, P., & Billiet, J. (2014). Measurement equivalence in cross-national research. Annual Review of Sociology, 40, 55–75. https://doi.org/10.1146/annurev-soc-071913-043137
  • Davies, P. L. (2014). Data analysis and approximate models. CRC Press. https://doi.org/10.1201/b17146
  • De Boeck, P. (2008). Random item IRT models. Psychometrika, 73, 533–559. https://doi.org/10.1007/s11336-008-9092-x
  • De Jong, M. G., Steenkamp, J. B. E., & Fox, J. P. (2007). Relaxing measurement invariance in cross-national consumer research using a hierarchical IRT model. Journal of Consumer Research, 34, 260–278. https://doi.org/10.1086/518532
  • De Los Reyes, A., Tyrell, F., Watts, A. L., & Asmundson, G. (2022). Conceptual, methodological, and measurement factors that disqualify use of measurement invariance techniques to detect informant discrepancies in youth mental health assessments. Frontiers in Psychology, 13, 931296. https://doi.org/10.3389/fpsyg.2022.931296
  • De Roover, K. (2021). Finding clusters of groups with measurement invariance: Unraveling intercept non-invariance with mixture multigroup factor analysis. Structural Equation Modeling: A Multidisciplinary Journal, 28, 663–683. https://doi.org/10.1080/10705511.2020.1866577
  • De Roover, K., Vermunt, J. K., & Ceulemans, E. (2022). Mixture multigroup factor analysis for unraveling factor loading noninvariance across many groups. Psychological Methods, 27, 281–306. https://doi.org/10.1037/met0000355
  • D’Urso, E. D., Maassen, E., van Assen, M. A., Nuijten, M. B., De Roover, K., & Wicherts, J. (2022). The dire disregard of measurement invariance testing in psychological science. PsyArXiv, July 26. https://doi.org/10.31234/osf.io/n3f5u
  • Edelsbrunner, P. A. (2022). A model and its fit lie in the eye of the beholder: Long live the sum score. Frontiers in Psychology, 13, 986767. https://doi.org/10.3389/fpsyg.2022.986767
  • Eid, M. (2000). A multitrait-multimethod model with minimal assumptions. Psychometrika, 65, 241–261. https://doi.org/10.1007/BF02294377
  • Eid, M. (2019). Multigroup and multilevel latent class analysis. In F. J. van de Vijver (Ed.), Invariance analyses in large-scale studies (pp. 70–90). OECD Education. Working Papers No. 201. https://doi.org/10.1787/254738dd-en
  • El Masri, Y. H., & Andrich, D. (2020). The trade-off between model fit, invariance, and validity: The case of PISA science assessments. Applied Measurement in Education, 33, 174–188. https://doi.org/10.1080/08957347.2020.1732384
  • Ellis, J. L. (1993). Subpopulation invariance of patterns in covariance matrices. British Journal of Mathematical and Statistical Psychology, 46, 231–254. https://doi.org/10.1111/j.2044-8317.1993.tb01014.x
  • Finch, H. (2022). Applied regularization methods for the social sciences. CRC Press. https://doi.org/10.1201/9780367809645
  • Fischer, R., Karl, J., & Luczak-Roesch, M. (2022). Why equivalence and invariance are both different and essential for scientific studies of culture: A discussion of mapping processes and theoretical implications. PsyArXiv, September 3. https://doi.org/10.31234/osf.io/fst9k
  • Flake, J. K., Pek, J., & Hehman, E. (2017). Construct validation in social and personality research: Current practice and recommendations. Social Psychological and Personality Science, 8, 370–378. https://doi.org/10.1177/1948550617693063
  • Fox, J.-P., & Verhagen, A. J. (2010). Random item effects modeling for cross-national survey data. In E. Davidov, P. Schmidt, & J. Billiet (Eds.), Cross-cultural analysis: Methods and applications (pp. 461–482). Routledge Academic.
  • Funder, D. (2020). Misgivings: Some thoughts about “measurement invariance”. Retrieved January 31, 2020, from https://bit.ly/3caKdNN
  • Geiser, C. (2020). Longitudinal structural equation modeling with Mplus: A latent state-trait perspective. Guilford Publications.
  • Geminiani, E., Marra, G., & Moustaki, I. (2021). Single-and multiple-group penalized factor analysis: A trust-region algorithm approach with integrated automatic multiple tuning parameter selection. Psychometrika, 86, 65–95. https://doi.org/10.1007/s11336-021-09751-8
  • Greiff, S., & Scherer, R. (2018). Still comparing apples with oranges? Some thoughts on the principles and practices of measurement invariance testing. European Journal of Psychological Assessment, 34, 141–144. https://doi.org/10.1027/1015-5759/a000487
  • He, J., Van de Vijver, F. J., Fetvadjiev, V. H., de Carmen Dominguez Espinosa, A., Adams, B., Alonso–Arbiol, I., Aydinli–Karakulak, A., Buzea, C., Dimitrova, R., Fortin, A., Hapunda, G., Ma, S., Sargautyte, R., Sim, S., Schachner, M. K., Suryani, A., Zeinoun, P., & Zhang, R. (2017). On enhancing the cross–cultural comparability of Likert–scale personality and value measures: A comparison of common procedures. European Journal of Personality, 31, 642–657. https://doi.org/10.1002/per.2132
  • Holland, P. W., & Wainer, H. (Eds.). (1993). Differential item functioning: Theory and practice. Erlbaum. https://doi.org/10.4324/9780203357811
  • Huang, P. H. (2018). A penalized likelihood method for multi‐group structural equation modelling. British Journal of Mathematical and Statistical Psychology, 71, 499–522. https://doi.org/10.1111/bmsp.12130
  • Huber, P. J. (1964). Robust estimation of a location parameter. The Annals of Mathematical Statistics, 35, 73–101. https://doi.org/10.1214/aoms/1177703732
  • Jacobucci, R., Brandmaier, A. M., & Kievit, R. A. (2019). A practical guide to variable selection in structural equation modeling by using regularized multiple-indicators, multiple-causes models. Advances in Methods and Practices in Psychological Science, 2, 55–76. https://doi.org/10.1177/2515245919826527
  • Jak, S. (2019). Cross-level invariance in multilevel factor models. Structural Equation Modeling: A Multidisciplinary Journal, 26, 607–622. https://doi.org/10.1080/10705511.2018.1534205
  • Jak, S., & Jorgensen, T. D. (2017). Relating measurement invariance, cross-level invariance, and multilevel reliability. Frontiers in Psychology, 8, 1640. https://doi.org/10.3389/fpsyg.2017.01640
  • Jak, S., Oort, F. J., & Dolan, C. V. (2013). A test for cluster bias: Detecting violations of measurement invariance across clusters in multilevel data. Structural Equation Modeling: A Multidisciplinary Journal, 20, 265–282. https://doi.org/10.1080/10705511.2013.769392
  • Kampen, J., & Swyngedouw, M. (2000). The ordinal controversy revisited. Quality and Quantity, 34, 87–102. https://doi.org/10.1023/A:1004785723554
  • Kane, M. T. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50, 1–73. https://doi.org/10.1111/jedm.12000
  • Kankaraš, M., & Moors, G. (2014). Analysis of cross-cultural comparability of PISA 2009 scores. Journal of Cross-Cultural Psychology, 45, 381–399. https://doi.org/10.1177/0022022113511297
  • Kline, R. B. (2016). Principles and practice of structural equation modeling (4th ed.). The Guilford Press.
  • Koc, P., & Pokropek, A. (2022). Accounting for cross-country-cross-time variations in measurement invariance testing. A case of political participation. Survey Research Methods, 16, 79–96. https://doi.org/10.18148/srm/2022.v16i1.7909
  • Kolen, M. J., & Brennan, R. L. (2014). Test equating, scaling, and linking. Springer. https://doi.org/10.1007/978-1-4939-0317-7
  • Lacko, D., Čeněk, J., Točík, J., Avsec, A., Đorđević, V., Genc, A., Haka, F., Šakotić-Kurbalija, J., Mohorić, T., Neziri, I., & Subotić, S. (2022). The necessity of testing measurement invariance in cross-cultural research: Potential bias in cross-cultural comparisons with individualism–collectivism self-report scales. Cross-Cultural Research, 56, 228–267. https://doi.org/10.1177/10693971211068971
  • Lee, S. S., & von Davier, M. (2020). Improving measurement properties of the PISA home possessions scale through partial invariance modeling. Psychological Test and Assessment Modeling, 62, 55–83. https://bit.ly/3FRN6Qf
  • Leitgöb, H., Seddig, D., Asparouhov, T., Behr, D., Davidov, E., De Roover, K., Jak, S., Meitinger, K., Menold, N., Muthén, B., Rudnev, M., Schmidt, P., & van de Schoot, R. (2023). Measurement invariance in the social sciences: Historical development, methodological challenges, state of the art, and future perspectives. Social Science Research, 110, 102805. https://doi.org/10.1016/j.ssresearch.2022.102805
  • Lek, K., Oberski, D., Davidov, E., Cieciuch, J., Seddig, D., & Schmidt, P. (2019). Approximate measurement invariance. In T. P. Johnson, B.-E. Pennell, I. A. L. Stoop, & B. Dorer (Eds.), Advances in comparative survey methods: Multinational, multiregional, and multicultural contexts (3MC) (pp. 911–928). Wiley. https://doi.org/10.1002/9781118884997.ch41
  • Liang, X., & Jacobucci, R. (2020). Regularized structural equation modeling to detect measurement bias: Evaluation of lasso, adaptive lasso, and elastic net. Structural Equation Modeling: A Multidisciplinary Journal, 27, 722–734. https://doi.org/10.1080/10705511.2019.1693273
  • Little, T. D. (2013). Longitudinal structural equation modeling. Guilford Press.
  • Little, T. D., Slegers, D. W., & Card, N. A. (2006). A non-arbitrary method of identifying and scaling latent variables in SEM and MACS models. Structural Equation Modeling: A Multidisciplinary Journal, 13, 59–72. https://doi.org/10.1207/s15328007sem1301_3
  • Lomazzi, V. (2021). Can we compare solidarity across Europe? What, why, when, and how to assess exact and approximate equivalence of first-and second-order factor models. Frontiers in Political Science, 3, 641698. https://doi.org/10.3389/fpos.2021.641698
  • Lubke, G., & Muthén, B. O. (2007). Performance of factor mixture models as a function of model size, covariate effects, and class-specific parameters. Structural Equation Modeling: A Multidisciplinary Journal, 14, 26–47. https://doi.org/10.1080/10705510709336735
  • Luong, R., & Flake, J. K. (2022). Measurement invariance testing using confirmatory factor analysis and alignment optimization: A tutorial for transparent analysis planning and reporting. Psychological Methods. Advance online publication. https://doi.org/10.1037/met0000441
  • Magis, D., & De Boeck, P. (2011). Identification of differential item functioning in multiple-group settings: A multivariate outlier detection approach. Multivariate Behavioral Research, 46, 733–755. https://doi.org/10.1080/00273171.2011.606757
  • Maxwell, S. E., Delaney, H. D., & Kelley, K. (2017). Designing experiments and analyzing data: A model comparison perspective. Routledge. https://doi.org/10.4324/9781315642956
  • Meitinger, K. (2017). Necessary but insufficient: Why measurement invariance tests need online probing as a complementary tool. Public Opinion Quarterly, 81, 447–472. https://doi.org/10.1093/poq/nfx009
  • Meitinger, K., Davidov, E., Schmidt, P., & Braun, M. (2020). Measurement invariance: Testing for it and explaining why it is absent. Survey Research Methods, 14, 345–349. https://doi.org/10.18148/srm/2020.v14i4.7655
  • Mellenbergh, G. J. (1989). Item bias and item response theory. International Journal of Educational Research, 13, 127–143. https://doi.org/10.1016/0883-0355(89)90002-5
  • Meredith, W. (1993). Measurement invariance, factor analysis and factorial invariance. Psychometrika, 58, 525–543. https://doi.org/10.1007/BF02294825
  • Meredith, W., & Teresi, J. A. (2006). An essay on measurement and factorial invariance. Medical Care, 44, S69–S77. https://www.jstor.org/stable/41219507
  • Meuleman, B., Żółtak, T., Pokropek, A., Davidov, E., Muthén, B., Oberski, D. L., Billiet, J., & Schmidt, P. (2022). Why measurement invariance is important in comparative research. A response to Welzel et al. (2021). Sociological Methods & Research. Advance online publication. https://doi.org/10.1177/00491241221091755
  • Millsap, R. E. (2011). Statistical approaches to measurement invariance. Routledge. https://doi.org/10.4324/9780203821961
  • Molenaar, D., & Borsboom, D. (2013). The formalization of fairness: Issues in testing for measurement invariance using subtest scores. Educational Research and Evaluation, 19, 223–244. https://doi.org/10.1080/13803611.2013.767628
  • Monseur, C., & Berezner, A. (2007). The computation of equating errors in international surveys in education. Journal of Applied Measurement, 8, 323–335. https://bit.ly/2WDPeqD
  • Morgan, S. L., & Winship, C. (2015). Counterfactuals and causal inference. Cambridge University Press. https://doi.org/10.1017/CBO9781107587991
  • Mulaik, S. A. (2009). Foundations of factor analysis. CRC Press. https://doi.org/10.1201/b15851
  • Muthén, B. (1984). A general structural equation model with dichotomous, ordered categorical, and continuous latent variable indicators. Psychometrika, 49, 115–132. https://doi.org/10.1007/BF02294210
  • Muthén, B., & Asparouhov, T. (2014). IRT studies of many groups: The alignment method. Frontiers in Psychology, 5, 978. https://doi.org/10.3389/fpsyg.2014.00978
  • Muthén, B., & Asparouhov, T. (2018). Recent methods for the study of measurement invariance with many groups: Alignment and random effects. Sociological Methods & Research, 47, 637–664. https://doi.org/10.1177/004912411770148
  • Noel, Y., & Dauvier, B. (2007). A beta item response model for continuous bounded responses. Applied Psychological Measurement, 31, 47–73. https://doi.org/10.1177/0146621605287691
  • Oberski, D. L. (2014). Evaluating sensitivity of parameters of interest to measurement invariance in latent variable models. Political Analysis, 22, 45–60. https://doi.org/10.1093/pan/mpt014
  • Pokropek, A., & Pokropek, E. (2022). Deep neural networks for detecting statistical model misspecifications. The case of measurement invariance. Structural Equation Modeling: A Multidisciplinary Journal, 29, 394–411. https://doi.org/10.1080/10705511.2021.2010083
  • Pokropek, A., Davidov, E., & Schmidt, P. (2019). A Monte Carlo simulation study to assess the appropriateness of traditional and newer approaches to test for measurement invariance. Structural Equation Modeling: A Multidisciplinary Journal, 26, 724–744. https://doi.org/10.1080/10705511.2018.1561293
  • Pokropek, A., Lüdtke, O., & Robitzsch, A. (2020). An extension of the invariance alignment method for scale linking. Psychological Test and Assessment Modeling, 62, 305–334. https://bit.ly/2UEp9GH
  • Protzko, J. (2022). Invariance: What does measurement invariance allow us to claim? PsyArXiv, April 18. https://doi.org/10.31234/osf.io/r8yka
  • Putnick, D. L., & Bornstein, M. H. (2016). Measurement invariance conventions and reporting: The state of the art and future directions for psychological research. Developmental Review, 41, 71–90. https://doi.org/10.1016/j.dr.2016.06.004
  • Revuelta, J., Hidalgo, B., & Alcazar-Córcoles, M. Á. (2022). Bayesian estimation and testing of a beta factor model for bounded continuous variables. Multivariate Behavioral Research, 57, 57–78. https://doi.org/10.1080/00273171.2020.1805582
  • Robitzsch, A. (2020a). Lp loss functions in invariance alignment and Haberman linking with few or many groups. Stats, 3, 246–283. https://doi.org/10.3390/stats3030019
  • Robitzsch, A. (2020b). Why ordinal variables can (almost) always be treated as continuous variables: Clarifying assumptions of robust continuous and ordinal factor analysis estimation methods. Frontiers in Education, 5, 589965. https://doi.org/10.3389/feduc.2020.589965
  • Robitzsch, A. (2021). Robust and nonrobust linking of two groups for the Rasch model with balanced and unbalanced random DIF: A comparative simulation study and the simultaneous assessment of standard errors and linking errors with resampling techniques. Symmetry, 13, 2198. https://doi.org/10.3390/sym13112198
  • Robitzsch, A. (2022). Estimation methods of the multiple-group one-dimensional factor model: Implied identification constraints in the violation of measurement invariance. Axioms, 11, 119. https://doi.org/10.3390/axioms11030119
  • Robitzsch, A., & Lüdtke, O. (2019). Linking errors in international large-scale assessments: Calculation of standard errors for trend estimation. Assessment in Education: Principles, Policy & Practice, 26, 444–465. https://doi.org/10.1080/0969594X.2018.1433633
  • Robitzsch, A., & Lüdtke, O. (2020). A review of different scaling approaches under full invariance, partial invariance, and noninvariance for cross-sectional country comparisons in large-scale assessments. Psychological Test and Assessment Modeling, 62, 233–279. https://bit.ly/3ezBB05
  • Robitzsch, A., & Lüdtke, O. (2022). Mean comparisons of many groups in the presence of DIF: An evaluation of linking and concurrent scaling approaches. Journal of Educational and Behavioral Statistics, 47, 36–68. https://doi.org/10.3102/10769986211017479
  • Rutkowski, L., & Svetina, D. (2014). Assessing the hypothesis of measurement invariance in the context of large-scale international surveys. Educational and Psychological Measurement, 74, 31–57. https://doi.org/10.1177/0013164413498257
  • Ryu, E. (2014). Factorial invariance in multilevel confirmatory factor analysis. British Journal of Mathematical and Statistical Psychology, 67, 172–194. https://doi.org/10.1111/bmsp.12014
  • Ryu, E., & Mehta, P. (2017). Multilevel factorial invariance in n-level structural equation modeling (nSEM). Structural Equation Modeling: A Multidisciplinary Journal, 24, 936–959. https://doi.org/10.1080/10705511.2017.1324311
  • Schauberger, G., & Mair, P. (2020). A regularization approach for the detection of differential item functioning in generalized partial credit models. Behavior Research Methods, 52, 279–294. https://doi.org/10.3758/s13428-019-01224-2
  • Schroeders, U., & Gnambs, T. (2020). Degrees of freedom in multigroup confirmatory factor analyses. European Journal of Psychological Assessment, 36, 105–113. https://doi.org/10.1027/1015-5759/a000500
  • Seddig, D., & Leitgöb, H. (2018). Approximate measurement invariance and longitudinal confirmatory factor analysis: Concept and application with panel data. Survey Research Methods, 12, 29–41. https://doi.org/10.18148/srm/2018.v12i1.7210
  • Seddig, D., Maskileyson, D., & Davidov, E. (2020). The comparability of measures in the ageism module of the fourth round of the European Social Survey, 2008–2009. Survey Research Methods, 14, 351–364. https://doi.org/10.18148/srm/2020.v14i4.7369
  • Shealy, R., & Stout, W. (1993). A model-based standardization approach that separates true bias/DIF from group ability differences and detects test bias/DTF as well as item bias/DIF. Psychometrika, 58, 159–194. https://doi.org/10.1007/BF02294572
  • Siemsen, E., & Bollen, K. A. (2007). Least absolute deviation estimation in structural equation modeling. Sociological Methods & Research, 36, 227–265. https://doi.org/10.1177/0049124107301946
  • Somaraju, A. V., Nye, C. D., & Olenick, J. (2022). A review of measurement equivalence in organizational research: What’s old, what’s new, what’s next? Organizational Research Methods, 25, 741–785. https://doi.org/10.1177/10944281211056524
  • Steenkamp, J. B. E. M., & Baumgartner, H. (1998). Assessing measurement invariance in cross-national consumer research. Journal of Consumer Research, 25, 78–107. https://doi.org/10.1086/209528
  • Steinmetz, H. (2013). Analyzing observed composite differences across groups: Is partial measurement invariance enough? Methodology, 9, 1–12. https://doi.org/10.1027/1614-2241/a000049
  • Steyer, R. (1989). Models of classical psychometric test theory as stochastic measurement models: representation, uniqueness, meaningfulness, identifiability, and testability. Methodika, 3, 25–60. https://bit.ly/3Js7N3S
  • Steyer, R., Mayer, A., Geiser, C., & Cole, D. A. (2015). A theory of states and traits—Revised. Annual Review of Clinical Psychology, 11, 71–98. https://doi.org/10.1146/annurev-clinpsy-032813-153719
  • Svetina, D., Rutkowski, L., & Rutkowski, D. (2020). Multiple-group invariance with categorical outcomes using updated guidelines: an illustration using Mplus and the lavaan/semtools packages. Structural Equation Modeling: A Multidisciplinary Journal, 27, 111–130. https://doi.org/10.1080/10705511.2019.1602776
  • Tutz, G., & Schauberger, G. (2015). A penalty approach to differential item functioning in Rasch models. Psychometrika, 80, 21–43. https://doi.org/10.1007/s11336-013-9377-6
  • Uher, J. (2021). Psychometrics is not measurement: Unraveling a fundamental misconception in quantitative psychology and the complex network of its underlying fallacies. Journal of Theoretical and Philosophical Psychology, 41, 58–84. https://doi.org/10.1037/teo0000176
  • van de Schoot, R., Kluytmans, A., Tummers, L., Lugtig, P., Hox, J., & Muthén, B. (2013). Facing off with Scylla and Charybdis: A comparison of scalar, partial, and the novel possibility of approximate measurement invariance. Frontiers in Psychology, 4, 770. https://doi.org/10.3389/fpsyg.2013.00770
  • van de Schoot, R., Lugtig, P., & Hox, J. (2012). A checklist for testing measurement invariance. European Journal of Developmental Psychology, 9, 486–492. https://doi.org/10.1080/17405629.2012.686740
  • van de Vijver, F. J. (2018). Towards an integrated framework of bias in noncognitive assessment in international large‐scale studies: Challenges and prospects. Educational Measurement: Issues and Practice, 37, 49–56. https://doi.org/10.1111/emip.12227
  • van der Linden, W. J. (1994). Fundamental measurement and the fundamentals of Rasch measurement. In M. Wilson (Ed.), Objective measurement: theory into practice (Vol. 2, pp. 3–24). Ablex.
  • Vandenberg, R. J., & Lance, C. E. (2000). A review and synthesis of the measurement invariance literature: Suggestions, practices, and recommendations for organizational research. Organizational Research Methods, 3, 4–70. https://doi.org/10.1177/109442810031002
  • VanderWeele, T. J. (2022). Constructed measures and causal inference: Towards a new model of measurement for psychosocial constructs. Epidemiology, 33, 141–151. https://doi.org/10.1097/EDE.0000000000001434
  • von Davier, M., & von Davier, A. A. (2007). A unified approach to IRT scale linking and scale transformations. Methodology, 3, 115–124. https://doi.org/10.1027/1614-2241.3.3.115
  • Wang, W., Liu, Y., & Liu, H. (2022). Testing differential item functioning without predefined anchor items using robust regression. Journal of Educational and Behavioral Statistics, 47, 666–692. https://doi.org/10.3102/10769986221109208
  • Welzel, C., & Inglehart, R. F. (2016). Misconceptions of measurement equivalence: Time for a paradigm shift. Comparative Political Studies, 49, 1068–1094. https://doi.org/10.1177/0010414016628275
  • Welzel, C., Brunkert, L., Kruse, S., & Inglehart, R. F. (2021). Non-invariance? An overstated problem with misconceived causes. Sociological Methods & Research. Advance online publication. https://doi.org/10.1177/0049124121995521
  • Welzel, C., Kruse, S., & Brunkert, L. (2022). Against the mainstream: On the limitations of non-invariance diagnostics: Response to Fischer et al. and Meuleman et al. Sociological Methods & Research. Advance online publication. https://doi.org/10.1177/00491241221091754
  • White, H. (1982). Maximum likelihood estimation of misspecified models. Econometrica, 50, 1–25. https://doi.org/10.2307/1912526
  • Wicherts, J. M. (2016). The importance of measurement invariance in neurocognitive ability testing. The Clinical Neuropsychologist, 30, 1006–1016. https://doi.org/10.1080/13854046.2016.1205136
  • Wicherts, J. M., Dolan, C. V., Hessen, D. J., Oosterveld, P., van Baal, G. C. M., Boomsma, D. I., & Span, M. M. (2004). Are intelligence tests measurement invariant over time? Investigating the nature of the Flynn effect. Intelligence, 32, 509–537. https://doi.org/10.1016/j.intell.2004.07.002
  • Winter, S. D., & Depaoli, S. (2020). An illustration of Bayesian approximate measurement invariance with longitudinal data and a small sample size. International Journal of Behavioral Development, 44, 371–382. https://doi.org/10.1177/0165025419880610
  • Wu, H. (2010). [An empirical Bayesian approach to misspecified covariance structures] [Doctoral dissertation]. Ohio State University. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1282058097
  • Wu, H., & Browne, M. W. (2015). Quantifying adventitious error in a covariance structure as a random effect. Psychometrika, 80, 571–600. https://doi.org/10.1007/s11336-015-9451-3
  • Zieger, L., Sims, S., & Jerrim, J. (2019). Comparing teachers’ job satisfaction across countries: A multiple‐pairwise measurement invariance approach. Educational Measurement: Issues and Practice, 38, 75–85. https://doi.org/10.1111/emip.12254
  • Zitzmann, S., & Loreth, L. (2021). Regarding an “almost anything goes” attitude toward methods in psychology. Frontiers in Psychology, 12, 612570. https://doi.org/10.3389/fpsyg.2021.612570