Abstract
Meaningful comparisons of means or relationships between latent constructs across groups require evidence that measurement is equivalent across the studied groups– a property known as measurement equivalence or invariance (ME/I). Methods typically involve an evaluation of increasingly stringent models via confirmatory factor analysis, a typical assumption of which is continuous observed variables. When that assumption is not met – as is often the case in many surveys – alternative methods that directly model the categorical nature of the data exist. Although well established, categorical ME/I models pose a number of complexities and various recommendations for their evaluation. To that end, we describe the current state of categorical ME/I and demonstrate an up-to-date method for model identification and invariance testing. In the tutorial, we exemplify a common approach to establishing ME/I via multiple-group confirmatory factor analysis using Mplus and the lavaan and semTools packages in R.
Notes
1 Through an item response theory (IRT) framework, measurement invariance is also known as an absence of differential item functioning (Hambleton & Rogers, Citation1989; Mellenbergh, Citation1994; Swaminathan & Rogers, Citation1990); however, we do not emphasize the IRT perspective here.
2 Although some scholars advocate for strict factorial invariance (Meredith, Citation1993) or equality of residual variances as a condition for comparing latent means (Deshon, Citation2004; Lubke & Dolan, Citation2003), in practice, this level of invariance is rarely pursued given that scalar invariance supports cross-group comparisons of manifest (or latent) variable means on the latent variable of interest (Hancock, Citation1997; Little, Citation1997; Thompson & Green, Citation2006).
3 Readers are directed to the original article for complete detail on all conditions posited by Wu and Estabrook (Citation2016).
4 As presented in Figure 1 in Wu and Estabrook, there are six different ways that the model can be identified as a baseline model while remaining statistically equivalent. Superscript (g) denotes different parameters for different groups. For purposes of our tutorial, we select one path to identify the model and test for threshold followed by loadings invariance.
5 This finding was supported by French and Finch (Citation2006) in the multivariate normal context.
6 We are not including all aspects of individual studies. For example, Cheung and Rensvold (Citation2002) studied impact of factor variances, strengths of factor loadings, and factor correlations-aspects not necessarily examined in other studies (which included other aspects).
7 We note that large number of resources, including discussion sites and groups, as well as supplemental documentation, are available for Mplus (http://www.statmodel.com/) and lavaan package (http://lavaan.ugent.be/; https://groups.google.com/forum/#!forum/lavaan).
8 In the Appendix, we provide selected annotated output.
9 We note that the previously used standard chi-square difference test is not appropriate for categorical MG-CFA.
10 Conceptually, this is the same idea as we would have in multiple groups mean comparisons and using a statistical omnibus test to control for multiple pair-wise comparisons.