ABSTRACT
In structural equation modeling (SEM) studies with categorical data, researchers often use the root mean square error of approximation (RMSEA), comparative fit index (CFI), or standardized root mean squared residual (SRMR) to compare rival models. Model selection based on , , or is meaningful because (a) finding a better model is more scientific and easier than establishing a “good” model, (b) it avoids the problems with cutoffs for fit indices, (c) one is less likely to overlook other equally substantively plausible models, and (d) information criteria (e.g., AIC, BIC) are not applicable to categorical data SEM. In this paper, we propose point estimators and confidence intervals (CIs) for , , and under categorical data. Our methods are applicable to nonnested models and do not need a true model. Simulation results show our point estimators and CIs are all trustworthy, whereas the bias is large when estimating (, ) based on the common estimators in the current literature for RMSEA (CFI, SRMR).
Notes
1 A saturated structure for thresholds simply means the number of model parameters for the thresholds is the same as the number of thresholds in data, namely, a case where are used to explain . However, this does not dictate , , …, , because the model parameters are still free to take on different values in model fitting.
2 All the code and files for our simulation studies are accessible at https://bit.ly/38sIWKz.
3 Models D, E, and G have the same degrees of freedom (see Table 7), and therefore it is impossible for them to be nested models. Moreover, because they do not always have the same fit (see Table 7), they are not equivalent models either. Regarding the model pair FD, means it is only possible for Model D to be nested within Model F, but the fact that can happen rules out the possibility of nesting. The same argument applies to model pair FG. The last pair is FE. Because dE > dF, if we can find a correlation matrix to which Model E fits better than does Model F, then we can rule out that Model E is nested within Model F. To find such an example, consider the we created in Misfit Situation 1 (see the online supplement in Footnote 2 for the values of this ). Fitting Model E to yields and the population model-implied matrix . Then, Model E fits perfectly to but Model F cannot. Because and Model E can fit better than Model F sometimes, Model E is not nested within Model F.