572
Views
5
CrossRef citations to date
0
Altmetric
Research Article

Evaluation of Model Fit in Structural Equation Models with Ordinal Missing Data: An Examination of the D2 Method

Pages 561-583 | Published online: 08 Oct 2019
 

Abstract

In many applied situations, the questionnaire items in measurement instruments do not approximate continuous, normally distributed variables but instead are ordinal. Properties of these instruments are often most accurately evaluated using structural equation models for ordinal data. However, most evaluations of instrument functioning need to overcome the problem of missing data. Multiple imputation is one approach to handling missing data, but no published article addresses the mechanism of pooling m tests of model fit across m imputed datasets for models with ordinal variables. This study conducts simulations to examine the feasibility of extending the D2 procedure to combining model fit information across multiply imputed datasets with ordinal variables. Our results suggest that the D2 procedure may be a reasonable procedure to use in this new context, so long as the analysis model also includes variables with little or no missing data that correlate with the incomplete ordinal variables.

Notes

1 Following Millsap and Tein (Citation2004) and Millsap (Citation2011), standardized threshold parameters refer to the threshold parameters for a continuous latent response variable that follows a standard normal distribution.

2 This correlation value was also used in the additional simulation studies of Forero and Maydeu-Olivares (Citation2009), which examined the limited versus full information methods for the estimation of IRT graded response models (i.e., ordinal factor analysis), to represent moderate correlations between the dimensions. Forero and Maydeu-Olivares (Citation2009) found that with seven binary indicators per factor and N = 500, the magnitude of the factor correlations (.60 versus .20) did not influence the accuracies of factor loading parameters and standard errors when using the ULS estimator to analyze complete data, though the FIML factor loading parameters and standard errors worsened as the magnitude of the factor correlations decreased.

3 The Satterthwaite approach (i.e., ULSMVS, WLSMVS) showed very similar patterns as the scale-shift approach (i.e., ULSMV, WLSMV) in the Shi et al. (Citation2018) study, and hence was not examined in the current study.

4 This option is the default in Mplus, incorporates FIML to handle missing data, generated the most accurate standard errors in Maydeu-Olivares (Citation2017), and generated goodness-of-fit statistics that had identical performance as MLR with expected information in most situations studied in Maydeu-Olivares (Citation2017).

5 While Maydeu-Olivares (Citation2017) found that the goodness-of-fit statistic obtained from Mplus using the estimator option ‘‘MLMV’’ had the best performance when analyzing complete data on five-category items (treated as continuous), this estimation choice in Mplus does not incorporate FIML.

6 For more information of the determination of the number of burn-in iterations for the multiple imputation step of each condition, see Supplemental Material 1.

7 Operationalized as the proportion of replications with PSR dropping below 1.05 for all parameters after discarding the first half of the burn-in iterations. Relaxing this criterion to PSR < 1.10 resulted in zero additional replications with converged multiple imputation for all conditions except for four of the conditions with two response categories and asymmetric observed distributions. For those four conditions, the number of additional replications with converged multiple imputation based on the more relaxed criterion was 1, 2, 3, and 16, respectively. Given the trivial gain in the number of replications with converged multiple imputation by using a more relaxed criterion for the convergence of multiple imputation, we decided to continue to use the more conservative criterion of PSR < 1.05.

8 Defined as the proportion of replications that had a significant (p < .05) D2 statistic when a correct model was fitted to the imputed datasets, among replications that (1) had converged multiple imputation, and (2) had 20 or more imputed datasets that had the analysis model converged to a proper solution.

9 Because with C = 5 the rejection rate of MLR-cont got higher as the missing data rate increased and exceeded .075 for Correct Model 1 in most conditions with N = 500, we decided not to examine statistical power of MLR-cont for Incorrect Model 1.

10 Note that many of the N = 300 conditions showed 100% or almost 100% statistical power to reject the Incorrect Model 2 even with 30% missing data, especially for conditions with C = 4 or 5, so we did not present the statistical power to reject this incorrect model at N = 500. This is not surprising given the high statistical power of the DWLS and ULS robust tests of model fit to detect misspecification in the measurement part of the model in analyzing complete ordinal variables found in previous studies (e.g., Shi et al., Citation2018).

11 As a reference, in the complete data analysis of this simulation, the empirical Type 1 error rates of the ULSM and particularly WLSM test statistics tended to be slightly inflated, whereas those of the ULSMV and WLSMV test statistics were typically in the acceptable range. This is consistent with the findings in previous research on similar conditions (e.g., Shi et al., Citation2018).

12 In general, the empirical distributions (and the empirical Type 1 error rates) of D2, ULSM and D2, WLSM were similar, and the empirical distributions (and the empirical Type 1 error rates) of D2, ULSMV and D2, WLSMV were similar. Thus, we only presented the empirical distributions D2, ULSM and D2, ULSMV. The general trend of results was similar across N = 500 and N = 300 conditions with the N = 300 conditions displaying more inflated Type 1 error rates. Moreover, at N = 500 the multiple imputation convergence rate was higher, and the rate of the analysis models for imputed datasets converging to a proper solution was higher (especially for conditions with C = 2). As a result, the number of imputed datasets used for the calculation of D2 statistics and the number of replications with 20 or more usable imputed datasets were both higher at N = 500 than at N = 300.

13 Other than that the data generation model in Additional Simulation 1 was the same as the one used in the main simulation for conditions with N = 500, C = 2, and asymmetric threshold distributions.

14 Other than that the data generation model in Additional Simulation 2 was the same as the one used in the main simulation for conditions with N = 500, C = 2, asymmetric threshold distributions, and factor loadings of .80, .80, .60, .60, and .60.

15 Generation of complete datasets in Additional Simulation 3 was otherwise the same as that in the main simulation conditions with N = 500, C = 2 and 5, and asymmetric threshold distributions.

16 Similarly, in the empirical Type 1 error rates of the D2 statistics for Correct Model 2 when only the binary Y items were missing (50% of the analysis variables were incomplete binary items, with the rest being complete binary items) were more inflated than those for Correct Model 3 in the bottom panel of (50% of the analysis variables were incomplete binary items, with the rest being complete five-category items).

17 This was achieved by generating a standard normal random variable A for each X item, and if Ai (i = 1, 2, 3, 4, 5) was higher than the 90th percentile of a standard normal distribution, the corresponding Xi was set to be missing.

18 Although at the time of this study the option to exclude imputed datasets that resulted in improper solutions of the analysis model from the calculation of the D2 statistics has not yet been implemented in the CRAN version of semTools, the development version of semTools (0.5–1.935) offers this option.

19 Only when the complete and incomplete analysis variables were all binary with asymmetric threshold distributions and missing data rate was 20% to 30%, did the D2 statistics show inflated Type 1 error rates with 50% incomplete analysis variables.

20 The few exceptions were in conditions with a mixture of medium and high loading values and 20% or 30% missing data for Correct Model 2 (100% incomplete analysis variables), which did not include the complete X items when calculating the D2 statistics but included them as missing data auxiliary variables when calculating the MLR-cont statistic.

21 The Chen et al. (Citation2019) study on nested model comparison for SEM models with ordinal incomplete variables made the assumption of a correctly specified configural invariance model, but did not provide ways to test this important assumption.

22 This approach was originally proposed in the context of ML estimation of continuous data to examine the joint significance of a set of parameters, by using the pooled parameter estimates and pooled parameter covariance matrix to construct a multivariate Wald test. Liu et al. (Citation2017) suggested that this may be extended to test the longitudinal measurement invariance of ordinal items (indicated by nonsignificance of changes in parameters) in the presence of missing data.

Additional information

Funding

This research was funded by the University of Houston New Faculty Research Program.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 412.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.