455
Views
5
CrossRef citations to date
0
Altmetric
Research Article

Correct Point Estimator and Confidence Interval for RMSEA Given Categorical Data

 

Abstract

RMSEA estimation given nonnormal continuous data is usually based on the mean-adjusted (TM) or mean-variance-adjusted (TMV) chi-square statistic, but a plain application of these statistics has poor performance. Savalei and colleagues gave a better way (the BSL method) to infer RMSEA using TM or TMV. However, the BSL method is applicable to continuous data only. For categorical data, currently RMSEA inference is still based on a plain application of TM or TMV, but such practice is already problematic under continuous data. In this paper, we first show that it is more meaningful to define RMSEA under unweighted least squares (ULS) than under weighted least squares (WLS) or diagonally weighted least squares (DWLS). Then, we propose a correct point estimator and confidence interval for RMSEA given categorical data and ULS. Simulation results show our methods perform well while all the traditional methods break down.

Notes

1 Combining EquationEquations (6) and (Equation9), when the model is correct (i.e., all δj=0), the real distribution of TM is characterized by j=1d(λj/λˉ)χ12.

2 Note that WLSM and WLSMV refer to the M-adjusted and MV-adjusted procedures for DWLS, not for WLS.

3 It is unclear to us how Savalei (Citation2018) arrived at the recommendation against applying the BSL methods for categorical data. The argument in Savalei (Citation2018) was that a correct fit function requires Π1 as the weight matrix but the weight is not Π1 in ULS or DWLS (p. 425). However, for a scalar-valued function F[S,Σ(θ)] to be a legitimate fit function, only three properties are required: (a) F[S,Σ(θ)]0 for any θ values; (b) F[S,Σ(θ)]=0 if and only if S=Σ(θ); (c) F[S,Σ(θ)] is twice differentiable with respect to both S and Σ(θ) (e.g., Browne, Citation1984, p. 64). Both ULS and DWLS satisfy these requirements, and therefore the argument in Savalei (Citation2018) is unconvincing. Nevertheless, our simulation results in a later section indicate that the BSL-M and BSL-MV CIs both have poor empirical coverages given categorical data but the new CI we propose performs well. Now that a new CI with satisfactory coverages is available, we do not bother to ask why the BSL CIs are inapplicable to categorical data.

4 A saturated model simply means the number of model parameters is the same as the number of data elements. Suppose the data have thresholds τ1,τ2,,τK and the model has θ1,θ2,,θK as the parameters for thresholds, then τ(θ) is saturated. However, this does not necessarily cause θ1=τ1, θ2=τ2, and so forth, because θ1 to θK can still freely take values in the model fitting process.

5 See https://bit.ly/2xEzeEB for the R code and simulation files.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.