Abstract
The importance of sample size, although widely discussed in the literature on structural equation modeling (SEM), has not been widely recognized among applied SEM researchers. To narrow this gap, we focus on second language testing and learning studies and examine the following: (a) Is the sample size sufficient in terms of precision and power of parameters in a model using Monte Carlo analysis? (b) How are the results from Monte Carlo sample size analysis comparable with those from the N ≥ 100 rule and from the N: q ≥ 10 (sample size–free parameter ratio) rule? Regarding (a), parameter bias, standard error bias, coverage, and power were overall satisfactory, suggesting that sample size for SEM models in second language testing and learning studies is generally appropriate. Regarding (b), both rules were often inconsistent with the Monte Carlo analysis, suggesting that they do not serve as guidelines for sample size. We encourage applied SEM researchers to perform Monte Carlo analyses to estimate the requisite sample size of a model.
Acknowledgments
We would like to thank Gary J. Ockey, Koken Ozaki, and the two anonymous reviewers for their valuable comments on earlier versions of this article.
Notes
Of the 18 models we analyzed (see the Results and Discussion section), five were reported to have non-normal data: Four of the five models were analyzed by Satorra-Bentler robust maximum likelihood (ML) methods and one was analyzed by ML methods. These five models also had missing data and the authors deleted them listwise. There was another model analyzed by full maximum ML methods while containing missing data. The remaining 12 models were unclear as to normality and/or missing data. Nevertheless, since the information we used in Monte Carlo studies is parameter values and not the standard errors of the parameters or fit indices, only the latter two of which are influenced by non-normality, our findings would be robust even if those 12 models had been based on non-normal data. On the other hand, missing data affect parameter estimates (Allison, Citation2003), and our inability to model the pattern of data gaps, if any, is the limitation of the current study.