Abstract
CitationYuan and Hayashi (2010) introduced 2 scatter plots for model and data diagnostics in structural equation modeling (SEM). However, the generation of the plots requires in-depth understanding of their underlying technical details. This article develops and introduces an R package semdiag for easily drawing the 2 plots. With a model specified in EQS syntax, one only needs to supply as few as 2 parameters to generate the 2 plots using the semdiag package. Two examples are provided to illustrate the use of the package. Multiple figures are used to explain the elements of data and model diagnostics. Advice on selecting proper estimation methods following the diagnostics is also given.
Notes
1The semdiag utilizes the REQS functions developed by Mair, Wu, and Bentler (2010) for running EQS (CitationBentler, 2008) within R. We thank Drs. Patrick Mair, Eric Wu, and Peter Bentler for allowing us to adapt the REQS functions as a part of the semdiag package.
2The semdiag package also works with other SEM software and more detail is given in the concluding section.
3More formal definitions of outliers using population distributions with saturated and structured models are given in CitationYuan and Bentler (2001) and CitationYuan and Hayashi (2010).
4The likelihood ratio statistic is commonly referred to as the chi-square statistic although its true distribution is seldom chi-square in practice.
5Parallel to the regression literature, we do not call the Es latent variables in this article although they are not observable in practice.
6The data set is also built into the R packages. To access the data, use data(N100) within R after loading the package.
7One can use any name for the EQS input file. However, the file name on Line 35 of Appendix A should be the same as the input file name except for different extensions.
8The idea proposed in CitationYuan and Hayashi (2010) might seem like the idea of jackknifing. However, there exist basic differences between the two approaches. Jackknifing is to estimate the bias and standard error of a statistic by systematically recomputing the statistic estimate leaving out one or more observations at a time from the sample set. The proposal in Yuan and Hayashi is to study the change on the LR statistic on a few selected clusters of observations, not to calculate its bias or standard error. Also, the number of observations in each cluster is determined by data rather than predetermined as in jackknifing.