Abstract
Data in social and behavioral sciences typically contain measurement errors and do not have predefined metrics. Structural equation modeling (SEM) is widely used for the analysis of such data, where the scales of the manifest and latent variables are often subjective. This article studies how the model, parameter estimates, their standard errors (SEs), and the corresponding z-statistics are affected by the scales of the manifest and latent variables. Analytical and empirical results show that (1) the normal-distribution-based likelihood ratio statistic is scale-invariant with respect to scale changes of manifest and latent variables as well as to anchor change of latent variables; (2) the normal-distribution-based maximum likelihood (NML) parameter estimates are scale-equivariant with respect to scale-change of manifest and latent variables as well as to anchor change of latent variables; (3) standard errors (SEs) following the NML method are parallel-scale-equivariant with respect to scale changes of the manifest and latent variables; and (4) the z-statistics are scale-invariant with respect to scale changes of the manifest and latent variables. However, only (1) and (2) hold if latent variables are rescaled by changing anchors. Nevertheless, parameters that are not directly related to latent variables with changing anchors are still scale-equivariant and their z-statistics are still scale-invariant. The results are expected to advance understanding of SEM analysis, and also facilitate result interpretation and comparison across studies as in meta analysis.
Notes
1 A path coefficient is directly related to a variable if the arrow representing the path is from the variable or pointing to the variable. Otherwise, it is not directly related.
2 This is because the variance of an endogenous latent variable is not a parameter. Instead, it is a function of model parameters and is subject to prediction (see Bentler, Citation2006, p. 25).
3 Because and
were used for other terminologies, we use
and
instead of
and
for measurement errors. We also use
and
instead of
and
for covariance matrices of measurement errors.
4 Note that the formulas of Dijkstra (Citation1990) contain a typo where the transformations on error variances and covariances between x and y need to be of switched.
5 Note that Tml and z-statistics depend on the sample size although they do not depend on the scales of the involved variables.
6 For a parameter estimate based on a sample of size N, the SNR is defined as
where θ and
are respectively the expected values of
and
or their probability limits as N increases. The SNR is a generalization of Cohen’s d from a mean difference to individual parameter estimates, and can be consistently estimated by
where z is the z-statistic for
(see Yuan & Fang, Citation2023).