Full article: A Faster Procedure for Estimating SEMs Applying Minimum Distance Estimators With a Fixed Weight Matrix

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

This study presents a separable nonlinear least squares (SNLLS) implementation of the minimum distance (MD) estimator employing a fixed-weight matrix for estimating structural equation models (SEMs). In contrast to the standard implementation of the MD estimator, in which the complete set of parameters is estimated using nonlinear optimization, the SNLLS implementation allows a subset of parameters to be estimated using (linear) least squares (LS). The SNLLS implementation possesses a number of benefits, such as faster convergence, better performance in ill-conditioned estimation problems, and fewer required starting values. The present work demonstrates that SNLLS, when applied to SEM estimation problems, significantly reduces the estimation time. Reduced estimation time makes SNLLS particularly useful in applications involving some form of resampling, such as simulation and bootstrapping.

Keywords:

1. Introduction

This study addresses the application of separable nonlinear least squares (SNLLS) when performing covariance structure analysis (CSA). SNLLS was first introduced by Golub and Pereyra (Citation1973), who showed that for a certain type of nonlinear estimation problems, a subset of parameters can be estimated using numerically efficient least squares (LS). As will be discussed below, several studies have shown that parameter separation offers a number of numerical benefits, such as faster convergence, better performance when the estimation problem is ill-conditioned (i.e., problems in which the ratio between the largest and the smallest singular value of the covariance matrix is large), and fewer required starting values.

SNLLS is typically applied to problems involving some form of nonlinear regression analysis, but not exclusively so. A recent study by Kreiberg et al. (Citation2021) suggested an SNLLS implementation of the minimum distance (MD) estimator for estimating confirmatory factor analysis (CFA) models. The motivation for the current study is to generalize the results in Kreiberg et al. (Citation2021) by outlining an SNLLS implementation for estimating structural equation models (SEMs). This is important for several reasons. First, it makes SNLLS applicable to a wider range of models. Second, at this stage, little is known about the potential benefits of applying SNLLS in the context of CSA. The outlined SNLLS implementation may pave the way for future research on how to improve the numerical performance of CSA based estimators.

To make the idea of SNLLS clearer, consider the familiar MD quadratic form objective function (1) $F (ϑ) = {(s_{x} - σ_{x} (ϑ))}^{T} V (s_{x} - σ_{x} (ϑ)),$ (1) where $s_{x}$ and $σ_{x} (ϑ)$ are covariance vectors derived from the sample and the model, respectively, $ϑ$ is the parameter vector and $V$ is a weighting matrix chosen by the user. We consider the case in which $V$ is a fixed matrix (i.e., when $V$ is not a function of $ϑ$ ). Such cases include well-known estimators such as unweighted least squares (ULS), generalized least squares (GLS), and weighted least squares (WLS). The standard implementation of EquationEquation (1)(1) $F (ϑ) = {(s_{x} - σ_{x} (ϑ))}^{T} V (s_{x} - σ_{x} (ϑ)),$ (1) is a one-step estimation procedure, here referred to as nonlinear least squares (NLLS), that involves the use of nonlinear optimization techniques. Estimation is performed by searching the parameter space for the value of $ϑ$ that minimizes EquationEquation (1)(1) $F (ϑ) = {(s_{x} - σ_{x} (ϑ))}^{T} V (s_{x} - σ_{x} (ϑ)),$ (1) . In contrast, the SNLLS implementation of EquationEquation (1)(1) $F (ϑ) = {(s_{x} - σ_{x} (ϑ))}^{T} V (s_{x} - σ_{x} (ϑ)),$ (1) is a two-step estimation procedure that works by splitting $ϑ$ into two subsets. In the first step, one subset of parameters is estimated using nonlinear optimization. In the second step, based on the estimates obtained in the first step, the remaining subset of parameters is estimated using LS. As demonstrated in Kreiberg et al. (Citation2021), SNLLS provides parameter estimates and a minimum objective function value identical to those obtained using NLLS. It obviously follows that the asymptotic properties of the estimator are maintained. The presentation below presents a general framework for how to accomplish parameter separation in the case of SEMs.

Over the years, SNLLS has become popular in applied research across a wide range of scientific disciplines. Golub and Pereyra (Citation2003) compiled a list of real-world examples of SNLLS applications. Mullen (Citation2008) subsequently provided a comprehensive overview of SNLLS for a number of applications in physics and chemistry. SNLLS has also proved useful in systems and control applications. For instance, Söderström et al. (Citation2009), Söderström & Mossberg (Citation2011), and Kreiberg et al. (Citation2016) applied CSA to handle the errors-in-variables (EIV) estimation problem. The work in these studies showed how to implement the MD estimator using SNLLS.

Several studies have documented that the SNLLS implementation of nonlinear estimators offers a number of benefits. For instance, Sjöberg and Viberg (Citation1997) evaluated the numerical performance of SNLLS when applied to neural-network minimization problems. Their main conclusions were that SNLLS provides faster convergence and performs better in cases in which the estimation problem is ill-conditioned. A recent study by Dattner et al. (Citation2020) investigated the performance of SNLLS when applied to estimation problems involving ordinary differential equations (ODEs). Their simulations showed that SNLLS provides faster convergence as well as parameter estimates of similar or higher accuracy than what is achieved by traditional nonlinear procedures.

The remainder of this article is organized as follows. Section 2 establishes the notation used throughout the article. In this section, we provide a brief overview of the SEM framework and the associated MD estimator. Section 3 outlines how to modify the MD objective function to accommodate the SNLLS implementation of the estimator when applied to SEMs. Section 4 compares the numerical efficiency of SNLLS and NLLS when applied to real-world estimation problems. Finally, Section 5 presents some concluding remarks.

2. Background

2.1. Notation

Before presenting the SEM framework, it will be useful to introduce the following notation. Let $x$ be a $p \times 1$ zero-mean random vector, and let $Σ_{x}$ be the associated $p \times p$ covariance matrix given by (2) $Σ_{x} = E [x x_{}^{T}],$ (2) where $E$ is the expectation operator and the superscript T is the transpose of a vector or a matrix. The number of nonredundant elements in $Σ_{x}$ is $h = 2^{- 1} p (p + 1),$ given that no restrictions other than symmetry are placed on the elements of $Σ_{x} .$ A covariance vector containing the nonredundant elements (i.e., the lower half of $Σ_{x}$ including the diagonal) is (3) $σ_{x} = vech (Σ_{x}) .$ (3) In this expression, $vech$ is the operation of vectorizing the nonredundant elements of $Σ_{x} .$ Alternatively, $σ_{x}$ is obtained by (4) $σ_{x} = K_{x}^{T} v e c (Σ_{x}) .$ (4) Here, $vec$ is the operation of vectorizing the elements of a matrix by stacking its columns, and $K_{x}$ is a $p^{2} \times h$ matrix obtained from (5) $K_{x} = L_{x} {(L_{x}^{T} L_{x})}^{- 1},$ (5) where $L_{x}$ is a $p^{2} \times h$ selection matrix containing only ones and zeros. This matrix has the additional usage (6) $vec (Σ_{x}) = L_{x} σ_{x} .$ (6) In the case of symmetry, $L_{x}$ is referred to as the duplication matrix in the literature (see Magnus & Neudecker, Citation1999). The matrices $L_{x}$ and $K_{x}$ can be formed to handle covariance matrices with additional structure beyond symmetry. For instance, in the case that $Σ_{x}$ is a diagonal, $K_{x}$ is constructed so that $σ_{x}$ contains only the elements on the diagonal of $Σ_{x} .$ Appendix A outlines a general framework for how to obtain $L_{x}$ and $K_{x}$ for various structures characterizing $Σ_{x} .$

We now expand the previous notation. Let $x_{1}$ and $x_{2}$ be $p_{1} \times 1$ and $p_{2} \times 1$ zero-mean random vectors, respectively. A $p = p_{1} + p_{2}$ dimensional column vector is given by (7) $x = {(x_{1}^{T} x_{2}^{T})}^{T} .$ (7) The associated $p \times p$ covariance matrix is (8) $Σ_{x} = (\begin{matrix} \underset{(p_{1} \times p_{1})}{Σ_{x_{1}}} & \underset{(p_{1} \times p_{2})}{Σ_{x_{2}, x_{1}}^{T}} \\ \underset{(p_{2} \times p_{1})}{Σ_{x_{2}, x_{1}}} & \underset{(p_{2} \times p_{2})}{Σ_{x_{2}}} \end{matrix}),$ (8) where (9) $Σ_{x_{1}} = E [x_{1} x_{1}^{T}], Σ_{x_{2}} = E [x_{2} x_{2}^{T}], Σ_{x_{2}, x_{1}} = E [x_{2} x_{1}^{T}] .$ (9)

As before, the vector consisting of the nonredundant elements of $Σ_{x}$ is given by $σ_{x} .$ However, for later, it will be more convenient to work with the vector (10) ${\tilde{σ}}_{x} = {(σ_{x_{1}}^{T} σ_{x_{2}}^{T} σ_{x_{2}, x_{1}}^{T})}^{T},$ (10) where (11) $σ_{x_{1}} = K_{x_{1}}^{T} v e c (Σ_{x_{1}}), σ_{x_{2}} = K_{x_{2}}^{T} v e c (Σ_{x_{2}}), σ_{x_{2}, x_{1}} = vec (Σ_{x_{2}, x_{1}}) .$ (11)

Note that ${\tilde{σ}}_{x}$ contains the same elements as $σ_{x},$ but in a different order. The last equation in EquationEquation (11)(11) $σ_{x_{1}} = K_{x_{1}}^{T} v e c (Σ_{x_{1}}), σ_{x_{2}} = K_{x_{2}}^{T} v e c (Σ_{x_{2}}), σ_{x_{2}, x_{1}} = vec (Σ_{x_{2}, x_{1}}) .$ (11) follows from the fact that there is no redundancy in $Σ_{x_{2}, x_{1}} .$ Appendix A shows how to derive a matrix ${\tilde{L}}_{x} .$ Then, by using EquationEquations (4)(4) $σ_{x} = K_{x}^{T} v e c (Σ_{x}) .$ (4) and Equation(5)(5) $K_{x} = L_{x} {(L_{x}^{T} L_{x})}^{- 1},$ (5) , we obtain the covariance vector (12) ${\tilde{σ}}_{x} = {\tilde{K}}_{x}^{T} v e c (Σ_{x}) .$ (12)

2.2. The SEM Framework

With the basic notation in place, we are ready to introduce the SEM framework, which consists of the following three equations (excluding constant terms) (13) $η = B η + Γ ξ + δ,$ (13) (14) $x_{1} = Λ_{1} η + ϵ_{1},$ (14) (15) $x_{2} = Λ_{2} ξ + ϵ_{2} .$ (15) The first equation is the structural equation, which specifies the causal relationships among the latent variables. In this equation, $η$ and $ξ$ are respectively $p_{η} \times 1$ and $p_{ξ} \times 1$ random vectors, $δ$ is a $p_{η} \times 1$ random noise vector, and $B$ and $Γ$ are respectively $p_{η} \times p_{η}$ and $p_{η} \times p_{ξ}$ parameter matrices relating the latent random vectors. The last two equations are measurement equations. In these equations, $x_{1}$ and $x_{2}$ are respectively $p_{1} \times 1$ and $p_{2} \times 1$ observed random vectors, $ϵ_{1}$ and $ϵ_{2}$ are noise vectors of similar dimensions, and $Λ_{1}$ and $Λ_{2}$ are respectively $p_{1} \times p_{η}$ and $p_{2} \times p_{ξ}$ parameter matrices relating the observed and the latent random vectors. All random vectors are zero-mean.

It is assumed that $I - B$ , where I is the identity matrix, is nonsingular such that $η$ is uniquely determined by $ξ$ and $δ .$ It is further assumed that $δ$ and $ξ$ are mutually uncorrelated, and that $ϵ_{1}$ and $ϵ_{2}$ are mutually uncorrelated with $η$ and $ξ,$ respectively. The noise vectors $ϵ_{1}$ and $ϵ_{2}$ are allowed to correlate.

The specification additionally includes the following covariance matrices (16) $\begin{array}{l} Σ_{ξ} = E [ξ ξ^{T}], Σ_{δ} = E [δ δ_{}^{T}], Σ_{ϵ_{1}} = E [ϵ_{1} ϵ_{1}^{T}], \\ Σ_{ϵ_{2}} = E [ϵ_{2} ϵ_{2}^{T}], Σ_{ϵ_{2}, ϵ_{1}} = E [ϵ_{2} ϵ_{1}^{T}] . \end{array}$ (16) The nonredundant elements of $Σ_{ξ},$ $Σ_{δ},$ $Σ_{ϵ_{1}},$ $Σ_{ϵ_{2}}$ and $Σ_{ϵ_{2}, ϵ_{1}}$ are given by the covariance vectors (17) $\begin{array}{l} σ_{ξ} = K_{ξ}^{T} v e c (Σ_{ξ}), σ_{δ} = K_{δ}^{T} v e c (Σ_{δ}), σ_{ϵ_{1}} = K_{ϵ_{1}}^{T} v e c (Σ_{ϵ_{1}}), \\ σ_{ϵ_{2}} = K_{ϵ_{2}}^{T} v e c (Σ_{ϵ_{2}}), σ_{ϵ_{2}, ϵ_{1}} = vec (Σ_{ϵ_{2}, ϵ_{1}}) . \end{array}$ (17) Let $ϑ$ be a parameter vector containing the free elements in $B,$ $Γ,$ $Λ_{1},$ $Λ_{2},$ $Σ_{ξ},$ $Σ_{δ},$ $Σ_{ϵ_{1}},$ $Σ_{ϵ_{2}}$ and $Σ_{ϵ_{2}, ϵ_{1}},$ and let $H = {(I - B)}^{- 1} .$ The covariance matrix implied by EquationEquations (13)–(15) is (18) $Σ_{x} (ϑ) = (\begin{array}{l} Λ_{1} H (Γ Σ_{ξ} Γ^{T} + Σ_{δ}) H^{T} Λ_{1}^{T} + Σ_{ϵ_{1}} & Λ_{1} H Γ Σ_{ξ} Λ_{2}^{T} + Σ_{ϵ_{2}, ϵ_{1}}^{T} \\ Λ_{2} Σ_{ξ} Γ^{T} H^{T} Λ_{1}^{T} + Σ_{ϵ_{2}, ϵ_{1}} & Λ_{2} Σ_{ξ} Λ_{2}^{T} + Σ_{ϵ_{2}} \end{array}) .$ (18)

2.3. The MD Estimator

Suppose that a sample of data points $x_{i}$ (for $i = 1, \dots, N$ ) is available. An estimate of $Σ_{x}$ is then computed using (19) $S_{x} = \frac{1}{N} \sum_{i = 1}^{N} x_{i} x_{i}^{T} .$ (19) Given $S_{x},$ the aim is to estimate the true parameter vector $ϑ_{0} .$ An estimate of $ϑ_{0}$ is obtained by (20) $\begin{matrix} \hat{ϑ} = \underset{}{arg min} F (ϑ), \\ ϑ \end{matrix}$ (20) where $F (ϑ)$ is a scalar function that expresses the distance between the observed and the model-implied covariance structure. Below, we focus on the MD objective function given by (21) $F (ϑ) = {(s_{x} - σ_{x} (ϑ))}^{T} V (s_{x} - σ_{x} (ϑ)) .$ (21) In this expression, $s_{x}$ and $σ_{x} (ϑ)$ are vectors containing the nonredundant elements of $S_{x}$ and $Σ_{x} (ϑ),$ respectively. That is, (22) $s_{x} = K_{x}^{T} v e c (S_{x}), σ_{x} (ϑ) = K_{x}^{T} v e c (Σ_{x} (ϑ)) .$ (22) Moreover, the matrix $V$ is a positive definite weighting matrix. Under suitable conditions, and for the right choice of $V,$ the MD estimator is consistent and asymptotically normal. Note that consistency does not depend on $V$ as long as $V$ converges in probability to a symmetric positive definite matrix.

Using a proper algorithm, EquationEquation (21)(21) $F (ϑ) = {(s_{x} - σ_{x} (ϑ))}^{T} V (s_{x} - σ_{x} (ϑ)) .$ (21) is minimized by numerically searching the parameter space until some convergence criterion is satisfied. For the estimation problem to be feasible, it is a necessary condition that the number of elements in $s_{x} - σ_{x} (ϑ)$ is at least as large as the number of free parameters in $ϑ .$

3. Modifying the MD Quadratic Form Objective Function

Next, we outline how to modify the objective function in EquationEquation (21)(21) $F (ϑ) = {(s_{x} - σ_{x} (ϑ))}^{T} V (s_{x} - σ_{x} (ϑ)) .$ (21) to accommodate the SNLLS implementation. To do so, we need some additional notation. Let $ϑ_{β, γ, λ}$ be a $t_{ϑ_{β, γ, λ}} \times 1$ vector containing the free elements in $B,$ $Γ,$ $Λ_{1}$ , and $Λ_{2},$ and let $σ_{ξ, δ, ϵ}$ be a $t_{σ_{ξ, δ, ϵ}} \times 1$ vector containing the free elements in $Σ_{ξ},$ $Σ_{δ},$ $Σ_{ϵ_{1}},$ $Σ_{ϵ_{2}}$ , and $Σ_{ϵ_{2}, ϵ_{1}} .$ The vector $σ_{ξ, δ, ϵ}$ is formed by (23) $σ_{ξ, δ, ϵ} = {(σ_{ξ}^{T} σ_{δ}^{T} σ_{ϵ_{1}}^{T} σ_{ϵ_{2}}^{T} σ_{ϵ_{2}, ϵ_{1}}^{T})}^{T} .$ (23) The complete parameter vector now becomes (24) $ϑ = {(ϑ_{β, γ, λ}^{T} σ_{ξ, δ, ϵ}^{T})}^{T} .$ (24) The key to applying SNLLS is the separation of parameters, which involves expressing the covariance vector in EquationEquation (10)(10) ${\tilde{σ}}_{x} = {(σ_{x_{1}}^{T} σ_{x_{2}}^{T} σ_{x_{2}, x_{1}}^{T})}^{T},$ (10) using (25) ${\tilde{σ}}_{x} = G (ϑ_{β, γ, λ}) σ_{ξ, δ, ϵ} .$ (25) In this expression, $G (ϑ_{β, γ, λ})$ is a tall matrix valued function (i.e., a matrix consisting of more rows than columns) assumed to have full column rank. In Appendix B, it is shown that $G (ϑ_{β, γ, λ})$ takes the general form (26) $G (ϑ_{β, γ, λ}) = (\begin{matrix} K_{x_{1}}^{T} (Λ_{1} H Γ \otimes Λ_{1} H Γ) L_{ξ} & K_{x_{1}}^{T} (Λ_{1} H \otimes Λ_{1} H) L_{δ} & K_{x_{1}}^{T} L_{ϵ_{1}} & 0 & 0 \\ K_{x_{2}}^{T} (Λ_{2} \otimes Λ_{2}) L_{ξ} & 0 & 0 & K_{x_{2}}^{T} L_{ϵ_{2}} & 0 \\ (Λ_{1} H Γ \otimes Λ_{2}) L_{ξ} & 0 & 0 & 0 & I \end{matrix}),$ (26) where $\otimes$ is the Kronecker product, the 0s are zero matrices of compatible sizes, and $I$ is the identity matrix. It is now possible to write the objective function using (27) $F (ϑ_{β, γ, λ}, σ_{ξ, δ, ϵ}) = {({\tilde{s}}_{x} - G (ϑ_{β, γ, λ}) σ_{ξ, δ, ϵ})}^{T} \tilde{V} ({\tilde{s}}_{x} - G (ϑ_{β, γ, λ}) σ_{ξ, δ, ϵ}),$ (27) where ${\tilde{s}}_{x}$ and $\tilde{V}$ correspond to $s_{x}$ and $V,$ respectively, but with their rows and columns rearranged according to the order in ${\tilde{σ}}_{x} .$ For some value of $G (ϑ_{β, γ, λ}),$ the solution to the problem of minimizing EquationEquation (27)(27) $F (ϑ_{β, γ, λ}, σ_{ξ, δ, ϵ}) = {({\tilde{s}}_{x} - G (ϑ_{β, γ, λ}) σ_{ξ, δ, ϵ})}^{T} \tilde{V} ({\tilde{s}}_{x} - G (ϑ_{β, γ, λ}) σ_{ξ, δ, ϵ}),$ (27) w.r.t. $σ_{ξ, δ, ϵ}$ is a straightforward application of LS (28) ${\hat{σ}}_{ξ, δ, ϵ} (ϑ_{β, γ, λ}) = {(G^{T} (ϑ_{β, γ, λ}) \tilde{V} G (ϑ_{β, γ, λ}))}^{- 1} G^{T} (ϑ_{β, γ, λ}) \tilde{V} {\tilde{s}}_{x} .$ (28) Since ${\hat{σ}}_{ξ, δ, ϵ}$ depends on $ϑ_{β, γ, λ}$ , it is necessary to outline how to obtain an estimate ${\hat{ϑ}}_{β, γ, λ}$ without directly involving $σ_{ξ, δ, ϵ}$ . Theorem 2.1 in Golub and Pereyra (1973) provides the justification for replacing $σ_{ξ, δ, ϵ}$ in EquationEquation (27)(27) $F (ϑ_{β, γ, λ}, σ_{ξ, δ, ϵ}) = {({\tilde{s}}_{x} - G (ϑ_{β, γ, λ}) σ_{ξ, δ, ϵ})}^{T} \tilde{V} ({\tilde{s}}_{x} - G (ϑ_{β, γ, λ}) σ_{ξ, δ, ϵ}),$ (27) with the right-hand side of EquationEquation (28)(28) ${\hat{σ}}_{ξ, δ, ϵ} (ϑ_{β, γ, λ}) = {(G^{T} (ϑ_{β, γ, λ}) \tilde{V} G (ϑ_{β, γ, λ}))}^{- 1} G^{T} (ϑ_{β, γ, λ}) \tilde{V} {\tilde{s}}_{x} .$ (28) . Doing so, leads to the modified objective function (29) $F (ϑ_{β, γ, λ}) = {\tilde{s}}_{x}^{T} \tilde{V} {\tilde{s}}_{x} - {\tilde{s}}_{x}^{T} \tilde{V} G (ϑ_{β, γ, λ}) {(G^{T} (ϑ_{β, γ, λ}) \tilde{V} G (ϑ_{β, γ, λ}))}^{- 1} G^{T} (ϑ_{β, γ, λ}) \tilde{V} {\tilde{s}}_{x} .$ (29) Apart from some slight notational differences, the derivation of EquationEquation (29)(29) $F (ϑ_{β, γ, λ}) = {\tilde{s}}_{x}^{T} \tilde{V} {\tilde{s}}_{x} - {\tilde{s}}_{x}^{T} \tilde{V} G (ϑ_{β, γ, λ}) {(G^{T} (ϑ_{β, γ, λ}) \tilde{V} G (ϑ_{β, γ, λ}))}^{- 1} G^{T} (ϑ_{β, γ, λ}) \tilde{V} {\tilde{s}}_{x} .$ (29) is similar to the derivation in Kreiberg et al. (Citation2021). From the preceding presentation, it follows that SNLLS is a two-step procedure. In the first step, ${\hat{ϑ}}_{β, γ, λ}$ is obtained by minimizing EquationEquation (29)(29) $F (ϑ_{β, γ, λ}) = {\tilde{s}}_{x}^{T} \tilde{V} {\tilde{s}}_{x} - {\tilde{s}}_{x}^{T} \tilde{V} G (ϑ_{β, γ, λ}) {(G^{T} (ϑ_{β, γ, λ}) \tilde{V} G (ϑ_{β, γ, λ}))}^{- 1} G^{T} (ϑ_{β, γ, λ}) \tilde{V} {\tilde{s}}_{x} .$ (29) applying nonlinear optimization. In the second step, using ${\hat{ϑ}}_{β, γ, λ}$ from the first step, ${\hat{σ}}_{ξ, δ, ϵ}$ is obtained by EquationEquation (28)(28) ${\hat{σ}}_{ξ, δ, ϵ} (ϑ_{β, γ, λ}) = {(G^{T} (ϑ_{β, γ, λ}) \tilde{V} G (ϑ_{β, γ, λ}))}^{- 1} G^{T} (ϑ_{β, γ, λ}) \tilde{V} {\tilde{s}}_{x} .$ (28) .

The major benefit of the formulation in EquationEquation (29)(29) $F (ϑ_{β, γ, λ}) = {\tilde{s}}_{x}^{T} \tilde{V} {\tilde{s}}_{x} - {\tilde{s}}_{x}^{T} \tilde{V} G (ϑ_{β, γ, λ}) {(G^{T} (ϑ_{β, γ, λ}) \tilde{V} G (ϑ_{β, γ, λ}))}^{- 1} G^{T} (ϑ_{β, γ, λ}) \tilde{V} {\tilde{s}}_{x} .$ (29) is that the minimization w.r.t. $ϑ_{β, γ, λ}$ represents a lower dimensional optimization problem. Thus, the computational load when minimizing $F (ϑ_{β, γ, λ})$ w.r.t. $ϑ_{β, γ, λ}$ is smaller, and in some cases by a considerable margin, than what is the case when minimizing EquationEquation (21)(21) $F (ϑ) = {(s_{x} - σ_{x} (ϑ))}^{T} V (s_{x} - σ_{x} (ϑ)) .$ (21) w.r.t. $ϑ .$ This is especially the case when the number of elements in $σ_{ξ, δ, ϵ}$ is large compared with the number of elements in $ϑ_{β, γ, λ} .$

4. Illustrations

This section provides two examples that illustrate the difference in numerical efficiency between the two implementations, SNLLS and NLLS, of the MD estimator when applied to SEMs. Numerical performance is assessed by studying the convergence of the optimizer and the time it takes the optimizer to reach its minimum. Since timing depends on other processes running on the device performing the estimation, it is recommended to compute the average estimation time over multiple runs. Estimation and timing are performed using Matlab (Citation2020, version R2020b). The two implementations are compared under the following conditions:

Algorithm: The optimizer is a Quasi-Newton (QN) design applying the Broyden–Fletcher–Goldfarb–Shanno (BFGS) Hessian update mechanism (default in Matlab).
Gradient: For simplicity, the gradient is computed using a finite difference approach. The computation is based on a centered design, which is supposed to provide greater accuracy at the expense of being more time-consuming.
Tolerances: Tolerances are set to their default values (details are found in the Matlab documentation).
Starting values: Starting values are taken from the open-source R (R Core Team, Citation2021) package lavaan (Rosseel, Citation2012). The starting values for the free elements are as follows:
- – $Λ_{1}$ and $Λ_{2}$ are computed using the non-iterative fabin 3 estimator (see Hägglund, Citation1982).
- – $B$ and $Γ$ are set to zero.
- – $Σ_{ξ}$ and $Σ_{δ}$ are set to zero except for the diagonal elements, which are set to 0.05.
- – $Σ_{ϵ_{1}}$ and $Σ_{ϵ_{2}}$ are set to zero except for the diagonal elements, which are set to half the observed variance. For the examples below, no starting values are required for the elements in $Σ_{ϵ_{2}}$ _,ϵ₁

Note that SNLLS only requires starting values for the parameter vector

ϑ

_β,γ,λ, whereas NLLS requires starting values for the complete parameter vector

ϑ .

Estimator: The GLS estimator is used throughout the examples. The GLS estimator uses a weight matrix of the form (30) $\tilde{V} = 2^{- 1} {\tilde{L}}_{x}^{T} (S^{- 1} ⨂ S^{- 1}) {\tilde{L}}_{x} .$ (30)
Timing: In each example, the model is re-estimated 1000 times using the same empirical covariance matrix as input.

To ensure that our programming is correct, we compared the estimation results to the results obtained using lavaan.

4.1. Example 1

The first example considers a model for the medical illness of depression. The data ( $N = 323)$ used in this example are taken from Geiser (Citation2012) and consist of six indicators of depression. In the data, $X_{1, 1}$ and $X_{1, 2}$ are indicators of the first-order common factor Depression State 1, $X_{1, 3}$ and $X_{1, 4}$ are indicators of the first-order common factor Depression State 2, and $X_{1, 5}$ and $X_{1, 6}$ are indicators of the first-order common factor Depression State 3. The three factors themselves are indicators of the second-order common trait factor Depression. The model additionally contains an indicator-specific factor labeled IS. Indicators $X_{1, 1},$ $X_{1, 2},$ $X_{1, 3},$ $X_{1, 5}$ , and the factor Depression State 1 serve as marker variables. The path diagram illustrating the structure of the model is shown in .

Figure 1. Geiser (Citation2012).

Results of the estimation are presented in . As seen from the table, the number of iterations and function evaluations is (It, Fe) $=$ (23, 375) for SNLLS and (It, Fe) $=$ (145, 5439) for NLLS. As expected, the required computational load for minimizing $F (ϑ_{β, γ, λ})$ w.r.t. $ϑ_{β, γ, λ}$ is far less than the required load for minimizing $F (ϑ)$ w.r.t. $ϑ .$ shows the convergence profiles for the two implementations. From the figure, it is clear that the SNLLS objective function $F (ϑ_{β, γ, λ})$ starts at a point much closer to its minimum of 0.0109 than what is seen for the NLLS objective function $F (ϑ) .$ In terms of estimation time, the mean time is 0.0334 sec. for SNLLS and 0.1408 sec. for NLLS. Thus, SNLLS is faster by a factor of 0.1408/0.0334 $=$ 4.2153. The results in this example clearly suggest that the SNLLS implementation is numerically more efficient than the standard NLLS implementation.

Figure 2. Convergence profile, Geiser (Citation2012).

Table 1. Timing results, Geiser (Citation2012).

Display Table

4.2. Example 2

The second example considers a model for industrialization and political democracy. The model is taken from Bollen (Citation1989), and has been used extensively in books, tutorials, etc. The data consist of 11 indicators of industrialization and political democracy for 75 countries ( $N = 75) .$ In the data, $X_{1, 1}, \dots, X_{1, 4}$ are indicators of the common factor Political Democracy at time 1 (1960), $X_{1, 5}, \dots, X_{1, 8}$ are indicators of the common factor Political Democracy at time 2 (1965) and $X_{2, 1}, \dots, X_{2, 3}$ are indicators of the common factor Industrialization at time 1 (1960). Due to the repeated measurement design, the unique factors belonging to $X_{1, i}$ and $X_{1, i + 4}$ for $i = 1, \dots, 4$ are set to correlate. Additionally, the unique factors belonging to $X_{1, i}$ and $X_{1, i + 2}$ for $i = 2, 6$ are set to correlate. Indicators $X_{1, 1},$ $X_{1, 5}$ and $X_{2, 1}$ serve as marker variables. The path diagram of the model is shown in .

Figure 3. Bollen (Citation1989).

Results of the estimation are presented in . The results in this example generally confirm the results from the previous example. In this case, the number of iterations and function evaluations are (It, Fe) $=$ (26, 759) for SNLLS and (It, Fe) $=$ (230, 14742) for NLLS. shows the convergence profiles for the two implementations. The patterns in the figure resemble those in . Considering the estimation time, the mean time is 0.1257 sec. for SNLLS and 0.8263 sec. for NLLS. In this case, SNLLS proves to be faster by a factor of 0.8263/0.1257 $=$ 6.5736.

Figure 4. Convergence profile, Bollen (Citation1989).

Table 2. Timing results, Bollen (Citation1989).

Display Table

5. Concluding Remarks

In this study, we have presented an SNLLS implementation of the MD objective function for estimating SEMs. The outlined framework includes all necessary expressions for applying SNLLS, and represents a generalization of previously known results. Using examples from the SEM literature, we demonstrated that the computational load of applying SNLLS is considerably less than that of applying NLLS. Another benefit of SNLLS is that fewer starting values are required, which may mitigate potential problems due to the somewhat arbitrary choice of starting values for the covariance parameters.

The present work may have several interesting extensions. First, as shown by research, SNLLS may hold a potential for improving numerical performance in situations in which the estimation problem is ill-conditioned. Thus, an interesting case for future research would be to compare the numerical performance of SNLLS and NLLS under more challenging conditions in which the condition number of the observed covariance matrix is large. Second, the SNLLS implementation is not yet available for maximum likelihood (ML) estimation. Some initial work on this topic is underway. This work, combined with the previous point, may lead to an improved implementation of the ML estimator when applied to SEMs.

References

Bollen, K. A. (1989). Structural equations with latent variables. Wiley.
Google Scholar
Dattner, I., Ship, H., & Voit, E. O. (2020). Separable nonlinear least-squares parameter estimation for complex dynamic systems. Complexity, 2020, 1–11. https://doi.org/10.1155/2020/6403641
Web of Science ®Google Scholar
Geiser, C. (2012). Data analysis with Mplus. Guilford Press.
Google Scholar
Golub, G. H., & Pereyra, V. (1973). The differentiation of pseudo-inverses and nonlinear least squares problems whose variables separate. SIAM Journal on Numerical Analysis, 10, 413–432. https://doi.org/10.1137/0710036
Web of Science ®Google Scholar
Golub, G. H., & Pereyra, V. (2003). Separable nonlinear least squares: The variable projection method and its applications. Inverse Problems, 19, R1–R26. https://doi.org/10.1088/0266-5611/19/2/201
Web of Science ®Google Scholar
Hägglund, G. (1982). Factor analysis by instrumental variables methods. Psychometrika, 47, 209–222. https://doi.org/10.1007/BF02296276
Web of Science ®Google Scholar
Kreiberg, D., Marcoulides, K., & Olsson, U. H. (2021). A faster procedure for estimating CFA models applying minimum distance estimators with a fixed weight matrix. Structural Equation Modeling: A Multidisciplinary Journal, 28, 725–739. https://doi.org/10.1080/10705511.2020.1835484
Web of Science ®Google Scholar
Kreiberg, D., Söderström, T., & Yang-Wallentin, F. (2016). Errors-in-variables system identification using structural equation modeling. Automatica, 66, 218–230. https://doi.org/10.1016/j.automatica.2015.12.007
Web of Science ®Google Scholar
Magnus, J. R., & Neudecker, H. (1999). Matrix differential calculus with applications in statistics and econometrics. John Wiley & Sons.
Google Scholar
MATLAB version R2020b. (2020). The MathWorks Inc.
Google Scholar
Mullen, K. M. (2008). Separable nonlinear models: Theory, implementation and applications in physics and chemistry [Ph.D. thesis]. Vrije Universiteit Amsterdam.
Google Scholar
R Core Team (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing. http://www.R-project.org/
Google Scholar
Rosseel, Y. (2012). lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48, 1–36. https://doi.org/10.18637/jss.v048.i02
Web of Science ®Google Scholar
Sjöberg, J., & Viberg, M. (1997). Separable non-linear least-squares minimization-possible improvements for neural net fitting. In Neural networks for signal processing VII. Proceedings of the 1997 IEEE signal processing society workshop (pp. 345–354). IEEE.
Google Scholar
Söderström, T., & Mossberg, M. (2011). Accuracy analysis of a covariance matching approach for identifying errors-in-variables systems. Automatica, 47, 272–282. https://doi.org/10.1016/j.automatica.2010.10.046
Web of Science ®Google Scholar
Söderström, T., Mossberg, M., & Hong, M. (2009). A covariance matching approach for identifying errors-in-variables systems. Automatica, 45, 2018–2031. https://doi.org/10.1016/j.automatica.2009.05.010
Web of Science ®Google Scholar

Appendices

A. Deriving L and K

Let

x

be an arbitrary

p \times 1

random vector, and let

Σ_{x} = {σ_{i, j}}

be the associated

p \times p

covariance matrix. The purpose of the following presentation is to introduce an algebraic framework that facilitates eliminating the redundancy originating from the structure of

Σ_{x} .

To do so, let

K_{x}

be a matrix such that

(A1)

σ_{x} = K_{x}^{T} v e c (Σ_{x}),

(A1) where

σ_{x}

is a covariance vector containing the nonredundant elements of

Σ_{x}

and

K_{x}

is a matrix obtained by

(A2)

K_{x} = L_{x} {(L_{x}^{T} L_{x})}^{- 1} .

(A2) In this expression,

L_{x}

is a selection matrix (i.e., a matrix composed of zeros and ones).

Below, we propose a rather general framework that applies to any structure characterizing $Σ_{x} .$ Before presenting some examples on how to obtain $L_{x},$ it is necessary to introduce some additional notation. Let $E (u, v) = {e_{i, j} (u, v)}$ denote a $p \times p$ matrix (for $i, j = 1, \dots, p$ ) with elements (A3) $e_{i, j} (u, v) = {\begin{matrix} 1 & i f σ_{i, j} = σ_{u, v} \\ 0 & o therwise \end{matrix} .$ (A3) Next, we demonstrate how to obtain $L_{x}$ for two standard cases and one case specialized for the SNLLS implementation.

Case 1:

As a start, consider the case in which $Σ_{x}$ is symmetric and no other restrictions are placed on its elements. The covariance vector containing the $2^{- 1} p (p + 1)$ nonredundant elements of $Σ_{x}$ is (A4) $σ_{x} = {(σ_{1, 1} \dots σ_{p, 1} σ_{2, 2} \dots σ_{p, 2} σ_{3, 3} \dots \dots σ_{p, p})}^{T} .$ (A4) Applying (A3), the matrix $L_{x}$ is formed by horizontally concatenating $2^{- 1} p (p + 1)$ vectors using (A5) $\begin{matrix} L_{x} = (vec (E (1, 1)) \dots vec (E (p, 1)) vec (E (2, 2)) \dots vec (E (p, 2)) \\ vec (E (3, 3)) \dots \dots vec (E (p, p))) . \end{matrix}$ (A5)

Case 2:

Now, consider the case in which $Σ_{x}$ is diagonal. The covariance vector containing the $p$ nonredundant elements of $Σ_{x}$ is given by (A6) $σ_{x} = {(σ_{1, 1} σ_{2, 2} \dots σ_{p, p})}^{T} .$ (A6) The matrix $L_{x}$ is now formed by horizontally concatenating $p$ vectors (A7) $L_{x} = (vec (E (1, 1)) vec (E (2, 2)) \dots vec (E (p, p))) .$ (A7) Before introducing the third and final case, it is necessary to expand the notation. Let $x_{1}$ and $x_{2}$ be respectively $p_{1} \times 1$ and $p_{2} \times 1$ random vectors, and let $x$ be a $p = p_{1} + p_{2}$ dimensional column vector obtained by stacking $x_{1}$ and $x_{2}$ in the following way (A8) $x = {(x_{1}^{T} x_{2}^{T})}^{T} .$ (A8) The associated p × p covariance matrix is given by (A9) $Σ_{x} = (\begin{matrix} \underset{(p_{1} \times p_{1})}{Σ_{x_{1}}} & \underset{(p_{1} \times p_{2})}{Σ_{x_{2}, x_{1}}^{T}} \\ \underset{(p_{2} \times p_{1})}{Σ_{x_{2}, x_{1}}} & \underset{(p_{2} \times p_{2})}{Σ_{x_{2}}} \end{matrix}) .$ (A9)

Case 3:

As in the first case, suppose that no other restrictions, apart from symmetry, are placed on the elements of $Σ_{x} .$ Let the covariance vector containing the $2^{- 1} p (p + 1)$ nonredundant elements of $Σ_{x}$ be given by (A10) ${\tilde{σ}}_{x} = {(σ_{x_{1}}^{T} σ_{x_{2}}^{T} σ_{x_{2}, x_{1}}^{T})}^{T},$ (A10) where (A11) $σ_{x_{1}} = {(σ_{1, 1} \dots σ_{p_{1}, 1} σ_{2, 2} \dots σ_{p_{1}, 2} σ_{3, 3} \dots \dots σ_{p_{1}, p_{1}})}^{T},$ (A11) (A12) $σ_{x_{2}} = {(σ_{p_{1} + 1, p_{1} + 1} \dots σ_{p, p_{1} + 1} σ_{p_{1} + 2, p_{1} + 2} \dots σ_{p, p_{1} + 2} σ_{p_{1} + 3, p_{1} + 3} \dots \dots σ_{p, p})}^{T},$ (A12) (A13) $σ_{x_{2}, x_{1}} = {(σ_{p_{1} + 1, 1} \dots σ_{p, 1} σ_{p_{1} + 1, 2} \dots σ_{p, 2} σ_{p_{1} + 1, 3} \dots \dots σ_{p, p_{1}})}^{T} .$ (A13)

Construct a matrix ${\tilde{L}}_{x}$ by horizontally concatenating three matrices (A14) ${\tilde{L}}_{x} = ({({\tilde{L}}_{x})}_{1, 1} {({\tilde{L}}_{x})}_{1, 2} {({\tilde{L}}_{x})}_{1, 3}),$ (A14) where the submatrices are given by (A15) $\begin{matrix} {({\tilde{L}}_{x})}_{1, 1} = (vec (E (1, 1)) \dots vec (E (p_{1}, 1)) vec (E (2, 2)) \dots vec (E (p_{1}, 2)) \\ vec (E (3, 3)) \dots \dots vec (E (p_{1}, p_{1}))), \end{matrix}$ (A15) (A16) $\begin{matrix} {({\tilde{L}}_{x})}_{1, 2} = (vec (E (p_{1} + 1, p_{1} + 1)) \dots vec (E (p, p_{1} + 1)) \\ v e c (E (p_{1} + 2, p_{1} + 2)) \dots vec (E (p, p_{1} + 2)) \\ vec (E (p_{1} + 3, p_{1} + 3)) \dots \dots vec (E (p, p))), \end{matrix}$ (A16) (A17) $\begin{matrix} {({\tilde{L}}_{x})}_{1, 3} = (vec (E (p_{1} + 1, 1)) \dots vec (E (p, 1)) vec (E (p_{1} + 1, 2)) \dots vec (E (p, 2)) \\ vec (E (p_{1} + 1, 3)) \dots \dots vec (E (p, p_{1}))) . \end{matrix}$ (A17) The number of columns in (EquationA15(A15) $\begin{matrix} {({\tilde{L}}_{x})}_{1, 1} = (vec (E (1, 1)) \dots vec (E (p_{1}, 1)) vec (E (2, 2)) \dots vec (E (p_{1}, 2)) \\ vec (E (3, 3)) \dots \dots vec (E (p_{1}, p_{1}))), \end{matrix}$ (A15) ), (EquationA16(A16) $\begin{matrix} {({\tilde{L}}_{x})}_{1, 2} = (vec (E (p_{1} + 1, p_{1} + 1)) \dots vec (E (p, p_{1} + 1)) \\ v e c (E (p_{1} + 2, p_{1} + 2)) \dots vec (E (p, p_{1} + 2)) \\ vec (E (p_{1} + 3, p_{1} + 3)) \dots \dots vec (E (p, p))), \end{matrix}$ (A16) ), and (EquationA17(A17) $\begin{matrix} {({\tilde{L}}_{x})}_{1, 3} = (vec (E (p_{1} + 1, 1)) \dots vec (E (p, 1)) vec (E (p_{1} + 1, 2)) \dots vec (E (p, 2)) \\ vec (E (p_{1} + 1, 3)) \dots \dots vec (E (p, p_{1}))) . \end{matrix}$ (A17) ) is $2^{- 1} p_{1} (p_{1} + 1),$ $2^{- 1} p_{2} (p_{2} + 1)$ and $p_{2} \times p_{1},$ respectively.

B. Deriving

G (ϑ_{β,}_{γ,}_{λ})

The derivation below uses the following matrix identity (B1) $vec (ABC) = (C^{T} ⨂ A) v e c (B),$ (B1) where $A,$ $B$ , and $C$ are matrices of compatible sizes. In addition, we make use of the following relations (B2) $\begin{matrix} v e c (Σ_{ξ}) = L_{ξ} σ_{ξ}, vec (Σ_{δ}) = L_{δ} σ_{δ}, vec (Σ_{ϵ_{1}}) = L_{ϵ_{1}} σ_{ϵ_{1}}, \\ v e c (Σ_{ϵ_{2}}) = L_{ϵ_{2}} σ_{ϵ_{2}} . \end{matrix}$ (B2) The model-implied covariance matrix is (B3) $\begin{matrix} Σ_{x} (ϑ) = (\begin{matrix} Σ_{x_{1}} (ϑ) & Σ_{x_{2}, x_{1}}^{T} (ϑ) \\ Σ_{x_{2}, x_{1}} (ϑ) & Σ_{x_{2}} (ϑ) \end{matrix}) \\ = (\begin{matrix} Λ_{1} H (Γ Σ_{ξ} Γ^{T} + Σ_{δ}) H^{T} Λ_{1}^{T} + Σ_{ϵ_{1}} & Λ_{1} H Γ Σ_{ξ} Λ_{2}^{T} + Σ_{ϵ_{2}, ϵ_{1}}^{T} \\ Λ_{2} Σ_{ξ} Γ^{T} H^{T} Λ_{1}^{T} + Σ_{ϵ_{2}, ϵ_{1}} & Λ_{2} Σ_{ξ} Λ_{2}^{T} + Σ_{ϵ_{2}} \end{matrix}) \\ = (\begin{matrix} Λ_{1} H Γ Σ_{ξ} Γ^{T} H^{T} Λ_{1}^{T} + Λ_{1} H Σ_{δ} H^{T} Λ_{1}^{T} + Σ_{ϵ_{1}} & Λ_{1} H Γ Σ_{ξ} Λ_{2}^{T} + Σ_{ϵ_{2}, ϵ_{1}}^{T} \\ Λ_{2} Σ_{ξ} Γ^{T} H^{T} Λ_{1}^{T} + Σ_{ϵ_{2}, ϵ_{1}} & Λ_{2} Σ_{ξ} Λ_{2}^{T} + Σ_{ϵ_{2}} \end{matrix}) . \end{matrix}$ (B3) Applying SNLLS, the key is to express the model-implied covariance vector using the form (B4) $\begin{matrix} {\tilde{σ}}_{x} (ϑ) = {(σ_{x_{1}}^{T} (ϑ) σ_{x_{2}}^{T} (ϑ) σ_{x_{2}, x_{1}}^{T} (ϑ))}^{T} \\ : = G (ϑ_{β, γ, λ}) σ_{ξ, δ, ϵ} . \end{matrix}$ (B4) To do so, it is necessary to vectorize the individual blocks of (B3). Starting with the block $Σ_{x_{1}} (ϑ),$ we have (B5) $\begin{array}{l} σ_{x_{1}} (ϑ) = K_{x_{1}}^{T} v e c (Σ_{x_{1}} (ϑ)) \\ = K_{x_{1}}^{T} v e c (Λ_{1} H Γ Σ_{ξ} Γ^{T} H^{T} Λ_{1}^{T}) + K_{x_{1}}^{T} v e c (Λ_{1} H Σ_{δ} H^{T} Λ_{1}^{T}) + K_{x_{1}}^{T} v e c (Σ_{ϵ_{1}}) \\ = K_{x_{1}}^{T} (Λ_{1} H Γ \otimes Λ_{1} H Γ) v e c (Σ_{ξ}) + K_{x_{1}}^{T} (Λ_{1} H \otimes Λ_{1} H) v e c (Σ_{δ}) + K_{x_{1}}^{T} v e c (Σ_{ϵ_{1}}) . \end{array}$ (B5) Using (EquationB1(B1) $vec (ABC) = (C^{T} ⨂ A) v e c (B),$ (B1) ) and (EquationB2(B2) $\begin{matrix} v e c (Σ_{ξ}) = L_{ξ} σ_{ξ}, vec (Σ_{δ}) = L_{δ} σ_{δ}, vec (Σ_{ϵ_{1}}) = L_{ϵ_{1}} σ_{ϵ_{1}}, \\ v e c (Σ_{ϵ_{2}}) = L_{ϵ_{2}} σ_{ϵ_{2}} . \end{matrix}$ (B2) ), it follows that (B6) $\begin{matrix} σ_{x_{1}} (ϑ) = K_{x_{1}}^{T} (Λ_{1} H Γ \otimes Λ_{1} H Γ) L_{ξ} σ_{ξ} + K_{x_{1}}^{T} (Λ_{1} H \otimes Λ_{1} H) L_{δ} σ_{δ} + K_{x_{1}}^{T} L_{ϵ_{1}} σ_{ϵ_{1}} \\ = (K_{x_{1}}^{T} (Λ_{1} H Γ \otimes Λ_{1} H Γ) L_{ξ} K_{x_{1}}^{T} (Λ_{1} H \otimes Λ_{1} H) L_{δ} K_{x_{1}}^{T} L_{ϵ_{1}} 0 0) σ_{ξ, δ, ϵ} . \end{matrix}$ (B6) Next, we consider the block $Σ_{x_{2}} (ϑ) .$ Using the same procedure as before, we have (B7) $\begin{array}{l} σ_{x_{2}} (ϑ) = K_{x_{2}}^{T} v e c (Σ_{x_{2}} (ϑ)) \\ = K_{x_{2}}^{T} v e c (Λ_{2} Σ_{ξ} Λ_{2}^{T}) + K_{x_{2}}^{T} v e c (Σ_{ϵ_{2}}) \\ \begin{matrix} = K_{x_{2}}^{T} (Λ_{2} \otimes Λ_{2}) v e c (Σ_{ξ}) + K_{x_{2}}^{T} v e c (Σ_{ϵ_{2}}) \\ \begin{matrix} = K_{x_{2}}^{T} (Λ_{2} \otimes Λ_{2}) L_{ξ} σ_{ξ} + K_{x_{2}}^{T} L_{ϵ_{2}} σ_{ϵ_{2}} \\ = (K_{x_{2}}^{T} (Λ_{2} \otimes Λ_{2}) L_{ξ} 0 0 K_{x_{2}}^{T} L_{ϵ_{2}} 0) σ_{ξ, δ, ϵ} . \end{matrix} \end{matrix} \end{array}$ (B7) Finally, for the block $Σ_{x_{2}, x_{1}} (ϑ),$ it follows that (B8) $\begin{array}{l} σ_{x_{2}, x_{1}} (ϑ) = vec (Σ_{x_{2}, x_{1}} (ϑ)) \\ = vec (Λ_{2} Σ_{ξ} Γ^{T} H^{T} Λ_{1}^{T}) + vec (Σ_{ϵ_{2}, ϵ_{1}}) \\ \begin{matrix} = (Λ_{1} H Γ \otimes Λ_{2}) v e c (Σ_{ξ}) + vec (Σ_{ϵ_{2}, ϵ_{1}}) \\ = (Λ_{1} H Γ \otimes Λ_{2}) L_{ξ} σ_{ξ} + σ_{ϵ_{2}, ϵ_{1}} \\ = ((Λ_{1} H Γ \otimes Λ_{2}) L_{ξ} 0 0 0 I) σ_{ξ, δ, ϵ} . \end{matrix} \end{array}$ (B8) Putting the pieces together, we obtain (B9) $(\begin{matrix} σ_{x_{1}} (ϑ) \\ σ_{x_{2}} (ϑ) \\ σ_{x_{2}, x_{1}} (ϑ) \end{matrix}) = (\begin{matrix} K_{x_{1}}^{T} (Λ_{1} H Γ \otimes Λ_{1} H Γ) L_{ξ} & K_{x_{1}}^{T} (Λ_{1} H \otimes Λ_{1} H) L_{δ} & K_{x_{1}}^{T} L_{ϵ_{1}} & 0 & 0 \\ K_{x_{2}}^{T} (Λ_{2} \otimes Λ_{2}) L_{ξ} & 0 & 0 & K_{x_{2}}^{T} L_{ϵ_{2}} & 0 \\ (Λ_{1} H Γ \otimes Λ_{2}) L_{ξ} & 0 & 0 & 0 & I \end{matrix}) \times σ_{ξ, δ, ϵ} .$ (B9)

A Faster Procedure for Estimating SEMs Applying Minimum Distance Estimators With a Fixed Weight Matrix

Abstract

1. Introduction