Full article: Non-negative variance component estimation for the partial EIV model by the expectation maximization algorithm

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

A difficulty in variance component estimation (VCE) is that the estimates may become negative, which is not acceptable in practice. This article presents two new methods for non-negative VCE that utilize the expectation maximization algorithm for the partial errors-in-variables model. The former searches for the desired solutions with unconstrained estimation criterion and concludes statistically that the variance components have indeed moved to the edge of the parameter space when negative estimates appear implemented by the other existing VCE methods. We concentrate on the formulation and provide non-negative analysis of this estimator. In particularly, the latter approach, which has greater computational efficiency, would be a practical alternative to the existing VCE-type algorithms. Additionally, this approach is easy to implement, the non-negative variance components are automatically supported by introducing non-negativity constraints. Both algorithms are free from a complex matrix inversion and reduce computational complexity. The results show that our algorithms retrieve well to achieve identical estimates over the other VCE methods, the latter approach can quickly estimate parameters and has practical aspects for the large volume and multisource data processing.

Keywords:

1. Introduction

In geodetic data processing, the least squares (LS) estimates would be incorrect if the coefficient matrix is contaminated with random errors. To avoid lacking good estimation precision, the iterative algorithm of total least squares (TLS) (Golub and Van Loan Citation1980; Schaffrin and Wieser Citation2008; Wang Citation2012; Wang and Zhao Citation2019) has been developed to solve the observation equations with the unknown parameters of interest and all measurements polluted errors. Such a functional model is the errors-in-variables (EIV) model (Fang Citation2011, Citation2013; Amiri-Simkooei and Jazaeri Citation2012; Jazaeri et al. Citation2014). The EIV model ignores the nonrandomness of elements in the coefficient matrix, As a result, the total number of the unknown parameters has been significantly increased. Xu et al. (Citation2012) proposed an extended model in the terminology of partial EIV model to reduce the correction of nonrandom elements and obtained the statistical properties of a (weighted or ordinary) TLS estimator in the case of finite samples. Liu et al. (Citation2013) further analyzed the structure of the partial EIV model and highlighted that this model greatly facilitates the precision estimation of subsequent estimates (Wang and Zhao Citation2018; Wang and Zou Citation2019; Wang and Ding Citation2020).

The functional models and the algorithms for numerically finding the LS or TLS solutions have been extensively studied. In practical applications, the prior stochastic model of observations is often unreliable. This problem has attracted a widely spread attention in the monitoring data sets of surface deformation, such as earthquake, volcanic eruption, crustal movement and groundwater extraction. Knowledge of an appropriate (co)variance matrix is a prerequisite for reasonable parameter estimation and subsequent precision analysis. The unknown variance components can be updated iteratively to weigh the contributions of different data sets, such as observed displacement data of InSAR and GPS, to obtain the final estimates, this is the ubiquitous task of variance component estimation (VCE). There exist several VCE methods to estimate heterogeneous variance components such as the variance component estimation of Helmert type (Helmert Citation1907), best invariant quadratic unbiased estimation (BIQUE) (Koch Citation1999), minimum norm quadratic unbiased estimation (MINQUE) (Xu et al. Citation2006, Citation2007), iterative almost unbiased estimation (IAUE) (Hsu Citation1999; Vergos et al. Citation2012), least squares variance component estimation (LS-VCE) (Teunissen and Amiri-Simkooei Citation2008) and restricted maximum likelihood estimation (REML) (Koch Citation1986). The VCE methods have been widely applied to the so called LS problem, heteroscedasticity of different types of observations is also a typical characteristic for the EIV model or the case of the partial EIV model. In an early work of Amiri-Simkooei (Citation2013), the LS-VCE is applied to the EIV model, this application can generally evaluate the unknown variance components for the observations and the coefficient matrix. Considering that the ill-posed TLS problem usually occurs in geodetic applications, Wang et al. (Citation2019) added the virtual observations and adopted the Helmert method to determine the ridge parameter for the partial EIV model, this ridge parameter is the weight ratio between the different measurements and reduces the influence of the minor disturbance of observations. Wang and Xu (Citation2016) used the ratio of relative weight method for estimating the variance components, they described that the Helmert method applied to the partial EIV model is also feasible.

A typical problem in estimation principles of variance components is that not all existing VCE methods produce positive estimates. Lee and Khuri (Citation2001), therefore, investigated the probability of the occurrence of negative estimates, it actually depends on the design used and the true values of the variance components themselves. Thompson (Citation1962) analyzed the influence of the VCE methods, it may lead to negative variance components. Moghtased-Azar et al. (Citation2014) presented an alternative method for non-negative VCE, and reparameterized the variance components by a positive-valued function, this method succeeded to overcome the phenomenon of negative variance in each iteration, but its convergence may be slow or divergent due to the presence of an extremely inappropriate stochastic model. Based on the LS-VCE method, Amiri-Simkooei (Citation2016) deduced a non-negative least squares variance component estimation (NNLS-VCE) approach using the sequential coordinate-wise non-negative least squares (NNLS) of Franc et al. (Citation2005), this approach provides the precision of variance component estimates that is never worse than the unconstraint algorithms. However, it is essentially a non-negative constraint approach and the statistic information is not easily available. Jennrich and Sampson (Citation1976) recommended replacing the negative variance components with a certain positive value or adding non-negativity constraints to the VCE method.

As a result, to obtain the non-negative estimation, some kinds of constraints, such as equality and inequality constraints, are imposed to the estimators (Jennrich and Sampson Citation1976; Groeneveld 1994; Moghtased-Azar et al. Citation2014; Amiri-Simkooei Citation2016). These participation constraints are rooted in a prior information inherent to the variance components, consequently, the estimation of parameters with large bias may be improved. Nevertheless, compared with a great number of publications on non-negative VCE methods and publications, statistical aspects of the non-negative variance component estimators have not received due attention, in particular, in the case of a non-negative estimator without any constraints. On the other hand, the existing non-negative VCE methods either take a positive estimate or estimate it to zero, it lacks a uniform standard. Unlike the VCE methods, the shortcomings of these methods with the constraints cannot give a reasonable explanation and derivation for the problem of the negative variance. To find a more intuitive and reliable procedure, the algorithm of expectation maximization (EM) is developed here.

The EM algorithm was originally proposed by Dempster et al. (Citation1977) and has been extensively studied (Lange et al. Citation1989; Peng Citation2009; Gupta and Chen Citation2010; Koch Citation2013). Variance component estimation derived by the EM algorithm for the linear mixed model is routinely applied to biological or longitudinal data. Laird and Ware (Citation1982) applied two-stage random-effects models for maximum likelihood estimation using two probability distributions, the first one for response vectors of different individuals belong to a single family and the second one for random-effects parameters, which are to be specified as the latent variables, that vary across individuals. Calvin (Citation1993) extended the method of Laird and Ware (Citation1982), and developed an EM algorithm for REML estimation of the multivariate mixed model where the variance component matrices are estimated with unbalanced data. This algorithm is reasonably quick for moderately sized data. Foulley et al. (Citation2000) investigated the class of random-effects components, and estimated the parameters associated with random coefficient factors separately from those pertaining to the stationary time processes and measurement errors. In the geodetic literature, the EM algorithm is often adopted to solve the mean shift model or variance inflation model for robust estimation (Peng Citation2009; Koch Citation2013; Koch and Kargoll Citation2013). Based on the time series data sets and EM algorithm, Kargoll et al. (Citation2018) investigated the random deviations associated with the autoregressive process, they also deduced an iteratively reweighted LS approach capable of adaptive robust adjustment of parameters. Unfortunately, the EM algorithm can be slow to converge. Thus, various modifications of the EM algorithm, such as expectation conditional maximization (ECM) and an extension of the ECM algorithm, namely, the ECM either (ECME), were proposed to accelerate convergence in this setting (Little and Rubin Citation2002; Mclachlan and Krishnan Citation2008).

The traditional solutions for the EM algorithm are usually applied to the linear model for VCE or robust estimation. In this article, we have derived two non-negative VCE approaches, namely the EM algorithm VCE (EM-VCE) and the modified EM algorithm of non-negative VCE (EM-NN-VCE), for the more general partial EIV model. Although, the coefficient matrix without random effects has been considered by Laird and Ware (1982) , estimation via the EM algorithm with a random coefficient matrix and measurement errors will be derived here. More importantly, we will extend the analysis of non-negativity for EM-VCE solutions. Unlike all the methods of TLS, the random errors of the coefficient matrix will be introduced as the latent variables or missing data, and based on conditional expectations. The missing data are determined to facilitate the problem of the maximization. Both algorithms are free from a complex matrix inversion and the parameters and variance components are jointly estimated. The latter, which exhibits less computational burden, would be a practical alternative to the existing VCE-type algorithms. Additionally, the feasibility and superiority of the proposed algorithms over the methods of REML, positive-valued functions non-negative VCE (PVFs-VCE) and NNLS-VCE will be verified by numerical examples.

This article is organized as follows: Section 2 introduces the structural information of the partial EIV model and briefly describes the maximization problems of this model. In Section 3, we deduce the EM algorithm variance component estimation (EM-VCE) for the partial EIV model, the non-negativity of EM-VCE and the modified version of non-negative VCE are further presented. In Section 4, we perform two numerical examples to simultaneously estimate the variance components and fixed parameters using the proposed algorithms and control methods. Finally, some conclusions are made in Section 5.

2. Partial EIV model

The partial EIV model proposed by Xu et al. (Citation2012) is expressed as (1) ${\begin{cases} y = U ξ + e_{y} = (ξ^{Τ} \otimes I_{n}) (h + B \bar{a}) + e_{y}, \\ a = \bar{a} + e_{a}, \\ v e c (A) = h + Ba, \end{cases}$ (1) where $y$ denotes the $n \times 1$ vector of observations, $A$ represents the $n \times m$ coefficient matrix with errors, $U$ is the true matrix of $A,$ $ξ$ denotes the $m \times 1$ vector of fixed parameters, $I_{n}$ represents the $n \times n$ identity matrix, $h$ is the $n m \times 1$ constant vector consisting of nonrandom elements and zeros of $A,$ $B$ represents the $n m \times t$ deterministic matrix related to $A,$ $a$ is the $t \times 1$ vector composed of random elements of $A,$ $\bar{a}$ denotes the true vector of $a,$ $e_{y}$ represents the $n \times 1$ vector of observation errors and $e_{a}$ is the $t \times 1$ error vector of the independent random elements of $A .$ $\otimes$ stands for the Kronecker product (Grafarend and Schaffrin Citation1993), $vec (•)$ denotes the operator that stacks one column of a matrix underneath the previous one, and $v e c^{- 1} (•)$ is the opposite of the $vec (•)$ operator which reshapes the vector into the original matrix.

We assume that the random error vectors $e_{y}$ and $e_{a}$ are independent of each other. Then the stochastic model of the partial EIV model is defined as (2) $[\begin{matrix} e_{y} \\ e_{a} \end{matrix}] \sim ([\begin{matrix} 0 \\ 0 \end{matrix}], [\begin{matrix} D_{y} & 0 \\ 0 & D_{a} \end{matrix}]),$ (2)

where $D_{y}$ and $D_{a}$ are the variance-covariance matrices corresponding to $e_{y}$ and $e_{a},$ the EquationEquation (1)(1) ${\begin{cases} y = U ξ + e_{y} = (ξ^{Τ} \otimes I_{n}) (h + B \bar{a}) + e_{y}, \\ a = \bar{a} + e_{a}, \\ v e c (A) = h + Ba, \end{cases}$ (1) is deformed as follows (3) $y - A ξ = - (ξ^{Τ} \otimes I_{n}) B e_{a} + e_{y} .$ (3)

This eliminates the unknown parameter vector $\bar{a}$ in EquationEquation (1)(1) ${\begin{cases} y = U ξ + e_{y} = (ξ^{Τ} \otimes I_{n}) (h + B \bar{a}) + e_{y}, \\ a = \bar{a} + e_{a}, \\ v e c (A) = h + Ba, \end{cases}$ (1) , and the stochastic properties of the variables in that vector are expressed in the coefficient matrix. In probability statistics, $e_{a}$ and $e_{y}$ are the two-dimensional random variables that follow the normal distribution, $L$ is also a random variable with $L = y - A ξ .$ If we define the equation $Z = - (ξ^{Τ} \otimes I_{n}) B,$ then, the first and second central moments of the random variable $L$ are given as (4) $E (L) = 0,$ (4) (5) $D (L) = Z D_{a} Z^{Τ} + D_{y},$ (5) where $D (L)$ is expressed as $Σ_{L} .$ Considering that the observation vector $y$ and the coefficient matrix $A$ have different variance components under certain circumstances, we set the variance-covariance matrices $D_{y} = σ_{1} Q_{y}$ and $D_{a} = σ_{2} Q_{a}$ in EquationEquation (2)(2) $[\begin{matrix} e_{y} \\ e_{a} \end{matrix}] \sim ([\begin{matrix} 0 \\ 0 \end{matrix}], [\begin{matrix} D_{y} & 0 \\ 0 & D_{a} \end{matrix}]),$ (2) , where $Q_{y}$ and $Q_{a}$ are the known cofactor matrices. Then, define the parameters as $θ = (ξ, σ_{1}, σ_{2})$ and assume $L \sim N (0, Σ_{L}),$ the likelihood function $l_{M} (θ; L)$ is identical to the probability density function $f (L | θ)$ of $L$ and can be expressed as (6) $l_{M} (θ; L) = f (L | θ) = {(2 π)}^{- 0.5 n} {| Σ_{L} |}^{- 0.5} exp {- 0.5 {(L - E (L))}^{Τ} Σ_{L}^{- 1} (L - E (L))} .$ (6)

To facilitate the calculation, we take logarithms of the exponential likelihood function in EquationEquation (6)(6) $l_{M} (θ; L) = f (L | θ) = {(2 π)}^{- 0.5 n} {| Σ_{L} |}^{- 0.5} exp {- 0.5 {(L - E (L))}^{Τ} Σ_{L}^{- 1} (L - E (L))} .$ (6) simultaneously on both sides (7) $log l_{M} (θ; L) = - 0.5 {n log (2 π) + log | Σ_{L} | + {(L - E (L))}^{Τ} Σ_{L}^{- 1} (L - E (L))} .$ (7)

The matrix $Σ_{L}$ is a nonlinear function of the unknown parameters $θ .$ Taking the partial derivatives with respect to the parameters in EquationEquation (7)(7) $log l_{M} (θ; L) = - 0.5 {n log (2 π) + log | Σ_{L} | + {(L - E (L))}^{Τ} Σ_{L}^{- 1} (L - E (L))} .$ (7) , which will involve nonlinear objective functions with the observation errors and coefficient matrix errors. Thus, the iterative form after derivation is more complicated.

3. Non-negative VCE for the partial EIV model

3.1. Formulation of EM-VCE

Facing the thorny task of maximizing likelihood function $l_{M} (θ; L),$ we introduce appropriate latent variables or missing data to bring the maximization problems into an equivalent and easy-to-handle form. Latent variables are treated as unobserved data that can be attributed to conditional expectations, and such a latent variable plays an auxiliary role in the framework of maximum likelihood estimation. A general technique for addressing the problems of various missing data is the EM algorithm (Dempster et al. 1977). It can avoid involving the inverse of the matrix and the iterative process is stable; thus, it is frequently utilized as an iterative optimization method for maximum likelihood estimation or residual (restricted) maximum likelihood estimation. From the EM algorithm, a random log-likelihood function $log l_{M} (θ; y_{c})$ is given with the unknown parameters $θ$ involving $y_{c} = {(y_{o}^{Τ}, y_{m}^{Τ})}^{Τ},$ $y_{c}$ denotes the complete data, $y_{o}$ is the observations or incomplete data and $y_{m}$ represents the latent variables or missing data. The parameters $θ^{(k + 1)}$ can be estimated as (8) $θ^{(k + 1)} = \arg \max E (log l_{M} (θ; y_{o}, y_{m}) | y_{o}; θ^{(k)}),$ (8) where $k + 1$ $(k = 0, 1, \dots, t_{\max})$ is the iteration step, $t_{\max}$ is the maximum number of iterations. To emphasize that the preceding conditional expectation is a function of the unknown parameters $θ$ involving parameters $θ^{(k)},$ we let $Ψ$ be the domain of missing data $y_{m} .$ The conditional expectation in EquationEquation (8)(8) $θ^{(k + 1)} = \arg \max E (log l_{M} (θ; y_{o}, y_{m}) | y_{o}; θ^{(k)}),$ (8) is denoted by the $Q -$ function $Q (θ | θ^{(k)}),$ such that (9) $Q (θ; θ^{(k)}) = \int_{Ψ} log f (y_{o}, y_{m} | θ) f (y_{m} | y_{o}; θ^{(k)}) d y_{m} = E (log l_{M} (θ; y_{c}) | y_{o}; θ^{(k)}),$ (9) where $f (y_{o}, y_{m} | θ)$ is the joint probability density function, $f (y_{m} | y_{o}; θ^{(k)})$ indicates the conditional probability density function of missing data $y_{m}$ based on the observations $y_{o}$ and the parameters $θ^{(k)} .$ The EM algorithm performs two steps in each iteration, the E-step determines the conditional expectation of the missing data and the M-step is to maximize the conditional expectation with respect to the unknown parameters $θ .$

As for the maximum likelihood estimation for the partial EIV model using the EM algorithm, the random error vector $e_{a}$ is deemed to be the missing data, $L$ is treated as the incomplete data; then the complete data will be $y_{c} = {(L^{Τ}, e_{a}^{Τ})}^{Τ} .$ Since the independent and random properties of the variables, we could find the complete data $y_{c} \sim N (μ, Σ) .$ Thereby, the stochastic model can be expressed as (10) $[\begin{matrix} L \\ e_{a} \end{matrix}] \sim N ([\begin{matrix} 0 \\ 0 \end{matrix}], [\begin{matrix} Σ_{L} & Z D_{a} \\ D_{a} Z^{Τ} & D_{a} \end{matrix}]) .$ (10)

According to Mclachlan and Krishnan (Citation2008), the likelihood function of $y_{c}$ is represented as (11) $l_{M} (θ; y_{c}) = {(2 π)}^{- 0.5 (n + t)} {| Σ |}^{- 0.5} exp (- 0.5 {(y_{c} - μ)}^{Τ} Σ^{- 1} (y_{c} - μ)) .$ (11)

Then, the log-likelihood function is written as (12) $log l_{M} (θ; y_{c}) = - 0.5 (n + t) log (2 π) - 0.5 log | Σ | - 0.5 {(y_{c} - μ)}^{Τ} Σ^{- 1} (y_{c} - μ) .$ (12)

Taking the particularity of the inversion of the block matrix and the determinant into account via Searle et al. (Citation1992), we obtain (13) $| Σ | = | D_{a} | | {\tilde{Σ}}_{L} | = | D_{a} | | Σ_{L} - Z D_{a} D_{a}^{- 1} D_{a} Z^{Τ} | = | D_{a} | | D_{y} |,$ (13) (14) $Σ^{- 1} = V_{1} + V_{2} = [\begin{matrix} 0 & 0 \\ 0 & D_{a}^{- 1} \end{matrix}] + [\begin{matrix} I_{n} \\ - Z^{Τ} \end{matrix}] {\tilde{Σ}}_{L} [\begin{matrix} I_{n} & - Z \end{matrix}] .$ (14)

Substituting EquationEquations (13)(13) $| Σ | = | D_{a} | | {\tilde{Σ}}_{L} | = | D_{a} | | Σ_{L} - Z D_{a} D_{a}^{- 1} D_{a} Z^{Τ} | = | D_{a} | | D_{y} |,$ (13) and Equation(14)(14) $Σ^{- 1} = V_{1} + V_{2} = [\begin{matrix} 0 & 0 \\ 0 & D_{a}^{- 1} \end{matrix}] + [\begin{matrix} I_{n} \\ - Z^{Τ} \end{matrix}] {\tilde{Σ}}_{L} [\begin{matrix} I_{n} & - Z \end{matrix}] .$ (14) into EquationEquation (12)(12) $log l_{M} (θ; y_{c}) = - 0.5 (n + t) log (2 π) - 0.5 log | Σ | - 0.5 {(y_{c} - μ)}^{Τ} Σ^{- 1} (y_{c} - μ) .$ (12) , the log-likelihood function is rewritten as (15) $log l_{M} (θ; y_{c}) = - 0.5 {(n + t) log (2 π) + log | σ_{2} Q_{a} | + log | σ_{1} Q_{y} | + {(y_{c} - μ)}^{Τ} (V_{1} + V_{2}) (y_{c} - μ)} .$ (15)

From EquationEquation (14)(14) $Σ^{- 1} = V_{1} + V_{2} = [\begin{matrix} 0 & 0 \\ 0 & D_{a}^{- 1} \end{matrix}] + [\begin{matrix} I_{n} \\ - Z^{Τ} \end{matrix}] {\tilde{Σ}}_{L} [\begin{matrix} I_{n} & - Z \end{matrix}] .$ (14) , the aforementioned equation would be further simplified as follows (16) $\begin{matrix} log l_{M} (θ; y_{c}) = - 0.5 {(n + t) log (2 π) + log | σ_{2} Q_{a} | + log | σ_{1} Q_{y} | + e_{a}^{Τ} {(σ_{2} Q_{a})}^{- 1} e_{a} \\ + {(L - Z e_{a})}^{Τ} {(σ_{1} Q_{y})}^{- 1} (L - Z e_{a})} \\ = - 0.5 {(n + t) log (2 π) + log | σ_{2} Q_{a} | + log | σ_{1} Q_{y} | \\ + e_{a}^{Τ} {(σ_{2} Q_{a})}^{- 1} e_{a} + e_{y}^{Τ} {(σ_{1} Q_{y})}^{- 1} e_{y}} . \end{matrix}$ (16)

From the preceding derivation, the log-likelihood function of the complete data $y_{c}$ may be divided into (17) $log l_{M} (θ; y_{c}) = log f (e_{y}; θ) + log f (e_{a}; θ) .$ (17)

Unlike the definition of the log-likelihood function in EquationEquation (7)(7) $log l_{M} (θ; L) = - 0.5 {n log (2 π) + log | Σ_{L} | + {(L - E (L))}^{Τ} Σ_{L}^{- 1} (L - E (L))} .$ (7) , the log-likelihood function $log l_{M} (θ; y_{c})$ consists of two marginal distributions. $log l_{M} (θ; y_{c})$ leads to a maximization problem, which is easier to address than the original problems described with EquationEquations (6)(6) $l_{M} (θ; L) = f (L | θ) = {(2 π)}^{- 0.5 n} {| Σ_{L} |}^{- 0.5} exp {- 0.5 {(L - E (L))}^{Τ} Σ_{L}^{- 1} (L - E (L))} .$ (6) and Equation(7)(7) $log l_{M} (θ; L) = - 0.5 {n log (2 π) + log | Σ_{L} | + {(L - E (L))}^{Τ} Σ_{L}^{- 1} (L - E (L))} .$ (7) . The E-step is to obtain the conditional expectation of the observations and random elements of the coefficient matrix. Thus, there is (18) $\begin{matrix} Q (θ; θ^{(k)}) = E (log (l_{M} (θ; y_{c})) | y_{o}; θ^{(k)}) \\ = E (log (f (e_{y}; θ)) | y_{o}; θ^{(k)}) + E (log (f (e_{a}; θ)) | y_{o}; θ^{(k)}) \\ = Q (σ_{1}; θ^{(k)}) + Q (σ_{2}; θ^{(k)}) . \end{matrix}$ (18)

Obviously, estimating variance components via maximum likelihood estimation requires the conditional distributions $e_{y} | L$ and $e_{a} | L .$ Diffey et al. (Citation2017) noted that the loss of freedom is derived from unconsidered parameters $ξ,$ the estimates of variance components will be biased. However, the estimates using the REML are unbiased. The approach of linear transformation transforms the vector $L$ into the new vector $κ .$ Namely: (19) $κ = SL = [\begin{matrix} S_{1} L \\ S_{2} L \end{matrix}] = [\begin{matrix} κ_{1} \\ κ_{2} \end{matrix}],$ (19) where $S^{Τ} = (S_{1}^{Τ}, S_{2}^{Τ})$ is a nonsingular matrix, $S_{1}^{Τ}$ and $S_{2}^{Τ}$ are the $n \times (n - m)$ and $n \times m$ transformation matrices with full column rank. Also, $S_{1} U = 0$ and $S_{2} U = I_{m}$ are required. To obtain the conditional distributions in EquationEquation (18)(18) $\begin{matrix} Q (θ; θ^{(k)}) = E (log (l_{M} (θ; y_{c})) | y_{o}; θ^{(k)}) \\ = E (log (f (e_{y}; θ)) | y_{o}; θ^{(k)}) + E (log (f (e_{a}; θ)) | y_{o}; θ^{(k)}) \\ = Q (σ_{1}; θ^{(k)}) + Q (σ_{2}; θ^{(k)}) . \end{matrix}$ (18) , the statistical information of new observations $κ_{1},$ $e_{y}$ and $e_{a}$ is of the form (20) $[\begin{matrix} κ_{1} \\ e_{a} \\ e_{y} \end{matrix}] \sim N ([\begin{matrix} 0 \\ 0 \\ 0 \end{matrix}], [\begin{matrix} S_{1} Σ_{L} S_{1}^{Τ} & S_{1} Z D_{a} & S_{1} D_{y} \\ D_{a} Z^{Τ} S_{1}^{Τ} & D_{a} & 0 \\ D_{y} S_{1}^{Τ} & 0 & D_{y} \end{matrix}]) .$ (20)

The two conditional distributions are simplified into (21) $e_{y} | κ_{1} \sim N (D_{y} C (y - E_{A} ξ), D_{y} - D_{y} C D_{y}),$ (21) (22) $e_{a} | κ_{1} \sim N (D_{a} Z^{Τ} C (y - E_{A} ξ), D_{a} - D_{a} Z^{Τ} C Z D_{a}),$ (22) where $E_{A} = A - U,$ then $C$ is obtained (23) $\begin{matrix} C = S_{1}^{Τ} {(S_{1} Σ_{L} S_{1}^{Τ})}^{- 1} S_{1} \\ = Σ_{L}^{- 1} - Σ_{L}^{- 1} U (U^{Τ} Σ_{L}^{- 1} U) U^{Τ} Σ_{L}^{- 1} . \end{matrix}$ (23)

The expectation formula of the quadratic form is (24) $E (x^{Τ} R x) = t r (R Σ_{x x}) + μ_{x}^{Τ} R μ_{x},$ (24) where $R$ is the known matrix, $μ_{x}$ and $Σ_{x x}$ are the expectation and covariance matrix of random variable $x .$ The preceding formula is equally appropriate for the conditional expectation. Thus, the $Q -$ function can be obtained using the EquationEquations (16)(16) $\begin{matrix} log l_{M} (θ; y_{c}) = - 0.5 {(n + t) log (2 π) + log | σ_{2} Q_{a} | + log | σ_{1} Q_{y} | + e_{a}^{Τ} {(σ_{2} Q_{a})}^{- 1} e_{a} \\ + {(L - Z e_{a})}^{Τ} {(σ_{1} Q_{y})}^{- 1} (L - Z e_{a})} \\ = - 0.5 {(n + t) log (2 π) + log | σ_{2} Q_{a} | + log | σ_{1} Q_{y} | \\ + e_{a}^{Τ} {(σ_{2} Q_{a})}^{- 1} e_{a} + e_{y}^{Τ} {(σ_{1} Q_{y})}^{- 1} e_{y}} . \end{matrix}$ (16) and Equation(24)(24) $E (x^{Τ} R x) = t r (R Σ_{x x}) + μ_{x}^{Τ} R μ_{x},$ (24) , such that (25) $\begin{matrix} Q (σ_{1}; θ^{(k)}) = - 0.5 {n log (2 π) + log | σ_{1} Q_{y} | + {\overset{⁁}{e}}_{y}^{{(k)}^{Τ}} {(σ_{1} Q_{y})}^{- 1} {\overset{⁁}{e}}_{y}^{(k)} \\ + \frac{1}{σ_{1}} t r ({\hat{σ}}_{1}^{(k)} I_{n} - {\hat{σ}}_{1}^{2 (k)} C^{(k)} Q_{y})}, \end{matrix}$ (25) (26) $\begin{matrix} Q (σ_{2}; θ^{(k)}) = - 0.5 {t log (2 π) + log | σ_{2} Q_{a} | + {\overset{⁁}{e}}_{a}^{{(k)}^{Τ}} {(σ_{2} Q_{a})}^{- 1} {\overset{⁁}{e}}_{a}^{(k)} \\ + \frac{1}{σ_{2}} t r ({\hat{σ}}_{2}^{(k)} I_{t} - {\hat{σ}}_{2}^{2 (k)} Z^{{(k)}^{Τ}} C^{(k)} Z^{(k)} Q_{a})}, \end{matrix}$ (26)

where (27) ${\overset{⁁}{e}}_{y}^{(k)} = E (e_{y} | κ_{1}; θ^{(k)}) = {\hat{σ}}_{1}^{(k)} Q_{y} C^{(k)} (y - {\overset{⁁}{E}}_{A}^{(k)} {\overset{⁁}{ξ}}^{(k)}),$ (27) (28) ${\overset{⁁}{e}}_{a}^{(k)} = E (e_{a} | κ_{1}; θ^{(k)}) = {\hat{σ}}_{2}^{(k)} Q_{a} Z^{{(k)}^{Τ}} C^{(k)} (y - {\overset{⁁}{E}}_{A}^{(k)} {\overset{⁁}{ξ}}^{(k)}) .$ (28)

Set the partial derivatives of the $Q -$ function in EquationEquations (25)(25) $\begin{matrix} Q (σ_{1}; θ^{(k)}) = - 0.5 {n log (2 π) + log | σ_{1} Q_{y} | + {\overset{⁁}{e}}_{y}^{{(k)}^{Τ}} {(σ_{1} Q_{y})}^{- 1} {\overset{⁁}{e}}_{y}^{(k)} \\ + \frac{1}{σ_{1}} t r ({\hat{σ}}_{1}^{(k)} I_{n} - {\hat{σ}}_{1}^{2 (k)} C^{(k)} Q_{y})}, \end{matrix}$ (25) and Equation(26)(26) $\begin{matrix} Q (σ_{2}; θ^{(k)}) = - 0.5 {t log (2 π) + log | σ_{2} Q_{a} | + {\overset{⁁}{e}}_{a}^{{(k)}^{Τ}} {(σ_{2} Q_{a})}^{- 1} {\overset{⁁}{e}}_{a}^{(k)} \\ + \frac{1}{σ_{2}} t r ({\hat{σ}}_{2}^{(k)} I_{t} - {\hat{σ}}_{2}^{2 (k)} Z^{{(k)}^{Τ}} C^{(k)} Z^{(k)} Q_{a})}, \end{matrix}$ (26) with respect to the variance factors $σ_{1}$ and $σ_{2}$ equal to $0,$ it can be formulated as (29) ${\hat{σ}}_{1}^{(k + 1)} = \frac{1}{n} (e_{y}^{{(k)}^{Τ}} Q_{y}^{- 1} e_{y}^{(k)} + t r ({\hat{σ}}_{1}^{(k)} I_{n} - {\hat{σ}}_{1}^{2 (k)} C^{(k)} Q_{y})),$ (29) (30) ${\hat{σ}}_{2}^{(k + 1)} = \frac{1}{t} (e_{a}^{{(k)}^{Τ}} Q_{a}^{- 1} e_{a}^{(k)} + t r ({\hat{σ}}_{2}^{(k)} I_{t} - {\hat{σ}}_{2}^{2 (k)} Z^{{(k)}^{Τ}} C^{(k)} Z^{(k)} Q_{a})) .$ (30)

Combining the stochastic information of the incomplete data $L$ and $e_{a}$ yields (31) ${\overset{⁁}{e}}_{a}^{(k) +} = E (e_{a} | L) = {\hat{σ}}_{2}^{(k)} Q_{a} Z^{Τ} Σ_{L}^{{(k)}^{- 1}} (y - A ξ^{(k)}),$ (31)

where ${\overset{⁁}{e}}_{a}^{(k) +}$ is the expectation involving a given observation vector $L .$ Wang et al. (Citation2016) indicated that the matrix ${\overset{⁁}{U}}^{(k)}$ is reconstructed when the vector ${\overset{⁁}{e}}_{a}^{(k) +}$ of the iteration step $k$ is known, and the parameters $ξ$ would be estimated with the indirect adjustment. It is equivalent to directly setting the partial derivative with respect to the parameters $ξ^{Τ}$ from EquationEquation (16)(16) $\begin{matrix} log l_{M} (θ; y_{c}) = - 0.5 {(n + t) log (2 π) + log | σ_{2} Q_{a} | + log | σ_{1} Q_{y} | + e_{a}^{Τ} {(σ_{2} Q_{a})}^{- 1} e_{a} \\ + {(L - Z e_{a})}^{Τ} {(σ_{1} Q_{y})}^{- 1} (L - Z e_{a})} \\ = - 0.5 {(n + t) log (2 π) + log | σ_{2} Q_{a} | + log | σ_{1} Q_{y} | \\ + e_{a}^{Τ} {(σ_{2} Q_{a})}^{- 1} e_{a} + e_{y}^{Τ} {(σ_{1} Q_{y})}^{- 1} e_{y}} . \end{matrix}$ (16) , which is derived as (32) ${\overset{⁁}{ξ}}^{(k + 1)} = {({\overset{⁁}{U}}^{{(k)}^{Τ}} {(σ_{1}^{(k)} Q_{y})}^{- 1} {\overset{⁁}{U}}^{(k)})}^{- 1} {\overset{⁁}{U}}^{{(k)}^{Τ}} {(σ_{1}^{(k)} Q_{y})}^{- 1} y .$ (32)

Fang et al. (Citation2017) deduced the TLS solution with the same form of EquationEquation (32)(32) ${\overset{⁁}{ξ}}^{(k + 1)} = {({\overset{⁁}{U}}^{{(k)}^{Τ}} {(σ_{1}^{(k)} Q_{y})}^{- 1} {\overset{⁁}{U}}^{(k)})}^{- 1} {\overset{⁁}{U}}^{{(k)}^{Τ}} {(σ_{1}^{(k)} Q_{y})}^{- 1} y .$ (32) by the Bayesian inference and programed the iterative process for the EIV model. They found that its computation efficiency is low. Thus, we start from EquationEquations (3)(3) $y - A ξ = - (ξ^{Τ} \otimes I_{n}) B e_{a} + e_{y} .$ (3) and Equation(31)(31) ${\overset{⁁}{e}}_{a}^{(k) +} = E (e_{a} | L) = {\hat{σ}}_{2}^{(k)} Q_{a} Z^{Τ} Σ_{L}^{{(k)}^{- 1}} (y - A ξ^{(k)}),$ (31) , the error vector ${\overset{⁁}{e}}_{y}^{(k) +}$ is written as (33) $\begin{matrix} {\overset{⁁}{e}}_{y}^{(k) +} = y - A \overset{⁁}{ξ} - Z^{(k)} {\overset{⁁}{e}}_{a}^{(k) +} \\ = y - A {\overset{⁁}{ξ}}^{(k)} - {\hat{σ}}_{2}^{(k)} Z^{(k)} Q_{a} Z^{{(k)}^{Τ}} Σ_{L}^{{(k)}^{- 1}} (y - A {\overset{⁁}{ξ}}^{(k)}) \\ = {\hat{σ}}_{1}^{(k)} Q_{y} Σ_{L}^{{(k)}^{- 1}} (y - A {\overset{⁁}{ξ}}^{(k)}) . \end{matrix}$ (33)

Substitute the equation $y = {\overset{⁁}{U}}^{(k)} {\overset{⁁}{ξ}}^{(k)} + {\overset{⁁}{e}}_{y}^{(k) +}$ into EquationEquation (32)(32) ${\overset{⁁}{ξ}}^{(k + 1)} = {({\overset{⁁}{U}}^{{(k)}^{Τ}} {(σ_{1}^{(k)} Q_{y})}^{- 1} {\overset{⁁}{U}}^{(k)})}^{- 1} {\overset{⁁}{U}}^{{(k)}^{Τ}} {(σ_{1}^{(k)} Q_{y})}^{- 1} y .$ (32) , the least-squares solution may be reconfigured as (34) ${\overset{⁁}{ξ}}^{(k + 1)} = {({\overset{⁁}{U}}^{{(k)}^{Τ}} Σ_{L}^{{(k)}^{- 1}} {\overset{⁁}{U}}^{(k)})}^{- 1} {\overset{⁁}{U}}^{{(k)}^{Τ}} Σ_{L}^{{(k)}^{Τ}} (y - {\overset{⁁}{E}}_{A}^{(k)} {\overset{⁁}{ξ}}^{(k)}) .$ (34)

EquationEquation (34)(34) ${\overset{⁁}{ξ}}^{(k + 1)} = {({\overset{⁁}{U}}^{{(k)}^{Τ}} Σ_{L}^{{(k)}^{- 1}} {\overset{⁁}{U}}^{(k)})}^{- 1} {\overset{⁁}{U}}^{{(k)}^{Τ}} Σ_{L}^{{(k)}^{Τ}} (y - {\overset{⁁}{E}}_{A}^{(k)} {\overset{⁁}{ξ}}^{(k)}) .$ (34) is an alternative form of the estimates $\overset{⁁}{ξ}$ with higher computational efficiency that the EM-VCE would be adopted. Also, the flowchart of the EM-VCE algorithm for the partial EIV model is depicted in .

3.2. Non-negativity of EM-VCE

The existing VCE methods such as the Helmert method, BIQUE, MINQUE, LS-VCE and REML, always have the potential to yield negative variances, which is in conflict with the fact that the variance components satisfy $σ_{i} \geq 0 .$ There are many reasons for the existence of negative variances (Sjöberg Citation2011; Moghtased-Azar et al. Citation2014; Amiri-Simkooei Citation2016; El Leithy et al. Citation2016): (1) a set of improper initial guess, namely, a badly chosen set for the initial variance components; (2) low redundancy in the functional model, which is insufficient redundant observations. In fact, the precision of VCE can be improved by increasing the number of redundant observations; and (3) an improper stochastic model. To illustrate the non-negativity of the variance components using the EM algorithm in Searle et al. (Citation1992), we start from EquationEquation (23)(23) $\begin{matrix} C = S_{1}^{Τ} {(S_{1} Σ_{L} S_{1}^{Τ})}^{- 1} S_{1} \\ = Σ_{L}^{- 1} - Σ_{L}^{- 1} U (U^{Τ} Σ_{L}^{- 1} U) U^{Τ} Σ_{L}^{- 1} . \end{matrix}$ (23) and combine it with the structure of the covariance matrix $Σ_{L}$ when $Z = 0 .$ That is (35) $P = S_{1}^{Τ} {(S_{1} D_{y} S_{1}^{Τ})}^{- 1} S_{1} .$ (35)

Based on the matrix inversion formula ${(F + JGK)}^{- 1} = F^{- 1} - F^{- 1} J {(G^{- 1} + K F^{- 1} J)}^{- 1} K F^{- 1},$ we obtain (36) $\begin{matrix} C = S_{1}^{Τ} {(S_{1} Σ_{L} S_{1}^{Τ})}^{- 1} S_{1} = S_{1}^{Τ} {(S_{1} D_{y} S_{1}^{Τ} + S_{1} Z D_{a} Z^{Τ} S_{1}^{Τ})}^{- 1} S_{1} \\ = P - PZ D_{a} {(I_{t} + Z^{Τ} P Z D_{a})}^{- 1} Z^{Τ} P . \end{matrix}$ (36)

Inserting EquationEquation (36)(36) $\begin{matrix} C = S_{1}^{Τ} {(S_{1} Σ_{L} S_{1}^{Τ})}^{- 1} S_{1} = S_{1}^{Τ} {(S_{1} D_{y} S_{1}^{Τ} + S_{1} Z D_{a} Z^{Τ} S_{1}^{Τ})}^{- 1} S_{1} \\ = P - PZ D_{a} {(I_{t} + Z^{Τ} P Z D_{a})}^{- 1} Z^{Τ} P . \end{matrix}$ (36) into EquationEquation (30)(30) ${\hat{σ}}_{2}^{(k + 1)} = \frac{1}{t} (e_{a}^{{(k)}^{Τ}} Q_{a}^{- 1} e_{a}^{(k)} + t r ({\hat{σ}}_{2}^{(k)} I_{t} - {\hat{σ}}_{2}^{2 (k)} Z^{{(k)}^{Τ}} C^{(k)} Z^{(k)} Q_{a})) .$ (30) , the second term of EquationEquation (30)(30) ${\hat{σ}}_{2}^{(k + 1)} = \frac{1}{t} (e_{a}^{{(k)}^{Τ}} Q_{a}^{- 1} e_{a}^{(k)} + t r ({\hat{σ}}_{2}^{(k)} I_{t} - {\hat{σ}}_{2}^{2 (k)} Z^{{(k)}^{Τ}} C^{(k)} Z^{(k)} Q_{a})) .$ (30) is simplified as (37) $\begin{matrix} t r (σ_{2} I_{t} - σ_{2}^{2} Z^{Τ} C Z Q_{a}) = σ_{2} t r (I_{t} - Z^{Τ} S_{1}^{Τ} {(S_{1} Σ_{L} S_{1}^{Τ})}^{- 1} S Z D_{a}) \\ = σ_{2} t r (I_{t} - Z^{Τ} P Z D_{a} + Z^{Τ} P Z D_{a} {(I_{t} + Z^{Τ} P Z D_{a})}^{- 1} Z^{Τ} P Z D_{a}) \\ = σ_{2} t r ({(I_{t} + Z^{Τ} P Z D_{a})}^{- 1}) . \end{matrix}$ (37)

Taking the matrix $I_{t} + Z^{Τ} P Z D_{a}$ into account with $D_{a} = D_{a}^{\frac{1}{2}} D_{a}^{\frac{1}{2}},$ the following equation is established (38) $D_{a}^{\frac{1}{2}} {(I_{t} + Z^{Τ} P Z D_{a})}^{- 1} = {(I_{t} + D_{a}^{\frac{1}{2}} Z^{Τ} P Z D_{a}^{\frac{1}{2}})}^{- 1} D_{a}^{\frac{1}{2}} = {(I_{t} + τ^{Τ} τ)}^{- 1} D_{a}^{\frac{1}{2}},$ (38)

where $τ = P^{\frac{1}{2}} Z D_{a}^{\frac{1}{2}}$ is the invertible matrix. The $τ^{Τ} τ$ and ${(I_{t} + τ^{Τ} τ)}^{- 1}$ are positive definite matrices from the properties of the positive definite; hence (39) $t r ({(I_{t} + Z^{Τ} P Z D_{a})}^{- 1}) > 0.$ (39)

The first term of EquationEquation (30)(30) ${\hat{σ}}_{2}^{(k + 1)} = \frac{1}{t} (e_{a}^{{(k)}^{Τ}} Q_{a}^{- 1} e_{a}^{(k)} + t r ({\hat{σ}}_{2}^{(k)} I_{t} - {\hat{σ}}_{2}^{2 (k)} Z^{{(k)}^{Τ}} C^{(k)} Z^{(k)} Q_{a})) .$ (30) is substituted into EquationEquation (28)(28) ${\overset{⁁}{e}}_{a}^{(k)} = E (e_{a} | κ_{1}; θ^{(k)}) = {\hat{σ}}_{2}^{(k)} Q_{a} Z^{{(k)}^{Τ}} C^{(k)} (y - {\overset{⁁}{E}}_{A}^{(k)} {\overset{⁁}{ξ}}^{(k)}) .$ (28) to obtain (40) $e_{a}^{{(k)}^{Τ}} Q_{a}^{- 1} e_{a}^{(k)} = {\hat{σ}}_{2}^{2 (k)} {(y - {\overset{⁁}{E}}_{A}^{(k)} {\overset{⁁}{ξ}}^{(k)})}^{Τ} C^{(k)} Z^{{(k)}^{Τ}} Q_{a} Z^{(k)} C^{(k)} (y - {\overset{⁁}{E}}_{A}^{(k)} {\overset{⁁}{ξ}}^{(k)}),$ (40)

where $Q_{a}$ is a symmetric and positive definite matrix, $e_{a}^{{(0)}^{Τ}} Q_{a}^{- 1} e_{a}^{(0)} > 0 .$ In the iteration step $k + 1,$ once any positive value of ${\hat{σ}}_{i}^{(0)}$ is given and the EquationEquation (39)(39) $t r ({(I_{t} + Z^{Τ} P Z D_{a})}^{- 1}) > 0.$ (39) is taken into account. The EquationEquation (30)(30) ${\hat{σ}}_{2}^{(k + 1)} = \frac{1}{t} (e_{a}^{{(k)}^{Τ}} Q_{a}^{- 1} e_{a}^{(k)} + t r ({\hat{σ}}_{2}^{(k)} I_{t} - {\hat{σ}}_{2}^{2 (k)} Z^{{(k)}^{Τ}} C^{(k)} Z^{(k)} Q_{a})) .$ (30) will be rewritten as (41) ${\hat{σ}}_{2}^{(k + 1)} = \frac{1}{t} (e_{a}^{{(k)}^{Τ}} Q_{a}^{- 1} e_{a}^{(k)} + {\hat{σ}}_{2}^{(k)} t r ({(I_{t} + Z^{Τ} P Z D_{a})}^{- 1})) > 0.$ (41)

Through induction, the non-negative estimates of each iteration are ensured in the aforementioned process. $C^{(k)}$ is still a positive definite matrix, then $y - {\overset{⁁}{E}}_{A}^{(k)} {\overset{⁁}{ξ}}^{(k)} \neq 0$ would be guaranteed. However, in actual situations, the variance components may converge to zero. According to Zhou et al. (2015), the EM algorithm, to some extent, acts like an interior point method, approaching the optimum from within the feasible region. The non-negativity of the variance factor $σ_{1}$ of the observations is similar to the variance factor $σ_{2} .$ Given any positive value of ${\hat{σ}}_{i}^{(0)},$ the estimate ${\hat{σ}}_{1}$ has the same property.

3.3. Modified version of EM-VCE

In the implement of EM-VCE, this approach requires a few more iterations. Thus, a modified algorithm to improve the convergence longs to be presented. The Fisher–Score algorithm is an iterative method for a nonlinear function model in Zhao et al. (Citation2019). The information matrix does not require evaluating the quadratic form of the observations in the Hessian matrix. This algorithm converges quickly and has strong robustness to the initial guess over the Newton–Rapson algorithm. To obtain a more general derivation of the partial EIV model, the statistical information of its observations and coefficient matrix is expressed as (42) $Σ_{L} = D_{y} + Z D_{a} Z^{Τ} = σ_{1} Q_{y} + σ_{2} Z Q_{a} Z^{Τ} = \sum_{i = 1}^{2} σ_{i} T_{i},$ (42)

where $T_{i}$ is a positive or semi-definite matrix corresponding to the variance factor $σ_{i}$ $(i = 1, 2) .$ Calculating the partial derivatives from the EquationEquations (25)–(28), it can be obtained as (43) ${\frac{\partial Q (σ_{i}; θ^{(k)})}{\partial σ_{i}} |}_{σ_{i} = {\hat{σ}}_{i}^{(k)}} = - 0.5 {t r (C^{(k)} T_{i}) - {(y - {\overset{⁁}{E}}_{A}^{(k)} {\overset{⁁}{ξ}}^{(k)})}^{Τ} C^{(k)} T_{i} C^{(k)} (y - {\overset{⁁}{E}}_{A}^{(k)} {\overset{⁁}{ξ}}^{(k)})} .$ (43)

Taking the second partial derivative of the $Q -$ function with respect to ${\hat{σ}}_{i}^{(k)},$ which means taking the first partial derivative for ${\hat{σ}}_{i}^{(k)}$ of EquationEquation (43)(43) ${\frac{\partial Q (σ_{i}; θ^{(k)})}{\partial σ_{i}} |}_{σ_{i} = {\hat{σ}}_{i}^{(k)}} = - 0.5 {t r (C^{(k)} T_{i}) - {(y - {\overset{⁁}{E}}_{A}^{(k)} {\overset{⁁}{ξ}}^{(k)})}^{Τ} C^{(k)} T_{i} C^{(k)} (y - {\overset{⁁}{E}}_{A}^{(k)} {\overset{⁁}{ξ}}^{(k)})} .$ (43) . That is (44) ${\frac{\partial^{2} Q (σ_{i}; θ^{(k)})}{\partial σ_{i} \partial {\hat{σ}}_{i}^{(k)}} |}_{σ_{i} = {\hat{σ}}_{i}^{(k)}} = 0.5 t r (C^{(k)} T_{i} C^{(k)} T_{i}) - {(y - {\overset{⁁}{E}}_{A}^{(k)} {\overset{⁁}{ξ}}^{(k)})}^{Τ} C^{(k)} T_{i} C^{(k)} T_{i} C^{(k)} (y - {\overset{⁁}{E}}_{A}^{(k)} {\overset{⁁}{ξ}}^{(k)}) .$ (44)

Take the expectation of the EquationEquation (44)(44) ${\frac{\partial^{2} Q (σ_{i}; θ^{(k)})}{\partial σ_{i} \partial {\hat{σ}}_{i}^{(k)}} |}_{σ_{i} = {\hat{σ}}_{i}^{(k)}} = 0.5 t r (C^{(k)} T_{i} C^{(k)} T_{i}) - {(y - {\overset{⁁}{E}}_{A}^{(k)} {\overset{⁁}{ξ}}^{(k)})}^{Τ} C^{(k)} T_{i} C^{(k)} T_{i} C^{(k)} (y - {\overset{⁁}{E}}_{A}^{(k)} {\overset{⁁}{ξ}}^{(k)}) .$ (44) to obtain (45) $E ({\frac{\partial^{2} Q (σ_{i}; θ^{(k)})}{\partial σ_{i} \partial {\hat{σ}}_{i}^{(k)}} |}_{σ_{i} = {\hat{σ}}_{i}^{(k)}}) = - 0.5 t r (C^{(k)} T_{i} C^{(k)} T_{i}) .$ (45)

The Fisher–Score algorithm changes the information matrix of the Newton–Rapson algorithm with the expectation of the information matrix. Thus, the iterative formulae of the modified EM-VCE are (46) ${\hat{σ}}_{1}^{(k + 1)} = {\hat{σ}}_{1}^{(k)} + \frac{{(y - {\overset{⁁}{E}}_{A}^{(k)} {\overset{⁁}{ξ}}^{(k)})}^{Τ} C^{(k)} T_{1} C^{(k)} (y - {\overset{⁁}{E}}_{A}^{(k)} {\overset{⁁}{ξ}}^{(k)}) - t r (C^{(k)} T_{1})}{t r (C^{(k)} T_{1} C^{(k)} T_{1})},$ (46) (47) ${\hat{σ}}_{2}^{(k + 1)} = {\hat{σ}}_{2}^{(k)} + \frac{{(y - {\overset{⁁}{E}}_{A}^{(k)} {\overset{⁁}{ξ}}^{(k)})}^{Τ} C^{(k)} T_{2} C^{(k)} (y - {\overset{⁁}{E}}_{A}^{(k)} {\overset{⁁}{ξ}}^{(k)}) - t r (C^{(k)} T_{2})}{t r (C^{(k)} T_{2} C^{(k)} T_{2})} .$ (47)

Obviously, the value of the molecular part of the second term of EquationEquations (46)(46) ${\hat{σ}}_{1}^{(k + 1)} = {\hat{σ}}_{1}^{(k)} + \frac{{(y - {\overset{⁁}{E}}_{A}^{(k)} {\overset{⁁}{ξ}}^{(k)})}^{Τ} C^{(k)} T_{1} C^{(k)} (y - {\overset{⁁}{E}}_{A}^{(k)} {\overset{⁁}{ξ}}^{(k)}) - t r (C^{(k)} T_{1})}{t r (C^{(k)} T_{1} C^{(k)} T_{1})},$ (46) and Equation(47)(47) ${\hat{σ}}_{2}^{(k + 1)} = {\hat{σ}}_{2}^{(k)} + \frac{{(y - {\overset{⁁}{E}}_{A}^{(k)} {\overset{⁁}{ξ}}^{(k)})}^{Τ} C^{(k)} T_{2} C^{(k)} (y - {\overset{⁁}{E}}_{A}^{(k)} {\overset{⁁}{ξ}}^{(k)}) - t r (C^{(k)} T_{2})}{t r (C^{(k)} T_{2} C^{(k)} T_{2})} .$ (47) is not always greater than or equal to zero. When a positive value ${\hat{σ}}_{i}^{(0)}$ is given as the initial guess, only the denominator of the fraction of the initial iteration is greater than zero. As the iterative process proceeds, ${\hat{σ}}_{i}^{(k)}$ may cause a nonpositive definite matrix $C^{(k)} T_{i} C^{(k)},$ which implies that a negative variance appears. Therefore, according to Sun et al. (Citation2003), adding the non-negativity constraints will make it possible to estimate non-negative variance components with ${\hat{σ}}_{i}^{(k) +} = max ({\hat{σ}}_{i}^{(k)}, 0)$ for the modified version. This will mainly constrain one of the elements and will not affect the estimation of the remaining elements, which ensures the non-negative estimates.

4. Numerical results and analysis

To verify the feasibility and effectiveness of the proposed algorithms, two examples, i.e. linear regression and the four-parameter planar coordinate transformation, are employed. Seven algorithms, namely, the LS method, TLS method, REML (Koch Citation1986), positive-valued functions non-negative variance component estimation (Moghtased-Azar et al. Citation2014), non-negative least squares variance component estimation (Amiri-Simkooei Citation2016) and the two proposed approaches are used to estimate the variance components for the partial EIV model. All seven algorithms adopted in this article are shown in .

Table 1. The design of seven algorithms.

Download CSV Display Table

4.1. Linear regression

The linear regression model is (48) $y - e_{y} = (x - e_{x}) ξ_{1} + ξ_{2},$ (48)

where $(x, y)$ represents the coordinates in a planar coordinate system, $e_{x}$ and $e_{y}$ are the corresponding coordinate errors and $ξ_{1}$ and $ξ_{2}$ are the linear regression model parameters.

To extract random and nonrandom elements from the coefficient matrix, the matrix forms $h$ and $B$ are given as (49) $h = [\begin{matrix} h_{1} \\ h_{2} \end{matrix}], B = [\begin{matrix} 1 \\ 0 \end{matrix}] \otimes I_{n},$ (49)

where $h_{1}$ is a $n \times 1$ matrix whose internal elements are all zero, and $h_{2}$ is a $n \times 1$ matrix whose internal elements are all one.

With the simulated data, 20 values are generated at equal intervals between 0.9 and 11 for the true vector $x$ of the coordinates. The true values of parameters for the linear regression model are $ξ = {[\begin{matrix} - 1.5 & 3 \end{matrix}]}^{Τ},$ then, the true vector $y$ of coordinates is obtained. Furthermore, random errors with a mean value of $0$ and the covariance matrices of $1.5 P_{x}^{- 1}$ and $0.5 P_{y}^{- 1}$ are added into the true value $(x_{i}, y_{i})$ of 20 sets of the coordinates. The simulated coordinates and the corresponding weights are shown in . For the purpose of comparing the relative performance of different VCE methods, this example is used to estimate the variance components in the presence of positive variances. In other words, the new proposed method must have the ability to estimate the variance components, which lays a foundation for the following research on negative variances.

Table 2. Coordinate observations and corresponding weights.

Display Table

With the given simulated data, the parameters and variance components are estimated using seven schemes with the convergence threshold of $ε = 10^{- 10},$ the results are listed in .

Table 3. Linear regression results with different methods.

Display Table

${\hat{σ}}_{1}$ and ${\hat{σ}}_{2}$ are the variance component estimates for the observations and coefficient matrix errors, respectively. $‖ \overset{⁁}{ξ} - \bar{ξ} ‖$ indicates the norm between the parameter estimate and the true value. MAE is the mean absolute error and describes the precision of the model-prediction. The weighted form of mean absolute error is $Μ Α Ε = \sum_{i = 1}^{n} P_{y_{i}} | {\hat{e}}_{y_{i}} | / \sum_{i = 1}^{n} P_{y_{i}},$ where ${\hat{e}}_{y_{i}}$ represents the individual model-prediction error and $P_{y_{i}}$ is the individual weight of observations. As seen from , there is a considerable difference for the parameters $\overset{⁁}{ξ}$ estimated depending on whether the VCE methods is considered. The parameters based on the VCE are closer to the true values. Additionally, the MAE of estimation results by the VCE methods is also smaller in comparison to the methods of LS and TLS. Regarding the number of iterations, Scheme 4 has 38, while Scheme 3 and 5 have 43. Regarding the two proposed approaches, the convergence of Scheme 6 is relatively slow and the iteration number is 121. However, Scheme 7 requires only 20 iterations, which is more efficient. The convergence of the VCE for Schemes 3, 6 and 7 is given in . According to the results of this example, the EM-VCE and the modified EM-NN-VCE accord with the three previous VCE methods.

Figure 1. Flowchart showing the implementation of the EM-VCE method for the partial EIV model.

Figure 2. The convergence of the estimates of two variance factors $σ_{1}$ and $σ_{2}$ for the linear regression. (a) Variance factors $σ_{1}$ and $σ_{2}$ estimated by the REML method; (b) Variance factors $σ_{1}$ and $σ_{2}$ estimated by the EM-VCE method; and (c) Variance factors $σ_{1}$ and $σ_{2}$ estimated by the modified EM-NN-VCE method.

4.2. Planar coordinate transformation

To further demonstrate the validity of the proposed algorithms in the presence of negative variance, a planar coordinate transformation model is adopted as follows: (50) $[\begin{matrix} X_{1} \\ Y_{1} \\ ⋮ \\ X_{n} \\ Y_{n} \end{matrix}] - [\begin{matrix} e_{X_{1}} \\ e_{Y_{1}} \\ ⋮ \\ e_{X_{n}} \\ e_{Y_{n}} \end{matrix}] = ([\begin{matrix} x_{1} & - y_{1} & 1 & 0 \\ y_{1} & x_{1} & 0 & 1 \\ ⋮ & ⋮ & ⋮ & ⋮ \\ x_{n} & - y_{n} & 1 & 0 \\ y_{n} & x_{n} & 0 & 1 \end{matrix}] - [\begin{matrix} e_{x_{1}} & e_{y_{1}} & 0 & 0 \\ e_{y_{1}} & e_{x_{1}} & 0 & 0 \\ ⋮ & ⋮ & ⋮ & ⋮ \\ e_{x_{n}} & e_{y_{n}} & 0 & 0 \\ e_{y_{n}} & e_{x_{n}} & 0 & 0 \end{matrix}]) [\begin{matrix} β_{1} \\ β_{2} \\ β_{3} \\ β_{4} \end{matrix}],$ (50)

where $(x_{i}, y_{i})$ are the coordinates of the start coordinate system, $(X_{i}, Y_{i})$ are the coordinates of the target coordinate system. $e_{x_{i}},$ $e_{y_{i}},$ $e_{X_{i}}$ and $e_{Y_{i}}$ are the errors of the corresponding coordinates. Subscripts $i$ expresses the $i th$ coordinates, and $β_{1},$ $β_{2},$ $β_{3}$ and $β_{4}$ are the coordinate transformation parameters.

Considering the planar coordinate transformation for the partial EIV model, the vector $h$ and the fixed matrix $B$ associated with the coefficient matrix are (51) $h = [\begin{matrix} h_{1} \\ h_{2} \\ h_{3} \\ h_{4} \end{matrix}], B = [\begin{matrix} B_{1} \\ B_{2} \\ B_{3} \\ B_{4} \end{matrix}],$ (51)

where $h_{1}$ and $h_{2}$ are the $2 n \times 1$ zero matrices, $h_{3} = 1_{n \times 1} \otimes [\begin{matrix} 1 \\ 0 \end{matrix}],$ $h_{4} = 1_{n \times 1} \otimes [\begin{matrix} 0 \\ 1 \end{matrix}],$ $B_{1} = I_{n} \otimes [\begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix}],$ $B_{2} = I_{n} \otimes [\begin{matrix} 0 & - 1 \\ 1 & 0 \end{matrix}]$ and $B_{3} = B_{4} = I_{n} \otimes [\begin{matrix} 0 & 0 \\ 0 & 0 \end{matrix}] .$

Given 15 sets of true values of the start coordinate system and 4 transformation parameters, the true coordinates of the target coordinate system are obtained from connecting two sets of coordinate systems. Then, we add random errors with a mean of $0$ and the covariance matrices of $1.5 P_{a}^{- 1}$ and $0.5 P_{y}^{- 1}$ to the two coordinate systems. A set of coordinates containing errors is listed in .

Table 4. Coordinate simulation values of two sets of coordinate systems.

Display Table

Assuming that the true values of the transformation parameters are $β = {[\begin{matrix} 0.9 & 0.6 & 10 & 15 \end{matrix}]}^{Τ},$ the weights of two sets of coordinate systems are as follows $P_{a} = diag ([1 1 1 3 3 3 2 2 2 4 4 4 6 6 6 5 5 5 7 7 7 9 9 9 8 8 8 12 12 12]),$ $P_{y} = diag ([11 11 11 8 8 8 10 10 10 7 7 7 6 6 6 4 4 4 5 5 5 2 2 2 3 3 3 1 1 1]) .$

In , a negative variance component appears by Scheme 3. Scheme 5 imposes non-negativity constraints based on the LS-VCE theory, the variance factors are non-negative. Scheme 4 fails to converge due to its improper stochastic model. Also, Scheme 3 lacks good estimation results in , the MAE is even worse than the LS and TLS. The MAE of estimation results by the non-negative VCE methods is negligible due to a small amplitude of the variance component obtained. In this article, the convergence threshold of Scheme 5 is set as $ε = 10^{- 10},$ the variance components are close to those of Schemes 5 and 7. The convergence of the variance components of Schemes 3, 5 and 7 is shown in . Nineteen iterations are needed for the REML. However, only 10 and 11 iterations are required for the NNLS-VCE and the modified EM-NN-VCE, respectively.

Figure 3. The convergence of the estimates of two variance factors $σ_{1}$ (left column) and $σ_{2}$ (right column) for the coordinate transformation. (a) Variance factor $σ_{1}$ estimated by the REML method; (b) Variance factor $σ_{2}$ estimated by the REML method; (c) Variance factor $σ_{1}$ estimated by the NNLS-VCE method; (d) Variance factor $σ_{2}$ estimated by the NNLS-VCE method; (e) Variance factor $σ_{1}$ estimated by the modified EM-NN-VCE method; and (f) Variance factor $σ_{2}$ estimated by the modified EM-NN-VCE method.

Table 5. Coordinate transformation results with different methods.

Display Table

In , different convergence thresholds are set for Scheme 6. As the threshold decreases, both variance components converge to a certain trend value. The variance factor ${\hat{σ}}_{1}$ estimated approaches zero. The variance factor ${\hat{σ}}_{2}$ also converges to a positive value. This approach guarantees the non-negativity of the variance components. In fact, when the variance factor ${\hat{σ}}_{1}$ approaches zero, the likelihood function $l_{M} (θ; y_{c})$ of the complete data $y_{c}$ will approach infinity. In that case, the variance component estimated by EM-VCE tend to converge slowly, maximizing the likelihood function is actually an ill-posed problem. Thereby, an approximate value with zero for variance component is feasible, the EM-VCE could be applied to verify the justification of the existing non-negative VCE methods.

Table 6. Results of Scheme 6 with different convergence thresholds.

Display Table

5. Conclusions

In some geodetic applications, the problem of negative variances may appear in the process of correcting the stochastic model with variance factors. Many attempts have been made to counteract this effect, such as PVFs-VCE and NNLS-VCE, this problem may be avoided by introducing the positive-valued functions or non-negativity constraints. To investigate the properties of solutions from the statistics, an unconstrained approach is required. Based on the standard theories of the EM algorithm, we develop two new methods of non-negative VCE for the partial EIV model, an iterative algorithm of TLS is obtained simultaneously. The EM-VCE can statistically demonstrate that the problem of maximum likelihood estimation is essentially an ill-posed problem in the presence of negative estimates and variance component estimates move to the edge of the parameter space. A modified version of VCE is also presented to improve the computational efficiency and has access to the non-negative estimates. Both algorithms are free from a complex matrix inversion. Finally, two numerical examples, i.e. linear regression and the four-parameter planar coordinate transformation, are employed to verify the feasibility of the proposed algorithms. The results of linear regression show that the proposed algorithms can achieve the more reasonable fixed parameter estimates compared with the algorithms of LS and TLS. The variance component estimates are identical to the algorithms of REML, PVFs-VCE and NNLS-VCE, the modified EM algorithm, namely, EM-NN-VCE, which is nearly fifty percent faster. Two variance factors of the latter example were simultaneously estimated to be non-negative by the proposed algorithms. This study provides a statistical explanation for the justifiability of the existing non-negative VCE methods.

Disclosure statement

No potential conflict of interest was reported by the authors.

Additional information

Funding

This research is supported by the National Natural Science Foundation of China (Nos. 41874001 and 41664001); Support Program for Outstanding Youth Talents in Jiangxi Province (No. 20162BCB23050); National Key Research and Development Program (No. 2016YFB0501405).

References

Amiri-Simkooei AR. 2013. Application of least squares variance component estimation to errors-in-variables models. J Geod. 87(10–12):935–944.
Google Scholar
Amiri-Simkooei AR. 2016. Non-negative least-squares variance component estimation with application to GPS time series. J Geod. 90(5):451–466.
Google Scholar
Amiri-Simkooei AR, Jazaeri S. 2012. Weighted total least squares formulated by standard least squares theory. J Geod Sci. 2(2):113–124.
Google Scholar
Calvin JA. 1993. REML estimation in unbalanced multivariate variance components models using an EM algorithm. Biometrics. 49(3):691–701.
Google Scholar
Dempster AP, Laird NM, Rubin DB. 1977. Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B. 39(1):1–38.
Google Scholar
Diffey SM, Smith AB, Welsh AH, Cullis BR. 2017. A new REML (parameter expanded) EM algorithm for linear mixed models. Aust N Z J Stat. 59(4):433–448.
Google Scholar
El Leithy HA, Abdel Wahed ZA, Abdallah MS. 2016. On non-negative estimation of variance components in mixed linear models. J Adv Res. 7(1):59–68.
Google Scholar
Fang X. 2011. Weighted total least squares solutions for applications in geodesy. Hanover, Germany: Leibniz Universität Hannover.
Google Scholar
Fang X. 2013. Weighted total least squares: necessary and sufficient conditions, fixed and random parameters. J Geod. 87(8):733–749.
Google Scholar
Fang X, Li B, Alkhatib H, Zeng W, Yao Y. 2017. Bayesian inference for the errors-in-variables model. Stud Geophys Geod. 61(1):35–52.
Google Scholar
Foulley JL, Jaffrézic F, Robert-Granié C. 2000. EM-REML estimation of covariance parameters in Gaussian mixed models for longitudinal data analysis. Genet Sel Evol. 32(2):129–141.
Google Scholar
Franc V, Hlaváč V, Navara M. 2005. Sequential coordinate-wise algorithm for the non-negative least squares problem. In: Gagalowicz A, Philips W (eds) Computer analysis of images and patterns, Lecture Notes in Computer Science. p. 407–414.
Google Scholar
Golub GH, Van Loan CF. 1980. An analysis of the total least squares problem. SIAM J Numer Anal. 17(6):883–893.
Google Scholar
Grafarend EW, Schaffrin B. 1993. Ausgleichungsrechnung in linearen modellen. Mannheim, Germany: BI-Wissenschaftsverlag.
Google Scholar
Groeneveld E. 1994. A reparameterization to improve numerical optimization in multivariate REML (co)variance component estimation. Genet Sel Evol. 26(6): 537–545.
Google Scholar
Gupta MR, Chen Y. 2010. Theory and use of the EM algorithm. FNT Signal Process. 4(3):223–296.
Google Scholar
Helmert FR. 1907. Die Ausgleichungsrechnung nach der Methode der kleinsten Quadrate. Berlin, Germany: Zweite Auflage, Teubner, Leipzig.
Google Scholar
Hsu R. 1999. An alternative expression for the variance factors in using iterated almost unbiased estimation. J Geod. 73(4):173–179.
Google Scholar
Jazaeri S, Amiri-Simkooei AR, Sharifi MA. 2014. Iterative algorithm for weighted total least squares adjustment. Surv Rev. 46(334):19–27.
Google Scholar
Jennrich RI, Sampson PF. 1976. Newton-Raphson and related algorithms for maximum likelihood variance component estimation. Technometrics. 18(1):11–17.
Google Scholar
Kargoll B, Omidalizarandi M, Loth I, Paffenholz J-A, Alkhatib H. 2018. An iteratively reweighted least-squares approach to adaptive robust adjustment of parameters in linear regression models with autoregressive and t-distributed deviations. J Geod. 92(3):271–297.
Google Scholar
Koch KR. 1986. Maximum likelihood estimate of variance components. Bull Geod. 60(4):329–338.
Google Scholar
Koch KR. 1999. Parameter estimation and hypothesis testing in linear models. Berlin, Germany: Springer-Verlag.
Google Scholar
Koch KR. 2013. Robust estimation by expectation maximization algorithm. J Geod. 87(2):107–116.
Google Scholar
Koch KR, Kargoll B. 2013. Expectation maximization algorithm for the variance-inflation model by applying the t-distribution. J Appl Geod. 7(3):217–225.
Google Scholar
Laird NM, Ware JH. 1982. Random-effects models for longitudinal data. Biometrics. 38(4):963–974.
Google Scholar
Lange KL, Little RJA, Taylor JMG. 1989. Robust statistical modeling using the t distribution. J Am Stat Assoc. 84(408):881–896.
Google Scholar
Lee J, Khuri AI. 2001. Modeling the probability of a negative ANOVA estimate of a variance component. CSAB. 51(1–2):31–45.
Google Scholar
Little RJA, Rubin DB. 2002. Statistical analysis with missing data. 2nd ed. Hoboken (NJ): Wiley.
Google Scholar
Liu JN, Zeng WX, Xu PL. 2013. Overview of total least squares methods. Geom Inform Sci. 38(5):505–512.
Google Scholar
Mclachlan GJ, Krishnan T. 2008. The EM algorithm and extensions. 2nd ed. Hoboken (NJ): Wiley.
Google Scholar
Moghtased-Azar K, Tehranchi R, Amiri-Simkooei AR. 2014. An alternative method for non-negative estimation of variance components. J Geod. 88(5):427–439.
Google Scholar
Peng J. 2009. Jointly robust estimation of unknown parameters and variance components based on expectation-maximization algorithm. J Surv Eng. 135(1):1–9.
Google Scholar
Schaffrin B, Wieser A. 2008. On weighted total least-squares adjustment for linear regression. J Geod. 82(7):415–421.
Google Scholar
Searle SR, Casella G, McCulloch CE. 1992. Variance components. New York (NY): Wiley.
Google Scholar
Sjöberg LE. 2011. On the best quadratic minimum bias non-negative estimator of a two-variance component model. J Geod Sci. 1(3):280–285.
Google Scholar
Sun Y, Sinha BK, Rosen DV, Meng Q. 2003. Nonnegative estimation of variance components in multivariate unbalanced mixed linear models with two variance components. J Stat Plan Infer. 115(1):215–234.
Google Scholar
Teunissen PJG, Amiri-Simkooei AR. 2008. Least-squares variance component estimation. J Geod. 82(2):65–82.
Google Scholar
Thompson WA, Jr. 1962. The problem of negative estimates of variance components. Ann Math Statist. 33(1):273–289.
Google Scholar
Vergos GS, Tziavos IN, Sideris MG. 2012. On the determination of sea level changes by combining altimetric, tide gauge, satellite gravity and atmospheric observations. Heidelberg, Berlin: Springer.
Google Scholar
Wang L. 2012. Properties of the total least squares estimation. Geod Geodyn. 3(4):39–46.
Google Scholar
Wang L, Ding R. 2020. Inversion and precision estimation of earthquake fault parameters based on scaled unscented transformation and hybrid PSO/Simplex algorithm with GPS measurement data. Measurement. 153:107422.
Google Scholar
Wang L, Wen G, Zhao Y. 2019. Virtual observation method and precision estimation for ill-posed partial EIV model. J Surv Eng. 145(4):04019010.
Google Scholar
Wang L, Xu G. 2016. Variance component estimation for partial errors-in-variables models. Stud Geophys Geod. 60(1):35–55.
Google Scholar
Wang L, Yu H, Chen X. 2016. An algorithm for partial EIV model. Acta Geod Cartographica Sin. 45(1):22–29.
Google Scholar
Wang L, Zhao Y. 2018. Scaled unscented transformation for nonlinear error propagation：accuracy, sensitivity and applications. J Surv Eng. 144(1):04017022.
Google Scholar
Wang L, Zhao Y. 2019. Second order approximating function method for precision estimation of total least squares. J Surv Eng. 145(1):04018011.
Google Scholar
Wang L, Zou C. 2019. Accuracy analysis and applications of Sterling interpolation method for nonlinear function error propagation. Measurement. 146:55–64.
Google Scholar
Xu P, Liu Y, Shen Y, Fukuda Y. 2007. Estimability analysis of variance and covariance components. J Geod. 81(9):593–602.
Google Scholar
Xu P, Liu J, Shi C. 2012. Total least squares adjustment in partial errors-in-variables models: algorithm and statistical analysis. J Geod. 86(8):661–675.
Google Scholar
Xu P, Shen Y, Fukuda Y, Liu Y. 2006. Variance component estimation in linear inverse ill-posed models. J Geod. 80(2):69–81.
Google Scholar
Zhao J, Guo F, Li Q. 2019. Fisher-Score algorithm of WTLS estimation for PEIV model. Geom Inform Sci Wuhan Univ. 44(2):59–65.
Google Scholar
Zhou H, Hu L, Zhou J, Lange K. 2015. MM algorithms for variance components models. J Comput Graph Stat. arXiv: 1509. 07426.
Google Scholar

Non-negative variance component estimation for the partial EIV model by the expectation maximization algorithm

Abstract

1. Introduction

2. Partial EIV model