Search in:

Statistical Theory and Related Fields Volume 4, 2020 - Issue 2

Submit an article Journal homepage

Free access

1,095

Views

CrossRef citations to date

Altmetric

Listen

Articles

Efficient GMM estimation with singular system of moment conditions

Zhiguo XiaoDepartment of Statistics, School of Management, Fudan University, Shanghai, People's Republic of ChinaCorrespondence[email protected]
View further author information

Pages 172-178 | Received 21 Apr 2019, Accepted 05 Aug 2019, Published online: 23 Aug 2019

Cite this article
https://doi.org/10.1080/24754269.2019.1653159
CrossMark

In this article

1. Introduction
2. GMM and generalised inverses
3. Main results
4. Further issues
5. Concluding remarks
Disclosure statement
Additional information
Footnotes
References
Appendixes

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions
View PDF PDF View EPUB EPUB

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

Standard generalised method of moments (GMM) estimation was developed for nonsingular system of moment conditions. However, many important economic models are characterised by singular system of moment conditions. This paper shows that efficient GMM estimation of such models can be achieved by using the reflexive generalised inverses, in particular the Moore–Penrose generalised inverse, of the variance matrix of the sample moment conditions as the weighting matrix. We provide a consistent estimator of the optimal weighting matrix and establish its consistency. Potential issues of using generalised inverse and some remedies are also discussed.

Keywords:

Singular moment condition models
generalised method of moments
reflexive generalised inverses
imposing random noises

JEL Classifications:

1. Introduction

Over the past several decades, a great deal of statisticians' efforts has been devoted to the statistical inference of moment condition models, i.e., models where the linkage between parameter and data is specified through a set of moment restrictions (also known as estimating equations). Technically, a moment condition model specifies that the data generating process of observations $Z_{1}, \dots, Z_{n}$ satisfies (1) $E [g (Z_{i}, β_{0})] = 0,$ (1) where g is a $R^{K}$ -valued known function and $β_{0}$ is the $R^{p}$ -valued parameter of interest, and $K \geq p$ . The popularity of moment condition models is partially due to that a parametric likelihood form may be too strong for many real applications or scientific theories. When the dimension of parameter of interest equals the number of moment conditions, the parameter is said to be just-identified, and the classical approach of the method of moments can be applied for parameter estimation. In practice, a majority of the moment condition models investigated by applied researchers, such as the models for assets pricing and dynamic panel data, are over-identified. The generalised method of moments (GMM) of Hansen (Citation1982) is one of the most popular techniques that are designed for the estimation of over-identified moment condition models (see, e.g., Hansen & West, Citation2002 and Hall, Citation2005).

Like many other classical statistical methods, GMM comes with the price of a set of regularity conditions which warrant its validity. Although in most applications those regularity conditions are not binding, some of them can be violated in interesting circumstances. This paper is concerned with the efficient GMM estimation when one of the regularity conditions of standard GMM, that the covariance matrix of the moment vector evaluated at the true parameter be of full rank, is violated. A typical such kind of violation appears when the system of moment conditions is singular, i.e., some components of the moment functions are linear combinations of each other.

Singular systems of moment conditions exist in a wide variety of economic studies, such as the consumer expenditure function analysis (Barten, Citation1969, Citation1977), the market share analysis (Rao, Citation1972; Weiss, Citation1968), the production function estimation (Dhrymes, Citation1962), the translog utility function analysis (Berndt & Christensen, Citation1974), the linearised dynamic stochastic general equilibrium (DSGE) modelling (Bierens, Citation2007; Ireland, Citation2004), the errors-in-variables analysis with panel data (Biørn, Citation2000; Biørn & Klette, Citation1998; Wansbeek, Citation2001; Xiao, Shao, & Palta, Citation2010a, Citation2010b; Xiao, Shao, Xu, & Palta, Citation2007), the multivariate random-effects meta-analysis models (Chen, Hong, & Riley,Citation2014; Riley, Abrams, Lambert, Sutton, & Thompson, Citation2007) and the non-Gaussian ARMA models (Alessi, Barigozzi, & Capasso, Citation2011; Leeper, Walker, & Yang, Citation2013; Mountford & Uhlig, Citation2009; Velasco & Lobato, Citation2018).

In a linear regression model with known singular disturbance covariance matrix, Theil (Citation1971) showed that a generalised Aitken-like estimator using the Moore–Penrose generalised inverse is best linear unbiased. Following Theil (Citation1971), Kreijger and Neudecker (Citation1977) proposed two optimality criteria to obtain best linear unbiased estimators. Within the same context, Dhrymes and Schwarz (Citation1987) discussed the existence issue of the estimators using generalised inverses. Haupt and Oberhofer (Citation2006) proposed an estimator which does not use the generalised inverses and allows for additional exogenous restrictions, collinearities and generalised adding-up. Bierens and Swanson (Citation2000) and Bierens (Citation2007) suggested that one can obtain parameter estimate by maximising the information content of the singular system. Ireland (Citation2004) and Lai (Citation2008) proposed adding random noises to the singular system to implement maximum likelihood estimation.

In the GMM literature, White (Citation1986) showed that if the estimating function g is of the form $g = (g_{1}^{'}, g_{2}^{'})^{'}$ such that: (i) $Ω_{1} = E [g_{1} (Z_{i}, β_{0}) g_{1} (Z_{i}, β_{0})^{'}]$ is nonsingular and (ii) components of $g_{2}$ are linear combinations of $g_{1}$ , then the efficient GMM estimator is the minimiser of (2) $J_{n} (β) = n {\bar{g}}_{n} (β)^{'} Ω^{-} {\bar{g}}_{n} (β),$ (2) where ${\bar{g}}_{n} (β) = \frac{1}{n} \underset{i = 1}{\sum^{n}} g (Z_{i}, β)$ and $Ω^{-}$ is a reflexive generalised inverse of (3) $Ω = E [g (Z_{i}, β_{0}) g (Z_{i}, β_{0})^{'}] .$ (3) However, in practice, the aforementioned representation of g is generally not readily obtainable (see, e.g., Schneeweiss, Citation2014; Velasco & Lobato, Citation2018; Xiao et al., Citation2010b).

The purpose of this article is to develop an efficient GMM estimator for a singular system of moment conditions with general form. An earlier effort appeared in Xiao (Citation2008), which is proposed using the reflexive generalised inverses to deal with the singularity. Schneeweiss (Citation2014) independently discussed similar ideas.

The rest of the paper is organised as follows. In Section 2, we briefly review the GMM methodology, the concepts of generalised inverses and some results of the reflexive generalised inverses. We present our main result in Section 3. Section 4 discusses further issues such as the estimation of optimal weighting matrix and the method of adding noises, and Section 5 concludes. Proofs of results are relegated to the Appendix.

2. GMM and generalised inverses

We first make a brief introduction of the standard GMM method. For book-length detailed account, see Hall (Citation2005). For simplicity we assume that the data $Z_{1}, \dots, Z_{n}$ are i.i.d. Assume also that K>p, i.e., the model is over-identified. Since the number of restrictions on parameter is greater than the dimension of parameter, in general it is impossible to obtain an estimator of the parameter by using method of moments, i.e., by setting the sample moment ${\bar{g}}_{n}$ equal to zero. The idea of GMM by Hansen (Citation1982) is to minimise a quadratic norm of ${\bar{g}}_{n}$ : (4) $J_{n} (β) = n {\bar{g}}_{n} (β)^{'} W_{n} {\bar{g}}_{n} (β),$ (4) where $W_{n}$ is a positive semidefinite matrix. Under a set of regularity conditions including that Ω being positive definite, and assuming $W_{n}$ converges in probability to a positive semi-definite matrix W, ${\hat{β}}_{G M M}$ , the minimiser of (Equation4(4) $J_{n} (β) = n {\bar{g}}_{n} (β)^{'} W_{n} {\bar{g}}_{n} (β),$ (4) ), is a consistent estimator for $β_{0}$ and has limiting distribution $\sqrt{n} ({\hat{β}}_{G M M} - β_{0}) \overset{d}{\to} N (0, V (W)),$ where $V (W) = (G^{'} W G)^{- 1} G^{'} W Ω W G (G^{'} W G)^{- 1}$ with $G = E [\frac{\partial g (Z_{i}, β_{0})}{\partial β}]$ . The lower bound of $V (W)$ is achieved at $W = Ω^{- 1}$ , i.e., $V (W) \geq V (Ω^{- 1})$ in the sense of being nonnegative definite, for any W. In practice, a consistent estimator of $Ω^{- 1}$ can be set as (5) ${\hat{Ω}}^{- 1} = {[\frac{1}{n} \underset{i = 1}{\sum^{n}} [g (Z_{i}, \tilde{β}) g (Z_{i}, \tilde{β})^{'}]]}^{- 1},$ (5) where $\tilde{β}$ is a consistent estimator of $β_{0}$ . A typical choice of $\tilde{β}$ is a GMM estimator with $W = I_{K}$ , the identity matrix of order K. Note that ${\hat{Ω}}^{- 1}$ converges in probability to $Ω^{- 1}$ because $\frac{1}{n} \underset{i = 1}{\sum^{n}} [g (Z_{i}, \tilde{β}) g (Z_{i}, \tilde{β})^{'}]$ converges in probability to Ω, and more importantly, Ω is positive definite.

Next we review the concepts of generalised inverses of a matrix and some of their properties.

Definition 2.1

Let A be a real $l \times s$ matrix. An $s \times l$ real matrix $A^{-}$ may have one or all of the following properties:

$A A^{-} A = A$ ;
$A^{-} A A^{-} = A^{-}$ ;
$(A A^{-})^{'} = A A^{-}$ ;
$(A^{-} A)^{'} = A^{-} A$ .

If $A^{-}$ satisfies (i), it is called a generalised inverse of A; if $A^{-}$ satisfies (i) and (ii), it is called a reflexive generalised inverse (or $g_{2}$ -inverse) of A; if $A^{-}$ satisfies (i) –(iv), it is called the Moore–Penrose generalised inverse of A. The Moore–Penrose generalised inverse of a matrix A is unique and is denoted by $A^{+}$ hereafter.Footnote1

We list some of the important properties of the generalised inverses by the following two propositions, proof of which can be achieved by direct verification and therefore is omitted.Footnote2 Proposition 1 states that when the matrix of interest has natural factorisation with certain structure, some of its generalised inverses can be easily derived.

Proposition 2.1

(i) Let $Ω = [A_{1}^{'} A_{2}^{'}]^{'} Ω_{1} [A_{1}^{'} A_{2}^{'}]$ , where $A_{1}$ and $Ω_{1}$ are nonsingular square matrices. Then $Ω^{-} = [\begin{matrix} (A_{1}^{- 1})^{'} Ω_{1}^{- 1} A_{1}^{- 1} & 0 \\ 0 & 0 \end{matrix}]$ is a reflexive generalised inverse of Ω.

(ii) Let $Ω = A Ω_{1} A^{'}$ , where A is of full column rank and $Ω_{1}$ is nonsingular, then any reflexive generalised inverse $Ω^{-}$ of Ω satisfies $A^{'} Ω^{-} A = Ω_{1}^{- 1}$ . Moreover, we have $A^{+} = (A^{'} A)^{- 1} A^{'}$ , and $Ω^{+} = (A^{+})^{'} Ω_{1}^{- 1} A^{+} = A (A^{'} A)^{- 1} Ω_{1}^{- 1} (A^{'} A)^{- 1} A^{'} .$

Proposition 2.2 points out that the generalised inverses (including the reflexive generalised inverses) are not unique and can be obtained by using the singular value decomposition .Footnote3

Proposition 2.2

Let Ω be an $m \times n$ real valued matrix with rank r>0. Suppose that the singular value decomposition of Ω is $Ω = S Σ T^{'}$ , where S is $m \times m$ with $S^{'} S = I_{m}$ , T is $n \times n$ with $T^{'} T = I_{n}$ , and $Σ = [\begin{matrix} Σ_{r} & O \\ O & O \end{matrix}]$ , with $Σ_{r}$ the diagonal matrix of singular values of Ω and O the matrices of zeros. Then

(i) (6) $G = T [\begin{matrix} Σ_{r}^{- 1} & X \\ Y & Z \end{matrix}] S^{'}$ (6) is a generalised inverse of Ω, where X, Y and Z are arbitrary real valued matrices with appropriate dimension.

(ii) (7) $G = T [\begin{matrix} Σ_{r}^{- 1} & X \\ Y & Y Σ_{r}^{- 1} X \end{matrix}] S^{'}$ (7) is a reflexive generalised inverse of Ω, where X, Y are arbitrary real valued matrices with appropriate dimension.

(iii) (8) $G = T [\begin{matrix} Σ_{r}^{- 1} & O \\ O & O \end{matrix}] S^{'}$ (8) is the Moore–Penrose generalised inverse of Ω.

White (Citation1986) result on GMM estimation with singular moment conditions can be stated as follows:

Theorem 2.1

White, Citation1986

Suppose there exists a matrix Δ such that $G = [I_{l_{1}} Δ]^{'} G_{1}$ and $Ω = [I_{l_{1}} Δ]^{'} Ω_{1} [I_{l_{1}} Δ]$ , where $G_{1}$ is of full column rank and $Ω_{1}$ is $l_{1} \times l_{1}$ positive definite with $l_{1} \geq p$ , then

(i) For any reflexive generalised inverse $Ω^{-}$ of Ω, $Ω - G (G^{'} Ω^{-} G)^{- 1} G^{'}$ is independent of the choice of $Ω^{-}$ , and $Ω - G (G^{'} Ω^{-} G)^{- 1} G^{'} \geq 0$ .

(ii) For any reflexive generalised inverse $Ω^{-}$ of Ω, and for any W, $(G^{'} W G)^{- 1} G^{'} W Ω W G (G^{'} W G)^{- 1} \geq (G^{'} Ω^{-} G)^{- 1} .$ Hence $Ω^{-}$ is the optimal weighting matrix. In practice, one may choose the Moore–Penrose generalised inverse of Ω, or a special reflexive generalised inverse $Ω^{-} = [\begin{matrix} Ω_{1}^{- 1} & 0 \\ 0 & 0 \end{matrix}] .$

Remark

Note that $(G^{'} Ω^{-} G)^{- 1}$ – the asymptotic covariance matrix of the optimal GMM estimator – does not depend on Δ. The basic idea of Theorem 2.1 is as follows. Suppose we have two sets of instrumental variables, say $Z_{1}$ and $Z_{2}$ , such that $Z_{1}$ is linearly independent and $Z_{2}$ is a linear combination of $Z_{1}$ , i.e., $Z_{2} = Z_{1} α$ for some constant vector α, then one can ignore $Z_{2}$ and use $Z_{1}$ only as instruments, in doing so we achieve the same asymptotic efficiency as using $Z = (Z_{1} Z_{2})$ . The limitation of this result is that to apply this method we have to sort all instrumental variables into two groups, such that instrumental variables in one group are linear combinations of those in the other group. This can be very tedious in practice. For example, in panel data models, there are often a very large number of instruments and it is in general impossible to sort out them.

3. Main results

We now establish some basic results about random vectors with singular covariance matrices.

Lemma 3.1

Let $Y = [Y_{1} \dots Y_{m}]^{'}$ be an $m \times 1$ random vector and r be the rank of the covariance matrix of Y. Suppose that r<m. Then

(i) There exist a r-dimensional subvector $Y^{(r)} = [Y_{i_{1}} \dots Y_{i_{r}}]^{'}$ of Y such that its covariance matrix $v a r (Y^{(r)})$ is positive definite. The vector $Y^{(r)}$ is called an essential subvector of Y.

(ii) Let $Y^{- (r)}$ be the $(m - r) \times 1$ vector consisting of the remaining components of Y. Then there exist an $(m - r) \times (m - r)$ constant matrix C and an $(m - r) \times 1$ constant vector d such that $Y^{- (r)} = C Y^{(r)} + d, w . p .1 .,$ where w.p.1. means ‘with probability one’. Hence there exist an $m \times r$ constant matrix B of full column rank and an $m \times 1$ constant vector $\tilde{d}$ such that $Y = B Y^{(r)} + \tilde{d}, w . p .1 .$ (iii) If $E Y = 0$ , then $\tilde{d}$ in (ii) is the zero vector. i.e., $Y = B Y^{(r)}$ , w.p.1.

Theorem 3.1 is the main result of this paper.

Theorem 3.1

Consider GMM estimation for model (Equation1(1) $E [g (Z_{i}, β_{0})] = 0,$ (1) ) with Ω defined by (Equation3(3) $Ω = E [g (Z_{i}, β_{0}) g (Z_{i}, β_{0})^{'}] .$ (3) ). Suppose it is known that the components of $g (Z_{i}, β_{0})$ are linearly dependent (with probability one). Then any reflexive inverse of Ω is an optimal weighting matrix. Particularly, we can use the Moore–Penrose generalised inverse $Ω^{+}$ .

Let $Ω^{-}$ be an arbitrary reflexive generalised inverse of Ω. Then the asymptotic variance matrix of the GMM estimator using $Ω^{-}$ as the weighting matrix is $\begin{aligned} V (Ω^{-}) & = (G^{'} Ω^{-} G)^{- 1} G^{'} Ω^{-} Ω Ω^{-} G (G^{'} Ω^{-} G)^{- 1} \\ = (G^{'} Ω^{-} G)^{- 1}, \end{aligned}$ where $G = E [\frac{\partial g (Z_{i}, β_{0})}{\partial β}]$ . A natural question is whether $V (Ω^{-})$ is a constant matrix independent of the choice of $Ω^{-}$ . The answer is yes. To see this, suppose the essential subvector of $g (Z_{i}, β_{0})$ is $g^{(r)} (Z_{i}, β_{0}) = (g_{i_{1}} (Z_{i}, β_{0}), \dots, g_{i_{r}} (Z_{i}, β_{0}))^{'}$ , with $g (Z_{i}, β_{0}) = B g^{(r)} (Z_{i}, β_{0}) (w . p .1 .)$ . Let $G_{1} = E [\frac{\partial g^{(r)} (Z_{i}, β_{0})}{\partial β^{'}}]$ and $Ω_{1} = v a r (g^{(r)} (Z_{i}, β_{0}))$ . Then we have $G = B G_{1}$ , $B^{'} Ω^{-} B = Ω_{1}^{- 1}$ , hence $\begin{aligned} V (Ω^{-}) & = (G^{'} Ω^{-} G)^{- 1} = (G_{1}^{'} B^{'} Ω^{-} B G_{1})^{- 1} \\ = (G_{1}^{'} Ω_{1}^{- 1} G_{1})^{- 1}, \end{aligned}$ which is independent of the choice of $Ω^{-}$ , as the essential vector and the corresponding matrices $G_{1}$ and $Ω_{1}$ are unrelated to $Ω^{-}$ . We also see that $V (Ω^{-})$ remains the same if we use another essential vector, as $(G^{'} Ω^{-} G)^{- 1} = (G^{'} Ω^{+} G)^{- 1}$ is unrelated to the choice of the essential vectors. More details can be found in the proof of Theorem 3.1 in the Appendix.

Judging from the asymptotic distributions we can see that GMM estimation using moment conditions (Equation1(1) $E [g (Z_{i}, β_{0})] = 0,$ (1) ) and the reflexive generalised inverses as weighting matrix is asymptotically equivalent to the efficient GMM using moment conditions $E [g^{(r)} (Z_{i}, β_{0})] = 0$ . In some situations, one can figure out the essential subvector $g^{(r)} (Z_{i}, β)$ , then efficient GMM estimation can be based on $E [g^{(r)} (Z_{i}, β_{0})] = 0$ directly. For instance, in the errors-in-variables analysis of panel data, Xiao et al. (Citation2010a) and Xiao et al. (Citation2010b) found that one can obtain $g^{(r)} (Z_{i}, β)$ by using singular value decomposition. However, such simple decomposition is not available in general and it can be very inconvenient, if not impossible, to find the essential vector $g^{(r)} (Z_{i}, β)$ . Theorem 2 tells us that whatever this subvector is, the GMM estimator using any of the reflexive generalised inverses, and the Moore–Penrose generalised inverse in particular, as the weighting matrix will always have the same asymptotic variance as the efficient GMM based on $E [g^{(r)} (Z_{i}, β_{0})] = 0$ .

4. Further issues

4.1. Optimal weighting matrix estimation

Now we discuss consistent estimation of $Ω^{+}$ . Let $\tilde{β}$ be a consistent estimator of $β_{0}$ and ${\hat{Ω}}_{n} = \frac{1}{n} \underset{i = 1}{\sum^{n}} [g (Z_{i}, \tilde{β}) g (Z_{i}, \tilde{β})^{'}]$ . Then ${\hat{Ω}}_{n} \to Ω$ in probability under normal regularity conditions. A natural candidate estimator of $Ω^{+}$ is ${\hat{Ω}}_{n}^{+}$ .

It is well known that if a sequence of nonsingular square matrices ${A_{n}}$ converges to a nonsingular square matrix A, then $A_{n}^{- 1} \to A^{- 1}$ .Footnote4 However, if A is singular, and $A_{n} \to A$ , we may not necessarily have that $A_{n}^{+} \to A^{+}$ . Footnote5 Assuming $A_{n} \to A$ and A is singular, then a necessary and sufficient condition for $A_{n}^{+} \to A^{+}$ is:

Theorem 4.1

Stewart, Citation1969

Let ${A_{n}}$ be a sequence of real $m \times n$ matrices converging to a $m \times n$ matrix A. Then $A_{n}^{+} \to A^{+}$ if and only $r a n k (A_{n}) = r a n k (A)$ for n large enough.

We now prove that ${\hat{Ω}}_{n}^{+}$ is a consistent estimator for $Ω^{+}$ . By Theorem 4.1, we need only to show that $r a n k ({\hat{Ω}}_{n}) = r a n k (Ω)$ when n is large enough. By Lemma 3.1, there exists a constant matrix B of full column rank, such that w.p.1., $\begin{aligned} {\hat{Ω}}_{n} & = \frac{1}{n} \underset{i = 1}{\sum^{n}} [B g^{r} (Z_{i}, \tilde{β}) g^{(r)} (Z_{i}, \tilde{β})^{'} B^{'}] \\ = B (\frac{1}{n} \underset{i = 1}{\sum^{n}} [g^{(r)} (Z_{i}, \tilde{β}) g^{(r)} (Z_{i}, \tilde{β})^{'}]) B^{'} . \end{aligned}$ Since the components of $g^{(r)} (Z_{i}, β)$ are linear independent for any β, $r a n k (\frac{1}{n} \underset{i = 1}{\sum^{n}} [g^{(r)} (Z_{i}, \hat{β}) g^{(r)} (Z_{i}, \hat{β})^{'}]) = r = r a n k (Ω),$ for any n. Therefore $r a n k ({\hat{Ω}}_{n}) = r a n k (\frac{1}{n} \underset{i = 1}{\sum^{n}} [g^{(r)} (Z_{i}, \hat{β}) g^{(r)} (Z_{i}, \hat{β})^{'}]) = r a n k (Ω)$ , and ${\hat{Ω}}_{n}^{+}$ converges to $Ω^{+}$ in probability.

Even though using generalised inverses is theoretically sound, it can be unstable, i.e., small perturbation of a singular matrix may result in large deviation from its generalised inverses.Footnote6 Therefore, one must be cautious when using generalised inverses. We suggest that one should first try to find the essential subvector $g^{(r)}$ . In case that $g^{(r)}$ is not easily obtainable, the method introduced below can be used as an alternative to generalised inverses.

4.2. Imposing random noises

To avoid the potential bias caused by generalised inverses, we can add randomly generated noises to the system to make it nonsingular, as Bierens (Citation2007) and Lai (Citation2008) did in the maximum likelihood estimation of singular system of equations. Specifically, let $U_{1}, \dots, U_{n}$ be i.i.d. $K \times 1$ random vectors generated from the multivariate normal distribution with mean zero and covariance matrix $σ^{2} I_{K}$ , and assume that $U_{1}, \dots, U_{n}$ are independent from $Z_{1}, \dots, Z_{n}$ . Define $h (Z_{i}, U i, β) = g (Z_{i}, β) + U_{i}$ , for $i = 1, \dots, n$ . Then $β_{0}$ is the solution of the set of moment conditions (9) $E [h (Z_{i}, U_{i}, β)] = 0.$ (9) The set of moment conditions (Equation9(9) $E [h (Z_{i}, U_{i}, β)] = 0.$ (9) ) is nonsingular, since $Σ = E [h (Z_{i}, U_{i}, β_{0}) h (Z_{i}, U_{i}, β_{0})^{'}] = Ω + σ^{2} I_{K} > 0$ . Let ${\tilde{β}}_{G M M}$ be an efficient GMM estimator of $β_{0}$ based on (Equation9(9) $E [h (Z_{i}, U_{i}, β)] = 0.$ (9) ), then the asymptotic distribution of ${\tilde{β}}_{G M M}$ is $\sqrt{n} ({\tilde{β}}_{G M M} - β_{0}) \overset{d}{\to} N (0, (G^{'} Σ^{- 1} G)^{- 1})$ . Since $(G^{'} Σ^{- 1} G)^{- 1} > (G^{'} Ω^{- 1} G)^{- 1}$ , for any $σ > 0$ , ${\tilde{β}}_{G M M}$ is asymptotically less efficient than ${\hat{β}}_{G M M}$ . However, the loss of efficiency can be controlled since $(G^{'} Σ^{- 1} G)^{- 1} \to (G^{'} Ω^{- 1} G)^{- 1}$ , as $σ \to 0$ . Similar to Lai (Citation2008), one can also generate m independent samples of $U_{1}, \dots, U_{n}$ , obtain m GMM estimators ${\tilde{β}}_{G M M}^{1}, \dots, {\tilde{β}}_{G M M}^{m}$ and then construct a new estimator by ${\tilde{β}}_{G M M}^{A} = \frac{1}{m} \underset{j = 1}{\sum^{m}} {\tilde{β}}_{G M M}^{j}$ .Footnote7 Since ${\tilde{β}}_{G M M}^{A}$ combines information in ${\tilde{β}}_{G M M}^{1}, \dots, {\tilde{β}}_{G M M}^{m}$ , in theory it is asymptotically more efficient than any of ${\tilde{β}}_{G M M}^{1}, \dots, {\tilde{β}}_{G M M}^{m}$ . It is of interest to investigate the asymptotic distribution and finite sample performance of ${\tilde{β}}_{G M M}^{A}$ in a future study.

5. Concluding remarks

Since the moment condition models do not require researchers to specify the likelihood function of the data generating process, they have been widely used by econometricians to model economic theories. Though it is desirable that the moment conditions constructed from economic theory are linearly independent, in practice this may not always be the case. Sometimes singularity is inherent in the model or is caused by some singular transformations. In this paper, we extended the efficient GMM estimation to linearly dependent moment condition models. The result can be viewed as a natural extension of the standard GMM theory, since the generalised inverse of a matrix is a natural extension of the inverse of a matrix. Though in theory using generalised inverses yields efficient GMM estimators, in practice one must be cautious of using them, in light of the following two concerns. First, using generalised inverses ignores the intrinsic structure of the moment conditions, which sometimes contains important information. Second, the generalised inverses of a singular matrix are unstable, which could induce serious bias of the resulting GMM estimator. Therefore when there is singularity in the system, a practical strategy is to obtain an essential moment vector and apply GMM to it. In case an essential moment vector is not available, we can add random noises to the moment conditions and obtain GMM estimators based on the new set of moment conditions. We suggest using generalised inverses with discretion. The results in this paper might also shed light on other popular statistical methods (such as the empirical likelihood) for estimating equations with singularity.

Disclosure statement

No potential conflict of interest was reported by the author.

Additional information

Funding

The research is supported by the National Natural Science Foundation of China (NSFC grant: 71661137005, 71473040 and 11571081).

Notes on contributors

Zhiguo Xiao

Dr. Zhiguo Xiao received his PhD in statistics from the University of Wisconsin-Madison. His research interest includes both theoretical and applied econometrics.

Notes

3 Similar results for square matrices can be found in Bapat (Citation2012, pp. 47–48).

2 More results on reflexive generalised inverses can be found in Rao and Mitra (Citation1971), Rao (2001), Bapat (Citation2012), Fampa and Lee (Citation2018), and Xu, Fampa, and Lee (Citation2019).

1 For the existence and uniqueness of the Moore–Penrose generalised inverse, see, e.g., Penrose (Citation1955) and Abadir and Magnus (Citation2005, pp. 284–285).

5 For example, consider $A_{n} = [\begin{matrix} 1 - \frac{1}{n} & 1 - \frac{1}{n^{2}} \\ 1 - \frac{1}{n^{2}} & 1 \end{matrix}]$ and $A = [\begin{matrix} 1 & 1 \\ 1 & 1 \end{matrix}]$ . Then $A_{n} \to A$ . Since $A_{n}$ is invertible, $A_{n}^{+} = A_{n}^{- 1} \to [\begin{matrix} - \infty & \infty \\ \infty & - \infty \end{matrix}]$ . Hence $A_{n}^{+} ↛ A^{+} = [\begin{matrix} \frac{1}{2} & \frac{1}{2} \\ \frac{1}{2} & \frac{1}{2} \end{matrix}]$ .

4 A sequence of real $m \times n$ matrices ${A_{n}}$ is said to converge to a $m \times n$ matrix A if $∥ A_{n} - A ∥ \to 0$ , where $∥ \cdot ∥$ is a matrix norm, such as the Euclidean norm or $∥ A ∥ = sup_{∥ x ∥ = 1} {∥ A x ∥}$ .

6 For example, consider a matrix $A = [\begin{matrix} 1 \\ 0 \\ 0 \end{matrix}]$ . Its Moore–Penrose generalised inverse is $A^{+} = A$ . Adding a small number $ε = 10^{- 3}$ to the first two diagonal entries of A, we obtain a new Moore–Penrose generalised inverse $[\begin{matrix} 0.999 \\ 1000 \\ 0 \end{matrix}]$ .

7 This idea is similar to bootstrap aggregating (bagging).

References

Abadir, K. M., & Magnus, J. R. (2005). Matrix algebra. New York: Cambridge University Press.
Google Scholar
Alessi, L., Barigozzi, M., & Capasso, M. (2011). Non-fundamentalness in structural econometric models: A review. International Statistical Review, 79, 16–47. doi: 10.1111/j.1751-5823.2011.00131.x
Web of Science ®Google Scholar
Bapat, R. B. (2012). Linear algebra and linear models (3rd ed.). London: Springer-Verlag.
Google Scholar
Barten, A. (1969). Maximum likelihood estimation of a complete system of demand equations. European Economic Review, 1, 7–73. doi: 10.1016/0014-2921(69)90017-8
Web of Science ®Google Scholar
Barten, A. (1977). The system of consumer demand function approach: A review. Econometrica, 45, 23–51. doi: 10.2307/1913286
Web of Science ®Google Scholar
Berndt, E. R., & Christensen, L. R. (1974). Testing for the existence of a consistent aggregate index of labor inputs. American Economics Review, 64, 391–403.
Web of Science ®Google Scholar
Bierens, H. J. (2007). Econometric analysis of linearized singular dynamic stochastic general equilibrium models. Journal of Econometrics, 136, 595–627. doi: 10.1016/j.jeconom.2005.11.008
Web of Science ®Google Scholar
Bierens, H. J., & Swanson, N. R. (2000). The econometric consequences of the ceteris paribus condition in economic theory. Journal of Econometrics, 95, 223–253. doi: 10.1016/S0304-4076(99)00038-X
Web of Science ®Google Scholar
Biørn, E. (2000). Panel data with measurement errors: Instrumental variables and GMM procedures combining levels and differences. Econometric Reviews, 19(4), 391–424. doi: 10.1080/07474930008800480
Google Scholar
Biørn, E., & Klette, T. (1998). Panel data with errors-in-variables: Essential and redundant orthogonal conditions in GMM-estimation. Economics Letters, 59, 275–282. doi: 10.1016/S0165-1765(98)00053-6
Web of Science ®Google Scholar
Chen, Y., Hong, C., & Riley, R. D. (2014). An alternative pseudolikelihood method for multivariate random-effects meta-analysis. Statistics in Medicine, 34(3), 361–380. doi: 10.1002/sim.6350
Web of Science ®Google Scholar
Dhrymes, P. J. (1962). On devising unbiased estimators for the parameters of a Cobb–Douglas production function. Econometrica, 30, 297–304. doi: 10.2307/1910218
Web of Science ®Google Scholar
Dhrymes, P. J., & Schwarz, S. (1987). On the existence of generalized inverse estimators in a singular system of equations. Journal of Forecasting, 6, 181–192. doi: 10.1002/for.3980060304
Web of Science ®Google Scholar
Fampa, M., & Lee, J. (2018). On sparse reflexive generalized inverses. Operations Research Letters, 46(6), 605–610. doi: 10.1016/j.orl.2018.09.005
Web of Science ®Google Scholar
Hall, A. (2005). Generalized method of moments. Oxford University Press: New York.
Google Scholar
Hansen, L. (1982). Large sample properties of generalized method of moments estimators. Econometrica, 50, 1029–1054. doi: 10.2307/1912775
Web of Science ®Google Scholar
Hansen, B., & West, K. (2002). Generalized method of moments and macroeconomics. Journal of Business & Economic Statistics, 20, 460–469. doi: 10.1198/073500102288618603
Web of Science ®Google Scholar
Haupt, H., & Oberhofer, W. (2006). Generalized adding-up in systems of regression equations. Economics Letters, 92, 263–269. doi: 10.1016/j.econlet.2006.03.001
Web of Science ®Google Scholar
Ireland, P. N. (2004). A method for taking models to the data. Journal of Economic Dynamics and Control, 28, 1205–1226. doi: 10.1016/S0165-1889(03)00080-0
Web of Science ®Google Scholar
Kreijger, R. G., & Neudecker, H. (1977). Exact linear restrictions on parameters in the general linear model with a singular covariance matrix. Journal of the American Statistical Association, 72, 430–432. doi: 10.1080/01621459.1977.10481014
Web of Science ®Google Scholar
Lai, H. (2008). Maximum likelihood estimation of singular systems of equations. Economics Letters, 99, 51–54. doi: 10.1016/j.econlet.2007.05.027
Web of Science ®Google Scholar
Leeper, E. M., Walker, T. B., & Yang, S.-C. S. (2013). Fiscal foresight and information flows. Econometrica, 81, 1115–1145. doi: 10.3982/ECTA8337
Web of Science ®Google Scholar
Mountford, A., & Uhlig, H. (2009). What are the effects of fiscal policy shocks?. Journal of Applied Econometrics, 24, 960–992. doi: 10.1002/jae.1079
Web of Science ®Google Scholar
Penrose, R. (1955). A generalized inverse for matrices. Mathematical Proceedings of the Cambridge Philosophical Society, 51, 406–413. doi: 10.1017/S0305004100030401
Google Scholar
Rao, C. R. (1972). Alternative econometric models of sales advertising relationships. Journal of Marketing Research, 9, 171–181. doi: 10.1177/002224377200900209
Web of Science ®Google Scholar
Rao, C. R., & Mitra, S. K. (1971). Generalized inverse of matrices and its applications. John Wiley & Sons, Inc.: New York.
Google Scholar
Riley, R., Abrams, K., Lambert, P., Sutton, A., & Thompson, J. (2007). An evaluation of bivariate random-effects meta-analysis for the joint synthesis of two correlated outcomes. Statistics in Medicine, 26(1), 78–97. doi: 10.1002/sim.2524
PubMed Web of Science ®Google Scholar
Schneeweiss, H. (2014). The linear GMM model with singular covariance matrix due to the elimination of a nuisance parameter. Technical Report 165, Dept. Statistics, Univ. Munich.
Google Scholar
Stewart, G. W. (1969). On the continuity of the generalized inverse. SIAM Journal of Applied Mathematics, 17, 33–45. doi: 10.1137/0117004
Web of Science ®Google Scholar
Theil, H. (1971). Principle of econometrics. John Wiley & Sons: New York.
Google Scholar
Velasco, C., & Lobato, I. N. (2018). Frequency domain minimum distance inference for possibly noninvertible and noncausal ARMA models. The Annals of Statistics, 46(2), 555–579. doi: 10.1214/17-AOS1560
Web of Science ®Google Scholar
Wansbeek, T. J. (2001). GMM Estimation in panel data models with measurement error. Journal of Econometrics, 104, 259–268. doi: 10.1016/S0304-4076(01)00079-3
Web of Science ®Google Scholar
Weiss, D (1968). Determinants of market share. Journal of Marketing Research, 5, 290–295. doi: 10.1177/002224376800500307
Web of Science ®Google Scholar
White, H. (1986). Instrumental variables analogs of generalized least squares estimators. Advances in Statistical Analysis and Statistical Computing, 1, 173–227.
Google Scholar
Xiao, Z. (2008). Generalized inverses in GMM estimation with redundant moment conditions. In Topics in Generalized Method of Moments Estimation with Application to Panel Data with Measurement Error, University of Wisconsin-Madison PhD thesis.
Google Scholar
Xiao, Z., Shao, J., & Palta, M. (2010a). GMM in linear regression for longitudinal data with multiple covariates measured with error. Journal of Applied Statistics, 37, 791–805. doi: 10.1080/02664760902890005
Web of Science ®Google Scholar
Xiao, Z., Shao, J., & Palta, M. (2010b). Instrumental variable and GMM estimation of panel data with measurement error. Statistica Sinica, 20(4), 1725–1747.
Web of Science ®Google Scholar
Xiao, Z., Shao, J., Xu, R., & Palta, M. (2007). Efficiency of GMM estimation in panel data models with measurement error. Sankhya: The Indian Journal of Statistics, 69, 101–118.
Google Scholar
Xu, L., Fampa, M., & Lee, J. (2019). Aspects of symmetry for sparse reflexive generalized inverses. arXiv:1903.05744v1[math.OC].
Google Scholar

Appendix

Proofs of results

Proof

Proof of Lemma 1

Let $Γ = v a r (Y)$ , and $Γ = T (\begin{matrix} Λ_{r} & 0 \\ 0 & 0 \end{matrix}) T^{'}$ be the spectrum decomposition of Γ, with $T^{'} T = T T^{'} = I_{m}$ . Let $T = [T_{1} T_{2}]$ , then $\begin{aligned} T^{'} T & = [\begin{matrix} T_{1}^{'} \\ T_{2}^{'} \end{matrix}] [T_{1} T_{2}] = [\begin{matrix} T_{1}^{'} T_{1} & T_{1}^{'} T_{2} \\ T_{2}^{'} T_{1} & T_{2}^{'} T_{2} \end{matrix}] = I_{m}, \\ Γ & = T [\begin{matrix} Λ_{r} & 0 \\ 0 & 0 \end{matrix}] T^{'} = T_{1} Λ_{r} T_{1}^{'}, \end{aligned}$ hence $T_{2}^{'} T_{1} = 0$ , and $v a r (T_{2}^{'} Y) = T_{2}^{'} V T_{2} = T_{2}^{'} T_{1} Λ_{r} T_{1}^{'} T_{2} = 0$ . Hence there exists an $(m - r) \times 1$ constant vector c such $T_{2}^{'} Y = c$ . Let $T_{2}^{'} = [t_{1} \dots t_{m}]$ , with $r a n k (T_{2}^{'}) = q := m - r$ . Suppose $t_{j_{1}} \dots$ $t_{j_{q}}$ are linearly independent, then $\begin{aligned} T_{2}^{'} Y & = [t_{1} \dots t_{m}] {[Y_{1} \cdot Y_{m}]}^{'} \\ = t_{1} Y_{1} + \dots + t_{m} Y_{m} \\ = [t_{j_{1}} \dots t_{j_{q}}] {[Y_{j_{1}} \cdot Y_{j_{q}}]}^{'} + \hat{T} Y^{- (q)}, \end{aligned}$ hence $[t_{j_{1}} \dots t_{j_{q}}] [Y_{j_{1}} \cdot Y_{j_{q}}]^{'} + \hat{T} Y^{- (q)} = c$ , i.e., $Y^{(q)} = {[Y_{j_{1}} \cdot Y_{j_{q}}]}^{'} = - {[t_{j_{1}} \dots t_{j_{q}}]}^{- 1} \hat{T} Y^{- (q)} + {[t_{j_{1}} \dots t_{j_{q}}]}^{- 1} c .$ Let $C = - [t_{j_{1}} \dots t_{j_{q}}]^{- 1} \hat{T}, d = [t_{j_{1}} \dots t_{j_{q}}]^{- 1} c$ , we get $Y^{(q)} = C Y^{- (q)} + d$ , i.e., $Y^{- (r)} = C Y^{(r)} + d$ . Since $Y^{(r)}$ and $Y^{- (r)}$ are subvectors of Y, there exists a $m \times m$ nonsingular matrix A such that $Y = A [\begin{matrix} Y^{(r)} \\ Y^{- (r)} \end{matrix}] = A [\begin{matrix} Y^{(r)} \\ C Y^{(r)} + d \end{matrix}] = A [\begin{matrix} I \\ C \end{matrix}] Y^{r} + A [\begin{matrix} 0 \\ d \end{matrix}],$ i.e., $Y = B Y^{(r)} + \tilde{d}$ , with $B = A [\begin{matrix} I \\ C \end{matrix}], \tilde{d} = A [\begin{matrix} 0 \\ d \end{matrix}]$ . Hence $v a r (Y) = B^{'} v a r (Y^{(r)}) B$ . Since $B = A [\begin{matrix} I \\ C \end{matrix}]$ is of full column rank, $r a n k (v a r (Y)) = r a n k (v a r (Y^{(r)})) = r$ . This shows that $v a r (Y^{(r)})$ is nonsingular.

Proof

Proof of Theorem 3.1

Let $V (W)$ denote the asymptotic variance of the GMM estimator using weighting matrix W. Then $V (W) = (G^{'} W G)^{- 1} G^{'} W Ω W G (G^{'} W G)^{- 1}$ . Let $Ω^{-}$ be a reflexive generalised inverse of Ω. Then we have $\begin{aligned} V (Ω^{-}) & = (G^{'} Ω^{-} G)^{- 1} G^{'} Ω^{-} Ω Ω^{-} G (G^{'} Ω^{-} G)^{- 1} \\ = (G^{'} Ω^{-} G)^{- 1} . \end{aligned}$ Hence $V (W) - V (Ω^{-}) = (G^{'} W G)^{- 1} G^{'} W [Ω - G (G^{'} Ω^{-} G)^{- 1} G^{'}] W G (G^{'} W G)^{- 1}$ . To establish $V (W) - V (Ω^{-}) \geq 0$ , we just need to show that $Ω - G (G^{'} Ω^{-} G)^{- 1} G^{'} \geq 0$ . Let $r a n k (Ω) = r$ . By Lemma 3.1, there exist a subvector $g^{(r)} (Z_{i}, β_{0}) = (g_{i_{1}} (Z_{i}, β_{0}), \dots, g_{i_{r}} (Z_{i}, β_{0}))^{'}$ and a matrix B of full column rank such that $g (Z_{i}, β_{0}) = B g^{r} (Z_{i}, β_{0})$ a.s., with $Ω_{1} = v a r (g^{(r)} (Z_{i}, β_{0}))$ positive definite. Then $Ω = v a r (g (Z_{i}, β_{0})) = B Ω_{1} B^{'}$ , and $G = E [\frac{\partial g (Z_{i}, β_{0})}{\partial β}] = B G_{1}$ , with $G_{1} = E [\frac{\partial g^{(r)} (Z_{i}, β_{0})}{\partial β^{'}}]$ . Hence $\begin{aligned} Ω - G (G^{'} Ω^{-} G)^{- 1} G^{'} & = B Ω_{1} B^{'} - B G_{1} (G^{'} Ω^{-} G)^{- 1} G_{1}^{'} B^{'} \\ = B [Ω_{1} - G_{1} (G^{'} Ω^{-} G)^{- 1} G_{1}^{'}] B^{'} . \end{aligned}$ So we just need to show that $Ω_{1} - G_{1} (G^{'} Ω^{-} G)^{- 1} G_{1}^{'} \geq 0$ . By Proposition 2.1, $B^{'} Ω^{-} B = Ω_{1}^{- 1}$ , hence $\begin{aligned} Ω_{1} - G_{1} (G^{'} Ω^{-} G)^{- 1} G_{1}^{'} \\ = Ω_{1} - G_{1} (G_{1}^{'} B^{'} Ω^{-} B G_{1})^{- 1} G_{1}^{'} \\ = Ω_{1} - G_{1} (G_{1}^{'} Ω_{1}^{- 1} G_{1})^{- 1} G_{1}^{'} \\ = Ω_{1}^{\frac{1}{2}} [I - Ω_{1}^{- \frac{1}{2}} G_{1} (G_{1}^{'} Ω_{1}^{- 1} G_{1})^{- 1} G_{1}^{'} Ω_{1}^{- \frac{1}{2}}] Ω_{1}^{\frac{1}{2}} \\ \geq 0, \end{aligned}$ since $I - Ω_{1}^{- \frac{1}{2}} G_{1} (G_{1}^{'} Ω_{1}^{- 1} G_{1})^{- 1} G_{1}^{'} Ω_{1}^{- \frac{1}{2}}$ is idempotent and symmetric.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Download PDF

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Your download is now in progress and you may close this window

Did you know that with a free Taylor & Francis Online account you can gain access to the following benefits?

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Have an account?
Login now Don't have an account?
Register for free

Login or register to access this feature

Have an account?
Login now Don't have an account?
Register for free

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Efficient GMM estimation with singular system of moment conditions

Abstract

1. Introduction

2. GMM and generalised inverses

White, Citation1986

3. Main results

4. Further issues

4.1. Optimal weighting matrix estimation

Stewart, Citation1969

4.2. Imposing random noises

5. Concluding remarks

Disclosure statement

Notes on contributors

Zhiguo Xiao

References

Appendix

Proofs of results

Proof of Lemma 1

Proof of Theorem 3.1

Information for

Open access

Opportunities

Help and information

Efficient GMM estimation with singular system of moment conditions

Abstract

1. Introduction

2. GMM and generalised inverses

White, Citation1986

3. Main results

4. Further issues

4.1. Optimal weighting matrix estimation

Stewart, Citation1969

4.2. Imposing random noises

5. Concluding remarks

Disclosure statement

Additional information

Funding

Notes on contributors

Zhiguo Xiao

Notes

References

Appendix

Proofs of results

Proof of Lemma 1

Proof of Theorem 3.1

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date