![MathJax Logo](/templates/jsp/_style2/_tandf/pb2/images/math-jax.gif)
Abstract
Standard generalised method of moments (GMM) estimation was developed for nonsingular system of moment conditions. However, many important economic models are characterised by singular system of moment conditions. This paper shows that efficient GMM estimation of such models can be achieved by using the reflexive generalised inverses, in particular the Moore–Penrose generalised inverse, of the variance matrix of the sample moment conditions as the weighting matrix. We provide a consistent estimator of the optimal weighting matrix and establish its consistency. Potential issues of using generalised inverse and some remedies are also discussed.
1. Introduction
Over the past several decades, a great deal of statisticians' efforts has been devoted to the statistical inference of moment condition models, i.e., models where the linkage between parameter and data is specified through a set of moment restrictions (also known as estimating equations). Technically, a moment condition model specifies that the data generating process of observations satisfies
(1)
(1) where g is a
-valued known function and
is the
-valued parameter of interest, and
. The popularity of moment condition models is partially due to that a parametric likelihood form may be too strong for many real applications or scientific theories. When the dimension of parameter of interest equals the number of moment conditions, the parameter is said to be just-identified, and the classical approach of the method of moments can be applied for parameter estimation. In practice, a majority of the moment condition models investigated by applied researchers, such as the models for assets pricing and dynamic panel data, are over-identified. The generalised method of moments (GMM) of Hansen (Citation1982) is one of the most popular techniques that are designed for the estimation of over-identified moment condition models (see, e.g., Hansen & West, Citation2002 and Hall, Citation2005).
Like many other classical statistical methods, GMM comes with the price of a set of regularity conditions which warrant its validity. Although in most applications those regularity conditions are not binding, some of them can be violated in interesting circumstances. This paper is concerned with the efficient GMM estimation when one of the regularity conditions of standard GMM, that the covariance matrix of the moment vector evaluated at the true parameter be of full rank, is violated. A typical such kind of violation appears when the system of moment conditions is singular, i.e., some components of the moment functions are linear combinations of each other.
Singular systems of moment conditions exist in a wide variety of economic studies, such as the consumer expenditure function analysis (Barten, Citation1969, Citation1977), the market share analysis (Rao, Citation1972; Weiss, Citation1968), the production function estimation (Dhrymes, Citation1962), the translog utility function analysis (Berndt & Christensen, Citation1974), the linearised dynamic stochastic general equilibrium (DSGE) modelling (Bierens, Citation2007; Ireland, Citation2004), the errors-in-variables analysis with panel data (Biørn, Citation2000; Biørn & Klette, Citation1998; Wansbeek, Citation2001; Xiao, Shao, & Palta, Citation2010a, Citation2010b; Xiao, Shao, Xu, & Palta, Citation2007), the multivariate random-effects meta-analysis models (Chen, Hong, & Riley,Citation2014; Riley, Abrams, Lambert, Sutton, & Thompson, Citation2007) and the non-Gaussian ARMA models (Alessi, Barigozzi, & Capasso, Citation2011; Leeper, Walker, & Yang, Citation2013; Mountford & Uhlig, Citation2009; Velasco & Lobato, Citation2018).
In a linear regression model with known singular disturbance covariance matrix, Theil (Citation1971) showed that a generalised Aitken-like estimator using the Moore–Penrose generalised inverse is best linear unbiased. Following Theil (Citation1971), Kreijger and Neudecker (Citation1977) proposed two optimality criteria to obtain best linear unbiased estimators. Within the same context, Dhrymes and Schwarz (Citation1987) discussed the existence issue of the estimators using generalised inverses. Haupt and Oberhofer (Citation2006) proposed an estimator which does not use the generalised inverses and allows for additional exogenous restrictions, collinearities and generalised adding-up. Bierens and Swanson (Citation2000) and Bierens (Citation2007) suggested that one can obtain parameter estimate by maximising the information content of the singular system. Ireland (Citation2004) and Lai (Citation2008) proposed adding random noises to the singular system to implement maximum likelihood estimation.
In the GMM literature, White (Citation1986) showed that if the estimating function g is of the form such that: (i)
is nonsingular and (ii) components of
are linear combinations of
, then the efficient GMM estimator is the minimiser of
(2)
(2) where
and
is a reflexive generalised inverse of
(3)
(3) However, in practice, the aforementioned representation of g is generally not readily obtainable (see, e.g., Schneeweiss, Citation2014; Velasco & Lobato, Citation2018; Xiao et al., Citation2010b).
The purpose of this article is to develop an efficient GMM estimator for a singular system of moment conditions with general form. An earlier effort appeared in Xiao (Citation2008), which is proposed using the reflexive generalised inverses to deal with the singularity. Schneeweiss (Citation2014) independently discussed similar ideas.
The rest of the paper is organised as follows. In Section 2, we briefly review the GMM methodology, the concepts of generalised inverses and some results of the reflexive generalised inverses. We present our main result in Section 3. Section 4 discusses further issues such as the estimation of optimal weighting matrix and the method of adding noises, and Section 5 concludes. Proofs of results are relegated to the Appendix.
2. GMM and generalised inverses
We first make a brief introduction of the standard GMM method. For book-length detailed account, see Hall (Citation2005). For simplicity we assume that the data are i.i.d. Assume also that K>p, i.e., the model is over-identified. Since the number of restrictions on parameter is greater than the dimension of parameter, in general it is impossible to obtain an estimator of the parameter by using method of moments, i.e., by setting the sample moment
equal to zero. The idea of GMM by Hansen (Citation1982) is to minimise a quadratic norm of
:
(4)
(4) where
is a positive semidefinite matrix. Under a set of regularity conditions including that Ω being positive definite, and assuming
converges in probability to a positive semi-definite matrix W,
, the minimiser of (Equation4
(4)
(4) ), is a consistent estimator for
and has limiting distribution
where
with
. The lower bound of
is achieved at
, i.e.,
in the sense of being nonnegative definite, for any W. In practice, a consistent estimator of
can be set as
(5)
(5) where
is a consistent estimator of
. A typical choice of
is a GMM estimator with
, the identity matrix of order K. Note that
converges in probability to
because
converges in probability to Ω, and more importantly, Ω is positive definite.
Next we review the concepts of generalised inverses of a matrix and some of their properties.
Definition 2.1
Let A be a real matrix. An
real matrix
may have one or all of the following properties:
;
;
;
.
If satisfies (i), it is called a generalised inverse of A; if
satisfies (i) and (ii), it is called a reflexive generalised inverse (or
-inverse) of A; if
satisfies (i) –(iv), it is called the Moore–Penrose generalised inverse of A. The Moore–Penrose generalised inverse of a matrix A is unique and is denoted by
hereafter.Footnote1
We list some of the important properties of the generalised inverses by the following two propositions, proof of which can be achieved by direct verification and therefore is omitted.Footnote2 Proposition 1 states that when the matrix of interest has natural factorisation with certain structure, some of its generalised inverses can be easily derived.
Proposition 2.1
(i) Let , where
and
are nonsingular square matrices. Then
is a reflexive generalised inverse of Ω.
(ii) Let , where A is of full column rank and
is nonsingular, then any reflexive generalised inverse
of Ω satisfies
. Moreover, we have
, and
Proposition 2.2 points out that the generalised inverses (including the reflexive generalised inverses) are not unique and can be obtained by using the singular value decomposition .Footnote3
Proposition 2.2
Let Ω be an real valued matrix with rank r>0. Suppose that the singular value decomposition of Ω is
, where S is
with
, T is
with
, and
, with
the diagonal matrix of singular values of Ω and O the matrices of zeros. Then
(i)
(6)
(6) is a generalised inverse of Ω, where X, Y and Z are arbitrary real valued matrices with appropriate dimension.
(ii)
(7)
(7) is a reflexive generalised inverse of Ω, where X, Y are arbitrary real valued matrices with appropriate dimension.
(iii)
(8)
(8) is the Moore–Penrose generalised inverse of Ω.
White (Citation1986) result on GMM estimation with singular moment conditions can be stated as follows:
Theorem 2.1
White, Citation1986
Suppose there exists a matrix Δ such that and
, where
is of full column rank and
is
positive definite with
, then
(i) For any reflexive generalised inverse of Ω,
is independent of the choice of
, and
.
(ii) For any reflexive generalised inverse of Ω, and for any W,
Hence
is the optimal weighting matrix. In practice, one may choose the Moore–Penrose generalised inverse of Ω, or a special reflexive generalised inverse
Remark
Note that – the asymptotic covariance matrix of the optimal GMM estimator – does not depend on Δ. The basic idea of Theorem 2.1 is as follows. Suppose we have two sets of instrumental variables, say
and
, such that
is linearly independent and
is a linear combination of
, i.e.,
for some constant vector α, then one can ignore
and use
only as instruments, in doing so we achieve the same asymptotic efficiency as using
. The limitation of this result is that to apply this method we have to sort all instrumental variables into two groups, such that instrumental variables in one group are linear combinations of those in the other group. This can be very tedious in practice. For example, in panel data models, there are often a very large number of instruments and it is in general impossible to sort out them.
3. Main results
We now establish some basic results about random vectors with singular covariance matrices.
Lemma 3.1
Let be an
random vector and r be the rank of the covariance matrix of Y. Suppose that r<m. Then
(i) There exist a r-dimensional subvector of Y such that its covariance matrix
is positive definite. The vector
is called an essential subvector of Y.
(ii) Let be the
vector consisting of the remaining components of Y. Then there exist an
constant matrix C and an
constant vector d such that
where w.p.1. means ‘with probability one’. Hence there exist an
constant matrix B of full column rank and an
constant vector
such that
(iii) If
, then
in (ii) is the zero vector. i.e.,
, w.p.1.
Theorem 3.1 is the main result of this paper.
Theorem 3.1
Consider GMM estimation for model (Equation1(1)
(1) ) with Ω defined by (Equation3
(3)
(3) ). Suppose it is known that the components of
are linearly dependent (with probability one). Then any reflexive inverse of Ω is an optimal weighting matrix. Particularly, we can use the Moore–Penrose generalised inverse
.
Let be an arbitrary reflexive generalised inverse of Ω. Then the asymptotic variance matrix of the GMM estimator using
as the weighting matrix is
where
. A natural question is whether
is a constant matrix independent of the choice of
. The answer is yes. To see this, suppose the essential subvector of
is
, with
. Let
and
. Then we have
,
, hence
which is independent of the choice of
, as the essential vector and the corresponding matrices
and
are unrelated to
. We also see that
remains the same if we use another essential vector, as
is unrelated to the choice of the essential vectors. More details can be found in the proof of Theorem 3.1 in the Appendix.
Judging from the asymptotic distributions we can see that GMM estimation using moment conditions (Equation1(1)
(1) ) and the reflexive generalised inverses as weighting matrix is asymptotically equivalent to the efficient GMM using moment conditions
. In some situations, one can figure out the essential subvector
, then efficient GMM estimation can be based on
directly. For instance, in the errors-in-variables analysis of panel data, Xiao et al. (Citation2010a) and Xiao et al. (Citation2010b) found that one can obtain
by using singular value decomposition. However, such simple decomposition is not available in general and it can be very inconvenient, if not impossible, to find the essential vector
. Theorem 2 tells us that whatever this subvector is, the GMM estimator using any of the reflexive generalised inverses, and the Moore–Penrose generalised inverse in particular, as the weighting matrix will always have the same asymptotic variance as the efficient GMM based on
.
4. Further issues
4.1. Optimal weighting matrix estimation
Now we discuss consistent estimation of . Let
be a consistent estimator of
and
. Then
in probability under normal regularity conditions. A natural candidate estimator of
is
.
It is well known that if a sequence of nonsingular square matrices converges to a nonsingular square matrix A, then
.Footnote4 However, if A is singular, and
, we may not necessarily have that
. Footnote5 Assuming
and A is singular, then a necessary and sufficient condition for
is:
Theorem 4.1
Stewart, Citation1969
Let be a sequence of real
matrices converging to a
matrix A. Then
if and only
for n large enough.
We now prove that is a consistent estimator for
. By Theorem 4.1, we need only to show that
when n is large enough. By Lemma 3.1, there exists a constant matrix B of full column rank, such that w.p.1.,
Since the components of
are linear independent for any β,
for any n. Therefore
, and
converges to
in probability.
Even though using generalised inverses is theoretically sound, it can be unstable, i.e., small perturbation of a singular matrix may result in large deviation from its generalised inverses.Footnote6 Therefore, one must be cautious when using generalised inverses. We suggest that one should first try to find the essential subvector . In case that
is not easily obtainable, the method introduced below can be used as an alternative to generalised inverses.
4.2. Imposing random noises
To avoid the potential bias caused by generalised inverses, we can add randomly generated noises to the system to make it nonsingular, as Bierens (Citation2007) and Lai (Citation2008) did in the maximum likelihood estimation of singular system of equations. Specifically, let be i.i.d.
random vectors generated from the multivariate normal distribution with mean zero and covariance matrix
, and assume that
are independent from
. Define
, for
. Then
is the solution of the set of moment conditions
(9)
(9) The set of moment conditions (Equation9
(9)
(9) ) is nonsingular, since
. Let
be an efficient GMM estimator of
based on (Equation9
(9)
(9) ), then the asymptotic distribution of
is
. Since
, for any
,
is asymptotically less efficient than
. However, the loss of efficiency can be controlled since
, as
. Similar to Lai (Citation2008), one can also generate m independent samples of
, obtain m GMM estimators
and then construct a new estimator by
.Footnote7 Since
combines information in
, in theory it is asymptotically more efficient than any of
. It is of interest to investigate the asymptotic distribution and finite sample performance of
in a future study.
5. Concluding remarks
Since the moment condition models do not require researchers to specify the likelihood function of the data generating process, they have been widely used by econometricians to model economic theories. Though it is desirable that the moment conditions constructed from economic theory are linearly independent, in practice this may not always be the case. Sometimes singularity is inherent in the model or is caused by some singular transformations. In this paper, we extended the efficient GMM estimation to linearly dependent moment condition models. The result can be viewed as a natural extension of the standard GMM theory, since the generalised inverse of a matrix is a natural extension of the inverse of a matrix. Though in theory using generalised inverses yields efficient GMM estimators, in practice one must be cautious of using them, in light of the following two concerns. First, using generalised inverses ignores the intrinsic structure of the moment conditions, which sometimes contains important information. Second, the generalised inverses of a singular matrix are unstable, which could induce serious bias of the resulting GMM estimator. Therefore when there is singularity in the system, a practical strategy is to obtain an essential moment vector and apply GMM to it. In case an essential moment vector is not available, we can add random noises to the moment conditions and obtain GMM estimators based on the new set of moment conditions. We suggest using generalised inverses with discretion. The results in this paper might also shed light on other popular statistical methods (such as the empirical likelihood) for estimating equations with singularity.
Disclosure statement
No potential conflict of interest was reported by the author.
Additional information
Funding
Notes on contributors
Zhiguo Xiao
Dr. Zhiguo Xiao received his PhD in statistics from the University of Wisconsin-Madison. His research interest includes both theoretical and applied econometrics.
Notes
3 Similar results for square matrices can be found in Bapat (Citation2012, pp. 47–48).
2 More results on reflexive generalised inverses can be found in Rao and Mitra (Citation1971), Rao (2001), Bapat (Citation2012), Fampa and Lee (Citation2018), and Xu, Fampa, and Lee (Citation2019).
1 For the existence and uniqueness of the Moore–Penrose generalised inverse, see, e.g., Penrose (Citation1955) and Abadir and Magnus (Citation2005, pp. 284–285).
5 For example, consider and
. Then
. Since
is invertible,
. Hence
.
4 A sequence of real matrices
is said to converge to a
matrix A if
, where
is a matrix norm, such as the Euclidean norm or
.
6 For example, consider a matrix . Its Moore–Penrose generalised inverse is
. Adding a small number
to the first two diagonal entries of A, we obtain a new Moore–Penrose generalised inverse
.
7 This idea is similar to bootstrap aggregating (bagging).
References
- Abadir, K. M., & Magnus, J. R. (2005). Matrix algebra. New York: Cambridge University Press.
- Alessi, L., Barigozzi, M., & Capasso, M. (2011). Non-fundamentalness in structural econometric models: A review. International Statistical Review, 79, 16–47. doi: 10.1111/j.1751-5823.2011.00131.x
- Bapat, R. B. (2012). Linear algebra and linear models (3rd ed.). London: Springer-Verlag.
- Barten, A. (1969). Maximum likelihood estimation of a complete system of demand equations. European Economic Review, 1, 7–73. doi: 10.1016/0014-2921(69)90017-8
- Barten, A. (1977). The system of consumer demand function approach: A review. Econometrica, 45, 23–51. doi: 10.2307/1913286
- Berndt, E. R., & Christensen, L. R. (1974). Testing for the existence of a consistent aggregate index of labor inputs. American Economics Review, 64, 391–403.
- Bierens, H. J. (2007). Econometric analysis of linearized singular dynamic stochastic general equilibrium models. Journal of Econometrics, 136, 595–627. doi: 10.1016/j.jeconom.2005.11.008
- Bierens, H. J., & Swanson, N. R. (2000). The econometric consequences of the ceteris paribus condition in economic theory. Journal of Econometrics, 95, 223–253. doi: 10.1016/S0304-4076(99)00038-X
- Biørn, E. (2000). Panel data with measurement errors: Instrumental variables and GMM procedures combining levels and differences. Econometric Reviews, 19(4), 391–424. doi: 10.1080/07474930008800480
- Biørn, E., & Klette, T. (1998). Panel data with errors-in-variables: Essential and redundant orthogonal conditions in GMM-estimation. Economics Letters, 59, 275–282. doi: 10.1016/S0165-1765(98)00053-6
- Chen, Y., Hong, C., & Riley, R. D. (2014). An alternative pseudolikelihood method for multivariate random-effects meta-analysis. Statistics in Medicine, 34(3), 361–380. doi: 10.1002/sim.6350
- Dhrymes, P. J. (1962). On devising unbiased estimators for the parameters of a Cobb–Douglas production function. Econometrica, 30, 297–304. doi: 10.2307/1910218
- Dhrymes, P. J., & Schwarz, S. (1987). On the existence of generalized inverse estimators in a singular system of equations. Journal of Forecasting, 6, 181–192. doi: 10.1002/for.3980060304
- Fampa, M., & Lee, J. (2018). On sparse reflexive generalized inverses. Operations Research Letters, 46(6), 605–610. doi: 10.1016/j.orl.2018.09.005
- Hall, A. (2005). Generalized method of moments. Oxford University Press: New York.
- Hansen, L. (1982). Large sample properties of generalized method of moments estimators. Econometrica, 50, 1029–1054. doi: 10.2307/1912775
- Hansen, B., & West, K. (2002). Generalized method of moments and macroeconomics. Journal of Business & Economic Statistics, 20, 460–469. doi: 10.1198/073500102288618603
- Haupt, H., & Oberhofer, W. (2006). Generalized adding-up in systems of regression equations. Economics Letters, 92, 263–269. doi: 10.1016/j.econlet.2006.03.001
- Ireland, P. N. (2004). A method for taking models to the data. Journal of Economic Dynamics and Control, 28, 1205–1226. doi: 10.1016/S0165-1889(03)00080-0
- Kreijger, R. G., & Neudecker, H. (1977). Exact linear restrictions on parameters in the general linear model with a singular covariance matrix. Journal of the American Statistical Association, 72, 430–432. doi: 10.1080/01621459.1977.10481014
- Lai, H. (2008). Maximum likelihood estimation of singular systems of equations. Economics Letters, 99, 51–54. doi: 10.1016/j.econlet.2007.05.027
- Leeper, E. M., Walker, T. B., & Yang, S.-C. S. (2013). Fiscal foresight and information flows. Econometrica, 81, 1115–1145. doi: 10.3982/ECTA8337
- Mountford, A., & Uhlig, H. (2009). What are the effects of fiscal policy shocks?. Journal of Applied Econometrics, 24, 960–992. doi: 10.1002/jae.1079
- Penrose, R. (1955). A generalized inverse for matrices. Mathematical Proceedings of the Cambridge Philosophical Society, 51, 406–413. doi: 10.1017/S0305004100030401
- Rao, C. R. (1972). Alternative econometric models of sales advertising relationships. Journal of Marketing Research, 9, 171–181. doi: 10.1177/002224377200900209
- Rao, C. R., & Mitra, S. K. (1971). Generalized inverse of matrices and its applications. John Wiley & Sons, Inc.: New York.
- Riley, R., Abrams, K., Lambert, P., Sutton, A., & Thompson, J. (2007). An evaluation of bivariate random-effects meta-analysis for the joint synthesis of two correlated outcomes. Statistics in Medicine, 26(1), 78–97. doi: 10.1002/sim.2524
- Schneeweiss, H. (2014). The linear GMM model with singular covariance matrix due to the elimination of a nuisance parameter. Technical Report 165, Dept. Statistics, Univ. Munich.
- Stewart, G. W. (1969). On the continuity of the generalized inverse. SIAM Journal of Applied Mathematics, 17, 33–45. doi: 10.1137/0117004
- Theil, H. (1971). Principle of econometrics. John Wiley & Sons: New York.
- Velasco, C., & Lobato, I. N. (2018). Frequency domain minimum distance inference for possibly noninvertible and noncausal ARMA models. The Annals of Statistics, 46(2), 555–579. doi: 10.1214/17-AOS1560
- Wansbeek, T. J. (2001). GMM Estimation in panel data models with measurement error. Journal of Econometrics, 104, 259–268. doi: 10.1016/S0304-4076(01)00079-3
- Weiss, D (1968). Determinants of market share. Journal of Marketing Research, 5, 290–295. doi: 10.1177/002224376800500307
- White, H. (1986). Instrumental variables analogs of generalized least squares estimators. Advances in Statistical Analysis and Statistical Computing, 1, 173–227.
- Xiao, Z. (2008). Generalized inverses in GMM estimation with redundant moment conditions. In Topics in Generalized Method of Moments Estimation with Application to Panel Data with Measurement Error, University of Wisconsin-Madison PhD thesis.
- Xiao, Z., Shao, J., & Palta, M. (2010a). GMM in linear regression for longitudinal data with multiple covariates measured with error. Journal of Applied Statistics, 37, 791–805. doi: 10.1080/02664760902890005
- Xiao, Z., Shao, J., & Palta, M. (2010b). Instrumental variable and GMM estimation of panel data with measurement error. Statistica Sinica, 20(4), 1725–1747.
- Xiao, Z., Shao, J., Xu, R., & Palta, M. (2007). Efficiency of GMM estimation in panel data models with measurement error. Sankhya: The Indian Journal of Statistics, 69, 101–118.
- Xu, L., Fampa, M., & Lee, J. (2019). Aspects of symmetry for sparse reflexive generalized inverses. arXiv:1903.05744v1[math.OC].
Appendix
Proofs of results
Proof
Proof of Lemma 1
Let , and
be the spectrum decomposition of Γ, with
. Let
, then
hence
, and
. Hence there exists an
constant vector c such
. Let
, with
. Suppose
are linearly independent, then
hence
, i.e.,
Let
, we get
, i.e.,
. Since
and
are subvectors of Y, there exists a
nonsingular matrix A such that
i.e.,
, with
. Hence
. Since
is of full column rank,
. This shows that
is nonsingular.
Proof
Proof of Theorem 3.1
Let denote the asymptotic variance of the GMM estimator using weighting matrix W. Then
. Let
be a reflexive generalised inverse of Ω. Then we have
Hence
. To establish
, we just need to show that
. Let
. By Lemma 3.1, there exist a subvector
and a matrix B of full column rank such that
a.s., with
positive definite. Then
, and
, with
. Hence
So we just need to show that
. By Proposition 2.1,
, hence
since
is idempotent and symmetric.