![MathJax Logo](/templates/jsp/_style2/_tandf/pb2/images/math-jax.gif)
Abstract
In this article we consider the partitioned linear model and the corresponding small model
We focus on comparing the best linear unbiased estimators, BLUEs, of
under
and
In other words, we are interested in the effect of adding regressors on the BLUEs. Particular attention is paid on the consistency of the model, that is, whether the realized value of the response vector y belongs to the column space of
or
1. Introduction
In this article we consider the partitioned linear model and so-called small model (submodel)
or shortly
Here y is an n-dimensional observable response variable, and
is an unobservable random error with a known covariance matrix
and expectation
The matrix X is a known n × p matrix, that is,
partitioned columnwise as
Vector
is a vector of fixed (but unknown) parameters; symbol
stands for the transpose.
As for notation,
and
denote, respectively, the rank, a generalized inverse, the (unique) Moore–Penrose inverse, the column space, the null space, and the orthogonal complement of the column space of the matrix A. By
we denote any matrix satisfying
Furthermore, we will write
to denote the orthogonal projector onto
The orthogonal projector onto
is denoted as
where
is the a × a identity matrix and a is the number of rows of A. We write shortly
One obvious choice for
is M.
When using generalized inverses it is essential to know whether the expressions are independent of the choice of the generalized inverses involved. The following lemma gives an important invariance condition; cf. Rao and Mitra (Citation1971, Lemma 2.2.4)
Lemma 1.1.
For nonnull matrices and
the following holds:
For a given linear model let the set
of nonnegative definite matrices be defined as
(1.1)
(1.1)
In Equation(1.1)
(1.1)
(1.1) , U can be any matrix comprising p rows as long as
is satisfied. Lemma 1.2 collects together some important properties of the class
see, for example, Puntanen, Styan, and Isotalo (Citation2011, Prop. 12.1 and 15.2).
Lemma 1.2.
Consider the model and let
. Then
(1.2)
(1.2)
Moreover, the following statements are equivalent:
is invariant for any choice of
for any choice of
for any choices of
and
It is noteworthy that the matrix in Equation(1.2)
(1.2)
(1.2) is invariant for the choice of the generalized inverses denoted as “
”, and it is independent of any choice of
Notice also that the invariance properties in (d) and (e) in Lemma 1.2 are valid for all choices of
It is clear that
if and only if
In Lemma 1.2, the matrix W is nonnegative definite, denoted as A corresponding version of Lemma 1.2 can be presented for
which may not be symmetric but satisfies
Corresponding to Equation(1.1)(1.1)
(1.1) , we will say that
if there exist
such that
(1.3)
(1.3)
For the partitioned linear model
we will say that
if
where
and
are defined as in Equation(1.3)
(1.3)
(1.3) . For our considerations the actual choice of
and
does not matter as long as they satisfy Equation(1.3)
(1.3)
(1.3) .
By the consistency of the model it is meant that y lies in
with probability 1. Hence we assume that under the consistent model
the observed numerical value of y satisfies
where “⊕” refers to the direct sum and “
” refers to the direct sum of orthogonal subspaces. For the equality
see Rao (Citation1974, Lemma 2.1).
For parts (a) and (b) of Lemma 1.3, see, for example, Puntanen, Styan, and Isotalo (Citation2011, Th. 8). and for part (c), see the rank rule of the matrix product of Marsaglia and Styan (Citation1974, Cor. 6.2). Claim (d) is straightforward to confirm.
Lemma 1.3.
Consider and let
. Then
For Lemma 1.4, see, for example, Puntanen, Styan, and Isotalo (Citation2011, p. 152).
Lemma 1.4.
For conformable matrices A and B the following three statements are equivalent:If any of the above conditions holds then
Let A and B be arbitrary m × n matrices. Then, in the consistent linear model the estimators
and
are said to be equal (with probability 1) if
(1.4)
(1.4)
where
Thus, if A and B satisfy Equation(1.4)
(1.4)
(1.4) , then
for some matrix C. It is crucial to notice that in Equation(1.4)
(1.4)
(1.4) we are dealing with the “statistical” equality of the estimators
and
In Equation(1.4)
(1.4)
(1.4) y refers to a vector in
Thus we do not make any notational difference between a random vector and its observed value.
According to the well-known fundamental BLUE-equation, see Lemma 2.1 in Section 2, is the
of
if and only if
Obviously
is another representation of
for any n × n matrix N. However, the equality
holds when the model is consistent in the sense that
The properties of the
deserve particular attention when
does not hold: then there is an infinite number of multipliers B such that
is
but for all such multipliers the vector
itself is unique once the response y has been observed. The case of two linear models,
is extensively studied by Mitra and Moore (Citation1973). They ask, for example, when is a specific linear representation of the
of
under
also a
under
and when is the
of
under
irrespective of the linear representation used in its expression, also a
under
The purpose of this paper is to consider the models and
in the spirit of Mitra and Moore (Citation1973). We pick up particular fixed representations for the
s of
under these two models, say
and
and study the conditions under which they are equal for all values of
or
that is,
(1.5)
(1.5)
Moreover, we review the conditions under which Equation(1.5)
(1.5)
(1.5) holds for all representations of the
s, not only for fixed
and
Some related considerations were made by Haslett, Markiewicz, and Puntanen (Citation2020) when these models are supplemented with the new unobservable random vector
coming from
where the covariance matrix of
is known as well as the cross-covariance matrix between
and y.
The well-known (or pretty well-known) results are given as Lemmas, while the new (or at least not so well-known) results are represented as Propositions.
2. The fundamental BLUE equations
A linear statistic is said to be linear unbiased estimator, LUE, for the parametric function
in
if its expectation is equal to
which happens if and only if
in this case
is said to be estimable. The LUE
is the best linear unbiased estimator,
of estimable
if
has the smallest covariance matrix in the Löwner sense among all LUEs of
It is well known that
is estimable under
if and only if
For Lemma 2.1, characterizing the
see, for example, Rao (Citation1973, p. 282).
Lemma 2.1.
Consider the model where
is estimable. Then
that is,
that is,
In particular, if
is estimable,
that is,
Of course, under the model we have
To indicate that
we will also use notations
Using Lemma 1.2 we can obtain, for example, the following well-known solution to A in Lemma 2.1:
where
and we can freely choose the generalized inverses involved. Expression
is not necessarily unique with respect to the choice of
but by Lemma 1.2, the matrix
is unique whatever choices of
and
we have and moreover,
does not depend on the choice of
The general solution for A in Lemma 2.1, can be expressed, for example, as
and
Thus the solution for A (as well as for B and C) in Lemma 2.1 is unique if and only if
Consider then the estimation of under
assuming that
is estimable. Premultiplying the model
by
yields the reduced model
Now the well-known Frisch–Waugh–Lovell theorem, see, for example, Groß and Puntanen (Citation2000, Sec. 6), states that the
s of
under
and
coincide. To obtain an explicit expression for the
of
under
we need a W-matrix in
Now any matrix of the form
satisfying
(2.1)
(2.1)
is a W-matrix in
Choosing
as in Equation(1.3)
(1.3)
(1.3) we have
Thus the
of
under
can be expressed as
where
We observe that Equation(2.1)(2.1)
(2.1) holds for
if and only if
that is, see part (d) of Lemma 1.3,
(2.2)
(2.2)
Our conclusion: If Equation(2.2)
(2.2)
(2.2) holds, then the
of
under
can be expressed as
(2.3)
(2.3)
where
Actually, it can be shown that Equation(2.2)
(2.2)
(2.2) is also a necessary condition for Equation(2.3)
(2.3)
(2.3) . It is obvious that under the estimability of
we have
(2.4a)
(2.4a)
(2.4b)
(2.4b)
where
An alternative expression for the of
can be obtained by premultiplying the fundamental BLUE-equation
by
yielding
(2.5)
(2.5)
Because
we can, by the rank cancelation rule of Marsaglia and Styan (Citation1974), cancel
in Equation(2.5)
(2.5)
(2.5) and thus an alternative expression for Equation(2.4a)
(2.4a)
(2.4a) is
Now we should pay attention to numerous generalized inverses appearing in the representations of the
s. Namely, when the observable response y belongs to a “correct” subspace of
then there is no problem with the generalized inverses. In the next section we will consider particular unique representations of the multipliers of y and study the equality of the relevant estimators taking the space where y belongs into account.
3. Some useful matrix results
Let us denote
where
and
are now unique (once W is given) matrices defined as
It is noteworthy that the following types of equalities hold:
Now under the estimability of
we have
and
Because
and
belong to
they satisfy the equation
(3.3)
(3.3)
Next we show that we also have
(3.4)
(3.4)
We immediately observe that
and what remains is to show that
Now the equation
holds if and only if
(3.5)
(3.5)
Clearly Equation(3.5)
(3.5)
(3.5) holds because
where the last inclusion follows from
Combining Equation(3.3)
(3.3)
(3.3) and Equation(3.4)
(3.4)
(3.4) gives the following result.
Proposition 3.1.
Assume that is estimable under
. Then
(3.6)
(3.6) where
Moreover, the expressions in Equation(3.6)
(3.6)
(3.6) are invariant for any choices of generalized inverses
and
as well as for the choice of
Corresponding equality holds between
and
Moreover,
We will also need the following proposition.
Proposition 3.2.
Denotewhere
Then
In particular, when
is estimable under
we have
Proof.
Property (b) comes from the following:
(3.7)
(3.7)
The last equality in Equation(3.7)
(3.7)
(3.7) follows from the fact that
The other statements can be confirmed in the corresponding way. □
Proposition 3.3 appears to be useful for our BLUE-considerations and it also provides some interesting linear algebraic matrix results. By we refer to the nonnegative definite square root of a nonnegative definite matrix A and
so that
Proposition 3.3.
The following five statements hold:
The following three statements are equivalent:
If any of the conditions (f)–(h) holds, then
If
then
Proof.
The first five statements (a)–(e) appear in Markiewicz and Puntanen (Citation2019, Sec. 4). The claim (h), that is,
holds if and only if, see Lemma 1.4,
(3.8)
(3.8)
Now Equation(3.8)
(3.8)
(3.8) holds if and only if
that is,
which further is equivalent to (f). Clearly (f) holds, for example, when
Assuming that (f) holds we can write
(3.9)
(3.9)
From Equation(3.9)
(3.9)
(3.9) it follows that
and hence, supposing that
we obtain (k):
Thus the proof is completed. □
4. Difference of the BLUEs under the full and small model
Next we introduce a particular expression for the difference which is valid for all
Proposition 4.1.
Consider the models and
and suppose that
is estimable under
. Using the earlier notation, we have for all
:
(4.1)
(4.1)
Proof.
It is clear that Premultiplying
by
we observe that
as
Thus we have
(4.2)
(4.2)
The claim Equation(4.1)
(4.1)
(4.1) follows from Equation(4.2)
(4.2)
(4.2) . □
Proposition 4.1 was proved by Haslett and Puntanen (Citation2010, Lemma 3.1) in the situation when
Using different formulation and proof, it appears also in Werner and Yapar (Citation1996, Th. 2.3). See also Sengupta and Jammalamadaka (Citation2003, Ch. 9) and Güler, Puntanen, and Özdemir (Citation2014). In the full rank model, that is, when X has full column rank and
is positive definite, it appears, for example, in Haslett (Citation1996).
Remark 4.1.
We might be tempted to express the equality as
(4.3)
(4.3)
However, the notation used in Equation(4.3)
(4.3)
(4.3) can be problematic when the possible values of the response vector y are taken into account. It is clear that
is the
of
under
and we may write shortly
Now, there might be another estimator
for which we can also write
but, however,
and
may have different numerical observed values. The numerical value of the
under
is unique if and only if y lies in
□
Notice that in above considerations all the matrices
and so on. are fixed. Let us check whether Equation(4.1)
(4.1)
(4.1) holds for arbitrary
and so on.
Corollary 4.1.
Let us denotewhere the matrices
are free to vary. Then
for all
for all
Moreover, the following two statements are equivalent:
for all
Proof.
In view of
the statement (a) holds. We observe that
Thus the statement (b), that is, the equality
holds if and only if
(4.5)
(4.5)
Replacing W with
in Equation(4.5)
(4.5)
(4.5) we observe that Equation(4.5)
(4.5)
(4.5) indeed holds. The equivalence of (c) and (d) is obvious. □
Proposition 4.2.
Consider the models and
and suppose that
is estimable under
. Then the following statements are equivalent:
for all
that is,
for all
for all
that is,
that is,
Proof.
Consider the statement (a) which is obviously equivalent to (d):
(4.6)
(4.6)
Now
and hence Equation(4.6)
(4.6)
(4.6) holds if and only if
(4.7)
(4.7)
that is,
which is equivalent to
The equivalence between (a) and (b) follows from the equivalence between Equation(4.6)
(4.6)
(4.6) and Equation(4.7)
(4.7)
(4.7) .
To prove that (a) and (c) are equivalent we need to show that
It is clear that
Similarly,
Thus (a) is equivalent to (c). The claim (g) follows from part (b) of Proposition 3.3. □
Remark 4.2.
Clearly (a) in Proposition 4.2 is equivalent to
that is, (i)
and (ii)
Here is a question: where does the condition (ii) vanish in Proposition 4.2?
In view of Proposition 4.2, the condition (i) implies that and hence trivially (ii) holds, that is,
However, (ii) does not imply (i). Moreover, the condition (ii) implies that
which by Proposition 4.3 (see below) is equivalent to
Thus we can conclude that
□
In Propositions 4.3–4.5 we assume that is estimable under
Proposition 4.3.
The following statements are equivalent:
for all
that is,
that is,
that is,
for all
Moreover, we always have
Proof.
It is clear that (b) is simply an alternative expression for (a) and similarly (d) for (c). The claim (a) holds if and only if
which gives (e):
that is,
(4.8)
(4.8)
Premultiplying Equation(4.8)
(4.8)
(4.8) by
yields
that is,
In view of Proposition 3.2, we have
and hence
becomes
(4.9)
(4.9)
Thus we have shown that (e) and (f) are equivalent. Equality Equation(4.9)
(4.9)
(4.9) implies
(4.10)
(4.10)
that is, (f) implies (h). In view of part (e) of Proposition 3.3 we have
(4.11)
(4.11)
Substituting Equation(4.10)
(4.10)
(4.10) into Equation(4.11)
(4.11)
(4.11) we observe that (h) implies (g), and so far we have confirmed the equivalence between (a) and any of (e)–(h).
The statement (c) holds if and only if
that is,
which holds if and only if
Thus (c) and (e) are equivalent.
The claim (a) holds if and only if which is precisely (l):
It is clear that (i) is equivalent to (j). Consider then
Notice that
where
and hence
and
Thus
and so (l) and (m) hold. Statement (l) obviously confirms the equivalence between (j) and (k). Property (n) is obvious. See also Remark 4.1. □
Next we consider the condition under which an arbitrary matrix from the set provides the
for
under
Proposition 4.4.
The following statements are equivalent:
that is,
for all
that is,
and
that is,
Proof.
Notice first that (b) is simply an alternative way to express (a). The statement (a) holds if and only if
that is,
which holds if and only if
and
which is precisely (c). Moreover, (c) implies that
(4.12)
(4.12)
for some A and B and
(4.13)
(4.13)
Now Equation(4.13)
(4.13)
(4.13) implies that
which further implies that
so that by Equation(4.12)
(4.12)
(4.12) we get (d). The claim (d) obviously implies (c). The equivalence between (d) and (e) is obvious because
It is clear that (f) implies (b). Thus to confirm the equivalence of (b) and (f) we have to show that
(4.14)
(4.14)
This follows at once from Proposition 4.3 by noting that the right-hand side of Equation(4.14)
(4.14)
(4.14) means that
The equivalence between (f) and (g) follows by combining part (d) of Proposition 4.4 and (k) of Proposition 4.3. □
Our next task is to find necessary and sufficient conditions for
when the inclusion
holds.
Proposition 4.5.
Consider the models and
and suppose that
(4.15)
(4.15) Then the following statements are equivalent:
for all
and
that is,
that is,
with probability 1,
where
is defined as
(4.16)
(4.16)
Proof.
The equivalence between (a)–(g) is obvious. Consider then part (h). Now we have
(4.17)
(4.17)
Hence (a) holds, under Equation(4.15)
(4.15)
(4.15) , if and only if
that is,
that is,
(4.18)
(4.18)
Using Equation(4.17)
(4.17)
(4.17) the equality Equation(4.18)
(4.18)
(4.18) becomes
(4.19)
(4.19)
where
is defined in Equation(4.16)
(4.16)
(4.16) . In light of
we can cancel
in the last expression in Equation(4.19)
(4.19)
(4.19) . This proves the equivalence between (a) and (h). □
5. Conclusions
In this article we consider the partitioned linear model and the corresponding small model
We focus on comparing the BLUEs of
under
and
The observed numerical value of the
is unique under the model
if the
is consistent in the sense that
and the same uniqueness concerns the full model in the respective way. But now there may be some problems if we write
(5.1)
(5.1)
What is the meaning of the above equality? It is not fully clear because we know that under
the values of y vary over
but under
the values of y vary over
and these column spaces may be different. However, if
there is no difficulties to interpret the equality Equation(5.1)
(5.1)
(5.1) , which means that
where
and
We consider the resulting problems by picking up particular fixed expressions for the s of
under these two models, and study the conditions under which they are equal for all values of
or
Moreover, we review the conditions under which all representations of the
s in one model continue to be valid in the other model. Some related considerations, using different approach, have been made by Lu et al. (Citation2015), Tian (Citation2013), and Tian and Zhang (Citation2016).
Acknowledgements
Part of this research was done during meeting of an International Research Group on Multivariate and Mixed Linear Models in the Mathematical Research and Conference Center, Bȩdlewo, Poland, in November 2019 and February 2020. Thanks go to the anonymous referee for constructive remarks.
References
- Groß, J., and S. Puntanen. 2000. Estimation under a general partitioned linear model. Linear Algebra and Its Applications 321, 131–44. doi:10.1016/S0024-3795(00)00028-8.
- Güler, N., S. Puntanen, and H. Özdemir. 2014. On the BLUEs in two linear models via C.R. Rao’s Pandora’s Box. Communications in Statistics - Theory and Methods, 5, 43, 921–31. doi:10.1080/03610926.2013.826366.
- Haslett, S. J. 1996. Updating linear models with dependent errors to include additional data and/or parameters. Linear Algebra and Its Applications 237(238):329–49.
- Haslett, S. J., A. Markiewicz, and S. Puntanen. 2020. Properties of BLUEs and BLUPs in full vs. small linear models with new observations. In Recent developments in multivariate and random matrix analysis: Festschrift in honour of Dietrich von Rosen, eds. T. Holgersson and M. Singull, 123–46. Cham: Springer.
- Haslett, S. J., and S. Puntanen. 2010. Effect of adding regressors on the equality of the BLUEs under two linear models. Journal of Statistical Planning and Inference 140, 104–10. doi:10.1016/j.jspi.2009.06.010.
- Lu, C., S. Gan, and Y. Tian. 2015. Some remarks on general linear model with new regressors. Statistics & Probability Letters, 97, 16–24. doi:10.1016/j.spl.2014.10.015.
- Markiewicz, A, and S. Puntanen. 2019. Further properties of the linear sufficiency in the partitioned linear model. In Matrices, statistics and big data, eds. S. E. Ahmed, F. Carvalho and S. Puntanen, 1–22. Cham: Springer.
- Marsaglia, G, and G. P. H. Styan. 1974. Equalities and inequalities for ranks of matrices. Linear Multilinear Algebra 2:269–92.
- Mitra, S. K, and B. J. Moore. 1973. Gauss–Markov estimation with an incorrect dispersion matrix. Sankhyā Series A 35:139–52.
- Puntanen, S., G. P. H. Styan, and J. Isotalo. 2011. Matrix tricks for linear statistical models: our personal top twenty. Heidelberg: Springer.
- Rao, C. R. 1973. Representations of best linear estimators in the Gauss–Markoff model with a singular dispersion matrix. Journal of Multivariate Analysis 3, 276–92. doi:10.1016/0047-259X(73)90042-0.
- Rao, C. R. 1974. Projectors, generalized inverses and the BLUE’s. Journal of the Royal Statistical Society: Series B 36:442–8.
- Rao, C. R, and S. K. Mitra. 1971. Generalized inverse of matrices and its applications. New York: Wiley.
- Sengupta, D, and S. R. Jammalamadaka. 2003. Linear models: An integrated approach. River Edge: World Scientific.
- Tian, Y. 2013. On properties of BLUEs under general linear regression models. Journal of Statistical Planning and Inference 43, 771–82. doi:10.1016/j.jspi.2012.10.005.
- Tian, Y., and X. Zhang. 2016. On connections among OLSEs and BLUEs of whole and partial parameters under a general linear model. Statistics & Probability Letters 112, 105–12. doi:10.1016/j.spl.2016.01.019.
- Werner, H.J., and C. Yapar. 1996. A BLUE decomposition in the general linear regression model. Linear Algebra and its Applications 237-238, 395–404. doi:10.1016/0024-3795(95)00542-0.