Abstract
In linear regression a relationship between a response variable y and a set of predictor variables x 1,…, x p is modeled as y = β0 + β1 x 1 + … + β p x p + ε, where ε represents random error and the β i 's are constant coefficients to be estimated from n observations of y and the x i . Given n observations of these p + 1 variables, the method of cyclic subspace regression (CSR) can be used to provide a finite number of estimators of β in the linear model y = X β + ε. Among the estimators produced by CSR are the estimators produced by the methods of least squares (LS), principal components regression (PCR), and partial least squares (PLS). In this article, after careful consideration of the invariant subspaces of X t X, a new method of regression is developed. This method, which the authors call invariant subspace regression (ISR), uses a selection of subspaces within the invariant subspaces of X t X, to create estimators of β. These subspaces are identified and created by the use of a finite list of user supplied non zero constants. Since there are no constraints on these constants other than they be non zero, ISR is capable of producing an infinite number of β estimators. It is shown that by an identified choice of constants all estimators produced by CSR can be generated by ISR, thereby showing that ISR is a generalization of CSR. Moreover, it is shown that all ISR estimators can be produced by CSR applied to an identified modification of the y data. Finally, examples are given showing that there exist ISR estimators that perform better in predictions than the best of all CSR estimators. It is also shown that there exist constants that create ISR estimators which perform equal to or worse than all CSR estimators in predictions. However, in this situation the best ISR estimator, in terms of prediction, is the best PCR estimator.