ABSTRACT
In the stepwise procedure of selection of a fixed or a random explanatory variable in a mixed quantitative linear model with errors following a Gaussian stationary autocorrelated process, we have studied the efficiency of five estimators relative to Generalized Least Squares (GLS): Ordinary Least Squares (OLS), Maximum Likelihood (ML), Restricted Maximum Likelihood (REML), First Differences (FD), and First-Difference Ratios (FDR). We have also studied the validity and power of seven derived testing procedures, to assess the significance of the slope of the candidate explanatory variable x 2 to enter the model in which there is already one regressor x 1. In addition to five testing procedures of the literature, we considered the FDR t-test with n − 3 df and the modified t-test with nˆ − 3 df for partial correlations, where nˆ is Dutilleul's effective sample size. Efficiency, validity, and power were analyzed by Monte Carlo simulations, as functions of the nature, fixed vs. random (purely random or autocorrelated), of x 1 and x 2, the sample size and the autocorrelation of random terms in the regression model. We report extensive results for the autocorrelation structure of first-order autoregressive [AR(1)] type, and discuss results we obtained for other autocorrelation structures, such as spherical semivariogram, first-order moving average [MA(1)] and ARMA(1,1), but we could not present because of space constraints. Overall, we found that:
-
the efficiency of slope estimators and the validity of testing procedures depend primarily on the nature of x 2, but not on that of x 1;
-
FDR is the most inefficient slope estimator, regardless of the nature of x 1 and x 2;
-
REML is the most efficient of the slope estimators compared relative to GLS, provided the specified autocorrelation structure is correct and the sample size is large enough to ensure the convergence of its optimization algorithm;
-
the FDR t-test, the modified t-test and the REML t-test are the most valid of the testing procedures compared, despite the inefficiency of the FDR and OLS slope estimators for the former two;
-
the FDR t-test, however, suffers from a lack of power that varies with the nature of x 1 and x 2; and
-
the modified t-test for partial correlations, which does not require the specification of an autocorrelation structure, can be recommended when x 1 is fixed or random and x 2 is random, whether purely random or autocorrelated. Our results are illustrated by the environmental data that motivated our work.
Mathematics Subject Classification:
Acknowledgments
The first author acknowledges a scholarship from Harran University (Turkey) during her former graduate studies. The second author's research work is funded by the Natural Sciences and Engineering Research Council of Canada and Le Fonds québécois de la recherche sur la nature et les technologies. The data of the environmental example were collected in a team research project conducted by Drs. G. Bell, M. Lechowicz, and M. Waterway. For the Monte Carlo study, we benefited from computing facilities made available to the second author thanks to a Canada Foundation for Innovation grant. SAS programs implementing the estimation and testing procedures are available from the authors upon request.