Abstract
In this paper we propose a test for the significance of categorical predictors in nonparametric regression models. The test is fully data-driven and employs cross-validated smoothing parameter selection while the null distribution of the test is obtained via bootstrapping. The proposed approach allows applied researchers to test hypotheses concerning categorical variables in a fully nonparametric and robust framework, thereby deflecting potential criticism that a particular finding is driven by an arbitrary parametric specification. Simulations reveal that the test performs well, having significantly better power than a conventional frequency-based nonparametric test. The test is applied to determine whether OECD and non-OECD countries follow the same growth rate model or not. Our test suggests that OECD and non-OECD countries follow different growth rate models, while the tests based on a popular parametric specification and the conventional frequency-based nonparametric estimation method fail to detect any significant difference.
ACKNOWLEDGMENTS
The authors would like to thank but not implicate Essie Maasoumi, Mike Veall and three referees for their helpful comments. A preliminary draft of this paper was presented at the 2002 International Conference on Current Advances and Trends in Nonparametric Statistics held in Crete, and we would like to thank numerous conference participants for their valuable input. Hart's research was supported by NSF Grant DMS 99-71755. Li's research was supported by the Bush Program in the Economics of Public Policy, and the Private Enterprises Research Center, Texas A&M University. Racine's research was supported by the NSERC, SSHRC, and SHARCNET
Notes
1Hart and Wehrly (Citation1992) observe a similar phenomenon with a cross-validation-based test for linearity with a univariate continuous variable. In their case h tends to take a large positive value when the null of linearity is true. For a sample size of n = 100, they observe that 60 percent of the time the smoothing parameter assumes values larger than 1,000.
2Li and Wang show that for a test based on a univariate (continuous variable) nonparametric kernel estimation, the rate of convergence to its asymptotic distribution is of the order O p (h 1/2), which is O p (n −1/10) if h ∼ n −1/5, where h is the smoothing parameter used in the kernel estimation.
3Letting denote the empirical rejection frequency associated with nominal level α, we tested the null H
0:
= α for Bootstrap Method I for n = 50 and obtained P-values of 0.04, 0.00, and 0.00 for α = 0.01, 0.05, and 0.10, respectively, while for Bootstrap Method II, we obtained P-values of 0.14, 0.29, and 0.18 for α = 0.01, 0.05, and 0.10, respectively.
4Qualitatively similar results were obtained for n = 100, so we do not give those tables, for the sake of brevity.
5These simulation results are not reported here, to save space. The results are available from the authors upon request.
6We are grateful to Thanasis Stengos for providing data and for suggesting this parametric specification based upon his work in this area.
7R code and data needed for the replication of these parametric results are available from the authors upon request.
Residual standard error: 0.026 on 599 degrees of freedom. Multiple R-squared: 0.2856; Adjusted R-squared: 0.2665. F-statistic: 14.97 on 16 and 599 DF; p-value: <2.2e-16