Abstract
In this paper, we investigate the finite sample performance of four kernel-based estimators that are currently available for additive non-parametric regression models—the classic backfitting estimator (CBE), the smooth backfitting estimator, the marginal integration estimator, and two versions of a two-stage estimator of which the first is proposed by Kim, Linton and Hengartner (1999) and the second is proposed in this paper. The bandwidths are selected for each estimator by minimizing their respective asymptotic approximation of the mean average squared errors. In our simulations, we are particularly concerned with the performance of these estimators under this unified data-driven bandwidth selection method, since in this case both the asymptotic and the finite sample properties of all estimators are currently unavailable. The comparison is based on the estimators' average squared error. Our Monte Carlo results seem to suggest that the CBE is the best performing kernel-based procedure.
Acknowledgements
We thank two anonymous referees for helpful comments. The authors retain responsibility for any remaining errors.
Notes
‡Alternative non-parametric smoothing methods, e.g. spline or wavelet method, could potentially be used, but such methods have not received the attention given to kernel-based methods. See Citation12 Citation13.
†See Citation17 Citation23 for simulation studies that address model mis-specifications.
†See Citation24 and Citation10 for details.
†Note that the true optimal bandwidths are different across samples since MASE is evaluated at sample points.
‡Numerical solutions for the non-linear systems defined by Equationequations (29) and Equation(30)
, Equation(33)
and Equation(34)
as well as Equation(35)
and Equation(36)
are obtained using a quasi-Newton method (step-by-step line search) with an analytical Jacobian. See Citation26.
†There is a small variation in computing time for different models, but none of the conclusions described in the text are changed.
†We note that the functions m d used in the DGP do not satisfy E(m d )=0 for d=1, 2, 3. Therefore, the estimators considered in the study are estimating m d −E(m d ). As such, the definition for ASE r ([mcirc] d ) and AB r ([mcirc] d ) incorporates the constants E(m d ).
†In the preliminary simulation, the MIE seems to be the most sensitive of the estimators with respect to increases in c. This coincides with the fact that its asymptotic variance increase significantly with c. Intuitively, this loss of accuracy is caused by the fact that the MIE needs to estimate the function at many out-of-sample points. When the correlation is high, the values of the function at those points are very hard to capture due to their distance from the observed values of the function.
†Note that for any estimator considered the variance for the r th replication can be obtained by ASE r −AB r .