739
Views
5
CrossRef citations to date
0
Altmetric
Original Articles

Cross-validating fit and predictive accuracy of nonlinear quantile regressions

, &
Pages 2939-2954 | Received 15 Jul 2010, Accepted 08 Mar 2011, Published online: 03 May 2011
 

Abstract

The paper proposes a cross-validation method to address the question of specification search in a multiple nonlinear quantile regression framework. Linear parametric, spline-based partially linear and kernel-based fully nonparametric specifications are contrasted as competitors using cross-validated weighted L 1-norm based goodness-of-fit and prediction error criteria. The aim is to provide a fair comparison with respect to estimation accuracy and/or predictive ability for different semi- and nonparametric specification paradigms. This is challenging as the model dimension cannot be estimated for all competitors and the meta-parameters such as kernel bandwidths, spline knot numbers and polynomial degrees are difficult to compare. General issues of specification comparability and automated data-driven meta-parameter selection are discussed. The proposed method further allows us to assess the balance between fit and model complexity. An extensive Monte Carlo study and an application to a well-known data set provide empirical illustration of the method.

Acknowledgements

We thank three anonymous referees for helpful comments. Of course all remaining errors are ours.

Notes

We thank Jeff Racine to bring the following reasoning to our attention: If m and h are set individually for every subsample, in contrast to parametric specifications the curvature could vary for partially linear and nonparametric specifications. As no re-scaling would alleviate part of the specific flexibility of the latter, re-scaling seems to create fair conditions for comparing the cross-validation results at hand.

The statistic does not penalize the complexity of the specification, while e.g., the generalized Schwarz information criterion of Machado Citation26 (cf. Citation14) does so. The latter criterion can be calculated for the linear parametric and partially linear specification, but for the nonparametric specification it is unclear how to approximate the model dimension. Note that in the application in Section 4 we omit the results on the information criterion for the linear parametric and the partially linear specification, as they are qualitatively similar to those for the .

All estimations are carried out with the open-source software R (2.11.1) using a seed of 42 for random number generation and default settings. Partially linear specifications employ a modified version of the function bs from the package splines to allow for equidistant and exterior knots. Simultaneous product kernel bandwidth estimation based on 20 multistarts is achieved using the np-package (0.40-3) of Hayfield and Racine Citation7. For conditional quantile estimation we use the package quantreg (4.53) from Koenker Citation16, where the function rq has an option (method = "fnc") to include inequality constraints such as EquationEquation (4) described in Section 2.1. All code snippets are available from the authors.

Note that this choice of discrete variables and skedastic function has been used by Li and Racine Citation24.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.