120
Views
0
CrossRef citations to date
0
Altmetric
Original Articles

Gains from joint cross validation bandwidth selection for derivatives of conditional multidimensional densities

Pages 807-819 | Received 17 Dec 2014, Accepted 01 Apr 2015, Published online: 24 Apr 2015
 

Abstract

This paper studies bandwidth selection for kernel estimation of derivatives of multidimensional conditional densities, a non-parametric realm unexplored in the literature. This paper extends Baird [Cross validation bandwidth selection for derivatives of multidimensional densities. RAND Working Paper series, WR-1060; 2014] in its examination of conditional multivariate densities, derives and presents criteria for arbitrary kernel order and density dimension, shows consistency of the estimators, and investigates a minimization criterion which jointly estimates numerator and denominator bandwidths. I conduct a Monte Carlo simulation study for various orders of kernels in the Gaussian family and compare the new cross validation criterion with those implied by Baird [Cross validation bandwidth selection for derivatives of multidimensional densities. RAND Working Paper series, WR-1060; 2014]. The paper finds that higher order kernels become increasingly important as the dimension of the distribution increases. I find that the cross validation criterion developed in this paper that jointly estimates the derivative of the joint density (numerator) and the marginal density (denominator) does orders of magnitude better than criteria that estimate the bandwidths separately. I further find that using the infinite order Dirichlet kernel tends to have the best results.

MSC2010 codes:

Acknowledgements

I would like to thank Rosa Matzkin for her advice and help on this project, anonymous referees, Richard Krutchkoff, Dan Ben-Moshe and Michelle Baird for valuable feedback, as well as seminar participants for helpful advice, including Jinyong Hahn and Conan Snider.

Disclosure statement

No potential conflict of interest was reported by the author.

Notes

1. Li and Racine [Citation7] provide an excellent review.

2. They allow for a weighting function, which here is assumed to be one.

3. I tested such an assumption, and found the assumption overly restrictive in minimizing the MSE

4. Where the kernel order is the first non-zero moment of the kernel, and the same kernel family and order is used for the derivative of the joint (numerator) and the marginal (denominator) densities.

5. For convergence of the parameters to the optimal parameters, an extension of the proof contained in [Citation8] for the joint estimation of a multivariate conditional density applies here for the derivative of a multivariate conditional density.

6. Marron [Citation15] shows that higher order kernels perform well when the curvature of what is being estimated is roughly constant, and poorly when there are abrupt changes in curvature on neighbourhoods about the size of the bandwidth.

7. Note that the standard errors of the mean, which would divide the standard deviations by the square root of the number of simulations, or by 10, demonstrates that in most cases, the mean MSEs are statistically different across the cases.

8. The results suggest that these improvements come both from avoiding non-convergence and local minima as well superior performance in converging cases.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.