3,966
Views
0
CrossRef citations to date
0
Altmetric
Articles

Asymptotic properties of a nonparametric conditional density estimator in the local linear estimation for functional data via a functional single-index model

, &
Pages 208-219 | Received 22 Nov 2020, Accepted 05 Jul 2021, Published online: 02 Sep 2021

Abstract

This paper deals with the conditional density estimator of a real response variable given a functional random variable (i.e., takes values in an infinite-dimensional space). Specifically, we focus on the functional index model, and this approach represents a good compromise between nonparametric and parametric models. Then we give under general conditions and when the variables are independent, the quadratic error and asymptotic normality of estimator by local linear method, based on the single-index structure. Finally, we complete these theoretical advances by some simulation studies showing both the practical result of the local linear method and the good behaviour for finite sample sizes of the estimator and of the Monte Carlo methods to create functional pseudo-confidence area.

1. Introduction

The nonparametric estimation of the conditional density function plays a crucial role in statistical analysis. This subject can be approached from multiple perspectives depending on the complexity of the problem. Many techniques were studied in the literature to treat these various situations but all treat only real or multidimensional explanatory random variables.

Focusing on functional data for the kernel-type, the first results on the nonparametric estimate of this model were got by Ferraty and Vieu (Citation2006). They have studied the almost complete convergence the estimator of the conditional density and its derivates. Laksaci (Citation2007) studied quadratic error of this estimator, and we return to Ferraty et al. (Citation2010) which established the uniform almost complete convergence of this model always.

Now, we show a few results on the local linear smoothing for functional data, actually these results have been considered by many authors. Baìllo and Grané (Citation2009) first proposed a local linear smoothing of the regression estimator in a Hilbert space, and coming after them Barrientos-Marin et al. (Citation2010) developed this method of local linear estimation of the regression in the semi-metric space for independent and identically distributed. Demongeot et al. (Citation2013, Citation2014), has used this method to estimate conditional distribution and density function. In the case of spatial data (Laksaci et al., Citation2013) they established pointwise almost complete convergence rates.

Furthermore, the functional index model plays a major role in statistics. The interest of this approach comes from its use to reduce the dimension of the data by projection in fractal space. The literature on this topic is closely limited, the first work which was interested in the single-index model on the nonparametric estimation is Ferraty et al. (Citation2003) which stated for i.i.d. variables and obtained the almost complete convergence under some conditions. Based on the cross-validation procedure, Ait Saidi et al. (Citation2008) proposed an estimator of this parameter, where the functional single-index is unknown. Recently, Attaoui et al. (Citation2011) considered the nonparametric estimation of the conditional density in the single functional model. They established its pointwise and uniform almost complete convergence (a.co.) rates. In the same topic, Attaoui and Ling (Citation2016) proved the asymptotic results of a nonparametric conditional cumulative distribution estimator for time series data. More recently, Tabti and Ait Saidi (Citation2018) obtained the almost complete convergence and the uniform almost complete convergence of a kernel estimator of the hazard function with quasi-association condition when the observations are linked with functional single-index structure.

In this paper, we focus on the local linear estimation with the single-index structure to compute under some conditions, the quadratic error of the conditional density function estimator. In practice, this study has great importance, because, it permits to construct a prediction method based on the maximum risk estimation with a single functional index.

In Section 2, We introduce the estimator of our model in the single functional index. In Section 3, we introduce assumptions and asymptotic properties are given. Simulations are given in Section 4. Finally, Section 5 is devoted to the proofs of the results.

2. The model

Let {(Xi,Yi),1in} be n random variables, independent and identically distributed as the random pair (X,Y) with values in H×R, where H is a separable real Hilbert space with the norm . generated by an inner product .,.. We consider the semi-metric dθ associated to the single-index θH defined by dθ(x1,x2):=∣x1x2,θ,x1,x2H. Assume that the explanation of Y given X is done through a fixed functional index θ in H. In the sense, there exists a θ in H (unique up to a scale normalization factor) such that: E[Y|X]=E[Y|θ,X]. The conditional density of Y given X = x denoted by fθ(.|x) exists and is given by fθ(y|x):=f(y|x,θ),yR. In the following, we denote by f(θ,.,x), the conditional density of Y given x,θ and we define the local linear estimator for single-index structure fˆ(θ,.,x) of f(θ,.,x) by fˆ(θ,y,x)=1i,jnWij(θ,x)H(hH1(yYj))hH1i,jnWij(θ,x)=1jnΩjKjHjhH1jnΩjKj,with Wij(θ,x)=βθ(Xi,x)(βθ(Xi,x)βθ(Xj,x))×K(hK1dθ(x,Xi))K(hK1dθ(x,Xj)),and ΩjKj=i=1nWij with βθ(Xi,x) is a known bi-functional operator from H2 into R where K and H are kernel functions and hK:=hn,K (resp. hH:=hn,H) is a sequence that decreases to zero as n goes to infinity.

3. Assumptions and mains results

Throughout the paper, we will denote by C, C and Cθ,x some strictly positive generic constants and Ki(θ,x):=K(hK1dθ(x,Xi)), xH,i=1,,n, Hj:=H(hH1(yYj)),yR,j=1,,n., βθ,i:=βθ(Xi,x), Wij(θ,x):=Wθ,ij and we will use the notation Bθ(x,hK):={x1H: 0<|xx1,θ|<hK}, the ball centred at x with radius hK. Moreover, to find the results in our paper we denote ψl(.,y):=lf(.,y,.)yl, for any l{0,2}

Φl(s)=E[ψl(X,y)ψl(x,y)|dθ(x,X)=s].

In order to study our asymptotic results, we need the following assumptions:

(H1)

P(XBθ(x,hK))=:ϕθ,x(hK)>0, and assume that there exists a function χθ,x() such that limhK0ϕθ,x(shK,hK)ϕθ,x(hK)=χθ,x(s),s[1,1].

(H2)

For any l{0,2}, the quantities Φl(0) and Φl(0) exist, where Φl (resp.Φl) denotes the first (resp. the second) derivative of Φl.

(H3)

The bi-functional βθ(.,.) satisfies x F,C1dθ(x,x)|βθ(x,x)|C2dθ(x,x), where C1>0,C2>0,supuB(x,r)|βθ(u,x)dθ(x,u)|=o(r),hKB(x,hK)βθ(u,x)dP(u)=o(B(x,hK)βθ2(u,x)dP(u)),where Bθ(x,r)={xH/|dθ(x,x)r} and dP(x) is the cumulative distribution of X.

(H4)

The kernel K is a positive, differentiable function and its derivative K exists and is such that there exist two constants C and C with <C<K(t)<C<0, for t[1,1] and K(1)>0.

(H5)

The kernel H is a differentiable function and bounded, such that H(t)dt=1,t2H(t)dt< andH2(t)dt<.

(H6)

The bandwidths hK,hH satisfy

  1. limnhK=0, limnhH=0 and limnnhHϕθ,x(hK)=,

  2. limnnhH5ϕθ,x(hK)=0 and limnnhHhK2ϕθ,x(hK)=0.

Comments on assumptions: Notice that, (H1) and (H2) are a simple adaptation of the conditions in Ferraty et al. (Citation2007) on the regression operator, when we replace the semi-metric by some bi-functional dθ. The second part of the condition (H3) is unrestrictive and is verified, for instance, if dθ(,)=βθ(,); moreover limdθ(x,u)0|βθ(u,x)dθ(x,u)1|=0.Assumptions (H4)–(H6) are classical in this context of quadratic errors and asymptotic normality in functional statistic.

3.1. Mean square convergence

In this part, we are going to show the asymptotic results of quadratic-mean convergence.

Theorem 3.1

Under assumptions (H1)–(H6), we obtain E[fˆ(θ,y,x)f(θ,y,x)]2=BH2(θ,x,y)hH4+BK2(θ,x,y)hK2+VHK(θ,x,y)nhHϕθ,x(hK)+o(hH4)+o(hK2)+o(1nhHϕθ,x(hK)),where BH(θ,x,y)=122f(θ,y,x)y2t2H(t)dt,BK(θ,x,y)=Φ0(0)M0M1hK,and VHK(θ,x,y)=M2f(θ,y,x)M12(H2(t)dt),with M0=K(1)01sK(s)χθ,x(s)dsandMj=Kj(1)01(Kj)(s)χθ,x(s)ds for j=1,2.We set fˆ(θ,y,x)=fˆN(θ,y,x)fˆD(θ,x),where fˆN(θ,y,x)=1n(n1)hHE[W12(θ,x)]×1ijnWij(θ,x)H(hH1(yYj)),and fˆD(θ,x)=1n(n1)E[W12(θ,x)]1ijnWij(θ,x).

The following lemmas will be useful for proof of Theorem 3.1.

Lemma 3.2

Under the assumptions of Theorem 3.1, we obtain E[fˆN(θ,y,x)]f(θ,y,x)=BH(θ,x,y)hH2+BK(θ,x,y)hK+o(hH2)+o(hK).

Lemma 3.3

Under the assumptions of Theorem 3.1, we obtain Var[fˆN(θ,y,x)]=VHK(θ,x,y)nhHϕθ,x(hK)+o(1nhHϕθ,x(hk)).

Lemma 3.4

Under the assumptions of Theorem 3.1, we get Cov(fˆN(θ,y,x),fˆD(θ,x))=O(1nϕθ,x(hK)).

Lemma 3.5

Under the assumptions of Theorem 3.1, we get Var[fˆD(θ,x)]=O(1nϕθ,x(hK)).

3.2. Asymptotic normality

This section contains results on the asymptotic normality of fˆ(θ,y,x). Before announcing our main results, we introduce the quantity N(a,b), which will appear in the bias and variance dominant terms: N(a,b)=Ka(1)11(ubKa(u))χx(u)dufor all a>0 and b=2,4Then, we have the following theorem

Theorem 3.6

Under assumptions (H1)(H6), we obtain (1) nhHϕθ,x(hK)(fˆ(θ,y,x)f(θ,y,x)Bn(θ,x,y))DN(0,VHK(θ,x,y))(1) where (2) VHK(θ,x,y)=M2M12f(θ,y,x)(H2(t)dt)(2) and (3) Bn(θ,x,y)=E(fˆN(θ,y,x)(y))E(fˆD(θ,x))f(θ,y,x)(3) with D denoting the convergence in distribution.

Proof

Proof of Theorem 3.6

Inspired by the decomposition given in Masry (Citation2005), we set fˆ(θ,y,x)f(θ,y,x)Bn(θ,x,y)=fˆN(θ,y,x)f(θ,y,x)fˆD(θ,x)fˆD(θ,x)Bn(θ,x,y)fˆD(θ,x).If we denote (4) Qn(θ,x,y)=fˆN(θ,y,x)f(θ,y,x)fˆD(θ,x)E(fˆN(θ,y,x)f(θ,y,x)fˆD(θ,x)=fˆN(θ,y,x)f(θ,y,x)fˆD(θ,x)Bn(θ,x,y),(4) since fˆN(θ,y,x)f(θ,y,x)fˆD(θ,x)=Qn(θ,x,y)+Bn(θ,x,y),then the proof of this theorem will be completed from the following expression (5) fˆ(θ,y,x)f(θ,y,x)Bn(θ,x,y)=Qn(θ,x,y)Bn(θ,x,y)(fˆD(θ,x)E(fˆD(θ,x)))fˆD(θ,x)(5) and the following auxiliary results which play a main role and for which proofs are given in the appendix.

Lemma 3.7

Under assumptions (H1)–(H5), we have fˆD(θ,x)PE(fˆD(θ,x))=1,where P denotes the convergence in probability.

So Lemma 3.7 implies that fˆD(θ,x)1. Moreover, Bn(θ,x,y)=o(1) as n because of the continuity of f(θ,y,x). Then, we obtain that fˆ(θ,y,x)f(θ,y,x)Bn(θ,x,y)=Qn(θ,x,y)fˆD(θ,x)(1+op(1)).

Lemma 3.8

Under assumptions (H1)(H5), we have (6) nhHϕθ,x(hK)Qn(θ,x,y)DN(0,VHK(θ,x,y)),(6)

where VHK(θ,x,y) is defined by (2).

Remark 3.9

As mentioned in Demongeot et al. (Citation2013), the function ϕθ,x(t) can be empirically estimated by ϕˆθ,x(t)={i:|d(Xi,x)|t}n,where (A) denote the cardinality of the set A. So, if we take advantage of the following assumptions and (H6) limn+nhHϕθ,x(hK)Bn(θ,x,y)=0, we can cancel the bias term and obtain the following corollary.

Corollary 3.10

Under the assumptions of Theorem 3.6, we get nhHϕˆθ,x(hK)VHK(θ,x,y)(fˆ(θ,y,x)f(θ,y,x))N(0,1).

4. Simulation study

We first construct the simulation of the explanatory functional variables. In the second part, we focus on the ability of the nonparametric functional regression to predict response variables from functional predictors. Finally we illustrate the Monte-Carlo methodology to test the efficiency of the asymptotic normality results parallel the practical experiment and build functional pseudo-confidence area.

For this purpose, we consider the following process explanatory functional variables for n = 350 Xi(t)=j=13Vijcos((3+j)t)+Wi(tπ)2,t[0,100],where Vij and Wi are n independent real random variables (r.r.v.) uniformly distributed over [0.3;2] (resp. [1; 3]), and it is assumed that these curves are observed on a discretization grid of 100 points in the interval. These functional variables are represented in Figure .

Figure 1. The curves Xi,i=1,,200.

Figure 1. The curves Xi,i=1,…,200.

For response variables Yi, we consider the following model for all i=1,n and j=1,100: Y=r(θk,X)+ϵwhere r(Ui)=010011Ui2(v)dv and ϵ is a centred normal variable and assumed to be independent of (Xi)i. Then, we can get the corresponding conditional density, which is explicitly defined by f(θk,y,x)=12πe12(yr(θk,x))2.Our goal in this illustration is to show the usefulness of conditional density in a context of forecasting. Thus the use of optimal parameters of the conditional density is without theoretical validity.

Now, we precise the different parameters of our estimators. Indeed, first of all, it is clear that the shape of the curves allows us to use d(x1,x2)=01(x1(t)x2(t))2dt;x1,x2H, where H is a semi-metric space.We choose particularly the quadratic kernels defined by 32(1x2)1[1,1)andK(1)>0.In this illustration, we select the functional index θk on the set of eigenvectors of the empirical covariance operator. Γn(X)=1200i=1200(XiX¯)t((XiX¯)).Indeed, we recall that the ideas of Ait Saidi et  al. (Citation2008) can be adapted to find a method of practical selection for θk. However, this adaptation in the case of the conditional density requires tools and additional preliminary results (See the discussion Attaoui et al. (Citation2011) and Attaoui (Citation2014)).

For this purpose, we divide our observations on two packets learning sample (Xi,Yi)i=1,,200 and test sample (Xi,Yi)i=201,,250. For the choice of smoothing parameters hK and hH, we will adopt the selection criterion used by Ferraty et al.  (Citation2006) in the case of the kernel method for which hK and hH are obtained by minimizing the next criterion (7) 1ni=1nW1(Xi)fˆ(hK,hH)i2(Xi,y)W2(y)dy2ni=1nfˆ(hK,hH)i(Xi,Yi)W1(Xi)W2(Yi),(7) where fˆ(hK,hH)k(Xk,y)=ki,j=1nWij(Xk)H(hH1(yYi))hHi,j=1nWij(Xk).A first way of assessing the quality of prediction is to compare predicted functional responses (fˆ(θ,y,x) for any X in the testing sample) versus the true of conditional density operator (i.e., f(θ,y,x)) as in Figure .

Figure 2. Predicted functional responses (solid lines); observed functional responses (dashed lines).

Figure 2. Predicted functional responses (solid lines); observed functional responses (dashed lines).

For the next simulation algorithm, we used:

  • Simulate a sample of size n.

  • Calculate the smoothing parameters hK and hH that are varied over interval [0,1] and which minimizes (Equation7).

  • Compute for k = 1, 2, 3, 4 the quantities (nhHϕθk,x)1/2(fˆ(θk,y,x)f(θk,y,x))where fˆ(θk,y,x) is the functional kernel estimator from the sample (Xi,Yi)i=1,,200, and k=1,2,3,4.

  • Compute a standard density estimator by local linear method.

  • Compute the estimated fˆ(θk,y,x) with the corresponding estimated f(θk,y,x)).

The obtained results are shown in Figure . It can be seen that both densities are very well approximated and have good behaviours with respect to the standard normal distribution.

Figure 3. Representation of the estimated density for k = 1, 2, 3, 4.

Figure 3. Representation of the estimated density for k = 1, 2, 3, 4.

An application of results of Theorem 3.6 is to build the functional pseudo-confidence areas. To this aim, let us set for any component k, (k=1,2,,K) and ηk=η/K with η[0,1], confidence intervals Ekηk such that P(k=1Krk(U)EˆKηk)1ηwhere U=X,θk with θ1,,θK being a data-driven orthonormal basis, the K eigenfunctions associated to the K largest eigenvalues of Γ.

The results from the asymptotic normality of the conditional density are expressed in Corollary 3.10 and we can approximate (1η) confidence interval of f(θ,y,x) by fˆ(θ,y,x)±tη/2×(VHKˆ(θ,x,y)nhHϕˆθ,x(hK))1/2,where tη/2 denotes the η/2 quantile of the standard normal N(0,1).

Figure  represents a functional pseudo-confidence zone for 9 different fixed curves with η=0.05 and K = 4. We see that r(x,θk) and its K-dimensional projection onto θˆ1,,θˆK are very close. This conclusion shows the good performance of our asymptotic normality. Indeed, when one replaces the data-driven basis with the eigenfunctions of Γ, one gets very similar functional pseudo-confidence areas.

Figure 4. Functional pseudo-confidence areas.

Figure 4. Functional pseudo-confidence areas.

5. Conclusion

In this paper, we are mainly interested in the nonparametric estimation of the conditional density function by the local linear method for a variable explanatory functionally conditioned to an actual response variable via a functional single-index model. We show that the estimator provides good predictions under this model. One of the main contributions of this work is the choice of the semi-metric. Indeed, it is well known that, in nonparametric functional statistics, the semi-metric of the projection type is very important for increasing the concentration property. The functional index model is a special case of this family of semi-metrics because it is based on the projection on a functional direction which is important for the implementation of our method in practice. Therefore, we can draw zones of functional pseudo-confidence, which is a very interesting tool for assessing the quality of the prediction.

Acknowledgments

The authors are very grateful to the Editor and the anonymous reviewers for their comments which improved the quality of this paper. The authors wish to thank two anonymous referees for their helpful comments and suggestions, which greatly improved the quality of this paper.

Disclosure statement

No potential conflict of interest was reported by the author(s).

References

  • Ait Saidi, A., Ferraty, F., Kassa, P., & Vieu, P. (2008). Cross-validated estimations in the single functional index model. Statistics, 42(6), 475–494. https://doi.org/10.1080/02331880801980377
  • Attaoui, S. (2014). Strong uniform consistency rates and asymptotic normality of conditional density estimator in the single functional index modeling for time series data. AStA Advances in Statistical Analysis., 98(3), 257–286. https://doi.org/10.1007/s10182-014-0227-3
  • Attaoui, S., Laksaci, A., & Ould Said, F. (2011). A note on the conditional density estimate in the single functional index model. Statistics & Probability Letters, 81(1), 45–53. https://doi.org/10.1016/j.spl.2010.09.017
  • Attaoui, S., & Ling, N. (2016). Asymptotic results of a nonparametric conditional cumulative distribution estimator in the single functional index modeling for time series data with applications. Metrika, 79(3), 485–511. https://doi.org/10.1007/s00184-015-0564-6
  • Baìllo, A., & Grané, A. (2009). Local linear regression for functional predictor and scalar response. Journal of Multivariate Analysis, 100(1), 102–111. https://doi.org/10.1016/j.jmva.2008.03.008
  • Barrientos-Marin, J., Ferraty, F., & Vieu, P. (2010). Locally modelled regression and functional data. Journal of Nonparametric Statistics, 22(5), 617–632. https://doi.org/10.1080/10485250903089930
  • Bosq, D., & Lecoutre, J. P. (1987). Théorie de l'estimation fonctionnelle. Ed. Economica.
  • Demongeot, J., Laksaci, A., Madani, F., & Rachdi, M. (2013). Functional data: local linear estimation of the conditional density and its application. Statistics: A Journal of Theoretical and Applied Statistics., 76(2), 328–355. https://doi.org/10.1080/02331888.2011.568117
  • Demongeot, J., Laksaci, A., Rachdi, M., & Rahmani, S. (2014). On the local linear modelization of the conditional distribution for functional data. Sankhya: The Indian Journal of Statistics., 76(2), 328–355. https://doi.org/10.1007/s13171-013-0050-z
  • Ferraty, F., Laksaci, A., Tadj, A., & Vieu, P. (2010). Rate of uniform consistency for nonparametric estimates with functional variables. Journal of Statistical Planning and Inference, 140(2), 335–352. https://doi.org/10.1016/j.jspi.2009.07.019
  • Ferraty, F, Laksaci, A, & Vieu, P. (2006). Estimating some characteristics of the conditional distribution in nonparametric functional model. Statistical Inference for Stochastic Processes , 9(1), 47–76. https://doi.org/10.1007/s.11203-004-3561-3
  • Ferraty, F., Mas, A., Vieu, P., & Vieu, P. (2007). Advances in nonparametric regression for functional variables. Australian & New Zealand Journal of Statistics, 49(1), 1–20. https://doi.org/10.1111/j.1467-842X.2006.00454.x
  • Ferraty, F., Peuch, A., & Vieu, P. (2003). Modéle à indice fonctionnel simple. Comptes Rendus Mathématique de l'Académie des Sciences Paris, 336(12), 1025–1028. https://doi.org/10.1016/S1631-073X(03)00239-5(in French)
  • Ferraty, F., & Vieu, P. (2006). Nonparametric functional data analysis. Theory and Practice. Springer Series in Statistics.
  • Laksaci, A. (2007). Convergence en moyenne quadratique de l'estimateur a noyau de la densité conditionnelle avec variable explicative fonctionnelle. Publications de l'Institut de statistique de l'Université de Paris, 51(3), 69–80.
  • Laksaci, A., Rachdi, M., & Rahmani, S. (2013). Spatial modelization: local linear estimation of the conditional distribution for functional data. Spatial Statistics, 6(4), 1–23. https://doi.org/10.1016/j.spasta.2013.04.004
  • Masry, E. (2005). Nonparametric regression estimation for dependent functional data: asymptotic normality. Stochastic Processes and their Applications, 115(1), 155–177. https://doi.org/10.1016/j.spa.2004.07.006
  • Sarda, P., & Vieu, P. (2000). Kernel regression. Wiley Series in Probability and Statistics (pp. 43–70).
  • Tabti, H., & Ait Saidi, A. (2018). Estimation and simulation of conditional hazard function in the quasi-associated framework when the observations are linked via a functional single-index structure. Communications in Statistics -- Theory and Methods, 47(4), 816–838. https://doi.org/10.1080/03610926.2016.1213294

Appendix

Proof

Proof of Theorem 3.1

We know the theorem is a consequence of separately computing two quantities (bias and variance) of fˆ(θ,y,x), and we have E[fˆ(θ,y,x)f(θ,y,x)]2=[E(fˆ(θ,y,x))f(θ,y,x)]2+Var[fˆ(θ,y,x)].By classical calculations, we obtain fˆ(θ,y,x)f(θ,y,x)=(fˆN(θ,y,x)f(θ,y,x))(fˆN(θ,y,x)E[fˆN(θ,y,x)])(fˆD(θ,x)1)E[fˆN(θ,y,x)](fˆD(θ,x)1)+(fˆD(θ,x)1)2fˆ(θ,y,x),which implies that E[fˆ(θ,y,x)]f(θ,y,x)=(E[fˆN(θ,y,x)]f(θ,y,x))Cov(fˆN(θ,y,x),fˆD(θ,x))+E[(fˆD(θ,x)E[fˆD(θ,x)])2fˆ(θ,y,x)].Under the assumption (H5), we can bound fˆ(θ,y,x) by a constant C>0 where fˆ(θ,y,x)C/hH. Hence E[fˆ(θ,y,x)]f(θ,y,x)=(E[fˆN(θ,y,x)]f(θ,y,x))Cov(fˆN(θ,y,x),fˆD(θ,x))+Var[fˆD(θ,x)]O(hH1).Now, by similar techniques as those of Sarda and Vieu (Citation2000) and by Bosq and Lecoutre (Citation1987), the variance term is Var[fˆ(θ,y,x)]=Var[fˆN(θ,y,x)]2E[fˆN(θ,y,x)]Cov(fˆN(θ,y,x),fˆD(θ,x))+(E[fˆN(θ,y,x)])2Var(fˆD(θ,x))×o(1nhHϕθ,x(hK)).

Proof

Proof of Lemma 3.2

We have E[fˆN(θ,y,x)]=E[1ijn1n(n1)hHE[W12(θ,x)]×1ijnWij(θ,x)H(hH1(yYj))]=1hHE[Wθ,12]E[Wθ,12E[H2|X2]].By using a Taylor's expansion and under assumption (H5), we have E[H2|X2]=f(θ,y,X2)+hH22(t2H(t)dt)2f(θ,y,X2)y2+o(hH2).Now, we can rewrite the above equation as E[H2|X2]=ψ0(X2,y)+hH22(t2H(t)dt)ψ2(X2,y)+o(hH2).Thus, we obtain E[fˆN(θ,y,x)]=1E[Wθ,12]((t2H(t)dt)E[Wθ,12ψ2(X2,y)]+(t2H(t)dt)E[Wθ,12ψ2(X2,y)]+o(hH2)).According to Ferraty et al. (Citation2007), for l{0,2}, we show that E[Wθ,12ψl(X2,y)]=ψl(x,y)E[Wθ,12]+E[Wθ,12(ψl(X2,y)ψl(x,y))]=ψl(x,y)E[Wθ,12]+E[Wθ,12E[ψl(X2,y)ψl(x,y)|dθ(X2,x)]]=ψl(x,y)E[Wθ,12]+E[Wθ,12Φl(dθ(X2,x))].Since Φl(0)=0, we obtain E[Wθ,12Φl(dθ(X2,x))]=Φl(0)E[dθ(X2,x)Wθ,12]+o(E[dθ(X2,x)Wθ,12]).Then we get E[fˆN(θ,y,x)]=f(θ,y,x)+hH222f(θ,y,x)y2t2H(t)dt+o(hH2E[dθ(X2,x)Wθ,12]E[Wθ,12])+Φ0(0)E[dθ(X2,x)Wθ,12]E[Wθ,12]+o(E[dθ(X2,x)Wθ,12]E[Wθ,12]).Therefore, it remains to determine the quantities E[dθ(X2,x)Wθ,12] and E[Wθ,12]. According to the definition of Wθ,12, the two quantities E[dθ(X2,x)Wθ,12] and E[Wθ,12] are based on the evaluation asymptotic of E[K1aβ1b]. To do that, we treat firstly, the case b = 1. For this case, we use the assumptions (H3) and (H4) to get hKE[K1aβθ,1]=o(B(x,hK)βθ2(u,x)dP(u))=o(hK2ϕθ,x(hK)).So, we obtain that, (A1) E[K1aβθ,1]=o(hKϕθ,x(hK)).(A1) Moreover, for all b>1, and after simplifications of the expressions, it is permitted to write that E[K1aβθ,1b]=E[K1adθb(x,X)]+o(hKbϕθ,x(hK)).Concerning the first term, we write hKbE[K1adθb]=vbKa(v)dPhK1dθ(x,X)(v)=11[Ka(1)v1((sbKa(s)))du]dPhK1dθ(x,X)(v)=(K(1)ϕθ,x(hK)11(sbKa(s))(1)ϕθ,x(shK,hK)ds)=ϕθ,x(hK)(K(1)11(sbKa(s))ϕθ,x(shK,hK)ϕθ,x(hK)ds).Finally, under assumptions (H1), we get (A2) E[K1aβθ1b]=hKbϕθ,x(hK)(K(1)11(sbKa(u))χθ,x(s)ds)+o(hKbϕθ,x(hK)).(A2) So, E[dθ(X2,x)Wθ,12]E[Wθ,12]=hK(K(1)11(sK(s))χθ,x(s)dsK(1)11(K(u)χθ,x(s)ds)+o(hK).Hence, E[fˆN(θ,y,x)]=f(θ,y,x)+hH222f(θ,y,x)y2t2H(t)dt+o(hH2)+hKΦ0(0)(K(1)11(sK(s))χθ,x(s)ds)(K(1)11K(s)χθ,x(s)ds)+o(hK).

Proof

Proof of Lemma 3.3.

We know (A3) Var(fˆN(θ,y,x))=1(n(n1)hH(E[Wθ,12]))2Var(1ijnWθ,ijHj)=1(n(n1)hH(E[Wθ,12]))2×[n(n1)E[Wθ,122(H22)]+n(n1)E[Wθ,12Wθ,21H2H1]+n(n1)(n2)E[Wθ,12Wθ,13H2H3]+n(n1)(n2)E[Wθ,12Wθ,23H2H3]+n(n1)(n2)E[Wθ,12Wθ,31H2H1]+n(n1)(n2)E[Wθ,12Wθ,32H22]n(n1)(4n6)E[Wθ,12H2]2].(A3) By direct calculations, we get {E[Wθ,122H2]=O(hK4hHϕθ,x2(hK)),E[Wθ,12Wθ,21H2H1]=O(hK4hH2ϕθ,x2(hK)),E[Wθ,12Wθ,13H2H3]=E[Wθ,12Wθ,31H2H1]=E[Wθ,12Wθ,23H2H3]=O(hK4hH2ϕθ,x3(hK)),E[Wθ,12Wθ,32H22]=E2[β12K1]E[K12H12]+o(hK4hHϕθ,x3(hK)).Clearly, the latter term in the above equation is the leading one, and can be evaluated in (EquationA3) by using (n2)n(n1)(hHE[Wθ,12])2E2[βθ,12K1]E[K12H12].By the same arguments used in the proof of Lemma 3.2, we obtain (A4) Var(fˆN(θ,y,x))=E[K12H12]n(hHE[K1])2+o(1nhHϕθ,x(hK)).(A4) Observe that E[K12H12]=E[K12E((H1(1))2|θ,X1)]=E[K12H2(hH1(yz))f(θ,z,X1)dz].Thus, by the change of variables t=hH1(yz), we get E[K12H12]=hHE[K12H2(t)f(θ,yhHt,X1)dt].By using Taylor's expansion of order 1 of f(θ,,y) we get f(θ,yhHt,X1)=f(θ,y,X1)+O(hH)=f(θ,y,X1)+o(1).Then E[K12H12]=hH(H2(t)dt)E[K12f(θ,y,X1)]+o(hHE[K12]).Also, by the same steps in the proof of Lemma 3.2, we obtain E[K12f(θ,y,X1)]=f(θ,y,x)E[K12]+o(E[K12])which give that (A5) E[K12H12]=hHf(θ,y,x)E[K12]H2(t)dt+o(hHE[K12]).(A5) Finally, we obtain from (EquationA2), (EquationA4) and (EquationA5), that Var(fˆN(θ,y,x))=f(θ,y,x)nhHϕθ,x(hK)(H2(t)dt)×[(K2(1)11(K2(s))χθ,x(s)ds)(K(1)11(K(s))χθ,x(s)ds)2]+o(1nhHϕθ,x(hK))=M2f(θ,y,x)M12nhHϕθ,x(hK)(H2(t)dt)+o(1nhHϕθ,x(hK)).

Proof

Proof of Lemma 3.4

The proof of this Lemma is similar to the proof of Lemma 3.3. We can write Cov(fˆN(θ,y,x),fˆD(θ,x))=1(n(n1)hHE[Wθ,12])2×Cov(1ijnWθ,ijHj,1ijnWθ,ij)=1(n(n1)hHE[Wθ,12])2[n(n1)E[Wθ122H1]+n(n1)E[Wθ,12Wθ,21H2]+n(n1)(n2)E[Wθ,12Wθ,13H2]+n(n1)(n2)E[Wθ,12Wθ,23H2]+n(n1)(n2)E[Wθ,12Wθ,31H2]+n(n1)(n2)E[Wθ,12Wθ,32H2]n(n1)(4n6)(E[Wθ,12H2]E[Wθ,12]].By direct calculations, we get {E[Wθ,122H2]=E[Wθ,12Wθ,21H2]=O(hK4hHϕθ,x2(hK)),E[Wθ,12Wθ,13H2]=E[Wθ,12Wθ,31H2]=O(hK4hHϕθ,x3(hK)),E[Wθ,12Wθ,23H2]=E[Wθ,12Wθ,32H2]=O(hK4hHϕθ,x3(hK)).Since E[Wθ,12]=O(hK2ϕθ,x2(hK)), we obtain Cov(fˆN(θ,y,x),fˆD(θ,x))=O(1nϕθ,x(hK)).

Proof

Proof of Lemma 3.5

We have that Var(fˆD(θ,x))=1(n(n1)E[Wθ,12])2Var(1ijnWθ,ij).Similarly to the proof of Lemma 3.3, we get Var(fˆD(θ,x))=E[K12]n(E[K1])2+o(1nϕθ,x(hK)).We have as n, 1ϕθ,x(hK)E[K1j]Mj,j=1,2 (see Ferraty et al., Citation2007). Then, we can write finally Var(fˆD(θ,x))=M2ϕθ,x(hK)n(M1ϕθ,x(hK))2+o(1nϕθ,x(hK))=O(1nϕθ,x(hK)).

Proof

Proof of Lemma 3.8

We have nhHϕθ,x(hK)Qn(θ,x,y)=nhHϕθ,x(hK)nE(Ω1K1)(j=1nΩjKj(Hjf(θ,y,x))E(j=1nΩjKj(Hjf(θ,y,x))).Then, combined with (Equation4) implies that nhHϕθ,x(hK)Qn(θ,x,y)=1nE(β12K1)i=1nβi2KinhHϕθ,x(hK)E(β12K1)E(Ω1K1)×j=1nKj(Hjf(θ,y,x))1nE(β1K1)i=1nβiKinhHϕθ,x(hK)E(β1K1)E(Ω1K1)×j=1nβjKj(Hjf(θ,y,x))E(j=1n1nE(β12K1)i=1nβi2KinhHϕθ,x(hK)E(β12K1)E(Ω1K1)×j=1nKj(Hjf(θ,y,x)))+E(j=1n1nE(β1K1)i=1nβiKinhHϕθ,x(hK)E(β1K1)E(Ω1K1)×j=1nβjKj(Hjf(θ,y,x))).Denote S1=1nE(β12K1)i=1nβi2Ki,S2=nhHϕθ,x(hK)E(β12K1)E(Ω1K1)j=1nKj(Hjf(θ,y,x)),S3=1nE(β1K1)i=1nβiKiandS4=nhHϕθ,x(hK)E(β1K1)E(Ω1K1)j=1nβjKj(Hjf(θ,y,x)).It remains to show that, nhHϕθ,x(hK)Qn(θ,x,y)=S1S2S3S4E(S1S2S3S4)=(S1S2E(S1S2))(S3S4E(S3S4)).Hence by Slutsky's theorem, to show (EquationA3), it suffices to prove the following two claims: (A6) S1S2E(S1S2)DN(0,VHK(θ,x,y)),(A6) (A7) S3S4E(S3S4)P0,(A7) Proof of (EquationA6) We can write that S1S2E(S1S2)=S2E(S2)+(S11)S2E((S11)S2).By Slutsky's theorem, we get the following intermediate results: (A8) (S11)S2E((S11)S2)P0(A8) and (A9) S2E(S2)DN(0,VHK(θ,x,y)).(A9) Concerning the proof of (EquationA8), by applying the Bienaymé–Tchebychv's inequality, we obtain for all ϵ>0 P(|(S11)S2E((S11)S2)|>ϵ)E(|(S11)S2E(S11)S2)|)ϵ.Then, the Cauchy–Schwarz inequality implies that E(|(S11)S2E((S11)S2)|)2E(|(S11)S2)|)2E((S11)2)E((S2)2).On one side, by using (EquationA1) and (EquationA2), we obtain E((S11)2)=Var(S1)=1n2E2(β12K1)nVar(β12K1)1nO(h4ϕθ,x2(hK))E(β14K12)=O(1nhHϕθ,x(hK)).And on the other side, we obtain E((S2)2)=nhHϕθ,x(hK)E2(β12K1)E2(Ω1K1)×E(j=1nKj(Hjf(θ,y,x)))2=n(n1)2O(ϕθ,x(hK))(nO(ϕθ,x(hK))+n(n1)o(ϕθ,x2(hK)))=O(1)+o(nϕθ,x2(hK)).Thus E(|(S11)S2E((S11)S2)|)2E((S11)2)E((S2)2)2O(1nhHϕθ,x(hK))(O(1)+o(nhHϕθ,x(hK)))=o(1),which implies that (S11)S2E(S11)S2)=op(1). Then, as n, we get P(|(S11)S2E(S11)S2)|)>ϵ)E(|(S11)S2E(S11)S2)|)ϵ0.Concerning the proof of (EquationA9) Pn=S2E(S2)=nhHϕθ,x(hK)E(β12K1)E(Ω1K1)j=1nKj(Hjf(θ,y,x))E(Kj(Hjf(θ,y,x)))=nhHϕθ,x(hK)E(β12K1)E(Ω1K1)j=1nμnj(x,y),where μnj(x,y)=Kj(Hjf(θ,y,x))E(Kj(Hjf(θ,y,x))).By the fact that μnj(x,y) are i.i.d., it follows that var(Pn(x,y))=n2ϕθ,x(hK)E2(β12K1)E2(Ω1K1)var(μn1(x,y))=n2ϕθ,x(hK)E2(β12K1)E2(Ω1K1)E(μn12(x,y)).Thus (A10) var(Pn(x,y))=n2ϕθ,x(hK)E2(β12K1)E2(Ω1K1)(E(K12(H1f(θ,y,x))2)(E(K1(H1f(θ,y,x)))2).(A10) Concerning the second term on the right-hand side of (EquationA10), we have (E(K1(H1f(θ,y,x)))2=(E(E(K1(H1f(θ,y,x))|X1))2=(E(K1E((H1|X1)f(θ,y,x))))2,where (A11) E((H1|X1)f(θ,y,x))0as n.(A11) Now let us return to the first term of the right hand of (EquationA10). We have n2ϕθ,x(hK)E2(β12K1)E2(Ω1K1)(E(K12(H1f(θ,y,x))2)=n2ϕθ,x(hK)E2(β12K1)E2(Ω1K1)(E(E((H1f(θ,y,x))2|X1)K12)=n2ϕθ,x(hK)E2(β12K1)E2(Ω1K1)E(var(H1|X1)K12)+n2ϕθ,x(hK)E2(β12K1)E2(Ω1K1)×(E(E((H1|X1)f(θ,y,x))2)K12)By using (EquationA9), we have as n n2ϕθ,x(hK)E2(β12K1)E2(Ω1K1)(E(E((H1|X1)f(θ,y,x))2)K12)0Combining (Equation5) and (Equation6), we obtain as n E(var(H1|X1)K12)E(K12)f(θ,y,x)(H2(t)dt)=M2f(θ,y,x)(H2(t)dt)ϕθ,x(hK).Therefore, by using (Equation5) and (Equation6), Equation (EquationA10) becomes Var(Pn(x,y))=n2ϕθ,x(hK)(N(1,2)hK2ϕθ,x(hK))2((n1)N(1,2)M1hK2ϕθ,x(hK))2M2f(θ,y,x)×(H2(t)dt)ϕθ,x(hK)=n2M2(n1)2M12f(θ,y,x)(H2(t)dt)M2M12f(θ,y,x)(H2(t)dt)=VHK(θ,x,y)as n.Now, in order to end the proof of (EquationA6), we focus on the central limit theorem. So, the proof of (13) is completed if Lindberg's condition is verified. In fact, Lindberg's condition holds since, for any η>0 j=1nE(μnj2𝟙(|μnj|>η))=nE(μn12𝟙(|μn1|>η))=E((nμn1)2𝟙(|nμn1|>nη))as E((nμn1)2)=nE(μn12)M2M12f(θ,y,x)(H2(t)dt).Proofs of (EquationA7) To use the same arguments as those invoked to prove (EquationA6), let us write S3S4E(S3S4)=S4E(S4)+(S31)S4E(S31)S4.By applying Bienaymé–Tchebychv's inequality, we obtain for all ϵ>0 P(|S3S4E(S3S4)|)>ϵ)E(|S3S4E(S3S4)|)ϵ.And the Cauchy–Schwarz inequality implies that E(|(S31)S4E((S31)S4)|)2E(|(S31)S4)|)2E((S31)2)E((S4)2).Taking into account the assumptions H(Equation5) and H(Equation6), we get E((S31)2=Var(S3)=nn2E2(β1K1)var(β1K1)1nO(h4ϕθ,x2(hK))E(β14K12)=O(1nhHϕθ,x(hK)).On the other hand, E((S4)2)=nhHϕθ,x(hK)E2(β1K1)E2(Ω1K1)×E(j=1nβjKj(Hjf(θ,y,x)))2=nhHϕθ,x(hK)O(hK2ϕθ,x2(hK))(n1)2O(hK4ϕθ,x4(hK))×(nE(β1K1(H1f(θ,y,x))))2+n(n1)E2(β1K1(H1f(θ,y,x)))=o(1)+o(nhHϕθ,x(hK)).It remains to show E(|(S31)S4E((S31)S4)|)2E((S31)2)E((S4)2)=o(1) which implies that E(|(S31)S4E((S31)S4)|)=op(1).Therefore, P(|S3S4E(S3S4)|)>ϵ)E(|S3S4E(S3S4)|)ϵ0as n.So, to prove (EquationA6), it suffices to show S4E(S4)=o(1), while E(S4E(S4))2=Var(S4)=n2ϕθ,x(hK)E2(β1K1)E2(Ω1K1)×Var(β1K1(H1f(θ,y,x))).We arrive finally at Var(β1K1(H1f(θ,y,x)))=f(θ,y,x)(H2(t)dt)E(β12K12).This last result together with (Equation5) and (Equation6) leads directly to E(S4E(S4))2=n2ϕθ,x(hK)E2(β1K1)E2(Ω1K1)f(θ,y,x)(H2(t)dt)E(β12K12)=f(θ,y,x)(H2(t)dt)o(1),which allows finishes the proof.