Full article: Rate of complete second-order moment convergence and theoretical applications

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

ABSTRACT

The purpose of this work is to present a novel mode of convergence, complete second-order moment convergence with rate, which implies almost complete convergence and gives a smaller rate of convergence. Indeed, this mode is easier to obtain and gives better performances than those of the almost complete convergence in the case of the nonparametric estimators with kernels of the density function, of the distribution function and of the quantile function. A great advantage of the proposed approach is that less conditions are imposed to the kernel function thanks to the use of the mean squared error expression.

KEYWORDS:

1. Introduction

The rate of convergence is an important concept that describes the speed with which a sequence converges to the limit. In this setting, a fundamental question is how fast the convergence is illuminating theoretical studies on the subject has been carried out proposing important new results, algorithms and applications to address this issue. The purpose of this work is to present a novel mode of convergence, namely the complete secondorder convergence with a rate. The novelty here is in the introduction of the complete secondorder convergence rate. Hence, we give proofs of different proprieties of this kind of convergence. The almost complete convergence is induced by second-order convergence in different context (see [Citation1–3] and recently see Yu et al. [Citation29]). The complete convergence concept was introduced by Hsu and Robbins [Citation4]. Then, it has been used by several authors such as Gut and Stadtmller [Citation5], Gut [Citation6,Citation7], Li et al. [Citation8], Sung [Citation9], Sung and Volodin [Citation10]. The interest of such a notion lies in the fact that almost complete convergence (a.c.) implies almost sure convergence (a.s.) due to the Borel–Cantelli lemma. However in some situations at least, it is much easier to obtain complete second-order moment convergence (c.s.m.) instead of almost complete convergence.

As a practical framework, rates of complete second-order moment convergence for the probability density , the distribution function and the quantile function kernel estimators are established. We also discuss about the speed at which these estimators converge. First, in the context of estimating probability density function, many studies using different methods have been proposed. The Kernel method is one of the best of these methods which seems to be convenient and does not require a multiple choice of parameters. Rosenblatt [Citation11] was the earliest pioneer of the class of kernel density estimators, using two parameters, namely the kernel and the bandwidth. For this estimator, convergence in probability was established by Parzen [Citation12]. Habbema et al. [Citation13], Hall and Kang [Citation14], Hall and Wand [Citation15], Gosh and Chaudhury [Citation16] and Gosh and Hall [Citation17] can be consulted for various works on the subject, in particular the estimation by classical kernels of the densities. In the case of independent observations, optimal rates of convergence to zero for mean square error and the bias of the kernel estimators have been addressed by several authors under varying conditions on the kernel (K) and the density (f ). As a contribution, a new complete second-order moment convergence with a rate is introduced for the first time to improve the rates of convergence for the bias and the mean square error (MSE) of kernel density estimators. Then the complete convergence of the density kernel estimator, under weaker conditions on the density function than those proposed in the literature, is achieved. As a consequence of the complete second-order moment convergence, the almost complete convergence is obtained with a better rate.

Second, the proposed mode of convergence is applied to the distribution function and quantiles kernel estimators. Notice that for the distribution function, Nadaraya [Citation18] proposed its kernel estimator. While, Parzen [Citation19] retraced the context of Nadaraya [Citation18] and constructed the kernel quantile estimators for which we established the rate of almost complete convergence. To the best of our knowledge, this is a new result obtained from the proposed convergence mode. A great advantage of the proposed approach is that less conditions are imposed to the kernel function thanks to the use of the mean squared error expression.

2. Complete second-order moment convergence with a rate

Throughout this paper, real-valued random variables are defined on a fixed probability space $(Ω, A, P)$ .

Let $(U_{n})_{n \in N}$ and $(V_{n})_{n \in N}$ be two sequences of real numbers. We assume $(V_{n})_{n \in N}$ does not vanish from a certain rank. We say that $(U_{n})_{n \in N}$ is dominated by $(V_{n})_{n \in N}$ if there exist a real number M and an integer $n_{0}$ , such that, for all $n \geq n_{0}$ , we have $| U_{n} | \leq M | V_{n} |,$ and we note $U_{n} = O (V_{n})$ .

Definition 2.1

A sequence $(X_{n})_{n \in N}$ of random variables is said to be almost complete second-order moment convergent (c.s.m) to the random variable X, with the convergence rate $\frac{1}{\sqrt{U_{n}}}$ , if $\sum_{n \in N}^{} U_{n} E (X_{n} - X)^{2} < \infty,$ where $(U_{n})$ in a sequence of positive numbers. Note this mode $X_{n} - X = O_{c . s . m} (\frac{1}{\sqrt{U_{n}}}) .$

The following theorem shows that if $(X_{n})_{n \in N}$ converges in complete second-order moment to X with a rate $\frac{1}{\sqrt{U_{n}}}$ , then it is almost completely convergent with a rate of $\frac{1}{\sqrt{U_{n}}} .$

Theorem 2.1

If a sequence $(X_{n})_{n \in N}$ of random variables verifies $X_{n} - X = O_{c . s . m} (\frac{1}{\sqrt{U_{n}}}),$ then $X_{n} - X = O_{a . c} (\frac{1}{\sqrt{U_{n}}}) .$ And for $β > - 1$ , $ϵ > 0$ and the sequence $(V_{n})_{n \in N} = (n^{β} U_{n})_{n N}$ we obtain $\sum_{n \in N}^{} n^{β} P (| X_{n} - X | > ϵ) \leq \sum_{n \in N}^{} n^{β} U_{n} E (X_{n} - X)^{2} < \infty .$

Proof.

Suppose that $\sum_{n \in N}^{} U_{n} E (X_{n} - X)^{2} < \infty,$ then from the Markov inequality we obtain $P (| X_{n} - X | \geq \frac{1}{\sqrt{U_{n}}}) \leq U_{n} E (X_{n} - X)^{2},$ so $\sum_{n \in N}^{} P (| X_{n} - X | \geq \frac{1}{\sqrt{U_{n}}}) \leq \sum_{n \in N}^{} U_{n} E (X_{n} - X)^{2} < \infty .$ Now, $\frac{1}{\sqrt{U_{n}}}$ is a rate of convergence, so $\frac{1}{\sqrt{U_{n}}} ⟶ 0$ which equivalent to, for each $ϵ > 0$ , there exist $n_{0} \in N$ such that for all $n \geq n_{0}$ we have $\frac{1}{\sqrt{U_{n}}} < ϵ$ . So $P (| X_{n} - X | > ϵ) \leq P (| X_{n} - X | \geq \frac{1}{\sqrt{U_{n}}})$ which implies $\sum_{n \in N}^{} n^{β} P (| X_{n} - X | > ϵ) \leq \sum_{n \in N}^{} n^{β} U_{n} E (X_{n} - X)^{2} < \infty .$

The following proposition gives some elementary calculus rules.

Proposition 2.1

Assume $lim_{n ⟶ \infty} U_{n} = + \infty, X_{n} - l_{X} = O_{c . s . m} (\frac{1}{\sqrt{U_{n}}})$ and $Y_{n} - l_{Y} = O_{c . s . m} (\frac{1}{\sqrt{U_{n}}}),$ where $l_{X}, l_{Y}$ are real numbers. We have

(1)	$(X_{n} + Y_{n}) - (l_{X} + l_{Y}) = O_{c . s . m} (\frac{1}{\sqrt{U_{n}}})$ ;
(2)	$X_{n} . Y_{n} - l_{X} . l_{Y} = O_{c . s . m} (\frac{1}{\sqrt{U_{n}}})$ ;
(3)	$\frac{1}{X_{n}} - \frac{1}{l_{X}} = O_{c . s . m} (\frac{1}{\sqrt{U_{n}}}),$ with $l_{X} \neq 0$ .

Proof.

Immediately from the following inequality: $\begin{aligned} \sum_{n \in N}^{} U_{n} E ((X_{n} + Y_{n}) - (l_{X} + l_{Y}))^{2} \\ \leq 2^{2} (\sum_{n \in N}^{} U_{n} E (X_{n} - l_{X})^{2} + \sum_{n \in N}^{} U_{n} E (Y_{n} - l_{Y})^{2}) . \end{aligned}$
We have $\begin{aligned} E (X_{n} . Y_{n} - l_{X} . l_{Y})^{2} \\ = \int_{0}^{+ \infty} P (| X_{n} . Y_{n} - l_{X} . l_{Y} | > \sqrt{t}) d t \\ = \int_{0}^{+ \infty} P (| (X_{n} - l_{X}) (Y_{n} - l_{Y}) + l_{Y} (X_{n} - l_{X}) \\ + l_{X} (Y_{n} - l_{Y}) | > \sqrt{t}) d t \\ \leq E | X_{n} - l_{X} |^{2} + E | Y_{n} - l_{Y} |^{2} \\ + l_{Y}^{2} E | X_{n} - l_{X} |^{2} + l_{X}^{2} E | Y_{n} - l_{Y} |^{2}, \end{aligned}$ then $\begin{aligned} \sum_{n \in N}^{} U_{n} E (X_{n} . Y_{n} - l_{X} . l_{Y})^{2} \\ \leq \sum_{n \in N}^{} U_{n} E | X_{n} - l_{X} |^{2} + \sum_{n \in N}^{} U_{n} E | Y_{n} - l_{Y} |^{2} \\ + \sum_{n \in N}^{} U_{n} l_{Y}^{2} E | X_{n} - l_{X} |^{2} + \sum_{n \in N}^{} U_{n} l_{X}^{2} E | Y_{n} - l_{Y} |^{2} \\ < \infty . \end{aligned}$
For $l_{X} \neq 0$ $\begin{aligned} E {(\frac{1}{X_{n}} - \frac{1}{l_{X}})}^{2} & = \int_{0}^{+ \infty} P (| \frac{1}{X_{n}} - \frac{1}{l_{X}} |^{2} > t) d t \\ = \int_{0}^{+ \infty} P (| X_{n} - l_{X} |^{2} > t . (X_{n} . l_{X})^{2}) d t \\ \leq \int_{0}^{+ \infty} P (| X_{n} - l_{X} |^{2} > t) d t \\ \leq E | X_{n} - X |^{2}, \end{aligned}$ so $\sum_{n \in N}^{} U_{n} E {(\frac{1}{X_{n}} - \frac{1}{l_{X}})}^{2} \leq \sum_{n \in N}^{} U_{n} E | X_{n} - X |^{2} < \infty .$

Now we have two properties which are consequences of the previous calculus rules.

Corollary 2.1

Assume $lim_{n ⟶ \infty} U_{n} = + \infty$ , $X_{n} = O_{c . s . m} (\frac{1}{\sqrt{U_{n}}})$ and $Y_{n} = O_{c . s . m} (\frac{1}{\sqrt{U_{n}}}),$ where $l_{Y}$ is a reel number. We have

(1)	$(X_{n} . Y_{n}) = O_{c . s . m} (\frac{1}{\sqrt{U_{n}}})$ ;
(2)	$\frac{X_{n}}{Y_{n}} = O_{c . s . m} (\frac{1}{\sqrt{U_{n}}}),$ with $l_{X} \neq 0.$

Proof.

Follows directly from Proposition 2.1.

3. Theoretical applications

In this section, rates of complete second-order moment convergence for the probability density , the distribution function and the quantile function kernel estimators are established. The following remark is very important to establish these rates.

Remark 3.1

In all that follows, the supposition of the convergence of $\sum_{n \geq 1} U_{n} h_{n}^{4}$ or $\sum_{n \geq 1} \frac{U_{n}}{h_{n}^{2}}$ follows from the fact that, if $h_{n} < 1$ we have $U_{n} h_{n}^{4} \leq \frac{U_{n}}{h_{n}^{2}}$ , so the convergence of $\sum_{n \geq 1} \frac{U_{n}}{h_{n}^{2}}$ implies the convergence of $\sum_{n \geq 1} U_{n} h_{n}^{4}$ . If $h_{n} > 1$ and $h_{n} ⟶ 0$ while $n ⟶ \infty$ , the convergence is obtained.

3.1. Kernel density estimator

Let $X_{1}$ , $X_{2}$ ,…, $X_{n}$ be independent and identically distributed copies of a random variable X, which has unknown continuous probability density function f. The kernel density estimator, noted $\hat{f_{n}}$ , of the unknown density f defined by Parsen [Citation12]; Rosenblatt [Citation11], is given by $\hat{f_{n}} (x) = \frac{1}{n h_{n}} \sum_{i = 1}^{n} K (\frac{x - X_{i}}{h_{n}}),$ where $(h_{n})_{n \geq 1}$ is a sequence of positive numbers, usually called a bandwidth or smoothing parameter, and K is an integrable Borel measurable function satisfying $K \geq 0$ and $\int_{R} K (x) d x = 1$ , called kernel. Assume that the kernel K and the density f functions verify the following conditions:

(H1)	$\forall x \in R$ , $K (x) = K (- x)$ ;
(H2)	$sup_{x \in R} ∣ K (x) ∣\leq M < \infty$ ;
(H3)	$\int_{- \infty}^{+ \infty} x^{2} K (x) d x < \infty$ ;
(H4)	$f \in C^{2}$ and $f^{(2)}$ is bounded.

Theorem 3.1

Under (H1)–(H4) and supposing that $\sum_{n \geq 1} U_{n} h_{n}^{4}$ or $\sum_{n \geq 1} \frac{U_{n}}{h_{n}^{2}}$ converges (see Remark 3.1), we have (1) $\hat{f_{n}} (x) - f (x) = O_{c . s . m} (\frac{1}{\sqrt{U_{n}}}),$ (1)

Proof.

To prove (Equation1(1) $\hat{f_{n}} (x) - f (x) = O_{c . s . m} (\frac{1}{\sqrt{U_{n}}}),$ (1) ), notice first that the mean squared error MSE of $\hat{f_{n}}$ , defined by $M S E (\hat{f_{n}} (x)) = E (\hat{f_{n}} (x) - f (x))^{2},$ can be written as (2) $M S E (\hat{f_{n}} (x)) = B i a s (\hat{f_{n}} (x))^{2} + V a r (\hat{f_{n}} (x)) .$ (2) Hence to prove (Equation1(1) $\hat{f_{n}} (x) - f (x) = O_{c . s . m} (\frac{1}{\sqrt{U_{n}}}),$ (1) ), it suffices to show that (3) $\sum_{n \geq 1} U_{n} V a r [\hat{f_{n}} (x)] < \infty$ (3) and (4) $\sum_{n \geq 1} U_{n} B i a s (\hat{f_{n}} (x))^{2} < \infty .$ (4) For the inequality (Equation3(3) $\sum_{n \geq 1} U_{n} V a r [\hat{f_{n}} (x)] < \infty$ (3) ) $\begin{aligned} V a r [\hat{f_{n}} (x)] & = \frac{1}{n^{2} h_{n}^{2}} V a r \sum_{i = 1}^{n} K (\frac{x - X_{i}}{h_{n}}) \\ \leq \frac{1}{n h_{n}^{2}} E [K^{2} (\frac{x - X}{h_{n}})] . \end{aligned}$ Using the condition (H2) of the kernel K, it follows that $V a r [\hat{f_{n}} (x)] \leq \frac{M^{2}}{n h_{n}^{2}} .$ Therefore $\sum_{n \geq 1} U_{n} V a r [\hat{f_{n}} (x)] \leq \sum_{n \geq 1} \frac{U_{n} M^{2}}{h_{n}^{2}} .$ Since $\sum_{n \geq 1} \frac{U_{n}}{h_{n}^{2}}$ converges, so $\sum_{n \geq 1} U_{n} V a r [\hat{f_{n}} (x)] < \infty .$

On the other hand for the inequality (Equation4(4) $\sum_{n \geq 1} U_{n} B i a s (\hat{f_{n}} (x))^{2} < \infty .$ (4) ), we have $\begin{aligned} E [\hat{f_{n}} (x)] & = E [\frac{1}{n h_{n}} \sum_{i = 1}^{n} K (\frac{x - X_{i}}{h_{n}})] \\ = \int_{R} K (z) f (x - z h_{n}) d z . \end{aligned}$ Using Taylor's series expansion of the function f about a point x up to order 3, (H3), (H1) and (H4), one obtain $\begin{aligned} E [\hat{f_{n}} (x)] & = f (x) \int_{R} K (z) d z - h_{n} f^{'} (x) \int_{R} z K (z) d z \\ + \frac{h_{n}^{2}}{2} f^{(2)} (x) \int_{R} z^{2} K (z) d z - \frac{h_{n}^{3}}{6} f^{(3)} (θ) \\ \times \int_{R} z^{3} K (z) d z, \end{aligned}$ where θ is a real number between x and $z h_{n}$ . Hence $E [\hat{f_{n}} (x)] ⩽ f (x) + \frac{h_{n}^{2}}{2} f^{(2)} (x) \int_{R} z^{2} K (z) d z,$ Thus $\begin{aligned} \sum_{n \geq 1} U_{n} B i a s (\hat{f_{n}} (x))^{2} \leq \sum_{n \geq 1} \frac{U_{n} h_{n}^{4}}{4} {[f^{(2)} (x) \int_{R} z^{2} K (z) d z]}^{2} \\ = C \sum_{n \geq 1} U_{n} h_{n}^{4} < \infty . \end{aligned}$ The proof is completed.

Remark 3.2

If K is a symmetric compactly supported kernel, we obtain the complete second-order moment convergence of $\hat{f_{n}}$ under H4.

The choices of $U_{n}$ and $h_{n}$ are not arbitrary because their expressions must be selected so that the convergence of the obtained series is ensured.

Example 3.1

By choosing $h_{n} = n^{- s}$ and $U_{n} = n^{- α}$ , where $s \in] 0, 1 [$ and $α > 1$ , we obtain $\sum_{n \geq 1} U_{n} V a r [\hat{f_{n}} (x)] \leq \sum_{n \geq 1} \frac{M^{2}}{n^{1 + α - 2 s}},$ This Riemann's serie converges if $1 + α - 2 s > 1$ , thus if $α > 2 s$ . And $\sum_{n \geq 1} U_{n} (E [\hat{f_{n}} (x)] - f (x))^{2} \leq C \sum_{n \geq 1} \frac{1}{n^{4 s + α}},$ the right-hand side converge if $α > 1 - 4 s$ .

Consequently, combining the two conditions of right-hand side series, one can obtain (Equation1(1) $\hat{f_{n}} (x) - f (x) = O_{c . s . m} (\frac{1}{\sqrt{U_{n}}}),$ (1) ) for $α > 2.$ Indeed, it can be checked that

if $(1 - 4 s) \in] - 3, 0]$ , then $2 s \in] 0.5, 1]$ ;
if $(1 - 4 s) \in] 0, 1]$ , then $2 s \in] 0, 0.5]$ ;
if $2 s \in] 1, 2]$ , then $1 - 4 s \in] - 3, - 1]$ ;
if 2s = 1−4s, than $s = \frac{1}{6}$ .

So, the condition $α > 2$ is always verified.

Corollary 3.1

Under (H1)–(H4), we have $\hat{f_{n}} - f = O_{a . c} (\frac{n h_{n}}{n^{α} (\log n)^{2}}), α > 1.$

Proof.

For the optimal bandwidth $h_{n} = n^{- \frac{1}{5}}$ and a rate of convergence $U_{n} = \frac{n h_{n}}{n^{α} (\log n)^{2}}$ , which satisfies the inequality $U_{n} < \frac{n h_{n}}{\log n}$ , where $\frac{n h_{n}}{\log n}$ is the rate of almost complete convergence of the density kernel estimator, one obtain $\sum_{n \geq 1} U_{n} V a r [\hat{f_{n}} (x)] \leq \sum_{n \geq 1} \frac{M^{2}}{n^{α - \frac{1}{5}} (\log n)^{2}}$ and $\sum_{n \geq 1} U_{n} (B i a s [\hat{f_{n}} (x)])^{2} \leq C \sum_{n \geq 1} \frac{1}{n^{α} (\log n)^{2}} .$ Combining the convergence conditions of the two series in the right hand side, we obtain (Equation1(1) $\hat{f_{n}} (x) - f (x) = O_{c . s . m} (\frac{1}{\sqrt{U_{n}}}),$ (1) ) for $α > 1$ .

3.2. Kernel distribution function estimator

Let $X_{1}$ , $X_{2}$ ,…, $X_{n}$ be independent and identically distributed copies of a random variable X, which has unknown continuous probability density f and distribution F functions. The kernel distribution function estimator $\hat{F_{n}}$ , that was proposed by Nadaraya [Citation18], can be obtained by integrating the kernel density estimator $\hat{f_{n}}$ , as follows: $\hat{F_{n}} (x) = \int_{- \infty}^{x} \hat{f_{n}} (t) d t = \frac{1}{n} \sum_{i = 1}^{n} H (\frac{x - X_{i}}{h_{n}}),$ where the function H is defined from the kernel K as $H (x) = \int_{- \infty}^{x} K (t) d t .$ Function H is a cumulative distribution function because K is a probability density function.

Assume that the following hypothesis are satisfied:

(H5)	$\int_{- \infty}^{+ \infty} K (x) H (x) d x < \infty$ ;
(H6)	$\int_{- \infty}^{+ \infty} x^{2} K (x) H (x) d x < \infty$ ;
(H7)	$\int_{- \infty}^{+ \infty} x K (x) H (x) d x \geq 0$ ;
(H8)	$F \in C^{2}$ and $f^{‵}$ is bounded.

Then Theorem 3.2 states the complete second-order moment convergence of $\hat{F_{n}}$ to F.

Theorem 3.2

Under (H1), (H3), (H5)–(H8) and supposing that $\sum_{n \geq 1} U_{n} h_{n}^{4}$ or $\sum_{n \geq 1} \frac{U_{n}}{h_{n}^{2}}$ converges (see Remark 3.1), we have (5) $\hat{F_{n}} (x) - F (x) = O_{c . s . m} (\frac{1}{\sqrt{U_{n}}}) .$ (5)

Proof.

To prove (Equation5(5) $\hat{F_{n}} (x) - F (x) = O_{c . s . m} (\frac{1}{\sqrt{U_{n}}}) .$ (5) ), we use the same argument used to check (Equation1(1) $\hat{f_{n}} (x) - f (x) = O_{c . s . m} (\frac{1}{\sqrt{U_{n}}}),$ (1) ). First for the bias, we have $\begin{aligned} E [\hat{F_{n}} (x)] & = E [\frac{1}{n} \sum_{i = 1}^{n} H (\frac{x - X_{i}}{h_{n}})] \\ = \int_{R} H (\frac{x - z}{h_{n}}) f (z) d z . \end{aligned}$ Using integration by part, substitution $\frac{x - z}{h_{n}} = y$ , a Taylor series expansion of the function F about the point x up to order 2, (H1), (H3) and (H8), we obtain $\sum_{n \geq 1} U_{n} B i a s (\hat{F_{n}} (x))^{2} \leq C \sum_{n \geq 1} U_{n} h_{n}^{4} < \infty .$ Now for the variance $\begin{aligned} V a r [\hat{F_{n}} (x)] & = \frac{1}{n} V a r \sum_{i = 1}^{n} H (\frac{x - X_{i}}{h_{n}}) \\ \leq E [H^{2} (\frac{x - X}{h_{n}})] . \end{aligned}$ Using (H1), (H5)–(H8), integration by part, substitution $\frac{x - z}{h_{n}} = y,$ one have $\begin{aligned} V a r [\hat{F_{n}} (x)] & \leq \frac{\begin{matrix} 2 h_{n}^{2} F (x) \int_{- \infty}^{+ \infty} K (y) H (y) d y \\ + h_{n}^{4} f^{‵} (x) \int_{- \infty}^{+ \infty} y^{2} K (y) H (y) d y \end{matrix}}{h_{n}^{2}} \\ \leq \frac{C_{1} h_{n}^{2} + C_{2} h_{n}^{4}}{h_{n}^{2}}, \end{aligned}$ where $C_{1}$ and $C_{2}$ are two constants. Since $(h_{n})$ converges to 0, so for every $ϵ > 0$ there exists $n_{0}$ such that $h_{n} \leq ϵ$ for all $n \geq n_{0}$ , then $\sum_{n \geq 1} U_{n} V a r [\hat{F_{n}} (x)] \leq M \sum_{n \geq 1} \frac{U_{n}}{h_{n}^{2}},$ and the desired result is obtained

Remark 3.3

When K is symmetric and has a compact support $[- 1, 1]$ , the proprieties of H are given in Baszczynska [Citation20], then Hypothesis H5 and H7 are verified. We can obtain the MSE of the kernel distribution estimator $\hat{F_{n}}$ with the assumptions used by Azzalini [Citation21] and the kernel satisfying the above assumptions, we have the squared bias $B i a s^{2} [\hat{F_{n}} (x)] = \frac{1}{4} h_{n}^{4} f^{' 2} (x) {(\int_{- 1}^{1} y^{2} K (y) d y)}^{2} + o (h_{n}^{4}),$ and the variance $\begin{aligned} V a r [\hat{F_{n}} (x)] & = \frac{1}{n} F (x) (1 - F (x)) \\ - 2 \frac{1}{n} h_{n} f (x) \int_{- 1}^{1} y K (y) H (y) d y + o (\frac{h_{n}}{n}) . \end{aligned}$ Then the MSE is given by $\begin{aligned} M S E [\hat{F_{n}} (x)] & = \frac{1}{4} h_{n}^{4} f^{' 2} (x) {(\int_{- 1}^{1} y^{2} K (y) d y)}^{2} \\ + \frac{1}{n} F (x) (1 - F (x)) - 2 \frac{1}{n} h_{n} f (x) \\ \times \int_{- 1}^{1} y K (y) H (y) d y + o (h_{n}^{4} + \frac{h_{n}}{n}) . \end{aligned}$ So under H8 we obtain $\sum_{n \geq 1}^{} U_{n} M S E (\hat{F_{n}} (x)) < \infty .$

The next corollary gives a new rate of the almost complete convergence of $\hat{F_{n}}$ to F.

Corollary 3.2

Under (H1), (H3), (H5)–(H8), we have $\hat{F_{n}} - F = O_{c . c} (\sqrt{\frac{\log n}{n^{2 α} h_{n}^{8}}}), α > 1.$

Proof.

Remarking that $\sqrt{\frac{\log n}{n^{2 α} h_{n}^{8}}} < \sqrt{\frac{\log n}{n}}$ , where $\sqrt{\frac{\log n}{n}}$ is the rate of the almost complete convergence of $\hat{F_{n}}$ to F, we get $\sum_{n \geq 1}^{} U_{n} E (\hat{F_{n}} (x) - F (x))^{2} \leq \sum_{n \geq 1}^{} \sqrt{\frac{\log n}{n^{2 α} h_{n}^{8}}} h_{n}^{4} \leq \sum_{n \geq 1}^{} \frac{\sqrt{\log n}}{n^{α}} .$ The last series converge for $α > 1$ .

Example 3.2

For the optimal bandwidth $h_{n} = n^{- \frac{1}{3}}$ and $U_{n} = n^{- α}$ where $α > 1$ , one obtain $\sum_{n \geq 1} U_{n} B i a s (\hat{F_{n}} (x))^{2} \leq C \sum_{n \geq 1} n^{- \frac{4}{3}} n^{- α}$ and $\sum_{n \geq 1} U_{n} V a r [\hat{F_{n}} (x)] \leq M \sum_{n \geq 1} \frac{n^{- α}}{n^{- \frac{2}{3}}} .$ The two right-hand side series converge simultaneously if $α > \frac{5}{3} .$

3.3. Kernel quantile function estimator

Let $X_{1}, X_{2}, \dots, X_{n}$ be independent and identically distributed copies of a random variable with absolutely continuous distribution function F. Denoting $X_{(1)} \leq X_{(2)} \leq \dots \leq X_{(n)}$ the corresponding order statistics. The quantile function, noted Q, is defined to be the left continuous inverse of F, given by $Q (p) = inf {x : F (x) \geq p}, 0 < p < 1.$ A kernel quantile estimator, based on the Nadaraya [Citation18] kernel distribution function estimator $\hat{Q_{n}}$ , is defined as $\hat{Q} (p) = inf {x : \hat{F_{n}} (x) \geq p}, 0 < p < 1.$ and given by ${\hat{Q}}_{n} (p) = \sum_{i = 1}^{n} X_{(i)} \int_{\frac{i - 1}{n}}^{\frac{i}{n}} \frac{1}{h_{n}} K (\frac{x - p}{h_{n}}) d x,$ where K is a density function, while $h_{n} ⟶ 0$ as $n ⟶ \infty$ .

Our result of convergence is based on the expression of the MSE of the kernel quantile estimator, given by Sheather and Marron [Citation22] (Theorem 1, p. 5), when p is in the interior of $(0, 1)$ , under conditions that the kernel K is symmetric about 0 with compact support, and $Q^{(2)}$ is continuous in a neighbourhood of p.

Theorem 3.3

Supposing that $\sum_{n \geq 1} U_{n} h_{n}^{4}$ or $\sum_{n \geq 1} \frac{U_{n}}{h_{n}^{2}}$ converges (see Remark 3.1), we have (6) $\hat{Q_{n}} (x) - Q (x) = O_{c . s . m} (\frac{1}{\sqrt{U_{n}}}) .$ (6)

Proof.

Building on Falk [Citation23] and David [Citation24], Sheather and Marron [Citation22] give the expressions of bias and variance of $\hat{Q_{n}}$ as $\begin{aligned} B i a s [\hat{Q_{n}} (x)] & = \frac{1}{2} h_{n}^{2} Q^{(2)} (p) \int_{- \infty}^{+ \infty} x^{2} K (x) d x \\ + o (h_{n}^{2}) + o (n^{- 1}), \end{aligned}$ and $\begin{aligned} V a r [\hat{Q_{n}} (x)] & = \frac{1}{n} p (1 - p) (Q^{'} (p))^{2} \\ - \frac{1}{n} h_{n} (Q^{'} (p))^{2} \int_{- \infty}^{+ \infty} x K (x) H (x) d x \\ + o (\frac{h_{n}}{n}) . \end{aligned}$ So $\sum_{n \geq 1} U_{n} B i a s (\hat{Q_{n}} (x))^{2} \leq C \sum_{n \geq 1} U_{n} h_{n}^{4} < \infty$ and $\begin{aligned} \sum_{n \geq 1} U_{n} V a r [\hat{Q_{n}} (x)] & \leq M \sum_{n \geq 1} \frac{U_{n}}{n} \\ \leq M \sum_{n \geq 1} \frac{U_{n}}{h_{n}^{2}} < \infty . \end{aligned}$ Finally we have obtained the almost complete convergence and the compete second-order moment convergence of the kernel quantile estimator $\hat{Q_{n}}$ .

Example 3.3

In the same way of Example 3.2 and using the same $h_{n}$ and $U_{n},$ we obtain $\hat{Q_{n}} (x) - Q (x) = O_{c . s . m} (\frac{1}{\sqrt{U_{n}}}) .$

4. Simulation study

In this section, to present the performance of the new rate of convergence for a finite-size sample, we realize a simulation study. We give a visual impression of the quality of convergence by calculating the correspondent MSE together with the value of the rate of complete second-moment convergence and the value of the rate of almost complete convergence, based on a sample obtained from two theoretical models: Gamma kernel density estimates and the innovation one said Laplace kernel density developed by Khan and Akbar [Citation25], inspired to Chen's idea [Citation26]. Defined by $K_{L a p l a c e (x, h^{1 / 2})} (u) = \frac{1}{2 \sqrt{h}} e x p (- \frac{| u - x |}{\sqrt{h}}) .$ In the second part, Normal and Epanechnikov kernel distribution estimators are used with optimal bandwidth and different sizes of normal and exponential sample to give a performance of the new rate of convergence. Finally, with the Normal model, one perform the quantile (25%, 50%,75%) estimates for the same sample size.

4.1. Kernel density MSE

We propose two schemes of Kernel estimations, Laplace and Gamma kernel density's estimates with optimal bandwidth $h = n^{-} 1 / 5.$ One conduct simulations of samples data from Exponential and Gamma density with sample sizes n = 100, n = 150, 200, 250, 300, 500, 800, 1000. We summarize the numerical calculations in Table .

Table 1. Mean simulated values and rate of convergences for density.

Download CSV Display Table

The convergence CSM rate is more efficient in terms of speed towards zero. Even if the rate is almost efficient in the AC convergence, we attend the good behaviour of the new rate for both kernel models.

According to the results obtained in Table , we remark that the values of the CSMC rate of the kernel density estimator for Laplace kernel (resp. Gamma kernel) are closer to the Exp MSE (resp. Gamma MSE) values, than the ACC rate one.

4.2. Kernel distribution MSE

In this case, the Normal kernel distribution and Epanechnikov kernel distribution with optimal bandwidth $h = n^{- 1 / 3}$ are compared to Normal sample. The size varies between 100 and 1000. We summarize the numerical calculations in Table .

Table 2. Mean simulated values and rate of convergences for distribution.

Download CSV Display Table

Here too, we notice the fast convergence speed of the CSM rate. We also notice its similarity to the values of the MSE in both distributions. We remark that the CSM rate gives a good results for the Normal kernel.

By the results given in Table , we notice that, in the both cases normal and Eparechnikov kernels distribution, CSMC rate values of the kernel distribution estimator, are closer to the normal MSE and Exp MSE, respectively values than that of the ACC rate.

4.3. Kernel quantile MSE

Now we calculate the MSE of the normal kernel quantile estimator. The size varies between 100 and 1000. We summarize the numerical calculations in Table .

Table 3. Mean simulated values and rate of convergence for quantiles.

Display Table

We also remark that the CSMC rate values are closer to the Kernel quantile estimators than the ACC rate values.

Eventually, we conclude that, in all cases, the CSMC rate of the kernel density, distribution and quantile estimators, gives better results than the ACC rate of the same estimators.

5. Real data analysis

Female infertility and BMI: This study aims to investigate the body mass index (BMI) of infertile women of childbearing age. We use data from 200 participants from the Ben Badis University Hospital Centre of Constantine, Algeria, in 2018.

The first step is to establish the conformity test between the sample and a normal distribution. One obtain, D = 0.079226 smaller than p-value $0.1623.$ This implies the use of the density and the distribution of the normal law. The second step is to estimate the density and the distribution functions and to represent them in a graph (Figure ). For the MSE between the real density and a normal kernel density is equal to 0.003128583 and the rate of CSM convergence is 0.0008729775 and the AC convergence is $0.06541179.$ We obtain for the distribution, MSE= 0.003128583 and the rate of CSM convergence is $0.0047592$ and the AC convergence is $0.1627624.$ For the same rate of convergence, one have the MSE quantiles given by (25%, 0.0260281), (50%, 0.04186125), (75%,0.058767206). Note that the results of the real data come supported those of the simulation.

Figure 1. Density and distribution real data with density and distribution kernel estimate.

6. Conclusion

The present work proposed a new method to obtain the convergence rate of the MSE that is much more efficient in the CSMC case. Indeed, the CSM convergence rate gives better results than that of AC convergence. The previous results indicate that the choice of this type of convergence using kernel method estimate is a good alternative to the almost complete one. We can apply this type of convergence in any estimation that requires the study of the MSE, such as for example neural network, least squares method, …, and following the suggestion, it can be applied in neutrosophic statistics developed in Smarandache [Citation27] and Afzal et al [Citation28]. Moreover, maybe we can apply this type of convergence to extend the theorem of the law of large numbers.

Disclosure statement

No potential conflict of interest was reported by the authors.

Correction Statement

This article has been republished with minor changes. These changes do not impact the academic content of the article.

References

Chow D. On the rate of moment convergence of sample function of a random variables with bounded support. Bull Inst Math Acd Sin. 1988;16:177–201.
Google Scholar
Liang H, Li D, Rosalsky A. Complete moment and integral convergence for sums of negatively associated random variables. Acta Math Sin (Engl Ser). 2010;26(3):419–432.
Web of Science ®Google Scholar
Qui D, Chen P. Complete moment convergence for i.i.d random variables. Statist Probab Lett. 2014;91:76–82.
Web of Science ®Google Scholar
Hsu P, Robbins H. Complete convergence and the law of large numbers. Proc Natl Acad Sci USA. 1947;33(2):25–31.
PubMed Web of Science ®Google Scholar
Gut A, Stadtmüller U. An intermediate Baum–Katz theorem. Statist Probab Lett. 2011;81(10):1486–1492.
Web of Science ®Google Scholar
Gut A. Marcinkiwicz laws and convergence rates in the law of large numbers for random variables with multidimensional indices. Ann Probab. 1978;6:469–482.
Web of Science ®Google Scholar
Gut A. Convergence rates for probabilities of moderate deviations for sums of random variables with multidimensional indices. Ann Probab. 1980;8(2):298–313.
Web of Science ®Google Scholar
Li D, Rao MB, Jiang T, et al. Complete convergence and almost sure convergence of weighted sums of random variables. J Theoret Probab. 1995;8(1):49–76.
Web of Science ®Google Scholar
Sung SH. Complete convergence for weighted sums of random variables. Statist Probab Lett. 2007;77(3):303–311.
Web of Science ®Google Scholar
Sung SH, Volodin A. On the rate of complete convergence for weighted sums of arrays of random elements. J Korean Math Soc. 2006;43(4):815–828.
Web of Science ®Google Scholar
Rosenblatt M. Remarks on some non parametric estimates of a density function. Ann Math Statist. 1956;27(3):832–837.
Google Scholar
Parzen E. On estimation of a probability density function and mode. Ann Math Stat. 1962;33(3):1065–1076.
Google Scholar
Habbema JDF, Hermans J, Vanden Broee K. A stepwise discriminant analysis program using density estimation. In: Bruckmnann G, editor. Comp stat 1974, Proceedings in Computational Statistics. Vienna: Physica Verlag; 1974. p. 101–110.
Google Scholar
Hall P, Kang K-H. Bandwidth choice for non parametric classification. Ann Stat. 2005;33:284–306.
Web of Science ®Google Scholar
Hall P, Wand MP. On non parametric discrimination using density differences. Biometrika. 1988;75(3):541–547.
Web of Science ®Google Scholar
Ghosh AK, Chaudhuri P. Optimal smoothing in kernel analysis discriminant. Stat Sin. 2004;14:457–483.
Web of Science ®Google Scholar
Ghosh AK, Hall P. On error-rate estimation in nonparametric classification. Stat Sin. 2008;18:1081–1100.
Web of Science ®Google Scholar
Nadaraya EA. Some new estimates for distribution function. Theory of Probeb. Appl. 1964;9:497–500.
Google Scholar
Parzen E. Non parametric statistical data modelling. J Amer Stat Assoc. 1979;74(365):105–131.
Web of Science ®Google Scholar
Baszczynska A. Kernel estimation of cumulative distribution function of a random variable with bounded support. Statist Trans. 2016;17:541–556.
Google Scholar
Azzalini A. A note on the estimation of a distribution function and quantiles by a kernel method. Biometrika. 1981;68(1):326–328.
Web of Science ®Google Scholar
Sheather SJ, Marron JS. Kernel quantile estimators. J Amer Statist Assoc. 1990;85(410):410–416.
Web of Science ®Google Scholar
Falk M. Relative deficiency of Kernel type estimators of quantiles. Ann Statist. 1984;12:261–268.
Web of Science ®Google Scholar
David HA. Order statistics. 2nd ed. New York: John Wiley; 1981.
Google Scholar
Khan JA, Akbar A. Density estimation by Laplace kernel. Working paper. Department of Statistics, Bahauddin Zakariya, Multan, Pakistan. 2021.
Google Scholar
Chen SX. Probability density function estimation using gamma kernels. Ann Inst Statist Math. 2000;52(3):471–480.
Web of Science ®Google Scholar
Smarandache F. Neutrosophic statistics vs. classical statistics, section in Nidus Idearum/superluminal physics. Vol. 7, 3rd ed. 2019. p. 117.
Google Scholar
Afzal U, Alrweili H, Ahamd N, et al. Neutrosophic statistical analysis of resistance depending on the temperature variance of conducting material. Sci Rep. 2021;11(1):Article ID 23939.
Web of Science ®Google Scholar
Yu Y, et al. On the complete convergence for uncertain random variables. Soft Comput. 2022;26(3):1025–1031.
Web of Science ®Google Scholar

Rate of complete second-order moment convergence and theoretical applications

ABSTRACT

1. Introduction

2. Complete second-order moment convergence with a rate

3. Theoretical applications

3.1. Kernel density estimator

3.2. Kernel distribution function estimator

3.3. Kernel quantile function estimator

4. Simulation study

4.1. Kernel density MSE

Table 1. Mean simulated values and rate of convergences for density.

4.2. Kernel distribution MSE

Table 2. Mean simulated values and rate of convergences for distribution.

4.3. Kernel quantile MSE

Table 3. Mean simulated values and rate of convergence for quantiles.

5. Real data analysis

6. Conclusion

Disclosure statement

References

Information for

Open access

Opportunities

Help and information

Rate of complete second-order moment convergence and theoretical applications

ABSTRACT

1. Introduction

2. Complete second-order moment convergence with a rate

3. Theoretical applications

3.1. Kernel density estimator

3.2. Kernel distribution function estimator

3.3. Kernel quantile function estimator

4. Simulation study

4.1. Kernel density MSE

Table 1. Mean simulated values and rate of convergences for density.

4.2. Kernel distribution MSE

Table 2. Mean simulated values and rate of convergences for distribution.

4.3. Kernel quantile MSE

Table 3. Mean simulated values and rate of convergence for quantiles.

5. Real data analysis

6. Conclusion

Disclosure statement

Correction Statement

References

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date