629
Views
2
CrossRef citations to date
0
Altmetric
Articles

Hoeffding-Blum-Kiefer-Rosenblatt independence test statistic on partly not identically distributed data

Pages 4006-4028 | Received 29 Sep 2019, Accepted 28 Jul 2020, Published online: 14 Aug 2020
 

Abstract

The established Hoeffding-Blum-Kiefer-Rosenblatt independence test statistic is investigated for partly not identically distributed data. Surprisingly, it turns out that the statistic has the well-known distribution-free limiting null distribution of the classical criterion under standard regularity conditions. An application is testing goodness-of-fit for the regression function in a non parametric random effects meta-regression model, where the consistency is obtained as well. Simulations investigate size and power of the approach for small and moderate sample sizes. A real data example based on clinical trials illustrates how the test can be used in applications.

2000 MSC:

Acknowledgments

The author wishes to thank the two referees for very helpful comments and suggestions.

Appendix A: proofs

Proof of Proposition 3.1.

First, let us consider the stochastic process Vn=(Vn(x,y),(x,y)R¯2), defined by Vn(x,y)=n(F̂(X,Y),n(x,y)F(X,Y),n(x,y)), (x,y)R¯2, where F(X,Y),n(x,y)=i=1kninF(X,Yi)(x,y), (x,y)R¯2.

The crucial point is that under the null hypothesis of independence H F(X,Yi)(x,y)=FX(x)FYi(r), (x,y)R¯2, i=1,,k, and so F(X,Y),n(x,r)=FX(x)FY,n(y), (x,y)R¯2, in the null hypothesis case, where FY,n(y)=i=1kninFYi(y), yR¯.

Under the null hypothesis H, it follows that the stochastic process Un can be rewritten as Un=U1,n+U2,n, where the stochastic process U1,n=(U1,n(x,y),(x,y)R¯2) is given by U1,n(x,y)=Vn(x,y)FX(x)Vn(,y)FY,n(y)Vn(x,), (x,y)R¯2, and the stochastic process U2,n=(U2,n(x,y),(x,y)R¯2) is defined by U2,n(x,y)=1nVn(x,)Vn(,y), (x,y)R¯2.

The process Vn is an empirical process based on not necessarily identically distributed random vectors. Limiting results are available by Ziegler (Citation1997). In our situation, the conditions in 4.2 in Ziegler (Citation1997) can be easily verified and the convergence in distribution VnV follows, where V=(V(x,y),(x,y)R¯2) is a centered Gaussian process with a.s. uniformly d2-continuous sample paths and covariance function v(x,y,r,s)=F(X,Y)(xr,ys)F(X,Y,X,Y)(x,y,r,s), (x,y),(r,s)R¯2, and (X,Y,X,Y) denotes a random vector with distribution function F(X,Y,X,Y)(x,y,r,s)=iρiF(X,Yi)(x,y)F(X,Yi)(r,s), (x,y,r,s)R¯4.

Note that in general F(X,Y,X,Y)(x,y,r,s)=F(X,Y)(x,y)F(X,Y)(r,s),(x,y,r,s)R¯4, does not hold, and the same applies to v(x,y,r,s)=F(X,Y)(xr,ys)F(X,Y)(x,y)F(X,Y)(r,s),(x,y,r,s)R¯4. For that reason, in contrast to the classical situation of identically distributed data, the process V has in general not the structure of a bivariate Brownian bridge.

It is easily seen that FY,n converges uniformly to FY. Applying Slutsky’s theorem, that is Example 1.4.7 in van der Vaart and Wellner (Citation1996), yields (Vn,FY,n)(V,FY), and from the continuous mapping theorem, that is Theorem 1.3.6 in van der Vaart and Wellner (Citation1996), it follows that U1,nU.

Moreover, it follows from Lemma 1.10.2. (iii) in van der Vaart and Wellner (Citation1996) that 1nVn0  in probability in the uniformly sense. The latter combined with Slutsky’s theorem yields (Vn,1nVn)(V,0), and it follows from the continuous mapping theorem that U2,n0  in probability, where the convergence holds uniformly. From a further application of Slutsky’s theorem, we deduce (U1,n,U2,n)(U,0), and finally UnU from the continuous mapping theorem again. It is clear that U is a centered Gaussian process with a.s. uniformly d2-continuous sample paths. The covariance function u turns out by simple calculation.

Note that, form the latter one it follows that the stochastic process U has the structure of the well-known Brownian pillow analogous to the classical situation of identically distributed data. Regarding that, in contrast to the classical situation of identically distributed data, V has in general not the structure of a multivariate Brownian bridge, this finding surprises. It is a special feature of the process U. □

Proof of Lemma 3.1.

The function F̂(X,Y),n is an empirical distribution function based on not necessarily identically distributed random vectors. Limiting results are available by Gänßler and Ziegler (Citation1994). Corollary 4.1 (i) in Gänßler and Ziegler (Citation1994) implies F̂(X,Y),nF(X,Y),n0  in probability where the convergence holds uniformly. It is easily seen that F(X,Y),n converges uniformly to F(X,Y). In all, we obtain sup(x,y)R¯2|F̂(X,Y),n(x,y)F(X,Y)(x,y)|sup(x,y)R¯2|F̂(X,Y),n(x,y)F(X,Y),n(x,y)|+sup(x,y)R¯2|F(X,Y),n(x,y)F(X,Y)(x,y)|0  in probability.

This implies the statement. □

Proof of Theorem 3.1.

Because the null hypothesis H is true, the convergence in distribution of HBKRn to HBKR=U(x,y)2dFX(x)dFY(y) follows immediately from Proposition 3.1 and Lemma 3.1 combined with the continuous mapping theorem. At first, we consider the special case that k = 1 is fixed. In this situation, HBKRn based on independent and identically distributed pairs of bivariate random vectors with underlying distribution function F(X,Y), where the only restriction on F(X,Y) is that the related marginal distributions are continuous. Under the stated null hypothesis of independence, HBKRn has the distribution-free distribution of the classical Hoeffding-Blum-Kiefer-Rosenblatt test statistic based on independent and identically distributed bivariate random vectors with continuous marginal distributions. On the one hand, it follows from the results in Blum, Kiefer, and Rosenblatt (Citation1961) that HBKRn converges in distribution to a real-valued random variable HBKR˜, say, where HBKR˜ is distribution-free, has a continuous and strictly increasing distribution function, and the characteristic function φHBKR˜(t)=j,=1(12itπ4j22)12, tR.

On the other hand, it is already shown that HBKRn converges in distribution to HBKR. In all, HBKR˜ and HBKR have the same distribution. Because F(X,Y) is an arbitrary uniformly continuous distribution function with continuous marginal distributions, the latter finding is also valid for general k. □

Proof of Lemma 6.1.

In the null hypothesis case, it is f = g. This implies Y=η+ε. Because X, η, and ε are independent, it follows that X and Y are independent. Now, we consider alternatives. Let fg be true. Well, suppose that X and Y are independent for a moment. It is well-known that the distribution of a random vector W with values in Rd,dN, is uniquely determined by its characteristic function φW(t)=E(eitW),tRd. The independence assumption implies that for all t=(t1,t2)R2 φ(X,Y)(t)=φX(t1)φY(t2).

Because X, η, and ε are independent, the latter is equivalent to φ(X,f(X)g(X))(t)=φX(t1)φf(X)g(X)(t2) for all t=(t1,t2)R2. We obtain the independence of X and f(X)g(X). Because I is the support of the distribution of X, it follows that the map xf(x)g(x),xI, is constant. From f(x0)=0=g(x0), we deduce f = g, that contradicts the assumption fg and completes the proof. □

Proof of Theorem 6.1.

We define the stochastic process Wn=(Wn(x,y),(x,y)R¯2) by Wn(x,r)=F̂(X,Y),n(x,y)F̂X,n(x)F̂Y,n(y), (x,y)R¯2, and the map W(x,y)=F(X,Y)(x,y)FX(x)FY(y), (x,y)R¯2.

It follows from Lemma 3.1 that sup(x,y)R¯2|Wn(x,y)W(x,y)|3sup(x,y)R¯2|F̂(X,Y),n(x,y)F(X,Y)(x,y)|0 in probability. The latter combined with Lemma 3.1 again yields (Wn,F̂(X,Y),n)(W,F(X,Y))  in probability in the uniformly sense. Regarding the expressions 1nHBKRn=Wn(x,y)2dF̂(X,Y),n(x,y)  and  Δ=W(x,y)2dF(X,Y)(x,y) the continuous mapping theorem implies the convergence of HBKRn/n to Δ in probability. It remains to show that Δ>0. Denote by ε a real-valued random variable with distribution function Fε(e)=iρiFεi(e), eR¯, such that X, η, and ε are independent. Setting Y=f(X)g(X)+η+ε, we have F(X,Y)(x,y)=P(Xx,Yr), (x,y)R¯2, as well as FY(y)=P(Yy), yR¯, and under the alternative it follows from Lemma 6.1 that X and Y are not independent. In addition, it is easy to see that F(X,Y) is absolutely continuous. In all, Δ>0 follows from Yanagimoto (Citation1970). □

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.