119
Views
1
CrossRef citations to date
0
Altmetric
Original Articles

Testing for constancy in varying coefficient models

&
Pages 890-911 | Received 11 Apr 2016, Accepted 18 Feb 2017, Published online: 02 Jan 2018
 

ABSTRACT

We consider varying coefficient models, which are an extension of the classical linear regression models in the sense that the regression coefficients are replaced by functions in certain variables (for example, time), the covariates are also allowed to depend on other variables. Varying coefficient models are popular in longitudinal data and panel data studies, and have been applied in fields such as finance and health sciences. We consider longitudinal data and estimate the coefficient functions by the flexible B-spline technique. An important question in a varying coefficient model is whether an estimated coefficient function is statistically different from a constant (or zero). We develop testing procedures based on the estimated B-spline coefficients by making use of nice properties of a B-spline basis. Our method allows longitudinal data where repeated measurements for an individual can be correlated. We obtain the asymptotic null distribution of the test statistic. The power of the proposed testing procedures are illustrated on simulated data where we highlight the importance of including the correlation structure of the response variable and on real data.

MATHEMATICS SUBJECT CLASSIFICATION:

Acknowledgments

We would like to thank the Editor and the referees for their detailed reading and very valuable comments on the manuscript.

Funding

M. Ahkim's research was supported by the Special Research Fund (BOF) of Universiteit Antwerpen [grant number 42FA070300FFB5994]. A. Verhasselt gratefully acknowledge support from the IAP Research Network P7/06 of the Belgian State (Belgian Science Policy) and the FWO [grant number 1.5.137.13N]. The infrastructure of the VSC—Flemish Supercomputer Center, funded by the Hercules Foundation and the Flemish Government—department EWI, was used for the simulations.

Appendix A. Notation

1.

For a real-valued function h on , denotes its supremum norm, while for a real vector valued function h = (h1, …, hm)′, we let its supremum norm be ‖h = max 1 ⩽ imhi.

2.

Let . Define the function g*(t) = (g*0(t), …, gd*(t))′ such that . Let denote the corresponding coefficient vector, i.e., . Throughout we assume that limn → ∞ρn = 0, i.e., the unknown function can be uniformly approximated by spline functions of a certain degree.

Appendix B. Assumptions

Assumption 1.

1.

The observation times tij, j = 1, …, Ni, i = 1, …, n, are chosen independently according to a distribution function FT(t) on . Moreover, they are independent of the response and the covariate process {(Yi(t), Xi(t))}, i = 1, …, n. The distribution function FT(t) has a Lebesgue density fT(t) that is bounded away from zero and infinity, uniformly over all , that is, there exist positive constants M1 and M2 such that M1fT(t) ⩽ M2 for .

2.

The eigenvalues η0(t), …, ηd(t) of are bounded away from zero and infinity, uniformly over all , that is, there exist positive constants M3 and M4 such that M3 ⩽ η0(t) ⩽ ⋅⋅⋅ ⩽ ηd(t) ⩽ M4 for .

3.

There exists a positive constant M5 such that |Xp(t)| ⩽ M5 for and p = 0, …, d.

4.

There exists a positive constant M6 such that for .

5.

.

These conditions are commonly used (e.g., Huang, Wu and Zhou Citation2004) and are satisfied in many practical examples. As for Assumption 1.1, when dealing with deterministic time points we can replace this assumption by for some distribution function FT having a Lebesgue density function fT which is bounded away from zero and infinity, uniformly over , where and is the indicator function (Huang, Wu and Zhou Citation2004). Note that we do not assume zero modeling bias, since we allow the knots to increase to infinity.

Appendix C. Theorem of Tan (Citation1977)

In the proof of Theorem 3 and 4 we need the following Lemma, based on Theorem 3.1 of Tan (Citation1977).

Lemma 1.

Let with V invertible and Q = ZAZ, where A is a real symmetric matrix. Then Q = ∑ki = 1λiχ2(ri, θ2i) where χ2(ri, θ2i) are independent non-central chi-square variables, λ1, …, λk are the non-zero distinct eigenvalues of VA with algebraic multiplicities r1, …, rk, respectively, and where VA has the spectral decomposition VA = ∑kj = 1λjEj. Moreover, we have that

Appendix D. Proof of Theorem 1

Proof.

Under hypothesis H1 we have that βp(t) = ∑lαplBpl(t; qp) and αpl = cp for l = 1, …, mp; p = 0, …, d. Therefore and Hence, we obtain that

The specified distribution of Q1 ∼ ∑ki = 1λiχ2(ri, θ2i) follows from Lemma 1 in Appendix C with 0 = ∑iλiθ2i. We now show that ∑ki = 1ri = Ndim and that all θi = 0. Note that the idempotent matrix has eigenvalues 0 and 1. Therefore we have the decomposition , where is the eigenspace corresponding to the eigenvalue λ = b of the matrix . Moreover, has dimension . Denote by the eigenspace of the eigenvalue λ = 0 of the matrix . One can verify that . Hence, in order to find the eigenvectors corresponding to a non-zero eigenvalue we can restrict to the space . This also means that the λi are eigenvalues of . Since is positive definite and the fact 0 = ∑iλiθ2i, we obtain that all θi = 0. The eigenspace of has dimension N, and therefore we have

It remains to show that Q1 and Q2 are independent. By Theorem 3.2 of Tan (Citation1977) Q1 and Q2 are independent if and only if (A1) It takes a small effort to verify the equation above by noting that .

Appendix E. Proof of Theorem 2

Proof.

The proof of this theorem is along the same lines as the proof of Theorem 3 in Li, Xu and Liu (Citation2011), some of the details are however different due to our longitudinal setting. Recall the definition of (see Appendix A). Set , then . We can also write , so that under hypothesis H0 we obtain Note that under H0. Hence so Denote . We define Using Lemma 1, we obtain that where γ2 and θ2i are specified in Lemma 1. Denote and . To prove Theorem 2, we need to show that (A2)

Some mathematical preparation is needed to prove (EquationA2). The Takagi factorization of leads to a matrix GIR(Ndim) × N such that Throughout ‖A‖ (‖c‖) denotes the Frobenius (Euclidean) norm of a matrix A (vector c), and ⟨a, b⟩ denotes the standard in-product of vectors a, b. Let , then where Let . Note that if , then there is nothing to prove since in that case ξ0 = ξ1 and η0 = η1, so we proceed with the case . We also have that from which it follows that . Define an orthogonal transformation TIR(Ndim) × (Ndim) with first row equal to and let We obtain the expressions Therefore (A3) since for a mean zero normal variable Z we have the property . Now and TGGT′ = INdim. We want to bound . Let b = (b1, b2, …, bN) denote the first row of the orthogonal matrix TG, then we know ‖b‖ = 1, also denote by c1, …, cN the columns of . Using the fact which is obtained by the Cauchy–Schwarz inequality, and the symmetric property of , we have that Using the previous inequality, we can continue from equation (EquationA3) to obtain (A4) Let , then and . Analogously as in (EquationA4) we obtain (A5) since for any orthogonal transformation , the variance of the first component of , where is obtained by the entry with index (1, 1) of the matrix Note that and are independent multivariate normal random vectors, because on the one hand on the other hand, by the same argument as in (EquationA1) from which we find that Hence

Fix a t > 0, then (A6) For the last inequality, since η1 and ξ1 are independent, and η1 and ξ0 are independent, we have that where f is the density function of the multivariate normal distribution

Continuing from equation (EquationA6) with k a positive real number (A7) where is the maximum of the density function of ξ0 (the Markov inequality is applied in (EquationA7)). Substitute in (EquationA7) to find that and by (EquationA6) we obtain that for all t ⩾ 0

On the other hand, we obtain in a similar fashion (A8) where is the maximum of the density function of the random variable η0. Substitute in (EquationA8) to finally establish (A9)

Note that since HH and GG are idempotent matrices, thus 0 and 1 are the only eigenvalues. Then by (EquationA3),(EquationA5) and (EquationA9), it follows that

Appendix F. Rate of convergence

In Theorem 2 we assume (Equation9). We shed more light on this rate by assuming that is bounded (Nmax  = max i = 1, …, nNi and Nmin  = min i = 1, …, nNi), and . Suppose that subjects with equal number of repeated measurements have the same time points, we do not need this assumption if the correlation structure does not depend on time, as is the case with any time independent correlation structure.

For the first part we use the fact that (Li, Xu and Liu Citation2011). Thus the first part is bounded by

Bounding

For the second part, we note that there is no closed-form expression of the density function of a linear combination of chi-square variables (see Bausch (Citation2013) among others). However, we obtain a reasonable bound on which is the maximum of the density of ∑ki = 1λiχ2(ri).

First, it does not hold that ri = 1 for all i. To prove this, suppose otherwise, i.e., ri = 1 for all i. Then, by Theorem 1, we have k = ∑ki = 1ri = Ndim. Next, we obtain a bound on k. We argue, as in the proof of Theorem 1, that to find a bound on k we restrict to the eigenspace of eigenvalue 1 of . Thus, restricting to , we only look at the number of positive eigenvalues of W1/2VW1/2 which is a block diagonal matrix. By the restriction on the time points (see above), W1/2VW1/2 contains at most Nmax Nmin  + 1 different block matrices with dimensions not exceeding Nmax . Hence, the number of different positive eigenvalues does not exceed Nmax (Nmax Nmin  + 1), i.e., kNmax (Nmax Nmin  + 1). By assumption all ri = 1, and thus it should hold (A10) Divide (EquationA10) by N. Since Nmax /Nmin  is bounded by C > 0 and Nmax /n → 0, we obtain from the previous inequality using also the fact NnNmin  that the left-hand side is 1 + o(1), while the right-hand side is o(1). This is a contradiction. Hence, there is a 1 ⩽ jk such that rj > 1.

Also, we can write ∑ki = 1λiχ2(ri) as a sum of a scaled chi-square distribution and the remaining part, where λmax  ≔ max iλi is assumed to be an eigenvalue of a vector in . Moreover, we assume that . The density of this sum is a convolution which is bounded by (after a small calculation). Moreover, by Theorem 2.1 of Wolkowicz and Styan (Citation1980) we know that since V contains only ones on its diagonal. Hence, we derived

Bound on (Equation9)

By the discussion above, we have the following bound on (Equation9)

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 1,069.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.