138
Views
0
CrossRef citations to date
0
Altmetric
Articles

Variable selection for longitudinal varying coefficient errors-in-variables models

, &
Pages 3713-3738 | Received 17 Dec 2019, Accepted 19 Jul 2020, Published online: 08 Aug 2020
 

Abstract

In this paper, we investigate the variable selection for varying coefficient errors-in-variables (EV) models with longitudinal data when some covariates are measured with additive errors. A variable selection method based on bias-corrected penalized quadratic inference function (pQIF) is proposed by combining the basis function approximation to coefficient functions and bias-corrected quadratic inference function (QIF) with shrinkage estimations. The proposed method can handle the measurement errors of covariates and within-subject correlation, estimate and select non-zero nonparametric coefficient functions. With appropriate selection of the tuning parameters, we establish the consistency of the variable selection method and the sparsity properties of the regularized estimators. The finite sample performance of the proposed method is assessed by simulation studies. The utility of the method is further demonstrated via a real data analysis.

MATHEMATICS SUBJECT CLASSIFICATION:

Acknowledgments

We thank the editor and reviewers for their helpful comments that significantly improved the manuscript.

Appendix 1. Derivation process of EquationEquation (9)

According to the above, we can see that uij=(uij1,uij2,,uijq)T,

Bij=IqB(tij)=(B(tij)000B(tij)000B(tij))Lq×q,    u˜ij=Bijuij=(B(tij)uij1B(tij)uij2B(tij)uijq)Lq×1,u˜i=(u˜i1,u˜i2,,u˜ini)T=(u˜i1Tu˜i2Tu˜iniT)=(ui11BT(ti1)ui12BT(ti1)ui1qBT(ti1)ui21BT(ti2)ui22BT(ti2)ui2qBT(ti2)uini1BT(tini)uini2BT(tini)uiniqBT(tini))ni×Lq

For simplicity, denote Γ=Ai1/2MkAi1/2=(γij)ni×ni. According to cov(uij1,uij2)=0 for j1j2, so we ca get Di(k)=E(u˜iTΓu˜i)=E((u˜i1,u˜i2,,u˜ini)(γ11γ12γ1niγ21γ22γ2niγni1γni2γnini)(u˜i1Tu˜i2Tu˜iniT))=E((j=1niuij1γj1B(tij)j=1niuij1γj2B(tij)j=1niuij1γjniB(tij)j=1niuij2γj1B(tij)j=1niuij2γj2B(tij)j=1niuij2γjniB(tij)j=1niuijqγj1B(tij)j=1niuijqγj2B(tij)j=1niuijqγjniB(tij))(ui11BT(ti1)ui12BT(ti1)ui1qBT(ti1)ui21BT(ti2)ui22BT(ti2)ui2qBT(ti2)uini1BT(tini)uini2BT(tini)uiniqBT(tini)))=((Di(k))11000(Di(k))22000(Di(k))qq) where σl2=var(uijl),l=1,2,,q, and Bi=(B(ti1),B(ti2),,B(tini)) (Di(k))ll=E(ui11ui11)γ11B(ti1)BT(ti1)+E(ui21ui21)γ22B(ti2)BT(ti2)++E(uini1uini1)γniniB(tini)BT(tini)=σ12γ11B(ti1)BT(ti1)+σ12γ22B(ti2)BT(ti2)++σ12γniniB(tini)BT(tini)=σ12Bi(γ11000γ2200000γnini)BiT=σ12Bidiag(Γ)BiTDi(k)=(σ12Bidiag(Γ)BiT000σ22Bidiag(Γ)BiT000σq2Bidiag(Γ)BiT)=ΣuBidiag(Γ)BiT

The derivation process of EquationEquation (9) is finished.

Appendix 2. Proof of theorems

Lemma 1.

Suppose Conditions C2 and C8 hold and K=O(N1/(2r+1)), there exists a constant c0 that satisfies (22) supt[0,1]|θl(t)BT(t)βl0|c0Kr,l=1,2,,q.(22)

Lemma 1 is the Corollary 6.21 in Schumaker (Citation2007), the proof is omitted here.

Lemma 2.

Assume Conditions C1-C11 hold, and K=O(N1/(2r+1)), then we have

  1. g¯̂̇n(β)pJ0, where “p” represents the convergence in probability.

  2. ng¯̂n(β0)dN(0,Ω0),g¯̂n(β0)=Op(n1/2), where “d” represents the convergence in distribution.

Proof.

We first prove part (i). According to EquationEquation (12), we can get the first derivative of g¯̂n(β) about β as (23) g¯̂̇n(β)=1ni=1n(W˜iTAi1/2M1Ai1/2W˜i+D̂i(1)W˜iTAi1/2MsAi1/2W˜i+D̂i(s)).(23)

Consider the kth block matrix of g¯̂̇n(β) as g¯̂̇nk(β),k=1,2,,s g¯^˙nk(β)=1ni=1n(X˜iTAi1/2MkAi1/2X˜i+X˜iTAi1/2MkAi1/2u˜i    +u˜iTAi1/2MkAi1/2X˜i+u˜iTAi1/2MkAi1/2u˜iD^i(k))=(Δ1+Δ2+Δ3+Δ41ni=1nD^i(k)).

Now, we prove Δ41ni=1nD̂i(k)p0 as n. Δ41ni=1nD̂i(k)=1ni=1nu˜iTAi1/2MkAi1/2u˜iDi(k)+Di(k)1ni=1nD̂i(k) where Di(k)=E(u˜iTAi1/2MkAi1/2u˜i). Clearly we get 1ni=1nu˜iTAi1/2MkAi1/2u˜iDi(k)p0 as n. E(u˜iTAi1/2MkAi1/2u˜i)1ni=1nD̂i(k)=ΣuBidiag(Γ)BiT1ni=1n(Σ̂uBidiag(Γ)BiT)

According to EquationEquation (10), we see that 1mi1r=1mi(Wij(r)W¯ij)(Wij(r)W¯ij)T is the sample covariance matrix of Σu, which implies that Σ̂u is the mean of some sample covariance matrices and Σ̂upΣu as n. According to the plug-in principle, we get E(u˜iTAi1/2MkAi1/2u˜i)1ni=1nD̂i(k)p0, and Δ41ni=1nD̂i(k)p0.

Under Condition C9, we can get Δ1pJ0(k). To prove Δ2T=Δ3p0, denote Δ2=1ni=1nξik, where ξik=X˜iTAi1/2MkAi1/2u˜i, we can get E(ξik)=0 and cov(ξik)=X˜iTAi1/2MkAi1/2E(u˜iu˜iT)Ai1/2MkAi1/2X˜i.

From Conditions C4-C7, E(u˜iu˜iT) is bounded. By the law of large numbers, Δ3p0. So we get g¯̂̇nk(β)pJ0(k) and g¯̂̇n(β)pJ0 where J0=(J0(1),J0(2),,J0(s))T. The proof of the part (i) is completed.

Nextly, we prove part (ii). We first prove g¯̂n(β0)p0. Consider the kth block matrix of g¯̂n(β0) as g¯̂nk(β0),k=1,2,,s g¯̂nk(β0)=1ni=1n(X˜iTAi1/2MkAi1/2εiX˜iTAi1/2MkAi1/2u˜iβ0+u˜iTAi1/2MkAi1/2εiu˜iTAi1/2MkAi1/2u˜iβ0(X˜i+u˜i)TAi1/2MkAi1/2X˜iR(ti)+D̂i(k)β0)=1ni=1n(J1i(k)J2i(k)+J3i(k)J4i(k)J5i(k)+D̂i(k)β0)=J1(k)J2(k)+J3(k)J4(k)J5(k)+1ni=1nD̂i(k)β0 where R(t)=(R1(t),R2(t),,Rq(t))T,Rl(t)=θl(t)BT(t)βl0,l=1,2,,q,Jm=(Jm(1),Jm(2),,Jm(s)),m=1,2,3,4,5.

Obviously, we have E(J1ki)=0 and cov(J1i(k))=X˜iTAi1/2MkAi1/2ViAi1/2MkAi1/2X˜i

From Conditions C4 to C7, cov(J1i(k)) is bounded, and we can get J1(k)p0 by the law of large numbers. Similarly, we have cov(J2i(k))<,J2(k)p0 and satisfies.

In addition, E(J3i(k))=0. According to the Cauchy-Schwarz inequality we have (cov(J3i(k)))2E(u˜iTAi1/2MkAi1/2u˜i)E(εiTAi1/2MkAi1/2εi)<.

Therefore, J3(k)p0.

Under Condition C8 and Lemma 1, we have J5(k)=Op(n1/2Kr)=op(n1/2). From the definition of D̂i(k) and by the law of large numbers, J4(k)1ni=1nD̂i(k)β0p0. By the definition of D̂i(k) and the central limit theorem, we can get J4(k)1ni=1nD̂i(k)β0=Op(n1/2). So we have g¯̂nk(β0)p0 and g¯̂n(β0)p0.

Next, we prove g¯̂n(β0)=Op(n1/2). According to the above conclusions, we have g¯̂nk(β0)=1ni=1n(J1i(k)J2i(k)+J3i(k)(J4i(k)D̂i(k)β0))+op(n1/2)=1ni=1nηi(k)+op(n1/2) where ηi=(ηi(1),ηi(2),,ηi(s))T, ηi(k)=J1i(k)J2i(k)+J3i(k)(J4i(k)D̂i(k)). Furthermore, we get g¯̂n(β0)=1ni=1nηi+op(n1/2),  and  Ωn(β0)=1ni=1nηiηiT+op(1).

Obviously, according to the above conclusions, we can get E(ηi(k))=0,cov(ηi(k))<,m=1,2,3,4.

Under conditions C4-C7, following Tian, Xue, and Liu (Citation2014), for any aRsqL which satisfies aTa=1, E(aTJ1i)=0 and supi||aTJ1i||2+δ||aT||2+δsupiE||J1i||2+δsupiE||εi||2+δ. Similarly, for any aRsqL such that aTa=1,E(aTJ2i)=0, then supi||aTJ2i||2+δ||aT||2+δsupiE||J2i||2+δsupiE||ui||2+δ. Using the Cauchy-Schwarz inequality, for any aRsqL such that aTa=1,E(aTJ3i)=0, supi||aTJ3i||2+δsupiE||εi||2+δ·supiE||ui||2+δ. So, we know that aTJ1i,aTJ2i and aTJ3i satisfy the Lyapunov condition for central limit theorem. In addition, we have J4(k)1ni=1nD̂i(k)β0=Op(n1/2), under condition C5, so we get aRsqL such that aTa=1, E(aTηi)=0, supiE||aTηi||2+δ< which implies that aTηi satisfies the Lyapunov condition for the central limit theorem. Thus (aTi=1ncov(ηi)a)1/2(i=1naTηi)dN(0,1).

According to condition C4, we have 1ni=1ncov(ηi)PΩ0. So ng¯̂n(β0)=n1ni=1nηi+nop(n1/2)dN(0,Ω0), g¯̂n(β0)=Op(n1/2).

The proof of Lemma 2 is completed.

Lemma 3.

Suppose that the preceding regularity conditions of C1-C11 hold, K=O(n1/(2r+1)) then ||n1Q̇n(β0)2g¯̂̇nT(β0)Ωn1g¯̂n(β0)||=Op(n1),||n1Q¨n(β0)2g¯̂̇nT(β0)Ωn1g¯̂̇n(β0)||=op(1).

Proof.

Following Tian, Xue, and Liu (Citation2014), apply Taylor expansion to Qn(β0) at β0, we have n1Q̇n(β0)=2g¯̂̇nT(β0)Ωn1g¯̂n(β0)+g¯̂̇nT(β0)Ωn1Ω̇nΩn1g¯̂n(β0)=2g¯̂̇nT(β0)Ωn1g¯̂n(β0)+Op(n1), where Ω̇n is a three-dimensional array of (Ωnβ1,Ωnβ2,,Ωnβq). By Lemma 2, we can see that g¯̂n(β0)=Op(n1/2). Under condition C4, we have g¯̂̇nT(β0)Ωn1Ω̇n1Ωn1g¯̂n(β0)=Op(n1/2), n1Q̇n(β0)=2g¯̂̇nT(β0)Ωn1g¯̂n(β0)+Op(n1),

So, we have ||n1Q̇n(β0)2g¯̂̇nT(β0)Ωn1g¯̂n(β0)||=Op(n1).

Similarly, we get n1Q¨n(β0)=2g¯̂̇nT(β0)Ωn1g¯̂̇n(β0)+Rn, where Rn=2g¯̂¨nTΩn1g¯̂n4g¯̂̇nTΩn1Ω̇nΩn1g¯̂n+2g¯̂nTΩn1Ωn1Ω̇nΩn1g¯̂ng¯̂nTΩn1Ω¨nΩn1g¯̂n is a four-dimensional array {2Ωnβiβj:i,j=1,2,,q}.

By the definition of g¯̂n(β), and g¯̂¨n(β)=0, so we have g¯̂¨nTΩn1g¯̂n=0. Using Lemma 2, we get g¯̂̇nTΩn1Ω¨nΩn1g¯̂n=Op(n1/2)=op(1),  g¯̂nTΩn1Ωn1Ω̇nΩn1g¯̂n=Op(n1)=op(1)g¯̂̇nTΩn1Ω̇nΩn1g¯̂n=op(1)

Hence we have Rn=op(1). So we get ||n1Q¨n(β0)2g¯̂̇nT(β0)Ωn1g¯̂̇n(β0)||=op(1).

The proof of Lemma 3 is completed.

Proof of Theorem 1.

From Lemma 1, we have ||θl(t)BT(t)βl0||=O(Kr). Suppose δ=nr/(2r+1) and β=β0+δD. To prove Theorem 1, we just have to show that for any ε>0, there exists a large constant C such that (24) P{inf||D||=CQp(β)Qp(β0)}1ε.(24)

When ε1, (24) is always true. So we just assume ε(0,1). Without loss of generality, assume θl(·)=0,l=q1+1,,q and pλ(0)=0, we have Qp(β)Qp(β0)Qn(β)Qn(β0)+l=1q1n[pλl(||βl||H)pλl(||βl0||H)].

Apply Taylor expansion to Qn(β0) at β0, we have Qn(β)=Qn(β0+δD)=Qn(β0)+δDTQ̇n(β0)+12δ2DTQ¨n(β0)D+||D||2op(1), where β˜ lies between β and β0. According to Lemmas 1 and 2, we can get δDTQ̇n(θ0)=δDT{2ng¯̂̇nT(θ0)Ωn1g¯̂n(β0)+nOp(n1)}=||D||Op(nδ)+||D||Op(δ), and 12δ2DTQ¨n(θ0)D=12δ2DT{2ng¯̂̇nT(θ0)Ωn1g¯̂̇n(θ0)+nop(1)}D=nδ2DTg¯̂̇nT(θ0)Ωn1g¯̂̇n(θ0)D+nδ2||D||2op(1).

Therefore we have Qn(β)Qn(β0)=nδ2||D||2J0TΩ01J0+||D||Op(nδ)+||D||Op(δ)+nδ2||D||2op(1).

Obviously, nδ2||D||2J0TΩ01J00. When C is large enough, nδ2||D||2J0TΩ01J0||D||Op(nδ),nδ2||D||2J0TΩ01J0nδ2||D||2op(1).

So when C is large enough, Qn(θ)>Qn(θ0).

Assume λl0 and K=O(n1/(2r+1)). When n is large enough, we have ||βl||Haλ,||βl0||Haλ. Following the definition of the penalty function, we get pλl(||βl||H)=pλl(||βl0||H)=(1+a)λl22, l=1q1n[pλl(||βl||H)pλl(||βl0||H)]=0.

So, for any given ε>0, there exists a large enough C which satisfies EquationEquation (24), which further implies that there exists β̂ which satisfies ||β̂β0||=Op(δ)=Op(nr/(2r+1)). Note that ||θ̂l(t)θl(t)||2=01{BT(t)β̂lBT(t)βl0+θl(t)BT(t)βl0}2dt201{BT(t)β̂lBT(t)βl0}2dt+201{θl(t)BT(t)βl0}2dt=2(β̂lβl0)T(01BT(t)B(t)dt)(β̂lβl0)+201{θl(t)BT(t)βl0}2dt=2(β̂lβl0)TH(β̂lβl0)+201Rl(t)2dt.

With the same arguments above, we can get ||β̂β||=Op(nr/(2r+1)). Therefore, invoking H=O(1)), we have (β̂lβl0)TH(β̂lβl0)=OP(n2r/(2r+1)). With Lemma 1, we get O1Rl(t)2dt=OP(n2r/(2r+1)). Thus, we complete the proof of Theorem 1.

Proof of Theorem 2.

Assume θl(·)=0 for l=q1+1,,q and θl(·) (l=1,2,,q1) are non-zero coefficient functions. So we get the corresponding regression parameter space Θ as Θ={β:β=(β1T,β2T,,βqT)T,βlT=0,l=q1+1,q1+2,,q}.

For l=q1+1,,q, denote Θl={β:β=(0T,0T,,0T,βlT,0T,,0T)}, where 0 is an L×1 vector of zeros. From Lemma 1 and Xue, Qu, and Zhou (Citation2010), we have ||BT(·)βl||=O(nr/(2r+1)), and ||BT(·)βl||λl=O(nr/(2r+1)λl). To prove Theorem 2, it is sufficient to show that, for any βΘ and βl*Θl,Qp(β+βl*)Qp(β) is true with probability 1. Qp(β+βl*)Qp(β)=Qn(β+βl*)Qn(β)+npλl(||βl*||H)=βl*TQ̇n(β)+12βl*TQ¨n(β̂l*)βl*(1+op(1))+npλl(||βl*||H)=nλl||BT(·)βl*||{Rl*λl+pλl(t)λl}(1+op(1)), where β̂l* lies between β+βl* and β, t lies between 0 and ||βl*||H. Furthermore, we get Rl*=βl*Tn1Q̇n(β̂l*)+12βl*Tn1Q¨n(β̂l*)βl*||BT(·)βl*||.

According to Lemmas 2 and 3, we have βl*Tn1Q̇n(β̂l*)=Op(n1/2)=op(1),  βl*Tn1Q¨n(β̂l*)βl*=βl*TJ0TΩ01J0βl*+op(1),Rl*λl=βl*TJ0TΩ01J0βl*||BT(·)βl*||λl+op(1)0.

Form Conditions C10 and C11, lim infnlim inf||βl||H0pλl(t)λl>0,l=q1+1,,q.

So for any βΘ and βl*Θl,Qp(β+βl*)Qp(β) is true with probability tending to 1. This completes the proof of Theorem 2.

Proof of Theorem 3.

Following Wang, Li, and Tsai (Citation2007) and Tian, Xue, and Liu (Citation2014), we create three mutually exclusive sets: R={λ:SλST},R0={λ:Sλ=ST},R+={λ:SλST,Sλ=ST}, where R,R0, and R+ represent underfitted, correctly fitted or overfitted model Sλ, respectively. Then, the theorem can be proved by comparing BIC(Sλ) and BIC(ST). Here we consider two separate cases.

Case I: When λR, we have E(g¯̂n(βλ))0,E(g¯̂n(βλT))=o(1).

By the law of large numbers and the continuous mapping theorem, we can get 1n{BIC(Sλ)BIC(ST)}=g¯̂T(βλ)Ωn1(βλ)g¯̂n(βλ)+dfλlog(n)n1nBIC(ST)g¯̂T(βλ)Ωn1(βλ)g¯̂n(βλ)1nBIC(ST)infRg¯̂T(βλ)Ωn1(βλ)g¯̂n(βλ)g¯̂T(βλT)Ωn1(βλT)g¯̂n(βλT)dfλTlog(n)nminSλST{E(g¯̂n(βλ))TΩn1(βλ)E(g¯̂n(βλ))}{E(g¯̂n(βλT))TΩn1(βλT)E(g¯̂n(βλT))}>0. Case II: When λR, we have E(g¯̂n(βλ))0 and E(g¯̂n(βλT))=o(1). So we can get infλR+1n{BIC(Sλ)BIC(Sλ)}=infλR+g¯̂T(βλ)Ωn1(βλ)g¯̂n(βλ)g¯̂T(βλT)Ωn1(βλT)g¯̂n(βλT)+(dfλdfλT)log(n)ninfλR+g¯̂T(βλ)Ωn1(βλ)g¯̂n(βλ)g¯̂T(βλT)Ωn1(βλT)g¯̂n(βλT)minSλST{E(g¯̂n(βλ))TΩn1(βλ)E(g¯̂n(βλ))}{E(g¯̂n(βλT))TΩn1(βλT)E(g¯̂n(βλT))}>0.

Both cases hold true in probability by the law of large numbers and the continuous mapping theorem. This completes the proof of the Theorem 3.

Additional information

Funding

This work was partly supported by a grant from the National Social Science Foundation of China (15CTJ008 to MZ) and a grant from the National Institute of Health (R21HG010073 to YC).

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 1,069.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.