721
Views
0
CrossRef citations to date
0
Altmetric
Articles

Personalized treatment selection via the covariate-specific treatment effect curve for longitudinal data

, , &
Pages 253-264 | Received 13 Nov 2019, Accepted 25 Apr 2020, Published online: 14 May 2020

Abstract

Treatment selection based on patient characteristics has been widely recognised in modern medicine. In this paper, we propose a generalised partially linear single-index mixed-effects modelling strategy for treatment selection and heterogeneous treatment effect estimation in longitudinal clinical and observational studies. We model the treatment effect as an unknown functional curve of a weighted linear combination of time-dependent covariates. This method enables us to investigate covariate-specific treatment effects and make personalised treatment selection in a flexible fashion. We develop a method that combines local linear regression and penalised quasi-likelihood to estimate the weight for each covariate, the unknown treatment effect curve and the parameters for mixed-effects. Based on pointwise confidence intervals for the treatment effect curve, we can make individualised treatment decisions from the information of patient characteristics. A simulation study is conducted to evaluate finite sample performance of the proposed method. We also illustrate the method via analysis of a real data example.

1. Introduction

Personalized or precision medicine, which aims to provide treatment strategies according to the characteristics of individuals or subgroups of the population, has gained much attention from biomedical researchers. The main goal of personalised medicine is to investigate covariate-specific (heterogeneous) effects of treatment, based on which individualised clinical decision can be better made to patients. In recent years, there has been an increasing amount of literature on heterogeneous treatment effect (HTE) estimation. A typical way of exploring heterogeneous treatment effects is to examine patient outcomes in mutually exclusive subgroups defined by observable patient characteristics, see Berger et al. (Citation2014), Ciampi et al. (Citation1995), Negassa et al. (Citation2005), Foster et al. (Citation2011), Su et al. (Citation2008) and Wang et al. (Citation2007). Bonetti and Gelber (Citation2004) introduced a subpopulation treatment effect pattern plot (STEPP), which characterises treatment effects across potentially overlapping intervals of a continuous covariate. Bonetti et al. (Citation2009) indicated that this method is effective only for large sample sizes, and proposed a permutation-based method for inference and achieved better performance for smaller sample sizes. The major limitation of such subgroup approach is that dichotomisation of continuous covariates can be artificial, and thus it may lose important information from the data.

In the literature, some other works apply nonparametric and semiparametric modelling methods to study HTE. These methods impose no or very relaxed assumptions on the model structure, and thus can explore HTE in a flexible fashion. For continuous response, Foster et al. (Citation2015) proposed a two-stage procedure. They obtain nonparametric estimates of treatment effects for each subject in stage 1, and these estimates are used to identify optimal subset for a treatment, thus determine a treatment regime in stage 2. However, this method only applies to situations that the optimal subset is contiguous. For time-to-event data, Ma and Zhou (Citation2014) defined a covariate-specific treatment effect (CSTE) curve, which is used to represent clinical utility of a continuous biomarker. They derived estimate of CSTE curve, and constructed pointwise confidence interval to select the optimal treatment for individual patient, as well as simultaneous confidence band to identify subpopulation who respond well to a treatment. Han et al. (Citation2017) extended the method of Ma and Zhou (Citation2014) to the case of binary response. Both of them only considered a single biomarker. In order to incorporate multivariate or even high-dimensional covariates, Guo et al. (Citation2018) proposed a sparse logistic single-index coefficient model for optimal treatment selection using the CSTE curve. This method offers a flexible way for studying the CSTE curve without a restrictive assumption on the structure of the curve while achieving great dimension reduction of the high-dimensional covariates.

All the above methods were proposed for cross-sectional data with responses measured at one time point. In practice, the patient outcomes are often collected at multiple follow-up times in order to better evaluate the effectiveness of treatments. In this paper, we extend the model considered in Guo et al. (Citation2018) to the longitudinal clinical and observational studies, and consider a generalised partially linear single-index coefficient mixed-effects model (GPLSIMM) for our longitudinal setting. Similar as Guo et al. (Citation2018), we treat the treatment effect as an unknown functional curve of a weighted linear combination of time-varying covariates. The weights for covariates account for their different contributions to the treatment effect, and they are estimated from the data. In longitudinal studies, the repeated measures are correlated within subjects, and thus the estimating method considered in Guo et al. (Citation2018) is not applicable to our proposed GPLSIMM. In our model, we need to estimate the unknown functional curve, the weight of each covariate and the parameters for the mixed-effects. Estimation of generalised semiparametric mixed-effects models has been considered in some existing works. Liang (Citation2009) proposed a local linear regression and penalised quasi-likelihood method for estimation of a generalised partially linear mixed-effects model. Pang and Xue (Citation2012) considered a local linear regression with GEE method for a single-index mixed effects model. Xu and Zhu (Citation2012) and Chen et al. (Citation2014) developed a kernel and a P-spline estimation method, respectively, together with quasi-lilikelhood for longitudinal generalised single-index models.

Based on the methods considered in these works, we see that penalised quasi-likelihood is a commonly used method for estimation of generalised mixed-effects models. It circumvents the calculation of high-dimensional integral in likelihood function (Breslow & Clayton, Citation1993). In our proposed GPLSIMM, we approximate the unknown treatment effect curve by the local linear method and estimate the parameters for the parametric and nonparametric parts through alternatively optimising the local and global penalised quasi-likelihood functions. We then select optimal treatment for a future patient based on pointwise confidence intervals for the CSTE curve.

The rest of the paper is organised as follows. In Section 2, we introduce the proposed model, the CSTE curve and the estimation method. Section 3 gives asymptotic properties of parametric and nonparametric estimates. Section 4 provides the algorithm for model estimation. In Section 5 we evaluate the finite sample properties of the proposed method via simulation studies, while Section 6 illustrates the application of the proposed method in a real data set. All technical proofs are relegated to Appendix.

2. Model and estimation

2.1. Model

Suppose our data are obtained from n independent subjects, and observations of the ith subject is {(Yij,Xij,Zij,Aij,Dij), j=1,,ni}. Yij is the response, Xij, Zij and Aij are covariates of dimension p, q and s, and Dij{0,1} is an indicator of exposure to treatment. We assume the relationship of response and covariates is specified by the following GPLSIMM: (1) E(Yij|Xij,Zij,Aij,Dij,γγi)=g1{ααTXij+f(ββTZij)Dij+AijTγγi},var(Yij|Xij,Zij,Aij,Dij,γγi)=wij1φV(μij),i=1,,n,j=1,,ni,(1) where μij=E(Yij|Xij,Zij,Aij,Dij,γγi), g() and V() are known functions, φ is a scale parameter and wij is a pre-determined weight for the jth observation of the ith subject. f() is an unknown function, and αα and ββ are unknown parameter vectors. For model identifiability, we assume that ββ=1 and β1>0. We incorporate random effect AijTγγi to account for the within-subject correlation, where γγi, i=1,,n are s-dimensional random effect vectors. We assume that γγiN(00,GG).

In this model, we characterise the treatment effect with a single-index term f(ββTZij), and the relationship of response and covariates for control group with a generalised linear mixed effect model. When response Y is binary, we can easily see that (2) f(ββTZ)=logit{E(Y|X,Z,A,γγ,D=1)}logit{E(Y|X,Z,A,γγ,D=0)}.(2) When Y is a count variable and follows Poisson distribution, (3) f(ββTZ)=log{E(Y|X,Z,A,γγ,D=1)}log{E(Y|X,Z,A,γγ,D=0)}.(3) Assume that D = 0 and D = 1 represents standard care and treatment respectively, X and Z are patient characteristics, Y indicates outcome of the study, and the better outcome corresponds to larger value of Y. From (Equation2) and (Equation3), f(ββTZ) could be regarded as a measure of treatment effect. We define f() as covariate-specific treatment effect (CSTE) curve under this model.

Given the estimate and confidence interval of f(), we could suggest the optimal treatment for a future patient based on his or her personal characteristics. Taking binary response as an example, we assume that Y = 1 represent a disease being cured, while Y = 0 indicate uncured. Let u0 be the estimated value of θθTZ calculated from personal characteristics of a patient. If the lower bound of confidence interval for f(u0) is greater than 0, the treatment is more effective than standard care for the patient. On the other hand, if the upper bound of confidence interval is smaller than 0, standard care is more effective. In this case, we could not recommend the treatment to the patient. If neither of the above cases happen, that is, 0 is contained in the confidence interval, we would draw the conclusion that there's no significant difference between treatment and standard care.

2.2. Estimation

Denote ΓΓ=(γ1T,,γnT)T. Let ββ(1)=(β2,,βq)T be the vector after deleting the first element from ββ. Then ββ=(1ββ(1)2,ββ(1)T)T, and we define the Jacobian matrix JJ=JJββ(1)=ββββ(1)T=ββ(1)T1ββ(1)2IIq1. Our primary interest is to estimate f(), αα, ββ and GG.

Suppose that GG is known. Given αα and ββ, we combine penalised quasi-likelihood with local linear technique to obtain estimates of f() and f(). If u is in a neighbourhood of ββTZij, g(μij) can be approximated by g(μ~ij)=ααTXij+{f(u)+f(u)(ββTZiju)}Dij+AijTγγi. Let d(y,μ)=2yμ((yu)/V(u))du, Kh()=h1K(/h), where h is bandwidth, K() is a zero-mean symmetric density function. We maximise the following local penalised quasi-likelihood (4) 12i=1nj=1niKh(ββTZiju)wijφ1d(Yij,g1[ααTXijj=1ni+{a+b(ββTZiju)}Dij+AijTγγi])+γγiTGG1γγi(4) with respect to a, b and ΓΓ, and obtain estimates fˆ(u)=aˆ and fˆ(u)=bˆ. Taking derivative on (Equation4) with respect to (a,b) and ΓΓ yields the following estimating equations: (5) i=1nj=1niKh(ββTZiju){g(μ~ij)}1wijφ1V1(μ~ij)×Yijg1[ααTXij+{a+b(ββTZiju)}Dij+AijTγγi]Dij1ββTZiju=00,(5) and for each i, (6) j=1niKh(ββTZiju)Aij{g(μ~ij)}1wijφ1V1(μ~ij)×Yijg1[ααTXij+{a+b(ββTZiju)}Dij+AijTγγi]GG1γγi=00.(6)

When f() is known, we obtain estimate of αα, ββ and ΓΓ by maximising the global penalised quasi-likelihood (7) 12i=1nj=1niwijφ1d[Yij,g1{ααTXij+f(ββTZij)Dijj=1ni+AijTγγi}]+γγiTGG1γγi,(7) with respect to αα, ββ(1) and ΓΓ. The corresponding estimating equations are as follows: (8) SS01(αα,ββ(1),ΓΓ)=i=1nj=1niXij{g(μij)}1wijφ1V1(μij)×Yijg1{ααTXij+f(ββTZij)Dij+AijTγγi}=00,(8) (9) SS02(αα,ββ(1),ΓΓ)=i=1nj=1niJJββ(1)TZijDijf(ββTZij){g(μij)}1wijφ1+V1(μij)Yijg1{ααTXijf(ββTZij)Dij+AijTγγi}=00,(9) and for each i=1,,n, (10) SSi(αα,ββ(1),γγi)=j=1niAij{g(μij)}1wijφ1V1(μij)×Yijg1{ααTXij+f(ββTZij)Dij+AijTγγi}GG1γγi=00.(10)

We obtain the parametric and nonparametric estimates by iteratively maximising quasi-likelihoods (Equation4) and (Equation7). The corresponding algorithm is summarised in Section 4 for practical implementation.

3. Asymptotic properties

Denote qk(t,y)=12(kd{y,g1(t)}/tk), and ρk(t)={dg1(t)/dt}kV1{g1(t)}, k = 1, 2, 3. Let κj=ujK(u)du,μj=ujK2(u)du for j = 1, 2, N=i=1nni and N1=i=1nni(ni1). We assume n tends to infinity and ni's are bounded. Denote ηij=ααTXij+f(ββTZij)Dij+AijTγγi, and η=ααTX+f(ββTZ)D+ATγγ.

In order to obtain asymptotic behaviours of the proposed parametric and nonparametric estimates, we assume the following regularity conditions:

  1. The marginal density of Z is positive and uniformly continuous with a compact support ZRq. fU(), density function of U=ββTZ, is twice continuously differentiable for UU, where U=[b1,b2]={ββTZ,ZZ} is a compact interval.

  2. For all x, ρ2(x)>0.

  3. The second derivative of g(),V() and f() are bounded and continuous.

  4. The kernel K() is a bounded and symmetric probability density function with bounded support [c0,c0], and satisfies u2K(u)du0 and u2K(u)du<. K() satisfies the Lipschitz condition of order 1.

  5. E{ρ2(η)D2|ββTZ=u},E{ρ2(η)DX|ββTZ=u} and E{ρ2(η)D2Z|ββTZ=u} are twice differentiable on u. E{q1(η11,Y11)q1(η12,Y12)D112D122|ββTZ11=u1,ββTZ12=u2} is continuous on u1 and u2.

  6. As n, h satisfies nh40 and nh2/log(1/h).

Denote X~=XE{ρ2(η)D2|ββTZ}1E{ρ2(η)DX|ββTZ},Z~=f(ββTZ)JJTZE{ρ2(η)D2|ββTZ}1×E{ρ2(η)D2Z|ββTZ}. Let ααˆ,ββˆ be the final estimator of αα and ββ, respectively. Theorem 3.1 gives the asymptotic distributions of ααˆ and ββˆ.

Theorem 3.1

Under conditions (C1)–(C6), (11) Nααˆαα0ββˆββ0N(0,ΣΣ),(11) where ΣΣ=DA1+N1NA1CA1DT, with A=Eρ2(η)X~Z~D2,C=covX~11Z~11D11q1(η11,Y11),X~12Z~12D12q1(η12,Y12), and D=IIp00p×(q1)00q×pJJββ0(1).

Theorem 3.1 indicates that in order to obtain nconsistent estimates of αα and ββ, we need to undersmooth the nonparametric function f(). For any zZ, we let fˆ(ββˆTz;ααˆ,ββˆ) be the estimate of f(ββTz) given ααˆ,ββˆ and h1 be the corresponding bandwidth. Theorem 3.2 presents the asymptotic distribution of fˆ(ββˆTz;ααˆ,ββˆ).

Theorem 3.2

Under conditions (C1)–(C5), as n, h10 and Nh1, (12) Nh{fˆ(ββˆTz;ααˆ,ββˆ)f(ββ0Tz)12h2κ2f(ββ0Tz)}N(0,σf2),(12) where σf2=μ0fU1(ββ0Tz)[E{ρ2(η)D2|ββ0TZ=ββ0Tz}]1.

4. Computation

Given αα and ββ, we solve estimating Equations (Equation5) and (Equation6) by Fisher's scoring algorithm (see Wu & Zhang, Citation2006, Section 10.4), and obtain estimates fˆ() and fˆ().

When f() is given, we obtain estimates of αα, ββ(1) and ΓΓ by solving Equations (Equation8), (Equation9) and (Equation10). Here we apply the quasi-Fisher scoring algorithm to this problem.

Denote θθ=(ααT,ββT)T,ΘΘ=(θθT,ΓΓT)T,θθ(1)=(ααT,ββ(1)T)T,ΘΘ(1)=(θθ(1)T,ΓΓT)T,SS(ΘΘ(1))={SS01T(αα,ββ(1),ΓΓ), SS02T(αα,ββ(1),ΓΓ),×SS1T(αα,ββ(1),γγ1),,SSnT(αα,ββ(1),γγn)}T. Using the quasi-Fisher scoring algorithm, estimate of ΘΘ(1) is updated by (13) ΘΘnew(1)=ΘΘold(1)+SS˙1(ΘΘold(1))SS(ΘΘold(1)),(13) where SS˙1(ΘΘ1(1))=QQ=QQ11QQ12QQ13QQ12TQQ22QQ23QQ13TQQ23TQQ33 with QQ11=i=1nj=1niXij{g(μij)}2wijφ1V1(μij)XijT,QQ12=i=1nj=1niXij{g(μij)}2wijφ1V1(μij)×f(ββTZij)ZijTJJββ(1),QQ22=i=1nj=1niJJββ(1)TZijDij2{f(ββTZij)}2{g(μij)}2×wijφ1V1(μij)ZijTJJββ(1),QQ13=QQ13[1],,QQ13[n],QQ23=QQ23[1],,QQ23[n],QQ33=diagQQ33[1],,QQ33[n], and for i=1,,n, QQ13[i]=j=1niXij{g(μij)}2wijφ1V1(μij)AijT,QQ23[i]=j=1niJJββ(1)TZijDijf(ββTZij){g(μij)}2×wijφ1V1(μij)AijT,QQ33[i]=j=1niAij{g(μij)}2wijφ1V1(μij)AijT+G1. Let PPi={g(μi1)}1Xi1T{(μi1)}1f(ββTZi1)Di1Zi1TJJββ(1){g(μini)}1XiniT{g(μini)}1f(ββTZini)DiniZiniTJJββ(1),AAi={g(μi1)}1Ai1T{g(μini)}1AiniT,Ri=diagwi11φV(μi1),,wini1φV(μini),εεi=Yi1μi1,,YiniμiniT. We further denote PP=PP1T,,PPnTT,AA=diag(AA1,,AAn),εε=εε1T,,εεnTT,R=diag(R1,,Rn),GG~=diag(GG,,GG).Equation (Equation13) can be rewritten as ΘΘnew(1)ΘΘold(1)=PPoldTRold1PPoldPPoldTRold1AAoldAAoldTRold1PPoldAAoldTRold1AAold+GG~11PPoldTRold1εεoldAAoldTRold1εεoldGG~1ΓΓold. Note that PPold,AAold,Rold and εεold on the right hand of above equation are calculated from ΘΘold. For simplicity we suppress their subscripts hereinafter. Denote HHi=AAiGGAAiT+Ri,HH=diag(HH1,,HHn). By matrix algebra, we can obtain (14) θθnew(1)=θθold(1)+(PPTHH1PP)1PPTHH1(εε+AAΓΓold),γγi,new=GGAAiTHHi1εεi+AAiγγi,oldPPi(θθnew(1)θθold(1)),i=1,,n.(14) Based on ββnew=1 and ββold=1, we can show that θθnewθθold=II0000JJββold(1)(θθnew(1)θθold(1)){1+op(1)}. Thus, we update θθold by θθnew=(ααnewT,ββnewT/ββnew)T, with θθnew=(ααnewT,ββnewT)T, and θθnew=θθold+II0000JJββold(1)(PPTHH1PP)1×PPTHH1(εε+AAΓΓold). Our algorithm can be summarised as following:

  1. Obtain initial estimate θθˆinit by solving generalised linear mixed effect modelE(Yij|Xij,Zij,Dij)=g1(ααTXij+ββTZijDij+AijTγγi)var(Yij|Xij,Zij,Dij)=wij1φV(μij).

  2. Given parametric estimate θθˆold, derive estimates fˆ(ββˆoldTZij) and fˆ(ββˆoldTZij) by solving estimating Equations (Equation5) and (Equation6).

  3. Obtain θθˆnew and ΓΓˆnew using Equation (Equation14).

  4. Iterate between Steps 2 and 3 until convergence, and obtain the final estimate θθˆ, ΓΓˆ and fˆ(). We further calculate the final estimate fˆ() by Step 2.

Note that matrix GG is still unknown. To obtain estimate of GG, we apply the maximum likelihood method under the normality assumption, and implement the method by EM algorithm (see Laird & Ware, Citation1982). To be specific, we set the initial value of GG to be the identity matrix. After obtaining the estimates of θθ,ΓΓ and f() by Steps 1–4, we have an updated estimate of GG. We repeat this procedure until convergence, and obtain the final estimate GGˆ. Given the estimate GGˆ, we derive the estimates of θθ, ΓΓ and f() by substitution of GGˆ for GG in Steps 2 and 3.

During the process of implementation, bandwidth h in Step 2 also needs to be selected. Theorem 3.1 indicates that undersmoothing the nonparametric part is necessary to guarantee n consistency of parametric estimates. We adopt the ad hoc method in Carroll et al. (Citation1997) to select appropriate bandwidth for Step 2.

5. Simulation

In this section, we assess the finite sample performance of the proposed method via two Monte Carlo simulations with binary and count responses respectively.

In these simulations, ηij=ααTXij+f(ββTZij)Dij+AijTγγi,i=1,,n,j=1,,ni, where αα=(0.5,1,0.5)T,ββ=(1,1)T/2 and f(u)=3sin{π(uc1)/(c2c1)}, with c1=221.64512, and c2=22+1.64512. We assume the random effect be random intercept, that is, A1 and γγ is one-dimensional. We let γγ follow N(0,1). For the covariates, X is distributed as X=(X1,X2,X3)T, where X1,X2 and X3 follows N(0,1), Z=(Z1,Z2)T, where Z1,Z2 are uniform U(0,1) variables, and the exposure indicator D follows Bernoulli(0.5). We assume ni, the number of observations for each subject, is 5, and set the number of subjects n to be 100, 300 and 500. For the response variable, we consider the following two cases:

Case 1 (Binary data): YijBernoulli(μij),wherelogit(μij)=ηij;

Case 2 (Count data): YijPoisson(μij),wherelog(μij)=ηij.

All simulations are repeated for 500 times. For comparison, we ignore the random effects and apply generalised partially linear single-index model (GPLSIM) to the simulated data. The corresponding estimates are obtained using the method in Xu and Zhu (Citation2012). The parametric estimates for binary and count data are summarised in Tables  and  respectively. We assess the performance of nonparametric estimator fˆ() by mean integrated squared error (MISE) defined as MISE(fˆ)=E{fˆ(u)f(u)}2du.

Table 1. Simulation results for parametric estimates with binary data.

Table 2. Simulation results for parametric estimates with count data.

Tables  and  shows mean and standard deviation of MISE for nonparametric function.

Table 3. MISE for nonparametric estimate with binary data.

Table 4. MISE for nonparametric estimate with count data.

From Tables  and , MSE of the parametric estimates under our model are generally smaller than that of ignoring the random effects. As the sample size increases, the performance of parametric estimates improves. Similar results hold for nonparametric estimates. The MISE of nonparametric estimates using our method is smaller than those derived under GPLSIM. MISE decreases as the sample size increases.

6. Real data analysis

We apply the proposed method to the US National Alzhemer's Coordinating Center Uniform Data Set (https://www.alz.washington.edu). Our goal is to investigate the effect of heredity on development of Alzheimer's disease (AD) among women. We take the diagnosis of AD in each observation of patients (yes/no) as response. The covariates that may influence the occurrence of AD include age, visit year, years of education (EDUC), indicator of first-degree family member with cognitive impairment (yes/no, FAM), depression (yes/no, DEP), diabetes (yes/no, DIABETES) and mini-mental state exam (MMSE) score. To avoid large computational burden including all the observations, we randomly select 500 subjects with at least 2 follow-up visits from the original data set. Our final sample includes 2491 observations and the median follow-up is 3. The visit year is between 2005 and 2019. The proportion of occurence of AD is 28.3% among all observations, and the proportion of subjects whose family member has cognitive impairment is 64.2%. Note that we repeated the sample selection for several times, and achieved consistent results.

Taking FAM as indicator of treatment, we apply the proposed model and method to the final data set. We include logarithm of age and MMSE score into Z, and intercept, age, visit year, EDUC, DEP, DIABETES and MMSE score into X. The estimate of treatment effect reflects the influence of family heredity on development of AD. Note that in this data set, FAM is not a real treatment indicator, thus the corresponding CSTE curve does not indicate a real treatment effect. This data set is only used to for illustrative purpose.

Table  shows the parametric estimates. The standard deviations of parameters are calculated from 500 bootstrap samples. Figure  presents estimate of nonparametric function and the corresponding 95% pointwise confidence interval. We construct the confidence interval based on result of Theorem 3.2, and apply the method of Zhang and Peng (Citation2010) to estimate the bias and variance. Figure  displays the curve of treatment effect versus MMSE score with age fixed on its mean value, as well as the curve of treatment effect versus age with MMSE score fixed on the mean value.

Figure 1. Estimate of nonparametric function f() (real line) and its 95% pointwise confidence interval (dashed line).

Figure 1. Estimate of nonparametric function f(⋅) (real line) and its 95% pointwise confidence interval (dashed line).

Figure 2. Relationship of treatment effect with MMSE score and Age: the estimated curve represented by real line and the corresponding 95% pointwise confidence interval by dashed line.

Figure 2. Relationship of treatment effect with MMSE score and Age: the estimated curve represented by real line and the corresponding 95% pointwise confidence interval by dashed line.

Table 5. Estimated αα and ββ.

Chen and Zhou (Citation2011) used a generalised linear model to investigate risk factors that influence the occurrence of AD. From their results, age, DEP and MMSE score are significant under four estimation methods. They found that age and DEP have positive associations with the occurence of AD, and MMSE score is negatively correlated with the development of AD. From Table , our result of parametric estimates is consistent with these findings. From Figure , we can clearly see that if the estimated index of a patient is between 0.03 and 0.57, family inheritance of congnitive impairment increases the risk of getting AD. However, if the estimated index is between −0.39 and −0.11, family heredity decreases the risk of AD. Figure  shows that the effect of family heredity on occurence of AD has a bell shape association with MMSE score, that is, heredity has lower or even negative influence on risk of AD when the value of MMSE score is small or large. Also, the effect has an increasing trend with age. The effect of family heredity on occurrence of AD is stronger for elderly people.

7. Discussion

This paper focuses on a generalised partially linear single-index mixed effects model for personalised treatment effect estimation and treatment selection in longitudinal studies. In our model, the treatment effect is described as a function of a linear combination of covariates. We develop a method combining local linear regression and penalised quasi-likelihood to estimate the coefficients for each covariate, the treatment effect curve and the parameters for mixed effects. Based on the pointwise confidence intervals for treatment effect curve, we can make individualised treatment decisions from the information of patient characteristics. Our simulation study and real data analysis illustrate effectiveness of the proposed method.

Nonparametric and semiparametric methods provide a flexible way to explore HTE. The previous research in this area mostly focus on cross-sectional data. Our work fills in the gap of semiparametric modelling of HTE with longitudinal data. On the other hand, we develop a new estimation method combining local linear technique and penalised quasi-likelihood, for generalised partially linear single-index model. Pointwise confidence interval can be directly constructed for the estimated treatment effect curve based on its asymptotic normality. The theory of simultaneous confidence band for treatment effect curve can also be established accordingly, and we leave this for future work.

There are still some limitations in our work. We directly apply the method of Zhang and Peng (Citation2010) to estimate bias and variance for the treatment effect curve. It would be more rigorous that the performance of this method in our context be validated via simulation studies. We will include this in our future research. Another limitation of our work is that, based on the pointwise confidence interval of treatment effect curve, we could only make treatment decision for a future patient. To identify subgroup of patients that benefit from each treatment, it is necessary to construct simultaneous confidence band for treatment effect curve. Some possible extensions of our work could also be considered in future research. Our model could be extended to high-dimensional covariates to cope with longitudinal studies in which large number of patient characteristics are recorded. It is also of interest to consider robust regression to limit the impact of outlying observations. An even further extension is a survival model for time-to-event response.

Acknowledgments

We are grateful to the Editor-in-Chief, Prof. Jun Shao, an Associate Editor and anonymous reviewers for their thorough reading of our manuscript and insightful comments that have led to significant improvement of this work. We also thank the US National Alzheimer's Coordinating Center for providing the data.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This work was supported by Natural Science Foundation of Shanghai (17ZR1409000). The NACC database is funded by NIA/NIH Grant U01 AG016976. NACC data are contributed by the NIA-funded ADCs: P30 AG019610 (PI Eric Reiman, MD), P30 AG013846 (PI Neil Kowall, MD), P30 AG062428-01 (PI James Leverenz, MD) P50 AG008702 (PI Scott Small, MD), P50 AG025688 (PI Allan Levey, MD, PhD), P50 AG047266 (PI Todd Golde, MD, PhD), P30 AG010133 (PI Andrew Saykin, PsyD), P50 AG005146 (PI Marilyn Albert, PhD), P30 AG062421-01 (PI Bradley Hyman, MD, PhD), P30 AG062422-01 (PI Ronald Petersen, MD, PhD), P50 AG005138 (PI Mary Sano, PhD), P30 AG008051 (PI Thomas Wisniewski, MD), P30 AG013854 (PI Robert Vassar, PhD), P30 AG008017 (PI Jeffrey Kaye, MD), P30 AG010161 (PI David Bennett, MD), P50 AG047366 (PI Victor Henderson, MD, MS), P30 AG010129 (PI Charles DeCarli, MD), P50 AG016573 (PI Frank LaFerla, PhD), P30 AG062429-01 (PI James Brewer, MD, PhD), P50 AG023501 (PI Bruce Miller, MD), P30 AG035982 (PI Russell Swerdlow, MD), P30 AG028383 (PI Linda Van Eldik, PhD), P30 AG053760 (PI Henry Paulson, MD, PhD), P30 AG010124 (PI John Trojanowski, MD, PhD), P50 AG005133 (PI Oscar Lopez, MD), P50 AG005142 (PI Helena Chui, MD), P30 AG012300 (PI Roger Rosenberg, MD), P30 AG049638 (PI Suzanne Craft, PhD), P50 AG005136 (PI Thomas Grabowski, MD), P30 AG062715-01 (PI Sanjay Asthana, MD, FRCP), P50 AG005681 (PI John Morris, MD), P50 AG047270 (PI Stephen Strittmatter, MD, PhD).

Notes on contributors

Yanghui Liu

Yanghui Liu is a Ph.D. candidate in Statistics at East China Normal University.

Riquan Zhang

Riquan Zhang is a Professor in School of Statistics at East China Normal University.

Shujie Ma

Shujie Ma is an Associate Professor in Department of Statistics at University of California, Riverside.

Xiuzhen Zhang

Xiuzhen Zhang is a Ph.D. candidate in Statistics at East China Normal University.

References

Appendix

Proof of Theorem 3.1.

Proof of Theorem 3.1

We use a similar proof strategy with Xu and Zhu (Citation2012). Our proof is divided to three steps.

Step 1: Given (αα0,ββ0), we derive the asymptotic expansion of fˆ(u)f(u).

Let cn=(Nh)1/2,δn=cn2log1/2(1/h), Λ=cn1{af(u)}cn1h{bf(u)},FFij=Dijh1(ββTZiju)Dij,Λˆ=cn1{fˆ(u)f(u)}cn1h{fˆ(u)f(u)}. Denote η¯ij=ααTXij+{f(u)+f(u)(ββTZiju)}Dij+AijTγγˆi. By (Equation4), Λˆ minimises L(Λ)=hi=1nj=1niKh(ββTZiju)×d{Yij,g1(η¯ij+cnFFijTΛ)}d{Yij,g1(η¯ij)} with respect to Λ. Using Taylor expansion, 12L(Λ)=cnhi=1nj=1niKh(ββTZiju)q1(η¯ij,Yij)FFijTΛ+cn2hΛTi=1nj=1niKh(ββTZiju)q2(η¯ij,Yij)FFijFFijT×Λ{1+op(1)}TTnTΛ+ΛTWWnΛ{1+op(1)}. Let η~ij=ααTXij+{f(u)+f(u)(ββTZiju)}Dij+AijTγγi. Then η¯ijη~ij=AijT(γγˆiγγi). Using similar arguments as Section A.1 in Liang (Citation2009), WWn=hcn2i=1nj=1niKh(ββTZiju)q2(η~ij,Yij)FFijFFijT+o(h),TTn=hcni=1nj=1niKh(ββTZiju)q1(η~ij,Yij)FFij+o(cn1h2).

It is easy to see that q2(t,y)={yg1(t)}ρ1(t)ρ2(t). Denote the first element of TTn by Tn[1]. Using Taylor expansion, Tn[1]=hcni=1nj=1niKh(ββTZiju)q1(ηij)Dijhcni=1nj=1niKh(ββTZiju)q2(ηij)×(ηijη~ij)Dij{1+op(1)}=hcni=1nj=1niKh(ββTZiju)q1(ηij)Dij12hcnf(u)i=1nj=1niKh(ββTZiju)q2(ηij)×(ββTZiju)2Dij2{1+op(1)}=hcni=1nj=1niKh(ββTZiju)q1(ηij,Yij)Dij+12cn1×f(u)h2κ2fU(u)E{ρ2(η)D2|ββTZ=u}+Op(δn). Using similar arguments, we also have WWn=fU(u)E{ρ2(η)Ξ|ββTZ=u}+Op(h2+δn), where Ξ=D2diag(1,κ2). By the concavity of function L(Λ), we obtain that Λˆ=WWn1TTn+op(1). Therefore, fˆ(u)f(u)=12f(u)h2κ2+fU(u)E{ρ2(η)D2|βTZ=u}1×N1i=1nj=1niKh(ββTZiju)q1(ηij,Yij)Dij+Op(δn).

Step 2: Derive expression of fˆ(ββTz;αα,ββ)αα(αα0,ββ0(1))andfˆ(ββTz;αα,ββ)ββ(1)(αα0,ββ0(1)).

From Step 1, fˆ(ββTz), fˆ(ββTz) satisfy the following equation: (A1) N1i=1nj=1niKh(ββTZijββTz)q1×ααTXij+{aˆ+bˆ(ββTZijββTz)}Dij+AijTγγi,Yij×Dij{1+op(1)}=0,(A1) where aˆ=fˆ(ββTz) and bˆ=fˆ(ββTz). Taking derivative with respect to αα yields: N1i=1nj=1niKh(ββTZijββTz)q2×ααTXij+{aˆ+bˆ(ββTZijββTz)}Dij+AijTγγi,YijDij×Xij+aˆαα+bˆαα(ββTZijββTz)Dij=0. From this equation we obtain aˆαα=An1(B1n+B2n), with An=N1i=1nj=1niKh(ββTZijββTz)q2×ααTXij+{aˆ+bˆ(ββTZijββTz)}Dij+AijTγγi,YijDij2,B1n=N1i=1nj=1niKh(ββTZijββTz)q2×ααTXij+{aˆ+bˆ(ββTZijββTz)}Dij+AijTγγi,Yij×DijXij,B2n=N1i=1nj=1niKh(ββTZijββTz)q2×ααTXij+{aˆ+bˆ(ββTZijββTz)}Dij+AijTγγi,YijDij2×(ββTZijββTz)bˆαα. Taking derivative on (EquationA1) with respect to ββ(1), we get N1i=1nj=1nih1Kh(ββTZijββTz)JJT(Zijz)q1×ααTXij+{aˆ+bˆ(ββTZijββTz)}Dij+AijTγγi,YijDij+i=1nj=1niKh(ββTZijββTz)q2×ααTXij+{aˆ+bˆ(ββTZijββTz)}+AijTγγi,YijDij2×aˆββ(1)+bˆββ(1)(ββTZijββTz)+bˆJJT(Zijz)=0. It follows that aˆββ(1)=An1(C1n+C2n+C3n), where C1n=N1i=1nj=1niKh(ββTZijββTz)q2×ααTXij+{aˆ+bˆ(ββTZijββTz)}Dij+AijTγγi,Yij×Dij2bˆJJT(Zijz),C2n=N1i=1nj=1niKh(ββTZijββTz)q2×ααTXij+{aˆ+bˆ(ββTZijββTz)}Dij+AijTγγi,Yij×Dij2,(ββTZijββTz)bˆββ(1),C3n=N1i=1nj=1nih1Kh(ββTZijββTz)JJT(Zijz)q1×ααTZij+{aˆ+bˆ(ββTZijββTz)}Dij+AijTγγi,Yij×Dij.

By similar calculations with Cui et al. (Citation2011), An=fU(ββTz)Eρ2(η)D2|ββTZ=ββTz+Op(h2+n/2h1/2),B1n=fU(ββTz)Eρ2(η)DX|ββTZ=ββTz+Op(h2+n1/2h1/2),B2n=Op(h2+n1/2h1/2),C1n=fU(ββTz)f(ββTz)JJTE{ρ2(η)D2Z|ββTZ=ββTz}E{ρ2(η)D2|ββTZ=ββTz}z+Op(h2+n1/2h1/2),C2n=Op(h2+n1/2h1/2),C3n=Op(h2+n1/2h3/2). Therefore, aˆαα=E{ρ2(η)D2|ββTZ=ββTz}1×E{ρ2(η)DX|ββTZ=ββTz}+O(h2+n1/2h1/2),aˆββ(1)=f(ββTz)JJTzE{ρ2(η)D2|ββTZ=ββTz}1×E{ρ2(η)D2Z|ββTZ=ββTz}+O(h2+n1/2h1/2).

We further denote X~=XE{ρ2(η)D2|ββTZ}1E{ρ2(η)DX|ββTZ},Z~=f(ββTZ)JJTZE{ρ2(η)D2|ββTZ}1×E{ρ2(η)D2Z|ββTZ}E{ρ2(η)D2|ββTZ}1. It is easy to see that (A2) E{ρ2(η)DX~|ββTZ}=0,E{ρ2(η)D2Z~|ββTZ}=0.(A2)

Step 3: The asymptotic normality of αα and ββ.

From estimating Equations (Equation8) and (Equation9), Nααˆαα0ββˆ(1)ββ0(1)=An1NBn+op(1), where An=1Ni=1nj=1niρ2(ηij)Xijf(ββ0TZij)JJTZijDijX~ijZ~ijDijT,Bn=1Ni=1nj=1niXijf(ββ0TZij)JJTZijDij{g(μij)}1V1(μij)×Yijg1{αα0TXij+fˆ(ββ0TZij)Dij+AijTγγi.

Based on (EquationA2), we have An=Eρ2(η)X~Z~D2+op(1)A+op(1). By Taylor expansion, Bn=B1n+B2n{1+op(1)}, where B1n=1Ni=1nj=1niXijf(ββ0TZij)JJTZijDijq1(ηij,Yij),B2n=1Ni=1nj=1niXijf(ββ0TZij)JJTZijDijρ2(ηij){fˆ(ββ0TZij)f(ββ0TZij)}Dij. From the result of Step 1, NB2n=1Ni=1nj=1niXijf(ββ0TZij)JJTZijDijρ2(ηij)Dij×fU(ββ0TZij)E{ρ2(η)D2|ββ0TZ=ββ0TZij}1×1Nk=1nl=1niKh(ββ0TZklββ0TZij)q1(ηkl,Ykl)Dkl12h2κ21Ni=1nj=1niXijf(ββ0TZij)JJTZijDij×ρ2(ηij)Dijf(ββTZij)=1Nk=1nl=1niq1(ηkl,Ykl)Dkl×i=1nj=1niKh(ββ0TZijββ0TZkl)×Xijf(ββ0TZij)JJTZijDijρ2(ηij)Dij×fU(ββ0TZij)E{ρ2(η)D2|ββ0TZ=ββ0TZij}1+Op(n1/2h2)=1Ni=1nj=1niE{ρ2(η)D2|ββ0TZ=ββ0TZij}1×Eρ2(η)DX|ββ0TZ=ββ0TZijf(ββ0TZij)JJTE{ρ2(η)D2Z|ββ0TZ=ββ0TZij}×Dijq1(ηij,Yij)+op(1). It is easy to see that NBn=1Ni=1nj=1niX~ijZ~ijDijq1(ηij,Yij). By CLT, NBn has asymptotic normal distribution with zero-mean. Denote sij=X~ijZ~ijDij, we have var(NBn)=1Ni=1nj=1nivar{sijq1(ηij,Yij)}+1Ni=1nj=1nik=1kjnicov{sijq1(ηij,Yij),sikq1(ηik,Yik)}. Using simple calculation, var{sijq1(ηij,Yij)}=E{ρ2(η)ssT}=A. Denote C=cov{s11q1(η11,Y11),s12q1(η12,Y12)}, var(NBn)=A+N1NC.Therefore, when n, Nααˆα0ββˆ(1)ββ0(1)N(0,A1+N1NA1CA1).

Denote D=IIp00p×(q1)00q×pJJββ0(1). An application of multivariate delta-method yields Nααˆα0ββˆββ0N(0,ΣΣ), where ΣΣ=D(A1+N1NA1CA1)DT.

Proof of Theorem 3.2

Proof of Theorem 3.2

From Theorem 3.1, Nααˆαα0ββˆ(1)ββ0(1)=Op(1). We write fˆ(ββˆTz;ααˆ,ββˆ)f(ββ0Tz)=fˆ(ββˆTz;ααˆ,ββˆ)fˆ(ββ0Tz)+fˆ(ββ0Tz)f(ββ0Tz)=fˆ(ββTz;αα,ββ)(ααT,ββ(1)T)|(αα0,ββ0(1))ααˆαα0ββˆ(1)ββ0(1){1+op(1)}+fˆ(ββ0Tz)f(ββ0Tz)=fˆ(ββ0Tz)f(ββ0Tz)+Op(n1/2). Hence the asymptotic distribution of fˆ(ββˆTz;ααˆ,ββˆ) is the same as that of fˆ(ββ0Tz). Based on the result of Step 1, we now derive the asymptotic variance of fˆ(ββ0Tz). Denote rij=Kh(ββ0TZijββ0Tz)q1(ηij,Yij)Dij, we have vari=1nj=1nirij=i=1nj=1nivar(rij)+i=1nj=1nik=1kjnicov(rij,rik), where var(rij)=EKh2(ββ0TZijββ0Tz)q12(ηij,Yij)Dij2=h1μ0fU(ββ0Tz)E{ρ2(η)D2|ββ0TZ=ββ0Tz},cov(rij,rik)=fU2(ββ0Tz)Eq1(η11,Y11)q1(η12,Y12)D112D122|ββ0TZ11=ββ0Tz,ββ0TZ12=ββ0Tz. Therefore, we have Nh{fˆ(ββˆTz;ααˆ,ββˆ)f(ββ0Tz)12h2κ2f(ββ0Tz)}N(0,σf2), where σf2=μ0fU1(ββ0Tz)[E{ρ2(η)D2|ββ0TZ=ββ0Tz}]1.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.