Search in:

Statistical Theory and Related Fields Volume 5, 2021 - Issue 3

Submit an article Journal homepage

Free access

728

Views

CrossRef citations to date

Altmetric

Listen

Articles

Personalized treatment selection via the covariate-specific treatment effect curve for longitudinal data

Yanghui Liua School of Statistics, East China Normal University, Shanghai, People's Republic of ChinaView further author information

Riquan Zhanga School of Statistics, East China Normal University, Shanghai, People's Republic of ChinaCorrespondence[email protected]
View further author information

Shujie Mab Department of Statistics, University of California, Riverside, CA, USAView further author information

Xiuzhen Zhanga School of Statistics, East China Normal University, Shanghai, People's Republic of ChinaView further author information

Pages 253-264 | Received 13 Nov 2019, Accepted 25 Apr 2020, Published online: 14 May 2020

Cite this article
https://doi.org/10.1080/24754269.2020.1762059
CrossMark

In this article

1. Introduction
2. Model and estimation
3. Asymptotic properties
4. Computation
5. Simulation
6. Real data analysis
7. Discussion
Acknowledgements
Disclosure statement
Additional information
References
Appendixes

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions
View PDF PDF View EPUB EPUB

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

Treatment selection based on patient characteristics has been widely recognised in modern medicine. In this paper, we propose a generalised partially linear single-index mixed-effects modelling strategy for treatment selection and heterogeneous treatment effect estimation in longitudinal clinical and observational studies. We model the treatment effect as an unknown functional curve of a weighted linear combination of time-dependent covariates. This method enables us to investigate covariate-specific treatment effects and make personalised treatment selection in a flexible fashion. We develop a method that combines local linear regression and penalised quasi-likelihood to estimate the weight for each covariate, the unknown treatment effect curve and the parameters for mixed-effects. Based on pointwise confidence intervals for the treatment effect curve, we can make individualised treatment decisions from the information of patient characteristics. A simulation study is conducted to evaluate finite sample performance of the proposed method. We also illustrate the method via analysis of a real data example.

Keywords:

personalized medicine
treatment selection
semiparametric model
longitudinal data

1. Introduction

Personalized or precision medicine, which aims to provide treatment strategies according to the characteristics of individuals or subgroups of the population, has gained much attention from biomedical researchers. The main goal of personalised medicine is to investigate covariate-specific (heterogeneous) effects of treatment, based on which individualised clinical decision can be better made to patients. In recent years, there has been an increasing amount of literature on heterogeneous treatment effect (HTE) estimation. A typical way of exploring heterogeneous treatment effects is to examine patient outcomes in mutually exclusive subgroups defined by observable patient characteristics, see Berger et al. (Citation2014), Ciampi et al. (Citation1995), Negassa et al. (Citation2005), Foster et al. (Citation2011), Su et al. (Citation2008) and Wang et al. (Citation2007). Bonetti and Gelber (Citation2004) introduced a subpopulation treatment effect pattern plot (STEPP), which characterises treatment effects across potentially overlapping intervals of a continuous covariate. Bonetti et al. (Citation2009) indicated that this method is effective only for large sample sizes, and proposed a permutation-based method for inference and achieved better performance for smaller sample sizes. The major limitation of such subgroup approach is that dichotomisation of continuous covariates can be artificial, and thus it may lose important information from the data.

In the literature, some other works apply nonparametric and semiparametric modelling methods to study HTE. These methods impose no or very relaxed assumptions on the model structure, and thus can explore HTE in a flexible fashion. For continuous response, Foster et al. (Citation2015) proposed a two-stage procedure. They obtain nonparametric estimates of treatment effects for each subject in stage 1, and these estimates are used to identify optimal subset for a treatment, thus determine a treatment regime in stage 2. However, this method only applies to situations that the optimal subset is contiguous. For time-to-event data, Ma and Zhou (Citation2014) defined a covariate-specific treatment effect (CSTE) curve, which is used to represent clinical utility of a continuous biomarker. They derived estimate of CSTE curve, and constructed pointwise confidence interval to select the optimal treatment for individual patient, as well as simultaneous confidence band to identify subpopulation who respond well to a treatment. Han et al. (Citation2017) extended the method of Ma and Zhou (Citation2014) to the case of binary response. Both of them only considered a single biomarker. In order to incorporate multivariate or even high-dimensional covariates, Guo et al. (Citation2018) proposed a sparse logistic single-index coefficient model for optimal treatment selection using the CSTE curve. This method offers a flexible way for studying the CSTE curve without a restrictive assumption on the structure of the curve while achieving great dimension reduction of the high-dimensional covariates.

All the above methods were proposed for cross-sectional data with responses measured at one time point. In practice, the patient outcomes are often collected at multiple follow-up times in order to better evaluate the effectiveness of treatments. In this paper, we extend the model considered in Guo et al. (Citation2018) to the longitudinal clinical and observational studies, and consider a generalised partially linear single-index coefficient mixed-effects model (GPLSIMM) for our longitudinal setting. Similar as Guo et al. (Citation2018), we treat the treatment effect as an unknown functional curve of a weighted linear combination of time-varying covariates. The weights for covariates account for their different contributions to the treatment effect, and they are estimated from the data. In longitudinal studies, the repeated measures are correlated within subjects, and thus the estimating method considered in Guo et al. (Citation2018) is not applicable to our proposed GPLSIMM. In our model, we need to estimate the unknown functional curve, the weight of each covariate and the parameters for the mixed-effects. Estimation of generalised semiparametric mixed-effects models has been considered in some existing works. Liang (Citation2009) proposed a local linear regression and penalised quasi-likelihood method for estimation of a generalised partially linear mixed-effects model. Pang and Xue (Citation2012) considered a local linear regression with GEE method for a single-index mixed effects model. Xu and Zhu (Citation2012) and Chen et al. (Citation2014) developed a kernel and a P-spline estimation method, respectively, together with quasi-lilikelhood for longitudinal generalised single-index models.

Based on the methods considered in these works, we see that penalised quasi-likelihood is a commonly used method for estimation of generalised mixed-effects models. It circumvents the calculation of high-dimensional integral in likelihood function (Breslow & Clayton, Citation1993). In our proposed GPLSIMM, we approximate the unknown treatment effect curve by the local linear method and estimate the parameters for the parametric and nonparametric parts through alternatively optimising the local and global penalised quasi-likelihood functions. We then select optimal treatment for a future patient based on pointwise confidence intervals for the CSTE curve.

The rest of the paper is organised as follows. In Section 2, we introduce the proposed model, the CSTE curve and the estimation method. Section 3 gives asymptotic properties of parametric and nonparametric estimates. Section 4 provides the algorithm for model estimation. In Section 5 we evaluate the finite sample properties of the proposed method via simulation studies, while Section 6 illustrates the application of the proposed method in a real data set. All technical proofs are relegated to Appendix.

2. Model and estimation

2.1. Model

Suppose our data are obtained from n independent subjects, and observations of the ith subject is ${(Y_{i j}, X_{i j}, Z_{i j}, A_{i j}, D_{i j}), j = 1, \dots, n_{i}}$ . $Y_{i j}$ is the response, $X_{i j}$ , $Z_{i j}$ and $A_{i j}$ are covariates of dimension p, q and s, and $D_{i j} \in {0, 1}$ is an indicator of exposure to treatment. We assume the relationship of response and covariates is specified by the following GPLSIMM: (1) $\begin{aligned} E (Y_{i j} | X_{i j}, Z_{i j}, A_{i j}, D_{i j}, {γ γ}_{i}) \\ = g^{- 1} {{α α}^{T} X_{i j} + f ({β β}^{T} Z_{i j}) D_{i j} + A_{i j}^{T} {γ γ}_{i}}, \\ v a r (Y_{i j} | X_{i j}, Z_{i j}, A_{i j}, D_{i j}, {γ γ}_{i}) \\ = w_{i j}^{- 1} φ V (μ_{i j}), i = 1, \dots, n, j = 1, \dots, n_{i}, \end{aligned}$ (1) where $μ_{i j} = E (Y_{i j} | X_{i j}, Z_{i j}, A_{i j}, D_{i j}, {γ γ}_{i})$ , $g (\cdot)$ and $V (\cdot)$ are known functions, φ is a scale parameter and $w_{i j}$ is a pre-determined weight for the jth observation of the ith subject. $f (\cdot)$ is an unknown function, and $α α$ and $β β$ are unknown parameter vectors. For model identifiability, we assume that $∥ β β ∥ = 1$ and $β_{1} > 0$ . We incorporate random effect $A_{i j}^{T} {γ γ}_{i}$ to account for the within-subject correlation, where ${γ γ}_{i}, i = 1, \dots, n$ are $s$ -dimensional random effect vectors. We assume that ${γ γ}_{i} \sim N (0 0, G G)$ .

In this model, we characterise the treatment effect with a single-index term $f ({β β}^{T} Z_{i j})$ , and the relationship of response and covariates for control group with a generalised linear mixed effect model. When response Y is binary, we can easily see that (2) $\begin{aligned} f ({β β}^{T} Z) & = l o g i t {E (Y | X, Z, A, γ γ, D = 1)} \\ - l o g i t {E (Y | X, Z, A, γ γ, D = 0)} . \end{aligned}$ (2) When Y is a count variable and follows Poisson distribution, (3) $\begin{aligned} f ({β β}^{T} Z) & = \log {E (Y | X, Z, A, γ γ, D = 1)} \\ - \log {E (Y | X, Z, A, γ γ, D = 0)} . \end{aligned}$ (3) Assume that D = 0 and D = 1 represents standard care and treatment respectively, X and Z are patient characteristics, Y indicates outcome of the study, and the better outcome corresponds to larger value of Y. From (Equation2(2) $\begin{aligned} f ({β β}^{T} Z) & = l o g i t {E (Y | X, Z, A, γ γ, D = 1)} \\ - l o g i t {E (Y | X, Z, A, γ γ, D = 0)} . \end{aligned}$ (2) ) and (Equation3(3) $\begin{aligned} f ({β β}^{T} Z) & = \log {E (Y | X, Z, A, γ γ, D = 1)} \\ - \log {E (Y | X, Z, A, γ γ, D = 0)} . \end{aligned}$ (3) ), $f ({β β}^{T} Z)$ could be regarded as a measure of treatment effect. We define $f (\cdot)$ as covariate-specific treatment effect (CSTE) curve under this model.

Given the estimate and confidence interval of $f (\cdot)$ , we could suggest the optimal treatment for a future patient based on his or her personal characteristics. Taking binary response as an example, we assume that Y = 1 represent a disease being cured, while Y = 0 indicate uncured. Let $u_{0}$ be the estimated value of ${θ θ}^{T} Z$ calculated from personal characteristics of a patient. If the lower bound of confidence interval for $f (u_{0})$ is greater than 0, the treatment is more effective than standard care for the patient. On the other hand, if the upper bound of confidence interval is smaller than 0, standard care is more effective. In this case, we could not recommend the treatment to the patient. If neither of the above cases happen, that is, 0 is contained in the confidence interval, we would draw the conclusion that there's no significant difference between treatment and standard care.

2.2. Estimation

Denote $Γ Γ = (γ_{1}^{T}, \dots, γ_{n}^{T})^{T}$ . Let ${β β}^{(1)} = (β_{2}, \dots, β_{q})^{T}$ be the vector after deleting the first element from $β β$ . Then $β β = (\sqrt{1 - ∥ {β β}^{(1)} ∥^{2}}, {β β}^{(1) T})^{T}$ , and we define the Jacobian matrix $J J = {J J}_{{β β}^{(1)}} = \frac{\partial β β}{\partial {β β}^{(1) T}} = (\begin{matrix} - \frac{{β β}^{(1) T}}{\sqrt{1 - ∥ {β β}^{(1)} ∥^{2}}} \\ {I I}_{q - 1} \end{matrix}) .$ Our primary interest is to estimate $f (\cdot)$ , $α α$ , $β β$ and $G G$ .

Suppose that $G G$ is known. Given $α α$ and $β β$ , we combine penalised quasi-likelihood with local linear technique to obtain estimates of $f (\cdot)$ and $f^{'} (\cdot)$ . If u is in a neighbourhood of ${β β}^{T} Z_{i j}$ , $g (μ_{i j})$ can be approximated by $\begin{aligned} g ({\tilde{μ}}_{i j}) & = {α α}^{T} X_{i j} + {f (u) + f^{'} (u) ({β β}^{T} Z_{i j} - u)} D_{i j} \\ + A_{i j}^{T} {γ γ}_{i} . \end{aligned}$ Let $d (y, μ) = - 2 \int_{y}^{μ} ((y - u) / V (u)) d u$ , $K_{h} (\cdot) = h^{- 1} K (\cdot / h)$ , where h is bandwidth, $K (\cdot)$ is a zero-mean symmetric density function. We maximise the following local penalised quasi-likelihood (4) $\begin{aligned} - \frac{1}{2} \sum_{i = 1}^{n} \{\sum_{j = 1}^{n_{i}} K_{h} ({β β}^{T} Z_{i j} - u) w_{i j} φ^{- 1} d (Y_{i j}, g^{- 1} [{α α}^{T} X_{i j} \\ + {a + b ({β β}^{T} Z_{i j} - u)} D_{i j} + A_{i j}^{T} {γ γ}_{i}]) + {γ γ}_{i}^{T} {G G}^{- 1} {γ γ}_{i}\} \end{aligned}$ (4) with respect to a, b and $Γ Γ$ , and obtain estimates $\hat{f} (u) = \hat{a}$ and ${\hat{f}}^{'} (u) = \hat{b}$ . Taking derivative on (Equation4(4) $\begin{aligned} - \frac{1}{2} \sum_{i = 1}^{n} \{\sum_{j = 1}^{n_{i}} K_{h} ({β β}^{T} Z_{i j} - u) w_{i j} φ^{- 1} d (Y_{i j}, g^{- 1} [{α α}^{T} X_{i j} \\ + {a + b ({β β}^{T} Z_{i j} - u)} D_{i j} + A_{i j}^{T} {γ γ}_{i}]) + {γ γ}_{i}^{T} {G G}^{- 1} {γ γ}_{i}\} \end{aligned}$ (4) ) with respect to $(a, b)$ and $Γ Γ$ yields the following estimating equations: (5) $\begin{aligned} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} K_{h} ({β β}^{T} Z_{i j} - u) {g^{'} ({\tilde{μ}}_{i j})}^{- 1} w_{i j} φ^{- 1} V^{- 1} ({\tilde{μ}}_{i j}) \\ \times (Y_{i j} - g^{- 1} [{α α}^{T} X_{i j} + {a + b ({β β}^{T} Z_{i j} - u)} D_{i j} \\ + A_{i j}^{T} {γ γ}_{i}]) D_{i j} (\begin{matrix} 1 \\ {β β}^{T} Z_{i j} - u \end{matrix}) = 0 0, \end{aligned}$ (5) and for each i, (6) $\begin{aligned} \sum_{j = 1}^{n_{i}} K_{h} ({β β}^{T} Z_{i j} - u) A_{i j} {g^{'} ({\tilde{μ}}_{i j})}^{- 1} w_{i j} φ^{- 1} V^{- 1} ({\tilde{μ}}_{i j}) \\ \times (Y_{i j} - g^{- 1} [{α α}^{T} X_{i j} + {a + b ({β β}^{T} Z_{i j} - u)} D_{i j} \\ + A_{i j}^{T} {γ γ}_{i}]) - {G G}^{- 1} {γ γ}_{i} = 0 0 . \end{aligned}$ (6)

When $f (\cdot)$ is known, we obtain estimate of $α α$ , $β β$ and $Γ Γ$ by maximising the global penalised quasi-likelihood (7) $\begin{aligned} - \frac{1}{2} \sum_{i = 1}^{n} (\sum_{j = 1}^{n_{i}} w_{i j} φ^{- 1} d [Y_{i j}, g^{- 1} {{α α}^{T} X_{i j} + f ({β β}^{T} Z_{i j}) D_{i j} \\ + A_{i j}^{T} {γ γ}_{i}}] + {γ γ}_{i}^{T} {G G}^{- 1} {γ γ}_{i}), \end{aligned}$ (7) with respect to $α α$ , ${β β}^{(1)}$ and $Γ Γ$ . The corresponding estimating equations are as follows: (8) $\begin{aligned} {S S}_{01} (α α, {β β}^{(1)}, Γ Γ) \\ = \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} X_{i j} {g^{'} (μ_{i j})}^{- 1} w_{i j} φ^{- 1} V^{- 1} (μ_{i j}) \\ \times [Y_{i j} - g^{- 1} {{α α}^{T} X_{i j} + f ({β β}^{T} Z_{i j}) D_{i j} + A_{i j}^{T} {γ γ}_{i}}] \\ = 0 0, \end{aligned}$ (8) (9) $\begin{aligned} {S S}_{02} (α α, {β β}^{(1)}, Γ Γ) \\ = \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} {J J}_{{β β}^{(1)}}^{T} Z_{i j} D_{i j} f^{'} ({β β}^{T} Z_{i j}) {g^{'} (μ_{i j})}^{- 1} w_{i j} φ^{- 1} \\ + V^{- 1} (μ_{i j}) [Y_{i j} - g^{- 1} {{α α}^{T} X_{i j} f ({β β}^{T} Z_{i j}) D_{i j} + A_{i j}^{T} {γ γ}_{i}}] \\ = 0 0, \end{aligned}$ (9) and for each $i = 1, \dots, n$ , (10) $\begin{aligned} {S S}_{i} (α α, {β β}^{(1)}, {γ γ}_{i}) \\ = \sum_{j = 1}^{n_{i}} A_{i j} {g^{'} (μ_{i j})}^{- 1} w_{i j} φ^{- 1} V^{- 1} (μ_{i j}) \\ \times [Y_{i j} - g^{- 1} {{α α}^{T} X_{i j} + f ({β β}^{T} Z_{i j}) D_{i j} + A_{i j}^{T} {γ γ}_{i}}] \\ - {G G}^{- 1} {γ γ}_{i} \\ = 0 0 . \end{aligned}$ (10)

We obtain the parametric and nonparametric estimates by iteratively maximising quasi-likelihoods (Equation4(4) $\begin{aligned} - \frac{1}{2} \sum_{i = 1}^{n} \{\sum_{j = 1}^{n_{i}} K_{h} ({β β}^{T} Z_{i j} - u) w_{i j} φ^{- 1} d (Y_{i j}, g^{- 1} [{α α}^{T} X_{i j} \\ + {a + b ({β β}^{T} Z_{i j} - u)} D_{i j} + A_{i j}^{T} {γ γ}_{i}]) + {γ γ}_{i}^{T} {G G}^{- 1} {γ γ}_{i}\} \end{aligned}$ (4) ) and (Equation7(7) $\begin{aligned} - \frac{1}{2} \sum_{i = 1}^{n} (\sum_{j = 1}^{n_{i}} w_{i j} φ^{- 1} d [Y_{i j}, g^{- 1} {{α α}^{T} X_{i j} + f ({β β}^{T} Z_{i j}) D_{i j} \\ + A_{i j}^{T} {γ γ}_{i}}] + {γ γ}_{i}^{T} {G G}^{- 1} {γ γ}_{i}), \end{aligned}$ (7) ). The corresponding algorithm is summarised in Section 4 for practical implementation.

3. Asymptotic properties

Denote $q_{k} (t, y) = - \frac{1}{2} (\partial^{k} d {y, g^{- 1} (t)} / \partial t^{k})$ , and $ρ_{k} (t) = {d g^{- 1} (t) / d t}^{k} V^{- 1} {g^{- 1} (t)}$ , k = 1, 2, 3. Let $κ_{j} = \int u^{j} K (u) d u, μ_{j} = \int u^{j} K^{2} (u) d u$ for j = 1, 2, $N = \sum_{i = 1}^{n} n_{i}$ and $N_{1} = \sum_{i = 1}^{n} n_{i} (n_{i} - 1)$ . We assume n tends to infinity and $n_{i}$ 's are bounded. Denote $η_{i j} = {α α}^{T} X_{i j} + f ({β β}^{T} Z_{i j}) D_{i j} + A_{i j}^{T} {γ γ}_{i}$ , and $η = {α α}^{T} X + f ({β β}^{T} Z) D + A^{T} γ γ$ .

In order to obtain asymptotic behaviours of the proposed parametric and nonparametric estimates, we assume the following regularity conditions:

The marginal density of Z is positive and uniformly continuous with a compact support $Z \subseteq R^{q}$ . $f_{U} (\cdot)$ , density function of $U = {β β}^{T} Z$ , is twice continuously differentiable for $U \in U$ , where $U = [b_{1}, b_{2}] = {{β β}^{T} Z, Z \in Z}$ is a compact interval.
For all x, $ρ_{2} (x) > 0$ .
The second derivative of $g (\cdot), V (\cdot)$ and $f (\cdot)$ are bounded and continuous.
The kernel $K (\cdot)$ is a bounded and symmetric probability density function with bounded support $[- c_{0}, c_{0}]$ , and satisfies $\int u^{2} K (u) d u \neq 0$ and $\int u^{2} K (u) d u < \infty$ . $K (\cdot)$ satisfies the Lipschitz condition of order 1.
$E {ρ_{2} (η) D^{2} | {β β}^{T} Z = u}, E {ρ_{2} (η) D X | {β β}^{T} Z = u}$ and $E {ρ_{2} (η) D^{2} Z | {β β}^{T} Z = u}$ are twice differentiable on u. $E {q_{1} (η_{11}, Y_{11}) q_{1} (η_{12}, Y_{12}) D_{11}^{2} D_{12}^{2} | {β β}^{T} Z_{11} = u_{1}, {β β}^{T} Z_{12} = u_{2}}$ is continuous on $u_{1}$ and $u_{2}$ .
As $n \to \infty$ , h satisfies $n h^{4} \to 0$ and $n h^{2} / \log (1 / h) \to \infty$ .

Denote $\begin{aligned} \tilde{X} & = X - {[E {ρ_{2} (η) D^{2} | {β β}^{T} Z}]}^{- 1} [E {ρ_{2} (η) D X | {β β}^{T} Z}], \\ \tilde{Z} & = f^{'} ({β β}^{T} Z) {J J}^{T} (Z - {[E {ρ_{2} (η) D^{2} | {β β}^{T} Z}]}^{- 1} \\ \times [E {ρ_{2} (η) D^{2} Z | {β β}^{T} Z}]) . \end{aligned}$ Let $\hat{α α}, \hat{β β}$ be the final estimator of $α α$ and $β β$ , respectively. Theorem 3.1 gives the asymptotic distributions of $\hat{α α}$ and $\hat{β β}$ .

Theorem 3.1

Under conditions (C1)–(C6), (11) $\sqrt{N} (\begin{matrix} \hat{α α} - {α α}_{0} \\ \hat{β β} - {β β}_{0} \end{matrix}) \to N (0, Σ Σ),$ (11) where $\begin{aligned} Σ Σ = D (A^{- 1} + \frac{N_{1}}{N} A^{- 1} C A^{- 1}) D^{T}, \end{aligned}$ with $\begin{aligned} A & = E \{ρ_{2} (η) {(\begin{matrix} \tilde{X} \\ \tilde{Z} D \end{matrix})}^{\otimes 2}\}, \\ C & = cov \{(\begin{matrix} {\tilde{X}}_{11} \\ {\tilde{Z}}_{11} D_{11} \end{matrix}) q_{1} (η_{11}, Y_{11}), \\ (\begin{matrix} {\tilde{X}}_{12} \\ {\tilde{Z}}_{12} D_{12} \end{matrix}) q_{1} (η_{12}, Y_{12})\}, \end{aligned}$ and $D = (\begin{matrix} {I I}_{p} & {0 0}_{p \times (q - 1)} \\ {0 0}_{q \times p} & {J J}_{{β β}_{0}^{(1)}} \end{matrix}) .$

Theorem 3.1 indicates that in order to obtain $\sqrt{n} -$ consistent estimates of $α α$ and $β β$ , we need to undersmooth the nonparametric function $f (\cdot)$ . For any $z \in Z$ , we let $\hat{f} ({\hat{β β}}^{T} z; \hat{α α}, \hat{β β})$ be the estimate of $f ({β β}^{T} z)$ given $\hat{α α}, \hat{β β}$ and $h_{1}$ be the corresponding bandwidth. Theorem 3.2 presents the asymptotic distribution of $\hat{f} ({\hat{β β}}^{T} z; \hat{α α}, \hat{β β})$ .

Theorem 3.2

Under conditions (C1)–(C5), as $n \to \infty,$ $h_{1} \to 0$ and $N h_{1} \to \infty$ , (12) $\begin{aligned} \sqrt{N h} {\hat{f} ({\hat{β β}}^{T} z; \hat{α α}, \hat{β β}) - f ({β β}_{0}^{T} z) - \frac{1}{2} h^{2} κ_{2} f^{''} ({β β}_{0}^{T} z)} \\ \to N (0, σ_{f}^{2}), \end{aligned}$ (12) where $σ_{f}^{2} = μ_{0} f_{U}^{- 1} ({β β}_{0}^{T} z) [E {ρ_{2} (η) D^{2} | {β β}_{0}^{T} Z = {β β}_{0}^{T} z}]^{- 1}$ .

4. Computation

Given $α α$ and $β β$ , we solve estimating Equations (Equation5(5) $\begin{aligned} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} K_{h} ({β β}^{T} Z_{i j} - u) {g^{'} ({\tilde{μ}}_{i j})}^{- 1} w_{i j} φ^{- 1} V^{- 1} ({\tilde{μ}}_{i j}) \\ \times (Y_{i j} - g^{- 1} [{α α}^{T} X_{i j} + {a + b ({β β}^{T} Z_{i j} - u)} D_{i j} \\ + A_{i j}^{T} {γ γ}_{i}]) D_{i j} (\begin{matrix} 1 \\ {β β}^{T} Z_{i j} - u \end{matrix}) = 0 0, \end{aligned}$ (5) ) and (Equation6(6) $\begin{aligned} \sum_{j = 1}^{n_{i}} K_{h} ({β β}^{T} Z_{i j} - u) A_{i j} {g^{'} ({\tilde{μ}}_{i j})}^{- 1} w_{i j} φ^{- 1} V^{- 1} ({\tilde{μ}}_{i j}) \\ \times (Y_{i j} - g^{- 1} [{α α}^{T} X_{i j} + {a + b ({β β}^{T} Z_{i j} - u)} D_{i j} \\ + A_{i j}^{T} {γ γ}_{i}]) - {G G}^{- 1} {γ γ}_{i} = 0 0 . \end{aligned}$ (6) ) by Fisher's scoring algorithm (see Wu & Zhang, Citation2006, Section 10.4), and obtain estimates $\hat{f} (\cdot)$ and ${\hat{f}}^{'} (\cdot)$ .

When $f (\cdot)$ is given, we obtain estimates of $α α$ , ${β β}^{(1)}$ and $Γ Γ$ by solving Equations (Equation8(8) $\begin{aligned} {S S}_{01} (α α, {β β}^{(1)}, Γ Γ) \\ = \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} X_{i j} {g^{'} (μ_{i j})}^{- 1} w_{i j} φ^{- 1} V^{- 1} (μ_{i j}) \\ \times [Y_{i j} - g^{- 1} {{α α}^{T} X_{i j} + f ({β β}^{T} Z_{i j}) D_{i j} + A_{i j}^{T} {γ γ}_{i}}] \\ = 0 0, \end{aligned}$ (8) ), (Equation9(9) $\begin{aligned} {S S}_{02} (α α, {β β}^{(1)}, Γ Γ) \\ = \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} {J J}_{{β β}^{(1)}}^{T} Z_{i j} D_{i j} f^{'} ({β β}^{T} Z_{i j}) {g^{'} (μ_{i j})}^{- 1} w_{i j} φ^{- 1} \\ + V^{- 1} (μ_{i j}) [Y_{i j} - g^{- 1} {{α α}^{T} X_{i j} f ({β β}^{T} Z_{i j}) D_{i j} + A_{i j}^{T} {γ γ}_{i}}] \\ = 0 0, \end{aligned}$ (9) ) and (Equation10(10) $\begin{aligned} {S S}_{i} (α α, {β β}^{(1)}, {γ γ}_{i}) \\ = \sum_{j = 1}^{n_{i}} A_{i j} {g^{'} (μ_{i j})}^{- 1} w_{i j} φ^{- 1} V^{- 1} (μ_{i j}) \\ \times [Y_{i j} - g^{- 1} {{α α}^{T} X_{i j} + f ({β β}^{T} Z_{i j}) D_{i j} + A_{i j}^{T} {γ γ}_{i}}] \\ - {G G}^{- 1} {γ γ}_{i} \\ = 0 0 . \end{aligned}$ (10) ). Here we apply the quasi-Fisher scoring algorithm to this problem.

Denote $\begin{aligned} θ θ & = ({α α}^{T}, {β β}^{T})^{T}, Θ Θ = ({θ θ}^{T}, {Γ Γ}^{T})^{T}, \\ {θ θ}^{(1)} & = ({α α}^{T}, {β β}^{(1) T})^{T}, {Θ Θ}^{(1)} = ({θ θ}^{(1) T}, {Γ Γ}^{T})^{T}, \\ S S ({Θ Θ}^{(1)}) & = {{S S}_{01}^{T} (α α, {β β}^{(1)}, Γ Γ), {S S}_{02}^{T} (α α, {β β}^{(1)}, Γ Γ), \\ \times {S S}_{1}^{T} (α α, {β β}^{(1)}, {γ γ}_{1}), \dots, {S S}_{n}^{T} (α α, {β β}^{(1)}, {γ γ}_{n})}^{T} . \end{aligned}$ Using the quasi-Fisher scoring algorithm, estimate of ${Θ Θ}^{(1)}$ is updated by (13) ${Θ Θ}_{n e w}^{(1)} = {Θ Θ}_{o l d}^{(1)} + {\dot{S S}}^{- 1} ({Θ Θ}_{o l d}^{(1)}) S S ({Θ Θ}_{o l d}^{(1)}),$ (13) where ${\dot{S S}}^{- 1} ({Θ Θ}_{1}^{(1)}) = Q Q = (\begin{matrix} {Q Q}_{11} & {Q Q}_{12} & {Q Q}_{13} \\ {Q Q}_{12}^{T} & {Q Q}_{22} & {Q Q}_{23} \\ {Q Q}_{13}^{T} & {Q Q}_{23}^{T} & {Q Q}_{33} \end{matrix})$ with $\begin{aligned} {Q Q}_{11} & = \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} X_{i j} {g^{'} (μ_{i j})}^{- 2} w_{i j} φ^{- 1} V^{- 1} (μ_{i j}) X_{i j}^{T}, \\ {Q Q}_{12} & = \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} X_{i j} {g^{'} (μ_{i j})}^{- 2} w_{i j} φ^{- 1} V^{- 1} (μ_{i j}) \\ \times f^{'} ({β β}^{T} Z_{i j}) Z_{i j}^{T} {J J}_{{β β}^{(1)}}, \\ {Q Q}_{22} & = \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} {J J}_{{β β}^{(1)}}^{T} Z_{i j} D_{i j}^{2} {f^{'} ({β β}^{T} Z_{i j})}^{2} {g^{'} (μ_{i j})}^{- 2} \\ \times w_{i j} φ^{- 1} V^{- 1} (μ_{i j}) Z_{i j}^{T} {J J}_{{β β}^{(1)}}, \\ {Q Q}_{13} & = ({Q Q}_{13}^{[1]}, \dots, {Q Q}_{13}^{[n]}), {Q Q}_{23} = ({Q Q}_{23}^{[1]}, \dots, {Q Q}_{23}^{[n]}), \\ {Q Q}_{33} & = diag ({Q Q}_{33}^{[1]}, \dots, {Q Q}_{33}^{[n]}), \end{aligned}$ and for $i = 1, \dots, n$ , $\begin{aligned} {Q Q}_{13}^{[i]} & = \sum_{j = 1}^{n_{i}} X_{i j} {g^{'} (μ_{i j})}^{- 2} w_{i j} φ^{- 1} V^{- 1} (μ_{i j}) A_{i j}^{T}, \\ {Q Q}_{23}^{[i]} & = \sum_{j = 1}^{n_{i}} {J J}_{{β β}^{(1)}}^{T} Z_{i j} D_{i j} f^{'} ({β β}^{T} Z_{i j}) {g^{'} (μ_{i j})}^{- 2} \\ \times w_{i j} φ^{- 1} V^{- 1} (μ_{i j}) A_{i j}^{T}, \\ {Q Q}_{33}^{[i]} & = \sum_{j = 1}^{n_{i}} A_{i j} {g^{'} (μ_{i j})}^{- 2} w_{i j} φ^{- 1} V^{- 1} (μ_{i j}) A_{i j}^{T} + G^{- 1} . \end{aligned}$ Let $\begin{aligned} {P P}_{i} & = (\begin{matrix} {g^{'} (μ_{i 1})}^{- 1} X_{i 1}^{T} & {^{'} (μ_{i 1})}^{- 1} f^{'} ({β β}^{T} Z_{i 1}) D_{i 1} Z_{i 1}^{T} {J J}_{{β β}^{(1)}} \\ ⋮ & ⋮ \\ {g^{'} (μ_{i n_{i}})}^{- 1} X_{i n_{i}}^{T} & {g^{'} (μ_{i n_{i}})}^{- 1} f^{'} ({β β}^{T} Z_{i n_{i}}) D_{i n_{i}} Z_{i n_{i}}^{T} {J J}_{{β β}^{(1)}} \end{matrix}), \\ {A A}_{i} & = (\begin{matrix} {g^{'} (μ_{i 1})}^{- 1} A_{i 1}^{T} \\ ⋮ \\ {g^{'} (μ_{i n_{i}})}^{- 1} A_{i n_{i}}^{T} \end{matrix}), \\ R_{i} & = diag (w_{i 1}^{- 1} φ V (μ_{i 1}), \dots, w_{i n_{i}}^{- 1} φ V (μ_{i n_{i}})), \\ {ε ε}_{i} & = {(Y_{i 1} - μ_{i 1}, \dots, Y_{i n_{i}} - μ_{i n_{i}})}^{T} . \end{aligned}$ We further denote $\begin{aligned} P P & = {({P P}_{1}^{T}, \dots, {P P}_{n}^{T})}^{T}, A A = diag ({A A}_{1}, \dots, {A A}_{n}), \\ ε ε & = {({ε ε}_{1}^{T}, \dots, {ε ε}_{n}^{T})}^{T}, \\ R & = diag (R_{1}, \dots, R_{n}), \tilde{G G} = diag (G G, \dots, G G) . \end{aligned}$ Equation (Equation13(13) ${Θ Θ}_{n e w}^{(1)} = {Θ Θ}_{o l d}^{(1)} + {\dot{S S}}^{- 1} ({Θ Θ}_{o l d}^{(1)}) S S ({Θ Θ}_{o l d}^{(1)}),$ (13) ) can be rewritten as $\begin{aligned} {Θ Θ}_{n e w}^{(1)} - {Θ Θ}_{o l d}^{(1)} & = {(\begin{matrix} {P P}_{o l d}^{T} R_{o l d}^{- 1} {P P}_{o l d} & {P P}_{o l d}^{T} R_{o l d}^{- 1} {A A}_{o l d} \\ {A A}_{o l d}^{T} R_{o l d}^{- 1} {P P}_{o l d} & {A A}_{o l d}^{T} R_{o l d}^{- 1} {A A}_{o l d} + {\tilde{G G}}^{- 1} \end{matrix})}^{- 1} \\ (\begin{matrix} {P P}_{o l d}^{T} R_{o l d}^{- 1} {ε ε}_{o l d} \\ {A A}_{o l d}^{T} R_{o l d}^{- 1} {ε ε}_{o l d} - {\tilde{G G}}^{- 1} {Γ Γ}_{o l d} \end{matrix}) . \end{aligned}$ Note that ${P P}_{o l d}, {A A}_{o l d}, R_{o l d}$ and ${ε ε}_{o l d}$ on the right hand of above equation are calculated from ${Θ Θ}_{o l d}$ . For simplicity we suppress their subscripts hereinafter. Denote ${H H}_{i} = {A A}_{i} G G {A A}_{i}^{T} + R_{i}, H H = diag ({H H}_{1}, \dots, {H H}_{n})$ . By matrix algebra, we can obtain (14) $\begin{aligned} {θ θ}_{n e w}^{(1)} & = {θ θ}_{o l d}^{(1)} + ({P P}^{T} {H H}^{- 1} P P)^{- 1} {P P}^{T} {H H}^{- 1} (ε ε + A A {Γ Γ}_{o l d}), \\ {γ γ}_{i, n e w} & = G G {A A}_{i}^{T} {H H}_{i}^{- 1} \{{ε ε}_{i} + {A A}_{i} {γ γ}_{i, o l d} - {P P}_{i} ({θ θ}_{n e w}^{(1)} - {θ θ}_{o l d}^{(1)})\}, \\ i = 1, \dots, n . \end{aligned}$ (14) Based on $∥ {β β}_{n e w} ∥ = 1$ and $∥ {β β}_{o l d} ∥ = 1$ , we can show that ${θ θ}_{n e w} - {θ θ}_{o l d} = (\begin{matrix} I I & 0 0 \\ 0 0 & {J J}_{{β β}_{o l d}^{(1)}} \end{matrix}) ({θ θ}_{n e w}^{(1)} - {θ θ}_{o l d}^{(1)}) {1 + o_{p} (1)} .$ Thus, we update ${θ θ}_{o l d}$ by ${θ θ}_{n e w} = ({α α}_{n e w}^{* T}, {β β}_{n e w}^{* T} / ∥ {β β}_{n e w}^{*} ∥)^{T}$ , with ${θ θ}_{n e w}^{*} = ({α α}_{n e w}^{* T}, {β β}_{n e w}^{* T})^{T}$ , and $\begin{aligned} {θ θ}_{n e w}^{*} & = {θ θ}_{o l d} + (\begin{matrix} I I & 0 0 \\ 0 0 & {J J}_{{β β}_{o l d}^{(1)}} \end{matrix}) ({P P}^{T} {H H}^{- 1} P P)^{- 1} \\ \times {P P}^{T} {H H}^{- 1} (ε ε + A A {Γ Γ}_{o l d}) . \end{aligned}$ Our algorithm can be summarised as following:

Obtain initial estimate ${\hat{θ θ}}_{i n i t}$ by solving generalised linear mixed effect model $\begin{aligned} E (Y_{i j} | X_{i j}, Z_{i j}, D_{i j}) & = g^{- 1} ({α α}^{T} X_{i j} + {β β}^{T} Z_{i j} D_{i j} \\ + A_{i j}^{T} {γ γ}_{i}) \\ v a r (Y_{i j} | X_{i j}, Z_{i j}, D_{i j}) & = w_{i j}^{- 1} φ V (μ_{i j}) . \end{aligned}$
Given parametric estimate ${\hat{θ θ}}_{o l d}$ , derive estimates $\hat{f} ({\hat{β β}}_{o l d}^{T} Z_{i j})$ and ${\hat{f}}^{'} ({\hat{β β}}_{o l d}^{T} Z_{i j})$ by solving estimating Equations (Equation5(5) $\begin{aligned} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} K_{h} ({β β}^{T} Z_{i j} - u) {g^{'} ({\tilde{μ}}_{i j})}^{- 1} w_{i j} φ^{- 1} V^{- 1} ({\tilde{μ}}_{i j}) \\ \times (Y_{i j} - g^{- 1} [{α α}^{T} X_{i j} + {a + b ({β β}^{T} Z_{i j} - u)} D_{i j} \\ + A_{i j}^{T} {γ γ}_{i}]) D_{i j} (\begin{matrix} 1 \\ {β β}^{T} Z_{i j} - u \end{matrix}) = 0 0, \end{aligned}$ (5) ) and (Equation6(6) $\begin{aligned} \sum_{j = 1}^{n_{i}} K_{h} ({β β}^{T} Z_{i j} - u) A_{i j} {g^{'} ({\tilde{μ}}_{i j})}^{- 1} w_{i j} φ^{- 1} V^{- 1} ({\tilde{μ}}_{i j}) \\ \times (Y_{i j} - g^{- 1} [{α α}^{T} X_{i j} + {a + b ({β β}^{T} Z_{i j} - u)} D_{i j} \\ + A_{i j}^{T} {γ γ}_{i}]) - {G G}^{- 1} {γ γ}_{i} = 0 0 . \end{aligned}$ (6) ).
Obtain ${\hat{θ θ}}_{n e w}$ and ${\hat{Γ Γ}}_{n e w}$ using Equation (Equation14(14) $\begin{aligned} {θ θ}_{n e w}^{(1)} & = {θ θ}_{o l d}^{(1)} + ({P P}^{T} {H H}^{- 1} P P)^{- 1} {P P}^{T} {H H}^{- 1} (ε ε + A A {Γ Γ}_{o l d}), \\ {γ γ}_{i, n e w} & = G G {A A}_{i}^{T} {H H}_{i}^{- 1} \{{ε ε}_{i} + {A A}_{i} {γ γ}_{i, o l d} - {P P}_{i} ({θ θ}_{n e w}^{(1)} - {θ θ}_{o l d}^{(1)})\}, \\ i = 1, \dots, n . \end{aligned}$ (14) ).
Iterate between Steps 2 and 3 until convergence, and obtain the final estimate $\hat{θ θ}$ , $\hat{Γ Γ}$ and $\hat{f} (\cdot)$ . We further calculate the final estimate $\hat{f} (\cdot)$ by Step 2.

Note that matrix $G G$ is still unknown. To obtain estimate of $G G$ , we apply the maximum likelihood method under the normality assumption, and implement the method by EM algorithm (see Laird & Ware, Citation1982). To be specific, we set the initial value of $G G$ to be the identity matrix. After obtaining the estimates of $θ θ, Γ Γ$ and $f (\cdot)$ by Steps 1–4, we have an updated estimate of $G G$ . We repeat this procedure until convergence, and obtain the final estimate $\hat{G G}$ . Given the estimate $\hat{G G}$ , we derive the estimates of $θ θ$ , $Γ Γ$ and $f (\cdot)$ by substitution of $\hat{G G}$ for $G G$ in Steps 2 and 3.

During the process of implementation, bandwidth h in Step 2 also needs to be selected. Theorem 3.1 indicates that undersmoothing the nonparametric part is necessary to guarantee $\sqrt{n}$ consistency of parametric estimates. We adopt the ad hoc method in Carroll et al. (Citation1997) to select appropriate bandwidth for Step 2.

5. Simulation

In this section, we assess the finite sample performance of the proposed method via two Monte Carlo simulations with binary and count responses respectively.

In these simulations, $\begin{aligned} η_{i j} & = {α α}^{T} X_{i j} + f ({β β}^{T} Z_{i j}) D_{i j} + A_{i j}^{T} {γ γ}_{i}, \\ i = 1, \dots, n, j = 1, \dots, n_{i}, \end{aligned}$ where $α α = (0.5, 1, - 0.5)^{T}, β β = (1, 1)^{T} / \sqrt{2}$ and $f (u) = 3 \sin {π (u - c_{1}) / (c_{2} - c_{1})}$ , with $c_{1} = \frac{\sqrt{2}}{2} - \frac{1.645}{\sqrt{12}}$ , and $c_{2} = \frac{\sqrt{2}}{2} + \frac{1.645}{\sqrt{12}}$ . We assume the random effect be random intercept, that is, $A \equiv 1$ and $γ γ$ is one-dimensional. We let $γ γ$ follow $N (0, 1)$ . For the covariates, X is distributed as $X = (X_{1}, X_{2}, X_{3})^{T}$ , where $X_{1}, X_{2}$ and $X_{3}$ follows $N (0, 1)$ , $Z = (Z_{1}, Z_{2})^{T}$ , where $Z_{1}, Z_{2}$ are uniform $U (0, 1)$ variables, and the exposure indicator D follows $B e r n o u l l i (0.5)$ . We assume $n_{i}$ , the number of observations for each subject, is 5, and set the number of subjects n to be 100, 300 and 500. For the response variable, we consider the following two cases:

Case 1 (Binary data): $Y_{i j} \sim B e r n o u l l i (μ_{i j}), w h e r e l o g i t (μ_{i j}) = η_{i j};$

Case 2 (Count data): $Y_{i j} \sim P o i s s o n (μ_{i j}), w h e r e \log (μ_{i j}) = η_{i j} .$

All simulations are repeated for 500 times. For comparison, we ignore the random effects and apply generalised partially linear single-index model (GPLSIM) to the simulated data. The corresponding estimates are obtained using the method in Xu and Zhu (Citation2012). The parametric estimates for binary and count data are summarised in Tables and respectively. We assess the performance of nonparametric estimator $\hat{f} (\cdot)$ by mean integrated squared error (MISE) defined as $M I S E (\hat{f}) = E \int {\hat{f} (u) - f (u)}^{2} d u .$

Table 1. Simulation results for parametric estimates with binary data.

Display Table

Table 2. Simulation results for parametric estimates with count data.

Display Table

Tables and shows mean and standard deviation of MISE for nonparametric function.

Table 3. MISE for nonparametric estimate with binary data.

Download CSV Display Table

Table 4. MISE for nonparametric estimate with count data.

Download CSV Display Table

From Tables and , MSE of the parametric estimates under our model are generally smaller than that of ignoring the random effects. As the sample size increases, the performance of parametric estimates improves. Similar results hold for nonparametric estimates. The MISE of nonparametric estimates using our method is smaller than those derived under GPLSIM. MISE decreases as the sample size increases.

6. Real data analysis

We apply the proposed method to the US National Alzhemer's Coordinating Center Uniform Data Set (https://www.alz.washington.edu). Our goal is to investigate the effect of heredity on development of Alzheimer's disease (AD) among women. We take the diagnosis of AD in each observation of patients (yes/no) as response. The covariates that may influence the occurrence of AD include age, visit year, years of education (EDUC), indicator of first-degree family member with cognitive impairment (yes/no, FAM), depression (yes/no, DEP), diabetes (yes/no, DIABETES) and mini-mental state exam (MMSE) score. To avoid large computational burden including all the observations, we randomly select 500 subjects with at least 2 follow-up visits from the original data set. Our final sample includes 2491 observations and the median follow-up is 3. The visit year is between 2005 and 2019. The proportion of occurence of AD is 28.3% among all observations, and the proportion of subjects whose family member has cognitive impairment is 64.2%. Note that we repeated the sample selection for several times, and achieved consistent results.

Taking FAM as indicator of treatment, we apply the proposed model and method to the final data set. We include logarithm of age and MMSE score into Z, and intercept, age, visit year, EDUC, DEP, DIABETES and MMSE score into X. The estimate of treatment effect reflects the influence of family heredity on development of AD. Note that in this data set, FAM is not a real treatment indicator, thus the corresponding CSTE curve does not indicate a real treatment effect. This data set is only used to for illustrative purpose.

Table shows the parametric estimates. The standard deviations of parameters are calculated from 500 bootstrap samples. Figure presents estimate of nonparametric function and the corresponding 95% pointwise confidence interval. We construct the confidence interval based on result of Theorem 3.2, and apply the method of Zhang and Peng (Citation2010) to estimate the bias and variance. Figure displays the curve of treatment effect versus MMSE score with age fixed on its mean value, as well as the curve of treatment effect versus age with MMSE score fixed on the mean value.

Figure 1. Estimate of nonparametric function $f (\cdot)$ (real line) and its 95% pointwise confidence interval (dashed line).

Figure 1. Estimate of nonparametric function f(⋅) (real line) and its 95% pointwise confidence interval (dashed line).

Figure 2. Relationship of treatment effect with MMSE score and Age: the estimated curve represented by real line and the corresponding 95% pointwise confidence interval by dashed line.

Table 5. Estimated $α α$ and $β β$ .

Display Table

Chen and Zhou (Citation2011) used a generalised linear model to investigate risk factors that influence the occurrence of AD. From their results, age, DEP and MMSE score are significant under four estimation methods. They found that age and DEP have positive associations with the occurence of AD, and MMSE score is negatively correlated with the development of AD. From Table , our result of parametric estimates is consistent with these findings. From Figure , we can clearly see that if the estimated index of a patient is between 0.03 and 0.57, family inheritance of congnitive impairment increases the risk of getting AD. However, if the estimated index is between −0.39 and −0.11, family heredity decreases the risk of AD. Figure shows that the effect of family heredity on occurence of AD has a bell shape association with MMSE score, that is, heredity has lower or even negative influence on risk of AD when the value of MMSE score is small or large. Also, the effect has an increasing trend with age. The effect of family heredity on occurrence of AD is stronger for elderly people.

7. Discussion

This paper focuses on a generalised partially linear single-index mixed effects model for personalised treatment effect estimation and treatment selection in longitudinal studies. In our model, the treatment effect is described as a function of a linear combination of covariates. We develop a method combining local linear regression and penalised quasi-likelihood to estimate the coefficients for each covariate, the treatment effect curve and the parameters for mixed effects. Based on the pointwise confidence intervals for treatment effect curve, we can make individualised treatment decisions from the information of patient characteristics. Our simulation study and real data analysis illustrate effectiveness of the proposed method.

Nonparametric and semiparametric methods provide a flexible way to explore HTE. The previous research in this area mostly focus on cross-sectional data. Our work fills in the gap of semiparametric modelling of HTE with longitudinal data. On the other hand, we develop a new estimation method combining local linear technique and penalised quasi-likelihood, for generalised partially linear single-index model. Pointwise confidence interval can be directly constructed for the estimated treatment effect curve based on its asymptotic normality. The theory of simultaneous confidence band for treatment effect curve can also be established accordingly, and we leave this for future work.

There are still some limitations in our work. We directly apply the method of Zhang and Peng (Citation2010) to estimate bias and variance for the treatment effect curve. It would be more rigorous that the performance of this method in our context be validated via simulation studies. We will include this in our future research. Another limitation of our work is that, based on the pointwise confidence interval of treatment effect curve, we could only make treatment decision for a future patient. To identify subgroup of patients that benefit from each treatment, it is necessary to construct simultaneous confidence band for treatment effect curve. Some possible extensions of our work could also be considered in future research. Our model could be extended to high-dimensional covariates to cope with longitudinal studies in which large number of patient characteristics are recorded. It is also of interest to consider robust regression to limit the impact of outlying observations. An even further extension is a survival model for time-to-event response.

Acknowledgments

We are grateful to the Editor-in-Chief, Prof. Jun Shao, an Associate Editor and anonymous reviewers for their thorough reading of our manuscript and insightful comments that have led to significant improvement of this work. We also thank the US National Alzheimer's Coordinating Center for providing the data.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This work was supported by Natural Science Foundation of Shanghai (17ZR1409000). The NACC database is funded by NIA/NIH Grant U01 AG016976. NACC data are contributed by the NIA-funded ADCs: P30 AG019610 (PI Eric Reiman, MD), P30 AG013846 (PI Neil Kowall, MD), P30 AG062428-01 (PI James Leverenz, MD) P50 AG008702 (PI Scott Small, MD), P50 AG025688 (PI Allan Levey, MD, PhD), P50 AG047266 (PI Todd Golde, MD, PhD), P30 AG010133 (PI Andrew Saykin, PsyD), P50 AG005146 (PI Marilyn Albert, PhD), P30 AG062421-01 (PI Bradley Hyman, MD, PhD), P30 AG062422-01 (PI Ronald Petersen, MD, PhD), P50 AG005138 (PI Mary Sano, PhD), P30 AG008051 (PI Thomas Wisniewski, MD), P30 AG013854 (PI Robert Vassar, PhD), P30 AG008017 (PI Jeffrey Kaye, MD), P30 AG010161 (PI David Bennett, MD), P50 AG047366 (PI Victor Henderson, MD, MS), P30 AG010129 (PI Charles DeCarli, MD), P50 AG016573 (PI Frank LaFerla, PhD), P30 AG062429-01 (PI James Brewer, MD, PhD), P50 AG023501 (PI Bruce Miller, MD), P30 AG035982 (PI Russell Swerdlow, MD), P30 AG028383 (PI Linda Van Eldik, PhD), P30 AG053760 (PI Henry Paulson, MD, PhD), P30 AG010124 (PI John Trojanowski, MD, PhD), P50 AG005133 (PI Oscar Lopez, MD), P50 AG005142 (PI Helena Chui, MD), P30 AG012300 (PI Roger Rosenberg, MD), P30 AG049638 (PI Suzanne Craft, PhD), P50 AG005136 (PI Thomas Grabowski, MD), P30 AG062715-01 (PI Sanjay Asthana, MD, FRCP), P50 AG005681 (PI John Morris, MD), P50 AG047270 (PI Stephen Strittmatter, MD, PhD).

Notes on contributors

Yanghui Liu

Yanghui Liu is a Ph.D. candidate in Statistics at East China Normal University.

Riquan Zhang

Riquan Zhang is a Professor in School of Statistics at East China Normal University.

Shujie Ma

Shujie Ma is an Associate Professor in Department of Statistics at University of California, Riverside.

Xiuzhen Zhang

Xiuzhen Zhang is a Ph.D. candidate in Statistics at East China Normal University.

References

Berger, J. O., Wang, X., & Shen, L. (2014). A Bayesian approach to subgroup identification. Journal of Biopharmaceutical Statistics, 24(1), 110–129. https://doi.org/10.1080/10543406.2013.856026
PubMed Web of Science ®Google Scholar
Bonetti, M., & Gelber, R. (2004). Patterns of treatment eects in subsets of patients in clinical trials. Biostatistics, 5(3), 465–481. https://doi.org/10.1093/biostatistics/kxh002
Web of Science ®Google Scholar
Bonetti, M., Zahrieh, D., Cole, B. F., & Gelber, R. D. (2009). A small sample study of the stepp approach to assessing treatment-covariate interactions in survival data. Statistics in Medicine, 28(8), 1255–1268. https://doi.org/10.1002/sim.v28:8 doi: https://doi.org/10.1002/sim.3524
Web of Science ®Google Scholar
Breslow, N. E., & Clayton, D. G. (1993). Approximate inference in generalized linear mixed models. Journal of the American Statistical Association, 88(21), 9–25. https://doi.org/10.1080/01621459.1993.10594284.
Google Scholar
Carroll, R. J., Fan, J., Gijbels, I., & Wand, M. P. (1997). Generalized partially linear single-index models. Journal of the American Statistical Association, 92(438), 477–489. https://doi.org/10.1080/01621459.1997.10474001
Web of Science ®Google Scholar
Chen, J., Kim, I., Terrell, G. R., & Liu, L. (2014). Generalized partially linear single-index mixed model for repeated measures data. Journal of Nonparametric Statistics, 26(2), 291–303. https://doi.org/10.1080/10485252.2014.891029
Web of Science ®Google Scholar
Chen, B., & Zhou, X. (2011). Doubly robust estimates for binary longitudinal data analysis with missing response and missing covariates. Biometrics, 67(3), 830–842. https://doi.org/10.1111/biom.2011.67.issue-3 doi: https://doi.org/10.1111/j.1541-0420.2010.01541.x
PubMed Web of Science ®Google Scholar
Ciampi, A., Negassa, A., & Lou, Z. (1995). Tree-structured prediction for censored survival data and the cox model. Journal of Clinical Epidemiology, 48(5), 675–689. https://doi.org/10.1016/0895-4356(94)00164-L
PubMed Web of Science ®Google Scholar
Cui, X., Härdle, W. K., & Zhu, L. (2011). The EFM approach for single-index models. The Annals of Statistics, 39(3), 1658–2688. https://projecteuclid.org/euclid.aos/1311600279 doi: https://doi.org/10.1214/10-AOS871
Web of Science ®Google Scholar
Foster, J. C., Taylor, J. M. G., Kaciroti, N., & Nan, B. (2015). Simple subgroup approximations to optimal treatment regimes from randomized clinical trial data. Biostatistics, 16(2), 368–382. https://doi.org/10.1093/biostatistics/kxu049
Web of Science ®Google Scholar
Foster, J. C., Taylor, J. M.G., & Ruberg, S. J. (2011). Subgroup identification from randomized clinical trial data. Statistics in Medicine, 30(24), 2867–2880. https://doi.org/10.1002/sim.v30.24 doi: https://doi.org/10.1002/sim.4322
PubMed Web of Science ®Google Scholar
Guo, W., Zhou, X., & Ma, S. (2018). Optimal treatment selection using the covariate-specific treatment effect curve with high-dimensional covariates, arXiv:1812.10018.
Google Scholar
Han, K., Zhou, X., & Liu, B. (2017). CSTE curve for selection the optimal treatment when outcome is binary. Scientia Sinica (Mathematica), 47(4), 497–514. https://doi.org/10.1360/SCM-2015-0595
Google Scholar
Laird, N. M., & Ware, J. H. (1982). Random effects models for longitudinal data. Biometrics, 38(4), 963–974. https://doi.org/10.2307/2529876
PubMed Web of Science ®Google Scholar
Liang, H. (2009). Generalized partially linear mixed-effects models incorporating mismeasured covariates. Annals of the Institute of Statistical Mathematics, 61(1), 27–46. https://doi.org/10.1007/s10463-007-0146-0
PubMed Web of Science ®Google Scholar
Ma, Y., & Zhou, X. (2014). Treatment selection in a randomized clinical trial via covariate-specific treatment effect curves. Statistical Methods in Medical Research, 26(1), 124–141. https://doi.org/10.1177/0962280214541724
Google Scholar
Negassa, A., Ciampi, A., Abrahamowicz, M., Shapiro, S., & Boivin, J. (2005). Tree-structured subgroup analysis for censored survival data: validation of computationally inexpensive model selection criteria. Statistics and Computing, 15(3), 231–239. https://doi.org/10.1007/s11222-005-1311-z
Web of Science ®Google Scholar
Pang, Z., & Xue, L. (2012). Estimation for the single-index models with random effects. Computational Statistics & Data Analysis, 56(6), 1837–1853. https://doi.org/10.1016/j.csda.2011.11.007
Web of Science ®Google Scholar
Su, X., Tsai, C., Wang, H., Nickerson, D., & Li, B. (2008). Subgroup analysis via recursive partitioning. Journal of Machine Learning Research, 10, 141–158. https://dl.acm.org/doi/10.5555/1577069.1577074.
Web of Science ®Google Scholar
Wang, R., Lagakos, S. W., Ware, J. H., Hunter, D. J., & Drazen, J. M. (2007). Statistics in medicine – Reporting of subgroup analyses in clinical trials. The New England Journal of Medicine, 357(21), 2189–2194. https://doi.org/10.1056/NEJMsr077003
PubMed Web of Science ®Google Scholar
Wu, H., & Zhang, J. T. (2006). Nonparametric regression methods for longitudinal data analysis. John Wiley & Sons.
Google Scholar
Xu, P., & Zhu, L. X. (2012). Estimation for a marginal generalized single-index longitudinal model. Journal of Multivariate Analysis, 105(1), 285–299. https://doi.org/10.1016/j.jmva.2011.10.004
Web of Science ®Google Scholar
Zhang, W., & Peng, H. (2010). Simultaneous confidence band and hypothesis test in generalised varying-coefficient models. Journal of Multivariate Analysis, 101(7), 1656–1680. https://doi.org/10.1016/j.jmva.2010.03.003
Web of Science ®Google Scholar

Appendix

Proof of Theorem 3.1.

Proof of Theorem 3.1

We use a similar proof strategy with Xu and Zhu (Citation2012). Our proof is divided to three steps.Step 1: Given

({α α}_{0}, {β β}_{0})

, we derive the asymptotic expansion of

\hat{f} (u) - f (u)

Let $c_{n} = (N h)^{- 1 / 2}, δ_{n} = c_{n}^{2} \log^{1 / 2} (1 / h)$ , $\begin{aligned} Λ & = (\begin{matrix} c_{n}^{- 1} {a - f (u)} \\ c_{n}^{- 1} h {b - f^{'} (u)} \end{matrix}), {F F}_{i j} = (\begin{matrix} D_{i j} \\ h^{- 1} ({β β}^{T} Z_{i j} - u) D_{i j} \end{matrix}), \\ \hat{Λ} & = (\begin{matrix} c_{n}^{- 1} {\hat{f} (u) - f (u)} \\ c_{n}^{- 1} h {{\hat{f}}^{'} (u) - f (u)} \end{matrix}) . \end{aligned}$ Denote ${\bar{η}}_{i j} = {α α}^{T} X_{i j} + {f (u) + f^{'} (u) ({β β}^{T} Z_{i j} - u)} D_{i j} + A_{i j}^{T} {\hat{γ γ}}_{i}$ . By (Equation4(4) $\begin{aligned} - \frac{1}{2} \sum_{i = 1}^{n} \{\sum_{j = 1}^{n_{i}} K_{h} ({β β}^{T} Z_{i j} - u) w_{i j} φ^{- 1} d (Y_{i j}, g^{- 1} [{α α}^{T} X_{i j} \\ + {a + b ({β β}^{T} Z_{i j} - u)} D_{i j} + A_{i j}^{T} {γ γ}_{i}]) + {γ γ}_{i}^{T} {G G}^{- 1} {γ γ}_{i}\} \end{aligned}$ (4) ), $\hat{Λ}$ minimises $\begin{aligned} L (Λ) & = h \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} K_{h} ({β β}^{T} Z_{i j} - u) \\ \times [d {Y_{i j}, g^{- 1} ({\bar{η}}_{i j} + c_{n} {F F}_{i j}^{T} Λ)} - d {Y_{i j}, g^{- 1} ({\bar{η}}_{i j})}] \end{aligned}$ with respect to Λ. Using Taylor expansion, $\begin{aligned} - \frac{1}{2} L (Λ) \\ = c_{n} h \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} K_{h} ({β β}^{T} Z_{i j} - u) q_{1} ({\bar{η}}_{i j}, Y_{i j}) {F F}_{i j}^{T} Λ \\ + c_{n}^{2} h Λ^{T} \{\sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} K_{h} ({β β}^{T} Z_{i j} - u) q_{2} ({\bar{η}}_{i j}, Y_{i j}) {F F}_{i j} {F F}_{i j}^{T}\} \\ \times Λ {1 + o_{p} (1)} \\ ≜ {T T}_{n}^{T} Λ + Λ^{T} {W W}_{n} Λ {1 + o_{p} (1)} . \end{aligned}$ Let ${\tilde{η}}_{i j} = {α α}^{T} X_{i j} + {f (u) + f^{'} (u) ({β β}^{T} Z_{i j} - u)} D_{i j} + A_{i j}^{T} {γ γ}_{i}$ . Then ${\bar{η}}_{i j} - {\tilde{η}}_{i j} = A_{i j}^{T} ({\hat{γ γ}}_{i} - {γ γ}_{i})$ . Using similar arguments as Section A.1 in Liang (Citation2009), $\begin{aligned} {W W}_{n} & = h c_{n}^{2} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} K_{h} ({β β}^{T} Z_{i j} - u) q_{2} ({\tilde{η}}_{i j}, Y_{i j}) {F F}_{i j} {F F}_{i j}^{T} + o (h), \\ {T T}_{n} & = h c_{n} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} K_{h} ({β β}^{T} Z_{i j} - u) q_{1} ({\tilde{η}}_{i j}, Y_{i j}) {F F}_{i j} + o (c_{n}^{- 1} h^{2}) . \end{aligned}$

It is easy to see that $q_{2} (t, y) = {y - g^{- 1} (t)} ρ_{1}^{'} (t) - ρ_{2} (t)$ . Denote the first element of ${T T}_{n}$ by $T_{n}^{[1]}$ . Using Taylor expansion, $\begin{aligned} T_{n}^{[1]} & = h c_{n} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} K_{h} ({β β}^{T} Z_{i j} - u) q_{1} (η_{i j}) D_{i j} \\ - h c_{n} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} K_{h} ({β β}^{T} Z_{i j} - u) q_{2} (η_{i j}) \\ \times (η_{i j} - {\tilde{η}}_{i j}) D_{i j} {1 + o_{p} (1)} \\ = h c_{n} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} K_{h} ({β β}^{T} Z_{i j} - u) q_{1} (η_{i j}) D_{i j} \\ - \frac{1}{2} h c_{n} f^{''} (u) \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} K_{h} ({β β}^{T} Z_{i j} - u) q_{2} (η_{i j}) \\ \times ({β β}^{T} Z_{i j} - u)^{2} D_{i j}^{2} {1 + o_{p} (1)} \\ = h c_{n} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} K_{h} ({β β}^{T} Z_{i j} - u) q_{1} (η_{i j}, Y_{i j}) D_{i j} + \frac{1}{2} c_{n}^{- 1} \\ \times f^{''} (u) h^{2} κ_{2} f_{U} (u) E {ρ_{2} (η) D^{2} | {β β}^{T} Z = u} + O_{p} (δ_{n}) . \end{aligned}$ Using similar arguments, we also have ${W W}_{n} = - f_{U} (u) E {ρ_{2} (η) Ξ | {β β}^{T} Z = u} + O_{p} (h^{2} + δ_{n}),$ where $Ξ = D^{2} diag (1, κ_{2})$ . By the concavity of function $L (Λ)$ , we obtain that $\hat{Λ} = - {W W}_{n}^{- 1} {T T}_{n} + o_{p} (1)$ . Therefore, $\begin{aligned} \hat{f} (u) - f (u) & = \frac{1}{2} f^{''} (u) h^{2} κ_{2} + {[f_{U} (u) E {ρ_{2} (η) D^{2} | β^{T} Z = u}]}^{- 1} \\ \times N^{- 1} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} K_{h} ({β β}^{T} Z_{i j} - u) q_{1} (η_{i j}, Y_{i j}) D_{i j} \\ + O_{p} (δ_{n}) . \end{aligned}$

Step 2: Derive expression of ${\frac{\partial \hat{f} ({β β}^{T} z; α α, β β)}{\partial α α}|}_{({α α}_{0}, {β β}_{0}^{(1)})} a n d {\frac{\partial \hat{f} ({β β}^{T} z; α α, β β)}{\partial {β β}^{(1)}}|}_{({α α}_{0}, {β β}_{0}^{(1)})} .$

From Step 1, $\hat{f} ({β β}^{T} z)$ , ${\hat{f}}^{'} ({β β}^{T} z)$ satisfy the following equation: (A1) $\begin{aligned} N^{- 1} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} K_{h} ({β β}^{T} Z_{i j} - {β β}^{T} z) q_{1} \\ \times [{α α}^{T} X_{i j} + {\hat{a} + \hat{b} ({β β}^{T} Z_{i j} - {β β}^{T} z)} D_{i j} + A_{i j}^{T} {γ γ}_{i}, Y_{i j}] \\ \times D_{i j} {1 + o_{p} (1)} = 0, \end{aligned}$ (A1) where $\hat{a} = \hat{f} ({β β}^{T} z)$ and $\hat{b} = {\hat{f}}^{'} ({β β}^{T} z)$ . Taking derivative with respect to $α α$ yields: $\begin{aligned} N^{- 1} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} K_{h} ({β β}^{T} Z_{i j} - {β β}^{T} z) q_{2} \\ \times [{α α}^{T} X_{i j} + {\hat{a} + \hat{b} ({β β}^{T} Z_{i j} - {β β}^{T} z)} D_{i j} + A_{i j}^{T} {γ γ}_{i}, Y_{i j}] D_{i j} \\ \times [X_{i j} + \{\frac{\partial \hat{a}}{\partial α α} + \frac{\partial \hat{b}}{\partial α α} ({β β}^{T} Z_{i j} - {β β}^{T} z)\} D_{i j}] = 0. \end{aligned}$ From this equation we obtain $\frac{\partial \hat{a}}{\partial α α} = - A_{n}^{- 1} (B_{1 n} + B_{2 n}),$ with $\begin{aligned} A_{n} & = N^{- 1} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} K_{h} ({β β}^{T} Z_{i j} - {β β}^{T} z) q_{2} \\ \times [{α α}^{T} X_{i j} + {\hat{a} + \hat{b} ({β β}^{T} Z_{i j} - {β β}^{T} z)} D_{i j} + A_{i j}^{T} {γ γ}_{i}, Y_{i j}] D_{i j}^{2}, \\ B_{1 n} & = N^{- 1} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} K_{h} ({β β}^{T} Z_{i j} - {β β}^{T} z) q_{2} \\ \times [{α α}^{T} X_{i j} + {\hat{a} + \hat{b} ({β β}^{T} Z_{i j} - {β β}^{T} z)} D_{i j} + A_{i j}^{T} {γ γ}_{i}, Y_{i j}] \\ \times D_{i j} X_{i j}, \\ B_{2 n} & = N^{- 1} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} K_{h} ({β β}^{T} Z_{i j} - {β β}^{T} z) q_{2} \\ \times [{α α}^{T} X_{i j} + {\hat{a} + \hat{b} ({β β}^{T} Z_{i j} - {β β}^{T} z)} D_{i j} + A_{i j}^{T} {γ γ}_{i}, Y_{i j}] D_{i j}^{2} \\ \times ({β β}^{T} Z_{i j} - {β β}^{T} z) \frac{\partial \hat{b}}{\partial α α} . \end{aligned}$ Taking derivative on (EquationA1(A1) $\begin{aligned} N^{- 1} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} K_{h} ({β β}^{T} Z_{i j} - {β β}^{T} z) q_{1} \\ \times [{α α}^{T} X_{i j} + {\hat{a} + \hat{b} ({β β}^{T} Z_{i j} - {β β}^{T} z)} D_{i j} + A_{i j}^{T} {γ γ}_{i}, Y_{i j}] \\ \times D_{i j} {1 + o_{p} (1)} = 0, \end{aligned}$ (A1) ) with respect to ${β β}^{(1)}$ , we get $\begin{aligned} N^{- 1} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} h^{- 1} K_{h}^{'} ({β β}^{T} Z_{i j} - {β β}^{T} z) {J J}^{T} (Z_{i j} - z) q_{1} \\ \times [{α α}^{T} X_{i j} + {\hat{a} + \hat{b} ({β β}^{T} Z_{i j} - {β β}^{T} z)} D_{i j} + A_{i j}^{T} {γ γ}_{i}, Y_{i j}] D_{i j} \\ + \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} K_{h} ({β β}^{T} Z_{i j} - {β β}^{T} z) q_{2} \\ \times [{α α}^{T} X_{i j} + {\hat{a} + \hat{b} ({β β}^{T} Z_{i j} - {β β}^{T} z)} + A_{i j}^{T} {γ γ}_{i}, Y_{i j}] D_{i j}^{2} \\ \times \{\frac{\partial \hat{a}}{\partial {β β}^{(1)}} + \frac{\partial \hat{b}}{\partial {β β}^{(1)}} ({β β}^{T} Z_{i j} - {β β}^{T} z) + \hat{b} {J J}^{T} (Z_{i j} - z)\} = 0. \end{aligned}$ It follows that $\frac{\partial \hat{a}}{\partial {β β}^{(1)}} = - A_{n}^{- 1} (C_{1 n} + C_{2 n} + C_{3 n}),$ where $\begin{aligned} C_{1 n} & = N^{- 1} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} K_{h} ({β β}^{T} Z_{i j} - {β β}^{T} z) q_{2} \\ \times [{α α}^{T} X_{i j} + {\hat{a} + \hat{b} ({β β}^{T} Z_{i j} - {β β}^{T} z)} D_{i j} + A_{i j}^{T} {γ γ}_{i}, Y_{i j}] \\ \times D_{i j}^{2} \hat{b} {J J}^{T} (Z_{i j} - z), \\ C_{2 n} & = N^{- 1} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} K_{h} ({β β}^{T} Z_{i j} - {β β}^{T} z) q_{2} \\ \times [{α α}^{T} X_{i j} + {\hat{a} + \hat{b} ({β β}^{T} Z_{i j} - {β β}^{T} z)} D_{i j} + A_{i j}^{T} {γ γ}_{i}, Y_{i j}] \\ \times D_{i j}^{2}, ({β β}^{T} Z_{i j} - {β β}^{T} z) \frac{\partial \hat{b}}{\partial {β β}^{(1)}}, \\ C_{3 n} & = N^{- 1} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} h^{- 1} K_{h}^{'} ({β β}^{T} Z_{i j} - {β β}^{T} z) {J J}^{T} (Z_{i j} - z) q_{1} \\ \times [{α α}^{T} Z_{i j} + {\hat{a} + \hat{b} ({β β}^{T} Z_{i j} - {β β}^{T} z)} D_{i j} + A_{i j}^{T} {γ γ}_{i}, Y_{i j}] \\ \times D_{i j} . \end{aligned}$

By similar calculations with Cui et al. (Citation2011), $\begin{aligned} A_{n} & = - f_{U} ({β β}^{T} z) E \{ρ_{2} (η) D^{2} | {β β}^{T} Z = {β β}^{T} z\} \\ + O_{p} (h^{2} + n^{- / 2} h^{- 1 / 2}), \\ B_{1 n} & = - f_{U} ({β β}^{T} z) E \{ρ_{2} (η) D X | {β β}^{T} Z = {β β}^{T} z\} \\ + O_{p} (h^{2} + n^{- 1 / 2} h^{- 1 / 2}), \\ B_{2 n} & = O_{p} (h^{2} + n^{- 1 / 2} h^{1 / 2}), \\ C_{1 n} & = - f_{U} ({β β}^{T} z) f^{'} ({β β}^{T} z) {J J}^{T} [E {ρ_{2} (η) D^{2} Z | {β β}^{T} Z = {β β}^{T} z} \\ - E {ρ_{2} (η) D^{2} | {β β}^{T} Z = {β β}^{T} z} z] \\ + O_{p} (h^{2} + n^{- 1 / 2} h^{- 1 / 2}), \\ C_{2 n} & = O_{p} (h^{2} + n^{- 1 / 2} h^{1 / 2}), \\ C_{3 n} & = O_{p} (h^{2} + n^{- 1 / 2} h^{- 3 / 2}) . \end{aligned}$ Therefore, $\begin{aligned} \frac{\partial \hat{a}}{\partial α α} & = - {[E {ρ_{2} (η) D^{2} | {β β}^{T} Z = {β β}^{T} z}]}^{- 1} \\ \times [E {ρ_{2} (η) D X | {β β}^{T} Z = {β β}^{T} z}] \\ + O (h^{2} + n^{- 1 / 2} h^{1 / 2}), \\ \frac{\partial \hat{a}}{\partial {β β}^{(1)}} & = f^{'} ({β β}^{T} z) {J J}^{T} (z - {[E {ρ_{2} (η) D^{2} | {β β}^{T} Z = {β β}^{T} z}]}^{- 1} \\ \times [E {ρ_{2} (η) D^{2} Z | {β β}^{T} Z = {β β}^{T} z}]) \\ + O (h^{2} + n^{- 1 / 2} h^{1 / 2}) . \end{aligned}$

We further denote $\begin{aligned} \tilde{X} & = X - {[E {ρ_{2} (η) D^{2} | {β β}^{T} Z}]}^{- 1} [E {ρ_{2} (η) D X | {β β}^{T} Z}], \\ \tilde{Z} & = f^{'} ({β β}^{T} Z) {J J}^{T} (Z - {[E {ρ_{2} (η) D^{2} | {β β}^{T} Z}]}^{- 1} \\ \times [E {ρ_{2} (η) D^{2} Z | {β β}^{T} Z}]) . \end{aligned}$ It is easy to see that (A2) $E {ρ_{2} (η) D \tilde{X} | {β β}^{T} Z} = 0, E {ρ_{2} (η) D^{2} \tilde{Z} | {β β}^{T} Z} = 0.$ (A2)

Step 3: The asymptotic normality of $α α$ and $β β$ .

From estimating Equations (Equation8(8) $\begin{aligned} {S S}_{01} (α α, {β β}^{(1)}, Γ Γ) \\ = \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} X_{i j} {g^{'} (μ_{i j})}^{- 1} w_{i j} φ^{- 1} V^{- 1} (μ_{i j}) \\ \times [Y_{i j} - g^{- 1} {{α α}^{T} X_{i j} + f ({β β}^{T} Z_{i j}) D_{i j} + A_{i j}^{T} {γ γ}_{i}}] \\ = 0 0, \end{aligned}$ (8) ) and (Equation9(9) $\begin{aligned} {S S}_{02} (α α, {β β}^{(1)}, Γ Γ) \\ = \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} {J J}_{{β β}^{(1)}}^{T} Z_{i j} D_{i j} f^{'} ({β β}^{T} Z_{i j}) {g^{'} (μ_{i j})}^{- 1} w_{i j} φ^{- 1} \\ + V^{- 1} (μ_{i j}) [Y_{i j} - g^{- 1} {{α α}^{T} X_{i j} f ({β β}^{T} Z_{i j}) D_{i j} + A_{i j}^{T} {γ γ}_{i}}] \\ = 0 0, \end{aligned}$ (9) ), $\sqrt{N} (\begin{matrix} \hat{α α} - {α α}_{0} \\ {\hat{β β}}^{(1)} - {β β}_{0}^{(1)} \end{matrix}) = A_{n}^{- 1} \sqrt{N} B_{n} + o_{p} (1),$ where $\begin{aligned} A_{n} & = \frac{1}{N} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} ρ_{2} (η_{i j}) (\begin{matrix} X_{i j} \\ f^{'} ({β β}_{0}^{T} Z_{i j}) {J J}^{T} Z_{i j} D_{i j} \end{matrix}) {(\begin{matrix} {\tilde{X}}_{i j} \\ {\tilde{Z}}_{i j} D_{i j} \end{matrix})}^{T}, \\ B_{n} & = \frac{1}{N} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} (\begin{matrix} X_{i j} \\ f^{'} ({β β}_{0}^{T} Z_{i j}) {J J}^{T} Z_{i j} D_{i j} \end{matrix}) {g^{'} (μ_{i j})}^{- 1} V^{- 1} (μ_{i j}) \\ \times [Y_{i j} - g^{- 1} {{α α}_{0}^{T} X_{i j} + \hat{f} ({β β}_{0}^{T} Z_{i j}) D_{i j} + A_{i j}^{T} {γ γ}_{i}] . \end{aligned}$

Based on (EquationA2(A2) $E {ρ_{2} (η) D \tilde{X} | {β β}^{T} Z} = 0, E {ρ_{2} (η) D^{2} \tilde{Z} | {β β}^{T} Z} = 0.$ (A2) ), we have $A_{n} = E (ρ_{2} (η) {(\begin{matrix} \tilde{X} \\ \tilde{Z} D \end{matrix})}^{\otimes 2}) + o_{p} (1) ≜ A + o_{p} (1) .$ By Taylor expansion, $B_{n} = B_{1 n} + B_{2 n} {1 + o_{p} (1)},$ where $\begin{aligned} B_{1 n} & = \frac{1}{N} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} (\begin{matrix} X_{i j} \\ f^{'} ({β β}_{0}^{T} Z_{i j}) {J J}^{T} Z_{i j} D_{i j} \end{matrix}) q_{1} (η_{i j}, Y_{i j}), \\ B_{2 n} & = - \frac{1}{N} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} (\begin{matrix} X_{i j} \\ f^{'} ({β β}_{0}^{T} Z_{i j}) {J J}^{T} Z_{i j} D_{i j} \end{matrix}) ρ_{2} (η_{i j}) {\hat{f} ({β β}_{0}^{T} Z_{i j}) \\ - f ({β β}_{0}^{T} Z_{i j})} D_{i j} . \end{aligned}$ From the result of Step 1, $\begin{aligned} \sqrt{N} B_{2 n} & = - \frac{1}{\sqrt{N}} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} (\begin{matrix} X_{i j} \\ f^{'} ({β β}_{0}^{T} Z_{i j}) {J J}^{T} Z_{i j} D_{i j} \end{matrix}) ρ_{2} (η_{i j}) D_{i j} \\ \times {[f_{U} ({β β}_{0}^{T} Z_{i j}) E {ρ_{2} (η) D^{2} | {β β}_{0}^{T} Z = {β β}_{0}^{T} Z_{i j}}]}^{- 1} \\ \times \frac{1}{N} \sum_{k = 1}^{n} \sum_{l = 1}^{n_{i}} K_{h} ({β β}_{0}^{T} Z_{k l} - {β β}_{0}^{T} Z_{i j}) q_{1} (η_{k l}, Y_{k l}) D_{k l} \\ - \frac{1}{2} h^{2} κ_{2} \frac{1}{\sqrt{N}} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} (\begin{matrix} X_{i j} \\ f^{'} ({β β}_{0}^{T} Z_{i j}) {J J}^{T} Z_{i j} D_{i j} \end{matrix}) \\ \times ρ_{2} (η_{i j}) D_{i j} f^{''} ({β β}^{T} Z_{i j}) \\ = - \frac{1}{\sqrt{N}} \sum_{k = 1}^{n} \sum_{l = 1}^{n_{i}} q_{1} (η_{k l}, Y_{k l}) D_{k l} \\ \times \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} K_{h} ({β β}_{0}^{T} Z_{i j} - {β β}_{0}^{T} Z_{k l}) \\ \times (\begin{matrix} X_{i j} \\ f^{'} ({β β}_{0}^{T} Z_{i j}) {J J}^{T} Z_{i j} D_{i j} \end{matrix}) ρ_{2} (η_{i j}) D_{i j} \\ \times {[f_{U} ({β β}_{0}^{T} Z_{i j}) E {ρ_{2} (η) D^{2} | {β β}_{0}^{T} Z = {β β}_{0}^{T} Z_{i j}}]}^{- 1} \\ + O_{p} (n^{1 / 2} h^{2}) \\ = - \frac{1}{\sqrt{N}} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} {[E {ρ_{2} (η) D^{2} | {β β}_{0}^{T} Z = {β β}_{0}^{T} Z_{i j}}]}^{- 1} \\ \times [\begin{matrix} E \{ρ_{2} (η) D X | {β β}_{0}^{T} Z = {β β}_{0}^{T} Z_{i j}\} \\ f^{'} ({β β}_{0}^{T} Z_{i j}) {J J}^{T} E {ρ_{2} (η) D^{2} Z | {β β}_{0}^{T} Z = {β β}_{0}^{T} Z_{i j}} \end{matrix}] \\ \times D_{i j} q_{1} (η_{i j}, Y_{i j}) + o_{p} (1) . \end{aligned}$ It is easy to see that $\sqrt{N} B_{n} = \frac{1}{\sqrt{N}} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} (\begin{matrix} {\tilde{X}}_{i j} \\ {\tilde{Z}}_{i j} D_{i j} \end{matrix}) q_{1} (η_{i j}, Y_{i j}) .$ By CLT, $\sqrt{N} B_{n}$ has asymptotic normal distribution with zero-mean. Denote $s_{i j} = (\begin{matrix} {\tilde{X}}_{i j} \\ {\tilde{Z}}_{i j} D_{i j} \end{matrix})$ , we have $\begin{aligned} v a r (\sqrt{N} B_{n}) & = \frac{1}{N} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} v a r {s_{i j} q_{1} (η_{i j}, Y_{i j})} \\ + \frac{1}{N} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} \sum_{\begin{matrix} k = 1 \\ k \neq j \end{matrix}}^{n_{i}} cov {s_{i j} q_{1} (η_{i j}, Y_{i j}), \\ s_{i k} q_{1} (η_{i k}, Y_{i k})} . \end{aligned}$ Using simple calculation, $v a r {s_{i j} q_{1} (η_{i j}, Y_{i j})} = E {ρ_{2} (η) s s^{T}} = A .$ Denote $C = cov {s_{11} q_{1} (η_{11}, Y_{11}), s_{12} q_{1} (η_{12}, Y_{12})}$ , $v a r (\sqrt{N} B_{n}) = A + \frac{N_{1}}{N} C .$ Therefore, when $n \to \infty$ , $\sqrt{N} (\begin{matrix} \hat{α α} - α_{0} \\ {\hat{β β}}^{(1)} - {β β}_{0}^{(1)} \end{matrix}) \to N (0, A^{- 1} + \frac{N_{1}}{N} A^{- 1} C A^{- 1}) .$

Denote $D = (\begin{matrix} {I I}_{p} & {0 0}_{p \times (q - 1)} \\ {0 0}_{q \times p} & {J J}_{{β β}_{0}^{(1)}} \end{matrix})$ . An application of multivariate delta-method yields $\sqrt{N} (\begin{matrix} \hat{α α} - α_{0} \\ \hat{β β} - {β β}_{0} \end{matrix}) \to N (0, Σ Σ),$ where $Σ Σ = D (A^{- 1} + \frac{N_{1}}{N} A^{- 1} C A^{- 1}) D^{T}$ .

Proof of Theorem 3.2

From Theorem 3.1, $\sqrt{N} (\begin{matrix} \hat{α α} - {α α}_{0} \\ {\hat{β β}}^{(1)} - {β β}_{0}^{(1)} \end{matrix}) = O_{p} (1)$ . We write $\begin{aligned} \hat{f} ({\hat{β β}}^{T} z; \hat{α α}, \hat{β β}) - f ({β β}_{0}^{T} z) \\ = \hat{f} ({\hat{β β}}^{T} z; \hat{α α}, \hat{β β}) - \hat{f} ({β β}_{0}^{T} z) + \hat{f} ({β β}_{0}^{T} z) - f ({β β}_{0}^{T} z) \\ = \frac{\partial \hat{f} ({β β}^{T} z; α α, β β)}{\partial ({α α}^{T}, {β β}^{(1) T})} |_{({α α}_{0}, {β β}_{0}^{(1)})} (\begin{matrix} \hat{α α} - {α α}_{0} \\ {\hat{β β}}^{(1)} - {β β}_{0}^{(1)} \end{matrix}) {1 + o_{p} (1)} \\ + \{\hat{f} ({β β}_{0}^{T} z) - f ({β β}_{0}^{T} z)\} \\ = \hat{f} ({β β}_{0}^{T} z) - f ({β β}_{0}^{T} z) + O_{p} (n^{- 1 / 2}) . \end{aligned}$ Hence the asymptotic distribution of $\hat{f} ({\hat{β β}}^{T} z; \hat{α α}, \hat{β β})$ is the same as that of $\hat{f} ({β β}_{0}^{T} z)$ . Based on the result of Step 1, we now derive the asymptotic variance of $\hat{f} ({β β}_{0}^{T} z)$ . Denote $r_{i j} = K_{h} ({β β}_{0}^{T} Z_{i j} - {β β}_{0}^{T} z) q_{1} (η_{i j}, Y_{i j}) D_{i j}$ , we have $\begin{aligned} v a r \{\sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} r_{i j}\} & = \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} v a r (r_{i j}) \\ + \sum_{i = 1}^{n} \sum_{j = 1}^{n_{i}} \sum_{\begin{matrix} k = 1 \\ k \neq j \end{matrix}}^{n_{i}} cov (r_{i j}, r_{i k}), \end{aligned}$ where $\begin{aligned} v a r (r_{i j}) & = E K_{h}^{2} ({β β}_{0}^{T} Z_{i j} - {β β}_{0}^{T} z) q_{1}^{2} (η_{i j}, Y_{i j}) D_{i j}^{2} \\ = h^{- 1} μ_{0} f_{U} ({β β}_{0}^{T} z) E {ρ_{2} (η) D^{2} | {β β}_{0}^{T} Z = {β β}_{0}^{T} z}, \\ cov (r_{i j}, r_{i k}) & = f_{U}^{2} ({β β}_{0}^{T} z) E \{q_{1} (η_{11}, Y_{11}) q_{1} (η_{12}, Y_{12}) D_{11}^{2} \\ D_{12}^{2} | {β β}_{0}^{T} Z_{11} = {β β}_{0}^{T} z, {β β}_{0}^{T} Z_{12} = {β β}_{0}^{T} z\} . \end{aligned}$ Therefore, we have $\sqrt{N h} {\hat{f} ({\hat{β β}}^{T} z; \hat{α α}, \hat{β β}) - f ({β β}_{0}^{T} z) - \frac{1}{2} h^{2} κ_{2} f^{''} ({β β}_{0}^{T} z)} \to N (0, σ_{f}^{2}),$ where $σ_{f}^{2} = μ_{0} f_{U}^{- 1} ({β β}_{0}^{T} z) [E {ρ_{2} (η) D^{2} | {β β}_{0}^{T} Z = {β β}_{0}^{T} z}]^{- 1}$ .

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Download PDF

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Your download is now in progress and you may close this window

Did you know that with a free Taylor & Francis Online account you can gain access to the following benefits?

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Have an account?
Login now Don't have an account?
Register for free

Login or register to access this feature

Have an account?
Login now Don't have an account?
Register for free

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Personalized treatment selection via the covariate-specific treatment effect curve for longitudinal data

Abstract

1. Introduction