Search in:

Statistical Theory and Related Fields Volume 4, 2020 - Issue 1

Submit an article Journal homepage

Free access

479

Views

CrossRef citations to date

Altmetric

Listen

Articles

Semiparametric estimation for accelerated failure time mixture cure model allowing non-curable competing risk

Yijun Wanga Key Laboratory of Advanced Theory and Application in Statistics and Data Science - MOE, School of Statistics, East China Normal University, Shanghai, 200062, People’s Republic of ChinaView further author information

Jiajia Zhangb Department of Epidemiology and Biostatistics, University of South Carolina, Columbia, SC, USAView further author information

Yincai Tanga Key Laboratory of Advanced Theory and Application in Statistics and Data Science - MOE, School of Statistics, East China Normal University, Shanghai, 200062, People’s Republic of ChinaCorrespondence[email protected]
View further author information

Pages 97-108 | Received 16 Jan 2019, Accepted 24 Mar 2019, Published online: 11 Apr 2019

Cite this article
https://doi.org/10.1080/24754269.2019.1600123
CrossMark

In this article

1. Introduction
2. Accelerated failure time mixture cure model allowing non-curable competing risk
3. EM algorithm
4. Simulation
5. Colorectal cancer clinical trial data
6. Discussions and conclusions
Acknowledgements
Disclosure statement
Additional information
References
Appendixes

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions
View PDF PDF View EPUB EPUB

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

The mixture cure model is the most popular model used to analyse the major event with a potential cure fraction. But in the real world there may exist a potential risk from other non-curable competing events. In this paper, we study the accelerated failure time model with mixture cure model via kernel-based nonparametric maximum likelihood estimation allowing non-curable competing risk. An EM algorithm is developed to calculate the estimates for both the regression parameters and the unknown error densities, in which a kernel-smoothed conditional profile likelihood is maximised in the M-step, and the resulting estimates are consistent. Its performance is demonstrated through comprehensive simulation studies. Finally, the proposed method is applied to the colorectal clinical trial data.

Keywords:

AFT mixture cure model
competing risk
EM algorithm

1. Introduction

In some medical studies, like prostate cancer or breast cancer, it is often observed that a substantial proportion of study patients never experience the event of interest, which are treated as cured or non-susceptible subjects. A number of mixture cure models have been proposed in the literature for analysing such survival data. The two-component mixture cure model can capture cured and uncured patients simultaneously. The early approach of univariate mixture cure model was used to model the survival data by Boag (Citation1949), which assumed that a fraction of the patients are cured from cancer and will never experience the event. Since then, parametric and semiparametric mixture cure models have been proposed (Li & Taylor, Citation2002; Peng & Dear, Citation2000; Peng, Dear, Denham, Citation1998; Sy & Taylor, Citation2000). The proportional hazards (PHMC) model was further discussed by Sy Taylor (Citation2000) and Peng (Citation2003). Besides, Li Taylor (Citation2002), Zhang Peng (Citation2007), Xu Zhang (Citation2009) and Lu (Citation2010) developed the accelerated failure time (AFTMC) model.

Although PHMC model specifies that the effects of the covariates act multipicatively on the hazard function, the AFTMC model regresses the logarithm of the failure time over the covariates, assuming a direct relationship between failure time over the covariates. Li and Taylor (Citation2002) and Zhang and Peng (Citation2007) estimated parameters through EM for the AFTMC model with semiparametric methods, and the large sample properties was discussed in Xu and Zhang (Citation2009). Lu (Citation2010) developed the AFTMC model and the error density was estimated by the kernel method.

The competing risk model is the main tool in explaining more than one event in survival analysis (Crowder, Citation2001; David & Moeschberge, Citation1978; Kalbfleisch & Prentice, Citation2002; Kleinbaum & Klein, Citation2006). There two common approaches for competing risk data, one is the cumulative incidence function (CIF) (Fine & Gray, Citation1999; Kalbfleisch & Prentice, Citation2002; Tai, Machin, White, & Gebski, Citation2001), the other one is cause-specified hazard or subhazard (Fusaro, Bacchetti, & Jewell, Citation1996; Gaynor et al., Citation1993; Klein, Citation2006; Ohneberg & Schumacher, Citation2017; Pintilie, Citation2007). Usually, the marginal hazards equals to the cause-specified hazards when the independence among events of interest are assumed (Gamel, Weller, Wesley, & Feuer, Citation2000; Kuk, Citation1992; Larson & Dinse, Citation1985; Ng & McLachlan, Citation2003). In mixture model, the marginal hazards are the hazards of single risk (Lambert, Thompson, Weston, & Dickman, Citation2006; Yu & Tiwari, Citation2007; Yu, Tiwari, Cronin, & Feuer, Citation2004).

Due to the scarcity of efficient and reliable computational methods, there is little literature dealing with AFTMC model and competing risk data. In this paper, we assume all patients may experience either event 1 or event 2, where event 1 is the primary event with a potential cure rate and event 2 is all other possible events of interest, which are non-curable. Different from the standard competing risk model, there is a cured fraction existed in event 1. For the patients being cured from event 1, they will only experience event 2; while for the patients being uncured from event 1, they will experience either event 1 or event 2. The rest of the paper is organised as follows. In Section 2, the accelerated failure time mixture cure model allowing non-curable competing risk is developed. An EM algorithm is developed to implement the estimation in Section 3, we maximise a kernel-smoothed conditional profile likelihood in the M-step, which is motivated by Zeng and Lin (Citation2007) in efficient estimation for the accelerated failure time model without cure fraction. We show that the estimates are consistent and efficient in the appendix. In addition, the proposed model and method are evaluated by the comprehensive simulation study in Section 4 and illustrated by the real data analysis in Section 5. Conclusions are made in Section 6.

2. Accelerated failure time mixture cure model allowing non-curable competing risk

Consider a study with n subjects. For subject i, let $T_{1, i}$ denote the event time undergoing the risk of primary interest and $T_{2, i}$ the event time under other risks. In addition, let $X_{i}$ denote the covariates and $C_{i}$ the censoring time for subject i. $ε_{i}$ is used as event indicator ( $ε_{i} = 1$ for event of interest; $ε_{i} = 2$ for other risk), and define $T_{i} = T_{j, i}$ when $ε_{i} = j, j = 1, 2$ , ${\tilde{T}}_{i} = min (T_{i}, C_{i})$ , $δ_{i} = I (T_{i} \leq C_{i})$ . The observed data then consist of $O = {({\tilde{t}}_{i}, δ_{i}, ε_{i}, X_{i}), i = 1, \dots, n}$ . $Y_{i}$ is the indicator of the uncured patient i subject to the event of primary interest. That is, $Y_{i} = 1$ means that subject i is subjected to the event of primary interest with uncured probability $P (Y_{i} = 1)$ . Note that $Y_{i}$ is not directly observable. As usual, we assume that $C_{i}$ is given or independent of $(T_{1, i}, T_{2, i}, Y_{i})$ given $X_{i}$ .

As commonly done in the mixture cure models, we assume a logistic regression for $Y_{i}$ , i.e. $π_{α} (X_{i}) \equiv P (Y_{i} = 1 | X_{i}) = \frac{\exp (α^{'} {\tilde{X}}_{i})}{1 + \exp (α^{'} {\tilde{X}}_{i})},$ where ${\tilde{X}}_{i} = (1, X_{i}^{'})^{'}$ .

To model the competing risk data, we adopt a similar mixture regression approach as studied in Lu and Peng (Citation2008). Note that when $Y_{i} = 0$ , $ε_{i} \equiv 2$ . When $Y_{i} = 1$ , we assume a logistic regression model $π_{θ} (X_{i}) \equiv P (ε_{i} = 1 | X_{i}, Y_{i} = 1) = \frac{\exp (θ^{'} {\tilde{X}}_{i})}{1 + \exp (θ^{'} {\tilde{X}}_{i})} .$ Then, we have $P (ε_{i} = 1 | X_{i}) = π_{α} (X_{i}) π_{θ} (X_{i})$ and $P (ε_{i} = 2 | X_{i}) = 1 - π_{α} (X_{i}) + π_{α} (X_{i}) {1 - π_{θ} (X_{i})}$ . Here, we use a logistic link for both $π_{α}$ and $π_{θ}$ , but other link functions, such as the log-log and probit links, can also be used to model $π_{α}$ and $π_{θ}$ .

In addition, we consider a accelerated failure time regression model for $T_{i}$ given $X_{i}$ and $ϵ_{j, i} = j$ , i.e. (1) $\begin{aligned} \log T_{j, i} & = β_{j}^{T} X_{i} + ϵ_{j, i}, j = 1, 2, \end{aligned}$ (1) (2) $\begin{aligned} f_{j} (t | X_{i}) & = \exp (- β_{j}^{T} X_{i}) f_{ϵ_{j}} (e^{\log (t) - β_{j}^{T} X_{i}}) or S_{j} (t | X_{i}) \\ = S_{ϵ_{j}} (e^{\log (t) - β_{j}^{T} X_{i}}), j = 1, 2, \end{aligned}$ (2) where $β_{j}$ is a row vector of unknown parameters, $ϵ_{j, i}$ denotes the error term for the event j with an unknown survival function, $f_{ϵ_{j}}$ and $S_{ϵ_{j}}$ are, respectively, the density and survival function of $e^{ϵ_{j, i}}$ .

Define the cumulative incidence functions (CIFs) as (3) $F_{j} (t, X_{i}) = P (T_{i} \leq t, ε_{i} = j | X_{i}), j = 1, 2.$ (3) Then, we have $F_{1} (t, X_{i}) = π_{α} (X_{i}) π_{θ} (X_{i}) {1 - S_{1} (t | X_{i})}$ and $F_{2} (t, X_{i}) = [1 - π_{α} (X_{i}) π_{θ} (X_{i})] {1 - S_{2} (t | X_{i})} .$ Moreover, (4) $\begin{aligned} S (t | X_{i}) \equiv P (T_{i} > t | X_{i}) \\ = \sum_{j = 1}^{2} P (T_{i} > t | ε_{i} = j, X_{i}) P (ε_{i} = j | X_{i}) \\ = {1 - π_{α} (X_{i})} S_{2} (t | X_{i}) + π_{α} (X_{i}) [π_{θ} (X_{i}) S_{1} (t | X_{i}) \\ + {1 - π_{θ} (X_{i})} S_{2} (t | X_{i})] \\ = π_{α} (X_{i}) π_{θ} (X_{i}) S_{1} (t | X_{i}) \\ + [1 - π_{α} (X_{i}) π_{θ} (X_{i})] S_{2} (t | X_{i}) . \end{aligned}$ (4) We see that it is just a mixture model with weights $π_{α} (X_{i}) π_{θ} (X_{i})$ and $1 - π_{α} (X_{i}) π_{θ} (X_{i})$ on the model of event 1 (primary) and event 2 (others). When only one risk was considered, i.e. $π_{θ} (X_{i}) = 1$ and $S_{2} (t | X_{i}) = 1$ , it becomes the simple mixture cure model $S (t | X_{i}) = 1 - π_{α} (X_{i}) + π_{α} (X_{i}) S_{1} (t | X_{i})$ .

Define $Θ = (α, θ, β_{1}, β_{2}, f_{ϵ_{1}} (\cdot), f_{ϵ_{2}} (\cdot))$ . With known $ε_{i}$ and $δ_{i}$ , $i = 1, 2, \dots, n$ , the likelihood function can be written as (5) $\begin{aligned} L_{o} & = \prod_{i = 1}^{n} \{π_{α} (X_{i}) π_{θ} (X_{i}) \\ {\exp (- β_{1}^{T} X_{i}) f_{ϵ_{1}} (e^{R_{i} (β_{1})})\}}^{I (ε_{i} = 1) δ_{i}} \\ \times ([1 - π_{α} (X_{i}) + π_{α} (X_{i}) {1 - π_{θ} (X_{i})}] \\ {\exp (- β_{2}^{T} X_{i}) f_{ϵ_{2}} (e^{R_{i} (β_{2})}))}^{I (ε_{i} = 2) δ_{i}} {S ({\tilde{t}}_{i} | X_{i})}^{1 - δ_{i}}, \end{aligned}$ (5) where $R_{i} (β_{j}) = \log ({\tilde{t}}_{i}) - β_{j}^{T} X_{i}, j = 1, 2$ . Direct maximisation of the above observed likelihood function is very challenging due to its complex structure. In the next Section, we derive an EM algorithm to maximise the complete likelihood function.

3. EM algorithm

Similar to the standard mixture cure mode, if $Y_{i}$ could be observed, then the complete likelihood function is written as (6) $\begin{aligned} L_{c} & = \prod_{i = 1}^{n} \{π_{α} (X_{i}) π_{θ} (X_{i}) \\ {\exp (- β_{1}^{T} X_{i}) f_{ϵ_{1}} (e^{R_{i} (β_{1})})\}}^{I (ε_{i} = 1) δ_{i}} \\ \times ([\{1 - π_{α} (X_{i})\} + π_{α} (X_{i}) \{1 - π_{θ} (X_{i})\}] \\ {\exp (- β_{2}^{T} X_{i}) f_{ϵ_{2}} (e^{R_{i} (β_{2})}))}^{I (ε_{i} = 2) δ_{i}} \\ \times {\{π_{α} (X_{i}) π_{θ} (X_{i}) S_{ϵ_{1}} (e^{R_{i} (β_{1})})\}}^{I (ε_{i} = 1) (1 - δ_{i})} \\ \times ([\{1 - π_{α} (X_{i})\} + π_{α} (X_{i}) \{1 - π_{θ} (X_{i})\}] \\ {S_{ϵ_{2}} (e^{R_{i} (β_{2})}))}^{I (ε_{i} = 2) (1 - δ_{i})} \\ = \prod_{i = 1}^{n} [π_{α} (X_{i}) π_{θ} (X_{i}) \\ {\exp (- β_{1}^{T} X_{i}) f_{ϵ_{1}} (e^{R_{i} (β_{1})})]}^{I (ε_{i} = 1) δ_{i}} \\ \times ({\{1 - π_{α} (X_{i})\}}^{1 - y_{i}} {[π_{α} (X_{i}) \{1 - π_{θ} (X_{i})\}]}^{y_{i}} \\ {\exp (- β_{2}^{T} X_{i}) f_{ϵ_{2}} (e^{R_{i} (β_{2})})})}^{I (ε_{i} = 2) δ_{i}} \\ \times {\{π_{α} (X_{i}) π_{θ} (X_{i}) S_{ϵ_{1}} (e^{R_{i} (β_{1})})\}}^{I (ε_{i} = 1) (1 - δ_{i})} \\ \times ({\{1 - π_{α} (X_{i})\}}^{1 - y_{i}} {[π_{α} (X_{i}) \{1 - π_{θ} (X_{i})\}]}^{y_{i}} \\ {S_{ϵ_{2}} (e^{R_{i} (β_{2})}))}^{I (ε_{i} = 2) (1 - δ_{i})} . \end{aligned}$ (6) It is easy to write the logarithm of the complete likelihood function into four components regarding to their unknown parameters. $\begin{aligned} l_{c} (α, θ, β_{1}, β_{2} | O, y) = l_{c_{1}} (α | O, y) + l_{c_{2}} (θ | O, y) \\ + l_{c_{3}} (β_{1} | O, y) + l_{c_{4}} (β_{2} | O, y), \end{aligned}$ where (7) $\begin{aligned} l_{c 1} (α | O, y) \\ = \sum_{i = 1}^{n} [I (ε_{i} = 1) + I (ε_{i} = 2) y_{i}] \log π_{α} (X_{i}) \\ + \sum_{i = 1}^{n} I (ε_{i} = 2) (1 - y_{i}) \\ \log \{1 - π_{α} (X_{i})\}, \end{aligned}$ (7) (8) $\begin{aligned} l_{c 2} (θ | O, y) & = \sum_{i = 1}^{n} I (ε_{i} = 1) \log π_{θ} (X_{i}) \\ + \sum_{i = 1}^{n} I (ε_{i} = 2) y_{i} \log \{1 - π_{θ} (X_{i})\}, \end{aligned}$ (8) (9) $\begin{aligned} l_{c 3} (β_{1} | O, y) \\ = \sum_{i = 1}^{n} [I (ε_{i} = 1) δ_{i} [\log h_{ϵ_{1}} (e^{R_{i} (β_{1})}) \\ - β_{1}^{T} X_{i}] \\ - I (ε_{i} = 1) H_{ϵ_{1}} (e^{R_{i} (β_{1})})], \end{aligned}$ (9) (10) $\begin{aligned} l_{c 4} (β_{2} | O, y) \\ = \sum_{i = 1}^{n} [I (ε_{i} = 2) δ_{i} [\log h_{ϵ_{2}} (e^{R_{i} (β_{2})}) \\ - β_{2}^{T} X_{i}] - I (ε_{i} = 2) H_{ϵ_{2}} (e^{R_{i} (β_{2})})], \end{aligned}$ (10) and $f_{ϵ_{j}} = h_{ϵ_{j}} \cdot S_{ϵ_{j}}$ , here $h_{ϵ_{j}}$ and $H_{ϵ_{j}}$ are the hazard function and cumulative functions of $e^{ϵ_{j}}$ , j=1,2 respectively.

Take $Y_{i}$ as auxiliary variables, and use EM algorithm to find the maximum likelihood estimates of the unknown parameters. The E-step in the EM algorithm computes the conditional expectation of the complete log-likelihood function with respect to three unobserved probabilities $P (Y_{i} = 1, ε_{i} = 1 | O, Θ^{(m)})$ , the probability that uncured patients die from the events of primary interest; $P (Y_{i} = 1, ε_{i} = 2 | O, Θ^{(m)})$ , the probability that uncured patients die from other risks; and $P (Y_{i} = 0, ε_{i} = 2 | O, Θ^{(m)})$ , the probability that cured patients die from other risks, respectively, where m indicates the m-th step in the EM algorithm. These three probabilities sum to 1 and can, respectively, be given by (11) $\begin{aligned} P (Y_{i} = 1, ε_{i} = 1 | O, Θ^{(m)}) \\ = {δ I (ε_{i} = 1) P (Y_{i} = 1 | ε_{i} = 1, δ_{i} = 1)|}_{Θ^{(m)}} \\ + {(1 - δ_{i}) P (Y_{i} = 1, ε_{i} = 1 | δ_{i} = 0)|}_{Θ^{(m)}} \\ = δ I (ε_{i} = 1) + {(1 - δ_{i}) \frac{π_{α} (X_{i}) π_{θ} (X_{i}) S_{1} ({\tilde{t}}_{i} | X_{i})}{S ({\tilde{t}}_{i} | X_{i})}|}_{Θ^{(m)}}, \end{aligned}$ (11) (12) $\begin{aligned} P (Y_{i} = 1, ε_{i} = 2 | O, Θ^{(m)}) \\ = {δ I (ε_{i} = 2) P (Y_{i} = 1 | ε_{i} = 2, δ_{i} = 1)|}_{Θ^{(m)}} \\ + {(1 - δ_{i}) P (Y_{i} = 1, ε_{i} = 2 | δ_{i} = 0)|}_{Θ^{(m)}} \\ = {δ I (ε_{i} = 2) \frac{π_{α} (X_{i}) [1 - π_{θ} (X_{i})]}{1 - π_{α} (X_{i}) π_{θ} (X_{i})}|}_{Θ^{(m)}} \\ + {(1 - δ_{i}) \frac{π_{α} (X_{i}) \{1 - π_{θ} (X_{i})\} S_{2} ({\tilde{t}}_{i} | X_{i})}{S ({\tilde{t}}_{i} | X_{i})}|}_{Θ^{(m)}}, \end{aligned}$ (12) (13) $\begin{aligned} P (Y_{i} = 0, ε_{i} = 2 | O, Θ^{(m)}) \\ = {δ I (ε_{i} = 2) P (Y_{i} = 0 | ε_{i} = 2, δ_{i} = 1)|}_{Θ^{(m)}} \\ + {(1 - δ_{i}) P (Y_{i} = 0, ε_{i} = 2 | δ_{i} = 0)|}_{Θ^{(m)}} \\ = {δ I (ε_{i} = 2) \frac{1 - π_{α} (X_{i})}{1 - π_{α} (X_{i}) π_{θ} (X_{i})}|}_{Θ^{(m)}} \\ + {(1 - δ_{i}) \frac{\{1 - π_{α} (X_{i})\} S_{2} ({\tilde{t}}_{i} | X_{i})}{S ({\tilde{t}}_{i} | X_{i})}|}_{Θ^{(m)}} . \end{aligned}$ (13)

Let ${\hat{p}}_{11, i}^{(m)} = P (Y_{i} = 1, ε_{i} = 1 | O, Θ^{(m)})$ and ${\hat{p}}_{ε, 1 i}^{(m)} = P (ε_{i} = 1 | O, Θ^{(m)})$ , then ${\hat{p}}_{ε, 1 i}^{(m)} = {\hat{p}}_{11, i}^{(m)}$ . Let ${\hat{p}}_{12, i}^{(m)} = P (Y_{i} = 1, ε_{i} = 2 | O, Θ^{(m)})$ and ${\hat{p}}_{02, i}^{(m)} = P (Y_{i} = 0, ε_{i} = 2 | O, Θ^{(m)})$ and ${\hat{p}}_{ε, 2 i}^{(m)} = P (ε_{i} = 2 | O, Θ^{(m)})$ , then ${\hat{p}}_{ε, 2 i}^{(m)} = {\hat{p}}_{12, i}^{(m)} + {\hat{p}}_{02, i}^{(m)}$ . The expectations of (Equation7(7) $\begin{aligned} l_{c 1} (α | O, y) \\ = \sum_{i = 1}^{n} [I (ε_{i} = 1) + I (ε_{i} = 2) y_{i}] \log π_{α} (X_{i}) \\ + \sum_{i = 1}^{n} I (ε_{i} = 2) (1 - y_{i}) \\ \log \{1 - π_{α} (X_{i})\}, \end{aligned}$ (7) ), (Equation8(8) $\begin{aligned} l_{c 2} (θ | O, y) & = \sum_{i = 1}^{n} I (ε_{i} = 1) \log π_{θ} (X_{i}) \\ + \sum_{i = 1}^{n} I (ε_{i} = 2) y_{i} \log \{1 - π_{θ} (X_{i})\}, \end{aligned}$ (8) ), (Equation9(9) $\begin{aligned} l_{c 3} (β_{1} | O, y) \\ = \sum_{i = 1}^{n} [I (ε_{i} = 1) δ_{i} [\log h_{ϵ_{1}} (e^{R_{i} (β_{1})}) \\ - β_{1}^{T} X_{i}] \\ - I (ε_{i} = 1) H_{ϵ_{1}} (e^{R_{i} (β_{1})})], \end{aligned}$ (9) ) and (Equation10(10) $\begin{aligned} l_{c 4} (β_{2} | O, y) \\ = \sum_{i = 1}^{n} [I (ε_{i} = 2) δ_{i} [\log h_{ϵ_{2}} (e^{R_{i} (β_{2})}) \\ - β_{2}^{T} X_{i}] - I (ε_{i} = 2) H_{ϵ_{2}} (e^{R_{i} (β_{2})})], \end{aligned}$ (10) ) can be written as (14) $\begin{aligned} E (l_{c_{1}}) & = \sum_{i = 1}^{n} ([{\hat{p}}_{ε, 1 i}^{(m)} + {\hat{p}}_{12, i}^{(m)}] \log [π_{α} (X_{i})] \\ + {\hat{p}}_{02, i}^{(m)} \log [1 - π_{α} (X_{i})]), \end{aligned}$ (14) (15) $\begin{aligned} E (l_{c_{2}}) & = \sum_{i = 1}^{n} ({\hat{p}}_{ε, 1 i}^{(m)} \log [π_{θ} (X_{i})] \\ + {\hat{p}}_{12, i}^{(m)} \log [1 - π_{θ} (X_{i})]), \end{aligned}$ (15) (16) $\begin{aligned} E (l_{c_{3}}) & = \sum_{i = 1}^{n} I (ε_{i} = 1) δ_{i} [\log h_{ϵ_{1}} (e^{R_{i} (β_{1})}) - β_{1}^{T} X_{i}] \\ + {\hat{p}}_{ε, 1 i}^{(m)} H_{ϵ_{1}} (e^{R_{i} (β_{1})}), \end{aligned}$ (16) (17) $\begin{aligned} E (l_{c_{4}}) & = \sum_{i = 1}^{n} I (ε_{i} = 2) δ_{i} [\log h_{ϵ_{2}} (e^{R_{i} (β_{2})}) - β_{2}^{T} X_{i}] \\ + {\hat{p}}_{ε, 2 i}^{(m)} H_{ϵ_{2}} (e^{R_{i} (β_{2})}) . \end{aligned}$ (17) The M-step in the EM algorithm is to maximise (Equation14(14) $\begin{aligned} E (l_{c_{1}}) & = \sum_{i = 1}^{n} ([{\hat{p}}_{ε, 1 i}^{(m)} + {\hat{p}}_{12, i}^{(m)}] \log [π_{α} (X_{i})] \\ + {\hat{p}}_{02, i}^{(m)} \log [1 - π_{α} (X_{i})]), \end{aligned}$ (14) ), (Equation15(15) $\begin{aligned} E (l_{c_{2}}) & = \sum_{i = 1}^{n} ({\hat{p}}_{ε, 1 i}^{(m)} \log [π_{θ} (X_{i})] \\ + {\hat{p}}_{12, i}^{(m)} \log [1 - π_{θ} (X_{i})]), \end{aligned}$ (15) ), (Equation16(16) $\begin{aligned} E (l_{c_{3}}) & = \sum_{i = 1}^{n} I (ε_{i} = 1) δ_{i} [\log h_{ϵ_{1}} (e^{R_{i} (β_{1})}) - β_{1}^{T} X_{i}] \\ + {\hat{p}}_{ε, 1 i}^{(m)} H_{ϵ_{1}} (e^{R_{i} (β_{1})}), \end{aligned}$ (16) ) and (Equation17(17) $\begin{aligned} E (l_{c_{4}}) & = \sum_{i = 1}^{n} I (ε_{i} = 2) δ_{i} [\log h_{ϵ_{2}} (e^{R_{i} (β_{2})}) - β_{2}^{T} X_{i}] \\ + {\hat{p}}_{ε, 2 i}^{(m)} H_{ϵ_{2}} (e^{R_{i} (β_{2})}) . \end{aligned}$ (17) ) with respect to the unknown parameters α, θ, $β_{1}$ , $β_{2}$ , $h_{ϵ_{1}}$ and $h_{ϵ_{2}}$ . Maximising (Equation14(14) $\begin{aligned} E (l_{c_{1}}) & = \sum_{i = 1}^{n} ([{\hat{p}}_{ε, 1 i}^{(m)} + {\hat{p}}_{12, i}^{(m)}] \log [π_{α} (X_{i})] \\ + {\hat{p}}_{02, i}^{(m)} \log [1 - π_{α} (X_{i})]), \end{aligned}$ (14) ) and (Equation15(15) $\begin{aligned} E (l_{c_{2}}) & = \sum_{i = 1}^{n} ({\hat{p}}_{ε, 1 i}^{(m)} \log [π_{θ} (X_{i})] \\ + {\hat{p}}_{12, i}^{(m)} \log [1 - π_{θ} (X_{i})]), \end{aligned}$ (15) ) with respect to α and θ can be easily carried out using the Newton–Raphson algorithm, and maximising $l_{c_{3}}$ and $l_{c_{4}}$ in (Equation16(16) $\begin{aligned} E (l_{c_{3}}) & = \sum_{i = 1}^{n} I (ε_{i} = 1) δ_{i} [\log h_{ϵ_{1}} (e^{R_{i} (β_{1})}) - β_{1}^{T} X_{i}] \\ + {\hat{p}}_{ε, 1 i}^{(m)} H_{ϵ_{1}} (e^{R_{i} (β_{1})}), \end{aligned}$ (16) ) and (Equation17(17) $\begin{aligned} E (l_{c_{4}}) & = \sum_{i = 1}^{n} I (ε_{i} = 2) δ_{i} [\log h_{ϵ_{2}} (e^{R_{i} (β_{2})}) - β_{2}^{T} X_{i}] \\ + {\hat{p}}_{ε, 2 i}^{(m)} H_{ϵ_{2}} (e^{R_{i} (β_{2})}) . \end{aligned}$ (17) )with respect to $β_{1}$ , $β_{2}$ , $h_{ϵ_{1}}$ and $h_{ϵ_{2}}$ can be carried out utilising the approach proposed by Zeng and Lin (Citation2007).

To find a smooth estimator for $H_{ϵ_{j}}$ , or $h_{ϵ_{j}}$ , for simplify, we take $H_{ϵ_{1}}$ , or $h_{ϵ_{1}}$ as example, $H_{ϵ_{2}}$ and $h_{ϵ_{2}}$ are similar. We start with the simplest case of a piecewise constant $h_{ϵ_{1}}$ . To be specific, we partition an interval containing all $e^{\log {\tilde{t}}_{i} - β_{1}^{T} x_{i}}$ into $J_{n}$ equally spaced intervals, $0 \equiv t_{0} < t_{1} < \dots < t_{J_{n}} \equiv M$ , where M denotes an upper bound for the $e^{\log {\tilde{t}}_{i} - β_{1}^{T} x_{i}}$ over all possible $β_{1}$ 's in a bounded set. A piecewise constant $h_{ϵ_{1}}$ takes the form (18) $h_{ϵ_{1}} (t) = \sum_{k = 1}^{J_{n}} c_{k} I ({\tilde{t}}_{i} \in [t_{k - 1}, t_{k})) .$ (18) Then, for any t, (19) $\begin{aligned} H_{ϵ_{1}} (t) & = \sum_{k = 1}^{J_{n}} c_{k} ({\tilde{t}}_{i} - t_{k}) I (t_{k - 1} \leq {\tilde{t}}_{i} < t_{k}) \\ + \frac{M}{J_{n}} \sum_{k = 1}^{J_{n}} c_{k} I ({\tilde{t}}_{i} \geq t_{k}) . \end{aligned}$ (19) As $\log {\sum_{k = 1}^{J_{n}} c_{k} I (e^{\log {\tilde{t}}_{i} - β_{1}^{T} x_{i}} \in [t_{k - 1}, t_{k}))} = \sum_{k = 1}^{J_{n}} \log c_{k} I (e^{\log {\tilde{t}}_{i} - β_{1}^{T} x_{i}} \in [t_{k - 1}, t_{k}))$ , then (Equation9(9) $\begin{aligned} l_{c 3} (β_{1} | O, y) \\ = \sum_{i = 1}^{n} [I (ε_{i} = 1) δ_{i} [\log h_{ϵ_{1}} (e^{R_{i} (β_{1})}) \\ - β_{1}^{T} X_{i}] \\ - I (ε_{i} = 1) H_{ϵ_{1}} (e^{R_{i} (β_{1})})], \end{aligned}$ (9) ) can be rewritten as (20) $\begin{aligned} \sum_{i = 1}^{n} - δ_{i} I (ε_{i} = 1) β_{1}^{T} X_{i} \\ + \sum_{k = 1}^{J_{n}} \log c_{k} \{\sum_{i = 1}^{n} δ_{i} I (ε_{i} = 1) \\ I (e^{R_{i} (β_{1})} \in [t_{k - 1}, t_{k}))\} \\ - \sum_{k = 1}^{J_{n}} c_{k} \{\sum_{i = 1}^{n} p_{ε, 1 i}^{(m)} (e^{R_{i} (β_{1})} - t_{k}) \\ I (t_{k - 1} \leq e^{R_{i} (β_{1})} < t_{k}) \\ + \frac{M}{J_{n}} \sum_{i = 1}^{n} p_{ε, 1 i}^{(m)} I (e^{R_{i} (β_{1})} \geq t_{k})\} . \end{aligned}$ (20) By differentiating with respect to $c_{k}$ , we see that the solution to the score equation of $c_{k}$ is (21) $c_{k} = \frac{\sum_{i = 1}^{n} δ_{i} I (ε_{i} = 1) I (e^{R_{i} (β_{1})} \in [t_{k - 1}, t_{k}))}{\begin{matrix} \sum_{i = 1}^{n} p_{ε, 1 i}^{(m)} (e^{R_{i} (β_{1})} - t_{k}) I (t_{k - 1} \leq e^{R_{i} (β_{1})} < t_{k}) \\ + \frac{M}{J_{n}} \sum_{i = 1}^{n} p_{ε, 1 i}^{(m)} I (e^{R_{i} (β_{1})} \geq t_{k}) \end{matrix}} .$ (21) After plugging the equations for the $c_{k}$ into (Equation20(20) $\begin{aligned} \sum_{i = 1}^{n} - δ_{i} I (ε_{i} = 1) β_{1}^{T} X_{i} \\ + \sum_{k = 1}^{J_{n}} \log c_{k} \{\sum_{i = 1}^{n} δ_{i} I (ε_{i} = 1) \\ I (e^{R_{i} (β_{1})} \in [t_{k - 1}, t_{k}))\} \\ - \sum_{k = 1}^{J_{n}} c_{k} \{\sum_{i = 1}^{n} p_{ε, 1 i}^{(m)} (e^{R_{i} (β_{1})} - t_{k}) \\ I (t_{k - 1} \leq e^{R_{i} (β_{1})} < t_{k}) \\ + \frac{M}{J_{n}} \sum_{i = 1}^{n} p_{ε, 1 i}^{(m)} I (e^{R_{i} (β_{1})} \geq t_{k})\} . \end{aligned}$ (20) ) and discarding the irrelevant component, we obtain the following sieve profile function: (22) $\begin{aligned} l_{c_{3}}^{p} (β_{1}) \\ = \sum_{i = 1}^{n} - δ_{i} I (ε_{i} = 1) β_{1}^{T} X_{i} \\ + \sum_{k = 1}^{J_{n}} \log \{\frac{\sum_{i = 1}^{n} δ_{i} I (ε_{i} = 1) I (e^{R_{i} (β_{1})} \in [t_{k - 1}, t_{k}))}{\begin{matrix} \sum_{i = 1}^{n} p_{ε, 1 i}^{(m)} (e^{R_{i} (β_{1})} - t_{k}) I (t_{k - 1} \leq e^{R_{i} (β_{1})} < t_{k}) \\ + \frac{M}{J_{n}} \sum_{i = 1}^{n} p_{ε, 1 i}^{(m)} I (e^{R_{i} (β_{1})} \geq t_{k}) \end{matrix}}\} \\ \times \{\sum_{i = 1}^{n} δ_{i} I (ε_{i} = 1) I (e^{R_{i} (β_{1})} \in [t_{k - 1}, t_{k}))\} \\ - \sum_{k = 1}^{J_{n}} \frac{\sum_{i = 1}^{n} δ_{i} I (ε_{i} = 1) I (e^{R_{i} (β_{1})} \in [t_{k - 1}, t_{k}))}{\begin{matrix} \sum_{i = 1}^{n} p_{ε, 1 i}^{(m)} (e^{R_{i} (β_{1})} - t_{k}) I (t_{k - 1} \leq e^{R_{i} (β_{1})} < t_{k}) \\ + \frac{M}{J_{n}} \sum_{i = 1}^{n} p_{ε, 1 i}^{(m)} I (e^{R_{i} (β_{1})} \geq t_{k}) \end{matrix}} \\ \times \{\sum_{i = 1}^{n} p_{ε, 1 i}^{(m)} (e^{R_{i} (β_{1})} - t_{k}) I (t_{k - 1} \leq e^{R_{i} (β_{1})} < t_{k}) \\ + \frac{M}{J_{n}} \sum_{i = 1}^{n} p_{ε, 1 i}^{(m)} I (e^{R_{i} (β_{1})} \geq t_{k})\} \\ = \sum_{i = 1}^{n} - δ_{i} I (ε_{i} = 1) β_{1}^{T} X_{i} \\ + \sum_{k = 1}^{J_{n}} \{\sum_{i = 1}^{n} δ_{i} I (ε_{i} = 1) I (e^{R_{i} (β_{1})} \in [t_{k - 1}, t_{k}))\} \\ \times \log \{\frac{J_{n}}{n M} \sum_{i = 1}^{n} δ_{i} I (ε_{i} = 1) I (e^{R_{i} (β_{1})} \in [t_{k - 1}, t_{k}))\} \\ - \sum_{k = 1}^{J_{n}} \{\sum_{i = 1}^{n} δ_{i} I (ε_{i} = 1) I (e^{R_{i} (β_{1})} \in [t_{k - 1}, t_{k}))\} \\ \times \log \{\frac{J_{n}}{n M} \sum_{i = 1}^{n} p_{ε, 1 i}^{(m)} (e^{R_{i} (β_{1})} - t_{k}) I (t_{k - 1} \leq e^{R_{i} (β_{1})} < t_{k}) \\ + \frac{1}{n} \sum_{i = 1}^{n} p_{ε, 1 i}^{(m)} I (e^{R_{i} (β_{1})} \geq t_{k})\} . \end{aligned}$ (22) Note $l_{c_{3}}^{p} (β_{1})$ is not smooth and may have multiple local maxima. Following Zeng and Lin (Citation2007), we further seek a smooth approximation of $l_{c_{3}}^{p} (β_{1})$ by the empirical measure. The kernel smoothed approximation of $l_{c_{3}}^{p} (β_{1})$ is (23) $\begin{aligned} l^{s} c_{3} (β_{1}) = \sum_{i = 1}^{n} δ_{i} I (ε_{i} = 1) \\ \log \{\frac{1}{{na}_{n}} \sum_{j = 1}^{n} δ_{i} I (ε_{j} = 1) K (\frac{R_{j} (β_{1}) - R_{i} (β_{1})}{a_{n}})\} \\ - \sum_{i = 1}^{n} δ_{i} I (ε_{i} = 1) \\ \log \{\frac{1}{n} \sum_{j = 1}^{n} p_{ε, 1 i}^{(m)} \int_{- \infty}^{R_{j} (β_{1}) - R_{i} (β_{1}) / a_{n}} K (s) ds\} . \end{aligned}$ (23) where $K (\cdot)$ is the kernel function and $a_{n}$ is the bandwidth. The detail of this derivation can be found in the appendix. The selection of the kernel function and bandwidth can be found in Zeng and Lin (Citation2007). We propose to maximise $l^{s} c_{3} (β_{1})$ over $β_{1}$ and denote the resulting estimator as $\hat{β_{1}}$ . Because $K (\cdot)$ is a smooth kernel function, we can use the Newton–Raphson algorithm or other gradient-based search algorithms to calculate $\hat{β_{1}}$ . Given $\hat{β_{1}}$ , we estimate $h_{ϵ_{1}} (t)$ by the following kernel-smoothed estimator: (24) ${\hat{h}}_{ϵ_{1}} (t) = \frac{\frac{1}{n a_{n} t} \sum_{i = 1}^{n} δ_{i} I (ε_{i} = 1) K (\frac{R_{i} (\hat{β_{1}}) - \log t}{a_{n}})}{\frac{1}{n} \sum_{i = 1}^{n} p_{ε, 1 i}^{(m)} \int_{- \infty}^{R_{i} (\hat{β_{1}}) - \log t / a_{n}} K (μ) d μ} .$ (24) The corresponding estimator of $H_{ϵ_{1}} (t)$ is (25) $\begin{aligned} {\hat{H}}_{ϵ_{1}} (t) \\ = \int_{- \infty}^{\log t} \frac{\frac{1}{n a_{n}} \sum_{i = 1}^{n} δ_{i} I (ε_{i} = 1) K (\frac{R_{i} (\hat{β_{1}}) - s}{a_{n}})}{\frac{1}{n} \sum_{i = 1}^{n} p_{ε, 1 i}^{(m)} \int_{- \infty}^{R_{i} (\hat{β_{1}}) - \log t / a_{n}} K (μ) d μ} d s . \end{aligned}$ (25) Similar to $h_{ϵ_{1}}$ and $H_{ϵ_{1}}$ , the estimator of $h_{ϵ_{2}}$ and $H_{ϵ_{2}}$ are (26) ${\hat{h}}_{ϵ_{2}} (t) = \frac{\frac{1}{n a_{n} t} \sum_{i = 1}^{n} δ_{i} I (ε_{i} = 2) K (\frac{R_{i} (\hat{β_{2}}) - \log t}{a_{n}})}{\frac{1}{n} \sum_{i = 1}^{n} p_{ε, 2 i}^{(m)} \int_{- \infty}^{R_{i} (\hat{β_{2}}) - \log t / a_{n}} K (μ) d μ},$ (26) and (27) $\begin{aligned} {\hat{H}}_{ϵ_{2}} (t) \\ = \int_{- \infty}^{\log t} \frac{\frac{1}{n a_{n}} \sum_{i = 1}^{n} δ_{i} I (ε_{i} = 2) K (\frac{R_{i} (\hat{β_{2}}) - s}{a_{n}})}{\frac{1}{n} \sum_{i = 1}^{n} p_{ε, 2 i}^{(m)} \int_{- \infty}^{R_{i} (\hat{β_{2}}) - \log t / a_{n}} K (μ) d μ} d s . \end{aligned}$ (27) The M-step in the EM algorithm is to maximise (Equation14(14) $\begin{aligned} E (l_{c_{1}}) & = \sum_{i = 1}^{n} ([{\hat{p}}_{ε, 1 i}^{(m)} + {\hat{p}}_{12, i}^{(m)}] \log [π_{α} (X_{i})] \\ + {\hat{p}}_{02, i}^{(m)} \log [1 - π_{α} (X_{i})]), \end{aligned}$ (14) ), (Equation15(15) $\begin{aligned} E (l_{c_{2}}) & = \sum_{i = 1}^{n} ({\hat{p}}_{ε, 1 i}^{(m)} \log [π_{θ} (X_{i})] \\ + {\hat{p}}_{12, i}^{(m)} \log [1 - π_{θ} (X_{i})]), \end{aligned}$ (15) ), (Equation16(16) $\begin{aligned} E (l_{c_{3}}) & = \sum_{i = 1}^{n} I (ε_{i} = 1) δ_{i} [\log h_{ϵ_{1}} (e^{R_{i} (β_{1})}) - β_{1}^{T} X_{i}] \\ + {\hat{p}}_{ε, 1 i}^{(m)} H_{ϵ_{1}} (e^{R_{i} (β_{1})}), \end{aligned}$ (16) ) and (Equation17(17) $\begin{aligned} E (l_{c_{4}}) & = \sum_{i = 1}^{n} I (ε_{i} = 2) δ_{i} [\log h_{ϵ_{2}} (e^{R_{i} (β_{2})}) - β_{2}^{T} X_{i}] \\ + {\hat{p}}_{ε, 2 i}^{(m)} H_{ϵ_{2}} (e^{R_{i} (β_{2})}) . \end{aligned}$ (17) ) with respect to the unknown parameters $Θ = (α, θ, β_{1}, β_{2}, H_{ϵ_{1}} (\cdot), H_{ϵ_{2}} (\cdot))$ , which can be easily estimated by the Newton–Raphson method using ‘optim’ function in R.

The EM algorithm is described as follows:

Step 0: Given initial values of $α^{(0)}$ , $θ^{(0)}$ , $β_{1}^{(0)}$ , $β_{2}^{(0)}$ and ${\hat{p}}_{ε, j i}^{(0)} = 1$ . $h_{ϵ_{1}}^{(0)} ({\tilde{t}}_{i} | X_{i})$ and $h_{ϵ_{2}}^{(0)} ({\tilde{t}}_{i} | X_{i})$ can be estimated by (Equation24(24) ${\hat{h}}_{ϵ_{1}} (t) = \frac{\frac{1}{n a_{n} t} \sum_{i = 1}^{n} δ_{i} I (ε_{i} = 1) K (\frac{R_{i} (\hat{β_{1}}) - \log t}{a_{n}})}{\frac{1}{n} \sum_{i = 1}^{n} p_{ε, 1 i}^{(m)} \int_{- \infty}^{R_{i} (\hat{β_{1}}) - \log t / a_{n}} K (μ) d μ} .$ (24) ) and (Equation26(26) ${\hat{h}}_{ϵ_{2}} (t) = \frac{\frac{1}{n a_{n} t} \sum_{i = 1}^{n} δ_{i} I (ε_{i} = 2) K (\frac{R_{i} (\hat{β_{2}}) - \log t}{a_{n}})}{\frac{1}{n} \sum_{i = 1}^{n} p_{ε, 2 i}^{(m)} \int_{- \infty}^{R_{i} (\hat{β_{2}}) - \log t / a_{n}} K (μ) d μ},$ (26) ) based on the value of $β_{1}^{(0)}$ and $β_{2}^{(0)}$ .

Step 1: Update $α^{(m + 1)}$ and $θ^{(m + 1)}$ by maximising (Equation14(14) $\begin{aligned} E (l_{c_{1}}) & = \sum_{i = 1}^{n} ([{\hat{p}}_{ε, 1 i}^{(m)} + {\hat{p}}_{12, i}^{(m)}] \log [π_{α} (X_{i})] \\ + {\hat{p}}_{02, i}^{(m)} \log [1 - π_{α} (X_{i})]), \end{aligned}$ (14) ) and (Equation15(15) $\begin{aligned} E (l_{c_{2}}) & = \sum_{i = 1}^{n} ({\hat{p}}_{ε, 1 i}^{(m)} \log [π_{θ} (X_{i})] \\ + {\hat{p}}_{12, i}^{(m)} \log [1 - π_{θ} (X_{i})]), \end{aligned}$ (15) ). Update $β_{1}^{(m + 1)}$ and $β_{2}^{(m + 1)}$ by maxi-mising (Equation23(23) $\begin{aligned} l^{s} c_{3} (β_{1}) = \sum_{i = 1}^{n} δ_{i} I (ε_{i} = 1) \\ \log \{\frac{1}{{na}_{n}} \sum_{j = 1}^{n} δ_{i} I (ε_{j} = 1) K (\frac{R_{j} (β_{1}) - R_{i} (β_{1})}{a_{n}})\} \\ - \sum_{i = 1}^{n} δ_{i} I (ε_{i} = 1) \\ \log \{\frac{1}{n} \sum_{j = 1}^{n} p_{ε, 1 i}^{(m)} \int_{- \infty}^{R_{j} (β_{1}) - R_{i} (β_{1}) / a_{n}} K (s) ds\} . \end{aligned}$ (23) ).

Step 2: Update $h_{ϵ_{1}}^{(m + 1)} (\cdot)$ and $h_{ϵ_{2}}^{(m + 1)} (\cdot)$ via (Equation24(24) ${\hat{h}}_{ϵ_{1}} (t) = \frac{\frac{1}{n a_{n} t} \sum_{i = 1}^{n} δ_{i} I (ε_{i} = 1) K (\frac{R_{i} (\hat{β_{1}}) - \log t}{a_{n}})}{\frac{1}{n} \sum_{i = 1}^{n} p_{ε, 1 i}^{(m)} \int_{- \infty}^{R_{i} (\hat{β_{1}}) - \log t / a_{n}} K (μ) d μ} .$ (24) ) and (Equation26(26) ${\hat{h}}_{ϵ_{2}} (t) = \frac{\frac{1}{n a_{n} t} \sum_{i = 1}^{n} δ_{i} I (ε_{i} = 2) K (\frac{R_{i} (\hat{β_{2}}) - \log t}{a_{n}})}{\frac{1}{n} \sum_{i = 1}^{n} p_{ε, 2 i}^{(m)} \int_{- \infty}^{R_{i} (\hat{β_{2}}) - \log t / a_{n}} K (μ) d μ},$ (26) ).

Step 3: In the $(m + 1)$ th iteration, calculate ${\hat{p}}_{ε, 1 i}^{(m)}$ , ${\hat{p}}_{12, i}^{(m)}$ , ${\hat{p}}_{02, i}^{(m)}$ based on $α^{(m)}$ , $θ^{(m)}$ , $β_{1}^{(m)}$ , $β_{2}^{(m)}$ , $h_{ϵ_{1}}^{(m)} ({\tilde{t}}_{i} | X_{i})$ and $h_{ϵ_{2}}^{(m)} ({\tilde{t}}_{i} | X_{i})$ from (Equation11(11) $\begin{aligned} P (Y_{i} = 1, ε_{i} = 1 | O, Θ^{(m)}) \\ = {δ I (ε_{i} = 1) P (Y_{i} = 1 | ε_{i} = 1, δ_{i} = 1)|}_{Θ^{(m)}} \\ + {(1 - δ_{i}) P (Y_{i} = 1, ε_{i} = 1 | δ_{i} = 0)|}_{Θ^{(m)}} \\ = δ I (ε_{i} = 1) + {(1 - δ_{i}) \frac{π_{α} (X_{i}) π_{θ} (X_{i}) S_{1} ({\tilde{t}}_{i} | X_{i})}{S ({\tilde{t}}_{i} | X_{i})}|}_{Θ^{(m)}}, \end{aligned}$ (11) ), (Equation12(12) $\begin{aligned} P (Y_{i} = 1, ε_{i} = 2 | O, Θ^{(m)}) \\ = {δ I (ε_{i} = 2) P (Y_{i} = 1 | ε_{i} = 2, δ_{i} = 1)|}_{Θ^{(m)}} \\ + {(1 - δ_{i}) P (Y_{i} = 1, ε_{i} = 2 | δ_{i} = 0)|}_{Θ^{(m)}} \\ = {δ I (ε_{i} = 2) \frac{π_{α} (X_{i}) [1 - π_{θ} (X_{i})]}{1 - π_{α} (X_{i}) π_{θ} (X_{i})}|}_{Θ^{(m)}} \\ + {(1 - δ_{i}) \frac{π_{α} (X_{i}) \{1 - π_{θ} (X_{i})\} S_{2} ({\tilde{t}}_{i} | X_{i})}{S ({\tilde{t}}_{i} | X_{i})}|}_{Θ^{(m)}}, \end{aligned}$ (12) ), (Equation13(13) $\begin{aligned} P (Y_{i} = 0, ε_{i} = 2 | O, Θ^{(m)}) \\ = {δ I (ε_{i} = 2) P (Y_{i} = 0 | ε_{i} = 2, δ_{i} = 1)|}_{Θ^{(m)}} \\ + {(1 - δ_{i}) P (Y_{i} = 0, ε_{i} = 2 | δ_{i} = 0)|}_{Θ^{(m)}} \\ = {δ I (ε_{i} = 2) \frac{1 - π_{α} (X_{i})}{1 - π_{α} (X_{i}) π_{θ} (X_{i})}|}_{Θ^{(m)}} \\ + {(1 - δ_{i}) \frac{\{1 - π_{α} (X_{i})\} S_{2} ({\tilde{t}}_{i} | X_{i})}{S ({\tilde{t}}_{i} | X_{i})}|}_{Θ^{(m)}} . \end{aligned}$ (13) ).

Step 4: Repeat Steps 1 and 3 until convergence attained. For convergence, we use the criterion, $max {(α^{(m)} - α^{(m - 1)})^{2}, (θ^{(m)} - θ^{(m - 1)})^{2}, (β_{1}^{(m)} - β_{1}^{(m - 1)})^{2}, (β_{2}^{m} - β_{2}^{m - 1})^{2}} < 0.0001$ .

The algorithm always converges in our simulation.

4. Simulation

We conducted numerous simulation studies to examine the proposed inference procedures. We generated failure times from the following model: (28) $\log T_{j} = β_{j, i} X_{1} + β_{j, i} X_{2} + ϵ_{j, i}, j = 1, 2,$ (28) where $X_{1}$ is standard normal distribution, and $X_{2}$ is Bernoulli with 0.5 success probability. We considered three distribution of $e^{ϵ_{j, i}}$ : Weibull distribution with different shape parameters, denoted by Weibull $(0.5, 1)$ and Weibull $(0.1, 1)$ ; lognormal distribution with different parameters, denoted by lognormal $(0, 1)$ and lognormal $(1, 1)$ ; Weibull distribution for the event of interest and lognormal distribution for the other event, which is Weibull $(0.5, 1)$ and lognormal $(1, 1)$ . $β_{1} = (1, 1)$ for the event of interest for uncured patient and $β_{2} = (2, 2)$ for other risk event. We assume $α = (2, 1, 1)$ in the uncure component $π_{α}$ , $θ = (0.5, - 0.5, 0.5)$ for the component in the competing risk $π_{θ}$ , We generated censoring times from the uniform $[0, τ]$ distribution, where τ was chosen to produce a $25 %$ censoring rate. We set n to 100, 500 and 1000.

Following Zeng and Lin (Citation2007), we choose the kernel function $K (\cdot)$ to be the standard normal density for convenience and tractability. We used the optimal bandwidths $4^{1 / 3} σ n^{- 1 / 3}$ , where σ is the sample standard deviations of $\log T$ .

Table reports the biases, mean squared errors (MSE), empirical standard errors (SE), average estimated standard deviations (ESD) with bootstrap sample size 500 and $95 %$ coverage probabilities (CP). From Table , we can see that all biases are relatively small, the SEs and ESDs are close to each other, and the $95 %$ CPs are close to their nominal levels. With sample size increases, the biases, MSE and SE become smaller.

Table 1. Estimates of parameters for three baseline survivals.

Display Table

Figures – show the estimates of CIFs of the proposed model along with the true CIFs. We compare the CIFs of the event of primary interest ( $F_{1}$ ) and other risks ( $F_{2}$ ) with their $95 %$ pointwise confidence intervals (CI). The estimated CIFs are close to the true values, and it becomes closer as sample sizes increase. All these evidences show that the proposed method performs well, even for small and medium size samples.

Figure 1. Estimated CIFs with Weibull distributions (solid lines), their 95% pointwise confidence intervals (dashed and dotted lines), and the true CIFs (dotdash lines). (a) $n = 100, F_{1}$ . (b) $n = 100, F_{2}$ . (c) $n = 500, F_{1}$ . (d) $n = 500, F_{2}$ . (e) $n = 1000, F_{1}$ . (f) $n = 1000, F_{2}$ .

Figure 2. Estimated CIFs with lognormal distributions (solid lines), their 95% pointwise confidence intervals (dashed and dotted lines), and the true CIFs dotdash lines). (a) $n = 100, F_{1}$ . (b) $n = 100, F_{2}$ . (c) $n = 500, F_{1}$ . (d) $n = 500, F_{2}$ . (e) $n = 1000, F_{1}$ . (f) $n = 1000, F_{2}$ .

Figure 3. Estimated CIFs with Weibull and lognormal distributions (solid lines), their 95% pointwise confidence intervals (dashed and dotted lines), and the true CIFs (dotdash lines). (a) $n = 100, F_{1}$ . (b) $n = 100, F_{2}$ . (c) $n = 500, F_{1}$ . (d) $n = 500, F_{2}$ . (e) $n = 1000, F_{1}$ . (f) $n = 1000, F_{2}$ .

5. Colorectal cancer clinical trial data

To illustrate the proposed estimation method for the AFT mixture cure model with competing risks data, we consider the colorectal cancer clinical trial data from González et al. (Citation2005), which contains rehospitalisation and death data of patients diagnosed with colorectal cancer between January 1996 and December 1998. The data includes calendar time (in days) of the successive hospitalisations after surgical procedure. Gray test with P=0.0006 provides the evidence for considering the competing risk model in this data set, and González et al. (Citation2005) considered death to be a competing risk to rehospitalisation. However, some patients may never have rehospitalisation after discharge. Thus, we consider time to rehospitalisation as the primary interest and death as other possible event which is non-curable, and analyse the data via the proposed model.

This study included 523 patients. The first hospital readmission time was considered as the day between date of surgery and the first rehospitalisation after discharge related to colorectal cancer. There are $23 %$ patients do not experience both events till the end of study and being treated as the right censoring. We considered drug treatment group (chem=1 for thiotepa drug and 0 for control group) and readmissions for comorbidity (Charlson's index:0, 1–2, 3) for possible risk factors for two events of interest. Thiotepa is a member of the class of alkylating agents, which were among the first anticancer drugs used. Alkylating agents are highly reactive and bind to certain chemical groups found in nucleic acids. These compounds inhibit proper synthesis of DNA and RNA, which leads to apoptosis or cell death. However, since alkylating agents cannot discriminate between cancerous and normal cells, both types of cells will be affected by this therapy. For example, normal cells can become cancerous due to alklyating agents. Thus, thiotepa is a highly cytotoxic compound and can potentially have adverse effects. Consequently, the effects of thiotepa on cancer recurrence and death are not obvious.

For the proposed model, all variables were included in the competing risk event with cured and without cured component. The estimated coefficients and their standard deviations for proposed model are listed in Table . Based on the uncure part and competing risk part, both Charlson's index (0) and Charlson's index(1−2) have significant impact. Similarly, both treatment and Charlson's index(1−2) show significant impact on whether patients will experience the rehospitalisation or death.

Table 2. Results for colorectal cancer clinical trial data.

Download CSV Display Table

After using the (Equation3(3) $F_{j} (t, X_{i}) = P (T_{i} \leq t, ε_{i} = j | X_{i}), j = 1, 2.$ (3) ), we calculated and plotted the cumulative incidence curves for rehospitalisation and death of colorectal cancer in the Figure . Figure shows the competing relationship between the rehospitalisaion and death clearly, and the rehospita-lisaion has a higher rate than death.

Figure 4. Proposed model.

6. Discussions and conclusions

In this paper, we developed a new accelerated failure time mixture cure model allowing non-curable competing risk. Comparing with the existing models, this model can better capture the cure rate of disease than the traditional mixture cure model. The semiparametric estimation based on the EM algorithm can be easily got with the help of existing popular R packages. Although the variance estimation is based on the bootstrap due to the complex structure of model, the comprehensive simulation shows that the performance is reasonable even when the resampling size is small.

Acknowledgements

The research is supported by the Natural Science Foundation of China (Nos. 11271136, 81530086) and the 111 Project of China (No. B14019).

Disclosure statement

No potential conflict of interest was reported by the authors.

Additional information

Notes on contributors

Yijun Wang

Yijun Wang is a PhD candidate in the College of Statistics, East China Normal University, Shanghai, China. Her research interests include Bayesian statistics, reliability statistics and biostatistics.

Jiajia Zhang

Jiajia Zhang is a professor of biostatistics in the Department of Epidemiology and Biostatistics, University of South Carolina, USC. She received her PhD degree from Memorial University. Her professional publications and research interests have focused on survival analysis,semiparametric estimation methods and spatial survival analysis.

Yincai Tang

Yincai Tang is professor of statistics in the College of Statistics, East China Normal University, Shanghai, China. He received his PhD degree from East China Normal University. His professional publications and research interests have focused on lifetime data analysis, degradation data analysis, big data analysis and Bayesian inference.

References

Boag, J.W. (1949). Maximum likelihood estimates of the proportion of patients cured by cancer therapy. Journal of the Royal Statistical Society. Series B (Methodological), 11(1), 15–53. doi: 10.1111/j.2517-6161.1949.tb00020.x
Web of Science ®Google Scholar
Crowder, M. (2001). Classical competing risk. London: Chapman and Hall/CRC.
Google Scholar
David, H., & Moeschberge, M. (1978). The theory of competing risks. London: Griffn.
Google Scholar
Fine, J., & Gray, R. (1999). A proportional hazards model for the subdistribution of a competing risk. Journal of the American Statistical Association, 94(446), 496–509. doi: 10.1080/01621459.1999.10474144
Web of Science ®Google Scholar
Fusaro, R., Bacchetti, P., & Jewell, N. (1996). A competing risks analysis of presenting aids diagnoses trends. Biometrics, 52, 211–225. doi: 10.2307/2533157
PubMed Web of Science ®Google Scholar
Gamel, J., Weller, E., Wesley, M., & Feuer, E. (2000). Parametric cure models of relative and cause-specific survival for grouped survival times. Computer Methods and Programs in Biomedicine, 61(2), 99–110. doi: 10.1016/S0169-2607(99)00022-X
Web of Science ®Google Scholar
Gaynor, J., Feuer, E., Tan, C., Wu, D., Little, C., Straus, D., …Brennan, M. (1993). On the use of cause-specific failure and conditional failure probabilities: Examples from clinical oncology data. Journal of the American Statistical Association, 88(422), 400–409. doi: 10.1080/01621459.1993.10476289
Web of Science ®Google Scholar
González, J. R., Fernandez, E., Moreno, V., Ribes, J., Peris, M., Navarro, M., …Borrás, J. M. (2005). Sex differences in hospital readmission among colorectal cancer patients. Journal of Epidemiology & Community Health, 59(6), 506–511. doi: 10.1136/jech.2004.028902
PubMed Web of Science ®Google Scholar
Kalbfleisch, J.D., & Prentice, R.L. (2002). The statistical analysis of failure time data. New York: John Wiley & Sons.
Google Scholar
Klein, J. (2006). Modelling competing risks in cancer studies. Statistics in Medicine, 25(6), 1015–1034. doi: 10.1002/sim.2246
PubMed Web of Science ®Google Scholar
Kleinbaum, D., & Klein, M. (2006). Survival analysis: A self-learning text. New York: Springer Science & Business Media.
Google Scholar
Kuk, A. (1992). A semiparametric mixture model for the analysis of competing risks data. Australin Jounal of Statistics, 34(2), 169–180.
Google Scholar
Lambert, P., Thompson, J., Weston, C., & Dickman, P. (2006). Estimating and modeling the cure fraction in population-based cancer survival analysis. Biostatistics (Oxford, England), 8(3), 576–594. doi: 10.1093/biostatistics/kxl030
Web of Science ®Google Scholar
Larson, M., & Dinse, G. (1985). A mixture model for the regression analysis of competing risks data. Applied Statistics, 34(3), 201–211. doi: 10.2307/2347464
Web of Science ®Google Scholar
Li, C., & Taylor, J. (2002). A semi-parametric accelerated failure time cure model. Statistics in Medicine, 21(21), 3235–3247. doi: 10.1002/sim.1260
PubMed Web of Science ®Google Scholar
Lu, W. (2010). Efficient estimation for an accelerated failure time model with a cure fraction. Statistica Sinica, 20, 661–674.
PubMed Web of Science ®Google Scholar
Lu, W., & Peng, L. (2008). Semiparametric analysis of mixture regression models with competing risks data. Lifetime Data Analysis, 14(3), 231–252. doi: 10.1007/s10985-007-9077-6
Web of Science ®Google Scholar
Ng, S., & McLachlan, G. (2003). An em-based semi-parametric mixture model approach to the regression analysis of competing-risks data. Statistics in Medicine, 22(7), 1097–1111. doi: 10.1002/sim.1371
PubMed Web of Science ®Google Scholar
Ohneberg, K., Schumacher, M., & Beyersmann, J. (2017). Modelling two cause-specific hazards of competing risks in one cumulative proportional odds model?. Statistics in Medicine, 36(27), 4353–4363. doi: 10.1002/sim.7437
Web of Science ®Google Scholar
Peng, Y. (2003). Fitting semiparametric cure models. Computational Statistics & Data Analysis, 41(3), 481–490. doi: 10.1016/S0167-9473(02)00184-6
Web of Science ®Google Scholar
Peng, Y., & Dear, K. B. (2000). A nonparametric mixture model for cure rate estimation. Biometrics, 56(1), 237–243. doi: 10.1111/j.0006-341X.2000.00237.x
PubMed Web of Science ®Google Scholar
Peng, Y., Dear, K. B., & Denham, J. (1998). A generalized f mixture model for cure rate estimation. Statistics in Medicine, 17(8), 813–830. doi: 10.1002/(SICI)1097-0258(19980430)17:8<813::AID-SIM775>3.0.CO;2-#
PubMed Web of Science ®Google Scholar
Pintilie, M. (2007). Analysing and interpreting competing risk data. Statistics in Medicine, 26(6), 1360–1367. doi: 10.1002/sim.2655
PubMed Web of Science ®Google Scholar
Sy, J., & Taylor, J. (2000). Estimation in a cox proportional hazards cure model. Biometrics, 56(1), 227–236. doi: 10.1111/j.0006-341X.2000.00227.x
PubMed Web of Science ®Google Scholar
Tai, B., Machin, D., White, I., & Gebski, V. (2001). Competing risks analysis of patients with osteosarcoma: A comparison of four different approaches. Statistics in Medicine, 20(5), 661–684. doi: 10.1002/sim.711
Web of Science ®Google Scholar
Van Der VaartJon, A., & Wellner, J. (1996). Weak convergence and empirical processes. New York, NY: Springer.
Google Scholar
Xu, L., & Zhang, J. (2009). An alternative estimation method for the semiparametric accelerated failure time mixture cure model. Communications in Statistics-Simulation and Computation, 38(9), 1980–1990. doi: 10.1080/03610910903180657
Web of Science ®Google Scholar
Yu, B., & Tiwari, R. (2007). Application of em algorithm to mixture cure model for grouped relative survival data. Journal of Data Science, 5, 41–51.
Google Scholar
Yu, B., Tiwari, R., Cronin, K., & Feuer, E. (2004). Cure fraction estimation from the mixture cure models for grouped survival data. Statistics in Medicine, 23, 1733–1747. doi: 10.1002/sim.1774
PubMed Web of Science ®Google Scholar
Zeng, D., & Lin, D. (2007). Efficient estimation for the accelerated failure time model. Journal of the American Statistical Association, 102(480), 1387–1396. doi: 10.1198/016214507000001085
Web of Science ®Google Scholar
Zhang, J., & Peng, Y. (2007). A new estimation method for the semiparametric accelerated failure time mixture cure model. Statistics in Medicine, 26(16), 3157–3171. doi: 10.1002/sim.2748
PubMed Web of Science ®Google Scholar

Appendix

Brief description of approximation for the

l^{*} c_{3} (β_{1})

. When

n \to \infty

J_{n} \to \infty

, and

J_{n} / n \to 0

, according to the Donsker theorem, we obtain

\begin{aligned} \frac{1}{n} \sum_{i = 1}^{n} δ_{i} I (ε_{i} = 1) β_{1}^{T} X_{i} \to E {δ I (ε = 1) β_{1}^{T} X}, \\ \frac{1}{n} \sum_{i = 1}^{n} δ_{i} I (ε_{i} = 1) I (e^{R_{i} (β_{1})} \in [t_{k - 1}, t_{k})) \to \\ P (δ = 1, ε = 1, e^{R_{i} (β_{1})} \in [t_{k - 1}, t_{k})), \\ \frac{J_{n}}{n M} \sum_{i = 1}^{n} δ_{i} I (ε_{i} = 1) I (t_{k - 1} \leq e^{R_{i} (β_{1})} < t_{k}) \to \\ {\frac{d P (δ = 1, ε = 1, e^{R_{i} (β_{1})} \leq s)}{d s}|}_{s = t_{k - 1}} . \end{aligned}

Using the multiplier central limit theorem of Van Der VaartJon and Wellner (Citation1996), we have

\begin{aligned} max |\frac{1}{n} \sum_{i = 1}^{n} p_{ε, 1 i}^{(m)} (e^{R_{i} (β_{1})} - t_{k}) I (t_{k - 1} \leq e^{R_{i} (β_{1})} < t_{k}) \\ - E [p_{ε, 1 i}^{(m)} (e^{R (β_{1})} - t_{k}) I (t_{k - 1} \leq e^{R (β_{1})} < t_{k})]| \\ = Op (\frac{1}{\sqrt{n}}) . \\ max |\frac{1}{n} \sum_{i = 1}^{n} p_{ε, 1 i}^{(m)} I (e^{R_{i} (β_{1})} \geq t_{k}) - E [p_{ε, 1 i}^{(m)} I (e^{R (β_{1})} \geq t_{k})]| \\ = Op (\frac{1}{\sqrt{n}}) . \end{aligned}

Since

J_{n} / n \to 0

n \to \infty

, we can obtain

\begin{aligned} max & |\frac{J_{n}}{n M} \sum_{i = 1}^{n} p_{ε, 1 i}^{(m)} (e^{R_{i} (β_{1})} - t_{k}) I (t_{k - 1} \leq e^{R_{i} (β_{1})} < t_{k}) \\ + \frac{1}{n} \sum_{i = 1}^{n} p_{ε, 1 i}^{(m)} I (e^{R_{i} (β_{1})} \geq t_{k}) \\ - E [\frac{J_{n}}{M} p_{ε, 1 i}^{(m)} (e^{R (β_{1})} - t_{k}) I (t_{k - 1} \leq e^{R (β_{1})} < t_{k}) \\ + p_{ε, 1 i}^{(m)} I (e^{R (β_{1})} \geq t_{k})]| \to 0. \end{aligned}

Uniformly in β and

t_{k}

. Note, for the last term, we have

\begin{aligned} E [\frac{J_{n}}{M} p_{ε, 1 i}^{(m)} (e^{R (β_{1})} - t_{k}) I (t_{k - 1} \leq e^{R (β_{1})} < t_{k}) \\ + p_{ε, 1 i}^{(m)} I (e^{R (β_{1})} \geq t_{k})] \\ = O (J_{n}^{- 1}) + E [p_{ε, 1 i}^{(m)} I (e^{R (β_{1})} \geq t_{k})] = op (1) \\ + E [p_{ε, 1 i}^{(m)} I (e^{R (β_{1})} \geq t_{k})] . \end{aligned}

Then,

\begin{aligned} sup |- l_{c_{3}}^{p} (β_{1}) / n - E (δ I (ε = 1) β_{1}^{T} X) \\ + \sum_{k = 1}^{J_{n}} P (δ = 1, ε = 1, e^{R (β_{1})} \in [t_{k - 1}, t_{k})) \\ \times \log {\frac{d P (δ = 1, ε = 1, e^{R (β_{1})} \leq s)}{d s}|}_{s = t_{k - 1}} \\ - \sum_{k = 1}^{J_{n}} P (δ = 1, ε = 1, e^{R (β_{1})} \in [t_{k - 1}, t_{k})) \\ \times \log E [p_{ε, 1 i}^{(m)} I (e^{R (β_{1})} \geq t_{k})]| \to 0. \end{aligned}

Similar to Zeng and Lin (Citation2007), we choose a kernel function

K (\cdot)

with bandwidth

a_{n}

. The theory of kernel estimation indicates that under suitable regularity conditions,

\begin{aligned} \frac{1}{n a_{n}} \sum_{i = 1}^{n} δ_{i} I (ε_{i} = 1) K (\frac{R_{i} (β_{1}) - \log t}{a_{n}}) \to \\ {\frac{d P (δ = 1, ε = 1, R (β_{1}) \leq s)}{ds}|}_{s = \log t} \\ = \frac{d P (δ = 1, ε = 1, e^{R (β_{1})} \leq t)}{d t} t \end{aligned}

and

\begin{aligned} \frac{1}{n a_{n}} \sum_{i = 1}^{n} p_{ε, 1 i}^{(m)} \int_{- \infty}^{\log t} K (\frac{R_{i} (β_{1}) - s}{a_{n}}) d s \to \\ E (p_{ε, 1 i}^{(m)}, I (e^{R (β_{1})} \leq t)) . \end{aligned}

Thus we approximate

\frac{d P (δ = 1, ε = 1, e^{R (β_{1})} \leq t)}{d t} / E (p_{ε, 1 i}^{(m)}, I (e^{R (β_{1})} \geq t))

\frac{1}{t} \frac{\frac{1}{n a_{n}} \sum_{i = 1}^{n} δ_{i} I (ε_{i} = 1) K (\frac{R_{i} (β_{1}) - \log t}{a_{n}})}{\frac{1}{n a_{n}} \sum_{i = 1}^{n} p_{ε, 1 i}^{(m)} \int_{\log t}^{\infty} K (\frac{R_{i} (β_{1}) - s}{a_{n}}) d s} .

The Kernel-smoothed approximation of the likelihood function is

\begin{aligned} l^{s} c_{3} (β_{1}) = n \{- E [δ I (ε = 1) β_{1}^{T} X] \\ + \int_{0}^{\infty} \log \frac{d P (δ = 1, ε = 1, e^{R (β_{1})} \leq t) / d t}{E (p_{ε, 1 i}^{(m)} I (e^{R (β_{1})} \geq t))} \\ d P (δ = 1, ε = 1, e^{R (β_{1})} \leq t)\} \\ = \sum_{i = 1}^{n} - δ_{i} I (ε_{i} = 1) β_{1}^{T} X_{i} - \sum_{i = 1}^{n} δ_{i} I (ε_{i} = 1) R_{i} (β_{1}) \end{aligned}

$\begin{aligned} + \sum_{i = 1}^{n} δ_{i} I (ε_{i} = 1) \log \{\frac{1}{{na}_{n}} \sum_{j = 1}^{n} δ_{i} I (ε_{j} = 1) K (\frac{R_{j} (β_{1}) - R_{i} (β_{1})}{a_{n}})\} \\ - \sum_{i = 1}^{n} δ_{i} I (ε_{i} = 1) \log \{\frac{1}{n} \sum_{j = 1}^{n} p_{ε, 1 i}^{(m)} \int_{- \infty}^{R_{j} (β_{1}) - R_{i} (β_{1}) / a_{n}} K (s) ds\} . \end{aligned}$ Discarding the constant term, we obtain the following log-likelihood function, $\begin{aligned} l^{s} c_{3} (β_{1}) = \sum_{i = 1}^{n} δ_{i} I (ε_{i} = 1) \\ \log \{\frac{1}{{na}_{n}} \sum_{j = 1}^{n} δ_{iI} (ε_{j} = 1) K (\frac{R_{j} (β_{1}) - R_{i} (β_{1})}{a_{n}})\} \end{aligned}$ $\begin{aligned} - \sum_{i = 1}^{n} δ_{i} I (ε_{i} = 1) \\ \log \{\frac{1}{n} \sum_{j = 1}^{n} p_{ε, 1 i}^{(m)} \int_{- \infty}^{R_{j} (β_{1}) - R_{i} (β_{1}) / a_{n}} K (s) d s\} . \end{aligned}$

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Download PDF

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Your download is now in progress and you may close this window

Did you know that with a free Taylor & Francis Online account you can gain access to the following benefits?

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Have an account?
Login now Don't have an account?
Register for free

Login or register to access this feature

Have an account?
Login now Don't have an account?
Register for free

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Semiparametric estimation for accelerated failure time mixture cure model allowing non-curable competing risk

Abstract

1. Introduction

2. Accelerated failure time mixture cure model allowing non-curable competing risk

3. EM algorithm

4. Simulation

Table 1. Estimates of parameters for three baseline survivals.

5. Colorectal cancer clinical trial data

Table 2. Results for colorectal cancer clinical trial data.

6. Discussions and conclusions

Acknowledgements

Disclosure statement

Notes on contributors

Yijun Wang

Jiajia Zhang

Yincai Tang

References

Appendix

Information for

Open access

Opportunities

Help and information

Semiparametric estimation for accelerated failure time mixture cure model allowing non-curable competing risk

Abstract

1. Introduction

2. Accelerated failure time mixture cure model allowing non-curable competing risk

3. EM algorithm

4. Simulation

Table 1. Estimates of parameters for three baseline survivals.

5. Colorectal cancer clinical trial data

Table 2. Results for colorectal cancer clinical trial data.

6. Discussions and conclusions

Acknowledgements

Disclosure statement

Additional information

Notes on contributors

Yijun Wang

Jiajia Zhang

Yincai Tang

References

Appendix

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date