Full article: Dissecting the restricted mean time in favor of treatment

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

ABSTRACT

The restricted mean time in favor (RMT-IF) summarizes the treatment effect on a hierarchical composite endpoint with mortality at the top. Its crude decomposition into “stage-wise effects,” i.e., the net average time gained by the treatment prior to each component event, does not reveal the patient state in which the extra time is spent. To obtain this information, we break each stage-wise effect into subcomponents according to the specific state to which the reference condition is improved. After re-expressing the subcomponents as functionals of the marginal survival functions of outcome events, we estimate them conveniently by plugging in the Kaplan -- Meier estimators. Their robust variance matrices allow us to construct joint tests on the decomposed units, which are particularly powerful against component-wise differential treatment effects. By reanalyzing a cancer trial and a cardiovascular trial, we acquire new insights into the quality and composition of the extra survival times, as well as the extra time with fewer hospitalizations, gained by the treatment in question. The proposed methods are implemented in the rmt package freely available on the Comprehensive R Archive Network (CRAN).

KEYWORDS:

1. Introduction

Patients in a phase-III clinical trial often experience nonfatal events like hospitalization or relapse of disease before they die. Assessing the composite outcomes based solely on time to the first event, whichever type it is, raises concerns over the inefficient use of data as well as indiscrimination between morbidity and mortality (Anker and McMurray Citation2012; Armstrong and Westerhout Citation2017; Freemantle et al. Citation2003; Mao and Kim Citation2021). In response, investigators increasingly turn to methods that compare patients in pairs across arms, as this allows them to capture the entirety of patient data and to prioritize death over lesser events (see, e.g., Abdalla et al. Citation2016; Buyse Citation2010; Cui et al. Citation2022; Dong et al. Citation2018, Citation2022, Citation2023; Finkelstein and Schoenfeld Citation1999; Kandzari et al. Citation2021; Mao et al. Citation2022; Maurer et al. Citation2018; Pocock et al. Citation2012; Redfors et al. Citation2020; Seifu et al. Citation2022).

The restricted mean time in favor (RMT-IF) of treatment is one such method (Mao Citation2023). Defined as the net average time a treated patient fares in a more “favorable” state than an untreated one within a fixed time window, the RMT-IF has all the advantages of a pairwise comparison scheme. In addition, by pre-setting the time frame of comparison, it produces a well-defined estimand that is transferable across studies with different censoring patterns (Akacha et al. Citation2017; Dong et al. Citation2020; Oakes Citation2016). Furthermore, the estimand can be additively decomposed into a number of “stage-wise” effects according to the event type against which favorability is measured. Against relapse, for example, a patient gains favorable time by staying in remission; against death, by staying alive (in this case, the stage-wise effect coincides with the difference in restricted mean survival time, or RMST (McCaw et al. Citation2019; Royston and Parmar Citation2011; Tian et al. Citation2018; Uno et al. Citation2014)). Such decomposition reveals the contributions of different events to the overall effect.

Yet it may still hide important details. The stage-wise effect for survival (i.e., net RMST), for example, provides the average (treatment-conferred) extra lifetime without telling whether it is lived healthily or with illness. Similar ambiguity arises in any nonfatal event ranked above a less severe one, say, metastasis over non-metastatic relapse of cancer (Crowther and Lambert Citation2017). The stage-wise effect for metastasis then concerns only time spent metastasis-free, whether that means in complete remission or after relapse. Inquisition into such details requires us to further break down the stage-wise effects. While Mao (Citation2023) alluded to the possibility of doing so, a full solution has not yet been worked out.

Another problem mentioned in the original paper but still left untreated is joint testing of the decomposed units. Although it is natural to use the estimator of the overall RMT-IF for a global test, this may not always be optimal because it hides possible variations among the components. A joint test, on the other hand, is expected to be more sensitive to component-specific deviations from the null.

In this paper, we proposed methods to further decompose the stage-wise effects to answer the kind of substantive questions raised earlier. We also develop joint tests on the stage-wise components as well as their subcomponents to provide more options for testing. We begin Section 2 by reviewing the RMT-IF and its main components with the outcomes formulated as a multistate process with hierarchically ranked states. We then introduce the subcomponents and develop estimators as well as inference procedures, with technical details relegated to the Appendix. The robust variance matrices for the main and subcomponents are then used to construct chi-square tests with multiple degrees of freedom. For both the further decomposition and joint testing, a separate strategy is designed for the special case of recurrent events and death. We also describe the usage of the R-programs that implement the new analyses. Extensive simulations are conducted in Section 3 to assess the finite-sample performance of the estimation and testing procedures. The colon cancer and heart failure trials considered in Mao (Citation2023) are reanalyzed in Section 4 for deeper understanding of the treatment effects. We conclude the paper in Section 5 with a summary and some practical considerations.

2. Methods

2.1. Review of RMT-IF

As in Mao (Citation2023), we use a multistate process $Y^{(a)} (t)$ to denote the composite outcomes on a generic subject from group $a$ , where $a = 1$ and 0 indicate the treatment and control groups, respectively. Suppose $Y^{(a)} (t) \in {0, 1, \dots, K, \infty}$ , with a larger number representing a more adverse state. In particular, states 0 and $\infty$ will always represent the initial event-free status and death, respectively. Those in-between depend on the application, e.g., 1 for cancer relapse and 2 for metastasis, as shown in (a); or $1, \dots, K$ for the cumulative number of a recurrent event like hospitalization, as shown in (b), where $K$ is the (data-dependent) maximum number of events per patient.

Figure 1. Composite endpoints formulated as multistate processes: (a) Relapse, metastasis, and death in cancer studies (Crowther and Lambert Citation2017); (b) Repeated hospitalizations and death in, e.g., cardiovascular trials (Vardeny et al. Citation2021).

With restricting time $τ > 0$ , the RMT-IF estimand can be expressed as

(1)

μ (τ) = E [\int_{0}^{τ} I {Y^{> (1)} (t) < Y^{(0)} (t)} d t] - E [\int_{0}^{τ} I {Y^{(0)} (t) < Y^{(1)} (t)} d t],

(1)

where $Y^{(1)} (t)$ and $Y^{(0)} (t)$ are two generic outcomes independently drawn from the treatment and control groups, respectively, and $I (\cdot)$ is the indicator function. Since $\int_{0}^{τ} I {Y^{(a)} (t) < Y^{(1 - a)} (t)} d t$ is the length of time $Y^{(a)} (\cdot)$ occupies a less severe state than $Y^{(1 - a)} (\cdot)$ does in $[0, τ]$ , the right-hand side of (1) can be interpreted as the net average time gained by the treatment in a more favorable state as compared to the control in the first, say, $τ$ years. As with the RMST, the interpretation of the RMT-IF is tied to the choice of the restricting time.

In the comparison, we can split $μ (τ)$ into $(K + 1)$ main components or stage-wise effects, according to the “losing” state. For example, in (a) where $K = 2$ , a favorable comparison can result from being (i) event-free (state 0) vs relapsed but non-metastatic and alive (state 1); (ii) non-metastatic and alive (states 0 or 1) vs metastatic and alive (state 2); and (iii) alive (states 0, 1, or 2) vs dead (state $\infty$ ). More generally, write $I {Y^{(a)} (t) < Y^{(1 - a)} (t)} = \sum_{k = 1}^{K, \infty} I {Y^{(a)} (t) < Y^{(1 - a)} (t) = k}$ (the indicators in the sum are non-overlapping). Then, by (1), it is easy to find that

μ (τ) = \sum_{k = 1}^{K, \infty} μ_{k} (τ),

where

(2)

μ_{k} (τ) = E [\int_{0}^{τ} I {Y^{(1)} (t) < Y^{(0)} (t) = k} d t] - E [\int_{0}^{τ} I {Y^{(0)} (t) < Y^{(1)} (t) = k} d t] .

(2)

The $k$ th component $μ_{k} (τ)$ measures the net average time favorable with reference to state $k$ . Hence, in (a), $μ_{1} (τ)$ is the net average relapse-free time (vs relapsed but non-metastatic and alive); $μ_{2} (τ)$ is the net average metastasis-free time (vs metastatic but alive); and $μ_{\infty} (τ)$ is the net average lifetime (alive vs dead), i.e., net RMST. With recurrent events and death ( (b)), Mao (Citation2023) suggested using instead the aggregate measure $μ_{R} (τ) = \sum_{k = 1}^{K} μ_{k} (τ)$ to summarize treatment effects on the nonfatal events as a whole.

2.2. Further decomposition and estimation

Except for $μ_{1} (τ)$ , all other $μ_{k} (τ)$ can be further divided. Indeed, we can do so by differentiating on the “winning” state in each $μ_{k} (τ)$ via $I {Y^{(1)} (t) < Y^{(0)} (t) = k} = \sum_{j < k} I {Y^{(1)} (t) = j, Y^{(0)} (t) = k}$ . This leads to

μ_{k} (τ) = \sum_{j < k} μ_{j k} (τ),

where

μ_{j k} (τ) = E [\int_{0}^{τ} I {Y^{(1)} (t) = j, Y^{(0)} (t) = k} d t] - E [\int_{0}^{τ} I {Y^{(0)} (t) = j, Y^{(1)} (t) = k} d t] .

The subcomponent $μ_{j k} (τ)$ measures the net average time improved from state $k$ to state $j$ specifically $(j < k)$ . Again using (a) as an example, $μ_{02} (τ)$ and $μ_{12} (τ)$ are the average pre-metastasis time gained in remission and post-relapse, respectively. Likewise, $μ_{0, \infty} (τ), μ_{1, \infty} (τ)$ , and $μ_{2, \infty} (τ)$ are the average lifetime gained in remission, post-relapse (but pre-metastasis), and post-metastasis, respectively. Clearly, the farther apart $j$ and $k$ are, the more valuable $μ_{j k} (τ)$ is per unit. In that sense, $μ_{0 k} (τ), μ_{1 k} (τ), \dots$ , and $μ_{k - 1, k} (τ)$ are ordered by importance, which justifies their separate analyses. shows this two-level decomposition of $μ (τ)$ diagrammatically.

Figure 2. A graphical dissection of $μ (τ) = \sum_{k = 1}^{K, \infty} μ_{k} (τ) = \sum_{k = 1}^{K, \infty} \sum_{j < k} μ_{j k} (τ)$ .

Figure 2. A graphical dissection of μ(τ)=∑k=1K,∞μk(τ)=∑k=1K,∞∑j<kμjk(τ).

The same strategy used to estimate $μ_{k} (τ)$ with censored data applies to the $μ_{j k} (τ)$ , only with additional derivations. As in Mao (Citation2023), suppose that $Y^{(a)} (t)$ is a progressive process in the sense that $Y^{(a)} (t) \leq Y^{(a)} (s)$ for all $0 \leq t \leq s$ (true for both examples in ). Let $T_{k}^{(a)} = inf {t : Y^{(a)} (t) \geq k}$ $(k = 1, \dots, K, \infty)$ . Because $Y^{(a)} (t)$ is increasing, $T_{k}^{(a)}$ is just the first time it goes up to state $k$ or higher. In (a), for example, $T_{1}^{(a)}$ is the time to the earliest of relapse, metastasis, and death; $T_{2}^{(a)}$ is the time to the earlier of metastasis and death; and $T_{\infty}^{(a)}$ is the time to death. Obviously, $T_{1}^{(a)} \leq \dots \leq T_{K}^{(a)} \leq T_{\infty}^{(a)}$ (with equalities attainable in cases of “state skipping,” e.g., death without any nonfatal events). Because of progressivity, $Y^{(a)} (t)$ is completely determined by the $(K + 1)$ transition times. In fact, $Y^{(a)} (t) = k$ is equivalent to $T_{k}^{(a)} \leq t < T_{k + 1}^{(a)}$ for $k = 0, 1, \dots, K$ with $T_{0}^{(a)} \equiv 0$ and $T_{K + 1}^{(a)} \equiv T_{\infty}^{(a)}$ , and $Y^{(a)} (t) = \infty$ is equivalent to $T_{\infty}^{(a)} \leq t$ . This means that

(3)

p r {Y^{(a)} (t) = k} = S_{k + 1}^{(a)} (t) - S_{k}^{(a)} (t) (k = 0, 1, \dots, K) a n d p r {Y^{(a)} (t) = \infty} = 1 - S_{K + 1}^{(a)} (t),

(3)

where $S_{k}^{(a)} (t) = p r (T_{k}^{(a)} > t)$ $(k = 0, 1, \dots, K, K + 1)$ . Using $μ_{j, K + 1} (τ)$ as an alias for $μ_{j, \infty} (τ)$ , we find that

(4)

\begin{aligned} μ_{j k} (τ) = \int_{0}^{τ} p r {Y^{(1)} (t) = j, Y^{(0)} (t) = k} d t - \int_{0}^{τ} p r {Y^{(0)} (t) = j, Y^{(1)} (t) = k} d t \\ = \int_{0}^{τ} p r {Y^{(1)} (t) = j} p r {Y^{(0)} (t) = k} d t - \int_{0}^{τ} p r {Y^{(0)} (t) = j} p r {Y^{(1)} (t) = k} d t \\ = \int_{0}^{τ} {S_{j + 1}^{(1)} (t) - S_{j}^{(1)} (t)} {S_{k + 1}^{(0)} (t) - S_{k}^{(0)} (t)} d t \\ - \int_{0}^{τ} {S_{j + 1}^{(0)} (t) - S_{j}^{(0)} (t)} {S_{k + 1}^{(1)} (t) - S_{k}^{(1)} (t)} d t \end{aligned}

(4)

for $0 = j < k = 1, \dots, K, K + 1$ , where $S_{K + 2}^{(a)} (t) \equiv 1$ . The first equality in (4) follows by interchanging the expectation and integration in (2), the second by the independence of $Y^{(1)} (\cdot)$ and $Y^{(0)} (\cdot)$ , and the third by (3).

In practice, the $Y^{(a)} (\cdot)$ are censored. With $C^{(a)}$ denoting the independent censoring time, we observe $O^{(a)} \equiv {Y^{(a)} (t) : 0 \leq t \leq T_{\infty}^{(a)} \land C^{(a)}}$ , where $b \land c = min (b, c)$ . In parallel with the latent $Y^{(a)} (\cdot)$ , we can equivalently express $O^{(a)}$ using a sequence of censored transition times, namely, $(X_{k}^{(a)}, δ_{k}^{(a)})$ $(k = 1, \dots, K, K + 1)$ , where $X_{k}^{(a)} = T_{k}^{(a)} \land C^{(a)}$ and $δ_{k}^{(a)} = I (T_{k}^{(a)} \leq C^{(a)})$ . Let ${O_{1}^{(a)}, \dots, O_{n_{a}}^{(a)}}$ denote a random $n_{a}$ -sample of $O^{(a)}$ and write $n = n_{1} + n_{0}$ . In the absence of competing risks other than death (see, e.g., Mao Citation2023), we can estimate the unknown $S_{k}^{(a)} (t)$ in (4) by the Kaplan--Meier estimator based on the $n_{a}$ -sample of $(X_{k}^{(a)}, δ_{k}^{(a)})$ .

Proposition 1

Let ${\hat{S}}_{k}^{(a)} (t)$ denote the Kaplan–Meier estimator for $S_{k}^{(a)} (t)$ $(k = 1, \dots, K, K + 1)$ with ${\hat{S}}_{0}^{(a)} (t) \equiv 0$ and ${\hat{S}}_{K + 2}^{(a)} (t) \equiv 1$ . Then, for $0 \leq j < k = 1, \dots, K, K + 1$ , the subcomponent $μ_{j k} (τ)$ can be consistently estimated by

{\hat{μ}}_{j k} (τ) = \int_{0}^{τ} {{\hat{S}}_{j + 1}^{(1)} (t) - {\hat{S}}_{j}^{(1)} (t)} {{\hat{S}}_{k + 1}^{(0)} (t) - {\hat{S}}_{k}^{(0)} (t)} d t

- \int_{0}^{τ} {{\hat{S}}_{j + 1}^{(0)} (t) - {\hat{S}}_{j}^{(0)} (t)} {{\hat{S}}_{k + 1}^{(1)} (t) - {\hat{S}}_{k}^{(1)} (t)} d t,

which is asymptotically normal with variance that can be robustly estimated by (12) in the Appendix.

It can be easily shown that $\sum_{j < k} {\hat{μ}}_{j k} (τ) = {\hat{μ}}_{k} (τ)$ , where ${\hat{μ}}_{k} (τ)$ is Mao (Citation2023)’s estimator for $μ_{k} (τ)$ . To derive the asymptotic normality and variance of ${\hat{μ}}_{j k} (τ)$ , we can expand it asymptotically into a linear form (Tsiatis Citation2006), i.e., a sum of i.i.d. terms, using the functional delta method on the ${\hat{S}}_{k}^{(a)} (t)$ , whose asymptotic linear forms are known (see, e.g., Corollary 3.2.1 of Fleming and Harrington Citation1991). Appendix A.1 lays out the details. Using these results, we can easily make inferences and construct confidence intervals for each $μ_{j k} (τ)$ .

2.3. Joint tests on the components

There are several ways to test the overall treatment effect on the composite endpoint. The simplest one is to test $H_{0} : μ (τ) = 0$ using the estimator $\hat{μ} (τ) = \sum_{k = 1}^{K, \infty} {\hat{μ}}_{k} (τ)$ along with its standard error. Alternatively, one can test on the $(K + 1)$ stage-wise effects jointly, i.e.,

H_{0, m a i n} : μ_{1} (τ) = \dots = μ_{\infty} (τ) = 0,

or even on the $(K + 1) (K + 2) / 2$ subcomponents, i.e.,

H_{0, s u b} : μ_{01} (τ) = μ_{02} (τ) = μ_{12} (τ) = \dots = μ_{K, \infty} (τ) = 0.

These two tests can be advantageous when treatment effect varies substantially across components.

Proposition 2 Write

{\hat{μ}}_{m a i n} (τ) = {{\hat{μ}}_{1} (τ), \dots, {\hat{μ}}_{\infty} (τ)}^{T} a n d {\hat{μ}}_{s u b} (τ) = {{\hat{μ}}_{01} (τ), {\hat{μ}}_{02} (τ), {\hat{μ}}_{12} (τ), \dots, {\hat{μ}}_{K, \infty} (τ)}^{T} .

Let ${\hat{Σ}}_{m a i n} (τ)$ and ${\hat{Σ}}_{s u b} (τ)$ denote the robust variance matrix estimators for ${\hat{μ}}_{m a i n} (τ)$ and ${\hat{μ}}_{s u b} (τ)$ , respectively, given in Appendix A.2. Then,

(5)

\begin{matrix} {\hat{μ}}_{m a i n} {(t)}^{T} {\hat{Σ}}_{m a i n} {(t)}^{- 1} {\hat{μ}}_{m a i n} (τ) \overset{H_{0, m a i n}}{\sim} χ_{K + 1}^{2} \\ a n d {\hat{μ}}_{s u b} {(t)}^{t} {\hat{Σ}}_{s u b} {(t)}^{- 1} {\hat{μ}}_{s u b} (τ) \overset{H_{0, s u b}}{\sim} χ_{(K + 1) (K + 2) / 2}^{2} . \end{matrix}

(5)

Based on the null distributions of the quadratic forms, we can easily construct chi-square tests with $(K + 1)$ and $(K + 1) (K + 2) / 2$ degrees of freedom (d.f.) to test $H_{0, m a i n}$ and $H_{0, s u b}$ , respectively.

2.4. Special case with recurrent events and death

The procedures in Sections 2.2 and 2.3 technically apply when $Y^{(a)} (t)$ represents recurrent events and death such as in (b). However, comparison of individual states is substantively less meaningful when those pertain to the number of occurrences of the same event. It is rarely of interest, for example, to separate out time spent having been hospitalized twice as opposed to, say, three, four, five, or more times. Coalescing the $μ_{j k} (τ)$ into a smaller set would make interpretation easier.

One way of doing so is to dichotomize between event-free (state 0) versus living with one or more events (states $1, \dots, K$ ). This splits $μ_{\infty} (τ)$ (net RMST) into $μ_{0, \infty} (τ)$ , the extra lifetime gained event-free, and $μ_{1 +, \infty} (τ) = \sum_{j = 1}^{K} μ_{j, \infty} (τ)$ , the extra lifetime gained having experienced at least one event. Likewise, $μ_{R} (τ)$ (see the end of Section 2.1) is split into $μ_{0, 1 +} (τ) = \sum_{k = 1}^{K} μ_{0 k} (τ)$ , the extra time gained event-free when alive, and $μ_{1 +, R} (τ) = \sum_{k = 2}^{K} \sum_{j = 1}^{k - 1} μ_{j k} (τ)$ , the extra time gained with fewer, but nonzero, nonfatal events when alive. In sum, we have that

(6)

μ (τ) = \underset{μ_{R} (τ)}{\underset{⏟}{μ_{0, 1 +} (τ) + μ_{1 +, R} (τ)}} + \underset{μ_{\infty} (τ)}{\underset{⏟}{μ_{0, \infty} (τ) + μ_{1 +, \infty} (τ)}} .

(6)

Hence no matter how large $K$ is, we will always have two main components and four subcomponents. These can be estimated by aggregating the lower-level ${\hat{μ}}_{j k} (τ)$ introduced in Proposition 1. A computationally more efficient approach is outlined in the supplementary materials. Corresponding joint tests with 2 and 4 d.f.’s can be constructed along the lines of Proposition 2.

2.5. Software

The R-programs that implement the new procedures are integrated with the original methodology in the rmt package. Recall that the main function to fit the RMT-IF is rmtfit(), with the basic syntax

obj <- rmtfit(id, time, status, trt, type=c(“multistate”,”recurrent”))

It accepts input data in the long format, with an id variable holding the unique patient identifiers. The time and status variables contain the event times and labels of event types, respectively. With type=”multistate” (default) for standard multistate data, the value of status corresponds to the label $k$ of the state triggered by the event, except that status = 0 for censoring and status = K + 1 for death. In (a), for example, status = 1, 2, and 3 indicate relapse, metastasis, and death, respectively. With type=”recurrent” for recurrent-event data, status = 1 for all nonfatal events (ordered chronologically) and status = 2 for death. In addition, the trt variable contains binary indicators for the treatment against control. At this point, we do not need to specify the restricting time $τ$ . Instead, we do so when using the summary() function on the rmtfit object to extract results on the overall and stage-wise effects for a particular $τ$ , output in a similar format to of Mao (Citation2023).

Table 1. Simulation results for the estimation and inference of the $μ_{j k} (τ)$ .

Display Table

Table 2. Simulation results for the empirical type I error of different tests.

Display Table

Table 3. Analysis of the colon cancer trial using the RMT-IF (months) of combined treatment.

Display Table

Now, to carry out the further decomposition, apply the new function dissect() similarly on the rmtfit object with a user-specified $τ$ , e.g., dissect(obj, tau = 3.0). To illustrate, we pick a random dataset with $n = 200$ and $K = 2$ in the first simulations in Section 3 and run

> obj <- rmtfit(id, time, status, trt)

> obj_sub <- dissect(obj, tau = 3.0)

> obj_sub

Call:

rmtfit. default (id = id, time = time, status = status, trt = trt)

Restricted mean time in favor of group “1” by time tau = 3:

Estimate Std.ErrZ valuePr(>|z|)

Overall 0.5988380.1899763.15220.0016206 **

Death 0.1742550.1405691.23960.2151084

vs State 00.1354880.0646622.09530.0361410 *

vs State 10.0633260.0502681.25980.2077573

vs State 2–0.0245590.054056–0.45430.6495950

State 20.2951150.0886533.32890.0008720 ***

vs State 00.2157030.0596333.61720.0002978 ***

vs State 10.0794120.0406211.95490.0505891.

State 1 (vs 0) 0.1294680.0557982.32030.0203251 *

Overall chi-square test:

X-squared = 9.55291, df = 1, p-value = 0.002;

Joint chi-square test on main components:

X-squared = 13.53824, df = 3, p-value = 0.0036;

Joint chi-square test on subcomponents:

X-squared = 20.78061, df = 6, p-value = 0.002.

The output is largely self-explanatory. In the table below the function call, the unindented lines show results for $μ (τ)$ and the $μ_{k} (τ)$ , whereas the indented ones concern the subcomponents $μ_{j k} (τ)$ (check the numerical additivity of Estimate!). The entire table is available as a numeric matrix in obj_sub$tab. We also see the results of the $χ_{1}^{2}$ (same as in the Overall line of the previous table), $χ_{K + 1}^{2}$ , and $χ_{(K + 1) (K + 2) / 2}^{2}$ tests, whose $p$ -values can be extracted from the trivariate vector obj_sub$pval. When we have recurrent events instead of standard multistate data, the decomposition scheme will be different according to Section 2.4, but the output will be similarly structured.

Finally, we introduce a graphic tool called “favorability plot.” Because all components of RMT-IF (and itself) are net measures of favorable and unfavorable times, a natural way to visualize them is to put the two opposing metrics side by side, as commonly seen in opinion polls of public figures or policies. To do so, use the ggrmtif() function (powered by ggplot2) directly on the dissect object, e.g.,

ggrmtif(obj_sub, unit = “months”)

This will generate a graphic that looks like or 6 ahead. It differs from the “bouquet plot” (Mao Citation2023) in that it maps out sub- as well as main components at a fixed $τ$ , rather than just the main components over a spectrum of $τ$ . We can add a state.label option to name states $0, 1, \dots, K$ in the graphic. For example, use state.label=c(“Remission”,”Relapse”) to produce the labels appearing on the left of . For detailed usage of the rmt package, see documentation and vignettes at https://cran.r-project.org/package=rmt.

3. Simulation Studies

In this section, we consider a standard multistate process with $K = 2$ , as in (a). Simulations for recurrent events and death are described in the supplementary materials. For a generic patient in group $a$ $(a = 1, 0)$ , use ${\tilde{T}}_{1}^{(a)}$ , ${\tilde{T}}_{2}^{(a)}$ , and $D^{(a)}$ to denote the latent relapse, metastasis, and death times, respectively. When ${\tilde{T}}_{2}^{(a)} < {\tilde{T}}_{1}^{(a)}$ , we consider the patient to have metastasized without experiencing (non-metastatic) relapse. This leads to a progressive process with transition times $T_{1}^{(a)} = {\tilde{T}}_{1}^{(a)} \land {\tilde{T}}_{2}^{(a)} \land D^{(a)}$ , $T_{2}^{(a)} = {\tilde{T}}_{2}^{(a)} \land D^{(a)}$ , and $T_{\infty}^{(a)} = D^{(a)}$ . We generated the latent event times through a trivariate Gumbel--Hougaard copula model (Oakes Citation1989)

(7)

p r \{{\tilde{T}}_{1}^{(a)} > t_{1}, {\tilde{T}}_{1}^{(a)} > t_{2}, D^{(a)} > s\} = exp [- {\{{(θ^{a} λ_{1} t_{1})}^{κ} + {(θ^{a} λ_{2} t_{2})}^{κ} + {(θ^{a} λ_{D} s)}^{κ}\}}^{1 / κ}],

(7)

where $λ_{1} = 0.8$ , $λ_{2} = 0.4$ , $λ_{D} = 0.2$ , $κ = 2$ (producing Kendall’s concordance coefficient $1 - κ^{- 1} = 50 %$ between components; see Oakes (Citation1989)), and $θ > 0$ is a common hazard ratio (HR) for all events. Under (7), we can use the relationship between the transition and latent event times to show that the former follow exponential distributions: $T_{1}^{(a)} \sim E x p n (θ^{a} λ_{1}^{*})$ , $T_{2}^{(a)} \sim E x p n (θ^{a} λ_{2}^{*})$ , and $T_{3}^{(a)} \sim E x p n (θ^{a} λ_{D})$ , where $λ_{1}^{*} = (λ_{1}^{κ} + λ_{2}^{κ} + λ_{D}^{κ})^{1 / κ}$ and $λ_{2}^{*} = (λ_{2}^{κ} + λ_{D}^{κ})^{1 / κ}$ . These marginal distributions allow us to derive the $μ_{j k} (τ)$ as functions of $τ$ in closed form using (4) (see supplementary materials for details). For censoring, let $C^{(a)} \sim U n i f [1, 4] \land E x p n (0.1)$ . Under this setup, the observed relapse, metastasis, and death rates are about 60%, 35%, and 20%, respectively.

We first focused on the estimation and inference of $μ_{j k} (τ)$ described in Proposition 1. With $θ = 1$ and 0.8, we generated samples of size $n = 500$ with equal allocations to the treatment and control, and estimated the $μ_{j k} (τ)$ and $μ_{k} (τ)$ $(0 = j < k = 1, 2, \infty)$ for $τ = 1.5$ and 3.0. The results are summarized in . All estimators show minimal bias, with robust standard errors closely reflecting their empirical variations. The corresponding 95% confidence intervals cover the true values at about the nominal rate. The same simulations were repeated with sample sizes $n = 200, 1000$ , and 2000. Similar results are shown in Tables S1–S3 in the supplementary materials.

Next, we checked the accuracy of the ${\hat{μ}}_{j k} (τ)$ over a spectrum of $τ$ . Under $θ = 0.8$ , we plotted the average estimates across $10, 000$ samples generated in the previous simulations and overlaid them with the true values computed from the analytic formulas given in the supplementary materials. As seen from , the average estimates are virtually indistinguishable from the true curves. Similar accuracy is observed for samples of size $n = 200$ and $1000$ (see Figures S1 and S2 in the supplementary materials).

Figure 3. Estimation of $μ_{j k} (τ)$ as a function of $τ$ . Solid line, true values; dashed line, average estimates based on 10,000 replicates of size $n = 500$ .

Finally, we turned to the joint tests proposed in Section 2.3. Three types of tests were considered: a $χ_{1}^{2}$ test based on $\hat{μ} (τ)$ , a $χ_{3}^{2}$ test based on the ${\hat{μ}}_{k} (τ)$ , and a $χ_{6}^{2}$ test based on the ${\hat{μ}}_{j k} (τ)$ , with the latter two described in (5) of Proposition 2. Because their relative performance likely depends on the pattern of component-wise effects, we relaxed model (7) to

p r \{{\tilde{T}}_{1}^{(a)} > t_{1}, {\tilde{T}}_{1}^{(a)} > t_{2}, D^{(a)} > s\} = exp [- {\{{(θ_{1}^{a} λ_{1} t_{1})}^{κ} + {(θ_{2}^{a} λ_{2} t_{2})}^{κ} + {(θ_{D}^{a} λ_{D} s)}^{κ}\}}^{1 / κ}],

where $θ_{1}$ , $θ_{2}$ , and $θ_{D}$ are the component-specific HRs for relapse, metastasis, and death, respectively. We first checked the type I error rates of these tests with $θ = 1$ , where the two groups are equivalent. All other parameters remain the same as in previous simulations. With $C^{(a)} \sim U n i f [1, 6] \land E x p n (0.1)$ , we performed level-0.05 tests at $τ = 3.0$ and $4.0$ for $n = 200, 500, 1000$ , and 2000. The empirical rejection rates are summarized across 10,000 samples in . All three tests show roughly correct type I error rate (with a slight deflation for $χ_{6}^{2}$ ), confirming their validity.

We then compared the power of these tests under alternative hypotheses. We set up three scenarios—(1) identical component-wise HR: $θ_{1} = θ_{2} = θ_{D} = θ$ ; (2) identical HR on relapse and metastasis and no effect on death: $θ_{1} = θ_{2} = θ$ and $θ_{D} = 1$ ; (3) possible effect on death but no effect on relapse or metastasis: $θ_{1} = θ_{2} = 1$ and $θ_{D} = θ$ . In each scenario, we ran the three types of tests at $τ = 4.0$ on 10,000 replicate samples of size $n = 200$ as $θ$ decreases from 1.0 to 0.4. The resulting empirical rejection rates are plotted as a function of $θ$ in . When component-wise HRs are the same, $χ_{1}^{2}$ is the most powerful of the three. However, it is easily outperformed by the joint tests in the latter two scenarios with heterogeneous component-wise effects.

Figure 4. Empirical power as a function of HR $θ$ at restricting time $τ = 4.0$ based on 10,000 replicate samples of size $n = 200$ . Scenario 1: $θ_{1} = θ_{2} = θ_{D} = θ$ ; scenario 2: $θ_{1} = θ_{2} = θ$ and $θ_{D} = 1$ ; scenario 3: $θ_{1} = θ_{2} = 1$ and $θ_{D} = θ$ . Dashed line, the $0.05$ significance level.

4. Real examples

With the new tools, we delve deeper into the two trials analyzed in Mao (Citation2023) by the RMT-IF.

4.1. A colon cancer study

Moertel et al. (Citation1990) reported a landmark colon cancer trial that established the efficacy of levamisole and fluorouracil in reducing the mortality and relapse in patients with stage C disease. The original trial involved 929 patients randomized into three arms: control $(n = 304)$ , levamisole alone $(n = 310)$ , and levamisole combined with fluorouracil $(n = 315)$ . Mao (Citation2023) analyzed the data by comparing the combined treatment to the control in terms of RMT-IF, with death prioritized over relapse $(K = 1)$ . Over a median follow-up of 5.5 years, 119 (39%) patients in the combined treatment relapsed, 18 (5.9%) died before relapse, and 105 (34.5%) died after; 177 (56%) patients in the control relapsed, 15 (4.8%) died before relapse, and 153 (48.6%) died after. It was shown that, in the first $τ = 7.5$ years after resection of tumor (the point of randomization), the treatment on average gains the patient $μ (τ) = 11.6$ months in a more favorable state, including an extra $μ_{\infty} (τ) = 7.4$ months survival time and $μ_{1} (τ) = 4.2$ months in remission as opposed to relapse.

Following this analysis, we further examine the composition of the survival component. We consider restricting times $τ = 2.5, 5.0$ , and 7.5 years, and use Proposition 1 to estimate and make inferences on the subcomponents. It turns out from that the survival benefits are fully explained by net gains in remission, which means that the prolonged life is of high quality. (The negative values of the other subcomponents are statistically insignificant and may only reflect a general reduction in relapse.) In particular, in the first $τ = 7.5$ years, treated patients on average survive 8.1 extra months in remission and lose 0.7 month post-relapse, accounting for a total of 7.4 months of net survival time. This pattern is shown in the favorability plot in , where the between-arm imbalance in “Death vs Life” is visibly driven by “Death vs Remission”.

Figure 5. Favorability plot for the colon cancer trial at $τ = 7.5$ years.

For the composite endpoint of death and relapse, we perform joint tests with 2 and 3 d.f. following Section 2.3. All $p$ -values are smaller than the corresponding single-d.f. tests in the bottom line of . This is not surprising given the consistently more significant effect on relapse than on survival, compounded by the even greater lopsidedness between the two subcomponents within survival.

4.2. A heart failure study

The Heart Failure: A Controlled Trial Investigating Outcomes of Exercise Training (HF-ACTION) study (O’Connor et al. Citation2009) evaluated the effect of adding exercise training to the usual care of over 2,000 heart failure patients. Mao (Citation2023) analyzed the data on a high-risk subgroup consisting of 426 nonischemic patients with poor performance in cardiopulmonary exercise test at baseline. In the cohort, 205 patients were randomized to receive exercise training along with usual care, and the remaining 221 received usual care alone as control. They were followed over a median length of 2.5 years. In the training group, there were 145 (71%) first hospitalizations and 306 (1.5 per patient) recurring hospitalizations; 6 (3%) and 20 (15%) patients died before and after the first hospitalization, respectively. In the control group, there were 170 (77%) first hospitalizations and 401 (1.8 per patient) recurring hospitalizations; 5 (2%) and 52 (24%) patients died before and after the first hospitalization, respectively. These crude statistics point to potential benefits of exercise training on both death and hospitalization. Indeed, it was shown that the treatment on average gains the patient $μ (τ) = 5.1$ months in a more favorable state in the first $τ = 4$ years post-randomization, including extra $μ_{\infty} (τ) = 2.9$ months survival time and $μ_{R} (τ) = 2.2$ months living with fewer hospitalizations.

We look further into the two main components through the decompositions of (6). We find that the extra lifetime consists of 1.1 months hospitalization-free (standard error 0.52 and $p$ -value 0.032) and 1.8 months having been hospitalized at least once (standard error 0.99 and $p$ -value 0.076), a much more balanced composition than that in the colon cancer trial of Section 4.1. Likewise, the extra time spent living with fewer hospitalizations consists of 1.3 months hospitalization-free (standard error 1.2 and $p$ -value 0.314) and 0.9 month having been hospitalized at least once (standard error 0.8 and $p$ -value 0.215). The favorability plot in shows the structure of the effect sizes. The $χ_{2}^{2}$ and $χ_{4}^{2}$ joint tests yield $p$ -values 0.039 and 0.173, respectively, both less significant than the $χ_{1}^{2}$ overall test ( $p$ -value 0.018; see Mao (Citation2023)) due to the largely homogeneous effects across components.

Figure 6. Favorability plot for the HF-ACTION trial at $τ = 4$ years.

5. Concluding remarks

Our dissection of the RMT-IF helps further reveal the makeup of the overall effect size. The resulting subcomponents, a product of state-to-state comparisons, provide detailed information about the changes in the average time spent in one state over another. Their estimation and inference are facilitated by the correspondence between the state probabilities and the survival functions of transition events, which allows the use of Kaplan--Meier curves to handle censored observations. These procedures will find use in the secondary analysis of composite endpoints, with the aim of understanding how the treatment affects different aspects of patient experience.

As a byproduct of component-wise inferences, their robust variance matrices have allowed us to construct joint tests, which empirically outperform the $χ_{1}^{2}$ test on the overall RMT-IF when component-wise effects differ widely. To choose an optimal test in practice, the investigator should consider historical evidence on the heterogeneity of treatment effect as well as the current trial’s sample size (relative to which the test d.f. should be small). In any case, a decision must be made before looking at the data in order to maintain the correct type I error.

The restricting time also needs to be pre-specified. Ideally, the time window should be wide enough to be of clinical interest and to at least allow the treatment effect to come through. With $τ = 2.5$ years in the colon cancer trial of Section 4.1, for example, we could hardly see any improvement in patient survival (first row of ), probably because the baseline mortality rate is still too low in such a short term. On the other hand, a restricting time beyond the last event in the data may cause numerical issues. Recently Tian et al. (Citation2020) explored data-dependent choice of the time window for the RMST. A similar study could be done for the RMT-IF.

We have given a separate treatment to recurrent events and death as deserved by their special features. Since the transient (i.e., nonterminal) states, potentially many, are triggered by the same type of event, a meticulous state-to-state comparison feels unnecessary and cumbersome. The merge of intermediate states proposed in Section 2.4 reduces the number of subcomponents down to four, yet still allowing us to distinguish whether the patient has had any nonfatal events or not. Compared with the standard partition based on the specific number of event (Mao Citation2023), this new approach seems to strike a better balance between the level of detail and ease of interpretation.

Supplemental material

Supplemental Material

Download PDF (533.7 KB)

Disclosure statement

No potential conflict of interest was reported by the authors.

Supplemental data

Supplemental data for this article can be accessed online at https://doi.org/10.1080/10543406.2023.2210658

Additional information

Funding

This research was supported by the National Institutes of Health grant R01HL149875.

References

Abdalla, S., M. E. Montez-Rath, P. S. Parfrey, and G. M. Chertow. 2016. The win ratio approach to analyzing composite outcomes: An application to the evolve trial. Contemporary Clinical Trials 48:119–124. doi:10.1016/j.cct.2016.04.001.
PubMed Web of Science ®Google Scholar
Akacha, M., F. Bretz, D. Ohlssen, G. Rosenkranz, and H. Schmidli. 2017. Estimands and their role in clinical trials. Statistics in Biopharmaceutical Research 9 (3):268–271. doi:10.1080/19466315.2017.1302358.
Web of Science ®Google Scholar
Anker, S. D., and J. V. McMurray. 2012. Time to move on from “time-to-first”: Should all events be included in the analysis of clinical trials. European Heart Journal 33 (22):2764–2765. doi:10.1093/eurheartj/ehs277.
PubMed Web of Science ®Google Scholar
Armstrong, P. W., and C. M. Westerhout. 2017. Composite end points in clinical research: A time for reappraisal. Circulation 135 (23):2299–2307. doi:10.1161/CIRCULATIONAHA.117.026229.
PubMed Web of Science ®Google Scholar
Buyse, M. 2010. Generalized pairwise comparisons of prioritized outcomes in the two-sample problem. Statistics in Medicine 29 (30):3245–3257. doi:10.1002/sim.3923.
PubMed Web of Science ®Google Scholar
Crowther, M. J., and P. C. Lambert. 2017. Parametric multistate survival models: Flexible modelling allowing transition-specific distributions with application to estimating clinically useful measures of effect differences. Statistics in Medicine 36 (29):4719–4742. doi:10.1002/sim.7448.
PubMed Web of Science ®Google Scholar
Cui, Y., G. Dong, P. F. Kuan, and B. Huang. 2022. Evidence synthesis analysis with prioritized benefit outcomes in oncology clinical trials. Journal of Biopharmaceutical Statistics 33 (3):272–288. doi:10.1080/10543406.2022.2141769.
PubMed Web of Science ®Google Scholar
Dong, G., D. C. Hoaglin, B. Huang, Y. Cui, D. Wang, Y. Cheng, and M. Gamalo-Siebers. 2023. The stratified win statistics (win ratio, win odds, and net benefit). Pharmaceutical Statistics. doi:10.1002/pst.2293.
Web of Science ®Google Scholar
Dong, G., B. Huang, Y. -W. Chang, Y. Seifu, J. Song, and D. C. Hoaglin. 2020. The win ratio: Impact of censoring and follow-up time and use with nonproportional hazards. Pharmaceutical Statistics 19 (3):168–177. doi:10.1002/pst.1977.
PubMed Web of Science ®Google Scholar
Dong, G., B. Huang, J. Verbeeck, Y. Cui, J. Song, M. Gamalo-Siebers, D. Wang, D. C. Hoaglin, Y. Seifu, T. Mütze, et al. (2022). Win statistics (win ratio, win odds, and net benefit) can complement one another to show the strength of the treatment effect on time-to-event outcomes. Pharmaceutical Statistics 10.1002/pst.2251.
Google Scholar
Dong, G., J. Qiu, D. Wang, and M. Vandemeulebroecke. 2018. The stratified win ratio. Journal of Biopharmaceutical Statistics 28 (4):778–796. doi:10.1080/10543406.2017.1397007.
PubMed Web of Science ®Google Scholar
Finkelstein, D. M., and D. A. Schoenfeld. 1999. Combining mortality and longitudinal measures in clinical trials. Statistics in Medicine 18 (11):1341–1354. doi:10.1002/(SICI)1097-0258(19990615)18:11<1341:AID-SIM129>3.0.CO;2-7.
PubMed Web of Science ®Google Scholar
Fleming, T. R., and D. P. Harrington. 1991. Counting Processes and Survival Analysis. Hoboken, NJ: John Wiley & Sons.
Google Scholar
Freemantle, N., M. Calvert, J. Wood, J. Eastaugh, and C. Griffin. 2003. Composite outcomes in randomized trials: Greater precision but with greater uncertainty. Journal of the American Medical Association 289 (19):2554–2559. doi:10.1001/jama.289.19.2554.
PubMed Web of Science ®Google Scholar
Kandzari, D. E., G. L. Hickey, S. J. Pocock, M. A. Weber, M. Boehm, S. A. Cohen, M. Fahy, G. Lamberti, and F. Mahfoud. 2021. Prioritised endpoints for device-based hypertension trials: The win ratio methodology. EuroIntervention: Journal of EuroPcr in Collaboration with the Working Group on Interventional Cardiology of the European Society of Cardiology 16 (18):e1496–1502. doi:10.4244/EIJ-D-20-01090.
PubMed Web of Science ®Google Scholar
Mao, L. 2023. On restricted mean time in favor of treatment. Biometrics 79 (1):61–72. doi:10.1111/biom.13570.
PubMed Web of Science ®Google Scholar
Mao, L., and K. Kim. 2021. Statistical models for composite endpoints of death and non-fatal events: A review. Statistics in Biopharmaceutical Research 13 (3):260–269. doi:10.1080/19466315.2021.1927824.
PubMed Web of Science ®Google Scholar
Mao, L., K. Kim, and Y. Li. 2022. On recurrent-event win ratio. Statistical Methods in Medical Research 31 (6):1120–1134. doi:10.1177/09622802221084134.
PubMed Web of Science ®Google Scholar
Maurer, M. S., J. H. Schwartz, B. Gundapaneni, P. M. Elliott, G. Merlini, M. Waddington-Cruz, A. V. Kristen, M. Grogan, R. Witteles, T. Damy, et al. 2018. Tafamidis treatment for patients with transthyretin amyloid cardiomyopathy. The New England Journal of Medicine. 379(11):1007–1016. doi:10.1056/NEJMoa1805689.
PubMed Web of Science ®Google Scholar
McCaw, Z. R., G. Yin, and L. -J. Wei. 2019. Using the restricted mean survival time difference as an alternative to the hazard ratio for analyzing clinical cardiovascular studies. Circulation 140 (17):1366–1368. doi:10.1161/CIRCULATIONAHA.119.040680.
PubMed Web of Science ®Google Scholar
Moertel, C. G., T. R. Fleming, J. S. Macdonald, D. G. Haller, J. A. Laurie, P. J. Goodman, J. S. Ungerleider, W. A. Emerson, D. C. Tormey, J. H. Glick, et al. 1990. Levamisole and fluorouracil for adjuvant therapy of resected colon carcinoma. The New England Journal of Medicine. 322(6):352–358. doi:10.1056/NEJM199002083220602.
PubMed Web of Science ®Google Scholar
Oakes, D. 1989. Bivariate survival models induced by frailties. Journal of the American Statistical Association 84 (406):487–493. doi:10.1080/01621459.1989.10478795.
Web of Science ®Google Scholar
Oakes, D. 2016. On the win-ratio statistic in clinical trials with multiple types of event. Biometrika 103 (3):742–745. doi:10.1093/biomet/asw026.
Web of Science ®Google Scholar
O’Connor, C. M., D. J. Whellan, K. L. Lee, S. J. Keteyian, L. S. Cooper, S. J. Ellis, E. S. Leifer, W. E. Kraus, D. W. Kitzman, J. A. Blumenthal, et al. 2009. Efficacy and safety of exercise training in patients with chronic heart failure: Hf-action randomized controlled trial. Journal of the American Medical Association. 301(14):1439–1450. doi:10.1001/jama.2009.454.
PubMed Web of Science ®Google Scholar
Pocock, S., C. Ariti, T. Collier, and D. Wang. 2012. The win ratio: A new approach to the analysis of composite endpoints in clinical trials based on clinical priorities. European Heart Journal 33 (2):176–182. doi:10.1093/eurheartj/ehr352.
PubMed Web of Science ®Google Scholar
Redfors, B., J. Gregson, A. Crowley, T. McAndrew, O. Ben-Yehuda, G. W. Stone, and S. J. Pocock. 2020. The win ratio approach for composite endpoints: Practical guidance based on previous experience. European Heart Journal 41 (46):4391–4399. doi:10.1093/eurheartj/ehaa665.
PubMed Web of Science ®Google Scholar
Royston, P., and M. K. Parmar. 2011. The use of restricted mean survival time to estimate the treatment effect in randomized clinical trials when the proportional hazards assumption is in doubt. Statistics in Medicine 30 (19):2409–2421. doi:10.1002/sim.4274.
PubMed Web of Science ®Google Scholar
Seifu, Y., S. Mt-Isa, K. Duke, M. Gamalo-Siebers, W. Wang, G. Dong, and J. Kolassa. 2022. Design of paediatric trials with benefit-risk endpoints using a composite score of adverse events of interest (aei) and win-statistics. Journal of Biopharmaceutical Statistics 1–12. doi:10.1080/10543406.2022.2153202.
PubMed Web of Science ®Google Scholar
Tian, L., H. Fu, S. J. Ruberg, H. Uno, and L. -J. Wei. 2018. Efficiency of two sample tests via the restricted mean survival time for analyzing event time observations. Biometrics 74 (2):694–702. doi:10.1111/biom.12770.
PubMed Web of Science ®Google Scholar
Tian, L., H. Jin, H. Uno, Y. Lu, B. Huang, K. M. Anderson, and L. Wei. 2020. On the empirical choice of the time window for restricted mean survival time. Biometrics 76 (4):1157–1166. doi:10.1111/biom.13237.
PubMed Web of Science ®Google Scholar
Tsiatis, A. 2006. Semiparametric Theory and Missing Data. New York: Springer.
Google Scholar
Uno, H., B. Claggett, L. Tian, E. Inoue, P. Gallo, T. Miyata, D. Schrag, M. Takeuchi, Y. Uyama, L. Zhao, et al. 2014. Moving beyond the hazard ratio in quantifying the between-group difference in survival analysis. Journal of Clinical Oncology. 32(22):2380. doi:10.1200/JCO.2014.55.2208.
PubMed Web of Science ®Google Scholar
Vardeny, O., K. Kim, J. A. Udell, J. Joseph, A. S. Desai, M. E. Farkouh, S. M. Hegde, A. F. Hernandez, A. McGeer, H. K. Talbot, et al. 2021. Effect of high-dose trivalent vs standard-dose quadrivalent influenza vaccine on mortality or cardiopulmonary hospitalization in patients with high-risk cardiovascular disease: A randomized clinical trial. JAMA. 325(1):39–49. doi:10.1001/jama.2020.23649.
PubMed Web of Science ®Google Scholar

Appendix

Rearranging the terms on the far right hand side of (4), we obtain that

(8)

μ_{j k} (τ) = θ_{j + 1, k + 1} (τ) - θ_{j + 1, k} (τ) - θ_{j, k + 1} (τ) + θ_{j k} (τ),

(8)

where

(9)

θ_{j k} (τ) = \int_{0}^{τ} \{S_{j}^{(1)} (t) S_{k}^{(0)} (t) - S_{j}^{(0)} (t) S_{k}^{(1)} (t)\} d t .

(9)

Let ${\hat{θ}}_{j k} (τ)$ denote the estimator of $θ_{j, k} (τ)$ by substituting the Kaplan–Meier estimator ${\hat{S}}_{l}^{(a)} (t)$ for $S_{l}^{(a)} (t)$ $(l = j, k)$ in (9). If we can expand ${\hat{θ}}_{j k} (τ)$ asymptotically in the linear form

n^{1 / 2} {{\hat{θ}}_{j k} (τ) - θ_{j k} (τ)} = q^{- 1 / 2} n_{1}^{- 1 / 2} \sum_{i = 1}^{n_{1}} \int_{0}^{τ} Ψ_{i j}^{(1)} (O_{i}^{(1)}) (t) d t

(10)

- (1 - q)^{- 1 / 2} n_{0}^{- 1 / 2} \sum_{i = 1}^{n_{0}} \int_{0}^{τ} Ψ_{i j}^{(0)} (O_{i}^{(0)}) (t) d t + o_{p} (1),

(10)

where $q = {lim}_{n \to \infty} n_{1} / n$ and the $Ψ_{i j}^{(a)} (O^{(a)}) (t)$ are some mean-zero influence functions (Tsiatis Citation2006), then by (8) we will have that

n^{1 / 2} {{\hat{μ}}_{j k} (τ) - μ_{j k} (τ)} = q^{- 1 / 2} n_{1}^{- 1 / 2} \sum_{i = 1}^{n_{1}} \int_{0}^{τ} γ_{i j}^{(1)} (O_{i}^{(1)}) (t) d t

(11)

- (1 - q)^{- 1 / 2} n_{0}^{- 1 / 2} \sum_{i = 1}^{n_{0}} \int_{0}^{τ} γ_{i j}^{(0)} (O_{i}^{(0)}) (t) d t + o_{p} (1),

(11)

where $γ_{j k}^{(a)} (O^{(a)}) (t) = Ψ_{j + 1, k + 1}^{(a)} (O^{(a)}) (t) - Ψ_{j + 1, k}^{(a)} (O^{(a)}) (t) - Ψ_{j, k + 1}^{(a)} (O^{(a)}) (t) + Ψ_{j k}^{(a)} (O^{(a)}) (t)$ . Let ${\hat{γ}}_{j k}^{(a)} (\cdot) (t)$ denote a nonparametric estimator of $γ_{j k}^{(a)} (\cdot) (t)$ (based on estimators for the $Ψ_{j k}^{(a)} (O^{(a)}) (t)$ below). Then, the asymptotic variance of ${\hat{μ}}_{j k} (τ)$ can be estimated by the empirical second moment

(12)

\hat{v a r} {{\hat{μ}}_{j k} (τ)} = n_{1}^{- 2} \sum_{i = 1}^{n_{1}} {\{\int_{0}^{τ} {\hat{γ}}_{j k}^{(1)} (O_{i}^{(1)}) (t) d t\}}^{2} + n_{0}^{- 2} \sum_{i = 1}^{n_{0}} {\{\int_{0}^{τ} {\hat{γ}}_{j k}^{(0)} (O_{i}^{(0)}) (t) d t\}}^{2} .

(12)

It now remains to derive and estimate $Ψ_{i j}^{(a)} (O^{(a)}) (t)$ in (10). By Corollary 3.2.1 of Fleming and Harrington (Citation1991), the Kaplan–Meier estimator can be expanded by

n_{a}^{1 / 2} {{\hat{S}}_{k}^{(a)} (t) - S_{k}^{(a)} (t)} = - n_{a}^{- 1 / 2} S_{k}^{(a)} (t) \sum_{i = 1}^{n_{a}} ψ_{k}^{(a)} (O_{i}^{(a)}) (t) + o_{p} (1),

where

ψ_{k}^{(a)} (O^{(a)}) (t) = \int_{0}^{t} π_{k}^{(a)} (s)^{- 1} M_{k}^{(a)} (d s; O^{(a)}),

π_{k}^{(a)} (s) = p r (X_{k}^{(a)} \geq s),

M_{k}^{(a)} {s; O^{(a)}} = I {X_{k}^{(a)} \leq s, δ_{k}^{(a)} = 1} - Λ_{k}^{(a)} (X_{k}^{(a)} \land s),

and $Λ_{k}^{(a)} (\cdot)$ is the cumulative hazard function for $T_{k}^{(a)}$ . We can estimate $ψ_{k}^{(a)} (O^{(a)}) (t)$ by replacing $π_{k}^{(a)} (s)$ with its empirical analog and $Λ_{k}^{(a)} (\cdot)$ with the standard Nelson–Aalen estimator. Denote the resulting estimator by ${\hat{ψ}}_{k}^{(a)} (O^{(a)}) (t)$ . Then, using the delta method on ${\hat{θ}}_{j k} (τ)$ as a functional of the ${\hat{S}}_{j}^{(a)} (t)$ and ${\hat{S}}_{k}^{(a)} (t)$ in (9), we find that

Ψ_{i j}^{(a)} (O^{(a)}) (t) = - S_{k}^{(1 - a)} (t) S_{j}^{(a)} (t) ψ_{j}^{(a)} (O_{i}^{(a)}) (t) + S_{k}^{(a)} (t) S_{j}^{(1 - a)} (t) ψ_{k}^{(a)} (O_{i}^{(a)}) (t),

which can be estimated by substituting ${\hat{S}}_{l}^{(a)} (t)$ for $S_{l}^{(a)} (t)$ and ${\hat{ψ}}_{l}^{(a)} (O^{(a)}) (t)$ for $ψ_{l}^{(a)} (O^{(a)}) (t)$ $(l = j, k)$ .

A.2 Construction of the joint tests

The robust variance matrix $\sum^{ˆ} s u b (τ)$ . can be constructed using the coordinate-wise influence functions in (11). Specifically, write

{\hat{γ}}^{(a)} (O^{(a)}) (t) = {{\hat{γ}}^{(a)} 01 (O^{(a)}) (t), {\hat{γ}}^{02} (a) (O^{(a)}) (t), {\hat{γ}}^{(a)} 12 (O^{(a)}) (t), . . ., {\hat{γ}}^{(a)} K, K + 1 (O^{(a)}) (t)}^{T}

Then by a similar construction to (12), we find that

{\sum^{ˆ}}_{s u b} (τ) = n_{1}^{- 2} \sum_{i = 1}^{n_{1}} {\{\int_{0}^{τ} {\hat{γ}}^{(1)} (O_{i}^{^{(a)}}) (t) d t\}}^{\oplus 2} + n_{0}^{- 2} \sum_{i = 1}^{n_{0}} {\{\int_{0}^{τ} {\hat{γ}}^{(0)} (O_{i}^{^{(a)}}) (t) d t\}}^{\oplus 2}

where $v^{\oplus} = v v^{T}$ for any vector $v$ . The matrix $\sum^{ˆ} m a i n (τ)$ can be derived similarly using the coordinate-wise influence functions of ${\hat{μ}}_{m a i n} (τ)$ given in Proposition 1 of Mao (2023).

Dissecting the restricted mean time in favor of treatment

ABSTRACT

1. Introduction