826
Views
1
CrossRef citations to date
0
Altmetric
Research Article

Estimating and testing sequential causal effects based on alternative G-formula: an observational study of the influence of early diagnosis on survival of cardia cancer

ORCID Icon & ORCID Icon
Pages 1917-1931 | Received 13 May 2021, Accepted 27 Mar 2022, Published online: 12 Apr 2022

Abstract

Cancer diagnosis is part of a complex stochastic process, in which patients' personal and social characteristics influence the choice of diagnosing methods, diagnosing methods in turn influence the initial assessment of cancer stage, cancer stage in turn influences the choice of treating methods, and treating methods in turn influence cancer outcomes such as cancer survival. To evaluate the performance of diagnoses, one needs to estimate and test the sequential causal effect (SCE) under a specified regime of diagnoses and treatments in such a complex observational study, where the data-generating mechanism is unknown and modeling is needed for statistical inference. In this article, we introduce a method of statistical modeling to estimate and test SCEs under regimes of treatments (diagnoses and treatments in cancer diagnosis) in complex observational studies. By applying the alternative G-formula, we express the SCE in terms of the point effects of treatments in the sequence, so that the modeling can be conducted via the point effects in the framework of single-point causal inference. We illustrate our method by a medical example of cancer diagnosis with data from a Swedish prognosis study of cardia cancer.

1. Introduction

Cancer diagnosis is part of a complex stochastic process, in which patients' personal and social characteristics influence the choice of diagnosing methods, diagnosing methods in turn influence the initial assessment of cancer stage, cancer stage in turn influences the choice of treating methods, and treating methods in turn influence cancer outcomes such as cancer survival. To evaluate the performance of cancer diagnoses, one needs to estimate and test the sequential causal effect (SCE), which is the causal effect under a specified regime of diagnoses and treatments on a certain outcome of interest. An important example of SCEs is the one under optimal regime.

In sequential causal inference, Robins (Citation1986, Citation1997, Citation2009) derived the well-known G-formula, which identifies the SCE under a specified regime of treatments (diagnoses and treatments in cancer diagnosis) via standard parameters. Based on the G-formula, a parametric method has been developed for estimating the SCE via standard parameters (Hernan and Robins Citation2020; Robins Citation1997, Citation2009; Taubman et al. Citation2009). However, this method may suffer from the curse of dimensionality and the null paradox. See also the improved versions of the parametric method. Still, these versions have the difficulty of specifying the influence of the time-dependent covariates due to the curse of dimensionality and the null paradox (Almirall, Have, and Murphy Citation2010; Henderson, Ansell, and Alshibani Citation2010; Chaffe and van der Laan Citation2012).

To avoid these problems, the literature focuses on semi-parametric methods. One class of methods is the marginal structural model based on the inverse probability of treatment weighting (Hernan and Robins Citation2020; Hernan, Brumback, and Robins Citation2000; Robins, Brumback, and Hernan Citation2000; Robins Citation2009) or the augmented the inverse probability of treatment weighting (Zhang et al. Citation2013). Another class of methods is the G-estimation based on the structural nested mean model (SNMM) or optimal SNMM (Hernan and Robins Citation2020; Robins Citation1997, Citation2004, Citation2009); the G-estimation incorporates the A-learning (for instance, Murphy Citation2003) and Q-learning (for instance, Sutton and Barto Citation1998). See also a summary of the development of these methods in the area of optimal regimes (Kosorok et al. Citation2021). These methods achieve consistent estimates of SCEs. In a recent review, Kosorok and Laber (Citation2019) highlighted the need for methods of evaluating the uncertainty in estimating SCEs in the context of optimal regimes (Sec. 6, statistical inference).

Recently, Wang and Yin (Citation2015, Citation2020) derived an alternative G-formula, which expresses the SCE in terms of the point effects of treatments in the sequence. The point effect is simply the effect of single-point treatment in single-point causal inference (Rosenbaum and Rubin Citation1983). In designed experiments, where data-generating mechanisms are known, they estimated the SCE by estimating point effects of treatments without the need for modeling. In observational studies such as the influence of early cancer diagnosis, where data-generating mechanism is unknown, one needs to conduct modeling to estimate and test the SCE.

In this article, we will introduce a method of statistical modeling to estimate and test the SCE in observational studies and illustrate our method by a medical example using data from a Swedish prognosis study of cardia cancer. In Sec. 2, we describe our motivating example. In Sec. 3, we introduce the general framework of our method. We apply the alternative G-formula to express the SCE in terms of point effects of treatments in the sequence and describe how to conduct modeling to estimate and test the point effects and then the SCE in the framework of single-point causal inference. In Sec. 4, we apply our method to estimate and test the SCE under any regime of diagnosing and treating hospitals on one-year survival in our medical example. In Sec. 5, we compare our method with the three available methods mentioned earlier by applying all these four methods to the data. Finally, in Sec. 6, we conclude the article with discussion.

2. Data and the medical background

In Sweden, patients usually seek medical help at hospitals near their residential areas, namely, catchment areas. When cancer is diagnosed, they may stay at the diagnosing hospital or transfer to another hospital for treatment. The hospital diagnosing cancer is called the diagnosing hospital, while the one treating cancer is called the treating hospital. To evaluate the performance of diagnosing and treating hospitals, we may study cancer outcomes such as survival under various regimes of diagnosing and treating hospitals among cancer patients after adjusting for patients' differences.

The data used in this study is from a prognosis study conducted during a period between 1988 and 1995 in hospitals in central and northern Sweden (Hansson et al. Citation2000). It contained information of 157 patients of cardia cancer. Cardia cancer is highly malignant with bad prognosis, and one-year survival is a good measure of the performance of both diagnosing and treating hospitals. A question of relevance to the public health policy is which types of the diagnosing and treating hospitals, large versus small, perform better on cancer outcomes, where the large type refers to the regional or county hospitals and the small type to local hospitals.

In Sec. 4 below, we will estimate and test the SCE on one-year survival under any regime of diagnosing and treating hospitals and then use the SCEs to evaluate the performance of diagnosing and treating hospitals.

3. General framework of estimating and testing SCEs based on the alternative G-formula

In this section, we will describe the alternative G-formula, which identifies the SCE by the point effects of treatments, and the general framework of our method for estimating and testing SCEs in observational studies.

3.1. Potential variables and the causal effect from a sequence of treatments

Let Dt at time t=1, , T be the treatment variable, which potentially and deterministically assigned treatments zt to each unit of the population. A regime is a sequence of such treatment variables, D1T=(D1,, DT). Prior to D1, there exists a set of stationary covariates X1. Under D1T, each unit could have a set of potential time-dependent covariates Xt(D1t1) between Dt1 and Dt (t=2, , T) and a potential outcome Y(D1T) of interest after the last treatment DT. Noticeably, Dt can be dynamic, namely, the assignment of the treatment depends on the earlier treatments and covariates of each unit.

Let X1T(D1T1)={X1,X2(D1),, XT(D1T1)} be the set of all stationary covariates and potential covariates, and {X1T(D1T1),D1T,Y(D1T)}={X1,D1,X2(D1),D2,,XT(D1T1),DT,Y(D1T)} be the set of all stationary and potential covariates, treatment variables, and the potential outcome. The realization of these variables is {x1T,z1T,y}={x1,z1,x2,z2,,xT,zT,y}. In stratum (x1t,z1t1), we document the regime as DtT=(Dt,, DT), its potential covariates as Xt+1T(DtT1) and potential outcome as Y(DtT).

Let DtT=0tT=(0, 0, , 0)  be the control regime given (x1t, z1t1). The blip effect ϕ(x1t, z1t1; zt) of treatment zt in stratum (x1t, z1t1) is an increase of the mean of potential outcome Y(DtT) under regimes DtT=(zt,0t+1T) versus DtT=0tT in the stratum,  that is ϕ(x1t, z1t1; zt)=E{Y(zt,0t+1T) | x1t, z1t1}E{Y(0tT) | x1t, z1t1}.The blip effect is called the net effect (Wang and Yin Citation2015). Clearly, ϕ(x1t, z1t1; zt=0)=0. The SCE under regime DtT in stratum (x1t, z1t1), SCE(x1t, z1t1; DtT),  is an increase of the mean of potential outcome Y(DtT) under DtT versus Y(0tT) under 0tT in the stratum, namely, SCE(x1t, z1t1; DtT)=E{Y(DtT) | x1t, z1t1}E{Y(0tT) | x1t, z1t1}.Here we see that if DtT=(zt,0t+1T), then SCE(x1t, z1t1; DtT)=ϕ(x1t, z1t1; zt).  Therefore the blip effect is a special type of SCEs. As it will become clear in Sec. 3.4, the blip effects determine the SCE under any regime of treatments.

The SCE under regime D1T in the population, SCE(D1T), is an increase of the mean of potential outcome Y(D1T) under D1T versus Y(01T) under 01T in the population, namely, SCE(D1T)=E{Y(D1T)}E{Y(01T)}

3.2. Observable variables and the point effects of treatments in the sequence

Now we consider observable variables. Let Zt be the treatment variable at t=1, , T; X1  be the set of stationary covariates, Xt be the set of time-dependent covariates between Zt1 and Zt; and Y be the outcome of interest. Let Z1T={Z1,, ZT} be the sequence of treatment variables; X1T={X1,, XT} be the set of all stationary and time-dependent covariates; {X1T,Z1T, Y}={X1,Z1,X2,Z2,, XT,ZT,Y} be the set of all treatments, covariates and outcome.

Suppose that the observable variables {X1T,Z1T, Y} have the same support as the potential ones {X1T(D1T1),D1T,Y(D1T)}. The observed values of these variables are {x1T,z1T,y}={x1,z1,x2,z2,, xT,zT,y}. In the following, we will use P(.) to denote the probability distribution of a discrete variable or the density distribution of a continuous variable. The joint distribution of {X1T,Z1T, Y} is given by (1) P(x1T,z1T,y)=P(x1)Pz1 | x1)PxT | x1T1,z1T1)P(zT | x1T,z1T1P(y  x1T,z1T) (1) The standard parameters for these conditional distributions are simply their conditional means, for instance, the standard parameter for Py | x1T,z1T) is the conditional mean μ(x1T,z1T)=E(Y | x1T,z1T).

Now we describe the point effects of treatments in the sequence, which are also parameters for the outcome (Wang and Yin Citation2015). Let μ(x1t, z1t1, zt)=E(Y | x1t, z1t1, zt) be the conditional mean of observable outcome Y given (x1t, z1t1, zt). Without loss of generality, we take zt=0 as the control treatment. Then the point effect of treatment zt in the stratum (x1t, z1t1)  is the following difference between the means (2) ϑ(x1t, z1t1; zt)=μ(x1t, z1t1, zt)μ(x1t, z1t1, 0)(2) Clearly, ϑ(x1t, z1t1; zt=0)=0. The point effect is simply the point effect of treatment in single-point causal inference and its estimation and hypothesis testing is well studied (for instance, Rosenbaum and Rubin Citation1983).

To identify the SCEs by the observable variables {X1T,Z1T, Y}, Robins (Citation1986, Citation1997, Citation2009) introduced the identifying condition, which consists of the consistency assumption, the positivity assumption for treatments and the assumption of no unmeasured confounders. The identifying condition is satisfied in sequential randomized experiments where treatment zt is randomly assigned according to a history (x1t, z1t1) of the earlier covariates and treatments. It is approximately satisfied in observational studies with a sufficient number of covariates. Throughout, we assume the identifying condition.

Under the identifying condition, Robins (Citation1986, Citation1997, Citation2009) derived the well-known G-formula, which expresses the SCE in terms of the standard parameters μ(x1T,z1T). If xt (t>1) has influence from earlier treatments Z1t1 and influence on subsequent treatments ZtT, then μ(x1T,z1T) are essentially all different for different (x1T,z1T1), even if each Zt has null effect on Y. So any model imposing equalities between μ(x1T,z1T) for different (x1T,z1T1) is misspecified (null paradox). Obviously, the number of μ(z1T,x1T) is huge in case of large T or even uncountable (curse of dimensionality).

Under the same identifying condition, Wang and Yin (Citation2020) derived an alternative G-formula, which expresses the blip effect and SCE in terms of point effects instead of the standard parameters. In the rest of this section, we will first describe the G-formula, then describe our method of conducting modeling to estimate and test blip effects and SCEs based on the alternative G-formula.

3.3. Alternative G-formula to estimate and test the blip effect in observational studies

Intuitively, the point effect ϑ(x1t, z1t1; zt) is the result of treatment zt and the subsequent treatments Zt+1T, and it may decompose into a sum of the blip effects of these treatments. Therefore, we should have the following decomposing formula (3) ϑ(x1t, z1t1; zt)=ϕ(x1t, z1t1; zt)+s=t+1TE{ϕ(x1s, z1s1; zs) | x1t, z1t1,zt}s=t+1TE{ϕ(x1s, z1s1; zs) | x1t, z1t1,zt=0}(3) where the expectations are with respect to the conditional distributions of observable variables P(xt+1s, zt+1s1, zs | x1t, z1t1,zt) and P(xt+1s, zt+1s1, zs | x1t, z1t1,zt=0). Wang and Yin (Citation2020) theoretically proved (3) under the identifying condition. Formula (3) is the alternative G-formula and expresses the point effects in terms of blip effects.

Now we describe our method of conducting modeling to estimate and test blip effects. First, the point effect is simply the point effect of treatment in single-point causal inference, so we can estimate it by modeling the mean μ(x1t, z1t)=EY | x1t,z1t) in the framework of single-point causal inference. As an illustration, if (x1t1, z1t1) influences the outcome Y via xt, then we have μ(x1t, z1t)=EY | xt,zt), denoted by μ(xt,zt), i.e., μ(x1t, z1t)=μ(xt,zt). Thus, we may rewrite (2) as ϑ(x1t, z1t1; zt)=ϑ(xt; zt)=μ(xt,zt)μ(xt,zt=0).Second, Robins (Citation1997, Citation2009) and Hernan and Robins (Citation2020) pointed out that the blip effects follow a certain pattern described by structural nested mean model (SNMM), as an illustration, ϕ(x1t, z1t1; zt)=f(xt,zt; γ), t=1,,Twhere f(.) is a deterministic function of (xt,zt) and is indexed by a parameter vector γ of small dimension. With ϑ(xt;zt) and SNMM, the alternative G-formula becomes the following decomposing formula ϑ(xt; zt)=f(xt,zt; γ)+s=t+1TE{f(xs,zs; γ) | xt,zt}s=t+1TE{f(xs,zs; γ) | xt,zt=0}where the expectations are with respect to the conditional distributions of observable variables P(xs,zs | xt,zt) and P(xs,zs | xt,zt=0).

To estimate and test γ, we treat the above decomposing formula as a regression model, where the response variables are the estimates ϑ̂(xt; zt) and the explanatory variables are the observed proportions P̂(xs,zs | xt,zt), which are obtained from the data without modeling. In the regression, we need the conditional variance of the estimated point effect given all covariates and treatments (x1T, z1T), which is obtained by specifying a constant dispersion parameter for the distribution P(y | x1T,z1T). Noticeably, the conditional covariance between the estimated point effects at different times is negligible, for instance, it is equal to zero for the normal outcome (Wang and Yin Citation2015, Citation2020). The bootstrap method is used to obtain the covariance matrix cov(γ̂) incorporating the variability of (x1T, z1T).

If the estimates for the point effects are consistent and asymptotically normal as typically required in single-point causal inference and if γ is of finite dimension, then the estimate γ̂  is consistent and asymptotically normal. In this case, with γ̂ and cov(γ̂), we conduct the Wald test on γ.

3.4. Alternative G-formula to estimate and test SCEs in observational studies

For the sake of explication, we first describe the alternative G-formula for SCEs in the population. Intuitively, it may be decomposed into a sum of the blip effects of treatments in the regime, as follows. Prior to D1, the covariate distribution in the population is P(x1), so when D1 potentially and deterministically assigned z1 to the population, the contribution of D1 to SCE would be the expectation E{ϕ(x1; z1)} with respect to P(x1), as well-known in single-point causal inference (for instance, Rosenbaum and Rubin Citation1983). Between D1 and D2, the covariate distribution in the population under D1 would be P(x1)Px2 | x1,z1), so the contribution of D2 to SCE would be E{ϕ(x1,z1,x2; z2)} with respect to P(x1)Px2 | x1,z1), and so forth. Therefore, we have the following decomposing formula, (4) SCE(D1T)=s=1TE{ϕ(x1s, z1s1; zs)}, (4) where the expectation is with respect to the distribution of observable covariates x1s: k=1sP(xk | x1k1,z1k1). Similarly, for the SCE in stratum (x1t, z1t1), we have the decomposing formula (5) SCE(x1t, z1t1; DtT)=ϕ(x1t, z1t1; zt)+s=t+1TE {ϕ(x1s, z1s1; zs) | x1t, z1t1,zt}, (5) where the expectation is with respect to the distribution of observable covariates xt+1s: k=t+1sP(xk | x1k1,z1k1). Theoretically, Wang and Yin (Citation2020) proved (4) and (5). The formula (4) or (5) is the alternative G-formula for the SCE and expresses the SCE in terms of the blip effects.

Now we describe our method of conducting modeling to estimate and test SCEs in observational studies. The blip effects ϕ(x1s, z1s1; zs) (s=t, t+1, , T) are already estimated from the previous subsection. Then we only need to model the probabilities P(xk | x1k1,z1k1) by regression in the framework of single-point causal inference. As an illustration of our method, suppose that we obtain SNMM ϕ(x1t, z1t1; zt)=f(xt,zt; γ) and P(xk | x1k1,z1k1)=P(xk | xk1,zk1). By decomposing the SCE into the blip effects or applying (4) and (5), we obtain SCE(D1T)=s=1TE{f(xt,zt; γ)},with respect to k=1sP(xk | xk1,zk1) and SCE(xt; DtT)=f(xt,zt; γ)+s=t+1TE{f(xs,zs; γ) | xt,zt},with respect to k=t+1sP(xk | xk1,zk1). 

With the estimates P̂(xk | xk1,zk1)  and γ̂, we obtain the estimate SCÊ(xt; DtT) under regime DtT in stratum xt and SCÊ(D1T) under regime D1T in the population. The covariance matrix between the estimated SCEs under different regimes is estimated by the bootstrap method. If P̂(xk | xk1,zk1), in addition to γ̂, are consistent and asymptotically normal as typically required in single-point causal inference, so are SCÊ(xt; DtT) and SCÊ(D1T). With the estimates and the covariance matrix, we conduct the Wald test on the SCE under any regime. We may also compare different regimes by testing the differences between their SCEs.

Several advantages of our method of modeling to estimate and test blip effects and SCEs are summarized as follows.

  • When estimating the blip effect and SNMM, there is little risk for the model misspecification due to the following reasons. First, the regression model for the blip effect is the alternative G-formula (the decomposing formula) for blip effects, which is subject to no model misspecification. Second, the point effects as the response variables are estimated by regression in the framework of single-point causal inference. Third, the probability of treatments and covariates as the explanatory variables are estimated by the observed proportion in the data without modeling. Fourth, there is no restriction on the functional form of SNMM, and SNMM is testable in the framework of regression. SNMM improves the efficiency of estimating and testing blip effects. Notably, the modeling does not suffer from the curse of dimensionality and the null paradox as compared to the modeling via standard parameters (Hernan and Robins Citation2020; Robins Citation1997, Citation2009; Taubman et al. Citation2009).

  • The covariate probabilities in the alternative G-formula (the decomposing formula) for SCEs are estimated by regression in the framework of single-point causal inference. With the estimated blip effects/SNMM and the estimated covariates probabilities, we use the alternative G-formula to calculate the estimate of SCE under any regime of treatments. We do not need to specify models for the baselines such as E{Y(0tT)|x1t, z1t1}, which are subject to high risks for model misspecification (Hernan and Robins Citation2020; Robins Citation1997, Citation2004, Citation2009).

4. Estimating and testing SCEs of diagnosing and treating hospitals on cardia cancer survival

4.1. Setting

In the medical example of Sec. 2, the diagnosing hospital is the treatment variable Z1, which takes the value z1=0 for small type and z1=1 for large type. The treating hospital is the treatment variable Z2, which takes z2=0 for small type and z2=1 for large type. The following stationary covariates before Z1 are measured: gender X11, geographic area X12 and age X13. Gender is x11=0 for female and x11=1 for male. Geographic area is categorized into x12=0 for rural and x12=1 for urban. Age takes continuous values x13. The time-dependent covariate between Z1 and Z2 is cancer stage X2 taking the values x2=1, 2, 3, 4. The outcome of interest is Y, which takes y=0 for death and y=1 for survival within one year after diagnosis. In Supporting Material, the data is available together with the code for analyses in this article. The descriptive statistics are given in .

Table 1. Frequencies or means (standard deviations) of covariates and outcome across the diagnosing and treating hospitals for 157 cardia cancer patients.

Due to the long-term social welfare system and relatively uniform culture in Sweden, most of the stationary covariates, such as education and socioeconomic status, have similar distributions across different hospitals and thus do not confound the causal effect. As a common practice in many epidemiologic researches in Sweden, we assume no unmeasured confounders for diagnosing hospitals Z1 at least after conditional on gender x11, residential area x12, and age x13. Similarly, we assume no unmeasured confounders for treating hospitals Z2 conditional on z1 and cancer stage x2 besides x11, x12, and x13. Together with the positivity and consistency assumptions, the assumption of no hidden confounders is the identifying condition (Robins Citation1986, Citation1997, Citation2009). Under the identifying condition, we identify the SCE under any regime of diagnosing and treating hospitals including the optimal regime by our data.

In the rest of this section, we will apply our method described in Sec. 3 to estimate and test the SCE under any regime of diagnosing and treating hospitals and evaluate the diagnosing hospitals.

4.2. Estimating and testing point effects of diagnosing and treating hospitals on one-year survival

In the area of cancer diagnosis, one usually studies one-year, three-year and five-year survivals. One-year survival is often the focus of highly malignant cancers such as cardia cancer.

To estimate the point effect of diagnosing hospital z1, we model the mean μ(x11,x12,x13,z1), which is the probability of one-year survival P(Y=1 | x11,x12,x13,z1) in stratum (x11,x12,x13,z1), where z1 is the variable of interest while all others are possible confounders of z1. Notably, the model does not contain any posttreatment variable after z1, so the null paradox does not necessarily occur, and the model can be unsaturated without biasing the point effect. At the significance level of 0.10, we exclude the geographic area x12 and age x13 and obtain (6) Logit{μ(x11,z1)}=log{μ(x11,z1)1μ(x11,z1)}=β0+β1x11+ϑ1z1. (6)

From this model, we obtain the estimate for μ(x11,z1)=exp(β0+β1x11+ϑ1z1)1+exp(β0+β1x11+ϑ1z1)

and then the point effect of large diagnosing hospital z1=1 ϑ(x11; z1=1)=μ(x11, z1=1)μ(x11,z1=0)ϑ1.Here, the point effects for both x11=0 and x11=1 are approximately equal, i.e., ϑ(x11; z1=1) ϑ1 with the p-value = 0.6.

The estimate ϑ̂1 is presented in . Please note that the point effect is a mixed effect of both z1=1 and z2=1 and thus of little medical interest.

Table 2. Point effects, blip effects and optimal SCEs of diagnosing and treating hospitals on one-year survival of cardia cancer: estimate, p-value and 95 % CI.

To estimate the point effect of treating hospital z2, we model the mean μ(x11,x12,x13,z1,x2,z2), which is the probability of one-year survival P(Y=1 | x11,x12,x13,z1,x2,z2) in stratum (x11,x12,x13,z1,x2,z2), where z2 is the variable of interest while all others are possible confounders of z2. At the significance level of 0.10, we exclude x11,x12 and x13 and obtain (7) Logit {μ(z1,x2,z2)}=log{μ(z1,x2,z2)1μ(z1,x2,z2)}=α0+α1z1+α2x2+ϑ2z2,(7)

From this model, we obtain the estimate for the point effect of large treating hospital z2=1: ϑ(z1,x2;z2=1)=μ(z1,x2,z2=1)μ(z1,x2,z2=0)ϑ2Here, the point effects in all strata (z1,x2) are approximately equal, i.e., ϑ(z1,x2;z2=1)ϑ2 with the p-value 0.2. The estimate ϑ̂2 is presented in together with ϑ̂1.

As it will be clear in the next subsection, we will need the conditional variances of the estimated point effects given all the covariates, the diagnosing and treating hospitals. The conditional variances are obtained by specifying a constant dispersion parameter for the binomial distribution and applying the obtained models (6) and (7). Noticeably, the conditional covariance between the estimated point effects of diagnosing and treating hospitals is approximately zero (Wang and Yin Citation2015, Citation2020).

4.3. Estimating and testing blip effect of diagnosing and treating hospitals on one-year survival

Because neither (6) nor (7) contains geographic area x12 and age x13, we do not include them in the blip effects and SCEs below. Because the blip effects are a special type of SCEs and determine all other SCEs (see the next subsection), we will estimate the blip effects here and SCEs in the next subsection. The blip effect ϕ(x11; z1) of diagnosing hospital z1 in the stratum (x11) is an increase of the mean of potential survival Y(D1,D2) under regimes (D1,D2)=(z1,0) versus to (D1,D2)=(0, 0) in the stratum, namely ϕ(x11; z1)=E{Y(z1, 0) | x11}E{Y(0, 0) | x11}.Clearly, we have ϕ(x11; z1=0)=0. The blip effect ϕ(x11,z1,x2; z2) of treating hospital z2 in the stratum (x11,z1,x2) is an increase of the mean of potential survival Y(D2) under regimes D2=z2 versus D2=0  in the stratum, namely, ϕ(x11,z1,x2; z2)=E{Y(z2) | x11,z1,x2}E{Y(0) | x11,z1,x2}.clearly, ϕ(x11,z1,x2; z2=0)=0. Because the treating hospital Z2 is the last treatment variable, the blip effect ϕ(x11,z1,x2; z2=1)  is equal to the point effect ϑ(z1,x2; z2=1)=ϑ2 obtained in the previous subsection. Furthermore, a question of medical interest is whether the blip effect of z1 depends on gender x11. Motivating by all these together, we suppose that the blip effects satisfy SNMM of the form (8) ϕ(x11=0; z1)=z1γ10,ϕ(x11=1; z1)=z1γ11,ϕ(x11,z1,x2; z2)=z2γ2.(8)

Let γ=(γ10,γ11,γ2) be the vector of all parameters in (9). The parameter γ10 is the blip effect of large diagnosing hospital among women (x11=0)  and γ11 is blip effect of large diagnosing hospital among men (x11=1) . They describe modification of the blip effect of diagnosing hospital by gender. The average of the blip effect ϕ(x11;z1=1) with respect to P(x11) is blip effect of large diagnosing hospital in the population and equal to γ1=γ10P(x11=0) +γ11P(x11=1).

Intuitively, the point effect ϑ(x11; z1=1) results from the blip effects of z1=1 and z2=1. For z1=1, the contribution is γ10 if x11=0 and γ11 if x11=1. For z2=1, the contribution comes from the two strata (x11,z1=1,x2) and (x11, z1=0,x2), and it is equal to γ2C(x11), where C(x11)=x2[P{(x2, z2=1) | x11, z1=1}P{(x2,z2=1) | x11, z1=0}.Thus, we have

ϑ(x11=0;z1=1)=γ10+γ2C(x11=0)  and ϑ(x11=1;z1=1)=γ11+γ2C(x11=1). The C(x11) is estimated by the observed proportion in the data without modeling. For the last treatment variable Z2, the point effect of z2=1 is equal to the blip effect of z2=1, namely, γ2=ϕ(x11,z1,x2;z2=1)=ϑ(z1,x2;z2=1)=ϑ2.Taking these together, we have the following decomposing formula under SNMM (8) (9) ϑ(x11=0; z1=1)=γ10+γ2C(x11=0), ϑ(x11=1; z1=1)=γ11+γ2C(x11=1), ϑ2=γ2.(9)

The formula (9) is the alternative G-formula for the blip effect under SNMM (8) and it can also be derived by applying the formula (3) and SNMM (8) as follows. By applying (3) to t=1, T=2  we obtain ϑ(x11; z1=1)=ϕ(x11; z1=1)+E{ϕ(x11,z1,x2; z2) | x11, z1=1}E{ϕ(x11,z1,x2; z2) |x11, z1=0}Applying (8), we obtain for x11=0 ϑ(x11=0; z1=1)=γ10+γ2 x2[P{(x2, z2=1) | x11=0, z1=1}P{(x2,z2=1)| x11=0, z1=0}]=γ10+γ2C(x11=0)Thus, we have ϑ(x11=0; z1=1)=γ10+γ2C(x11=0). Similarly we have ϑ(x11=1;z1=1)=γ11+γ2C(x11=1). By applying (3) to t= T=2  we obtain ϑ2=ϑ(z1,x2; z2=1)=ϕ(x11,z1,x2; z2=1)=γ2. That is ϑ2=γ2.

Because ϑ(x11; z1=1)=ϑ1 for both x11=0 and 1 as seen from model (6), we have ϑ̂(x11; z1)=1)=ϑ̂1, but the variance var{ϑ̂(x11; z1)=1)} is obtained by adjusting var(ϑ̂1) to the size of stratum x11. All probabilities in C(x11) are estimated by the corresponding proportions in the data without modeling. The ϑ̂2 is estimated from model (7).

Now, conditional on all covariates and diagnosing and treating hospitals, we use (9) as a regression model to estimate γ=(γ10, γ11, γ2), where the response variables are ϑ̂(x11=0; z1=1), ϑ̂(x11=1; z1=1), and ϑ̂2 while the explanatory variables are the estimates Ĉ(x11=0), Ĉ(x11=1) and the ones. The bootstrap method is used to obtain the covariance matrix cov(γ̂) incorporating the variability of all covariates and diagnosing and treating hospitals. With γ̂ and cov(γ̂), we conduct the Wald test on γ. The result is presented in .

4.4. Estimate and testing SCE under any regime of diagnosing and treating hospitals on one-year survival

First, we consider SCE(x11,z1,x2; D2). It is an increase of the mean of potential survival under regimes D2 versus 0 in stratum (x11,z1,x2), namely SCE(x11,z1,x2; D2)=E{Y(D2) | x11,z1,x2}E{Y(0) | x11,z1,x2}where D2 potentially and deterministically assigned treating hospital z2 to each patient. Noticeably, D2 can be dynamic, for instance, D2=1 for patients younger than 55 years and D2=0 otherwise. By applying (5) to t= T=2,  we see that SCE(x11,z1,x2; D2)= ϕ(x11,z1,x2; z2). Thus, by applying ϕ(x11,z1,x2; z2) in SNMM (8), we obtain following decomposing formula for the SCE (10) SCE(x11,z1,x2; D2)=γ2z2(10)

Second, we consider SCE(x11; D1,D2), which is an increase of the mean of potential survival under regime (D1,D2) versus (0, 0) in stratum (x11), namely SCE(x11; D1,D2)=E{Y(D1, D2) | x11}E{Y(0, 0) | x11}where (D1,D2) potentially and deterministically assigned (z1, z2) of diagnosing and treating hospitals and it can be dynamic. If regime (D1,D2)=(z1, 0), then SCE(x11; D1,D2)=ϕ(x11; z1), which implies that the blip effect is a special type of SCEs. Intuitively the SCE should be a result of the blip effects of z1=1 and z2=1 in the regime. The contribution of z1 is γ10z1 if x11=0 and γ11z1 if x11=1. The contribution of z2 is γ2z2. Therefore, we have the following decomposing formula (11) SCE(x11=0; D1,D2)=γ10z1+γ2z2,  SCE(x11=1; D1,D2)=γ11z1+γ2z2. (11)

The formulas (10) and (11) implies that all SCEs are determined by blip effects. They are the alternative G-formula for the SCE under SNMM (8) and can also be derived by applying the formula (5) and the SNMM (8) as follows. By applying (5) to t=1, T=2,  we obtain SCE(x11; D1,D2)=ϕ(x11; z1)+ E{ϕ(x11,z1,x2; z2)x11,z1}Now by applying (8), we obtain(11).

Taking the average of (11) with respect to P(x11), we obtain the alternative G-formula for SCE(D1,D2), which is an increase of the mean of the potential survival under regimes (D1,D2) versus (0, 0) in the population.

In (10) replacing γ=(γ10, γ11, γ2) by estimates γ̂ obtained from (9), we obtain the estimate for SCE(x11,z1,x2; D2) under any treating hospital D2. In (11) replacing γ=(γ10, γ11, γ2) by estimates γ̂ obtained from (9), we obtain the estimate for SCE(x11; D1,D2) under any regime (D1,D2). Averaging SCÊ(x11; D1,D2) with respect to proportion P̂(x11), we obtain the estimate for SCE(D1,D2). The variance of the estimated SCE is obtained by the bootstrap method. With the estimate and its variance, we obtain the confidence interval of the SCE. The result is presented in .

4.5. Estimating and testing optimal regime of diagnosing and treating hospitals on one-year survival

By the dynamic programming procedure (Bellman, Citation1957), we estimate and test optimal regime (O1,O2) of diagnosing and treating hospitals by estimating and testing SCEs, as follows. We first estimate the optimal treating hospital O2 by Ô2=arg maxD2{0,1}SCÊ(x11,z1,x2; D2),which is the treating hospital such that SCÊ(x11,z1,x2; D2)=γ̂2z2 achieves the maximum value among all possible values (0 or 1) that D2 assigns for given (x11,z1,x2). Because γ̂2=0.194>0 (see ), we have Ô2=1 (the large treating hospital) and SCÊ(x11,z1,x2;Ô2)=γ̂2 for all given (x11,z1,x2). We test Ô2 by testing γ2 and obtain the 95% CI and the p-value. Now we estimate the optimal diagnosing hospital O1 by Ô1=arg maxD1{0,1}SCÊ(x11; D1,Ô2).

Because γ̂10=0.158 and γ̂11=0.109 we have SCÊ(x11=0; D1,Ô2)=γ̂10z1+γ̂2=0.158z1+0.194, SCÊ(x11=1; D1,Ô2)=γ̂11z1+γ̂2=0.109z1+0.194.

We see that SCÊ(x11; D1,Ô2) take maximum value when z1=0 for both x11=0 and x11=1. Therefore Ô1=0 (small diagnosing hospital) and SCÊ(x11; Ô1,Ô2)=0.194  for any given gender x11. Thus, the estimates for Γ10=SCE(x11=0; Ô1,Ô2), Γ11=SCE(x11=1; Ô1,Ô2), and Γ1=SCE(Ô1,Ô2)are all equal to 0.194. We test (Ô1,Ô2) by testing SCE(x11; Ô1,Ô2) and obtain the 95% CI and the p-value for SCE(x11; Ô1,Ô2). The result is presented in .

4.6. Causal analysis of diagnosing and treating hospitals based on

For the treating hospital, the point effect ϑ2 or the blip effect γ2 of large treating hospital is estimated at 0.194 with the p-value equal to 0.009, implying that the large treating hospital is superior to one-year survival. There is no effect modification by the cancer stage, gender and age. This observation is consistent with the medical knowledge. Cardia cancer is highly malicious regardless of age and gender. It progressed quickly regardless of cancer stage. Patients typically have comorbidities and large treating hospitals are probably better in dealing with comorbidities.

For the diagnosing hospital, the point effect ϑ1 is estimated at 0.020 with the p-value 0.799, suggesting large and small diagnosing hospitals perform equally well. But it is misleading, because the point effect is a mixed effect of diagnosing and treating hospitals. The blip effect γ1 is estimated at 0.121 with the p-value 0.142, implying that the small diagnosing hospital performs better for one-year survival. This observation indicates that the early diagnosis of such a malicious cancer as cardia cancer depends on the short waiting queue and attention typically at small diagnosing hospitals. The blip effect γ10 for women is estimated at 0.158 with the p-value equal to 0.090, and the blip effect γ11 for men at 0.109 with the p-value equal to 0.179, suggesting some modification of the blip effect of diagnosing hospital by gender.

The optimal regime is estimated as the small diagnosing hospital and large treating hospital. The optimal SCE is estimated at 0.194 with the p-value 0.009. This reveals the potential improvement for one-year survival by the optimal regime.

Here, all effects are measured by the difference in mean survival, but they can also be measured by the difference in a function of mean survival. We may apply the alternative G-formula (namely, the decomposing formula) to estimate the causal effect measured by odds ratio, rate ratio and hazard ratio. Though the estimates are consistent, the problem of non-collapsibility of a non-linear measure becomes far worse in the context of sequential treatments, leading to a biased estimate for finite sample. Therefore, it is recommended that one uses the linear measure for the SCE.

5. Comparison of our method with available methods

Method (i) is our method described in Sec. 4. Method (ii) is based on the well-known G-formula (Hernan and Robins Citation2020; Robins Citation1997, Citation2009; Taubman et al. Citation2009). Method (iii) is the marginal structural model based on inverse probability of treatment weighting (Hernan and Robins Citation2020; Hernan, Brumback, and Robins Citation2000; Robins, Brumback, and Hernan Citation2000; Robins Citation2009). Method (iv) is the G-estimation based on SNMM or optimal SNMM (Hernan and Robins Citation2020; Robins Citation1997, Citation2004, Citation2009). Method (i) and (ii) are parametric whereas methods (iii) and (iv) are semi-parametric. Methods (ii)-(iv) are also described in the context of this article in Supplemental Material.

Because methods (i)-(iv) may lead to the same inference of causal effect of treating hospital, we only present the causal effect of diagnosing hospital. In Sec. 4, we have applied method (i) to estimate the blip effect γ10 of large diagnosing hospital among women; γ11 of large diagnosing hospital among men; γ1  of large diagnosing hospital in the population; optimal SCE (Γ1) under optimal regime of diagnosing and treating hospitals versus small ones in the population. Here, we additionally estimate the optimal blip effect of large diagnosing hospital: Υ1=SCE(D1=1,Ô2)SCE(D1=0,Ô2), because method (iv) does not estimate Γ1. The result from these methods is presented in . From this table, we have the following observations.

Table 3. Comparison of our method with available methods in Sec. 5: estimate, p-value and 95 % confidence interval (95% CI) for causal effects of diagnosing and treating hospitals on one-year survival of cardia cancer.

Method (i) yields the estimates for all five causal effects: γ̂10=0.158,γ̂11=0.109,γ̂1=0.121,Υ̂1=0.121, and Γ̂1=0.194.Here we only need to estimate a total of two parameters: ϑ1 for the point effect of z1 and ϑ2 for the point effect of z2, to estimate the blip effect and the SCE under any regime, as described in Sec. 4. The estimates are in consistency with the medical knowledge. Because SCEs are all estimated from the same estimated SNMM, we can compare different regimes by testing their SCEs, for instance, we compare (Ô1,Ô2)  with (0, 0) by testing Γ1.

Method (ii) only yields the estimates for three causal effects: γ̂1=0.066,Υ̂1=0.101, and Γ̂1=0.200,because of imbalanced and sparse data between exposure sequence (z1,z2) and gender x11. Here one needs to estimate a total of 2×2×4×2=32 standard parameters μ(x11,z1,x2,z2) to estimates these causal effects (curse of dimensionality), which leads to wide confidence intervals. The estimate Γ̂1=0.200 is biased, because the optimal SCE should be larger than or equal to zero. The bias may be due to the null paradox (Hernan and Robins Citation2020; Robins Citation1997, Citation2009; Taubman et al. Citation2009). For a long sequence of cancer diagnoses and treatments, the problems with the null paradox and the curse of dimensionality in method (ii) become far worse. In this case, method (i) may provide a useful tool for estimating and testing the causal effect of cancer diagnosis.

Method (iii) also yields the estimates for only three causal effects: γ̂1=0.463,Υ̂1=0.167, and Γ̂1=0.249.

The estimate γ̂1=0.463 is not medically sensible. Also notice the wide confidence intervals. The reason is due to imbalanced and sparse data between exposure sequence (z1,z2) and gender x11 (Hernan and Robins Citation2020; Hernan, Brumback, and Robins Citation2000; Robins, Brumback, and Hernan Citation2000; Robins Citation2009).

Method (iv) yields estimates for four causal estimates: γ̂10=0.241,γ̂11=0.092,γ̂1=0.127, and Υ̂1=0.127.The estimates γ̂1=0.127 and Υ̂1=0.127  are rather close to those obtained from method (i), albeit with lower significance. However, it does not estimate Γ1. Generally, method (iv) does not facilitate comparison between different regimes.

6. Conclusion

In many practices, a single-point treatment can hardly achieve the desirable result. More often, a sequence of treatments are assigned or implemented, where a new situation often arises from the early treatments and also influences the assignment of subsequent treatments. A typical example is the stochastic process of cancer diagnosis. In such circumstances, it is highly difficult to design and conduct sequential randomized trials in comparison to randomized trials for single-point treatments. Consequently, data arising from a sequence of treatments are often observational, and this requires statistical modeling when estimating and testing SCEs under specified regimes of treatments.

In this article, we introduce a parametric method of estimating and testing SCEs in the observational study settings by modeling based on the alternative G-formula. Our method is implemented in three steps: first to estimate and test the point effect, second to use the estimated point effect to estimate and test the blip effect, and finally to use the estimated blip effects to estimate and test SCEs. Each of these steps is intuitive and can be carried out using regressions familiar to epidemiologists in single-point causal inference.

With its simple setting, we can readily extend the example in this article to other observational studies. As to cancer diagnosis, we can similarly study various cancer types, different cancer outcomes such as cancer progression, quality of life and others, different diagnosing techniques such as the biomarkers or a sequence of screening steps, and different modification factors such as social-economic status, comorbidity, and family history.

Ethics approval

The proposed research is covered by the ethical committee approval (DNR880113/13, §121) from the ethical review board of Uppsala University.

This work was supported by Vetenskapsrådet, Sweden (2019-02913).

Supplemental material

Supplemental Material

Download MS Word (38.5 KB)

Supplemental material

  1. Data and code available in [Zenodo], at https://doi.org/10.5281/zenodo.6367502

  2. A description of available methods (ii), (iii) and (iv) in the context of the medical example.

Additional information

Funding

This work is partially supported by the Swedish Research Council with the grant number 2019-02913.

References

  • Almirall, D., T. T. Have, and S. A. Murphy. 2010. Structural nested mean models for assessing time-varying effect moderation. Biometrics 66 (1):131–9. doi:10.1111/j.1541-0420.2009.01238.x.
  • Bellman, R. 1957. Dynamic programming. Princeton, NJ: Princeton University Press.
  • Chaffe, P. H, and M. J. van der Laan. 2012. Targeted maximum likelihood estimation for dynamic treatment regimes in sequentially randomized controlled trials. International Journal of Biostatistics 8:14.
  • Hansson, L. E., A. M. Ekström, R. Bergström, and O. Nyrén. 2000. Surgery for stomach cancer in a defined Swedish population: Current practices and operative results. The European Journal of Surgery = Acta Chirurgica 166 (10):787–975. doi:10.1080/110241500447425.
  • Henderson, R., P. Ansell, and D. Alshibani. 2010. Regret-regression for optimal dynamic treatment regimes. Biometrics 66 (4):1192–201. doi:10.1111/j.1541-0420.2009.01368.x.
  • Hernan, M. A., B. Brumback, and J. M. Robins. 2000. Marginal structural models to estimate the causal effect of zidovudine on the survival of HIV-positive men. Epidemiology 11:561–70.
  • Hernan, M. A, and J. M. Robins. 2020. Causal inference: What if. Boca Raton, FL: Chapman & Hall/CRC.
  • Kosorok, M. R, and E. B. Laber. 2019. Precision medicine. Annual Review of Statistics and Its Application 6:263–86. doi:10.1146/annurev-statistics-030718-105251.
  • Kosorok, M. R., Laber, E. B. Small, D. S. Donglin, and Zeng, D. 2021. Introduction to the theory and methods special issue on precision medicine and individualized policy discovery. Journal of the American Statistical Association 116 (533):159–61. doi:10.1080/01621459.2020.1863224.
  • Murphy, S. 2003. Optimal dynamic treatment regimes. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 65 (2):331–66. doi:10.1111/1467-9868.00389.
  • Robins, J. M. 1986. A new approach to causal inference in mortality studies with sustained exposure periods – Application to control of the healthy worker survival effect. Mathematical Modelling 7 (9–12):1393–512. doi:10.1016/0270-0255(86)90088-6.
  • Robins, J. M. 1997. Causal inference from complex longitudinal data. In Latent variable modeling and applications to causality, Lecture Notes in Statistics 120, ed. M. Berkane, 69–117. New York: Springer-Verlag.
  • Robins, J. M. 2004. Optimal structural nested models for optimal sequential decisions. In Proceedings of the second seattle symposium in biostatistics, analysis of correlated data, eds, D. Y. Lin, and P. J. Heagerty, 189–326. New York: Springer-Verlag.
  • Robins, J. M. 2009. Longitudinal data analysis. In Handbooks of modern statistical methods, ed. G. Fitzmaurice, 553–99. Boca Raton, FL: Chapman & Hall/CRC.
  • Robins, J. M., B. Brumback, and M. A. Hernan. 2000. Marginal structural models and causal inference in epidemiology. Epidemiology (Cambridge, Mass.) 11 (5):550–60. doi:10.1097/00001648-200009000-00011.
  • Rosenbaum, P. R, and D. B. Rubin. 1983. The central role of the propensity score in observational studies for causal effects. Biometrika 70 (1):41–55. doi:10.1093/biomet/70.1.41.
  • Sutton, R, and A. Barto. 1998. Reinforcement learning: An introduction. MIT Press, Cambridge.
  • Taubman, S. L., J. M. Robins, M. A. Mittleman, and M. A. Hernán. 2009. Intervening on risk factors for coronary heart disease: An application of the parametric G-formula. International Journal of Epidemiology 38 (6):1599–611. doi:10.1093/ije/dyp192.
  • Wang, X, and L. Yin. 2015. Identifying and estimating net effects of treatments in sequential causal inference. Electronic Journal of Statistics 9 (1):1608–43. doi:10.1214/15-EJS1046.
  • Wang, X, and L. Yin. 2020. New G-formula for the sequential causal effect and blip effect of treatment in sequential causal inference. Annals of Statistics 48:138–60.
  • Zhang, B., A. A. Tsiatis, E. B. Laber, and M. Davidian. 2013. Robust estimation of optimal dynamic treatment regimes for sequential treatment decisions. Biometrika 100 (3):681–94. doi:10.1093/biomet/ast014.