Full article: Robust variance estimation for covariate-adjusted unconditional treatment effect in randomized clinical trials with binary outcomes

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

To improve the precision of estimation and power of testing hypothesis for an unconditional treatment effect in randomized clinical trials with binary outcomes, researchers and regulatory agencies recommend using g-computation as a reliable method of covariate adjustment. However, the practical application of g-computation is hindered by the lack of an explicit robust variance formula that can be used for different unconditional treatment effects of interest. To fill this gap, we provide explicit and robust variance estimators for g-computation estimators and demonstrate through simulations that the variance estimators can be reliably applied in practice.

Keywords:

1. Introduction

In randomized clinical trials, adjusting for baseline covariates has been advocated as a way to improve the precision of estimating and power of testing treatment effects (Freedman, Citation2008; Lin, Citation2013; Tsiatis et al., Citation2008; Yang & Tsiatis, Citation2001; Ye et al., Citation2023, Citation2022). We focus on binary outcomes in this article. When a logistic model is used as a working model for baseline covariate adjustment, the g-computation (Freedman, Citation2008; Moore & van der Laan, Citation2009) provides asymptotically normal estimators of unconditional treatment effects such as the risk difference, relative risk and odds ratio, regardless of whether the logistic model is correct or not. In May 2021, the US Food and Drug Administration released a draft guidance (FDA, Citation2021) for the use of covariates in the analysis of randomized clinical trials, and recommended the g-computation as a ‘statistically reliable method of covariate adjustment for an unconditional treatment effect with binary outcomes '.

However, to the best of our knowledge, no explicit robust variance estimation formula for g-computation is currently available that can be used for inference on different unconditional treatment effects of interest. Moreover, some existing variance estimation formulas in the literature, such as the formula in Ge et al. (Citation2011) for risk difference and two treatment arms, are model-based and do not fit the model-robust inference paradigm. Additionally, the formula in Ge et al. (Citation2011) does not take into account a source of variability due to covariates and nonlinearity of logistic model, which can lead to confidence intervals with insufficient coverage probabilities.

The purpose of this article is to fill this gap by providing explicit and robust variance estimators for g-computation estimators. Our simulations demonstrate that the provided variance estimators can be reliably applied in practice.

2. Robust variance estimation

Consider a k-arm trial with n subjects. For each subject i, let $A_{i}$ be the k-dimensional treatment indicator vector that equals $a_{t}$ if patient i receives treatment t for $t = 1, \dots, k$ , where $a_{t}$ denotes the k-dimensional vector whose tth component is 1 and other components are 0, $Y_{i}^{(t)}$ be the binary potential outcome under treatment t, and $X_{i}$ be the baseline covariate vector for adjustment. The observed outcome is $Y_{i} = Y_{i}^{(t)}$ if and only if $A_{i} = a_{t}$ . We consider simple randomization where $A_{i}$ is completely random with known $π_{t} = P (A_{i} = a_{t})$ , $π_{t} > 0$ and $\sum_{t = 1}^{k} π_{t} = 1$ . We assume that $(Y_{i}^{(1)}, \dots, Y_{i}^{(k)}, A_{i}, X_{i}), i = 1, \dots, n$ , are independent and identically distributed with finite second order moments. To simplify the notation, we drop the subscript i when referring to a generic subject from the population. Write the unconditional response means as $θ_{t} = E (Y^{(t)})$ and $θ = (θ_{1}, \dots, θ_{k})^{⊤}$ , where the superscript ⊤ denotes the transpose of a vector throughout. The target parameter is a given contrast of the unconditional response mean vector $θ$ denoted as $f (θ)$ , such as the risk difference $θ_{t} - θ_{s}$ , risk ratio $θ_{t} / θ_{s}$ , and odds ratio $\frac{θ_{t} / (1 - θ_{t})}{θ_{s} / (1 - θ_{s})}$ between two treatment arms t and s.

Throughout this article, we consider the g-computation procedure that fits a working logistic model $E (Y ∣ A, X) = e x p i t (β_{A}^{⊤} A + β_{X}^{⊤} X)$ , where $e x p i t (x) = \exp (x) / {1 + \exp (x)}$ , and $β_{A}$ and $β_{X}$ are unknown parameter vectors (FDA, Citation2021). The logistic model does not need to be correct and is only used as an intermediate step to obtain g-computation estimators. Let ${\hat{β}}_{A}$ and ${\hat{β}}_{X}$ be the maximum likelihood estimators of $β_{A}$ and $β_{X}$ , respectively, under the working logistic model. Then, ${\hat{μ}}_{t} (X_{i}) = e x p i t ({\hat{β}}_{A}^{⊤} a_{t} + {\hat{β}}_{X}^{⊤} X_{i})$ is the predicted probability of response under treatment t. The g-computation estimator of $θ$ is $\hat{θ} = ({\hat{θ}}_{1}, \dots, {\hat{θ}}_{k})^{⊤}$ with ${\hat{θ}}_{t} = n^{- 1} \sum_{i = 1}^{n} {\hat{μ}}_{t} (X_{i})$ , and of a given contrast $f (θ)$ is $f (\hat{θ})$ . Hence, the g-computation takes a summary-then-contrast approach (Citation(2019), R1).

Next, we derive the asymptotic distribution of the g-computation estimator $\hat{θ}$ and apply the delta method to obtain the asymptotic distribution of the g-computation estimator $f (\hat{θ})$ . As the logistic regression uses a canonical link, the first-order conditions of the maximum likelihood estimation ensure that, for $t = 1, \dots, k$ , $\sum_{i = 1}^{n} I (A_{i} = a_{t}) {Y_{i}^{(t)} - {\hat{μ}}_{t} (X_{i})} = 0,$ where $I (A_{i} = a_{t})$ is the indicator of $A_{i} = a_{t}$ . Hence, the g-computation estimator is equal to ${\hat{θ}}_{t} = \frac{1}{n} \sum_{i = 1}^{n} {\hat{μ}}_{t} (X_{i}) = \frac{1}{n} \sum_{i = 1}^{n} [\frac{I (A_{i} = a_{t})}{{\hat{π}}_{t}} {Y_{i}^{(t)} - {\hat{μ}}_{t} (X_{i})} + {\hat{μ}}_{t} (X_{i})],$ where ${\hat{π}}_{t} = n_{t} / n$ and $n_{t}$ is the number of subjects assigned to treatment t. Since $A_{i}$ 's are assigned completely at random, ${\hat{π}}_{t}$ and ${\hat{μ}}_{t} (x)$ can converge to $π_{t}$ and $μ_{t} (x)$ with $n^{- 1 / 2}$ rate, respectively, where $x$ is a fixed point and $μ_{t} (x)$ is a function not necessarily equal to $E (Y^{(t)} ∣ X = x)$ under model misspecification but satisfies $E {Y_{i}^{(t)} - μ_{t} (X_{i})} = 0$ due to the above first-order conditions, $t = 1, \dots, k$ . Then, by Kennedy (Citation2016) and Chernozhukov et al. (Citation2017), ${\hat{θ}}_{t} = \frac{1}{n} \sum_{i = 1}^{n} [\frac{I (A_{i} = a_{t})}{π_{t}} {Y_{i}^{(t)} - μ_{t} (X_{i})} + μ_{t} (X_{i})] + o_{p} (n^{- 1 / 2}),$ where $o_{p} (n^{- 1 / 2})$ denotes the remaining term multiplied by $n^{1 / 2}$ converges to 0 in probability. Therefore, an application of the central limit theorem shows that, regardless of whether the working model is correct or not, $\begin{aligned} \sqrt{n} (\hat{θ} - θ) ⟹ d N (0, V), V = (\begin{matrix} v_{11} & v_{12} & \dots & v_{1 k} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ v_{1 k} & v_{2 k} & \dots & v_{k k} \end{matrix}), \end{aligned}$ where $⟹ d$ denotes convergence in distribution, $0$ is the k-dimensional vector of zeros, and $\begin{aligned} v_{t t} & = π_{t}^{- 1} v a r {Y^{(t)} - μ_{t} (X)} + 2 c o v {Y^{(t)}, μ_{t} (X)} - v a r {μ_{t} (X)}, t = 1, \dots, k, \\ v_{t s} & = c o v {Y^{(t)}, μ_{s} (X)} + c o v {Y^{(s)}, μ_{t} (X)} - c o v {μ_{t} (X), μ_{s} (X)}, 1 \leq t < s \leq k . \end{aligned}$ By the delta method, when $f (θ)$ is differentiable at $θ$ with partial derivative vector $\nabla f (θ)$ , we have $\sqrt{n} {f (\hat{θ}) - f (θ)} ⟹ d N (0, {\nabla f (θ)}^{⊤} V {\nabla f (θ)}) .$ Some examples are: $\begin{aligned} {\nabla f (θ)}^{⊤} V {\nabla f (θ)} = {\begin{cases} v_{t t} - 2 v_{t s} + v_{s s}, & f (θ) = θ_{t} - θ_{s}, \\ \frac{v_{t t}}{θ_{t}^{2}} - \frac{2 v_{t s}}{θ_{t} θ_{s}} + \frac{v_{s s}}{θ_{s}^{2}}, & f (θ) = \log \frac{θ_{t}}{θ_{s}}, \\ \frac{v_{t t}}{θ_{t}^{2} (1 - θ_{t})^{2}} - \frac{2 v_{t s}}{θ_{t} (1 - θ_{t}) θ_{s} (1 - θ_{s})} + \frac{v_{s s}}{θ_{s}^{2} (1 - θ_{s})^{2}} & f (θ) = \log \frac{θ_{t} / (1 - θ_{t})}{θ_{s} / (1 - θ_{s})} . \end{cases} \end{aligned}$ Note that we apply normal approximation for the log transformed risk ratio and odds ratio because the log transformation typically can improve the performance of normal approximation (Haldane, Citation1956; Woolf, Citation1955).

For robust inference, we propose the following variance estimator for $f (\hat{θ})$ that is always consistent regardless of model misspecification: (1) $\begin{aligned} n^{- 1} {\nabla f (\hat{θ})}^{⊤} \hat{V} {\nabla f (\hat{θ})}, \hat{V} = (\begin{matrix} {\hat{v}}_{11} & {\hat{v}}_{12} & \dots & {\hat{v}}_{1 k} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ {\hat{v}}_{1 k} & {\hat{v}}_{2 k} & \dots & {\hat{v}}_{k k} \end{matrix}), \end{aligned}$ (1) where $\begin{aligned} {\hat{v}}_{t t} & = π_{t}^{- 1} S_{r t}^{2} + 2 Q_{y t t} - S_{μ t}^{2}, t = 1, \dots, k, \\ {\hat{v}}_{t s} & = Q_{y t s} + Q_{y s t} - Q_{μ t s}, 1 \leq t < s \leq k, \end{aligned}$ $S_{r t}^{2}$ is the sample variance of $Y_{i} - {\hat{μ}}_{t} (X_{i})$ for subjects with $A_{i} = a_{t}$ , $Q_{y t t}$ is the sample covariance of $Y_{i}$ and ${\hat{μ}}_{t} (X_{i})$ for subjects with $A_{i} = a_{t}$ , $S_{μ t}^{2}$ is the sample variance of ${\hat{μ}}_{t} (X_{i})$ for all subjects, $Q_{y t s}$ is the sample covariance of $Y_{i}$ and ${\hat{μ}}_{s} (X_{i})$ for subjects with $A_{i} = a_{t}$ , and $Q_{μ t s}$ is the sample covariance of ${\hat{μ}}_{t} (X_{i})$ and ${\hat{μ}}_{s} (X_{i})$ for all subjects. These robust variance estimators can be directly calculated using our R package RobinCar that is publicly available at https://github.com/tye27/RobinCar.

To end this section, we describe the variance estimator in Ge et al. (Citation2011) for the g-computation estimator of risk difference ${\hat{θ}}_{2} - {\hat{θ}}_{1}$ in a two-arm trial and discuss why it can be inconsistent and underestimate the true variance. In our notation, Ge et al. (Citation2011) wrote the g-computation estimator ${\hat{θ}}_{2} - {\hat{θ}}_{1}$ as $g_{n} (\hat{β})$ , where $g_{n} (\hat{β}) = \frac{1}{n} \sum_{i = 1}^{n} e x p i t ({\hat{β}}_{A}^{⊤} a_{2} + {\hat{β}}_{X}^{⊤} X_{i}) - \frac{1}{n} \sum_{i = 1}^{n} e x p i t ({\hat{β}}_{A}^{⊤} a_{1} + {\hat{β}}_{X}^{⊤} X_{i})$ and $\hat{β} = ({\hat{β}}_{A}^{⊤}, {\hat{β}}_{X}^{⊤})^{⊤}$ . Then they applied the Taylor expansion $g_{n} (\hat{β}) - g_{n} (β) = {\nabla g_{n} (β)}^{⊤} (\hat{β} - β) + o_{p} (n^{- 1 / 2}),$ where $β$ is the probability limit of $\hat{β}$ , and proposed $n^{- 1} {\nabla g_{n} (\hat{β})}^{⊤} {\hat{V}}_{M} {\nabla g_{n} (\hat{β})}$ as a variance estimator for ${\hat{θ}}_{2} - {\hat{θ}}_{1}$ , where ${\hat{V}}_{M}$ is the model-based variance estimator for $\sqrt{n} (\hat{β} - β)$ from the standard maximum likelihood approach. This approach has two problems. First, it uses the model-based variance estimator ${\hat{V}}_{M}$ , which may be inconsistent to the true variance of $\hat{β}$ under model misspecification. Second, from $({\hat{θ}}_{2} - {\hat{θ}}_{1}) - (θ_{2} - θ_{1}) = {g_{n} (\hat{β}) - g_{n} (β)} + {g_{n} (β) - (θ_{2} - θ_{1})},$ the variance estimator proposed by Ge et al. (Citation2011) only accounts for the variance of the first term $g_{n} (\hat{β}) - g_{n} (β)$ but misses the variability from $g_{n} (β) - (θ_{2} - θ_{1})$ that is not 0 as the function $e x p i t (\cdot)$ is nonlinear. This second problem can lead to a confidence interval with too low coverage probability, which can be seen from the simulation results in the following section.

3. Simulations

We conduct simulations to evaluate the finite-sample performance of our robust variance estimator in (Equation1(1) $\begin{aligned} n^{- 1} {\nabla f (\hat{θ})}^{⊤} \hat{V} {\nabla f (\hat{θ})}, \hat{V} = (\begin{matrix} {\hat{v}}_{11} & {\hat{v}}_{12} & \dots & {\hat{v}}_{1 k} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ {\hat{v}}_{1 k} & {\hat{v}}_{2 k} & \dots & {\hat{v}}_{k k} \end{matrix}), \end{aligned}$ (1) ). We consider two arms or three arms, simple randomization for treatment assignments with equal allocation (i.e., $π_{1} = π_{2} = 1 / 2$ for two arms and $π_{1} = π_{2} = π_{3} = 1 / 3$ for three arms), a one-dimensional covariate $X \sim N (0, 3^{2})$ , and n = 200 or 500.

We consider the following three outcome data generating processes.

Case I: $P (Y = 1 ∣ A, X) = e x p i t {- 2 + 5 I (A = a_{2}) + X}$ .
Case II: $P (Y = 1 ∣ A = a_{1}, X) = e x p i t (- 2 + X)$ and $P (Y = 1 ∣ A = a_{2}, X) = e x p i t (3 + 1.5 X - 0.01 X^{2})$ .
Case III: $P (Y = 1 ∣ A, X) = e x p i t (- 2 + 2 I (A = a_{2}) + 4 I (A = a_{3}) + X)$ .

In order to determine the true values of the unconditional response means, we simulate a large dataset of sample size $10^{7}$ for each case and obtain that $(θ_{1}, θ_{2}) = (0.2830, 0.8057)$ for Case I, $(θ_{1}, θ_{2}) = (0.2830, 0.7297)$ for Case II, and $(θ_{1}, θ_{2}, θ_{3}) = (0.2827, 0.5004, 0.7172)$ for Case III. In each case, the g-computation estimator is based on fitting a working logistic model $P (Y = 1 ∣ A, X) = e x p i t (β_{A}^{⊤} A + β_{X} X)$ , which is correctly specified under Case I and Case III, but is misspecified under Case II.

For Cases I–II, which have two arms, we focus on estimating $θ_{2} - θ_{1}$ and also include the variance estimator in Ge et al. (Citation2011). For Case III, which has three arms, we evaluate our robust variance estimators for three common unconditional treatment effects for binary outcomes. The results for Cases I–II are in Table and for Case III are in Table , which include (i) the true parameter value, (ii) Monte Carlo mean and standard deviation (SD) of g-computation point estimators, (iii) average of standard error (SE), and (iv) coverage probability (CP) of 95% confidence intervals. We use sample size $n =$ 200 or 500, and 10000 simulation runs.

Table 1. Simulation mean and standard deviation (SD) of ${\hat{θ}}_{2} - {\hat{θ}}_{1}$ , average standard error (SE), and coverage probability (CP) of 95% asymptotic confidence interval for $θ_{2} - θ_{1}$ under Cases I–II and simple randomization.

Display Table

Table 2. Simulation mean and standard deviation (SD) of g-computation estimators, average standard error (SE), and coverage probability (CP) of 95% asymptotic confidence interval based on robust SE (Equation1(1) $\begin{aligned} n^{- 1} {\nabla f (\hat{θ})}^{⊤} \hat{V} {\nabla f (\hat{θ})}, \hat{V} = (\begin{matrix} {\hat{v}}_{11} & {\hat{v}}_{12} & \dots & {\hat{v}}_{1 k} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ {\hat{v}}_{1 k} & {\hat{v}}_{2 k} & \dots & {\hat{v}}_{k k} \end{matrix}), \end{aligned}$ (1) ) under Case III and simple randomization.

Display Table

From Tables –, we see that the g-computation estimators have negligible biases compared to the standard deviations. Our robust standard error, which is the squared root of variance estimator in (Equation1(1) $\begin{aligned} n^{- 1} {\nabla f (\hat{θ})}^{⊤} \hat{V} {\nabla f (\hat{θ})}, \hat{V} = (\begin{matrix} {\hat{v}}_{11} & {\hat{v}}_{12} & \dots & {\hat{v}}_{1 k} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ {\hat{v}}_{1 k} & {\hat{v}}_{2 k} & \dots & {\hat{v}}_{k k} \end{matrix}), \end{aligned}$ (1) ), is always very close to the actual standard deviation, and the related confidence interval has nominal coverage across all settings. In contrast, the standard error in Ge et al. (Citation2011) underestimates the actual standard deviation under Case I when there is no model misspecification, as well as under Case II when there is model misspecification, and the related confidence intervals have too low coverage probabilities in both cases.

4. Summary and discussion

In this article, we provide an explicit robust variance estimator formula for g-computation estimators, which can be used for different unconditional treatment effects of interest and clinical trials with two or more arms. Our simulations demonstrate that the variance estimator can be reliably used in practice.

In this article, for the purpose of being specific, we focus on the logistic model that regresses the outcome on the treatment indicators and covariates, which is arguably the most widely used model for binary outcomes. However, our robust variance estimation formula in (Equation1(1) $\begin{aligned} n^{- 1} {\nabla f (\hat{θ})}^{⊤} \hat{V} {\nabla f (\hat{θ})}, \hat{V} = (\begin{matrix} {\hat{v}}_{11} & {\hat{v}}_{12} & \dots & {\hat{v}}_{1 k} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ {\hat{v}}_{1 k} & {\hat{v}}_{2 k} & \dots & {\hat{v}}_{k k} \end{matrix}), \end{aligned}$ (1) ) is not limited to this model and can be used with different specifications of the working model (e.g., fitting a separate logistic model for each treatment arm) or with other generalized linear models using a canonical link for non-binary outcomes (e.g., Poisson regression for count outcomes). Additionally, although our article considers simple randomization, our robust variance formula in (Equation1(1) $\begin{aligned} n^{- 1} {\nabla f (\hat{θ})}^{⊤} \hat{V} {\nabla f (\hat{θ})}, \hat{V} = (\begin{matrix} {\hat{v}}_{11} & {\hat{v}}_{12} & \dots & {\hat{v}}_{1 k} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ {\hat{v}}_{1 k} & {\hat{v}}_{2 k} & \dots & {\hat{v}}_{k k} \end{matrix}), \end{aligned}$ (1) ) can also be used for a complete randomization scheme where the sample size in every group t is fixed to be $n π_{t}$ , because this randomization scheme leads to the same asymptotic distribution as the simple randomization (Ye et al., Citation2023). Simulation results under this randomization scheme are similar to those under simple randomization; see Tables 3-4 in the Appendix.

We implement an R package called RobinCar to conveniently compute the g-computation estimator and our robust variance estimators, which is publicly available at https://github.com/tye27/RobinCar.

Appendix

In Tables –, we include simulation results under a complete randomization scheme where the sample size in every group t is fixed to be $n π_{t}$ .

Table 3. Simulation mean and standard deviation (SD) of ${\hat{θ}}_{2} - {\hat{θ}}_{1}$ , average standard error (SE) and coverage probability (CP) of 95% asymptotic confidence interval for $θ_{2} - θ_{1}$ under Cases I–II and complete randomization that fixes $n_{t} = n π_{t}$ .

Display Table

Table 4. Simulation mean and standard deviation (SD) of g-computation estimators, average standard error (SE), and coverage probability (CP) of 95% asymptotic confidence interval based on robust SE (Equation1(1) $\begin{aligned} n^{- 1} {\nabla f (\hat{θ})}^{⊤} \hat{V} {\nabla f (\hat{θ})}, \hat{V} = (\begin{matrix} {\hat{v}}_{11} & {\hat{v}}_{12} & \dots & {\hat{v}}_{1 k} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ {\hat{v}}_{1 k} & {\hat{v}}_{2 k} & \dots & {\hat{v}}_{k k} \end{matrix}), \end{aligned}$ (1) ) under Case III and complete randomization that fixes $n_{t} = n π_{t}$ .

Display Table

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This work was supported by National Institute of Allergy and Infectious Diseases [NIAID 5 UM1 AI068617].

References

FDA (2021). Adjusting for covariates in randomized clinical trials for drugs and biological products. Draft Guidance for Industry. Center for Drug Evaluation and Research and Center for Biologics Evaluation and Research, Food and Drug Administration (FDA), U.S. Department of Health and Human Services, May 2021.
Google Scholar
ICH E9 (R1) (2019). Addendum on estimands and sensitivity analysis in clinical trials to the guideline on statistical principles for clinical trials. International Council for Harmonisation (ICH).
Google Scholar
Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., & Newey, W. (2017). Double/debiased/neyman machine learning of treatment effects. American Economic Review, 107(5), 261–265. https://doi.org/10.1257/aer.p20171038
Web of Science ®Google Scholar
Freedman, D. A. (2008). Randomization does not justify logistic regression. Statistical Science, 23(2), 237–249. https://doi.org/10.1214/08-STS262
Web of Science ®Google Scholar
Ge, M., Durham, L. K., Meyer, R. D., Xie, W., & Thomas, N. (2011). Covariate-adjusted difference in proportions from clinical trials using logistic regression and weighted risk differences. Drug Information Journal: DIJ/Drug Information Association, 45(4), 481–493. https://doi.org/10.1177/009286151104500409
Google Scholar
Haldane, S. (1956). The estimation and significant of the logarithm of a ratio of frequencies. Annals of Human Genetics, 20(4), 309–311. https://doi.org/10.1111/ahg.1956.20.issue-4
PubMed Web of Science ®Google Scholar
Kennedy, E. H. (2016). Semiparametric theory and empirical processes in causal inference. In Statistical causal inferences and their applications in public health research (pp. 141–167).
Google Scholar
Lin, W. (2013). Agnostic notes on regression adjustments to experimental data: Reexamining freedman's critique. Annals of Applied Statistics, 7(1), 295–318. https://doi.org/10.1214/12-AOAS583
Web of Science ®Google Scholar
Moore, K. L., & van der Laan, M. J. (2009). Covariate adjustment in randomized trials with binary outcomes: Targeted maximum likelihood estimation. Statistics in Medicine, 28(1), 39–64. https://doi.org/10.1002/sim.v28:1
PubMed Web of Science ®Google Scholar
Tsiatis, A. A., Davidian, M., Zhang, M., & Lu, X. (2008). Covariate adjustment for two-sample treatment comparisons in randomized clinical trials: A principled yet flexible approach. Statistics in Medicine, 27(23), 4658–4677. https://doi.org/10.1002/sim.v27:23
PubMed Web of Science ®Google Scholar
Woolf, B. (1955). On estimating the relation between blood group and disease. Annals of Human Genetics, 19(4), 251–253. https://doi.org/10.1111/ahg.1955.19.issue-4
PubMed Web of Science ®Google Scholar
Yang, L., & Tsiatis, A. A. (2001). Efficiency study of estimators for a treatment effect in a pretest–posttest trial. The American Statistician, 55(4), 314–321. https://doi.org/10.1198/000313001753272466
Web of Science ®Google Scholar
Ye, T., Shao, J., Yi, Y., & Zhao, Q. (2023). Toward better practice of covariate adjustment in analyzing randomized clinical trials. Journal of the American Statistical Association, 117, in press. https://doi.org/10.1080/01621459.2022.2049278
Google Scholar
Ye, T., Yi, Y., & Shao, J. (2022). Inference on average treatment effect under minimization and other covariate-adaptive randomization methods. Biometrika, 109(1), 33–47. https://doi.org/10.1093/biomet/asab015
Web of Science ®Google Scholar

Robust variance estimation for covariate-adjusted unconditional treatment effect in randomized clinical trials with binary outcomes

Abstract

1. Introduction

2. Robust variance estimation

3. Simulations

Table 1. Simulation mean and standard deviation (SD) of ${\hat{θ}}_{2} - {\hat{θ}}_{1}$ , average standard error (SE), and coverage probability (CP) of 95% asymptotic confidence interval for $θ_{2} - θ_{1}$ under Cases I–II and simple randomization.

4. Summary and discussion

Appendix

Table 3. Simulation mean and standard deviation (SD) of ${\hat{θ}}_{2} - {\hat{θ}}_{1}$ , average standard error (SE) and coverage probability (CP) of 95% asymptotic confidence interval for $θ_{2} - θ_{1}$ under Cases I–II and complete randomization that fixes $n_{t} = n π_{t}$ .

Disclosure statement

References

Information for

Open access

Opportunities

Help and information

Robust variance estimation for covariate-adjusted unconditional treatment effect in randomized clinical trials with binary outcomes

Abstract

1. Introduction

2. Robust variance estimation

3. Simulations

Table 1. Simulation mean and standard deviation (SD) of θˆ2−θˆ1, average standard error (SE), and coverage probability (CP) of 95% asymptotic confidence interval for θ2−θ1 under Cases I–II and simple randomization.

4. Summary and discussion

Appendix

Table 3. Simulation mean and standard deviation (SD) of θˆ2−θˆ1, average standard error (SE) and coverage probability (CP) of 95% asymptotic confidence interval for θ2−θ1 under Cases I–II and complete randomization that fixes nt=nπt.

Disclosure statement

Additional information

Funding

References

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date

Table 1. Simulation mean and standard deviation (SD) of ${\hat{θ}}_{2} - {\hat{θ}}_{1}$ , average standard error (SE), and coverage probability (CP) of 95% asymptotic confidence interval for $θ_{2} - θ_{1}$ under Cases I–II and simple randomization.

Table 3. Simulation mean and standard deviation (SD) of ${\hat{θ}}_{2} - {\hat{θ}}_{1}$ , average standard error (SE) and coverage probability (CP) of 95% asymptotic confidence interval for $θ_{2} - θ_{1}$ under Cases I–II and complete randomization that fixes $n_{t} = n π_{t}$ .