81
Views
0
CrossRef citations to date
0
Altmetric
Articles

Estimating correlations between vaccine clinical trial outcomes

, &
Pages 3392-3413 | Received 01 Dec 2020, Accepted 24 Jun 2021, Published online: 12 Jul 2021
 

Abstract

We demonstrate how a linear factor model with latent variables can be used to estimate correlations between the outcomes of clinical trials. These correlations are needed for many policy questions of drug/vaccine development (such as calculating the optimal size of financial incentives) and the literature so far has relied on expert opinions. We apply our methodology to the case of vaccines and show that the estimated correlations are highly significant. We also illustrate how the estimated correlations can be used to find the probability of obtaining a successful vaccine out of a certain number of candidates and to determine optimal investment in vaccine development.

2010 MS Classifications:

Acknowledgments

We are thankful for valuable comments and feedback from Artem Goryaev, Andrey Zubarev, Kristina Nesterova, Revold Entov, Sergey Sinelnikov-Murylyov, and other participants of seminars at RANEPA.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes

1 For example, in educational setting they can be used to estimate students' ‘ability’ based on test outcomes. See section 4.13 in Bartholomew et al. [Citation2] for more details.

2 Typically we have several correlated discrete observations on a cross-section of random subjects and we want to estimate how these observations are affected by some explanatory variables taking into account the correlations between observations within a subject (see Song and Lee [Citation12] or Chib and Greenberg [Citation3]). For such estimation multivariate logit regressions can also be used (for example, Li and Wong [Citation7]).

3 We can do this by matching the quantiles of the actual distribution of Yik with the quantiles of the standard normal distribution.

4 The reason why one should define the correlation structure for the continuous ‘quality’ variables Yik instead of using the outcomes Xik is that for the discrete random variables the correlation matrix does not fully describe the distribution, while for continuous (with the assumption of multivariate normal distribution) it is sufficient.

5 Using Cholesky decomposition for the covariance matrix we can represent each element {Yik} as a weighted sum of standard normal random variables. We can call these variables factors.

6 We use a lower case letter to distinguish a specific realization sk from the random variable Sk.

7 We illustrate how our methodology can be extended to incorporate additional information in a simulation study (#2) in Appendix. We should note, though, that adding more factors makes the estimation computationally rather challenging, and might require significant time or computing resourses.

8 The code for the estimation is available at https://github.com/sergeynzhuk/clinical-trials-correlations. The numerical integrals were calculated with ‘scipy.integrate’ Python library, for the optimization we used ‘scipy.optimize’ with the default BFGS algorithm. The estimation running times are for a machine with Intel Core i5-6500 CPU @ 3.20GHz, 8Gb RAM, Windows 10, Python 3.7.9.

9 Here we make an important assumption that the distributional parameters of the candidates do not change as we add more candidates. In reality, more promising projects are financed first. Thus, as we add more candidates, their success probabilities are likely to decrease. In addition, the candidates are likely to be similar to already developed ones. Thus, the correlations could increase as well.

10 For simplicity, we assume that we need only one successful vaccine. If additional successes bring additional benefits, the example can be straightforwardly modified to account for them as well.

11 The duration T is the combined duration of three phases of clinical trials (from (Equation39)). The average development cost I roughly corresponds to the expected present value of the investments required for each phase (see Equation (Equation38)).

12 For arbitrary structure of dependence among the variables one might be inclined to use e.g. copulas. In our case, though, there are too few observations to test the implications of relaxing the factor model assumption.

13 The code for the estimation is available at https://github.com/sergeynzhuk/clinical-trials-correlations. The numerical integrals were calculated with ‘scipy.integrate’ Python library, for the optimization we used ‘scipy.optimize’ with the default BFGS algorithm. When calculating the likelihood for model (d) we first constructed a function that calculates the likelihood for each disease for a given realization of the common factor Fk and then calculated the average value of this function over Fk with numerical integration. The estimation running times (in the format hh:mm:ss) are for a machine with Intel Core i5-6500 CPU @ 3.20GHz, 8Gb RAM, Windows 10, Python 3.7.9.

14 Again, we make an assumption that the distributional parameters of the candidates do not change as we add more candidates.

15 The costs of the phases are from DiMasi et al. [Citation4]. Their estimates are based on a confidential survey of manufacturers for 68 randomly selected compounds including one vaccine. Waye et al. [Citation15] point out the development costs of the vaccines can differ from those of the drugs, and Gouglas et al. [Citation5] provide the estimates of the phase I and phase II costs for vaccines specifically. However, we stick to the numbers from DiMasi et al. [Citation4] since Gouglas et al. [Citation5] do not contain the phase III data. The durations of the phases are from Pronker et al. [Citation10].

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 549.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.