Estimating correlations between vaccine clinical trial outcomes: Journal of Applied Statistics: Vol 49 , No 13

Sample our Mathematics & Statistics journals, sign in here to start your FREE access for 14 days

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions
Read this article /doi/full/10.1080/02664763.2021.1949439?needAccess=true

Abstract

We demonstrate how a linear factor model with latent variables can be used to estimate correlations between the outcomes of clinical trials. These correlations are needed for many policy questions of drug/vaccine development (such as calculating the optimal size of financial incentives) and the literature so far has relied on expert opinions. We apply our methodology to the case of vaccines and show that the estimated correlations are highly significant. We also illustrate how the estimated correlations can be used to find the probability of obtaining a successful vaccine out of a certain number of candidates and to determine optimal investment in vaccine development.

Keywords:

Applications of statistics
clinical trials
correlations
factor models
latent variables
maximum likelihood

2010 MS Classifications:

62Pxx
62P10
62P20

Acknowledgments

We are thankful for valuable comments and feedback from Artem Goryaev, Andrey Zubarev, Kristina Nesterova, Revold Entov, Sergey Sinelnikov-Murylyov, and other participants of seminars at RANEPA.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes

1 For example, in educational setting they can be used to estimate students' ‘ability’ based on test outcomes. See section 4.13 in Bartholomew et al. [Citation2] for more details.

2 Typically we have several correlated discrete observations on a cross-section of random subjects and we want to estimate how these observations are affected by some explanatory variables taking into account the correlations between observations within a subject (see Song and Lee [Citation12] or Chib and Greenberg [Citation3]). For such estimation multivariate logit regressions can also be used (for example, Li and Wong [Citation7]).

3 We can do this by matching the quantiles of the actual distribution of $Y_{i}^{k}$ with the quantiles of the standard normal distribution.

4 The reason why one should define the correlation structure for the continuous ‘quality’ variables $Y_{i}^{k}$ instead of using the outcomes $X_{i}^{k}$ is that for the discrete random variables the correlation matrix does not fully describe the distribution, while for continuous (with the assumption of multivariate normal distribution) it is sufficient.

5 Using Cholesky decomposition for the covariance matrix we can represent each element ${Y_{i}^{k}}$ as a weighted sum of standard normal random variables. We can call these variables factors.

6 We use a lower case letter to distinguish a specific realization $s^{k}$ from the random variable $S^{k}$ .

7 We illustrate how our methodology can be extended to incorporate additional information in a simulation study (#2) in Appendix. We should note, though, that adding more factors makes the estimation computationally rather challenging, and might require significant time or computing resourses.

8 The code for the estimation is available at https://github.com/sergeynzhuk/clinical-trials-correlations. The numerical integrals were calculated with ‘scipy.integrate’ Python library, for the optimization we used ‘scipy.optimize’ with the default BFGS algorithm. The estimation running times are for a machine with Intel Core i5-6500 CPU @ 3.20GHz, 8Gb RAM, Windows 10, Python 3.7.9.

9 Here we make an important assumption that the distributional parameters of the candidates do not change as we add more candidates. In reality, more promising projects are financed first. Thus, as we add more candidates, their success probabilities are likely to decrease. In addition, the candidates are likely to be similar to already developed ones. Thus, the correlations could increase as well.

10 For simplicity, we assume that we need only one successful vaccine. If additional successes bring additional benefits, the example can be straightforwardly modified to account for them as well.

11 The duration T is the combined duration of three phases of clinical trials (from (Equation39(39) $\begin{array}{l} I_{1} = 25.3 m l n T_{1} = 2.48 y e a r s \\ I_{2} = 58.6 m l n T_{2} = 2.86 y e a r s \\ I_{3} = 255.4 m l n T_{3} = 2.64 y e a r \\ r = 10 % \\ B = e i t h e r 4 b l n o r 20 b l n \end{array}$ (39) )). The average development cost I roughly corresponds to the expected present value of the investments required for each phase (see Equation (Equation38(38) $I = I_{1} + \frac{I_{2}}{(1 + r)^{T_{1}}} \cdot p_{T, 1} + \frac{I_{3}}{(1 + r)^{T_{1} + T_{2}}} \cdot p_{T, 1} \cdot p_{T, 2}$ (38) )).

12 For arbitrary structure of dependence among the variables one might be inclined to use e.g. copulas. In our case, though, there are too few observations to test the implications of relaxing the factor model assumption.

13 The code for the estimation is available at https://github.com/sergeynzhuk/clinical-trials-correlations. The numerical integrals were calculated with ‘scipy.integrate’ Python library, for the optimization we used ‘scipy.optimize’ with the default BFGS algorithm. When calculating the likelihood for model (d) we first constructed a function that calculates the likelihood for each disease for a given realization of the common factor $F^{k}$ and then calculated the average value of this function over $F^{k}$ with numerical integration. The estimation running times (in the format hh:mm:ss) are for a machine with Intel Core i5-6500 CPU @ 3.20GHz, 8Gb RAM, Windows 10, Python 3.7.9.

14 Again, we make an assumption that the distributional parameters of the candidates do not change as we add more candidates.

15 The costs of the phases are from DiMasi et al. [Citation4]. Their estimates are based on a confidential survey of manufacturers for 68 randomly selected compounds including one vaccine. Waye et al. [Citation15] point out the development costs of the vaccines can differ from those of the drugs, and Gouglas et al. [Citation5] provide the estimates of the phase I and phase II costs for vaccines specifically. However, we stick to the numbers from DiMasi et al. [Citation4] since Gouglas et al. [Citation5] do not contain the phase III data. The durations of the phases are from Pronker et al. [Citation10].

D.J. Bartholomew, M. Knott and I. Moustaki, Latent Variable Models and Factor Analysis: A Unified Approach, Vol. 904, John Wiley & Sons, Chichester, 2011.

Google Scholar

X.-Y. Song and S.-Y. Lee, A multivariate probit latent variable model for analyzing dichotomous responses, Stat. Sin. 15 (2005), pp. 645–664.

Web of Science ®Google Scholar

S. Chib and E. Greenberg, Analysis of multivariate probit models, Biometrik 85 (1998), pp. 347–361.

Web of Science ®Google Scholar

J. Li and W.K. Wong, Two-dimensional toxic dose and multivariate logistic regression, with application to decompression sickness, Biostatistics 12 (2011), pp. 143–155.

PubMed Web of Science ®Google Scholar

J.A. DiMasi, H.G. Grabowski and R.W. Hansen, Innovation in the pharmaceutical industry: new estimates of r&d costs, J. Health. Econ. 47 (2016), pp. 20–33. Available at https://doi.org/10.1016/j.jhealeco.2016.01.012.

PubMed Web of Science ®Google Scholar

A. Waye, P. Jacobs and A.B. Schryvers, Vaccine development costs: a review, Expert. Rev. Vaccines 12 (2013), pp. 1495–1501. Available at https://doi.org/10.1586/14760584.2013.850035.

PubMed Web of Science ®Google Scholar

D. Gouglas, T.T. Le, K. Henderson, A. Kaloudis, T. Danielsen, N.C. Hammersland, J.M. Robinson, P.M. Heaton and J.-A. Røttingen, Estimating the cost of vaccine development against epidemic infectious diseases: a cost minimisation study, Lancet Glob. Health 6 (2018), p. e1386. Available at https://doi.org/10.1016/S2214-109X(18)30346-2.

PubMed Web of Science ®Google Scholar

E.S. Pronker, T.C. Weenen, H. Commandeur, E.H. Claassen and A.D. Osterhaus, Risk in vaccine research and development quantified, PLoS One 8 (2013), p. e57755. Available at https://doi.org/10.1371/journal.pone.0057755.

PubMed Web of Science ®Google Scholar

Log in via your institution

Access through your institution

Log in to Taylor & Francis Online

Shibboleth

Log in to Taylor & Francis Online

Username Password

Forgot password?

Keep me logged in (not suitable for shared devices).

You will otherwise be logged out automatically, after a limited period, and will need to log in again.

Restore content access

Restore content access for purchases made as guest

Purchase options * Save for later Item saved, go to cart

PDF download + Online access

48 hours access to article PDF & online version
Article PDF can be downloaded
Article PDF can be printed

USD 61.00 Add to cart

PDF download + Online access - Online Checkout

Issue Purchase

30 days online access to complete issue
Article PDFs can be downloaded
Article PDFs can be printed

USD 549.00 Add to cart

Issue Purchase - Online Checkout

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

Estimating correlations between vaccine clinical trial outcomes

Log in via your institution

Log in to Taylor & Francis Online

Restore content access

Related Research

Information for

Open access

Opportunities

Help and information

Estimating correlations between vaccine clinical trial outcomes

Abstract

Acknowledgments

Disclosure statement

Notes

Log in via your institution

Log in to Taylor & Francis Online

Log in to Taylor & Francis Online

Restore content access

Related Research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature