Abstract
We propose an empirical Bayes method for evaluating overall and study-specific treatment effects in multivariate meta-analysis with binary outcome. Instead of modeling transformed proportions or risks via commonly used multivariate general or generalized linear models, we directly model the risks without any transformation. The exact posterior distribution of the study-specific relative risk is derived. The hyperparameters in the posterior distribution can be inferred through an empirical Bayes procedure. As our method does not rely on the choice of transformation, it provides a flexible alternative to the existing methods and in addition, the correlation parameter can be intuitively interpreted as the correlation coefficient between risks.
Appendix
Section A: Some derivations
Derivation of Eq. (Equation3(3) ) For simplicity of notation, we suppress the index “i”. With some algebra, we can show
Denote μj = aj/(aj + bj) and σ2j = μj(1 − μj)/(aj + bj + 1) (j = 1, 2). We have
Derivation of Eq. (Equation5(5) ) With beta marginals and mixing functions φi = (pi − μi)/δi, the Sarmanov prior distribution of p1 and p2 can be written as linear combination of products of independent beta distributions as follows:
where beta( · ; aj, bj) is the beta distribution, vk (k = 1, …, 4) are weights, defined by, v1 = 1 + ρd, v2 = v3 = −ρd, v4 = ρd, d = (μ1μ2)/(δ1δ2). After some algebra, the posterior distribution of p1 and p2 given data is also a linear combination of products of independent beta distributions,
where the weights ωk (k = 1, …, 4) are defined by
and C, the normalizing constant, is calculated as
The proof is completed following the derivation of Eq. (Equation4
(4) ).
Some results on the moments of relative risk Under independent beta priors, the kth posterior moment of relative risk exists for k < min (α1, β2) and is given by
(6) Specifically, the mean and variance are given by E[θ; α1, β1, α2, β2] = β1α2/{(α1 − 1)(β2 − 1)} and
Under correlated beta priors, the kth posterior moment of relative risk exists for k < min (α1, β2) and is given by
(7) where E[θk; α1, β1, α2, β2] is the kth posterior moment of relative risk under independent beta priors, defined in Eq. (Equation6
(6) ).
Some results on correlation coefficients Denote yji = ∑njik = 1Zjim where Zjim is the indicator of whether the kth subject in the jth group of ith study experienced an event. Denote the dispersion parameter φj = 1/(aj + bj + 1). Note that Yji follows a beta-binomial distribution with parameters (aj, bj). Therefore, , i.e., the correlation between the outcomes for two subjects from the same study and the same jth group is φj. Using double expectation technique, we can show that
, i.e., the correlation between the outcomes for two subjects from the same study but different groups is
.
Section B: SPLUS/R program to fit model (Equation1
(1) ) and a working example
# function to compute the log-likelihood in equation (5)
myLik <- function(mypar, mydat) {
par <- par.cal(mypar); a1 <- par[1]; b1 <- par[2]; a2 <- par[3]; b2 <- par[4]
temp1 <- (lgamma(a1+mydat$y1) + lgamma(b1+mydat$n1-mydat$y1)+ lgamma(a2+mydat$y2) + lgamma(b2+mydat$n2-mydat$y2) + lgamma(a1+b1) + lgamma(a2+b2))
temp2 <- (lgamma(a1) + lgamma(b1) + lgamma(a2) + lgamma(b2)+ lgamma(a1+b1+mydat$n1) + lgamma(a2+b2 +mydat$n2))
if (flag == 0) myLogLik <- sum(temp1 - temp2) # if independent beta-binomial model
if (flag == 1) { # if Sarmanov beta-binomial model
rho <- par[5]
mu1 <- a1/(a1+b1); mu2 <- a2/(a2+b2)
delta1 <- sqrt(mu1*(1-mu1)/(a1+b1+1)); delta2 <- sqrt(mu2*(1-mu2)/(a2+b2+1))
temp3 <- (log(1+rho/delta1/delta2*(mydat$y1-mydat$n1 *mu1)*(mydat$y2-mydat$n2*mu2)/(a1+b1+mydat$n1)/(a2+b2 +mydat$n2)))
myLogLik <- sum(temp1 - temp2 + temp3)}
return(myLogLik)
}
# Back-transform the parameters (a1,b1,a2,b2,rho) to original scale
par.cal <- function(mypar) {
a1 <- exp(mypar[1]); b1 <- exp(mypar[2]); a2 <- exp(mypar[3]); b2 <- exp(mypar[4])
if (flag == 0) return(c(a1,b1,a2,b2))
if (flag == 1) {
eta <- mypar[5]; cc <- sqrt(a1*a2*b1*b2)/sqrt((a1+b1+1)*(a2+b2+1))
upper.bound <- cc/max(a1*b2, a2*b1); lower.bound <- -cc/max(a1*a2, b1*b2)
rho <- (upper.bound-lower.bound)*exp(eta)/(1+exp(eta)) + lower.bound
return(c(a1,b1,a2,b2,rho))}
}
# note: we use Delta method to get the variance of log(RR)and use Wald interval on log(RR)
RR.comp.log <- function(par, hessian) {
a1 <- par[1]; b1 <- par[2]; a2 <- par[3]; b2 <- par[4]
myRR.overall <- log(a2/(a2+b2)/(a1/(a1+b1)))
myVar <- solve(-hessian)
if (flag == 0) myD <- matrix(c(-b1/(a1+b1), b1/(a1+b1),b2/(a2+b2), -b2/(a2+b2)), nrow=1)
if (flag == 1) myD <- matrix(c(-b1/(a1+b1), b1/(a1+b1),b2/(a2+b2), -b2/(a2+b2), 0), nrow=1)
myRR.overall.Var <- as.numeric(myD %*% myVar %*% t(myD))
myRR.overall.sd <- sqrt(myRR.overall.Var)
myRR.left.bound <- myRR.overall-1.96*sqrt(myRR.overall.Var)
myRR.right.bound <- myRR.overall+1.96*sqrt(myRR.overall.Var)
return(list(RR=exp(myRR.overall), RR.left=exp(myRR.left.bound), RR.right=exp(myRR.right.bound)))
}
# Dataset from Bellamy (2009) Lancet
y1<-c(6628,22,0,150,1,16,7,8,0,0,0,1,0,1,7,0,0,3,18,0)
n1<-c(637341,868,39,2242,111,783,108,489,11,435,70,61,52,39,431,35,57,47,328,41)
y2<-c(2874,71,21,43,53,405,6,13,7,23,44,21,10,15,105,10,33,14,224,5)
n2<-c(21823,620,68,166,295,5470,70,35,23,435,696,229,28,45,801,15,241,47,615,145)
# remove the first study with extremely large sample size
#y1 <- y1[-1]; n1 <- n1[-1]; y2 <- y2[-1]; n2 <- n2[-1]
init.val <- rep(0, 5)
# maximization of the likelhood of independent beta-binomialmodel
flag <- 0 # flag = 0: independent beta-binomial model
results.indep <- optim(init.val[1:4], myLik, method = ''L-BFGS-B'', lower=rep(-20,4), upper=rep(20,4),control = list(fnscale=-1,maxit=1000), hessian = T, mydat=list(y1=y1,n1=n1,y2=y2,n2=n2))
RR.comp.log(par.cal(results.indep$par), results.indep$hessian)
# maximization of the likelhood of Sarmanov beta-binomial model
flag <- 1 # flag = 1: Sarmanov beta-binomial model
results <- optim(init.val, myLik, method = ''L-BFGS-B'', lower=rep(-20,5), upper=rep(20,5), control = list(fnscale=-1,maxit=1000), hessian=T, mydat=list(y1=y1,n1=n1,y2=y2,n2=n2))
RR.comp.log(par.cal(results$par), results$hessian)