Full article: A trivariate Bernoulli regression model

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

A trivariate Bernoulli regression model is proposed in this paper. There is extensive need for analysing repeated binary outcomes where correlated binary outcomes are obtained from repeated measures or longitudinal data. The proposed model is based on marginal and conditional probabilities as functions of covariates. The estimation and test procedures are shown. The tests include testing of hypotheses for first- and second-order associations among outcome variables. It is noteworthy that the test procedures for dependence in outcome variables can be demonstrated in terms of vectors of regression parameters for models on outcome variables. The proposed model can be extended for more than three correlated outcome variables conveniently.

Keywords:

PUBLIC INTEREST STATEMENT

This research work provides an important development in modelling correlated outcomes data and shows the test for different types of associations. This model is developed for three binary outcomes. In various fields, the presence of correlation in outcome variables poses formidable difficulty to model the relationship between potential explanatory variables and outcome variables. However, if the dependence in outcome variables is not considered in modelling, the relationships between explanatory and outcome variables may be affected to a large extent resulting in misleading results. This paper provides a trivariate Bernoulli regression model and tests for the potential relationships of first, second and third orders are proposed. The applications of these procedures will make the analysis of correlated binary outcomes possible with appropriate interpretation and underlying mechanism of relationships among outcome variables and between outcome and explanatory variables.

1. Introduction

The use of Bernoulli regression model is well established. The logistic regression model has been one of the most extensively used techniques in various applications which is based on univariate Bernoulli distribution. Since the development of generalized linear models, it has been presented more explicitly using the logit link function from exponential family of distributions for binary data. There have been attempts to develop regression models for bivariate and multivariate Bernoulli regression models. Some noteworthy works were presented by McCullagh and Nelder (Citation1989), Glonek and McCullagh (Citation1995), Yee and Dirnbock (Citation2009) and Dai, Ding, and Wahba (Citation2013). McCullagh and Nelder (Citation1989) considered proportional odds model as a starting point for constructing proportional odds model for three ordinal categories. Glonek and McCullagh generalized the model for several categorical responses. Alternative conditional models based on Markovian assumptions were proposed by Islam and Chowdhury (Citation2006, Citation2008, Citation2017), Islam, Chowdhury, and Huda (Citation2009) and Islam et al. (Citation2012a). On the other hand, Dai et al. (Citation2013) showed a multivariate Bernoulli model to estimate structure of graphs with binary nodes. Islam, Alzaid, Chowdhury, and Sultan (Citation2013) proposed an alternative procedure using marginal-conditional approach to construct a bivariate Bernoulli model and provided tests for dependence in outcome variables. In this paper, a trivariate Bernoulli regression model is shown using marginal-conditional approach. Tests for dependence in trivariate models are also shown. The proposed model can be used very extensively in longitudinal studies with trivariate binary outcomes in various fields.

2. Trivariate Bernoulli distribution

Marshall and Olkin (Citation1985) showed the bivariate Bernoulli form of $Y_{1}$ and $Y_{2}$ with Bernoulli marginal. Let us consider three binary variables $Y_{1}$ , $Y_{2}$ and $Y_{3}$ , which, in longitudinal studies, can be considered as status of outcome variables at time points $T_{1}$ , $T_{2}$ and $T_{3}$ , respectively. The probability distribution is displayed below:

\begin{matrix} Y_{1} & Y_{2} & Y_{3} & T o t a l \\ 0 & 0 & 0 & p_{001} & p_{00} \\ 0 & 1 & p_{000} & p_{011} & p_{01} \\ 1 & 0 & p_{010} & p_{101} & p_{10} \\ 1 & 1 & p_{100} & p_{111} & p_{11} \\ T o t a l & p_{110} & {p_{. .}}_{1} & 1 \end{matrix}

and the trivariate Bernoulli probability can be represented as follows

(1)

\begin{matrix} P (Y_{1} = y_{1}, Y_{2} = y_{2}, Y_{3} = y_{3}) = p_{000}^{(1 - y_{1}) (1 - y_{2}) (1 - y_{3})} p_{001}^{(1 - y_{1}) (1 - y_{2}) y_{3}} \\ p_{010}^{(1 - y_{1}) y_{2} (1 - y_{3})} p_{100}^{y_{1} (1 - y_{2}) (1 - y_{3})} p_{011}^{(1 - y_{1}) y_{2} y_{3}} p_{101}^{y_{1} (1 - y_{2}) y_{3}} p_{110}^{y_{1} y_{2} (1 - y_{3})} p_{111}^{y_{1} y_{2} y_{3}} . \end{matrix}

(1)

We can easily find the expression for the above trivariate Bernoulli in exponential family form and after taking log the log-likelihood for n = 1 is

(2)

\begin{aligned} l = y_{1} ln (\frac{p_{100}}{p_{000}}) + y_{2} ln (\frac{p_{010}}{p_{000}}) + y_{3} ln (\frac{p_{001}}{p_{000}}) + y_{1} y_{2} ln (\frac{p_{110} p_{000}}{p_{100} p_{010}}) + \\ y_{1} y_{3} ln (\frac{p_{101} p_{000}}{p_{100} p_{001}}) + y_{2} y_{3} ln (\frac{p_{011} p_{000}}{p_{010} p_{001}}) + y_{1} y_{2} y_{3} ln (\frac{p_{100} p_{010} p_{001} p_{111}}{p_{110} p_{011} p_{101} p_{000}}) + ln p_{000} . \end{aligned}

(2)

The natural link functions are:

(3)

\begin{aligned} θ_{1} = ln (\frac{p_{100}}{p_{000}}), θ_{2} = ln (\frac{p_{010}}{p_{000}}), θ_{3} = ln (\frac{p_{001}}{p_{000}}), \\ θ_{12} = ln (\frac{p_{110} p_{000}}{p_{100} p_{010}}), θ_{13} = ln (\frac{p_{101} p_{000}}{p_{100} p_{001}}), θ_{23} = ln (\frac{p_{011} p_{000}}{p_{010} p_{001}}), \\ θ_{123} = ln (\frac{p_{100} p_{010} p_{001} p_{111}}{p_{110} p_{011} p_{101} p_{000}}), θ_{0} = ln p_{000} . \end{aligned}

(3)

Islam, Alzaid, Chowdhury, and Sultan (Citation2013) showed for bivariate Bernoulli regression model that underlying relationships can be explored more conveniently if a marginal-conditional approach is employed. We can use the underlying conditional and marginal models for expressing the association parameters in the proposed model. The bivariate Bernoulli model is comprised of 3 ( $2^{2} - 1$ ) models, two conditional and one marginal. Extension to trivariate Bernoulli shows that there are 7 (2³–1) models. It appears from the above link functions that there are underlying relationships between link functions of three first, three second and one third orders that emerge from the natural link functions. The first-order link functions are simply odds between respective cell probability for an outcome variable and baseline probability of non-occurrence of any of the outcomes at three time points.

3. The marginal-conditional models for trivariate Bernoulli

We can express the joint probability of three outcome variables Y₁, Y₂ and Y₃ for given X as follows:

(4)

P (Y_{1} = y_{1}, Y_{2} = y_{2}, Y_{3} = y_{3} | X) = P (Y_{1} = y_{1} | X) P (Y_{2} = y_{2} | y_{1}, X) P (Y_{3} = y_{3} | y_{1}, y_{2}, X)

(4)

Here is $P (Y_{1} = y_{1} | X) = π_{y_{1}} (X)$ marginal probability for Y₁, $P (Y_{2} = y_{2} | y_{1}, X) = π_{y_{1} y_{2}} (X)$ is conditional probability for Y₂ given Y₁, $P (Y_{3} = y_{3} | y_{1}, y_{2}, X) = π_{y_{1} y_{2} y_{3}} (X)$ is conditional probability for Y₃ given Y₁ and Y₂, and $X = (1, X_{1}, . . ., X_{p})$ . Using the relationship shown in Equation (4), the joint probabilities are

(5)

\begin{aligned} p_{000} (X) = [1 - π_{1} (X)] [1 - π_{01} (X)] [1 - π_{001} (X)] \\ p_{100} (X) = π_{1} (X) [1 - π_{11} (X)] [1 - π_{101} (X)] \\ p_{010} (X) = [1 - π_{1} (X)] π_{01} (X) [1 - π_{011} (X)] \\ p_{001} (X) = [1 - π_{1} (X)] [1 - π_{01} (X)] π_{001} (X) \\ p_{110} (X) = π_{1} (X) π_{11} (X) [1 - π_{111} (X)] \\ p_{011} (X) = [1 - π_{1} (X)] π_{01} (X) π_{011} (X) \\ p_{101} (X) = π_{1} (X) [1 - π_{11} (X)] π_{101} (X) \\ p_{111} (X) = π_{1} (X) π_{11} (X) π_{111} (X) \end{aligned}

(5)

The marginal model $π_{1} (X)$ is

(6)

π_{1} (X) = \frac{e^{X β_{1}}}{1 + e^{X β_{1}}}

(6)

where $β_{1} = (β_{10}, β_{11}, . . ., β_{1 p})$ and $π_{0} (X) = \frac{1}{1 + e^{X β_{1}}}$ as $π_{0} (X) + π_{1} (X) = 1$ .

The conditional probabilities can be obtained from the first- and second-order Markov models with covariate dependence (Islam et al., Citation2012b, Citation2012a; Islam et al., Citation2013; Islam & Chowdhury, Citation2017). The conditional models for first-order Markov chain transition probabilities in the relationships (5) are displayed in Table shown below for outcome variables Y₁ and Y₂:

Table 1. Transition probabilities for outcome variables Y₁ and Y₂

Display Table

The transition probabilities are functions of covariates as displayed below

(7)

π_{01} (X) = \frac{e^{X β_{01}}}{1 + e^{X β_{01}}},

(7)

(8)

π_{11} (X) = \frac{e^{X β_{11}}}{1 + e^{X β_{11}}}

(8)

where $β_{01} = (β_{010}, β_{011}, . . ., β_{01 p}), β_{11} = (β_{110}, β_{111}, . . ., β_{11 p})$ ,

\begin{matrix} π_{00} (X) = \frac{1}{1 + e^{X β_{01}}}, π_{10} (X) = \frac{1}{1 + e^{X β_{11}}}, a n d π_{00} (X) + π_{01} (X) = 1 \\ a n d π_{10} (X) + π_{11} (X) = 1 \end{matrix}

Four second-order conditional models $π_{001} (X), π_{011} (X), π_{101} (X), a n d π_{111} (X)$ are needed for the trivariate Bernoulli model. The conditional probabilities for second-order Markov models satisfy $π_{000} (X) + π_{001} (X) = 1$ ,

$π_{010} (X) + π_{011} (X) = 1$ , $π_{100} (X) + π_{101} (X) = 1$ , $π_{110} (X) + π_{111} (X) = 1$

The second-order transition probabilities are shown in Table :

Table 2. Transition probabilities for outcome variables Y₁, Y₂ and Y₃

Display Table

The covariate-dependent second-order models are

(9)

π_{001} (X) = \frac{e^{X β_{001}}}{1 + e^{X β_{001}}}, π_{000} (X) = \frac{1}{1 + e^{X β_{001}}},

(9)

(10)

π_{011} (X) = \frac{e^{X β_{011}}}{1 + e^{X β_{011}}}, π_{010} (X) = \frac{1}{1 + e^{X β_{011}}}

(10)

(11)

π_{101} (X) = \frac{e^{X β_{101}}}{1 + e^{X β_{101}}}, π_{100} (X) = \frac{1}{1 + e^{X β_{101}}}

(11)

and

(12)

π_{111} (X) = \frac{e^{X β_{111}}}{1 + e^{X β_{111}}}, π_{110} (X) = \frac{1}{1 + e^{X β_{111}}} .

(12)

Here, $β_{001} = (β_{0010}, β_{0011}, . . ., β_{001 p}), β_{011} = (β_{0110}, β_{0111}, . . ., β_{011 p})$

β_{101} = (β_{1010}, β_{1011}, \dots, β_{101 p}), β_{111} = (β_{1110}, β_{1111}, \dots, β_{111 p})

4. Link functions and estimating equations for trivariate Bernoulli model

The link functions for trivariate Bernoulli regression models are displayed in (3). In Section 3, we have shown the relationship between joint probabilities and marginal-conditional probabilities. The conditional probabilities are obtained from the first- and second-order transition probabilities. We have introduced seven logistic regression models, one marginal model in Equation (6) and six conditional models in Equations 7–12. Using the relationships between joint and marginal-conditional probabilities, we can redefine the link functions which are summarized below:

(13)

θ_{0} = ln p_{000} = - ln (1 + e^{X β_{1}}) - ln (1 + e^{X β_{01}}) - ln (1 + e^{X β_{001}}),

(13)

(14)

θ_{1} = ln (\frac{p_{100}}{p_{000}}) = X β_{1} - \{ln (1 + e^{X β_{11}}) - ln (1 + e^{X β_{01}})\} - \{ln (1 + e^{X β_{101}}) - ln (1 + e^{X β_{001}})\}

(14)

(15)

θ_{2} = ln (\frac{p_{010}}{p_{000}}) = X β_{01} - \{ln (1 + e^{X β_{011}}) - ln (1 + e^{X β_{001}})\},

(15)

(16)

θ_{3} = ln (\frac{p_{001}}{p_{000}}) = X β_{001},

(16)

(17)

\begin{matrix} θ_{12} = ln (\frac{p_{110} p_{000}}{p_{100} p_{010}}) = X β_{11} - X β_{01} - ln (1 + e^{X β_{111}}) + ln (1 + e^{X β_{011}}) \\ + ln (1 + e^{X β_{101}}) - ln (1 + e^{X β_{001}}) \end{matrix}

(17)

(18)

θ_{13} = ln (\frac{p_{101} p_{000}}{p_{100} p_{001}}) = X β_{101} - X β_{001},

(18)

(19)

θ_{23} = ln (\frac{p_{011} p_{000}}{p_{010} p_{001}}) = X β_{011} - X β_{001},

(19)

(20)

θ_{123} = ln (\frac{p_{111} p_{100} p_{010} p_{001}}{p_{110} p_{011} p_{101} p_{000}}) = X β_{111} + X β_{001} - X β_{101} - X β_{011} .

(20)

The estimating equations are obtained by differentiating the log-likelihood function (2) with respect to regression parameters of seven marginal and conditional models. It may be noted here that for simplicity, the summation signs for i = 1,…,n is ignored here and the equations are shown for n = 1. The estimating equations are:

(21)

{[\frac{\partial l}{\partial β_{s k}}]}^{'} = [\begin{matrix} \frac{\partial l}{\partial β_{1 k}} \\ \frac{\partial l}{\partial β_{01 k}} \\ \frac{\partial l}{\partial β_{11 k}} \\ \frac{\partial l}{\partial β_{001 k}} \\ \frac{\partial l}{\partial β_{011 k}} \\ \frac{\partial l}{\partial β_{101 k}} \\ \frac{\partial l}{\partial β_{111 k}} \end{matrix}] = [\begin{matrix} X_{k} (y_{1} - π_{1} (X)) \\ X_{k} [(1 - y_{1}) (y_{2} - π_{01} (X))] \\ y_{1} X_{k} (y_{2} - π_{11} (X)) \\ X_{k} [(1 - y_{1} - y_{2} + y_{1} y_{2}) (y_{3} - π_{001} (X))] \\ y_{2} X_{k} [(1 - y_{1}) (y_{3} - π_{011} (X))] \\ y_{1} X_{k} [(1 - y_{2}) (y_{3} - π_{101} (X))] \\ y_{1} y_{2} X_{k} (y_{3} - π_{111} (X)) \end{matrix}] = [\begin{matrix} 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \end{matrix}], k = 0, 1, \dots, p .

(21)

where $X_{k}, k = 1, . . ., p$ is the kth explanatory variable corresponding to the coefficient $β_{s k}$ , s denotes marginal and conditional models represented by 1, 01,11,001, 011, 101 and 111. The information matrix is comprised of information matrices for seven sets of parameters for marginal and conditional models. The $(k, k^{'})$ th element of the information matrix for model s is $- \frac{\partial^{2} l}{\partial β_{s k} \partial β_{s k^{'}}}$ . Let us denote the $7 (p + 1) \times 7 (p + 1)$ matrix containing 7 diagonal $(p + 1) \times (p + 1)$ matrices as shown below:

(22)

\begin{aligned} I = [\begin{matrix} I_{1} & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & I_{2} & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & I_{3} & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & I_{4} & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & I_{5} & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & I_{6} & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & I_{7} \end{matrix}] \end{aligned}

(22)

where

I_{1} = {[X_{k} {X_{k}}^{'} π_{1} (X) (1 - π_{1} (X)), k, k^{'} = 0, 1, \dots, p]}_{(p + 1) \times (p + 1)},

I_{2} = {[X_{k} {X_{k}}^{'} (1 - y_{1}) π_{01} (X) (1 - π_{01} (X)), k, k^{'} = 0, 1, \dots, p]}_{(p + 1) \times (p + 1)},

I_{3} = {[X_{k} {X_{k}}^{'} y_{1} π_{11} (X) (1 - π_{11} (X)), k, k^{'} = 0, 1, \dots, p]}_{(p + 1) \times (p + 1)},

I_{4} = {[X_{k} {X_{k}}^{'} (1 - y_{1} - y_{2} + y_{1} y_{2}) π_{001} (X) (1 - π_{001} (X)), k, k^{'} = 0, 1, \dots, p]}_{(p + 1) \times (p + 1)},

I_{5} = {[X_{k} {X_{k}}^{'} (1 - y_{1}) y_{2} π_{011} (X) (1 - π_{011} (X)), k, k^{'} = 0, 1, \dots, p]}_{(p + 1) \times (p + 1)},

I_{6} = {[X_{k} {X_{k}}^{'} y_{1} (1 - y_{2}) π_{101} (X) (1 - π_{101} (X)), k, k^{'} = 0, 1, \dots, p]}_{(p + 1) \times (p + 1)},

I_{7} = {[X_{k} {X_{k}}^{'} y_{1} y_{2} π_{111} (X) (1 - π_{111} (X)), k, k^{'} = 0, 1, \dots, p]}_{(p + 1) \times (p + 1)} .

5. Tests for models and dependence of outcome variables

For trivariate Bernoulli regression models, we have seven sets of parameters, one set for marginal for Y₁ (Model 1for s = 1), two sets for conditional models from first-order Markov chains for transition from Y₁ to Y₂ (Models 1 and 2 for s = 01, 11, respectively) and four sets from second-order Markov chain for outcomes Y₁, Y₂ and Y₃ (s = 001, 011, 101, 111, respectively) . The likelihood ratio test for overall joint model is

- 2 [ln L (β_{0}) - ln L (β)] \sim {χ^{2}}_{7 p}

where $β_{0} = (β_{s 0}, s = 1, 01, 11, 001, 011, 101, 111)$ and $β_{} = (β_{s}, s = 1, 01, 11, 001, 011, 101, 111)$ . Let $β * = (β_{s} *, s = 1, 01, 11, 001, 011, 101, 111)$ where $β_{s} * = (β_{s 1}, . . ., β_{s p})$ . The null hypothesis of the above test is $H_{0} : β * = 0$ .

Tests for the marginal and conditional models can be performed separately as well. In that case, the null hypothesis for each model is $H_{0} : β_{s} * = 0$ , s = 1, 01, 11, 001, 011, 101, 111. The likelihood ratio test for each model can be shown as

- 2 [ln L (β_{s 0}) - ln L (β_{s})] \sim {χ^{2}}_{p}

The Wald test will be applied to test for significance of a parameter for each model.

The tests for association parameters, $θ_{12}, θ_{13}, θ_{23}, θ_{123}$ , can be performed based on the parameters of regression models (17)—(20). The null hypotheses and test statistics are displayed here:

Test for independence of Y₂ and Y₃

The null hypothesis is

H_{01} : β_{001} = β_{011}

The test statistic for equality of parameters

A_{1} = {({\hat{β}}_{001} - {\hat{β}}_{011})}^{'} {[V ({\hat{β}}_{001} - {\hat{β}}_{011})]}^{- 1} ({\hat{β}}_{001} - {\hat{β}}_{011})

which is asymptotically chi-square with p degrees of freedom. It may be noted here that ${[V ({\hat{β}}_{001} - {\hat{β}}_{011})]}^{- 1} ≃ I_{4} + I_{5}$ .

(ii) Test for independence of Y₁ and Y₃

The null hypothesis is

H_{01} : β_{001} = β_{101}

The test statistic for equality of parameters

A_{2} = {({\hat{β}}_{001} - {\hat{β}}_{101})}^{'} {[V ({\hat{β}}_{001} - {\hat{β}}_{101})]}^{- 1} ({\hat{β}}_{001} - {\hat{β}}_{101})

which is asymptotically chi-square with p degrees of freedom. It may be noted here that ${[V ({\hat{β}}_{001} - {\hat{β}}_{101})]}^{- 1} ≃ I_{4} + I_{6}$ .

(iii) Test for independence of Y₁ and Y₂.

It appears from Equation (17) that independence of Y₁ and Y₂ depends not only on equality of regression parameters from Models 7 and 8 but also on equality of parameters of Models 9 and 11 and also Models 10 and 12.

The null hypotheses are

\begin{matrix} H_{01} : β_{01} = β_{11} \\ H_{02} : β_{001} = β_{101} \\ H_{03} : β_{011} = β_{111} . \end{matrix}

The test statistic for equality of regression parameters from Models 7 and 8 ( $H_{01}$ ) can be performed using the following statistic

A_{3} = {({\hat{β}}_{01} - {\hat{β}}_{11})}^{'} {[V ({\hat{β}}_{01} - {\hat{β}}_{11})]}^{- 1} ({\hat{β}}_{01} - {\hat{β}}_{11})

The denominator is obtained approximately from

{[V ({\hat{β}}_{01} - {\hat{β}}_{11})]}^{- 1} ≃ I_{2} + I_{3}

Test for $H_{02}$ is shown in (ii). Similarly, the test for $H_{03}$ is based on Models 10 and 12 and the test statistic is

A_{4} = {({\hat{β}}_{011} - {\hat{β}}_{111})}^{'} {[V ({\hat{β}}_{011} - {\hat{β}}_{111})]}^{- 1} ({\hat{β}}_{011} - {\hat{β}}_{111})

where ${[V ({\hat{β}}_{011} - {\hat{β}}_{111})]}^{- 1} ≃ I_{5} + I_{7}$ .

(iv) Test for independence of Y₁, Y₂ and Y₃

It appears from Equation (20) that independence of Y₁, Y₂ and Y₃ depends on equality of regression parameters from Models 9 and 11 but also on equality of parameters of Models 10 and 12.

The null hypotheses are

\begin{matrix} H_{01} : β_{001} = β_{101} \\ H_{02} : β_{011} = β_{111} \end{matrix}

The first null hypothesis is shown in (ii) which is the test for independence of Y₁ and Y₃(A₂) and the second null hypothesis is discussed in (iii) for partial test for independence of Y₁ and Y₂ (A₄). In other words, independence of Y₁, Y₂ and Y₃ depends on both (a) conditional independence of Y₁ and Y₂ (A₄) and (b) independence of Y₁ and Y₃ (A₂). Independence of three outcome variables depends on these pairwise conditional or unconditional independence of outcome variables that are shown in previous tests.

An alternative null hypothesis is quite straightforward from the model shown in Equation (20). Let us define $β_{. . .} = β_{111} + β_{001} - β_{101} - β_{011}$ then the independence of Y₁, Y₂ and Y₃ can be tested alternatively for null hypothesis $H_{01} : β_{. . .} = 0$ . We can test the null hypothesis using

A_{5} = {\hat{β}}_{\dots}^{'} {[V ({\hat{β}}_{\dots})]}^{- 1} {\hat{β}}_{\dots}

where ${[V ({\hat{β}}_{. . .})]}^{- 1} ≃ I_{4} + I_{5} + I_{6} + I_{7}$ .

6. Concluding remarks

We need to analyse binary repeated measures data in many instances where the outcome variables are correlated. The modelling of correlated outcome variables have been of interest in many fields due to recent emergence of need for analysing repeated measures data in the presence of correlation among outcome variables in addition to models for identifying explanatory variables associated with outcome variables. Several attempts have been made in the past to model such data but due to inbuilt complexity in modelling multivariate data with correlated outcomes it remained a challenge for a long time. This study shows an alternative approach based on marginal-conditional formulation to describe a joint model and provides a set of marginal and conditional models that can provide joint probabilities for a trivariate binary case. This procedure can be extended for more than three correlated outcomes easily using the same approach which is not shown in this paper to keep the exposition simple. Several tests are displayed in this paper for testing the overall model for trivariate Bernoulli as well as for examining dependence in outcome variables.

Acknowledgements

This work was supported by the HEQEP sub-project CP 3293 sponsored by UGC, Bangladesh and World Bank. I would like to express my gratitude to the anonymous reviewers for their helpful suggestions.

Additional information

Funding

This work was supported by the HEQEP sub-project [Grant Number CP 3293] sponsored by UGC, Bangladesh and World Bank.

Notes on contributors

M. Ataharul Islam

M. Ataharul Islam is currently the QM Husain Professor, ISRT, University of Dhaka. He was a former professor of statistics at the University Sains Malaysia, King Saud University, University of Dhaka and the East West university. He was a visiting faculty at the University of Hawaii and University of Pennsylvania. He is recipient of the Pauline Stitt Award, the WNAR Biometric Society Award for content and writing, University Grants Commission Award for book and research, Ibrahim Gold Medal for research, etc. He published more than 100 research papers in international journals on various topics, extensively on longitudinal and repeated measures data, including multistate and multistage hazards models, statistical models for repeated measures data, Markov models with covariate dependence, generalized linear models, conditional and joint models for correlated outcomes, etc. He authored several books either published or being published.

References

Dai, B., Ding, S., & Wahba, G. (2013). Multivariate Bernoulli distribution. Bernoulli, 19(4), 1465–1483. doi:10.3150/12-BEJSP10
Web of Science ®Google Scholar
Glonek, G. F. V., & McCullagh, P. (1995). Multivariate logistic models. Journal of the Royal Statistical Society, Series B (Methodological), 57, 533–546.
Web of Science ®Google Scholar
Islam, M. A., Alzaid, A. A., Chowdhury, R. I., & Sultan, K. S. (2013). A generalized bivariate Bernoulli model with covariate dependence. Journal of Applied Statistics, 40(5), 1064–1075. doi:10.1080/02664763.2013.780156
Web of Science ®Google Scholar
Islam, M. A., & Chowdhury, R. I. (2006). A higher order Markov model for analyzing covariate dependence. Applied Mathematical Modeling, 30, 477–488. doi:10.1016/j.apm.2005.05.006
Web of Science ®Google Scholar
Islam, M. A., & Chowdhury, R. I. (2008). Chapter 4: First and higher order transition models with covariate dependence. In F. Columbus (Ed.). Progress in applied mathematical modeling (pp. 153–196). Hauppage, NY: Nova Science Publishers.
Google Scholar
Islam, M. A., & Chowdhury, R. I. (2017). Analysis of repeated measures data. Singapore: Springer.
Google Scholar
Islam, M. A., Chowdhury, R. I., & Alzaid, A. A. (2012b). Tests for dependence in binary repeated measures data. Journal of Statistical Research, 46, 203–217.
Google Scholar
Islam, M. A., Chowdhury, R. I., & Briollais, L. (2012a). A bivariate binary model for testing depedence in outcomes. Bulletin of Malaysian Mathematical Sciences Society, 35, 845–858.
Web of Science ®Google Scholar
Islam, M. A., Chowdhury, R. I., & Huda, S. (2009). Markov models with covariate dependence for repeated measures. New York: Nova Science Publishers.
Google Scholar
Marshall, A. W., & Olkin, I. (1985). A family of bivariate distributions generated by the bivariate Bernoulli distribution. Journal of the American Statistical Association, 80, 332–338. doi:10.1080/01621459.1985.10478116
Web of Science ®Google Scholar
McCullagh, P., & Nelder, J. A. (1989). Generalized linear models (2nd ed.). London: Chapman and Hall.
Google Scholar
Yee, T. W., & Dirnbock, T. (2009). Models for analyzing species’ presence/absence data at two time points. Journal of Theoretical Biology, 259, 684–694. doi:10.1016/j.jtbi.2009.05.004
PubMed Web of Science ®Google Scholar

A trivariate Bernoulli regression model

Abstract

PUBLIC INTEREST STATEMENT

1. Introduction

2. Trivariate Bernoulli distribution

3. The marginal-conditional models for trivariate Bernoulli

Table 1. Transition probabilities for outcome variables Y₁ and Y₂

Table 2. Transition probabilities for outcome variables Y₁, Y₂ and Y₃

4. Link functions and estimating equations for trivariate Bernoulli model

5. Tests for models and dependence of outcome variables

6. Concluding remarks

Acknowledgements

Notes on contributors

M. Ataharul Islam

References

Information for

Open access

Opportunities

Help and information

A trivariate Bernoulli regression model

Abstract

PUBLIC INTEREST STATEMENT

1. Introduction

2. Trivariate Bernoulli distribution

3. The marginal-conditional models for trivariate Bernoulli

Table 1. Transition probabilities for outcome variables Y1 and Y2

Table 2. Transition probabilities for outcome variables Y1, Y2 and Y3

4. Link functions and estimating equations for trivariate Bernoulli model

5. Tests for models and dependence of outcome variables

6. Concluding remarks

Acknowledgements

Additional information

Funding

Notes on contributors

M. Ataharul Islam

References

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date

Table 1. Transition probabilities for outcome variables Y₁ and Y₂

Table 2. Transition probabilities for outcome variables Y₁, Y₂ and Y₃