2,851
Views
126
CrossRef citations to date
0
Altmetric
Theory and Methods

Pair Copula Constructions for Multivariate Discrete Data

, &
Pages 1063-1072 | Received 01 May 2011, Published online: 08 Oct 2012
 

Abstract

Multivariate discrete response data can be found in diverse fields, including econometrics, finance, biometrics, and psychometrics. Our contribution, through this study, is to introduce a new class of models for multivariate discrete data based on pair copula constructions (PCCs) that has two major advantages. First, by deriving the conditions under which any multivariate discrete distribution can be decomposed as a PCC, we show that discrete PCCs attain highly flexible dependence structures. Second, the computational burden of evaluating the likelihood for an m-dimensional discrete PCC only grows quadratically with m. This compares favorably to existing models for which computing the likelihood either requires the evaluation of 2 m terms or slow numerical integration methods. We demonstrate the high quality of inference function for margins and maximum likelihood estimates, both under a simulated setting and for an application to a longitudinal discrete dataset on headache severity. This article has online supplementary material.

SUPPLEMENTARY MATERIALS

 The supplementary materials include an algorithm for generating from a discrete D-vine and additional materials on the application to headache severity data including an out of sample validation.

Anastasios Panagiotelis acknowledges the support of the Alexander von Humboldt Foundation, Claudia Czado is partially supported by the German Research Foundation grant (CZ86_1_3), and Harry Joe is supported by an NSERC (Natural Sciences and Engineering Research Council of Canada) Discovery grant. The numerical computations were performed on a Linux cluster supported by a DFG (Deutsche Forschungsgemeinschaft: German Research Foundation) grant INST 95/919-1 FUGG. The authors also acknowledge the Associate Editor and an anonymous referee for helpful comments.

Notes

Although the MTCJ copula should be attributed to Mardia, Takahashi, Cook, and Johnson (see Joe, Li, and Nikoloulopoulos Citation2010 for a detailed discussion), the bivariate version is more commonly referred to as the Clayton copula. We use the name MTCJ/Clayton throughout the article as a compromise.

Table 1 Summary of parameter values for the five-dimensional D-vine with Bernoulli margins discussed in Section 3.5 and used for the simulation study in Section 4.2

We thank a referee for pointing this out to us.

NOTE: The leftmost column describes the realization of Y; for example, “01010” denotes Y 1 = 0, Y 2 = 1, Y 3 = 0, Y 4 = 1, and Y 5 = 0. Here, “low Pr(Yj = 0)” refers to the case where Pr(Yj = 0) = 0.3 for all j, “high Pr(Yj = 0)” refers to the case where Pr(Yj = 0) = 0.7 for all j, “low dependence” refers to the case where Kendall's τ = 0.3, 0.2, 0.1, 0.05 for pair copulas corresponding to the first, second, third, and fourth tree, respectively, and “high dependence” refers to the case where Kendall's τ = 0.7, 0.4, 0.3, 0.2 for pair copulas corresponding to the first, second, third, and fourth tree, respectively. In the rightmost column, the joint probabilities for independent margins are given for comparison.

NOTE: Results are averaged over 100 replications of data, each having sample size 300. Coverage refers to the proportion of simulations where a 95% bootstrapped confidence interval contains the true parameter value.

NOTE: The pair copulas on the first two trees are Gaussian with Kendall's τ = 0.3, while all other pair copulas are the independence copula. The sample size is 1000, and similar results were obtained for 30 replications of data generated from this model.

NOTE: Parentheses are for estimated Kendall's τ of the copula and the corresponding lower/upper confidence intervals (CIs). Here, “M” denotes morning, “A” afternoon, “E” evening, and “N” night, so , for example, describes the dependence between headache severity in the morning and night, conditional on headache severity in the afternoon and evening.

NOTE: The covariates that correspond to each of the coefficients (β's) are described in the online supplement. The figures in bold have 95% bootstrapped confidence intervals that do not contain 0.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.