Abstract
Our goal is to model the joint distribution of a series of contingency tables for which some of the data are partially collapsed (i.e., aggregated in as few as two dimensions). More specifically, the joint distribution of four clinical characteristics in breast cancer patients is estimated. These characteristics include estrogen receptor status (positive/negative), nodal involvement (positive/negative), HER2-neu expression (positive/negative), and stage of disease (I, II, III, IV). The joint distribution of the first three characteristics is estimated conditional on stage of disease and we propose a dynamic model for the conditional probabilities that let them evolve as the stage of disease progresses. The dynamic model is based on a series of Dirichlet distributions whose parameters are related by a Markov prior structure (called dynamic Dirichlet prior). This model makes use of information across disease stage (known as “borrowing strength” and provides a way of estimating the distribution of patients with particular tumor characteristics. In addition, since some of the data sources are aggregated, a data augmentation technique is proposed to carry out a meta-analysis of the different datasets.
Acknowledgments
Dr. Bekele's work was supported in part by NIH grant 3U01 CA088278. Dr. Nieto-Barajas's work was partially supported by a grant from The Fulbright-García Robles Program. Mr. Munsell's work was supported by NIH contract 1HHSN2612007005. We also thank Shana Palla and Graciela Nogueras-Gonzalez for assistance in data management of the SEER and M.D. Anderson data sets and Dr. Richard Theriault for giving us permission to use the M.D. Anderson data.