1,199
Views
8
CrossRef citations to date
0
Altmetric
Theory and Methods

A Common Atoms Model for the Bayesian Nonparametric Analysis of Nested Data

ORCID Icon, ORCID Icon, ORCID Icon & ORCID Icon
Pages 405-416 | Received 13 Aug 2020, Accepted 19 May 2021, Published online: 14 Jul 2021
 

Abstract

The use of large datasets for targeted therapeutic interventions requires new ways to characterize the heterogeneity observed across subgroups of a specific population. In particular, models for partially exchangeable data are needed for inference on nested datasets, where the observations are assumed to be organized in different units and some sharing of information is required to learn distinctive features of the units. In this manuscript, we propose a nested common atoms model (CAM) that is particularly suited for the analysis of nested datasets where the distributions of the units are expected to differ only over a small fraction of the observations sampled from each unit. The proposed CAM allows a two-layered clustering at the distributional and observational level and is amenable to scalable posterior inference through the use of a computationally efficient nested slice sampler algorithm. We further discuss how to extend the proposed modeling framework to handle discrete measurements, and we conduct posterior inference on a real microbiome dataset from a diet swap study to investigate how the alterations in intestinal microbiota composition are associated with different eating habits. We further investigate the performance of our model in capturing true distributional structures in the population by means of a simulation study.

Supplementary material

Section A summarizes the terminology used throughout the main paper with a glossary. In addition, a diagram is presented that helps understanding the clustering structure induced by the Common Atom Model (CAM). Section B presents the proofs of the theoretical results in the main paper. Section C presents more details regarding the nested slice sampler algorithm. Section D contains additional plots that are related to the simulation studies and the microbiome application of the main paper. Section E illustrates how CAM performs the density estimation for every unit in the Scenario 1 - Case A of the main article. Section F presents a sensitivity study showing how different prior specifications affect the recovered partitions and estimated densities. Section G compares CAM with some competitor models in terms of distributional clustering performance. Section H compares and discusses the models and implementations in terms of efficiency, measured by simulation time. Section I reports the truncated Gibbs sampler that can be used in place of the nested slice sampler. Section J provides a theoretical evaluation of the errors arising when the truncated algorithm is adopted.

Additional information

Funding

F. Denti was partially funded as a postdoctoral scholar by the NIH grant R01MH115697. Federico Camerlenghi received funding from the European Research Council (ERC) under the European Unions Horizon 2020 research and innovation programme under grant agreement No 817257. Federico Camerlenghi gratefully acknowledges also the financial support from the Italian Ministry of Education, University and Research (MIUR), “Dipartimenti di Eccellenza grant 2018-2022. Michele Guindani was partially funded by the US National Science Foundation Award SES-1659921. Antonietta Mira was partially funded by the Swiss National Science Foundation grant 163196.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 343.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.