ABSTRACT
Replicated network data are increasingly available in many research fields. For example, in connectomic applications, interconnections among brain regions are collected for each patient under study, motivating statistical models which can flexibly characterize the probabilistic generative mechanism underlying these network-valued data. Available models for a single network are not designed specifically for inference on the entire probability mass function of a network-valued random variable and therefore lack flexibility in characterizing the distribution of relevant topological structures. We propose a flexible Bayesian nonparametric approach for modeling the population distribution of network-valued data. The joint distribution of the edges is defined via a mixture model that reduces dimensionality and efficiently incorporates network information within each mixture component by leveraging latent space representations. The formulation leads to an efficient Gibbs sampler and provides simple and coherent strategies for inference and goodness-of-fit assessments. We provide theoretical results on the flexibility of our model and illustrate improved performance—compared to state-of-the-art models—in simulations and application to human brain networks. Supplementary materials for this article are available online.
Supplementary Materials
The supplementary materials contain proofs of Lemma 2.1, Proposition 2.2, Theorem 3.1, Lemma 3.2, and Lemma 3.3.
Acknowledgments
The authors thank the editor, the associate editor, and the referee for the valuable comments on a first version of this article.
Funding
This work is graciously supported by grant CPDA154381/15 of the University of Padova, Italy, grant N00014-14-1-0245 of the United States Office of Naval Research (ONR) and by the Defense Advanced Research Projects Agency (DARPA) SIMPLEX program through SPAWAR contract N66001-15-C-4041 and DARPA GRAPHS N66001-14-1-4028.