933
Views
8
CrossRef citations to date
0
Altmetric
Applications and Case Studies

A Multiscale Community Blockmodel for Network Exploration

, &
Pages 916-934 | Received 01 Nov 2010, Published online: 08 Oct 2012
 

Abstract

Real-world networks exhibit a complex set of phenomena such as underlying hierarchical organization, multiscale interaction, and varying topologies of communities. Most existing methods do not adequately capture the intrinsic interplay among such phenomena. We propose a nonparametric multiscale community blockmodel (MSCB) to model the generation of hierarchies in social communities, selective membership of actors to subsets of these communities, and the resultant networks due to within- and cross-community interactions. By using the nested Chinese restaurant process, our model automatically infers the hierarchy structure from the data. We develop a collapsed Gibbs sampling algorithm for posterior inference, conduct extensive validation using synthetic networks, and demonstrate the utility of our model in real-world datasets, such as predator–prey networks and citation networks.

Notes

Due to the short window of time, this 1000-article subnetwork has a lower citation density than the original network. We acknowledge that this subnetwork is not ideal for hierarchy learning, since articles that share only older citations will have no network paths between them. Nevertheless, this subnetwork retains enough structure for our algorithm to recover a two-level hierarchy, which we report in our experiments.

Note that our use of the stick-breaking process is unrelated to the stick-breaking construction for the DP. We use the stick-breaking process to produce a mixture over the mixture components induced by the nCRP, not to define the mixture components themselves.

A word on notation is used as shorthand to (1) first select the community compatibility matrix B associated with the parent of communities h, h′, and then (2) select the element of that B corresponding to h interacting with h′. Because our inference procedure integrates out the B matrices, a precise, long-form notation for them is unnecessary.

Algorithm details: for the IRM/MSCB MCMC algorithm, we took 10,000 samples as burn-in, and then estimated each hyperparameter, γ, m, π, λ1, λ2, by its average over the next 1000 samples. This was repeated for hierarchy depths K = 1 (i.e., IRM), 2, and 3. For the MMSB variational EM algorithm, we ran 100 random restarts until convergence, and then took hyperparameter estimates from the restart with the highest variational lower bound (with respect to the true likelihood). Because MMSB requires the number of latent roles R to be specified, we repeated this experiment for each 2 ⩽ R ⩽ 20.

One could address this by a Metropolis–Hastings scheme that splits and merges hierarchy branches, but the development of such a scheme is out of the scope of this work.

This issue is not unique to MSCB, but occurs when naively applying MCMC methods to models with permutation nonidentifiability, such as stochastic blockmodels or models meant for clustering. In many such models, permuting the communities/clusters has no effect on the model likelihood, so the posterior mean must be an average over all permutations—which is usually uninterpretable.

A more thorough analysis would include consensus hierarchies over multiple values of τ ∈ [0, 1], so as to present a fuller complete picture of hierarchy posterior variation. Alternatively, one could analyze the “closeness” of pairs of network actors by their number of shared path samples.

We note that this output is reminiscent of text models from natural language processing, particularly the latent Dirichlet allocation (Blei, Ng, and Jordan Citation2003). However, we stress that MSCB is not a text model; the title words are determined post hoc, after hierarchy inference.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.