2,192
Views
24
CrossRef citations to date
0
Altmetric
Theory and Methods

Hierarchical Community Detection by Recursive Partitioning

, , , , , & show all
Pages 951-968 | Received 01 Sep 2019, Accepted 04 Oct 2020, Published online: 24 Nov 2020
 

Abstract

The problem of community detection in networks is usually formulated as finding a single partition of the network into some “correct” number of communities. We argue that it is more interpretable and in some regimes more accurate to construct a hierarchical tree of communities instead. This can be done with a simple top-down recursive partitioning algorithm, starting with a single community and separating the nodes into two communities by spectral clustering repeatedly, until a stopping rule suggests there are no further communities. This class of algorithms is model-free, computationally efficient, and requires no tuning other than selecting a stopping rule. We show that there are regimes where this approach outperforms K-way spectral clustering, and propose a natural framework for analyzing the algorithm’s theoretical performance, the binary tree stochastic block model. Under this model, we prove that the algorithm correctly recovers the entire community tree under relatively mild assumptions. We apply the algorithm to a gene network based on gene co-occurrence in 1580 research papers on anemia, and identify six clusters of genes in a meaningful hierarchy. We also illustrate the algorithm on a dataset of statistics papers. Supplementary materials for this article are available online.

Supplementary Materials

The online supplementary materials contain our proofs for the theoretical results, additional details of the Anemia example, and one additional example of using HCD to analyze a citation network.

Acknowledgments

We thank the associate editor and the referees for their helpful and constructive comments.

Funding

T. Li was supported in part by an NSF grant (DMS-2015298) and the Quantitative Collaborative Award from the College of Arts and Sciences at the University of Virginia. K. Van den Berge is a postdoctoral fellow of the Belgian American Educational Foundation (BAEF) and is supported by the Research Foundation Flanders (FWO), grant 1246220N. P. Sarkar was supported in part by an NSF grant (DMS-1713082). P. Bickel is supported in part by an NSF grant (DMS-1713083). E. Levina is supported in part by NSF grants (DMS-1521551 and DMS-1916222) and an ONR grant (N000141612910).

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 343.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.