179
Views
45
CrossRef citations to date
0
Altmetric
Original Articles

Efficient Triangle Counting in Large Graphs via Degree-Based Vertex Partitioning

, , &
Pages 161-185 | Received 02 Apr 2011, Accepted 15 Jul 2011, Published online: 06 Mar 2012
 

Abstract

The number of triangles is a computationally expensive graph statistic frequently used in complex network analysis (e.g., transitivity ratio), in various random graph models (e.g., exponential random graph model), and in important real-world applications such as spam detection, uncovering the hidden thematic structures in the Web, and link recommendation. Counting triangles in graphs with millions and billions of edges requires algorithms that run fast, use little space, provide accurate estimates of the number of triangles, and preferably are parallelizable. In this paper we present an efficient triangle-counting approximation algorithm that can be adapted to the semistreaming model [CitationFeigenbaum et al. 05]. Its key idea is to combine the sampling algorithm of [CitationTsourakakis et al. 09, CitationTsourakakis et al. 11] and the partitioning of the set of vertices into high- and low-degree subsets as in [CitationAlon et al. 97], treating each set appropriately. From a mathematical perspective, we present a simplified proof of [CitationTsourakakis et al. 11] that uses the powerful Kim–Vu concentration inequality [CitationKim and Vu 00] based on the Hajnal–Szemerédi theorem [CitationHajnal and Szemerédi 70]. Furthermore, we improve bounds of existing triple-sampling techniques based on a theorem of [CitationAhlswede and Katona 78]. We obtain a running time and an (1±ε) approximation, where n is the number of vertices, m is the number of edges, and Δ is the maximum number of triangles in which any single edge is contained. Furthermore, we show how this algorithm can be adapted to the semistreaming model with space usage and a constant number of passes (three) over the graph stream. We apply our methods to various networks with several millions of edges and we obtain excellent results, outperforming existing triangle-counting methods. Finally, we propose a random-projection-based method for triangle counting and provide a sufficient condition to obtain an estimate with low variance.

Acknowledgments.

The authors gratefully acknowledge support from the University of Crete, Grant No. 2569, and the National Science Foundation, Grants Nos. IIS-0705359, CCF-0635257, and CCF-1013110. This is the extended version of our proceedings paper [CitationKolountakis et al. 10].

Notes

1We use the tilde notation to hide polylogarithmic factors .

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access
  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart
* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.