1,283
Views
8
CrossRef citations to date
0
Altmetric
Theory and Methods

Estimation of Subgraph Densities in Noisy Networks

, & ORCID Icon
Pages 361-374 | Received 01 Feb 2019, Accepted 02 Jun 2020, Published online: 20 Jul 2020
 

Abstract

While it is common practice in applied network analysis to report various standard network summary statistics, these numbers are rarely accompanied by uncertainty quantification. Yet any error inherent in the measurements underlying the construction of the network, or in the network construction procedure itself, necessarily must propagate to any summary statistics reported. Here we study the problem of estimating the density of an arbitrary subgraph, given a noisy version of some underlying network as data. Under a simple model of network error, we show that consistent estimation of such densities is impossible when the rates of error are unknown and only a single network is observed. Accordingly, we develop method-of-moment estimators of network subgraph densities and error rates for the case where a minimal number of network replicates are available. These estimators are shown to be asymptotically normal as the number of vertices increases to infinity. We also provide confidence intervals for quantifying the uncertainty in these estimates based on the asymptotic normality. To construct the confidence intervals, a new and nonstandard bootstrap method is proposed to compute asymptotic variances, which is infeasible otherwise. We illustrate the proposed methods in the context of gene coexpression networks. Supplementary materials for this article are available online.

Supplementary Materials

The on-line supplementary material contains all the technical proofs for the theoretical results in this paper.

Acknowledgments

The authors are grateful to the editor, an associate editor, and two referees for their helpful suggestions.

Additional information

Funding

Chang was supported in part by the Fundamental Research Funds for the Central Universities of China, the National Natural Science Foundation of China (grant nos. 11871401 and 71991472), the funds of Fok Ying-Tong Education Foundation for Young Teachers in the Higher Education Institutions of China, and the Center of Statistical Research and the Joint Lab of Data Science and Business Intelligence at SWUFE. Kolaczyk was supported in part by the US Air Force Office of Scientific Research.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.