593
Views
1
CrossRef citations to date
0
Altmetric
Articles

Spatial Rank-Based Augmentation for Nonparametric Online Monitoring and Adaptive Sampling of Big Data Streams

, &
Pages 243-256 | Received 18 Mar 2021, Accepted 29 Oct 2022, Published online: 01 Dec 2022
 

Abstract

The age of Internet of Things (IoT) has witnessed the rapid development of modern data acquisition devices and communicating-actuating networks, which enables the generation of big data streams shared across platforms for remote and efficient decision making of many critical systems. The monitoring of big data streams remains a challenging task in various practical applications mainly due to their complexity in interrelationships, large volume, and high velocity, which places prohibitive demands on monitoring methodologies and resources. To tackle the challenges of monitoring unexchangeable and correlated big data streams with only partial observations available under resource constraints, we propose a method by incorporating spatial rank-based statistics with effective data augmentation techniques for the online unobservable data streams that can analytically inform the monitoring and sampling decisions based only on partially observed data streams. By exploiting historical data, the proposed method preserves strong descriptive power of general big data streams under partial observations and can explicitly use the correlation among data streams, and thus allows effective monitoring and equitable sampling over general heterogeneous and correlated big data streams, which is free of simplified assumptions (e.g., exchangeability) compared to existing methods. Theoretical investigations are carried out to evaluate the effectiveness of the augmentation statistics as well as the sampling strategy, which guarantee the superiority of the sampling performance over existing methods. Simulations under various scenarios and two real case studies are also conducted to evaluate and validate the performance of the proposed method.

Supplementary Materials

The file “Supplementary sections.pdf” contains: (i) properties 1.1–2 as well as proofs of all properties in Section 3.2; (ii) the parameter settings of the proposed method; (iii) additional simulation study to justify the estimation performance in Section 4; and (iv) another case study of COVID-19 pandemic surveillance for further performance evaluation. The file “codes&data.zip” contains: (i) the codes for the proposed SRAS algorithm; and (ii) the aggregated data for the two real case studies.

Acknowledgments

The authors gratefully acknowledge the support provided by the funding agencies. The authors would also like to thank the editor, the associate editor and three reviewers’ helpful comments.

Disclosure Statement

The authors report that there are no competing interests to declare.

Additional information

Funding

This work was supported in part by the National Science Foundation under grant 2032734, National Science Foundation of China under grant 72101148, Shanghai Sailing Program under grant 21YF1420100, and National Science Foundation of Shanghai under grant 22ZR1433000.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 97.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.