426
Views
0
CrossRef citations to date
0
Altmetric
Articles

A Scalable Partitioned Approach to Model Massive Nonstationary Non-Gaussian Spatial Datasets

&
Pages 105-116 | Received 25 Nov 2020, Accepted 20 Jul 2022, Published online: 07 Oct 2022
 

Abstract

Nonstationary non-Gaussian spatial data are common in many disciplines, including climate science, ecology, epidemiology, and social sciences. Examples include count data on disease incidence and binary satellite data on cloud mask (cloud/no-cloud). Modeling such datasets as stationary spatial processes can be unrealistic since they are collected over large heterogeneous domains (i.e., spatial behavior differs across subregions). Although several approaches have been developed for nonstationary spatial models, these have focused primarily on Gaussian responses. In addition, fitting nonstationary models for large non-Gaussian datasets is computationally prohibitive. To address these challenges, we propose a scalable algorithm for modeling such data by leveraging parallel computing in modern high-performance computing systems. We partition the spatial domain into disjoint subregions and fit locally nonstationary models using a carefully curated set of spatial basis functions. Then, we combine the local processes using a novel neighbor-based weighting scheme. Our approach scales well to massive datasets (e.g., 2.7 million samples) and can be implemented in nimble, a popular software environment for Bayesian hierarchical modeling. We demonstrate our method to simulated examples and two massive real-world datasets acquired through remote sensing.

Supplementary Materials

Supplement contains source code, details for the spatial clustering algorithm, data generation in the simulated examples, parallelized schemes, and additional simulations for count data. It also includes partition-varying β estimates, details for the comparative analysis, a variant of SMB-SGLMM with geometric median weights, and a water vapor data example.

Acknowledgments

The authors are grateful to Matthew Heaton, Murali Haran, John Hughes, and Whitney Huang for providing useful sample code and advice. The authors are thank the anonymous reviewers for their careful review and valuable comments.

Disclosure Statement

The authors report there are no competing interests to declare.

Funding

Additional information

Funding

Jaewoo Park was supported by the Yonsei University Research Fund 2020-22-0501 and the National Research Foundation of Korea (NRF-2020R1C1C1A0100386811).

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 97.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.