ABSTRACT
Regionalization is the task of partitioning a set of contiguous areas into spatial clusters or regions. The theoretical and empirical literature focusing on regionalization is extensive, yet few quantitative comparisons have been conducted. We present a simulation study and explore the quality of frequently used and state-of-the-art regionalization algorithms, namely AZP, AZP-SA, AZPTabu, ARISEL, REDCAP, and SKATER, where the number of regions is an exogenous variable. The simulated benchmark data set consists of model realizations that represent various complexities in spatial data. Model families are defined with respect to regions’ shapes, value-mixing between regions, and the number of underlying spatial clusters. We evaluate the performance of different regionalization methods for realizations families using internal and external measures of regionalization quality. A large number of regionalization quality metrics expose a detailed profile of the analyzed methods’ strengths and weaknesses. We investigate the computational efficiency of every method as a function of the number of spatial units studied. We summarize results for different region families and discuss circumstances that make a certain method more desirable. We illustrate different regionalization algorithms’ implications on defining ecological regions for the conterminous US and compare them against expert-defined ecoregions.
Acknowledgments
This article is supported by Environmental Systems Research Institute (ESRI) ‘s Spatial Statistics and Virtual Science teams. Any opinions, findings, and conclusions, or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of ESRI.
Disclosure statement
No potential conflict of interest was reported by the author(s).
Data and codes availability statement
The data, code to generate data and tabular analysis results that support this study’s findings are available with the identifier at the private link (https://doi.org/10.6084/m9.figshare.14067239).
Additional information
Notes on contributors
Orhun Aydin
Orhun Aydin is a lecturer at the Spatial Sciences Institute of the University of Southern California. He is also a senior research scientist at the Spatial Statistics team of Environmental Systems Research Institute.
Mark. V. Janikas
Mark V. Janikas is the lead developer of the Spatial Statistics team of Environmental Systems Research Institute.
Renato Martins Assunção
Renato Martins Assunção is a professor in the Department of Computer Science, Universidade Federal de Minas Gerais (UFMG).
Ting-Hwan Lee
Ting-Hwan Lee is a senior product engineer at the Spatial Statistics team of Environmental Systems Research Institute.