316
Views
3
CrossRef citations to date
0
Altmetric
Articles

ESCIP: An Expansion-Based Spatial Clustering Method for Inhomogeneous Point Processes

, &
Pages 259-276 | Received 07 Jan 2018, Accepted 09 Mar 2019, Published online: 31 Jul 2019
 

Abstract

Detecting irregularly shaped spatial clusters within heterogeneous point processes is challenging because the number of potential clusters with different sizes and shapes can be enormous. This research develops a novel method, expansion-based spatial clustering for inhomogeneous point processes (ESCIP), for detecting spatial clusters of any shape within a heterogeneous point process in the context of analyzing spatial big data. Statistical testing is used to find core points—points with neighboring areas that have significantly more cases than the expectation—and an expansion approach is developed to find irregularly shaped clusters by connecting nearby core points. Instead of employing a brute-force search for all potential clusters, as done in the spatial scan statistics, this approach only requires testing a small neighboring area for each potential core point. Moreover, spatial indexing is leveraged to speed up the search for nearby points and the expansion of clusters. The proposed method is implemented with Poisson and Bernoulli models and evaluated for large spatial data sets. Experimental results show that ESCIP can detect irregularly shaped spatial clusters from millions of points with high efficiency. It is also demonstrated that the method outperforms the spatial scan statistics on the flexibility of cluster shapes and computational performance. Furthermore, ESCIP ensures that every subset of a detected cluster is statistically significant and contiguous. Key Words: cyberGIS, spatial algorithm, spatial analysis, spatial clustering.

在异质的点过程中侦测不规则形成的空间和集群具有挑战, 因为潜在的大小和形状各异的集群可能为数众多。本研究为不同质的点过程(ESCIP)发展一个崭新方法, 根据扩张的空间集群, 以在分析空间大数据脉络中的异质点过程中侦测任何形状的空间集群。统计检定用来寻找核心点——邻近区域较预期显着具有更多案例的点——并发展一个扩张方法, 通过连结附近的核心点, 发掘不规则形塑的集群。不同于像空间扫描统计一般运用暴力法搜寻所有潜在的集群, 此一法仅需检验每个潜在核心点的小范围邻近面积。此外, 空间指标发挥槓杆作用来加速搜寻邻近点和集群的扩张。本文提出的方法同时运用卜瓦松和伯努利模型, 并对大型空间数据集进行评估。实验结果显示ESCIP能够以高效能侦测数百万点中不规则形塑的空间集群。实验结果亦证实, 该方法在集群形式的弹性和演算表现上, 较空间扫描统计表现更好。此外, ESCIP确保每一个侦测到的集群子集在统计上是显着且连续的。关键词:网络地理信息系统, 空间演算, 空间分析, 空间集群。

Detectar aglomeraciones espaciales configuradas irregularmente dentro de procesos puntuales heterogéneos representa todo un reto debido a que el número de aglomeraciones potenciales de diferentes tamaños y formas puede ser enorme. Esta investigación desarrolla un nuevo método, la aglomeración espacial basada en la expansión para procesos puntuales no homogéneos (ESCIP) para detectar aglomeraciones espaciales de cualquier forma dentro de un proceso puntual heterogéneo en el contexto de análisis de big data espaciales. Se usa prueba estadística para hallar puntos medulares—puntos con áreas vecinas que significativamente tienen más casos de los esperados—y se desarrolla un enfoque de expansión para hallar aglomeraciones conformadas irregularmente, conectando puntos medulares cercanos. En vez de utilizar una búsqueda de fuerza bruta para todas las aglomeraciones potenciales, como se hace en las estadísticas espaciales de escáner, este enfoque solo requiere poner a prueba una pequeña área vecina por cada punto medular potencial. Aún más, se apalanca la indexación espacial para acelerar la búsqueda de puntos cercanos y la expansión de las aglomeraciones. El método propuesto se implementó con modelos Poisson y Bernoulli y se evalúa para conjuntos de datos espaciales grandes. Los resultados experimentales muestran que la ESCIP pueden detectar aglomeraciones espaciales de configuración irregular desde millones de puntos, con alto grado de eficiencia. Se demuestra también que el método supera en desempeño a la estadística espacial de escáner en lo que concierne a flexibilidad de formas del aglomerado y desempeño computacional. Todavía más, ESCIP asegura que cada subconjunto de una aglomeración detectada es estadísticamente significativo y contiguo.

Acknowledgments

The authors are grateful for help and support from Dr. Sara McLafferty and Dr. Charles R. Ehlschlaeger. The authors also acknowledge insightful comments on earlier drafts received from Editor Ling Bian and anonymous reviewers. The research outcome benefits from helpful critique and feedback from Rebecca Vandewalle and other members of the CyberInfrastructure and Geospatial Information Laboratory at the University of Illinois at Urbana–Champaign, which are greatly appreciated.

Additional information

Funding

This research is based in part on work supported by the U.S. National Science Foundation under grant numbers 1443080, 1743184, and 1833225. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

Notes on contributors

Ting Li

TING LI received her MS in geography in 2018 from the University of Illinois at Urbana–Champaign, Urbana, IL 61801. E-mail: [email protected]. Her research interests include cyberGIS and spatial clustering analysis.

Yizhao Gao

YIZHAO GAO received his PhD in Geography in 2018 from the University of Illinois at Urbana–Champaign, Urbana, IL 61801. E-mail: [email protected]. He is a software engineer at Google. His research interests include cyberGIS, high-performance computing, spatial analysis, and spatial data science.

Shaowen Wang

SHAOWEN WANG is Professor and Head of the Department of Geography and Geographic Information Science at the University of Illinois at Urbana–Champaign, Urbana, IL 61801. E-mail: [email protected]. His research interests focus on geographic information science and systems (GIS), advanced cyberinfrastructure and cyberGIS, complex environmental and geospatial problems, computational and data sciences, high-performance and distributed computing, and spatial analysis and modeling.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 312.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.