Abstract
Multi-scale geographically weighted regression (MGWR) is among the most popular methods to analyze non-stationary spatial relationships. However, the current model calibration algorithm is computationally intensive: its runtime has a cubic growth with the sample size, while its memory use grows quadratically. We propose calibrating MGWR with gradient-based optimization. This is obtained by analytically deriving the gradient vector and the Hessian matrix of the corrected Akaike information criterion (AICc) and wrapping them with a trust-region optimization algorithm. We evaluate the model quality empirically. Our method converges to the same coefficients and produces the same inference as the current method but it has a substantial computational gain when the sample size is large. It reduces the runtime to quadratic convergence and makes the memory use linear with respect to sample size. Our new algorithm outperforms the existing alternatives and makes MGWR feasible for large spatial datasets.
Author contributions
The idea of this research was mainly proposed by Mark Janikas and Renato Assunção. The latter worked on the mathematical derivation while Xiaodan Zhou crafted the algorithm and proofread the mathematical derivation. Hu Shao implemented the final code. Cheng-Chia Huang and Xiaodan Zhou designed and ran the simulations and empirical examples. The first version of the manuscript was written by Renato Assunção and Xiaodan Xhou, and it was revised by Hanna Asefaw. All authors commented on the final manuscript and improved it.
Disclosure statement
No potential conflict of interest was reported by the author(s).
Data and codes availability statement
The data and codes that support the findings of this study are available at https://github.com/shaohu/MGWR, and here is its related DOI: 10.5281/zenodo.8200787.
Additional information
Notes on contributors
Xiaodan Zhou
Xiaodan Zhou is currently a Ph.D. student majoring in Statistics at North Carolina State University. Her expertise lies in spatial statistics, causal inference, and their diverse applications in a variety of fields. Twitter @xiaodanzhou11. E-mail: [email protected]
Renato Assunção
Renato Assunção received his Ph.D. in Statistics from the University of Washington in 1994. He has been an academic until 2021 when he joined ESRI as a researcher. He develops algorithms and probabilistic methods for the statistical analysis of spatial data, especially areal and point processes data. He has developed Bayesian spatially varying parameter models, a regional partitioning method based on minimum spanning trees, methods to detect arbitrarily shaped clusters, and surveillance methods for the detection of emergent space-time clusters. Twitter: @assuncaoest. E-mail: [email protected]
Hu Shao
Hu Shao is a software developer in ESRI since 2018. He received his Ph.D. in Geography from Arizona State University in 2018. He focuses on developing spatial statistics algorithms and tools. Email: [email protected]
Cheng-Chia Huang
Cheng-Chia Huang is currently a Sr. Product Engineer in Spatial Statistics Team at ESRI. With GIS and Geography background, she enjoys solving geographical problems using spatial data science techniques. Twitter: @karie_huang. E-Mail: [email protected]
Mark Janikas
Mark Janikas is a Lead Product Developer focusing on spatial statistics and has been working at esri since earning his Ph.D. in Quantitative Geography from UC Santa Barbara in 2006. Email: [email protected]
Hanna Asefaw
Hanna Asefaw is a Ph.D. candidate at Scripps Institution of Oceanography at the University of California, San Diego and a Product Engineer at Esri. E-mail: [email protected]