474
Views
5
CrossRef citations to date
0
Altmetric
Articles

Predicting Model Improvement by Accounting for Spatial Autocorrelation: A Socioeconomic Perspective

Pages 131-149 | Received 26 Feb 2020, Accepted 02 Jul 2020, Published online: 07 Oct 2020
 

Abstract

In geographical literature, numerous studies have demonstrated the differences that arise if spatial autocorrelation (SAC) is incorporated into a conventional nonspatial modeling procedure, but little is known about when these differences might be magnified. This study addressed this query by conducting two sets of regression modeling for 561 variables representing housing prices, metropolitan industry, health, crime, education, and (un)employment across various parts of the United States: (1) nonspatial ordinary least squares (OLS) using a set of selected independent variables and (2) spatial regression incorporating spatial filters into the nonspatial OLS as additional independent variables. This incorporation generally improved the model outcomes through decreases in residual autocorrelation and Akaike’s information criterion (AIC). The degree of improvement correlated positively with the level of SAC inherent in the dependent variables. That is, strongly autocorrelated socioeconomic variables underwent greater decreases in residual autocorrelation and AIC than those variables with weaker SAC. The results imply that spatial modeling outcomes are sensitive to and potentially predictable by the level of SAC possessed by dependent variables. Therefore, the degree of SAC present in a socioeconomic variable can serve as a direct indicator of how much improvement a nonspatial OLS will experience if that SAC is properly accounted for.

大量的地理学文献表明, 将空间自相关性(SAC)引入传统的非空间模型, 会导致分析结果的差别。但我们还不甚了解这些差别何时会放大。因此, 针对美国各地的561个变量(住房价格、城市工业、健康、犯罪、教育、就业/失业), 本研究建立了两组回归模型:1)采用部分自变量的非空间最小二乘法(OLS), 2)结合并将空间滤波器做为自变量的空间回归。总体来说, 结合空间滤波器, 能降低残差的自相关性和赤池信息量准则(AIC), 从而改善模型结果。改善的程度, 与因变量的SAC水平呈正相关。即, 与弱自相关的变量相比, 强自相关的社会经济变量在残差自相关性和AIC上有更大的降低。研究结果意味着, 空间模型结果受到因变量的SAC程度影响, 并且模型结果可以预测。因此, 如果能够恰当地考虑SAC, 社会经济变量的SAC程度可以做为非空间OLS改进程度的直接指标。

Numerosos artículos de la literatura geográfica han demostrado las diferencias que surgen si la autocorrelación espacial (SAC) es incorporada dentro de un procedimiento de modelado no espacial convencional, aunque poco se sabe acerca de cuándo estas diferencias podrían magnificarse. El presente estudio abocó esta cuestión llevando a cabo dos conjuntos de modelado de regresión para 561 variables que representan precios de vivienda, industria metropolitana, salud, crimen, educación y (des)empleo a través de varias partes de los Estados Unidos: (1) cuadrados mínimos ordinarios no espaciales (OLS) usando un conjunto de variables independientes selectas, y (2) regresión espacial con la incorporación de filtros espaciales en los OLS no espaciales como variables independientes adicionales. En general, esta incorporación mejoró los resultados del modelo por medio de reducciones en autocorrelación residual y el criterio de información de Akaike (AIC). El grado de mejoramiento correlacionó positivamente con el nivel de la SAC inherente en las variables dependientes. O sea, las variables socioeconómicas fuertemente autocorrelacionadas experimentaron las reducciones más grandes en autocorrelación residual y AIC que aquellas variables con SAC más débil. Los resultados implican que lo que surge del modelado espacial es potencialmente predecible y sensible a la predicción por el nivel de SAC que posean las variables dependientes. En consecuencia, el grado de SAC presente en una variable socioeconómica puede servir de indicador directo de qué tanto mejoramiento experimentará un OLS no espacial si tal SAC es apropiadamente tenida en cuenta.

Acknowledgments

We thank three anonymous reviewers for providing great comments on the earlier versions of this article. Professor Yongwan Chun provided statistical advice on the spatial filtering method.

Funding 

This research was supported by the National Science Foundation (No. 1560907)

Notes

1 For a full list of dependent and independent variables, visit the authors’ GitHub repository https://github.com/biogeokim/sac_prediction.

2 A copy of the reproducible R code is presented at https://github.com/biogeokim/sac_prediction.

3 For all specific results from the twenty-eight data sets, see https://github.com/biogeokim/sac_prediction.

Additional information

Notes on contributors

Daehyun Kim

DAEHYUN KIM is a Professor in the Department of Geography at Seoul National University, Seoul, South Korea 08826. E-mail: [email protected]. His research interests include the implications of spatial autocorrelation for spatial modeling of natural resources and socioeconomic phenomena. He is also a physical geographer investigating biogeomorphology and vegetation dynamics.

Insang Song

INSANG SONG is a Doctoral Student in the Department of Geography at the University of Oregon, Eugene, OR 97403. E-mail: [email protected]. His research interests include health geography, spatiotemporal modeling, explainable machine learning, and causal inference.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 198.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.