997
Views
5
CrossRef citations to date
0
Altmetric
SOCIAL SCIENCE

Sociospatial inequality: combining multilevel and spatial analysis

Pages 50-54 | Received 26 Mar 2012, Accepted 03 Feb 2013, Published online: 05 Mar 2013

Abstract

The idea that statistical relationships can change over time and space has been around for many centuries. Social scientists began investigating how statistical relationships shift over time, with techniques such as time series analysis, long ago and have only recently begun to explore shifts over geographical space. A map of the USA. mainland that makes use of recent population data to regress a geographically weighted regression is presented to show the reader an instance when a statistical relationship varies as a function of geographical space.

1. Introduction

Several decades ago, CitationRobert Park (1926) eloquently argued that it is because geography influences the place and group of associates with whom each one of us is bound to live that it plays a role in determining the distribution of population. He went on to underscore the importance of his argument by explaining that it ‘is because social relations are so frequently and so inevitably correlated with spatial relations … that statistics have any significance’ for social scientists (CitationPark, 1926, p. 18). He made it clear that researchers interested in how humans related to one another must explore both the physical and social environments their subjects inhabit. He solidified his argument by stating that ‘it is only as social and physical facts can be reduced to, or correlated with, spatial facts that they can be measured at all’ (CitationPark, 1926, p. 18). Since then, social scientists have sought ways to analyze human behavior while accounting for the spatial relations in which they occur. An eloquent and more complete discussion on this topic has been given elsewhere (CitationPorter & Howell, 2011).

Social scientists have long made use of techniques, such as time series analysis, to investigate how statistical relationships shift in time and have only recently begun to explore if and how statistical relationships vary over space. In this paper, I explain how modern theoretical, statistical, and technological advances can be merged to investigate human behavior while accounting for the spatial relations in which they occur. I make use of one map to demonstrate how statistical relationships can shift/change over geographical space. The Main Map uses outputs from a Geographically Weighted Regression (GWR) to depict an instance when a statistical relationship varies/shifts over geographical space.

2. Method

The term ‘spatially non-stationary’ is a key concept in understanding this article. By way of analogy, we could say that in general water is wet anywhere on planet earth – the ‘wetness’ of water would thus be ‘spatially stationary.’ However, the amount of salt (i.e. salinity) of water differs by geographical location – that is, the salinity of water could be said to be spatially non-stationary. Thus, spatial non-stationarity simply refers to the idea that something changes/shifts over geographical space. A more complete theoretical discussion on the term is given elsewhere (CitationSiordia, Saenz, & Tom, 2012). The specific aim of this project is to show the reader the spatially non-stationary statistical relationship between percent in poverty and percent Latino in the USA mainland. In the USA, the term Latino refers to a person, of any race, who is identified as being of Mexican, Puerto Rican, Cuban, Central or South American origin (CitationEnnis, Rios-Vargas, & Albert, 2011).

The percentage of people living in-poverty is the outcome variable of interest. Poverty in the American Community Survey (ACS) is calculated using standards outlined by the Office of Management and Budget based on monetary income before taxes. Thresholds vary by family size and composition and are annually updated for inflation using the Consumer Price Index. In general, if a family's total income is less than the federal poverty threshold for their type of family, then all family members are considered to be in poverty. For example, in 2012, Federal Poverty Guidelines indicated that a family of four with a total household income below $23,050 would be considered to be in-poverty. Further details on measuring poverty in the USA can be found elsewhere (CitationU. S. Census Bureau, 2010a). Please note that other covariates are present in the model, namely: percentage of non-Latino-black and percentage of individuals with a bachelor's degree and beyond. The first is included to more fully account for the presence of minorities and the latter is added as a proxy measure of the area's economic wellbeing.

The map was drawn using ArcGIS 10 (CitationESRI, 2011), a mapping software program for designing and managing spatial data. The data used come from the nationally representative ACS Public Use Microdata Sample (PUMS) 2005–2007 three-year files (CitationU. S. Census Bureau, 2008). The microdata (i.e. information on individuals) can only be geographically located using Public Use Microdata Areas (PUMAs) (CitationU. S. Census Bureau, 2007). PUMAs can be mapped by using the publicly released Topologically Integrated Geographic Encoding and Referencing (TIGER) ArcView shapefiles (CitationU. S. Census Bureau, 2010b). The TIGER shapefile data base is a description of the geographical structure of the USA. The Census TIGER database was initially created using the US Geological Survey (USGS) 1:100,000-scale Digital Line Graph (DLG), the USGS 1:24,000-scale quadrangles, the Census Bureau's 1980 geographic base files (GBF/DIME-Files), and a variety of sources for the areas outside the contiguous 48 states. TIGER shapefiles have coordinates to six decimal places but the bureau does not release a level of precision. Please note that the continuous surface of parameters being estimated in the GWR equation make use of a 60-neighbor bandwith.

The main goal of this map is to display how percentage in poverty and percent Latino statistical relationship changes over geographical space. The original map was projected in miles using the North American Albers Equal Area Conic (‘polygon’ geometry type with a ‘degree’ angular unit) system on a North American 1983 Geographic Coordinate System based on the 1983 North American Datum (NAD83). An automatic data frame extend was used with an average data frame extend of 1:17,542,365. The selected projection was used after it was decided to be the most appropriate for handling the ‘bandwidth’ computation, with an adaptive kernel, necessary to estimate the statistical model outlined below. A bandwidth refers to the radius (in miles) used to determine the number of neighboring PUMA polygons used in the estimation of each regression coefficient.

The projected map outlined above was subsequently used to perform a GWR (CitationFotheringham, Brunsdon, & Charlton, 2002). In the simplest of terms, GWR generates a spatially calibrated regression model – the product of generating separate regression equations for every PUMA polygon to address spatial variation. Such models are used to explore if/how statistical variations over geographical-space exist. By using GWR, this sociospatial investigation of inequality (as measured by poverty) estimated parameters using a weighted function based on geographical distance. In technical terms, with GWR, a continuos surface of parameter values is estimated under the assumption that locations nearer to ith (e.g. within the scope of the bandwith) will have more influence on the estimation of the parameter β1-hat for that location. In short, GWR assumes parameters are functions of the geographical locations for which the observations are obtained.

The GWR procedure in ArcGIS 10 produces data that contain coefficient values for each polygon (CitationFotheringham et al., 2002). The procedure simultaneously rasterizes the PUMA polygon (i.e. create pixel file from the vector graphics format used in the GWR procedure). Chris Brunsdon provided ESRI with the code for implementation. The GWR coefficient surfaces, generated as Arc Data Files (ADF: i.e. rasterized outputs), were then used to create the Main Map. In terms of symbolization in the map, a red to green color-ramp is used to symbolize the spatially non-stationary relationship between percent in-poverty and percent Latino – where a change in color symbolizes how it shifts as a function of geographical location.

This is the first time that the spatially non-stationary statistical relationship between percent in poverty and percent Latino by PUMA is displayed in such an innovative way. To be clear, the color-ramp represents a series of numbers that range from 0.83 to −1.21. The numbers are the computed coefficients from the GWR equation executed in ArcGIS 10. The coefficients represent the spatial variation in the relationship between percent in-poverty and percent Latino. Positive numbers (in green) signal a positive statistical relationship and negative numbers (in red) a negative statistical relationship. A positive relationship, in the context of the GWR equation and variables of interest, means that as the percentage of Latinos increases, the percentage of people in-poverty increases. A negative relationship signals that as the percentage of Latinos increases, the poverty concentration decreases in the PUMA. The complete GWR equation produces a total of 1660 PUMAs with a positive coefficient and 397 with a negative coefficient (model details: residual squares = 1.99; effective number = 385.59; signal = 0.035; R 2-adjusted = 0.72).

Understanding the socio-historical background fueling these statistical relationships is important. Although a full discussion is beyond the scope of this report, a brief example may help give the findings some theoretical context. For example, out of the 2057 PUMAs, 81% of them fit the commonly found micro-level positive association between being Latino and the likelihood of being in poverty (CitationSiordia & Farias, 2013). In contrast, the 19% of PUMAs with a ‘negative-GWR-Latino-coefficient’ represent the reverse at a macro-level. For example, the negative statistical relationship between percent in-poverty and percent Latino in the Delta-Mississippi region may be due to three factors: the high level and long-term presence of poverty in the area; the low level of Latino concentrations (see CitationSiordia, Panas, & Delgado, 2012); and post-Katrina economics in the region. The Lower Mississippi Delta region has experience deep poverty for many decades (CitationPoston, Singelmann, Siordia et al., 2010). It is possible Latinos would not reside in this area unless there is a healthy local labor market. Thus, the increase in Latinos would be a signal that poverty in the PUMA is low relative to surrounding areas – where Latino concentrations would also be low.

The main point, as can be seen from the map, is that at the macro-level, the increase presence of Latinos is not always associated with increased levels of poverty. The map makes it clear: The relationship between percent in poverty and percent Latino is spatially non-stationary at the PUMA level.

3. Conclusions

This paper presents an overview of the spatially non-stationary statistical relationship between percent in poverty and percent Latino by PUMA. The Main Map visually depicts the important, yet typically ignored, fact that social behaviors can aggregate to create macro-level spatial non-stationarity. This project expands our understanding of what spatial non-stationarity is and gives an instance of its presence in the USA mainland. Future work should explore the same topic while accounting for polygon fragmentation in PUMAs (see CitationSiordia & Fox, 2013). Researchers were admonished long ago to account for the spatial relationships that govern human behaviors. As advances in research continue to move toward spatially aware modeling of human behavior, it is important that scientist continue to explore if spatial non-stationarity plays a role in their research.

Software

Raw microdata linked to PUMAs in ArcGIS 10, were managed using SAS 9.3. TIGER Shapefiles were used in ArcGIS 10 to draw the Main Map. In particular, the GWR equation was executed in the ArcToolbox component of the software.

Map Design

My map design decisions were solely driven by display considerations. The map was used in my dissertation defense. I decided to use the color-ramp because I believe it is more esthetically pleasing and draws the would-be reader by triggering their interest.

Supplemental material

Supplemental Material

Download PDF (5.2 MB)

References

  • Ennis, S. R., Rios-Vargas, M., & Albert, N. G. (2011). The hispanic population: 2010. U.S. department of commerce, economics and statistics administration, 2010 census brief issued May 2011 (C2010BR-04)..
  • ESRI 2011, 2011. ArcGIS desktop: Release 10 . Redlands, CA: Environmental Systems Research Institute; 2011.
  • Fotheringham, A. S. , Brunsdon, C. , and Charlton, M. E. , 2002. Geographically weighted regression: The analysis of spatially varying relationships . West Sussex, UK: John Wiley; 2002.
  • Park, R. E. , 1926. "The urban community as a spatial pattern and a moral order". In: Burgess, Ernest W. , ed. The urban community . Chicago: University of Chicago Press; 1926. pp. 3–21.
  • Porter, J. R., & Howell, F. M. (2011). Geographical sociology: Theoretical foundations and methodological application in the sociology of location. GeoJournal Library 105, Springer Science+Business Media BV..
  • Poston, D. L. , Singelmann, J. , Siordia, C. , et al., 2010. Spatial context and poverty: Area-level effects and micro-level effects on household poverty in the Texas borderland and lower Mississippi delta: United States, 2006 , Applied Spatial Analysis and Policy 3 (2010), pp. 139–162.
  • Siordia, C., & Farias, R. A. (2013). A multilevel analysis on Latino's economic inequality: A test of the minority group threat theory. Pages 65–79 in the Economic Status volume of the Hispanic Population edited by Richard Verdugo, in-press..
  • Siordia, C. , and Fox, A. , 2013. Public use microdata area fragmentation: Research and policy implications of polygon discontiguity , Spatial Demography 1 (2013), pp. 42–56, forthcoming.
  • Siordia, C., Panas, L. J., & Delgado, D. J. (2012). Geographing latinoization in the U.S. Mainland: Mexican origin latino population growth between 2000 and 2010 by county. Report in the Hispanic Economic Outlook, Spring: 9–14..
  • Siordia, C. , Saenz, J. , and Tom, S. E. , 2012. An introduction to macro-level spatial nonstationarity: A geographically weighted regression analysis of diabetes and poverty , Human Geographies 6.2 (2012), pp. 5–13, doi:10.5719/hgeo.2012.62.5.
  • U. S. Census Bureau. (2007). 2007 TIGER/Line Shapefiles. Technical documentation prepared by the U.S. Census Bureau, Washington, DC..
  • U. S. Census Bureau, 2008. A compass for understanding and using American community survey data: What general data users need to know . Washington, DC: U.S. Government Printing Office; 2008.
  • U.S. Census Bureau. (2010a). Percent in poverty, 2009. Small area income and poverty estimates (SAIPE) Program (December 2010), U.S. department of commerce economic and statics administration: SAIPE09-1.1..
  • U. S. Census Bureau. (2010b). TIGER/Line shapefiles technical documentation. Prepared by the U.S. Census Bureau, 2011..

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.