844
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Spatial Models of Travel Behavior and Land Use Restriction

ORCID Icon &
Article: 2174661 | Received 07 Apr 2022, Accepted 14 Dec 2022, Published online: 15 Feb 2023

Abstract

This paper develops a spatial econometric model of transportation mode choice and tests the association between zoning and other built environment variables and the choice of auto and non-auto transportation. We provide an extensive review of spatial econometrics and demonstrate the importance of using models that treat space formally when investigating urban transportation behavior. Using a unique combination of travel, employment, and built environment datasets from Denver, Colorado, we confirm previous results that built environment variables have a small association with choice of transportation mode and show the benefits of formal spatial modeling to the traditional probit model.

Introduction

Location of economic activity has always been an important aspect of regional economics and real estate. Spatial proximity plays a key role in many decisions made by individuals when it comes to weighing the benefits and costs of purchase decisions, allocation of resources, and other economic behavior. Transportation choices are particularly affected by location, and distances between origins and destinations undoubtedly influence travel decisions among individuals.

The development of formal econometric techniques to address location is a more recent development in the field. Of particular importance to the field of spatial econometrics is the treatment of spatial dependence (spatial autocorrelation) and spatial heterogeneity (spatial structure). Spatial dependence and spatial heterogeneity are important in applied economic models because the presence of these phenomena may invalidate or bias results from mainstream approaches. In addition, these issues have been largely ignored in the mainstream literature (Anselin, Citation1988).

This paper focuses on consumer transportation mode choice in a spatial context to demonstrate the importance of formal spatial modeling techniques in transportation behavior research. Previous studies of transportation choice largely implement OLS, probit, logit, or multinomial logit models to estimate impacts on mode choice (Ewing & Cervero, Citation2010). To investigate the importance of formal spatial modeling, we use the dataset from a previous study on the impacts of zoning on mode choice (Mueller & Trujillo, Citation2019) and formally model spatial dependence and spatial heterogeneity to test whether the results are significantly different from the standard econometric approach where space is dealt with informally. It is particularly important to investigate the presence of spatial dependence and heterogeneity in regard to land use restrictions because each individual faces a unique set of transportation choices based on their residential location and the proximity of this residential location to available goods, services, recreation, transportation, and employment opportunities. While built environment variables have been studied for their impact on driving behavior (Vance & Hedel, Citation2008), their role in land use segmentation (Levkovich et al., Citation2018), development and density (Butsic et al., Citation2011; Newburn & Ferris, Citation2016), and sprawl (Vyn, Citation2012), to the author’s knowledge this is the first study that employs formal spatial modeling to investigate the association of zoning restrictions with travel mode choice. From a policy perspective, models that inform the impact that a change in zoning laws might have on transportation choice can further inform alternative transportation agendas. Non-auto forms of transportation and built environments that support their use can provide traffic congestion relief, and so are the primary focus of this research, although other agendas, such as the transition to electric vehicles may have other benefits.

Previous work has used a choice-specific multinomial logit model to estimate the effect of built environment, socioeconomic, and land restriction measures on the propensity of survey respondents to use four mode choice alternatives; auto, transit, bike, and walk (Ewing & Cervero, Citation2010; Mueller & Trujillo, Citation2019). Due to the lack of observations in the data for walk (7.42%), bike (1.5%), and transit (7.71%) which individually represent a small subset of the dataset, we condense the dependent variable choice set into a binary choice between auto and non-auto transportation (16.6%). This paper adds to the previous work by formally implementing econometric techniques that explicitly deal with bias and inefficiency in the estimation of effects that are introduced if spatial autocorrelation and/or spatial heterogeneity are present in the underlying data generating process. The dataset does not allow for testing of causality in the relationship between the built environment and transportation choice, and we acknowledge that there may be simultaneity and self-selection present in the dynamic process that determines the relationship between mode preference and zoning variables by respondents choosing to locate in areas that align with their transportation preferences, and possibly also influencing the political process that determines zoning in their neighborhoods. Testing of such relationships would require repeat observations of households in a longitudinal survey, and such a survey is not available currently.

Using transportation survey data from the City and County of Denver in 2010 and corresponding built environment measures during that time, we test the association of zoning and other built environmental variables on survey respondents’ choice of auto and non-auto transportation while controlling for demographic and preference variables collected in the survey. The unit of observation is a “tour” taken by individuals where a respondent leaves the home, makes stops for various activities, and returns home. The key independent variables of interest in this study are the land use restriction variables encompassed in percentages of different zoning types (low, medium, and high density residential, low and high density business, and industrial) within a 1/4, 1/4–1/2, 1/2–3/4, and 3/4–1-mile radius surrounding each survey respondent’s home. For a more thorough explanation of the dataset and its construction, see Mueller and Trujillo (Citation2019) and the descriptive statistics in .

Table 1. Descriptive statistics.

In this work, we formally test for spatial autocorrelation and spatial heterogeneity in the data and apply spatial econometric methods to correct for spatial components found in the data generating process. The paper reviews the relevant spatial econometric theory, and applies the Moran’s I and Geary’s C tests for spatial processes at work in the data generating process. We then explore several models that address spatial autocorrelation, spatial heterogeneity, and both spatial autocorrelation and heterogeneity jointly and compare them with the results of the canonical probit approach. Estimates of each model are performed with the two most commonly used spatial weights matrices in the literature, the row standardized binary weights matrix and the row standardized inverse distance matrix. The results from each model show that there is a high likelihood of the presence of spatial processes in the data generating process and that these models are preferable to canonical approaches to estimating travel mode choice behavior in this travel survey sample. We provide evidence that the spatial Durbin model (SDM) with a binary spatial weights matrix is the superior model that accounts for both spatial dependence and heterogeneity, and accounts for spatial processes that are not modeled in the traditional probit approach. Particularly, if spatially lagged forms of the dependent and independent variables show signs of spatial dependence or heterogeneity, only the spatial Durban model addresses both spatial phenomena simultaneously (LeSage & Pace, Citation2009).

Reviewing the Spatial Econometric Approach

Spatial econometrics differentiates itself from mainstream econometric approaches by applying formal spatial modeling best summarized in Luc Anselin’s pioneering work on the topic: “I will consider the field of spatial econometrics to consist of those methods and techniques that, based on formal representation of the structure of spatial dependence and spatial heterogeneity, provide the means to carry out the proper specification, estimation, hypothesis testing, and prediction for models in regional science” (Anselin, Citation1988, p. 10).

Spatial Effects

Real estate economics and regional science inherently deal with issues related to human behavior across space, cities, and regions. The term spatial econometrics and its designation as a distinct branch of econometrics dates back to the seminal work of Paelinck and Klaassen (Citation1979) that collected a growing body of literature in the regional sciences that attempted to formally deal with the problems inherent in modeling spatial data in the context of regional econometric models. The primary characteristics that delineate the field according to Paelinck and Klaassen (Citation1979) and summarized by Anselin (Citation1988, p. 7) are: (1) the role of spatial interdependence in spatial models, (2) the asymmetry in spatial relations, (3) the importance of explanatory factors located in other spaces, (4) differentiation between ex post and ex ante interaction, and (5) explicit modeling of space. While it is possible to measure and model spatial data using standard econometric techniques by including variables in the model that have a spatial nature to their measurement, the distinction to be made here is that spatial econometrics formally deals with specific spatial aspects of the data that preclude the use of traditional econometric techniques, and more particularly, address spatial dependence and spatial heterogeneity formally (Anselin, Citation1988; LeSage & Pace, Citation2009).

Spatial dependence addresses the lack of mutual independence across observations in cross-sectional data sets and is often referred to as spatial autocorrelation following the path-breaking work of Cliff and Ord (Citation1969, Citation1973). In essence, addressing spatial dependence is the development of formal statistical specifications of economic models that address Tobler’s first law of geography, that “everything is related to everything else, but near things are more related than distant things” (Tobler, Citation1970, p. 236). Spatial dependence is estimated by the relative location of one observation in the dataset to another, with an emphasis on the effect of distance between observations. Spatial dependence is caused by a variety of measurement errors, by spatial spill-over effects, or spatial externalities (Anselin, Citation1988), by spatially autocorrelated variables (Fingleton & Lopez-Bazo, Citation2006), or any situation in which the covariance of observations across geographical space is not equal to zero (Anselin, Citation2001). For example, spatial autocorrelation is often found in hedonic pricing models of residential real estate, where the sale price of one residential property is influenced by housing prices in surrounding neighborhoods (von Graevenitz & Panduro, Citation2015).

Spatial heterogeneity is the “lack of stability over space of the behavioral or other relationship under study. More precisely, this implies that functional forms and parameters vary with location and are not homogeneous throughout the data set” (Anselin, Citation1988, p. 9). This type of econometric model addresses these issues by formally modeling the variation in parameters across space to address the heterogeneous effect an independent variable may have in different locations. More importantly, when spatial dependence and spatial heterogeneity are present in the data generating process and not explicitly modeled, the results of mainstream econometric techniques may be biased, inefficient, or both (Anselin, Citation1988; LeSage & Pace, Citation2009; Schnier & Felthoven, Citation2011). Spatial econometric techniques address spatial processes within the data generating process and are generally preferred when differences due to spatial location are present in data. An example of spatial heterogeneity is the variation in the effect income may have on travel mode preferences across the urban landscape. Income may have the opposite effect on preferences to drive in suburban locations than it does in central business districts because higher incomes allow suburban dwellers greater access to automobiles, while in urban locations higher income may allow individuals to live in areas with better access to goods and services, thus increasing reliance on alternative forms of transportation.

Formally Modeling Spatial Interaction

At the center of spatial econometrics is defining spatial association amongst observations (Anselin, Citation1988, Citation2010; Arbia, Citation2006). To formally address the spatial connectedness of observations across space, an approach has been developed which uses a decision rule that determines whether two observations are spatial neighbors and thus close enough to exert influence on each other. The typical convention is to formally define spatial connectedness through the use of a symmetric matrix W of dimensions equal to the number of observations n, whose strictly non-negative elements wij indicate the spatial connectedness between units i and ji. With the spatial neighbor matrix constructed, spatial modeling proceeds by re-weighting each row to develop a spatial weights matrix, then pre-multiplying either the dependent or independent variables by the spatial weights matrix and estimating a vector of coefficients that includes a spatial dependence parameter. This modeling approach formally connects variables of neighboring observations through the spatial weights matrix W and produces an estimate of spatial association in the data generating process through the spatial dependence parameter(s). To demonstrate the use of the spatial weights matrix, the spatial autoregressive model equation is illuminating. In its simplest form with no independent regressors, the spatial autoregressive model equation is: (1) yi =ρjwijyj+εi(1)

The term jwijyj+εi gives a weighted sum of each neighboring observation j’s dependent variable yj,ji. The estimated spatial dependence parameter ρ gives a measure of the influence those neighboring observations have on each yi observation. High values of ρ indicate strong spatial autocorrelation between observations, while a value of 0 indicates no spatial autocorrelation. In addition to measuring the direct influence of neighbors j on observation i, the parameter ρ is sometimes referred to as the spatial decay parameter, because it also indicates how fast the effect of neighboring observations declines with higher order neighbors, i.e. neighbors of neighbors (Anselin, Citation1988; LeSage & Pace, Citation2009). For example, second order neighbors of yi are first order neighbors of yi’s first order neighbors and have an influence on yi equal to ρ2, their influence on yi being exerted indirectly through yi’s direct neighbors. Influence dissipates as observations become further removed from yi, and k  order neighbors have and influence equal to ρk. Thus, values of ρ closer to 1 indicate a slowly dissipating influence, while values close to 0 indicate an effect that quickly dissipates with higher order neighbors.

The literature has yet to determine a formal approach to developing the spatial weights matrix, although several approaches have widespread adoption. The pioneering work of Moran (Citation1950) and Geary (Citation1954) developed the notion of a binary weights matrix W, where each element wij was assigned a value of 1 if two observational units were neighbors and assumed to exhibit influence on each other, and 0 otherwise. The spatial weights matrix was originally developed in the context of areal units and neighbors were defined as two observational units that shared a common border (Cliff & Ord, Citation1973). When observational units are points in space rather than areal units (as the data in this study is), neighbors are identified based on distance. Two spatial point observations i and j are considered neighbors if 0dijD, where dij is the distance between points i and j and D is the bandwidth after which interaction between observations is considered non-existent and wij is assigned a 0 weight (Anselin, Citation1988). Assignment of a zero weight does not preclude spatial effects occurring between more distant neighbors, however. Instead, influence is modeled as a higher order recursive effect through the estimated spatial dependence parameter as discussed above. Thus, observations that are not direct neighbors can influence each other indirectly through intermediary neighbors that connect them. Once a binary spatial weights matrix is constructed which determines which observations are neighbors of each other, the spatial weight matrix is often row standardized so that each row sums to 1. Row standardization normalizes spatial effects across a dataset, preventing observations that have many spatial neighbors from dominating coefficient estimates (Anselin, Citation1988).

Additional weighting schemes have also been applied to the binary weights matrix, and it is currently convention to row standardize the spatial weight matrices after applying alternative weighting schemes. If an alternative weighting scheme is applied, the construction of the spatial weights matrix becomes a two-step process, first constructing a binary spatial neighbor matrix as above, then multiplying this matrix by another measure of spatial association. Cliff and Ord (Citation1973, Citation1981) pioneered this concept by multiplying the binary spatial neighbor matrix by the inverse of the distance between observations. This approach places higher weights on neighboring observations that are closer, while still placing zero weights on neighbors that are further apart than distance D. In this study, we test the sensitivity of results of all models to the choice of spatial weights matrix by estimating models with both a binary row standardized spatial weights matrix and an inverse distance row standardized weights matrix.

The decision to standardize the spatial weights matrix is not at all clear from the literature, and decisions on how to form the spatial weights matrix are generally determined by a priori assumptions made by the researcher in the context of each study. Anselin (Citation1988) argues that in certain cases, such as inverse distance, the standardization of the spatial weights matrix may eliminate the economic interpretation of the results. However, the consensus is that the standardization of the spatial weights matrix is the preferable approach to avoid magnitude complications amongst variables and avoid certain spatially weighted variables dominating the results of spatial models (LeSage & Pace, Citation2009).

Formally, each element of a binary spatial weights matrix (spatial neighbor matrix) is calculated based on a decision rule. For contiguity neighbors, each element wij=1 if the two areal units represented as polygons share a common boundary, and 0 otherwise. For distance based neighbors, wij=1 if dD and 0 otherwise, where d is the distance between observation i and j, and D is a pre-determined distance threshold above which observations are said to exhibit no direct influence on each other. The choice of the distance threshold D is not well developed in the literature and is typically based on domain knowledge or correspondence to other distance measurements in the dataset. Each element of a row standardized spatial weights matrix Ws is calculated as: (2) wijs=wijjwij(2) with each element of Ws equal to 0 or 1 in a binary specification, and 1/d in an inverse distance specification if the two observations are neighbors. Matrix Ws is used to link neighboring observations in spatial regression models, which produces estimates of coefficients on the resulting spatially weighted variables.

Measuring Spatial Dependence

Constructing a spatial weights matrix allows for formal testing of spatial dependence in the data generating process. The canonical measure of spatial dependence was developed by Moran (Citation1950) and is widely used across many fields of study. Moran’s I is a global test of spatial dependence. Shortly after Moran, Geary (Citation1954) developed a formal test of localized autocorrelation, known as Geary’s C. Moran’s I indicates the level of global spatial autocorrelation, while Geary’s C indicates localized spatial autocorrelation and therefore the possibility that spatial heterogeneity is also present in the data generating process.

Moran’s I ranges between −1 and 1, with values near 1 (−1) indicating the positive (negative) spatial autocorrelation, and values near 0 indicating weak spatial dependence of the observed variables (Moran, Citation1950). Moran’s I can be interpreted as a spatial version of a typical correlation calculation. Geary’s C ranges between 0 and 2, with values <1 demonstrating increasing positive spatial autocorrelation and values >1 indicating increasing negative spatial autocorrelation (Geary, Citation1954). Formally, Moran’s I is calculated as: (3) I=ni=1nj=1nwiji=1nj=1nwij(xix¯)(xjx¯)i=1n(xix¯)2(3) and Geary’s C is calculated as: (4) C=(n1)2i=1nj=1nwiji=1nj=1nwij(xixj)2i=1n(xix¯)2(4)

Using the equations above, it is possible to test Moran’s I and Geary’s C statistics against their theoretical values under different distributional assumptions. We test these two statistics against their theoretical values under a normal Gaussian distribution and the results are shown as significance stars in . As can be seen from the equations above, both statistics I and C are measurements of the covariance of deviation from the mean of observations of a single variable x across a dataset, linked through the spatial weights matrix W. Thus, one can think of the measures as clustering of deviations from the mean. If neighboring observations defined through W deviate from the mean in the same direction, high spatial clustering (autocorrelation) is present.

Table 2. Moran’s I and Geary’s C statistics.

shows that Moran’s I and Geary’s C for most variables in the current dataset are statistically significant at the 1% level, providing confidence in their estimated values. Comparing Moran’s I using the binary and inverse distance weight matrices, one can see that the inverse distance matrix picks up higher values of positive spatial correlation for demographic variables (household size, vehicles, bikes, age, income, and college education). This is an indication that the closest neighbors are more spatially correlated. Urban form variables have Moran’s I values close to 1 using either spatial weights matrix, indicative of neighboring observations sharing a common urban form. Geary’s C statistics also show local positive auto-correlation under both spatial weights matrices, with the association also being stronger across most variables using the inverse distance spatial weights matrix. This result is an indication that spatial dependence may be present both globally (spatial autocorrelation) and locally (spatial heterogeneity) in the underlying data generating processes. The dataset may include clustering that results from socioeconomic traits, political zoning boundary determination, and transit network design among other spatial phenomena, which is not surprising, considering that spatial segregation of land use is one of the objectives of zoning laws (Levkovich et al., Citation2018), and socioeconomic segregation is a widely accepted phenomenon.

The test results from justify using spatial econometric modeling techniques to address the spatial dependence and heterogeneity that is present in the data. We develop three specifications of spatial models to correct for these spatial processes; the spatial autoregressive model (SAR) which addresses spatial dependence, the spatial error model (SEM) which addresses spatial heterogeneity, and the spatial Durbin model (SDM) which simultaneously addresses spatial dependence and spatial heterogeneity. As suggested by LeSage and Pace (Citation2009), we estimate each of the models using the two most common row standardized spatial weights matrices and use a Lagrange multiplier test to determine which spatial weight matrix best fits the data. The two spatial weighting schemes employed in the spatial weights matrix before row standardization are the two most common in the literature, binary and inverse distance. Both spatial weight matrices are then row standardized before estimating each model.

The Spatial Autoregressive Model

The spatial autoregressive model (SAR) formally estimates the presence of spatial dependence by incorporating a spatially lagged dependent variable on the right-hand side of the regression equation (Cliff & Ord, Citation1973). Thus, observations of the dependent variable are influenced by other observations of the dependent variable nearby. In the context of the present study, the SAR model is a way of controlling for the influence of neighboring survey respondents’ transportation mode choices on the observational unit under study which represents a spatial clustering effect. In the binomial context, the choice variable observed (transportation mode = auto or non-auto) depends on the underlying utility of the choice indicator observed. The underlying latent variable yi*=U1iU0i is assumed to follow a normal distribution in the probit model estimation. The general spatial autoregressive model in a binomial context can be formally stated in the system of equations as: (5) y*=ρWy+Xβ+ε(5) εN(0,σ2In) yi=1, if yi*0 yi=0, if yi*<0

Where yi* is the unobserved latent utility of mode choice, yi=1 if the binomial choice is observed, and 0 otherwise, W is the spatial weights matrix, ρ is an estimated spatial dependence parameter of spatial autocorrelation between observations, X is a matrix of independent variables, and β is a vector of estimated coefficients. The latent utility construct implies that Pr(yi=1)=Pr(U1iU0i)=Pr(yi*0) (LeSage & Pace, Citation2009).

Typically, the SAR model is used to adjust for dependent variables that have a direct effect on the realization of the dependent variable in close proximity. The classic example is SAR hedonic pricing models of residential home values (e.g., Pace & Barry, Citation2004), where the value of a house sold has a direct impact on other residential home prices in the area and has been shown to be a valuable addition to traditional home price models (Anselin & Lozano-Gracia, Citation2007). Conceptually, the SAR would be the correct model for the underlying data generating process if a survey respondent’s choice to use auto or non-auto transportation depended upon neighboring survey respondents’ transportation mode choices, i.e. a clustering effect of mode choice.

While theoretically, the model has justifiable merit in controlling for spatial dependence, it is important to note that this model does not distinguish the direction of causality, only the association of built environment characteristics with transportation mode choice. It is quite possible that people who enjoy non-auto forms of transportation tend to live in the same locations because these locations provide employment, leisure, and shopping in close enough proximity to make non-auto trips more convenient. However, this model does identify if there is spatially clustered transportation behavior, and how fast this clustering effect deteriorates with distance. If the spatial dependence parameter ρ is significant, explicitly modeling spatial dependence is justified and therefore relevant to the study of spatial associations between zoning laws and transportation choices. While this model cannot determine the underlying cause of clustering in mode choice, it does control for the spatial phenomenon, leading to unbiased estimates of the association between urban form and transportation behavior.

It is also important to note that in the SAR model, the spatial dependence parameter ρ incorporates a feedback loop in the effect of neighboring observations on the dependent variable. There is a direct effect of independent variables on transportation choice, and this transportation choice then indirectly effects transportation mode choices of neighboring observations, which in turn affect the observation under study, creating a spatial feedback loop effect. Thus, direct, indirect, and total effects of independent variables on the dependent variable are estimated.

The Spatial Error Model

In contrast to the spatial autoregressive model, the spatial error model (SEM) allows for heterogeneous effects of independent regressors across space. This adaptation of the traditional OLS or probit model allows for both global coefficients (β) and local variation across the space of coefficients to be modeled in the error structure. In the binomial choice context, the latent variable approach of unobserved utility of the resulting choice indicator is used for the probit estimator similar to the process described for the SAR model. The SEM binomial choice model can be formally stated as: (6) yi*=Xβ+u(6) u=λWu+e eN(0,σ2In) yi=1, if yi*0 yi=0, if yi*<0 where W is the spatial weights matrix. The SEM model allows for spatial variance of the error term and the estimation of its spatial lag parameter λ. Unlike the SAR model, indirect and direct effects cannot be estimated because there is no feedback loop of changes in the dependent regressors of neighboring observations on the dependent variable since there is no autocorrelation parameter present. The parameter λ represents the extent to which heterogeneous independent coefficient estimates vary across space. This is the correct model to use if neighboring respondents’ transportation mode choices do not affect an individual’s mode choice, but the effect of independent variables have varying effects across space, such as the variation in the effect of income described earlier.

Spatial Durbin Model

The spatial Durbin model (SDM) allows for the estimation of both spatial autocorrelation and spatial heterogeneity simultaneously by including a spatially lagged dependent variable as well as spatially lagged independent variables in a single model. The advantages of this model are the simultaneous control of both spatial dependence and spatial heterogeneity, but in practice can suffer from the curse of dimensionality. One advantage of the Bayesian approach to model estimation employed in this study and described below is the ability to estimate such models without running into non-convergence problems. These problems can be a significant challenge with the maximization procedures employed in maximum likelihood and generalized method of moments estimation, and often lead to severe computational challenges. The binomial probit SDM model can be formally stated as: (7) yi*=ρWy+Xβ+WXθ+ε(7) εN(0,σ2In) yi=1, if yi*0 yi=0, if yi*<0 where ρ is the estimated parameter of spatial autocorrelation of the dependent variable, β is the estimated vector of parameters on the independent variables, and θ is the vector of estimated parameters on the spatially lagged independent variables. The estimation of the SDM is similar to that of the SAR model with the independent variables multiplied by the spatial weights matrix added as additional independent variables, WX. The resulting model then produces a vector of global effects of the independent variables β and a vector of local effects of the independent variables θ.

LeSage and Pace (Citation2009) detail the advantages of each spatial modeling approach, and determine that when the correct model is unknown and not dictated by theory, only the SDM gives unbiased results even if the true model is SAR or SEM. More particularly, when the true data generating process is the SEM model, SAR and SDM will produce unbiased but inefficient estimates. When the true data generating process is the SAR model, the SEM model produces biased estimates, while the SDM does not. If the true data generating process is the SDM model, the other models will have omitted variable bias. The SAR, SEM, and SDM versions of the travel behavior—built environment models are estimated in using both a binary and inverse distance weighted row standardized spatial weights matrix.

Table 3. Model coefficient comparison.

Estimation Techniques

McMillen (Citation1992) was the first to propose techniques for estimating the SAR and SEM probit models. Due to the complicated error structure of the SAR and SEM probit models, direct maximum-likelihood estimation is not possible; however, in McMillen’s procedure, the discrete variable is replaced by the expected value of the underlying latent variable, and the expectation is calculated iteratively until convergence. McMillen (Citation1992), among others, deems this procedure impractical for large datasets. LeSage (Citation2000) outlines several other drawbacks to the procedure. First, the estimation procedure requires the estimation of the likelihood function, which prohibits the use of the information matrix for calculating the precision of the parameter estimates. Attempts to circumvent this problem produces biased estimates of the covariance matrix. Second, McMillen’s approach requires the researcher to specify a functional form of the heteroskedastic spatial variance and leads to varying inferences across alternative specifications. Alternatively, Bayesian estimation techniques do not require these assumptions about the functional form of the error process. We, therefore, implement a spatial Bayesian technique to estimate the spatial probit models in this study.

Following the work of Albert and Chib (Citation1993) and Chib (Citation1992), which detail the estimation of probit and logit models for discrete choices using Markov Chain Monte Carlo estimation in a Bayesian context, LeSage (Citation2000) proposes a Bayesian estimation technique based on the Gibbs sampling approach (Albert & Chib, Citation1993). The estimation technique specifies a complete set of prior distributions for all parameters in the model and then samples from these distributions until a large number of parameter draws are obtained that converge to the true joint posterior distribution of the parameters. This approach overcomes the drawbacks of the approach proposed by McMillen (Citation1992) because the posterior distributions are available to calculate valid inference measures of the parameter estimates, thus escaping the bias inherent in McMillen’s algorithm and the necessity to specify the functional form of model variance over space a priori. The likelihood function for the SAR, SDM, and SEM models is: (8) L(y,Wρ,β,σ )=12πσ2(n2)|InρW|exp{12σ2(εε)}(8)

ε=(InρW) yXβ for the SAR model,

ε=(InρW)(yXθ)Xβ for the SDM model,

ε=(InλW)(yXβ) for the SEM model (LeSage, Citation2000, p. 23).

It is important to note that the Bayesian approach to modeling is fundamentally different from that of the frequentist approach employed in OLS, probit, and other canonical statistical models. The results of Bayesian estimation produce full distributions of parameter estimates, and convention is to report the mean of each parameter distribution. Significance tests are then the probability of the parameter estimate containing zero calculated directly from the parameter distribution. This approach is fundamentally different from the frequentist approach, which calculates the probability of the parameter estimate being zero from the standard errors of each estimate and the underlying distributional assumption (often Gaussian) of the errors (Albert, Citation2007; Albert & Chib, Citation1993; LeSage, Citation2000).

Econometric Models and Results

Econometric Model

Three econometric models are specified following the theoretical specifications for the SAR, SEM, and SDM above. The binary choice indicator variable y is set to 1 if the survey respondent used non-auto transportation for an observed tour, and 0 otherwise. The spatial probit model for the SAR, SEM, and SDM is comprised of the travel choice indicator variable and the independent regressors which are the same as in Mueller and Trujillo (Citation2019). The formal equation to be estimated for the SDM is then:

yi*=ρWy+Xβ+WXθ+ε (9) εN(0,σ2In) yi=1, if yi*0 yi=0, if yi*<0

where X=[I S BE] where I is an n × 1 vector of ones, S is a matrix of sociodemographic characteristics, and BE is a matrix of built environment characteristics including zoning variables and follows the same substitution for the SAR and SEM. In the SEM model, both ρ and θ are set to 0 and λ enters the distribution of ε as described in EquationEquation (6). In the SAR θ is set to 0.

Determination of the Spatial Weights Matrix

The spatial weight matrix, W, in the equations above is developed by a two-step process. In the first step, observations are determined to be spatial neighbors if they are within a distance D from one another. The bandwidth used to create the neighbor matrix was the minimum straight line distance necessary so that each observation included at least one neighbor, D = 1.076 miles. This distance corresponds closely with the distance bands used to calculate the zoning percentages surrounding survey respondents’ residences and therefore is an ideal choice for D. While it is possible to estimate spatial models with some observations having no neighbors, in practice, this also causes far more problems than the benefits of having more restrictive definitions of spatial neighbors, as outlined by Bivand and Portnov (Citation2004). Using this distance-based neighbor rule, the neighbor binary matrix is constructed, with observations within D distance of each other assigned a 1, and observations further apart than D assigned a 0.

In the second step, the neighbor matrix is transformed into a spatial weight matrix W by either row standardizing the binary neighbor matrix so that all rows sum to 1, or applying a function based on distance and then row standardizing the matrix. While there are no generally accepted procedures for determining the correct weighting structure to use for W, we apply the two most commonly used weighting schemes, the binary neighbor matrix, and a weight that declines with a distance where the weight of each neighboring observation is set to the inverse of distance, wij=1/dij, where d is the distance between observations i and j in miles. We estimate the SAR, SEM, and SDM models using each spatial weight matrix and compare the results below.

Estimation

The SAR and SEM have been estimated in the past using maximum likelihood techniques, as well as more recently with Bayesian techniques. The estimation of the model using Bayesian techniques has some advantages over maximum likelihood, the most important being the recovery of the posterior coefficient distributions which can be used for statistical inference tests (LeSage, Citation2000). The SAR, SEM, and SDM models are estimated with a Bayesian model that takes 1,000 draws with a burn-in of 100 draws. Model results are listed in .Footnote1

Spatial Dependence Parameters

The model results for both the SAR and SEM models without zoning variables included using both a binary neighbor row standardized and inverse distance row standardized spatial weights matrix show that there is spatial dependence, with the spatial parameters ρ and λ statistically significant at the 5% level in the SAR and 10% level in the SEM model using the inverse distance weight matrix, and ρ and λ statistically significant at the 5 and 10% level in the SAR and SEM models using the row standardized binary spatial weights matrix. The value of ρ in the SAR using a binary W and inverse distance W are 0.235 and 0.159, respectively. The lower coefficient on ρ in the SAR model using the inverse distance spatial weights matrix shows the impact of the choice of W, as the inverse distance specification already weights closer neighbors more heavily.

In the model using binary W, ρ is the estimated parameter on the n × 1 vector Wy, where y is a vector of 1s and 0s indicating non-auto transportation, and thus Wy can be understood as the percentage of non-auto trips of all neighboring observations. Therefore, a value of ρ equal to 0.235 tells us that an increase in the percentage of non-auto trips of neighbors by 0.01 (1%) increases the probability of a non-auto trip by 0.01 * 0.235 = 0.00235 (0.235%) on average. This value of ρ also indicates that the dissipation of the effect is quite rapid, as second order neighboring observations exhibit an effect of 0.2352 = 0.055225 and an effect of a 1% increase is equal to 0.01 * 0.05225 = 0.00055225 (0.055225%). This finding is further evidence that the choice between auto and non-auto transportation is somewhat localized to a one-mile radius surrounding place of residence. Using the inverse distance W which places higher weights on closer neighbors, ρ is 0.159. The fact that ρ is a lower in this weighting scheme gives further evidence that the effect of neighboring observations is weak at closer distances than 1 mile. In this model, the coefficient has a slightly different interpretation, as the vector Wy is not a simple percentage, but rather a spatially weighted percentage of non-auto trips based on distance.

The value of λ in the SEM model using binary and inverse distance W is 0.333 and 0.069, respectively. The coefficient λ is estimated using EquationEquation (6), where u=λWu+e,u are the errors from the normal probit equation, W is the spatial weights matrix, and e are the residuals after spatial correction. The SEM model only addresses the spatial correlation of errors across space, and therefore only corrects for spatial heteroskedasticity. The positive and significant values of λ indicate the correlation between error terms that are neighbors. However, since the SAR model also demonstrates spatial autocorrelation, part of the error term spatial correlation may be due to missing variable bias since the spatially lagged dependent variable is absent from this model. The general conclusion from the significant values of ρ and λ indicate that a model that jointly addresses both spatial dependence and spatial heteroskedasticity may be the correct model (the SDM model).

The results from the SDM estimate the ρ parameter of −0.876 and −0.373 for the binary and inverse distance weights matrix, respectively. This value indicates negative spatial autocorrelation when spatially weighted independent regressors are added to the model. This is unexpected as positive spatial autocorrelation was estimated in the SAR model, and the results from the SEM indicated that controlling for spatial heteroskedasticity was warranted. A negative spatial dependence parameter indicates that the choice of auto by a neighbor increases the probability of non-auto use. This peculiar result seems to point to the complex nature of transportation decisions, as spatially clustered use of auto or non-auto would be expected due to preferences for locating near transportation networks of choice. However, it may be the case that this negative spatial autocorrelation of the dependent variable is picking up a preference for non-auto travel when neighbors neighboring auto choice creates congestion.

Zoning Parameters

The most notable result of the SAR models is that the coefficients on all three residential zoning density levels are negative and statistically significant in the 0–1/4 mile distance band. This indicates that higher levels of residential zoning surrounding respondents’ residences are associated with a decreased likelihood of observing non-auto transportation. The other result that indicates the potential association between zoning and travel behavior is the significant positive coefficients in the 1/4–1/2 and 3/4–1 mile band for industrial zoning, and high density business zoning in the 3/4–1 mile band. This indicates that business and industrial zoning moderately close to home is associated with increased non-auto transportation. This indicates that residential locations surrounded by a band of residential zoning up to one half mile may prefer to drive to shopping, employment, and recreation, and that zoning that precludes closer businesses may be associated with more non-auto travel behavior. The coefficients on the SEM model are all statistically insignificant so no interpretation can be made for this model.

While the sign of the coefficients on the explanatory variables indicates the direction of association on the conditional probability of non-auto transportation behavior, their magnitude cannot be interpreted in the same way as OLS or probit models. Due to the non-linearity of the model, and the presence of spatial dependence, the impact on a change of one explanatory variable has a spatial feedback loop effect on the dependent variable due to the presence of the spatially lagged dependent variable in the estimated equation. Therefore, it is necessary to estimate the marginal effects of the change in each explanatory variable in the model to determine the direct, indirect, and total effects. We list the marginal effects of the SDM in .

Table 4. Marginal effects: SDM model, binary W dependent variable: non-auto transportation mode = 1.

While the SAR and SEM models both show significance in some of the zoning variables in determining mode choice, the SDM shows significance in the low, medium, and high residential zoning types for the binary spatial weights matrix, significant negative impacts for all spatially weighted zoning variables in the 1/2–3/4 mile zoning band, and positive associations of spatially weighted high density business and industrial zoning in the 3/4–1 mile zoning band providing some evidence of association of association between zoning and mode choice.

The SDM with binary spatial weights matrix was the best model using the log likelihood test. The SAR with binary spatial weights matrix is indicated as the best model using Akaike Information Criteria (AIC) (Akaike, Citation1974) and the standard probit model is indicated as the best model using the Bayesian Information Criteria (BIC) (Schwarz, Citation1978). A summary of the log likelihood, AIC, and BIC of each of the models tested is shown in . All models show relatively similar results from the log likelihood, AIC, and BIC criteria. However, the aggregate results of the Moran’s I, Geary’s C, and the significance of both SAR and SEM models indicate that there may be both spatial autocorrelation and spatial heterogeneity in the data generating process of the underlying dataset. Therefore, the only model that produces unbiased results is the SDM (LeSage & Pace, Citation2009), Comparing the use of the two spatial weights matrices in each model, the binary row standardized spatial weights matrix leads to a better posterior distribution fit to the data, indicating that the binary matrix is preferred to the inverse distance matrix. This indicates that the spatial effects may be strong within the distance used to specify spatial neighbors, just over one mile.

Table 5. Log likelihood tests.

Marginal Effects and Elasticities

The Bayesian Markov Chain Monte Carlo (MCMC) estimation technique used to estimate the models above produces samples of the posterior distribution of the model parameters. These sample distributions of coefficients can be used to compute average marginal effects across observations of a change in an independent variable of the model on the probability of the independent variable, non-auto travel mode choice (LeSage & Pace, Citation2009). While the SEM model coefficients can be interpreted as marginal effects as in ordinary least squares because the spatial variation is only present in the error term, for the SAR and SDM models which include spatially lagged dependent or independent variables, the impacts of a change in an explanatory variable can have an impact on all other neighboring dependent variables, creating a feedback loop with several orders of magnitude. Thus, these spatial models exhibit direct, indirect, and total impacts. LeSage and Pace (Citation2009) propose summary measures of the marginal effects of a change in an explanatory variable xr by using the average change in the expected value of the dependent variable yi and changing the multiplier matrix Sr(W) based on the spatial model. The expected value of a change is listed in EquationEquation (10), where X is an n × p matrix of n observations and p explanatory variables. (10) E(y)=i1pSr(W)xr+αI(10)

Sr (W) for the SAR and SDM model are given in EquationEquations (11) and Equation(12).

The diagonal elements of the trace of the Sr(W) matrix multiplied by the change in independent variable xir give the direct impacts (EquationEquation 13), while the trace of the entire Sr(W) matrix multiplied by the change in independent variable xir gives the total impacts (EquationEquation 14). Indirect impacts are the difference between total and direct impacts (EquationEquation 15). Marginal direct effects for individual observations are contained in the diagonal elements of Sr(W) (EquationEquation 16) and indirect marginal effects are contained in the off diagonal elements of Sr(W) (EquationEquation 17) (LeSage & Pace, Citation2009). (11) Sr(W)=(InρW)1βr(11) (12) Sr (W)=(InρW)1(Inβr+Wθr)(12) (13) M(r)direct=n1tr(Sr(W))(13) (14) M(r)total=n1In1Sr(W)In(14) (15) M(r)indirect=M(r)totalM(r)direct(15) (16) yixir=SrWii(16) (17) yixjr=SrWij(17)

To calculate the elasticities, the change of each variable is taken at the mean of the posterior distribution and the mean of the expected probability of the binary dependent variable, which is 16.63%. Marginal effects are reported for the direct, indirect, and total marginal effects of a change in each independent variable. Direct effects are the change in the probability of observing non-auto mode choice attributed to the change in the independent variable. Indirect effects represent the spatially lagged effect on the autocorrelated dependent variable of a change in one of the independent variables after the feedback loop from a change in an independent variable has affected the spatially lagged dependent variable of spatial neighbor observations. The sum of direct and indirect effects equals the total effect of a change in the independent variables after the feedback loop of the change has run its course. Dummy variable elasticities are not reported. Results for the SDM model are reported for the binary spatial weights matrix in .

Table 6. Elasticities: SDM model, binary W dependent variable: non-auto transportation mode = 1.

Several of the statistically significant variables in the SAR model have small marginal effects on the expected sign. The largest of these is the number of household vehicles, with a marginal effect of −9.123%. This is not surprising considering this variable indicates preference for owning an asset that encourages automobile transportation. Miles of bike lanes have an unexpected negative marginal effect but is very small. One possible reason for the unexpected sign on this variable may be that areas that are more dense, such as the CBD, may have an overall lower mileage of bike lanes, while areas that lack access to goods and services within a non-auto distance have a high mileage of bike lanes that are intended for recreational use.

Mileage of bus routes and number of bus stops both have the expected sign but the effects are also small. The estimated coefficient for number of rail stops is unexpectedly negative, while the coefficient for miles of rail lines has the expected sign but a small positive coefficient. The study area has a more mature bus system than rail system, and the rail stops are spread more evenly between dense urban locations near downtown and suburban locations. Perhaps the reason for the unexpected sign on the rail stops coefficient is capturing the propensity of most suburban residents to use auto transportation even when they live in close proximity to rail stops. This phenomenon may be due to the rail lines not going to locations that meet suburban household needs since many of the rail lines were built to service commuting to downtown from the suburbs, but not to perform everyday shopping or recreational tasks close to home.

Shopping and social stops along a tour are both negative and relatively large compared with many of the other variables in the regression. These are the expected sign and indicate a propensity to drive when shopping for goods that may need to be carried home or to social gatherings that are located in recreational or residential areas. Residential zoning within a quarter mile from place of residence has the expected negative sign, although the effects are small. Given the value people place on their own time, it is not surprising that higher residential density, and therefore lower business density, may encourage people to drive to locations that are not in the immediate vicinity of their residence. Residential zoning in the quarter to half mile range has the opposite sign with similarly small marginal effects. It is uncertain what explains the positive marginal effect on non-auto transportation for higher residential zoning levels within this band. Finally, high density business within the three quarter to one-mile band has a positive marginal effect on non-auto transportation. This may indicate that if a high level of businesses are located within this band, survey respondents are willing to travel by non-auto modes to reach these destinations even though they are slightly farther than other statistically significant variables would suggest for encouraging non-auto transportation.

Elasticities calculated from the marginal effects at the means of the coefficient distributions indicate that non-auto transportation mode preferences are highly inelastic. This result in part captures the sample distribution which indicates that people use auto for their mode of transportation at a much higher frequency than all other modes combined. Elasticities for number of household vehicles and age are above one, signaling that these variables are a good indicator of transportation mode preference. The most interesting result is that low density residential zoning is more elastic than all other zoning types. This is the expected result for the zoning band within a quarter mile of place of residence, indicating that altering this zoning type may have the most potential of all possible zoning changes in promoting non-auto transportation.

Marginal effects in the SDM model with binary spatial weights are similar to those in the SAR model, with many of the same variables being statistically significant and thus leading to many of the same conclusions. The inclusion of the spatially weighted variables in the regression make some of the variables that were statistically significant in the SAR regression insignificant.

Elasticities in the SDM model are also similar to the SAR model. One interesting observation is that household vehicles have a negative direct marginal effect, but a positive indirect impact, indicating that having many cars may encourage auto usage, but discourage neighboring respondents to use auto transportation. The effects are still small, however, with the indirect elasticity being less than half of direct elasticity.

Elasticities of residential zoning variables are negative but very small, indicating that the response to residential zoning is highly inelastic. Although the marginal effects are non-linear, in general most of the estimates follow a normal distribution. Thus, when considering that the marginal effects are capturing a one-percent increase in a specific zoning type, it may be more appropriate to consider that, for example, a 10% increase in a zoning type would have roughly ten times the impact on the probability of non-auto transportation modes being chosen. For example, if residential low-density zoning was to increase by 10% within a quarter mile of a survey respondents’ residence, using this rough measure we would expect to see a 0.136% decrease in non-auto transportation mode choices. While still a small impact, this change is not insignificant when considering the magnitude of trips away from home taken by city residents across the United States each year. Even small percentage reductions in auto trips could add up to large overall reductions in vehicle miles driven.

Conclusion

In comparing the current results to those of the naïve non-spatial empirical approach by Mueller and Trujillo (Citation2019), this paper has shown that explicit spatial modeling is clearly an important factor in a fuller understanding of travel choice behavior. This work has shown that both spatial autocorrelation and heterogeneity are present in the data and formal modeling techniques give insight into the spatial aspects of associations between built environment variables and mode choice. The spatial results here reconfirm the overall inelasticity of travel mode choice, yet also offer explicit clues as to zoning’s subtle influences on mode choice.

The evidence provided by the log likelihood test indicate that the Bayesian models of the association between zoning on travel mode preference favor the Spatial Durbin Model with a binary spatial weights matrix indicated in . Residential low, medium, and high density within one quarter mile of residences are all statistically significant and associated with a lower propensity for non-auto travel. Using AIC to assess the models considered in this chapter, the SAR model with a binary spatial weights matrix was the best model overall. In this model, residential zoning variables of low, medium, and high density in the 0–1/4 mile zoning band all have statistically significant and negative impacts on survey respondents’ probability of choosing non-auto transportation. Using BIC, the standard probit model had the best fit to the data. One potential conclusion from these results is that zoning variables may in fact have a significant influence on travel preference, but that zoning variables may manifest themselves in other built environment variables in the model and therefore warrant further study, as zoning laws and the resulting manifestation of the built environment are determined and implemented by a political process. While the associations represented by the marginal effects of zoning variables are small when considering minute increases in zoning types surrounding residential locations, more drastic changes to zoning mixes may have a more profound impact.

Automobile usage was the dominant mode of transportation across respondents in this study and corresponds to patterns of heavy auto usage in the United States in general (Glaeser & Kahn, Citation2004). Part of the dominance of the automobile in the transportation of citizens in the United States may be the result of long-term path dependence that followed from an early preference for auto transportation in the development of transportation infrastructure. The widespread building of roads may have led to a long chain of city planning decisions that have shaped the built environment to accommodate automobile transportation to the detriment of alternative modes of transportation that use less energy and decrease congestion. Further research will be needed to determine if drastic changes in built environment design that focus on alternatives to automobile transportation can change society’s preference for the automobile towards transportation usage that is more environmentally and culturally sustainable for the long-term future. In addition to urban form, there are many other factors that determine travel behavior, such as weather, physical health, and availability of transportation options. The current dataset precludes the testing of such relationships, but in cities where sweeping changes to more flexible zoning policies have been implemented, event studies could shed light on the causal relationships between zoning mix and transportation behavior. This study has provided some direction on the utility of using spatial models in transportation studies, but more research is needed to incorporate holistic solutions to urban transportation problems that include vehicle electrification, driverless cars, bike, and scooter sharing programs, and alternative transportation infrastructure.

Notes

1 All estimations are implemented in the software system R (R Core Team, Citation2022). The spatial weights matrix was constructed and standardized using the R add-on package spdep (Bivand et al., Citation2013; Bivand & Piras, Citation2015). The spatial probit SAR, SEM, and SDM models are estimated using the Bayesian approach implemented in the R package spatialprobit (Wilhelm & de Matos, Citation2013).

References

  • Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6), 716–723.
  • Albert, J. (2007). Bayesian computation with R. Springer.
  • Albert, J., & Chib, S. (1993). Bayesian analysis of binary and polychotomous response data. Journal of the American Statistical Association, 88(422), 669–679.
  • Anselin, L. (1988). Spatial econometrics: Methods and models. NATO Asi Series. Series E, Applied Sciences. Springer.
  • Anselin, L. (2001). Spatial econometrics. In B. H. Baltagi (Ed.), A companion to theoretical econometrics. Blackwell.
  • Anselin, L. (2010). Thirty years of spatial econometrics. Papers in Regional Science, 89(1), 3–25.
  • Anselin, L., & Lozano-Gracia, N. (2007). Errors in variables and spatial effects in hedonic house price models of ambient air quality. Empirical Economics, 34(1), 5–34.
  • Arbia, G. (2006). Spatial econometrics: Statistical foundations and applications to regional convergence. Springer.
  • Bivand, R. S., Hauke, J., & Kossowski, T. (2013). Computing the Jacobian in Gaussian spatial autoregressive models: An illustrated comparison of available methods. Geographical Analysis, 45(2), 150–179.
  • Bivand, R. S., & Piras, G. (2015). Comparing implementations of estimation methods for spatial econometrics. Journal of Statistical Software, 63(18), 1–36.
  • Bivand, R. S., & Portnov, B. A. (2004). Exploring spatial data analysis techniques using R: The case of observations with no neighbors. In Advances in spatial econometrics: Methodology, tools and applications (pp. 121–142). Springer.
  • Butsic, V., Lewis, D. J., & Ludwig, L. (2011). An econometric analysis of land development with endogenous zoning. Land Economics, 87(3), 412–432.
  • Chib, S. (1992). Bayes inference in the Tobit censored regression model. Journal of Econometrics, 51(1–2), 79–99.
  • Cliff, A., & Ord, J. (1969). The problem of spatial autocorrelation. In A. J. Scott (Ed.), Studies in regional science London papers in regional science (pp. 25–55). Pion.
  • Cliff, A. D., & Ord, J. K. (1973). Spatial autocorrelation. Pion.
  • Cliff, A. D., & Ord, J. K. (1981). Spatial processes, models and applications. Pion.
  • Ewing, R., & Cervero, R. (2010). Travel and the built environment. Journal of the American Planning Association, 76(3), 265–294.
  • Fingleton, B., & Lopez-Bazo, E. (2006). Empirical growth models with spatial effects. Papers in Regional Science, 85(2), 177–198.
  • Geary, R. C. (1954). The contiguity ratio and statistical mapping. The Incorporated Statistician, 5(3), 115–127.
  • Glaeser, E. L., & Kahn, M. E. (2004). Sprawl and urban growth. In Handbook of regional and urban economics (Vol. 4, pp. 2481–2527). Elsevier.
  • LeSage, J. P. (2000). Bayesian estimation of limited dependent variable spatial autoregressive models. Geographical Analysis, 32(1), 19–35.
  • LeSage, J. P., & Pace, R. K. (2009). Introduction to spatial econometrics (1st ed.). Chapman & Hall/CRC.
  • Levkovich, O., Rouwendal, J., & Brugman, L. (2018). Spatial planning and segmentation of the land market: The case of the Netherlands. Land Economics, 94(1), 137–154.
  • McMillen, D. (1992). Probit with spatial autocorrelation. Journal of Regional Science, 32(3), 335–348.
  • Moran, P. A. P. (1950). Notes on continuous stochastic phenomena. Biometrika, 37(1), 17–23.
  • Mueller, A. G., & Trujillo, D. (2019). The impact of zoning and built environment characteristics on transit, biking, and walking. Journal of Sustainable Real Estate, 11, 108–129.
  • Newburn, D. A., & Ferris, J. S. (2016). The effect of downzoning for managing residential development and density. Land Economics, 92(2), 220–236.
  • Pace, R., & Barry, R. (2004). Simultaneous spatial and functional form transformations. In Advances in Spatial Science volume 2041 (pp. 197–224). Springer.
  • Paelinck, J. H. P., & Klaassen, L. H. (1979). Spatial econometrics. Saxon House.
  • R Core Team (2022). R: A language and environment for statistical computing. R Foundation for Statistical Computing. https://www.R-project.org/
  • Schnier, K. E., & Felthoven, R. G. (2011). Accounting for spatial heterogeneity and autocorrelation in spatial discrete choice models: Implications for behavioral predictions. Land Economics, 87(3), 382–402.
  • Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6(2), 461–464.
  • Tobler, W. (1970). A computer movie simulating urban growth in the Detroit region. Economic Geography, 46(sup1), 234–240.
  • Vance, C., & Hedel, R. (2008). On the link between urban form and automobile use: Evidence from German survey data. Land Economics, 84(1), 51–65.
  • von Graevenitz, K., & Panduro, T. E. (2015). An alternative to the standard spatial econometric approaches in hedonic house price models. Land Economics, 91(2), 386–409.
  • Vyn, R. J. (2012). Examining for evidence of the leapfrog effect in the context of strict agricultural zoning. Land Economics, 88(3), 457–477.
  • Wilhelm, S., & de Matos, M. G. (2013). Estimating spatial probit models in R. R Journal, 5(1), 130–143.