Abstract
This paper proposes a new generalized method of moments (GMM) estimator for spatial panel models with spatial moving average errors combined with a spatially autoregressive dependent variable. Monte Carlo results are given suggesting that the GMM estimator is consistent. The estimator is applied to English real estate price data.
Abstract
Une méthode généralisée d'estimateur de moments pour un modèle de panel spatial avec un décalage endogène spatial et des erreurs spatiales de type moyenne mobile
Cette étude propose un nouvel estimateur GMM pour des modèles de panel spatial avec des erreurs spatiales de type moyenne mobile combiné à une variable de dépendance spatiale autorégressive. Les résultats de Monte Carlo fournis suggèrent que l'estimateur GMM est cohérent. L'estimateur s'applique à des données sur des prix d'immobilier anglais.
Abstract
Un estimador de método generalizado de momentos para un modelo de panel espacial con retardo espacial endógeno y errores espaciales de media móvil
Este estudio propone un nuevo estimador GMM para modelos de panel espacial con errores espaciales de media móvil combinado con una variable dependiente autorregresiva. Se indican los resultados de Monte Carlo que revelan la coherencia del estimador GMM. El estimador se aplica a los datos de precios en inmobiliarias inglesas.
Introduction
There is a growing literature dedicated to the analysis of panel data with spatial dependence, with various different approaches suggested. Probably the most useful starting point in the spatial econometrics literature is Anselin (Citation1988), and among some of the more recent contributions, such as Conley (Citation1999), Chen & Conley (Citation2001), Baltagi et al. (Citation2003), Druska & Horrace (Citation2003), Elhorst (Citation2003), Baltagi (Citation2005), Baltagi & Li (Citation2006) and Pinkse et al. (Citation2006), we highlight the work of Kapoor et al. (Citation2007), which generalizes the generalized moments estimators of Kelejian & Prucha (Citation1999) to a panel data model with spatially and temporally correlated error components, and which provides a feasible generalized least squares procedure for the regression parameters, and formal large sample results for their estimators.
This paper draws on their contribution, which provides the necessary theoretical, computational, and mathematical background for the present paper. Given this context, the specific innovatory aspects of the current paper are:
-
the extension of the generalized moments estimators (GMM) estimation procedure to allow a spatial moving average (MA) error process rather than the spatial autoregressive process that has been the focus of attention thus far in the literature;
-
the extension of the methodology to incorporate an endogenous spatial lag, so that spatial dependence is not solely via the error process;
-
application of the method to real panel data involving real estate prices in England.
2 The Model
Consider the N location cross-sectional time t regression specification
The most widely used approach to modelling spatial error dependence involving N locations is to assume that in each period , in which u(t) is a vector of errors at time t, ρ is a parameter, W is also an N×N matrix of non-stochastic weights which defines the error interaction across areas and ξ(t) is an N×1 vector of time t innovations. All the diagonal elements of W are zero, (I–ρW) is non-singular and W is also uniformly bounded in absolute value. This is referred to as a spatial autoregressive (AR) process and implies complex interdependence between locations, so that a shock at location j is transmitted to all other locations, as indicated by the expansion of
In contrast, the MA error process,Footnote1 which is the subject of this paper, is
For ν, it is assumed that and we also make the standard assumption that the errors have finite fourth moments () to ensure a finite domain for estimation. Likewise for µ, and . Also, we assume that the error components are independent, hence , and each of the two error components µ and ν is subject to the same spatial moving average process, since
3 The Moments Equations
Consider the TN×1 vector of residuals u
4 Estimation
The estimation procedure comprises three stages. At stage 1, because of the presence of the spatial lag, we obtainFootnote4 IV estimates of b and hence residuals . In stage 2 we use these IV residuals to obtain the estimates g and of γ and , and denoting Γ and by G and we have the sample counterpart of equation (Equation42), which is
In general, the variances associated with F 1 and F 2 differ, and Kapoor et al. (2007) suggest weighting to allow for this. However, in the Monte Carlo simulations that follow, for simplicity we have not introduced differential weighting. In the analogous situation examined by Kapoor et al. (Citation2007) they note that giving equal weight to all six moments equations does give consistent estimates. While the small sample behaviour in the AR case is the worse of the alternative weighting schemes they examine, it seems appropriate commencing with MA errors to initially explore the behaviour of the simplest approach prior to more elaborate methods, which could be the subject of further research.
In the third stage, because the errors are not constant, the appropriate method is generalized least squares (GLS), estimated by IV to also allow for the presence of the endogenous spatial lag. The estimated error covariance matrix is obtained using from stage 2, but first is used to perform a Cochrane–Orcutt (C-O)-type transformation to account for the spatial dependence in the residuals.
Normally with C-O the assumption is an autoregressive error process, hence , in which case one pre-multiplies through by to obtain the innovations ξ. However, the MA error process requires pre-multiplicationFootnote6 by the inverseFootnote7 to obtain ξ, thus
5 Example 1: the Data-generating Process
In this first example the data are purely artificial, and correspond to model (6), which is repeated here for convenience:
Matrix H had columns equal to the TN×1 vectors ι TN , H 1, H 2, and H 3, in which ι TN is a TN×1 vector with 1s. To obtain each H, first generate time t=0, N×1 vectors H 1(0), H 2(0), H 3(0) by sampling at random from a uniform (rectangular) distribution with minimum equal to 0 and maximum equal to 1. Then for t equal to 1 … T, , in which π 1∼N(0,1), and likewise for H 2(t), H 3(t) using π 2∼N(0,1) and π 3∼N(0,1). Then stacking these N×1 vectors we obtain H 1, H 2 and H 3. In his way the exogenous variables in H have some time dependency, as seems reasonable for panel data. Once generated, the variables H 1, H 2 and H 3 remain fixed. Also, in practice T=2 and T=4 below.
Given the exogenous variables, we next obtain the innovations vector NT×1 vector ξ. The innovations vector depends on the N×1 vector µ obtained by sampling from an distribution and on the NT×1 vector ν obtained by sampling from an distribution, so that . This is repeated for each iteration k=1 … K, to obtain
Given Y, W and H, K estimates are obtained of the known parameters and using the three-stage method outlined above. This is achieved by using instruments Z for the endogenous spatial lag comprising the exogenous variables, H, together with the TN×1 vector comprising T stacked identical time ‘zero’ N×1 spatial lag vectors WY(0), which is assumed to be exogenous with respect to the endogenous lag .
5.1 Monte Carlo Results
Monte Carlo results are given both here and in more detail in Appendix B. Those given here are illustrative, while those in Appendix B provide more substantive empirical evidence of the consistency of the estimator. In this first example, the values ρ=−0.25, λ=0.75, γ 0=1, γ1=10, γ 2=10, γ 3=10, and are used to generate Y, and the three-stage estimation method employed K=100 timesFootnote9 gives a set of K estimates and (k=1 … K) of these parameters. summarizes the parameter estimate distributions. It is evident that the parameter estimate means are close to the true values, although it is shown later (see Appendix B) that there is evidence of small sample bias in the estimator of ρ, although it is apparently consistent. The distributions are relatively symmetrical, and on the whole have a degree of kurtosis consistent with the normal distribution. To formally test the null of normality, the K estimates are divided into groups with upper and lower bounds defined so that each group has approximately the same observed frequency (O i ). These observed frequencies are then compared to expected frequencies (E i ) calculated using the data to obtain maximum-likelihood estimates of the normal mean and variance.Footnote10 The test statistic is then referred to the distribution. It is apparent that none of the distributions differs significantly from normal, using the upper 5% point (14.07) of the distribution.
Table 1 . Estimated parameter distributions
summarizes the estimate distributions obtained with T=4 time periods and assuming a different set of parameter values. In this case the true values are ρ=−0.5, λ=0.25, γ 0=1, γ 1=2, γ 2=4, γ 3=6, and . Once again the distribution means are all quite close to expectation, in most cases with low levels of skewness and kurtosis and acceptable approximations to normality, although the estimates are clearly furthest from normality.
Table 2 . Estimated parameter distributions
The more detailed Monte Carlo results given in Appendix B (Tables A2 to A8) use both Rook's (edge touching) and Queen's (edge and corner touching) definitions of contiguity on a lattice, and also a torus, with opposite sides of the lattice treated as being contiguous so as to eliminate edges. Measures of bias and of a single indicator combining both precision (variance) and accuracy (bias), as given by a variant of the RMSE statistic (see Appendix B), are calculated from 1,000 Monte Carlo replications of equation (Equation53). Summarizing the outcomes obtained under the various alternative assumptions detailed in Tables A2 to A8, it is evident that there is a small sample bias in . Attention is focused on positive dependence (negative ρ), which gives positive bias, and these outcomes are mirrored in the case of negative dependence,Footnote11 which gives negative bias, so that in both cases the estimated parameter is closer to zero than the true value. The bias is increasing in , but the most significant result is the clear evidence that as the sample size (N) increases, the bias in diminishes and the RMSE falls, suggesting consistency.
These indications of consistency are precisely what one might anticipate on the basis of the theoretical results given by Kapoor et al. (Citation2007) (see also Kelejian & Prucha, Citation1998, Citation1999). An essential difference between their analysis and what is done here is, of course, that here we are assuming a spatial moving average process rather than spatially autoregressive errors. In addition, in this paper we also introduce an endogenous spatial lag in the panel context, a feature absent from their analysis. Consistency of the generalized moments estimators of ρ and σ 2 is maintained by utilizing IV estimates of b leading to consistent disturbances. Thus, although the formal proofs given by Kapoor et al. (Citation2007) are in the context of exogenous regressors (no spatial lag) and autoregressive rather than moving average errors, it is clear that their results carry through to the present set-up. Finally, it seems that although there is a small sample positive bias in the estimator , in many applied situations will be effectively unbiased. One advantage of GMM estimation is its comparative simplicity and computational efficaciousnessFootnote12 in applications in which the number of locations is far in excess of those subject to Monte Carlo exploration in this paper. It is clear from the results presented here that as the number of locations rises into the thousands, as for example with the 3,000 plus counties in the USA, small sample bias in the estimator should be minimal.
6 Example 2: Real Estate Prices
In this example the GMM estimator is applied to a panel of average house prices in N=353 small areasFootnote13 of England in the T=2 years 2000, 2001, denoted by the NT×1 vector p. If the price at j is comparatively high, then demand may be displaced to nearby location k. On the other hand, supply may be displaced from k to j as investors in property seek higher returns. We therefore assume that price in area j interacts contemporaneously with price in area k, and model this interaction by the presence of an endogenous spatial lag Wp. In this case we again use the row normalized contiguity matrix W for both the endogenous spatial lag and the MA error process. The other explanatory variablesFootnote14 are income from local jobs (wE), equal to the local wage rate (w) times the local employment level (E), and income from wages and employments within commuting distance (w c E c). In order to be able to treat these as exogenous variables, 1 year lags are introduced, so that year 2001 prices are a function of income in 2000, and year 2000 prices are a function of income in 1999.
There are many other variables that one might wish to introduce were panel data available, such as air quality, the quality of local schooling, the size of the existing stock of properties, demand coming from non-wage earners such as the retired and students, and the effects of criminality, social quality of the neighbourhood, amenity, local taxes, the nature of the housing stock, planning and building regulations, vulnerability to flooding and therefore the additional insurance premiums for areas on flood plains, and various other social, demographic, labour market, environmental and cultural differences. These omitted variables are likely to be spatially autocorrelated, the net effect of which is to induce an organized residual pattern (Dubin, Citation1988). We model these omitted variables by the spatial MA error process. While displaced demand or supply may causes price interactions that cascade outwards in an autoregressive process, I assume no such chain reaction for these variables, so that a shock, on its own, has a limited spatial extent, which is a property of the spatial MA process.
In order to obtain the estimates given in , the exogenous variables wE and w c E c and their first spatial lags, obtained by pre-multiplying these vectors by W, were used as instruments for the endogenous spatial lag in the first stage of the three-stage estimation process. This then provided estimated IV residuals which facilitated the second stage, enabling ρ, and estimates to be obtained. The third-stage estimates are given in , showing that there is a significant endogenous spatial lag effect, so that prices are directly positively related to contemporaneous prices in contiguous areas, and there are significant effects due to income from local jobs and jobs within commuting distance.
Table 3 . GMM estimates for the real estate price panel data with spatial moving average errors
7 Conclusion
This paper considers panel data in which spatial interaction comes from the effects of an endogenous lag and also from the MA error process. Monte Carlo results are given suggesting that the GMM estimator is consistent. It appears that this is the first paper to consider panel analysis with spatial MA errors, and also to jointly consider an endogenous lag together with spatial and temporal correlation in the error components, although much of this has been presaged in the earlier spatial econometrics literature (Anselin, Citation1988), and also in the time series context (Harvey, Citation1990). Indeed, in the conclusion to their paper, Kapoor et al. (Citation2007) suggest that they would like to extend their results to models containing spatially lagged dependent variables. The present paper raises many issues which should be the subject of further study, such as the choice of appropriate instruments, the most efficient optimization method, and the small sample properties of the estimator, but the evidence presented here does suggest that there is scope for the practical implementation via GMM of panel data models with an endogenous spatial lag and spatial error processes.
Acknowledgements
The author is grateful to Michael Pfaffermayr and the other participants at the International Workshop on Spatial Econometrics and Statistics, Rome, 25–27 May 2006, and participants at the 13th International Conference on Panel Data, University of Cambridge, 7–9 July 2006, for their contributions to the discussion of this paper.
Notes
1. An early detailed account of the MA spatial process is given by Haining (Citation1978).
2. Pre-multiplication of a TN×1 vector θ by Q 0 creates a TN×1 vector of deviations from the mean, where the mean is obtained by averaging θ over time. Pre-multiplication of a TN×1 vector θ by Q 1 creates a TN×1 vector, comprising N across time area-specific means stacked for each T.
3. Note that Tr(W′WW)=0 for the Rook's case contiguity matrix.
4. So that we can use equation (Equation49) in both stage 1 and stage 3, it is assumed that and (so that Ω ξ is a diagonal matrix of 1s) and that ρ=0. The result is that at stage 1 we simply obtain IV estimates.
5. Using unconstrained non-linear least squares estimation. The method is a modified Newton–Raphson method which is suitable for minimizing any non-linear function, and which depends on numerical differences rather than derivatives.
6. In contrast, at stage 1, ρ is assumed to equal 0, so that in that case Y * =Y, X * =X.
7. Moore–Penrose generalized inverses are used to avoid singularities.
8. In the main body of the text W is a Rook's case contiguity matrix based on a 15×15 lattice. In the Monte Carlo simulations described in Appendix B, the lattice size is varied and an irregular spatial partitioning is also considered. Also, alternative contiguity definitions, namely the Queen's case and torus, are also implemented.
9. Appendix B gives the results obtained using K=1,000 replications.
10. These procedures were carried out using the DISTRIBUTION directive of the programming language GENSTAT. The DISTRIBUTION directive is used to fit an observed sample of data to a theoretical distribution function, in order to obtain maximum-likelihood estimates of the parameters of the distribution and test the goodness of fit.
11. To save space these results have not been reported here.
12. With the MA error process the C-O transform involves the inverse. See Smirnov & Anselin (Citation2001) for a discussion of the use of the power expansion to approximate the matrix inverse with large matrices.
13. Unitary authority and local authority districts, or UALADs.
14. Appendix A gives details of the sources and construction of these variables.
15. Small administrative areas, with median area equal to 250.77 km2.
16. Available on the NOMIS website (the ONS online labour market statistics database).
17. 1991 Census of Population—Special Workplace Statistics, available from NOMIS.
18. Total employees and self-employed with a workplace coded, tabulated by residents in each zone (10% sample).
19. Minimum of the sum of the squared deviations of the observed proportions in each distance band up to 40 km and the proportions of the sum of the function exp(−δ i d ij ) calculated using the upper limit of each distance band.
References
- Anselin , L. 1988 . Spatial Econometrics: Methods and Models , Dordrecht : Kluwer .
- Baltagi , B. H. 2005 . Econometric Analysis of Panel Data , 3rd edn , Chichester : Wiley .
- Baltagi , B. H. and Li , D. 2006 . Prediction in the panel data model with spatial correlation: the case of liquor . Spatial Economic Analysis , 1 : 175 – 185 .
- Baltagi , B. H. , Song , S. H. and Koh , W. 2003 . Testing panel data regression models with spatial error correlation . Journal of Econometrics , 117 : 123 – 150 .
- Bowden , R. J. and Turkington , D. A. 1984 . Instrumental Variables , Cambridge : Cambridge University Press .
- Brueckner , J. K. 2003 . Strategic interaction among governments: an overview of empirical studies . International Regional Science Review , 26 : 175 – 188 .
- Chen , X. and Conley , T. G. 2001 . A new semiparametric spatial model for panel time series . Journal of Econometrics , 105 : 59 – 83 .
- Conley , T. G. 1999 . GMM estimation with cross sectional dependence . Journal of Econometrics , 92 : 1 – 45 .
- Druska , V. & Horrace , W. C. (2003) Generalized Moments Estimation for Spatial Panel Data , Technical Working Paper 291, National Bureau of Economic Research, Cambridge, MA .
- Dubin , R. A. 1988 . Estimation of regression coefficients in the presence of spatially autocorrelated error terms . Review of Economics and Statistics , 70 : 466 – 474 .
- Elhorst , J. P. 2003 . Specification and estimation of spatial panel data models . International Regional Science Review , 26 : 244 – 268 .
- Greene , W. H. 2003 . Econometric Analysis , 5th edn , Upper Saddle River, NJ : Prentice Hall .
- Haining , R. P. 1978 . The moving average model for spatial interaction . Transactions of the Institute of British Geographers , 3 : 202 – 225 .
- Harvey , A. C. 1990 . The Economic Analysis of Time Series , 2nd edn , Cambridge, MA : MIT Press .
- Kapoor , M. , Kelejian , H. H. and Prucha , I. 2007 . Panel data models with spatially correlated error components . Journal of Econometrics , 140 : 97 – 130 .
- Kelejian , H. H. and Prucha , I. R. 1998 . A generalized spatial two-stage least squares procedure for estimating a spatial autoregressive model with autoregressive disturbances . Journal of Real Estate Finance and Economics , 17 : 99 – 121 .
- Kelejian , H. H. and Prucha , I. R. 1999 . A generalized moments estimator for the autoregressive parameter in a spatial model . International Economic Review , 40 : 509 – 533 .
- Pinkse , J. , Slade , M. and Shen , L. 2006 . Dynamic spatial discrete choice using one-step GMM: an application to mine operating decisions . Spatial Economic Analysis , 1 : 53 – 99 .
- Smirnov , O. and Anselin , L. 2001 . Fast maximum likelihood estimation of very large spatial autoregressive models: a characteristic polynomial approach . Computational Statistics and Data Analysis , 35 : 301 – 319 .
Appendix A
The dependent variable p is the mean transaction price (all types of residential property) by area for the period July–September 2000, and July–September 2001 for the N=353 English Unitary Authority and Local Authority DistrictsFootnote15 (UALADs). The data were provided by the Land Registry. The wage rate (w) is the gross weekly pay for all occupations and both males and females taken from the Office for National Statistics’Footnote16 (ONS) New Earnings Survey. The employment level for the years 1999 and 2000 is based on the annual business enquiry employee analysis, also carried out by the ONS and available on the NOMIS database.
Total earnings in an area is the product of the average wage rate (w) in 1999 and 2000 and the total level of employment in 1999 and 2000, denoted by wE. These are assumed to be predetermined with respect to year 2000 and 2001 price levels.The vector w c E c denotes total earnings within commuting distance of a UALAD. This is equal to the matrix product of the n×n matrix C and the n×1 vector wE. Matrix C is defined as follows:
Table A1 . Commuting distances to work in England and Wales
Appendix B: Appendix B: Monte Carlo Investigation
Bias=median – true parameter value
In all cases W is normalized to row totals equal to 1 and the bias is based on 1,000 Monte Carlo replications.