403
Views
23
CrossRef citations to date
0
Altmetric
Original Articles

Approximation of Gaussian spatial autoregressive models for massive regular square tessellation data

Pages 2143-2173 | Received 05 May 2015, Accepted 28 Jun 2015, Published online: 03 Aug 2015
 

Abstract

Most of the literature to date proposes approximations to the determinant of a positive definite × n spatial covariance matrix (the Jacobian term) for Gaussian spatial autoregressive models that fail to support the analysis of massive georeferenced data sets. This paper briefly surveys this literature, recalls and refines much simpler Jacobian approximations, presents selected eigenvalue estimation techniques, summarizes validation results (for estimated eigenvalues, Jacobian approximations, and estimation of a spatial autocorrelation parameter), and illustrates the estimation of the spatial autocorrelation parameter in a spatial autoregressive model specification for cases as large as n = 37,214,101. The principal contribution of this paper is to the implementation of spatial autoregressive model specifications for any size of georeferenced data set. Its specific additions to the literature include (1) new, more efficient estimation algorithms; (2) an approximation of the Jacobian term for remotely sensed data forming incomplete rectangular regions; (3) issues of inference; and (4) timing results.

Acknowledgement

This research was supported, in part, by an endowed Ashbel Smith chaired professorship.

Notes

1. Often matrix W is row-standardized; this adjustment results in a value of some random variable at location i being specified as a function of its average neighboring values, sets the maximum eigenvalue to 1, and allows positive estimates of to have a more natural interpretation by falling into the interval [0, 1).

2. Walde et al. (Citation2008) argue that the GMM approach is preferable, although they employ an unrealistically very sparse matrix with only two neighbors per location, and furnish no timing experiment results. GMM (Kelejian and Prucha Citation1999) requires the calculation of WW and WTW, as well as and , time-consuming operations.

3. Calibration refers to either estimation based upon a purposeful systematic sample covering a feasible parameter space, or calculations based upon the population moments.

4. Jacobian terms were calculated for a range of spatial autocorrelation values and each of a number of selected P × Q surfaces ranging in size from 3 × 3 to 75 × 75; Equation (11) q2, q4, and q20 coefficients were calibrated, and then these trend line descriptions were estimated from those results.

5. Regressing Y on X produces (XTX)−1XTY, the first standard OLS result. Regressing WY on X produces (XTX)1XTWY, the second standard OLS result.

6. As the sample size increases, the mean of the sampling distribution concentrates at the population parameter value, the variance of the sampling distribution decreases, and the sampling distribution converges on, for example, the normal distribution. Not all statistics are governed by a CLT; for example, the arithmetic mean of a Cauchy random variable.

7. A wide range of researchers are beginning to embrace this notion. For example, Lovell (Citation2013) argues this point, but in terms of biological importance. Dette and Wied (Citation2015) argue this point, but for change detection in time series.

8. Statistical significance also may be based upon a degrees of freedom measure other than n.

9. MLEs are widely accepted as being asymptotically optimal and asymptotically normally distributed, with their variances given by the inverse of the Fisher information matrix. This outcome results from the CLT guaranteeing that the logarithmic differentiation of a probability density function is asymptotically normally distributed as n increases to infinity (Hayashi Citation2009, p. 99).

10. A 2 × 1 lattice yields 9/16 as the ratio for the two quantities, with the limit of this ratio for an infinite lattice being 3/4. The specimen incomplete rectangular regions () have a ratio between 0.76 and 0.80.

11. Centering reduces multicollinearity, sometimes dramatically (see Kestens et al. Citation2006).

12. The current calibration involves a systematic sample of size 500 from across the feasible parameter space of . This amount of time can be reduced by decreasing this systematic sample size to, say, 50.

13. The appropriate test statistics are z = , , z = , and z = , where the subscript 0 denotes the comparison value.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.