3,764
Views
18
CrossRef citations to date
0
Altmetric
Theory and Methods

Random Forests for Spatially Dependent Data

, &
Pages 665-683 | Received 02 Dec 2020, Accepted 23 Jun 2021, Published online: 13 Aug 2021

References

  • Ahijevych, D., Pinto, J. O., Williams, J. K., and Steiner, M. (2016), “Probabilistic Forecasts of Mesoscale Convective System Initiation Using the Random Forest Data Mining Technique,” Weather and Forecasting, 31, 581–599.
  • Banerjee, S., Carlin, B. P., and Gelfand, A. E. (2014), Hierarchical Modeling and Analysis for Spatial Data, Boca Raton, FL: CRC Press.
  • Bradley, R. C. (2005), “Basic Properties of Strong Mixing Conditions. A Survey and Some Open Questions,” arXiv: math/0511078.
  • Breiman, L. (1996), “Bagging Predictors,” Machine Learning, 24, 123–140. DOI: 10.1007/BF00058655.
  • Breiman, L. (2001), “Random Forests,” Machine Learning, 45, 5–32.
  • Breiman, L., Friedman, J., Stone, C. J., and Olshen, R. A. (1984), Classification and Regression Trees, Boca Raton, FL: CRC Press.
  • Carrasco, M. and Chen, X. (2002), “Mixing and Moment Properties of Various GARCH and Stochastic Volatility Models,” Econometric Theory, 17–39. DOI: 10.1017/S0266466602181023.
  • Chen, Y. M., Chen, X. S., and Li, W. (2016), “On Perturbation Bounds for Orthogonal Projections,” Numerical Algorithms, 73, 433–444. DOI: 10.1007/s11075-016-0102-2.
  • Datta, A., Banerjee, S., Finley, A. O., and Gelfand, A. E. (2016a), “Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets,” Journal of the American Statistical Association, 111, 800–812. DOI: 10.1080/01621459.2015.1044091.
  • Datta, A., Banerjee, S., Finley, A. O., and Gelfand, A. E. (2016b), “On Nearest-Neighbor Gaussian Process Models for Massive Spatial Data,” Wiley Interdisciplinary Reviews: Computational Statistics, 8, 162–171.
  • Datta, A., Banerjee, S., Finley, A. O., Hamm, N. A. S., and Schaap, M. (2016c), “Nonseparable Dynamic Nearest Neighbor Gaussian Process Models for Large Spatio-Temporal Data With an Application to Particulate Matter Analysis,” Annals of Applied Statistics, 10, 1286–1316.
  • Dehling, H., and Philipp, W. (2002), “Empirical Process Techniques for Dependent Data,” in Empirical Process Techniques for Dependent Data, Boston, MA: Springer, pp. 3–113.
  • Di, Q., Amini, H., Shi, L., Kloog, I., Silvern, R., Kelly, J., Sabath, M. B., Choirat, C., Koutrakis, P., Lyapustin, A., et al. (2019), “An Ensemble-Based Model of PM2. 5 Concentration Across the Contiguous United States with High Spatiotemporal Resolution,” Environment International, 130, 104909. DOI: 10.1016/j.envint.2019.104909.
  • Diggle, P. J., and Hutchinson, M. F. (1989), “On Spline Smoothing With Autocorrelated Errors,” Australian Journal of Statistics, 31, 166–182. DOI: 10.1111/j.1467-842X.1989.tb00510.x.
  • Doukhan, P. (2012), Mixing: Properties and Examples, Vol. 85, New York: Springer Science & Business Media.
  • Du, J., Zhang, H., Mandrekar, V., et al. (2009), “Fixed-Domain Asymptotic Properties of Tapered Maximum Likelihood Estimators,” The Annals of Statistics, 37, 3330–3361. DOI: 10.1214/08-AOS676.
  • Fayad, I., Baghdadi, N., Bailly, J.-S., Barbier, N., Gond, V., Hérault, B., El Hajj, M., Fabre, F., and Perrin, J. (2016), “Regional Scale Rain-Forest Height Mapping Using Regression-Kriging of Spaceborne and Airborne LiDAR Data: Application on French Guiana,” Remote Sensing, 8, 240. DOI: 10.3390/rs8030240.
  • Finley, A. O., Datta, A., and Banerjee, S. (2021), “spNNGP: R package for Nearest Neighbor Gaussian Process Models,” Journal of Statistical Software, (accepted conditionally).
  • Finley, A. O., Datta, A., Cook, B. D., Morton, D. C., Andersen, H. E., and Banerjee, S. (2019), “Efficient Algorithms for Bayesian Nearest Neighbor Gaussian Processes,” Journal of Computational and Graphical Statistics, 28, 401–414. DOI: 10.1080/10618600.2018.1537924.
  • Fox, E. W., Ver Hoef, J. M., and Olsen, A. R. (2020), “Comparing Spatial Regression to Random Forests for Large Environmental Data Sets,” PloS One, 15, e0229509. DOI: 10.1371/journal.pone.0229509.
  • Friedman, J., Hastie, T., and Tibshirani, R. (2001), The Elements of Statistical Learning, Vol. 1, New York: Springer Series in Statistics.
  • Friedman, J. H. (1991), “Multivariate Adaptive Regression Splines,” The Annals of Statistics, 1–67. DOI: 10.1214/aos/1176347963.
  • Friedman, J. H. (2006), “Recent Advances in Predictive (Machine) Learning,” Journal of Classification, 23, 175–197.
  • Friedman, J. H., Popescu, B. E., et al. (2008), “Predictive Learning via Rule Ensembles,” The Annals of Applied Statistics, 2, 916–954. DOI: 10.1214/07-AOAS148.
  • Genton, M. G. and Kleiber, W. (2015), “Cross-Covariance Functions for Multivariate Geostatistics,” Statistical Science, 147–163. DOI: 10.1214/14-STS487.
  • Georganos, S., Grippa, T., Niang Gadiaga, A., Linard, C., Lennert, M., Vanhuysse, S., Mboga, N., Wolff, E., and Kalogirou, S. (2019), “Geographical Random Forests: A Spatial Extension of the Random Forest Algorithm to Address Spatial Heterogeneity in Remote Sensing and Population Modelling,” Geocarto International, 1–16. DOI: 10.1080/10106049.2019.1595177.
  • Györfi, L., Kohler, M., Krz.yzak, A., and Walk, H. (2002), A Distribution-Free Theory of Nonparametric Regression (Vol. 1), New York: Springer.
  • Hartikainen, J., and Särkkä, S. (2010), “Kalman Filtering and Smoothing Solutions to Temporal Gaussian Process Regression Models,” in 2010 IEEE International Workshop on Machine Learning for Signal Processing, Kittilä, Finland, IEEE, pp. 379–384.
  • Heaton, M. J., Datta, A., Finley, A. O., Furrer, R., Guinness, J., Guhaniyogi, R., Gerber, F., Gramacy, R. B., Hammerling, D., Katzfuss, M., Lindgren, F., Nychka, D., Sun, F., Zammit-Mangion, A. (2019), “A Case Study Competition Among Methods for Analyzing Large Spatial Data,” Journal of Agricultural, Biological and Environmental Statistics, 24, 398–425. DOI: 10.1007/s13253-018-00348-w.
  • Hengl, T., Nussbaum, M., Wright, M. N., Heuvelink, G. B., and Gräler, B. (2018), “Random Forest as a Generic Framework for Predictive Modeling of Spatial and Spatio-Temporal Variables,” PeerJ, 6, e5518. DOI: 10.7717/peerj.5518.
  • Ihara, S. (1993), Information Theory for Continuous Systems, Vol. 2, World Scientific.
  • Kimeldorf, G., and Wahba, G. (1971), “Some Results on Tchebycheffian Spline Functions,” Journal of Mathematical Analysis and Applications, 33, 82–95. DOI: 10.1016/0022-247X(71)90184-3.
  • Lim, C. C., Kim, H., Vilcassim, M. R., Thurston, G. D., Gordon, T., Chen, L.-C., Lee, K., Heimbinder, M., and Kim, S.-Y. (2019), “Mapping Urban air Quality Using Mobile Sampling With Low-Cost Sensors and Machine Learning in Seoul, South Korea,” Environment International, 131, 105022.
  • Lin, Y., and Jeon, Y. (2006), “Random Forests and Adaptive Nearest Neighbors,” Journal of the American Statistical Association, 101, 578–590. DOI: 10.1198/016214505000001230.
  • Mentch, L., and Hooker, G. (2016), “Quantifying Uncertainty in Random Forests Via Confidence Intervals and Hypothesis Tests,” The Journal of Machine Learning Research, 17, 841–881.
  • Mentch, L., and Zhou, S. (2020), “Getting Better from Worse: Augmented Bagging and a Cautionary Tale of Variable Importance,” arXiv: 2003.03629.
  • Mokkadem, A. (1988), “Mixing Properties of ARMA Processes,” Stochastic Processes and Their Applications, 29, 309–315. DOI: 10.1016/0304-4149(88)90045-2.
  • Nobel, A., and Dembo, A. (1993), “A Note on Uniform Laws of Averages for Dependent Processes,” Statistics & Probability Letters, 17, 169–172.
  • Nobel, A. et al. (1996), “Histogram Regression Estimation Using Data-Dependent Partitions,” The Annals of Statistics, 24, 1084–1105. DOI: 10.1214/aos/1032526958.
  • Pardo-Igúzquiza, E. and Olea, R. A. (2012), “VARBOOT: A Spatial Bootstrap Program for Semivariogram Uncertainty Assessment,” Computers & Geosciences, 41, 188–198.
  • Peligrad, M. (2001), “A Note on the Uniform Laws for Dependent Processes Via Coupling,” Journal of Theoretical Probability, 14, 979–988.
  • Saha, A. and Datta, A. (2018a), “BRISC: Bootstrap for Rapid Inference on Spatial Covariances,” Statistics, 7, e184. DOI: 10.1002/sta4.184.
  • Saha, A. and Datta, A. (2018b), BRISC: Fast Inference for Large Spatial Datasets Using BRISC, r package version 0.1.0.
  • Scornet, E. (2016), “Random Forests and Kernel Methods,” IEEE Transactions on Information Theory, 62, 1485–1500. DOI: 10.1109/TIT.2016.2514489.
  • Scornet, E., Biau, G., and Vert, J.-P. (2015), “Consistency of Random Forests,” The Annals of Statistics, 43, 1716–1741. DOI: 10.1214/15-AOS1321.
  • Stein, M. L. (2012), Interpolation of Spatial Data: Some Theory for Kriging, Springer Science & Business Media. New York: Springer-Verlag.
  • Stein, M. L. (2002), “The Screening Effect in Kriging,” The Annals of Statistics, 30, 298–323. DOI: 10.1214/aos/1015362194.
  • Taylor, J., and Einbeck, J. (2013), “Challenging the Curse of Dimensionality in Multivariate Local Linear Regression,” Computational Statistics, 28, 955–976. DOI: 10.1007/s00180-012-0342-0.
  • Vecchia, A. V. (1988), “Estimation and Model Identification for Continuous Spatial Processes,” Journal of the Royal Statistical Society, Series B, 50, 297–312. DOI: 10.1111/j.2517-6161.1988.tb01729.x.
  • Viscarra Rossel, R. A., Webster, R., and Kidd, D. (2014), “Mapping Gamma Radiation and Its Uncertainty From Weathering Products in a Tasmanian Landscape With a Proximal Sensor and Random Forest Kriging,” Earth Surface Processes and Landforms, 39, 735–748. DOI: 10.1002/esp.3476.
  • Wager, S., and Athey, S. (2018), “Estimation and Inference of Heterogeneous Treatment Effects Using Random Forests,” Journal of the American Statistical Association, 113, 1228–1242. DOI: 10.1080/01621459.2017.1319839.
  • Zhang, H. (2004), “Inconsistent Estimation and Asymptotically Equal Interpolations in Model-Based Geostatistics,” Journal of the American Statistical Association, 99, 250–261. DOI: 10.1198/016214504000000241.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.