Search in:

Applied Artificial Intelligence

An International Journal

Volume 29, 2015 - Issue 3

Submit an article Journal homepage

Free access

508

Views

CrossRef citations to date

Altmetric

Listen

Original Articles

An ArcGIS Tool for Modeling the Climate Envelope with Feed-Forward ANN

Ákos Bede-FazekasDepartment of Garden and Open Space Design, Corvinus University of Budapest, Budapest, HungaryCorrespondence[email protected]
View further author information

Levente HorváthDepartment of Mathematics and Informatics, Corvinus University of Budapest, Budapest, HungaryView further author information

Attila J. TrájerDepartment of Limnology, University of Pannonia, Veszprém, Hungary;MTA-PE Limnoecology Research Group, Veszprém, HungaryView further author information

Tibor GregoricsDepartment of Software Technology and Methodology, Eötvös Loránd University, Budapest, HungaryView further author information

Pages 233-242 | Published online: 01 Apr 2015

Cite this article
https://doi.org/10.1080/08839514.2015.1004612
CrossMark

In this article

INTRODUCTION
PROGRAM DESCRIPTION
APPLICATION
CASE STUDY OF LARIX DECIDUA
SUMMARY
FUNDING
Additional information
Footnotes
References

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions
View PDF PDF

Abstract

This article is about the development and application of an ESRI ArcGIS tool that implements a multilayer, feed-forward artificial neural network (ANN) to study the climate envelopes of species. The supervised learning is achieved by a backpropagation algorithm. Based on the distribution and the grids of the climate (and edaphic data) of the reference and future periods, the tool predicts the future potential distribution of the studied species. The trained network can be saved and loaded. A modeling result based on the distribution of European larch (Larix decidua Mill.) is presented as a case study.

INTRODUCTION

The impact of climate change on the distribution of species can be modeled with climate envelope modeling (CEM), also known as niche-based modeling or correlative modeling (Box Citation1981; Hijmans and Graham Citation2006). The method is about predicting responses of species to climate change by drawing an envelope around the domain of climatic variables where the given species has been recently found and then identifying areas predicted to fall within that domain under future scenarios (Ibáñez et al. Citation2006). It hypothesizes that (both present and future) distributions are dependent mostly on the climatic variables (Czúcz Citation2010). Compared to mechanistic models, CEM tries to find statistical correlations between climate and distribution of species (Elith and Leathwick Citation2009; Guisan and Zimmermann Citation2000), and models the future temporal correspondence based on the present spatial correspondence between the variables (Pickett Citation1989). A key advantage of CEM is that there is no requirement for detailed physiological data of species (Pearson et al. Citation2002).

Various methods can be used to determine the climate envelope, including simple regression, distance-based methods, genetic algorithms (GAs), and artificial neural networks (ANNs; Ibáñez et al. Citation2006). The last belong to artificial intelligence (AI) methods that are used less frequently in ecology than statistical approaches because the AI models are considered to be less interpretable and often are called “black-boxes” (Elith et al. Citation2008). A review of the various modeling methods is provided by Guisan and Zimmermann (Citation2000). ANN in CEMs can be mentioned as a method that is relatively new but widely applied (Carpenter et al. Citation1999; Hilbert and Van Den Muyzenberg Citation1999; Özesmi and Özesmi Citation1999; Hilbert and Ostendorf Citation2001; Pearson et al. Citation2002; Özesmi et al. Citation2006; Harrison et al. Citation2010; Ogawa-Onishi et al. Citation2010). ANN-based models are more powerful than multiple regression models when modeling nonlinear relationships (Lek et al. Citation1996). ANNs have proven to be advantageous in many fields of science wherein complex datasets need to be analyzed (Van Leeuwen et al. Citation2012).

The concept of ANN is inspired by the structure and operation of the nervous system. ANN is a machine learning system that has computational units, called neurons, simplification of the human neurons. In general, neurons are organized to lie in layers and are densely connected to each other. ANN is able to learn and recognize patterns such as climatic patterns that can be found within the distribution of a species. Detailed discussion of the method is provided by Picton (Citation2000) and Van Leeuwen (Citation2012).

PROGRAM DESCRIPTION

The program provides an AI method for CEM in an ESRI ArcGISFootnote¹ 10.0 environment. The program was implemented in Python and is freely accessible through the ESRI tool center (ArcGIS Citation2013). Based on the distribution (which serves as the base of the presence/absence calculations) and the grids of the climatic, edaphic, topographic, and other data of the reference and future periods (which provide the predictors, or explanatory variables, for the learning and projection phase), the program learns the climatic patterns found within the distribution of the studied species and then predicts the future potential distribution (makes projection). The program implements a multilayer, feed-forward ANN to learn the climate envelope of species. Sigmoid, tangent hyperbolic activation function is used. The multilayer topology includes (1) one input layer with the same number of neurons as the number of the given input predictor variables; (2) several hidden layers (the number of the hidden layers and the neurons of the hidden layers can be set); (3) and one output layer with one neuron that is able to estimate the presence/absence in a certain geological point (a point of the grid). The supervised learning is achieved by a backpropagation algorithm with adjustable learning rate and momentum factor. Multiple predictions can be made in one procedure. The trained network can be saved to and loaded from a file, therefore, training and prediction can be separated.

The program has a linear run in a temporal term with five distinguished phases: verifying, data preprocessing, training, projecting, and processing phases; see . The verifying phase verifies the input data and the parameters formally and in terms of the content. In case of any problem, the program shows an informative error message and terminates. The data preprocessing and training phases are done if the program was started with the parameter “Should training be done?” being checked. During the data preprocessing, the climatic data are studentized (standardized) for faster training; presence/absence is calculated for every geographic point, and the training pattern is created. Either the entire grid of the reference period can be used for training or a part of it can be selected randomly. In the training phase, the core of the program (the neural network) learns until one of the previously set three termination conditions is satisfied (see them in the next section).

FIGURE 1 The logic of the program: the subprogram’s connections to each other and to the user.

The projecting and processing phases are done if the program was started with the parameter “Should projection be done?” being checked. During the projection phase the program iterates through the points of the projection grid(s) and the trained neural network makes a projection. The projection values, typically within the (0;1) interval, are discretized to binary presence/absence data by a manually specified threshold. They can be preserved in a new column of the projection grid. The processing phase is responsible for drawing the potential distribution(s) based on the projection(s) of presence/absence. It is achieved by creating and aggregating Thiessen polygons (Voronoi cells). Detailed structure of the program can be seen in .

FIGURE 2 The connection of the program’s functions (userOutput.AddMessage function is excluded, because almost all the other functions call it). The communication toward the user is displayed with dashed lines.

APPLICATION

The program can be run (1) as a tool of the ArcToolbox either manually or by Model Builder; (2) or as a script from the Python Window or from other scripts. The program needs several inputs to be given and starting parameters to be set. All the inputs and parameters can be set in the starting window of the tool () or as parameters of the function. After the program has been started, the user cannot affect the running of the program. In the tool window, the user specifies whether both training and projection should be done or only one or the other. In the case of the training-only or the training-and-prediction mode, the trained network can be saved to a given file. In the case of prediction-only mode, the network previously saved can be loaded from the given file.

FIGURE 3 The parameterization of the tool at launching.

In the case of training, the user should set the parameters of the ANN, In other words, the number of hidden layers, the number of neurons per hidden layer, the learning rate, and the momentum factor. A point-type ESRI shapefile (grid) containing the climatic parameters in columns should be loaded as input of the climatic data of the reference period (reference grid). The grid should contain only the climatic parameters and the FID/OID/Shape fields. The user should previously select the appropriate column to avoid high collinearity (detailed information about the phenomenon is given by Dormann et al. Citation2013). Another input is the distribution of the species formatted as ESRI polygon-type shapefile (reference distribution). The program bounds the reference distribution to the reference grid. Also, the number of training points and the termination conditions of the training can be set. If no training point number is given, the program uses the entire reference grid as a training pattern. The optional termination conditions are (1) the number of iterations; (2) the error value to be reached; (3) the training duration in milliseconds.

In the case of projecting, the user should open one or more projection grid(s) with similar structure to that of the reference grid. The column order should be the equivalent of the order within the reference grid. Bias correction of the projection grids should be previously done if necessary. A checkbox enables setting the calculated presence/absence data, as 1/0 values placed in a new temporary column, if they should be preserved in the projection grids. The user should select as many projection distributions as the number of the projection grids. The program bounds the first grid to the first distribution, and so on. Nonexistent projection distributions are created, while the existing ones are overwritten. The output of the program is the list of the projection distributions that can be handled by Model Builder or by other scripts.

CASE STUDY OF LARIX DECIDUA

Aim

A modeling process, including the input data types, the selected parameters, and the modeling result, based on the distribution of European larch (Larix decidua Mill.) is presented as a case study. Although using a more sophisticated CEM and more adequate predictor variables (e.g., soil type, exposure, potential evapotranspiration) could reflect more on the demand of the species, the only aim of the case study was to show how easy the application of the tool is.

Data Sources

The current (latest update was achieved in 2008) continuous distribution map of European larch (Larix decidua Mill.) was derived from the EUFORGEN digital area database (Euforgen Citation2009), whereas the discrete (fragmented) observations were ignored. The distribution from 2009 was bound to the reference period of the climate data, because the studied species has a long life cycle and can slowly adapt to the changing climate (Nadezda et al. Citation2006). Larix decidua is one of the most climate-sensitive tree species of the Alps (Carrer and Urbinati Citation2006).

The climatic data were gained from the REMO regional climate model (Hewitt and Griggs Citation2004); the grid had a 25-km horizontal resolution. The model REMO is based on the ECHAM5 global climate model and uses the Intergovernmental Panel on Climate Change Special Report on Emissions Scenarios(IPCC SRES) scenario called A1B. The reference period was 1961–1990, the two prediction periods were 2011–2040 and 2041–2070. The entire European Continent is within the domain of REMO; we used, however, only a part of the grid (25,724 of the 32,300 points). Five climatic predictors were selected, which were averaged in the three periods. June temperature and precipitation were found to be the best predictors of larch growth in the Southern Alps (Carrer and Urbinati Citation2006). Additionally, mean temperature of January, minimum temperature of September, and precipitation sum of January were used as explanatory variables.

Input Parameters

The selected input parameters were the following. The neural network had 5 hidden layers with 15 neurons per layer. The learning rate and the momentum factor were set to be 0.1 and 0.01, respectively. The entire reference grid was given to the network to be used for training. Only one termination condition was set: the supervised training should be terminated after the tenth iteration.

Result and Discussion

An extract of modeling results can be seen in . The modeled potential distributions include parts of Norway and Sweden, which are not displayed. The modeled potential distribution for the reference period shows great similarity to the observed distribution. Although more similarity could be reached in the case of a longer training phase, that could result in an overfitted model. The Cohen’s kappa (Cohen Citation1960) value of the model result for the reference period was 0.4905.

FIGURE 4 Current distribution (dotted), modeled potential distribution in the reference period (grey), and predicted potential distribution in the periods of 2011–2040 (SW–NE hatch) and 2041–2070 (NW–SE hatch) of European larch (Larix decidua Mill.), zoomed to Central Europe.

The ratio of the presence data in the entire grid was originally 1.89% (486/25,724 points). The modeled ratios in 1961–1990, 2011–2040, and 2041–2070 were 2.97, 2.60, and 2.40, respectively. The retraction of the distribution in the Northern Alps is predicted. The model of our previous research (Bede-Fazekas Citation2013) resulted in much larger potential distribution for the reference period and predicted more significant retraction in the Alps.

SUMMARY

The application, applied methods, and example model results of the newly developed ANN Distribution ArcGIS tool are reported to introduce this tool to the community of ecologists. The application of the program is simple because no data transformation, presence/absence calculation, and data migration to statistical software are needed. The program was optimized to the typical data formats of CEM. As far as the authors know, the presented program is the first ANN-based simple CEM tool written to ArcGIS.

Although we stressed the benefits of the tool, we should not forget to mention the challenges. ANN is a black-box method, which is not able to help the ecologists to understand the underlying processes and factors that drive the distribution of species; the method can be applied specifically for modeling. This version of the tool lacks automatic parameter setting and regularization scheme, which could prevent the model from becoming overfitted (no statistical measures are calculated during the training phase and, therefore, no automatic calibration can be achieved).

The concept and aim of the program are complex issues and might include many potential developing targets. The main effort for the future version of this program (1) would handle probabilities rather than (or in addition to) binary presence/absences; (2) would continuously model to the reference period to calculate ROC/AUC or Cohen’s kappa values and apply them for early stopping regularization (calibration); (3) would dynamically change the discretization boundary; (4) and would optimize the projecting and processing phases to multicore processors.

FUNDING

ArcGIS is a trademark product of the Environmental Systems Research Institute (ESRI) Inc. The research was supported by the project TÁMOP-4.2.1/B-09/1/KMR-2010-0005 and TÁMOP 4.2.2.A-1/1/KONV-2012-0064. The ENSEMBLES data used in this work was funded by the EU FP6 Integrated Project ENSEMBLES (Contract number 505539) whose support is gratefully acknowledged.

Additional information

Funding

Notes

¹ ArcGIS is a trademark product of the Environmental Systems Research Institute (ESRI) Inc.

REFERENCES

ArcGIS. 2013. ANNDistribution: A tool for modeling the climate envelope with feed-forward artificial neural network. Available at www.arcgis.com/home/item.html?id=2c6a49d147b94503b28ff6342e84b4be ( accessed October 6, 2013).
Google Scholar
Bede-Fazekas, Á., 2013. Negative impact of climate change on the distribution of some conifers. Hadtudomány 23(Suppl.):234–243.
Google Scholar
Box, E. O., 1981. Macroclimate and plant forms: An introduction to predictive modelling in phytogeography. The Hague: Dr. W. Junk.
Google Scholar
Carpenter, G. A., S. Gopal, S. Macomber, S. Martens, C. E. Woodcock, and J. Franklin. 1999. A neural network method for efficient vegetation mapping. Remote Sensing of the Environment 70(3):326–338.
Web of Science ®Google Scholar
Carrer, M., and C. Urbinati. 2006. Long‐term change in the sensitivity of tree‐ring growth to climate forcing in Larix decidua. New Phytologist 170(4):861–872.
PubMed Web of Science ®Google Scholar
Cohen, J., 1960. A coefficient of agreement for nominal scales. Educational and Psychological Measurement 20(1):37–46.
Web of Science ®Google Scholar
Czúcz, B., 2010. Modelling the impact of climate change on natural habitats in Hungary (PhD Thesis, Corvinus University of Budapest, Budapest, Hungary).
Google Scholar
Dormann, C. F., J. Elith, S. Bacher, C. Buchmann, G. Carl, G. Carré, J. R. García Marquéz, B. Gruber, B. Lafourcade, P. J. Leitão, T. Münkemüller, C. McClean, P. E. Osborne, B. Reineking, B. Schröder, A. K. Skidmore, D. Zurell, and S. Lautenbach. 2013. Collinearity: A review of methods to deal with it and a simulation study evaluating their performance. Ecography 36(1):27–46.
Web of Science ®Google Scholar
Elith, J., and J. R. Leathwick. 2009. Species distribution models: Ecological explanation and prediction across space and time. Annual Review of Ecology, Evolution, and Systematics 40(1):677–697.
Web of Science ®Google Scholar
Elith, J., J. R. Leathwick, and T. Hastie, 2008. A working guide to boosted regression trees. Journal of Animal Ecology 77(4):802–813.
PubMed Web of Science ®Google Scholar
Euforgen, 2009. Distribution map of Europaean larch (Larix decidua). Bioversity International, Rome, Italy. www.euforgen.org/distribution_maps.html ( accessed April 1, 2013).
Google Scholar
Guisan, A., and N. E. Zimmermann. 2000. Predictive habitat distribution models in ecology. Ecological Modelling 135(2–3):147–186.
Web of Science ®Google Scholar
Harrison, S., E. I. Damschen, and J. B. Grace. 2010. Ecological contingency in the effects of climatic warming on forest herb communities. Proceedings of the National Academy of Sciences USA. 107(45):19362–19367.
PubMed Web of Science ®Google Scholar
Hewitt, C. D., and D. J. Griggs. 2004. Ensembles-based predictions of climate changes and their impacts. Eos 85(52):566.
Google Scholar
Hijmans, R. J., and C. H. Graham. 2006. The ability of climate envelope models to predict the effect of climate change on species distributions. Global Change Biology 12(12):2272–2281.
Web of Science ®Google Scholar
Hilbert, D. W., and B. Ostendorf. 2001. The utility of artificial neural networks for modelling the distribution of vegetation in past, present and future climates. Ecological Modelling 146(1–3):311–327.
Web of Science ®Google Scholar
Hilbert, D. W., and J. Van Den Muyzenberg. 1999. Using an artificial neural network to characterize the relative suitability of environments for forest types in a complex tropical vegetation mosaic. Diversity and Distributions 5(6):263–274.
Google Scholar
Ibáñez, I., J. S. Clark, M. C. Dietze, K. Feeley, M. Hersh, S. Ladeau, A. Mcbride, N. E. Welch, and M. S. Wolosin. 2006. Predicting biodiversity change: outside the climate envelope, beyond the species-area curve. Ecology 87(8):1896–1906.
PubMed Web of Science ®Google Scholar
Lek, S., M. Delacoste, P. Baran, I. Dimopoulos, J. Lauga, and S. Aulagnier. 1996. Application of neural networks to modelling non linear relationships in ecology. Ecological Modelling 90(1):39–52.
Web of Science ®Google Scholar
Nadezda, M. T., E. R. Gerald, and I. P. Elena. 2006. Impacts of climate change on the distribution of Larix spp. and Pinus sylvestris and their climatypes in Siberia. Mitigation and Adaptation Strategies for Global Change 11(4):861–882.
Google Scholar
Ogawa-Onishi, Y., P. M. Berry, and N. Tanaka. 2010. Assessing the potential impacts of climate change and their conservation implications in Japan: A case study of conifers. Biological Conservation 143(7):1728–1736.
Web of Science ®Google Scholar
Özesmi, S. L., and U. Özesmi. 1999. An artificial neural network approach to spatial habitat modelling with interspecific interaction. Ecological Modelling 116(1):15–31.
Web of Science ®Google Scholar
Özesmi, S. L., C. O. Tan, and U. Özesmi. 2006, Methodological issues in building, training, and testing artificial neural networks in ecological applications. Ecological Modelling 195(1–2):83–93.
Web of Science ®Google Scholar
Pearson, R. G., T. P. Dawson, P. M. Berry, and P. A. Harrison. 2002. SPECIES: A spatial evaluation of climate impact on the envelope of species. Ecological Modelling 154(3):289–300.
Web of Science ®Google Scholar
Pickett, S. T. A1989. Space-for-time substitution as an alternative to long-term studies. In Long-term studies in ecology: approaches and alternatives. ed. G. E. Likens, 110–135. New York, NY, USA: Springer.
Google Scholar
Picton, P. D. 2000. Neural networks. Basingstoke, UK: Palgrave Macmillan.
Google Scholar
Van Leeuwen, B. 2012. Artificial neural networks and geographic information systems for inland excess water classification (PhD Thesis, University of Szeged, Szeged, Hungary).
Google Scholar
Van Leeuwen, B., G. Mezősi, Z. Tobak, J. Szatmári, and K. Barta. 2012. Identification of inland excess water floodings using an artificial neural network. Carpathian Journal of Earth and Environmental Sciences 7(4):173–180.
Web of Science ®Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Download PDF

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Your download is now in progress and you may close this window

Did you know that with a free Taylor & Francis Online account you can gain access to the following benefits?

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Have an account?
Login now Don't have an account?
Register for free

Login or register to access this feature

Have an account?
Login now Don't have an account?
Register for free

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

An ArcGIS Tool for Modeling the Climate Envelope with Feed-Forward ANN

Abstract

INTRODUCTION

PROGRAM DESCRIPTION

APPLICATION

CASE STUDY OF LARIX DECIDUA

Aim

Data Sources

Input Parameters

Result and Discussion

SUMMARY

FUNDING

REFERENCES

Information for

Open access

Opportunities

Help and information

An ArcGIS Tool for Modeling the Climate Envelope with Feed-Forward ANN

Abstract

INTRODUCTION

PROGRAM DESCRIPTION

APPLICATION

CASE STUDY OF LARIX DECIDUA

Aim

Data Sources

Input Parameters

Result and Discussion

SUMMARY

FUNDING

Additional information

Funding

Notes

REFERENCES

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date