Geospatial data mining for digital raster mapping

Bruce K. WylieU.S. Geological Survey, Earth Resources Observation and Science (EROS) Center, Science Division, 47914 252 St., Sioux Falls, SD57198, USACorrespondence[email protected]

http://orcid.org/0000-0002-7374-1083 View further author information

Neal J. PastickStinger Ghaffarian Technologies, Inc., Contractor to U.S. Geological Survey Earth Resources Observation and Science (EROS) Center, 47914 252 St., Sioux Falls, SD57198, USA;Department of Forest Resources, University of Minnesota, St. Paul, MN55108, USA

http://orcid.org/0000-0002-4021-4623 View further author information

Joshua J. PicotteASRC Federal InuTeq, LLC, Contractor to U.S. Geological Survey Earth Resources Observation and Science (EROS) Center, 47914 252 St., Sioux Falls, SD57198, USA

http://orcid.org/0000-0002-4021-4623 View further author information

Carol A. DeeringInnovate!, Inc, Contractor to U.S. Geological Survey Earth Resources Observation and Science (EROS) Center, 47914 252 St., Sioux Falls, SD57198, USA

http://orcid.org/0000-0003-3565-6264 View further author information

Figures & data

Table 1. Yearly publications of various data mining algorithms in the literature as indexed by Scopus and Web of Science. The title/abstract/keyword search was conducted on 18 October 2017, with the query ((“classification tree” or “regression tree” or “classification and regression tree” or CRT or “decision tree” or “random forest” or “neural net” or “neural network” or “support vector” or “k-means”) AND (“remote sensing” or “remotely sensed” or GIS or “geographic information science” or “satellite data” or “satellite imagery” or “satellite image”)). The query was filtered in each index to capture articles or articles in press published in the years 2013–2017. Duplicate records were removed. Some publications met the criteria of more than one data mining method.

Download CSV Display Table

Figure 1. Hypothetical classification or regression tree predicting parameter Z from the continuous variables W, Q, G, and X. The four ellipses are hierarchical splits based on a cost function that stratifies the data. Some example cost functions include minimizing the squared error, maximizing how pure the splits are – Gini index (Brownlee Citation2016), and high simplicity with a low absolute difference error – low sensitivity to outliers (Gu et al. Citation2016). The rectangles are the terminal node predictions. If Z is a categorical number, then the majority class is predicted (classification tree). If Z is a continuous value, the prediction can be the mean value or performed by either a simple regression or a multiple regression equation(s).

Figure 2. Optimization of the number of rules in a regression tree model to minimize overfitting tendencies and test error magnitudes. Score is (Test MAE – Training MAE) + test MAE and is a relative measure of model overfitting.

Figure 3. a) Normalized difference vegetation index (NDVI * 100) of Landsat 8 image acquired on 25 May 2016; b) NDVI synthetic Landsat image for 25 May 2106, generated with a regression tree model driven by Landsat OLI and Sentinel (MSI) time series data; c) Difference between real and synthetic images; and d) histogram of NDVI differences between real and synthetic images. Areas in white (no data) represent cloud, shadow, snow/ice, or water as identified from a decision tree masking model.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Geospatial data mining for digital raster mapping

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Geospatial data mining for digital raster mapping

Figures & data

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date