1,340
Views
65
CrossRef citations to date
0
Altmetric
Original Articles

Impact of training and validation sample selection on classification accuracy and accuracy assessment when using reference polygons in object-based classification

, , &
Pages 6914-6930 | Received 14 Nov 2012, Accepted 24 Feb 2013, Published online: 25 Jun 2013
 

Abstract

Reference polygons are homogenous areas that aim to provide the best available assessment of ground condition that the user can identify. Delineation of such polygons provides a convenient and efficient approach for researchers to identify training and validation data for supervised classification. However, the spatial dependence of training and validation data should be taken into account when the two data sets are obtained from a common set of reference polygons. We investigate the effect on classification accuracy and the accuracy estimates derived from the validation data when training and validation data are obtained from four selection schemes. The four schemes are composed of two sampling designs (simple random and systematic) and two methods for splitting sample points between training and validation (validation points in separate polygons from training points and validation points and training points split within the same polygons). A supervised object-based classification of the study region was repeated 30 times for each selection scheme. The selection scheme did not impact classification accuracy, but estimates of overall (OA), user's (UA), and producer's (PA) accuracies produced from the validation data overestimated accuracy for the study region by about 10%. The degree of overestimation was slightly greater when the validation sample points were allowed to be in the same polygons as the training data points. These results suggest that accuracy estimates derived from splitting training and validation within a limited set of reference polygons should be regarded with suspicion. To be fully confident in the validity of the accuracy estimates, additional validation sample points selected from the region outside the reference polygons may be needed to augment the validation sample selected from the reference polygons.

Acknowledgements

The lidar data and the reference land-cover data set used for generating error matrix weights in post-stratified estimation were obtained from the NYView website (http://nyview.esf.edu). The reference land-cover data set was created by the Spatial Analysis Laboratory of University of Vermont (SALUVM) using an object-based image analysis technique and matched the seven classes in this study. We would like to thank the two anonymous reviewers who provided valuable comments for improving this article.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.