397
Views
3
CrossRef citations to date
0
Altmetric
Research Article

Mahalanobis distance based accuracy prediction models for Sentinel-2 Image Scene Classification

ORCID Icon, ORCID Icon, ORCID Icon & ORCID Icon
Pages 6001-6026 | Received 13 Aug 2021, Accepted 25 Nov 2021, Published online: 09 Jan 2022
 

ABSTRACT

Over the years, due to the enrichment of paired-label datasets, supervised machine learning has become a prime component of any problem-solving. Examples include building classifiers for applications such as image/speech recognition, traffic prediction, product recommendation, virtual personal assistant (VPA), online fraud detection and many more. The performance of these developed classifiers is highly dependent upon the training dataset, and subsequently, without human intervention or true labels, the evaluation over unseen observations remains unknown. Using a statistical distance researchers did try to assess the model’s goodness-of-fit and compared multiple independent models. Nonetheless, given a train-test split and different classifiers built over the training set, the question ‘is it possible to find a prediction error using the relation between training and test set?’ remains unsolved. In this article, we propose a generalized statistical distance-based method measuring the prediction uncertainty at a new query point. To be specific, we propose a Mahalanobis distance-based Evidence Function Model to measure the misclassification caused by K-Nearest Neighbours (KNN), Extra Trees (ET), and Convolutional Neural Network (CNN) models when classifying Sentinel-2 image into six scene classes (Water, Shadow, Cirrus, Cloud, Snow, Other). The performance of the proposed method was assessed over two different datasets: (i) the test set, with an overall mean prediction uncertainty detection of 62.99%, 29.80% and 31.51%, leading to a mean micro-F1 performance of 67.89%, 39.30%, and 38.29% for KNN, ET, and CNN, respectively; (ii) a water-body set, with prediction uncertainty detection of 22.27%, 42.08%, and 27.67%, leading to a micro-F1 performance of 34.70%, 58.96%, and 43.32%, respectively.

Authors contributions

Investigation, Methodology, and Writing – K.R.; Conceptualization, Supervision, and Validation – T.G. and L.R.; and Validation – M.B. All authors have read and agreed to the published version of the manuscript.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Abbreviations

The following abbreviations are used in this manuscript:

Notes

1 These two images are unlabeled so, we were not able to provide the result in exact %.:

Additional information

Funding

This work was funded by NIIAA (Núcleo de Investigação em Inteligência Artificial em Agricultura), a project financed by the Alentejo 2020 program [reference ALT20-03-0247-FEDER-036981]

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.