312
Views
6
CrossRef citations to date
0
Altmetric
Articles

MODIS probabilistic cloud masking over the Amazonian evergreen tropical forests: a comparison of machine learning-based methods

, &
Pages 185-210 | Received 08 Mar 2019, Accepted 18 Jun 2019, Published online: 09 Jul 2019
 

ABSTRACT

Amazonian tropical forests play a significant role in global water, carbon and energy cycles. Satellite remote sensing is presented as a feasible means in order to monitor these forests. In particular, the Moderate Resolution Imaging Spectroradiometer (MODIS) is amongst major tools for studying this region. Nevertheless, MODIS operative surface variable retrieval was reported to be impacted by cloud contamination effects. A proper cloud masking is a major consideration in order to ensure accuracy when analysing Amazonian tropical forests current and future status. In the present study, the potential of supervised machine learning algorithms in order to overcome this issue is evaluated. In front of global operative MODIS cloud masking algorithms (MYD35 and the Multi-Angle Implementation of Atmospheric Correction Algorithm (MAIAC)) these algorithms benefit from the fact that they can be optimized to properly represent the local cloud conditions of the region. Models considered were: Gaussian Naïve Bayes (GNB), Linear Discriminant Analysis (LDA), Quadratic Discriminant Analysis (QDA), Random Forests (RF), Support Vector Machine (SVM) and Multilayer Perceptron (MLP). These algorithms are able to provide a continuous measure of cloud masking uncertainty (i.e. a probability estimate of each pixel belonging to clear and cloudy class) and therefore can be used for probabilistic cloud masking. Truth reference dataset (a priori knowledge) requirement was satisfied by considering the collocation of Cloud Profiling Radar (CPR) and Cloud Aerosol Lidar with Orthogonal Polarization (CALIOP) observations with MODIS sensor. Model performance was tested using three independent datasets: 1) collocated CPR/CALIOP and MODIS data, 2) MODIS manually classified images and 3) in-situ ground data. For satellite image and in-situ testing results were additionally compared to current operative MYD35 (version 6.1) and MAIAC cloud masking algorithms.

Satellite image and in-situ testing results show that machine learning algorithms are able to improve MODIS operative cloud masking performance over the region. MYD35 and MAIAC tend to underestimate and overestimate the cloud cover over the study region, respectively. Amongst the models considered, probabilistic algorithms (LDA, GNB and in less extent QDA) provided better performance than RF, SVM and MLP machine learning algorithms as they were able to better deal with the viewing conditions limitation that resulted from collocating MODIS and CPR/CALIOP observations. In particular, best performance was obtained for LDA with a difference in Kappa coefficient (model minus MODIS operative algorithm) of 0.293/0.155 (MYD35/MAIAC, respectively) considering satellite image testing validation. Worst performance was obtained for MLP with a difference in Kappa coefficient of 0.175/0.037. For in-situ testing, models overall accuracy (OA) and Kappa coefficient values are higher than MYD35/MAIAC respective values. Models are computationally efficient (swath image calculation time between 0.37 and 9.49 s) and thus being able to be implanted for remote-sensing vegetation retrieval processing chains over the Amazonian tropical forests. LDA stands out as the best candidate because of its maximum accuracy and minimum computational associated.

Acknowledgements

This study was supported by Ministerio de Educación, Cultura y Deporte (grant FPU14/06502), and Ministerio de Economía y Competitividad (PCIN 2015-232, ESP2015-71894-R). The authors thank MODIS and MAIAC data teams for making data publicly available. In addition, the U.S. Department of Energy Atmospheric Radiation Measurement (ARM) Climate Research Facility, for the provision of the in situ Total Sky Imager data used in this study (https://www.arm.gov/). Anonymous reviewers are also acknowledged for their constructive comments and suggestions that help to improve the quality of the paper.

Disclosure statement

No potential conflict of interest was reported by the authors.

Supplementary material

Supplemental data for this article can be accessed here.

Additional information

Funding

This work was supported by the Ministerio de Economía y Competitividad [PCIN 2015-232, ESP2015-71894-R]; Ministerio de Educación y Formación Profesional (Spanish Government) [FPU14/06502].

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.