1,445
Views
8
CrossRef citations to date
0
Altmetric
Research Article

A LightGBM-based landslide susceptibility model considering the uncertainty of non-landslide samples

, , ORCID Icon &
Article: 2213807 | Received 22 Mar 2023, Accepted 10 May 2023, Published online: 24 May 2023
 

Abstract

The quality of samples is crucial in constructing a data-driven landslide susceptibility model. This article aims to construct a data-driven landslide susceptibility model that takes into account the selection of non-landslide samples. First, 21 conditioning factors are selected, including four types of topography and landform, geological conditions, environmental conditions, and human activities. Grid units with 30 m resolution are established by combining 942 historical landslide events in study area. Second, non-landslide samples are selected using both the traditional method and the information quantity method. Two landslide susceptibility models are established using the Bayesian optimization-LightGBM model. The accuracy of the model is evaluated by significance test and the area under curve (AUC). Finally, the SHAP algorithm is used to analyse the internal mechanism of the model’s decision-making. Based on the information quantity method, the LightGBM model identifies very high-high susceptibility areas that account for 77.92% of the total number of landslides. Additionally, the AUC of test set and the AUC of training set are 23.2% and 17.1% higher, respectively, compared to the traditional model. The selection of different sample data, whether landslide or non-landslide, impacts the factor rank, model accuracy, and the interal decision-making mechanism of the model. This finding provides valuable for the selection of sample data in the binary classification model.

Authors’ contributions

Conceptualization, Haijia Wen and Deliang Sun; methodology, Deliang Sun and Xiaoqing Wu; software, Xiaoqing Wu; validation, Xiaoqing Wu and Qingyu Gu; formal analysis, Qingyu Gu; investigation, Xiaoqing Wu; resources, Deliang Sun; data curation, Deliang Sun and Xiaoqing Wu; writing—original draft preparation, Deliang Sun and Xiaoqing Wu; writing—review and editing, Haijia Wen; visualization, Xiaoqing Wu; supervision, Haijia Wen; project administration, Deliang Sun; funding acquisition, Deliang Sun. All authors have read and agreed to the published version of the manuscript.

Disclosure statement

No potential competing interest was reported by the authors.

Data availability statement

The datasets used or analysed during this study are available from the corresponding author on reasonable request.

Additional information

Funding

This work was supported by the Natural Science Foundation of Chongqing (CSTB2022NSCQ-MSX0594) and National Social Science Funds of China (Grant No. 22BJY140).