Abstract
In recent years, ensemble learning methods have become popular in landslide susceptibility mapping (LSM) with varying degrees of success. Within classifier ensemble concept, decision tree based ensemble learners such as random forest (RF) (i.e. decision forest) and rotation forest (RotFor) have gained a great interest due to their robustness against conventional statistical methods. This study proposes canonical correlation forest (CCF), a new member of ensemble learning family, in the prediction of landslide susceptibility for Yenice district of Karabuk in Turkey. To test the robustness and suitability of the CCF method, its prediction performance was compared to two well-known machine learning ensemble algorithms, RF and RotFor, and a commonly used statistical method, the logistic regression (LR). Furthermore, the effects of variations in ratio of training/testing datasets were assessed on the performances of RF, CCF, RotFor and LR models using the root-mean square error (RMSE). The quality of resulting landslide susceptibility maps was evaluated using overall accuracy (OA), Kappa coefficient (KC), success rate curves and receiver operating characteristic (ROC) curves. Wilcoxon’s signed rank test was also applied to measure the statistical differences of the accuracies of susceptibility maps. The estimated area under curve (AUC) values for RF, CCF, RotFor and LR models were 0.982, 0.970, 0.966 and 0.826, respectively. It was clear that ensemble learning algorithms outperformed the LR method. The results also showed that selection of sampling ratio had significant effect on model performance of RF, CCF, RotFor and LR models, and the lowest RMSE values were estimated with the use of 70:30 ratio for training and test datasets.