Full article: Landslide detection with Mask R-CNN using complex background enhancement based on multi-scale samples

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

Deep learning has been widely used in landslides detection. However, in practical application, the sample quality often cannot meet the requirements of training models. Some smaller landslides are easy to be omitted if there are multiple landslide objects in one sample. Furthermore, there are some objects with similar shape, texture and colour to landslides (complex backgrounds), such as bare land, roads, water surfaces and artificial buildings. The traditional landslides detection method is easy to confuse landslides and complex backgrounds, which leads to false and omissive detections. To solve the above two problems, a complex background enhancement method with multi-scale samples (MSSCBE) was proposed to improve sample quality. Using the background enhanced samples, the deep learning model can not only learn differences between landslides and complex backgrounds, but also learn the multi-scale features of landslides better. The proposed method was applied to detect landslides that occurred in Jiuzhaigou County, Sichuan Province. Comparative experiments were conducted using Mask R-CNN model. And the model trained with both MSSCBE background enhanced samples and original samples has the best performance. Compared with the model trained with only original samples, Precision, Recall, F1 Score and mIoU is improved by 29.76%, 5.59%, 17.82% and 25.80%, respectively.

Keywords:

1. Introduction

Landslide detection is of great significance to geological hazards management and risk assessment (Xu et al. Citation2015; Drazba et al. Citation2018; Intrieri et al. Citation2019; Miao et al. Citation2019; Zhou et al. Citation2021; Wang et al. Citation2022). However, traditional landslide detection methods are mostly based on field surveys (Brardinoni et al. Citation2003; Aksoy and Ercanoglu Citation2012; Chen W et al. Citation2014). The field survey is limited in scope, time-consuming, heavy in workload and low in efficiency. It is difficult to meet needs of the rescue departments in efficiency. With the rapid development of remote sensing technology, it has been widely used in disaster emergency and rescue due to its advantages of rapid and macro (Behling et al. Citation2014; Chen G et al. Citation2014; Luo et al. Citation2016; Manfre et al. Citation2016; Keyport et al. Citation2018; Lu et al. Citation2019). Therefore, it is of great significance to automatically or semi-automatically grasp the distribution of disasters using remote sensing technology.

In recent years, deep learning has been widely used in the field of computer vision, such as instance segmentation, object detection and image classification, also provides an effective direction for landslides detection and susceptibility mapping (Jin et al. Citation2019; Wang J et al. Citation2021; Lv et al. Citation2022). Compared with the traditional methods, deep learning method uses hierarchical feature extraction instead of manual feature recognition, which has higher accuracy and efficiency in landslides detection.

The application of remote sensing image deep learning in landslide detection has gradually become a research hotspot (Ghorbanzadeh et al. Citation2019; Ji et al. Citation2019; Karantanellis et al. Citation2020; Wang H et al. Citation2021). Zhang (Citation2019) applied three deep learning algorithms to detect landslides, and achieved good results on the boundary, but the shape of landslides could not be obtained. Ju et al. (Citation2022) uses RetinaNet, YOLO v3 and Mask R-CNN to identify old loess landslides with Google Earth images in three landslide areas of Gansu Province, China. Lu et al. (Citation2020) proposed a landslides detection method with high-resolution remote sensing images combining TL model and object-oriented image analysis. This method can not only extract the edge of large landslides, but also accurately detect small and medium-sized landslides.

In recent years, many researches have been done to improve the efficiency and accuracy of landslide detection. There are two main types for landslide detection in current researches based on deep learning.

1.1. Optimization of the deep learning model

There are many studies on improving the accuracy of landslides detection by optimizing deep learning models. Li et al. (Citation2021) constructed a deep learning framework based on the stacked auto en-coder (SAE) and considered the source area features of EQILs to model their distribution, and the results show that this method is significantly superior to the traditional method. Liu P et al. (Citation2021) proposed an end-to-end improved Mask R-CNN model, using ResNext instead of ResNet, improving the feature pyramid network, adding an edge loss function to improve the extraction of landslide boundaries, and realized the automatic identification of landslides in Jiuzhaigou. Ullo et al. (Citation2021) presented a novel method of landslide detection by using a pre-trained Mask R-CNN model with ResNet-50 and 101 as backbone based on transfer learning approach. Gao et al. (Citation2021) proposed a new landslide recognition and mapping framework based on the fully convolutional DenseNet by improving the network structure to solve the problems of insufficient feature extraction, excessive parameters and slow model testing. Fu et al. (Citation2022) developed a model for identifying post-earthquake seismic landslides by using Swin Transformer as the backbone networks of Mask R-CNN and taking Wenchuan County, Sichuan Province, China’s 2008 post-quake Unmanned Air Vehicle remote sensing images as the data source, which performs better than the original Mask RCNN, YOLOv5, and Faster R-CNN. Liu et al. (Citation2023) proposed a feature-fusion based semantic segmentation network (FFS-Net), which can extract texture and shape features from remote sensing images and terrain features from DEM data before fusing these two distinct types of features in a higher feature layer. And experimental results validated that FFS-Net can greatly improve the segmentation accuracy of visually blurred old landslides.

1.2. Improvement of training samples quality

Samples are the basis of deep learning and has a great influence on the accuracy of landslides detection. Ghorbanzadeh et al. (Citation2019) found that deep learning can improve landslide mapping in the future if there are enough training samples and the number of existing samples is artificially increased with suitable augmentation strategies by comparing the potential of machine learning and deep-learning convolution neural networks for landslide detection. Training samples with low quality will lead to insufficient feature learning and low detection accuracy (Jiang et al. Citation2023; Yang et al. Citation2022). To improve the quality of samples, researchers enriched the feature information of samples by data enhancement methods or by using multi-source geoscience data as supplement of images. Jiang et al. (Citation2023) used Mask R-CNN model to segmentize landslides by hard samples enhancement, and the accuracy of landslides detection was improved while maintaining the detection speed, which were proved by the experiment results of Bijie City and Tianshui City in China. Long et al. (Citation2022) combined satellite remote sensing images with various landslides inducing factors, and then established two landslide detection models based on deep belief network and convolutional neural network algorithms to detect Jinsha River high-rise landslides. Liu P et al. (Citation2020) added three new bands of DSM, slope and aspect to the RGB bands using U-Net model. By increasing the number of characteristic parameters of training samples, the intelligent extraction of landslide information after earthquake is realized through automatic extraction of hierarchical features. Liu T et al. (Citation2021) constructed landslide detection models based on convolutional neural networks, residual neural networks, and dense convolutional neural networks (DenseNets) that considers ‘ZY-3’ high spatial resolution (HSR) data and conditioning factors (CFs). And the experimental results of Three Gorges Reservoir show that DenseNet with HSR data and CFs outperforms the other five models. And Cai et al. (Citation2021) superimposed 12 environmental factors on the original remote sensing image to create landslide samples, and found that the combination of environmental factors and DenseNet can improve the accuracy of the detection model. Yang et al. (Citation2022) proposed a background-enhancement method to enrich the complexity of landslide samples, while landslide inducing factors such as DEM, slope and distance to river were used as supplements to provide additional information for landslides detection. And the experimental results of Ludian County, Yunnan Province show that the accuracy of landslide detection was improved obviously.

Although a lot of work has been done to improve the accuracy of landslide detection, there are still the following two problems in practical application. Firstly, landslide samples are usually of single scale, so some smaller landslides are easily omitted if there are multiple landslides in one sample. Secondly, landslide samples are relatively scant compared with vegetation, water and rural areas (Cai et al. Citation2021), and landslides vary greatly in shape, texture, colour and size. While there are some ground objects with similar shape, texture and colour to landslides, such as bare land, roads, water surfaces and some artificial buildings (referred to as ‘complex backgrounds’ in this paper). It is easy to cause the false detections and omissive detections because of insufficient feature learning of complex backgrounds.

Both optimization of deep learning model structures and improvement of training samples quality can improve the accuracy of landslide detection to a certain extent. Improving the quality of training samples by data enhancement or supplement of images are optimizations on data level, which are applicable for deep learning models with different structures theoretically (Yang et al. Citation2022). Therefore, providing high quality training samples for deep learning models to adequately learn landslide features is a worthy research direction. Aiming for this, to solve the above problems, a complex background enhancement method with multi-scale samples (MSSCBE) is proposed which combines CutMix and Mosaic to improve the quality of landslide samples. The differences between landslides and complex backgrounds can be learned more effectively and the feature learning of multi-scale samples can be enhanced through the background enhanced samples. Therefore, the false detections and omissive detections of landslides would be reduced, and the accuracy of landslides detection would be improved.

Landslides are usually irregular in shape. Mask R-CNN is an instance segmentation model with a complete structure and strong ability to extract object features (He et al. Citation2017; Carvalho et al. Citation2020; Catani Citation2021). Mask R-CNN is suitable for landslide detection because it can detect boundaries of irregular objects while locating objects. In the process of landslides detection, the location and shape of landslides can be extracted synchronously. Therefore, comparative experiments are made based on the Mask R-CNN model to evaluate the effectiveness and applicability of the proposed method. And two geographically close areas with concentrated landslides are selected as the study area in Jiuzhaigou County, Sichuan Province.

2. Study area and data sets

2.1. Study area

The study area is located in Jiuzhaigou County in the northeast of Sichuan Province, China, which is shown in . There are multiple natural reserves, national forest parks, and national 5 A scenic spots in the territory. Jiuzhaigou County is also known as ‘China Tourism County’. On 8 August 2017, an earthquake with a magnitude of Ms 7.0 occurred in the area, and the epicentre of the earthquake was 20 km deep (Liang et al. Citation2021). Due to the steep terrain, massive secondary disasters such as landslides were triggered, causing road blockage and ecological damage. Detecting the spatial location of these landslides is crucial for disaster prevention, mitigation, and reconstruction of scenic spots. In this study, remote sensing images from two areas where landslides are densely distributed were selected as experimental data. Study Area 1 on the left is used as the training area, while Study Area 2 on the right is used as the test area. The training area is 23.03 km², and the test area is 8.94 km².

Figure 1. Information of study area.

2.2. Data sets

In this study, remote sensing images of Google Earth are selected to construct landslide samples for deep learning. Image parameters are shown in . The images are preprocessed through several steps such as projection transformation, radiation correction and image registration. The size of landslides varies greatly in the study area, and many small-sized landslides cannot be detected if landslide samples are of single-scale. To solve this problem, we plan to construct landslide samples with multi-scale features. In this paper, the original images are clipped by regular grid clipping method, and the datasets of 256 × 256 pixels and 512 × 512 pixels are obtained respectively (). 256 × 256 pixels dataset will be used as original samples, and 512 × 512 pixels dataset will be used by MSSCBE background enhancement method to make samples with multi-scale landslide features. The web tool ‘VGG Image Annotator’ is used to label the landslide samples according to the landslide locations published in Resource Environmental Science and Data Center (https://www.resdc.cn/) and the features of landslides.

Figure 2. Image splitting process of the training area.

Table 1. Information parameter table of remote sensing images.

Download CSV Display Table

2.3. Definition of complex backgrounds

It is difficult to identify landslides which vary in size, shape, texture and colour (Li et al. Citation2016). In particular, some backgrounds with similar features to landslides are easy to be misidentified as landslides. To solve this problem, information of complex backgrounds is used to improve the quality of landslide samples through background enhancement method in this paper. And the following three types of background features are selected and defined as complex background. The first type is artificial buildings, as shown in (a). Some artificial buildings have high reflectance and are bright white that similar to the landslide bed of rock landslide in colour. The second type is terraced fields and water surfaces, as shown in (b). Most landslides show a U-shape and have obvious boundary with the surrounding vegetation. Some terraces fields and water surfaces have similar features to landslides in shape. The third type is bare land and roads, as shown in (c). In the landslide area, surface vegetation is destroyed, and the soil is exposed. In terms of texture features, the surface is broken, the texture is rough and the colour are heterogeneous. The texture features of some bare land are similar to the landslides, which is difficult to identify.

Figure 3. Three types of complex background. (a) Artificial buildings; (b) terraced fields and water surfaces; (c) bare land and roads.

3. Methodology

3.1. Background enhancement method

It is necessary to prepare a labelled training dataset with high quantity and quality so that the deep learning model can properly learn the features of the ground objects. Data enhancement can expand landslide samples by increasing the number and complexity of the sample. There are two main types of data enhancement methods commonly used. The first type is single-sample data enhancement, such as rotation, inversion, random clipping, translation, random erasure, etc. However, excessive rotation or inversion on a single-sample may lead to overfitting of the model. The second type is multi-sample data enhancement, such as Cutout, Mixup, CutMix (Yun et al. Citation2019) and Mosaic (Bochkovskiy et al. Citation2020). Multi-sample data enhancement can provide more useful information than single-sample data enhancement.

In this study, the location and extent of landslides would be determined. Therefore, data enhancement methods such as Mixup and Cutout, which can only identify the target type, are not suitable. The CutMix method (Yun et al. Citation2019) proposed by Yun et al. is to create a new sample by replacing a random area in one image with pixels from another image, making the model focus on the whole area rather than some easily distinguishable parts. The Mosaic data enhancement method is used to randomly scale, crop, arrange, and then stitch the input to improve the detection effect of small targets in YOLO v5, and it achieved a fairly excellent performance (Zhang et al. Citation2022). To reduce omissive and false detection, the model needs to adequately learn the texture, shape, colour and other features of landslides and be able to distinguish the differences between landslides and complex backgrounds. In this paper, a complex background enhancement method with multi-scale samples (MSSCBE) which combines CutMix and Mosaic is proposed to improve the quality of landslide samples, so as to improve the accuracy of landslide detection. The proposed method mainly includes three steps:

The first step is to make CutMix background enhanced samples by two specific methods. The first method is shown in , a non-landslide sample (4(a)) with complex background and a landslide sample (4(b)) are randomly selected from the 256 × 256 pixels dataset. Then A1 region similar to landslide in the non-landslide sample is clipped, and A2 region in the landslide sample is replaced with A1 region to obtain a new sample (4(c)); The second method is shown in . A landslide sample (5(a)) and a non-landslide sample (5(b)) containing complex background are randomly selected from the 256 × 256 pixels dataset. Then, landslide B1 region in the landslide sample is clipped, and B2 region in the non-landslide sample is replaced with B1 region to obtain a new sample (5(c)), and landslides in the new sample are labelled (4(d), 5(d)).

Figure 4. Background enhancement method 1 based on CutMix. (a) Non-landslide sample; (b) landslide sample; (c) new sample obtained by CutMix; (d) mask of new landslide sample.

Figure 5. Background enhancement method 2 based on CutMix. (a) Landslide sample; (b) non-landslide sample; (c) new sample obtained by CutMix; (d) mask of new landslide sample.

The second step is to make Mosaic background enhanced samples by two specific methods. The first method is shown in , three non-landslide samples (6(a)) and one landslide sample (6(b)) are spliced together to make a new sample (6(c)). The second method is shown in , one non-landslide sample (7(a)) and three landslide samples (7(b)) are spliced together to make a new sample (7(c)). The landslide and non-landslide samples to be spliced are randomly selected from the 256 × 256 pixels dataset, which can reduce the subjective influence of artificial selection. The new samples can provide more background features during model training, and improve the ability of model to detect landslides from complex backgrounds. Finally, landslides in the new sample are labelled (6(d), 7(d)).

Figure 6. Background enhancement method 1 based on Mosaic. (a) Three non-landslide samples; (b) one landslide sample; (c) new sample obtained by Mosaic; (d) mask of new landslide sample.

Figure 7. Background enhancement method 2 based on Mosaic. (a) One non-landslide sample; (b) three landslide samples; (c) new sample obtained by Mosaic; (d) mask of new landslide sample.

The third step is to make MSSCBE background enhanced samples based on CutMix and Mosaic. This method is shown in . First of all, a non-landslide sample (8(a)), a landslide sample (8(b)) are randomly selected from the 256 × 256 pixels dataset, a CutMix background enhanced sample (8(c)) is selected from samples made in the first step and a landslide sample (8(d)) is randomly selected from the 512 × 512 pixels dataset. Then, according to the method of the second step, the four selected samples are spliced together into a new sample (8(e)) based on Mosaic, and landslides in the new sample are labelled (8(f)). In this way, the new sample contains both landslide features of two scales and complex background features. Therefore, differences between landslides and complex backgrounds can be learned by the deep learning model more effectively, while the multi-scale features of landslides can also be learned.

Figure 8. Background enhancement method based on MSSCBE. (a) Non-landslide sample from 256 × 256 pixels dataset; (b) landslide sample from 256 × 256 pixels dataset; (c) CutMix background enhanced sample; (d) landslide sample from 512 × 512 pixels dataset; (e) new sample obtained by MSSCBE; (f) mask of new landslide sample.

3.2. Technical flowchart

As shown in , the process of this study mainly includes three parts: preparation of experimental data, detection of landslide information and analysis of landslide detection results. Our goal is to construct and evaluate landslides detection model using complex background enhanced samples. In the experiment, to explore the influence of different background enhancement methods on landslides detection results, the Mask R-CNN model will be trained using four different training datasets in the first group of comparative experiments.

Figure 9. The flowchart of construct and evaluate landslides detection model.

During the preparation of experimental data, 256 × 256 pixels dataset is used as original samples with an amount of 1441. Three types of background enhanced samples are created based on original samples through background enhancement method mentioned in Section 3.1. Experiment I uses only original samples as input dataset. On the basis of Experiment I, Experiment II adds the CutMix background enhanced samples that have an amount of 700, Experiment III adds the Mosaic background enhanced samples that have an amount of 700, and Experiment IV adds the MSSCBE background enhanced samples that have an amount of 700. Specific dataset is shown in . The model performance and landslide detection results of different training datasets are analysed through this group of comparative experiments. Precision, Recall, F1 Score and mIoU are used to quantitatively evaluate the accuracy of landslide detection results, and some representative landslide detection results are selected for detailed comparison with landslides truth to analyse the effectiveness of the proposed MSSCBE method furtherly.

Table 2. Datasets for comparative experiments using Mask R-CNN.

Download CSV Display Table

3.3. Experimental environment and parameter setting

The hardware environment used for training the model is as follows: the processor is AMD Ryzen Threadripper 2970WX 24-Core, and the version of graphic card is NVIDIA GeForce RTX 3070 TI. The software environment of the experiment is configured as follows: the version of PyTorch is 1.10, the version of CUDA is 11.3, and the backbone network used in the graphic processing unit is ResNet101. During training the model, the training parameters are set as follows: epochs at 100, number of iterations per epoch at 200, batch size of training at 2, learning rate at 0.001, learning momentum at 0.9, and weight decay at 0.0001.

To reduce training costs and effectively improve the performance of model and overall accuracy of landslide detection, the code used in this paper is Matterport’s implementation of Mask R-CNN (https://github.com/matterport/Mask_RCNN). A pre-trained Mask R-CNN weighted from COCO2014 dataset is used to transfer initial values of parameters, and a new weight is obtained and saved by training the model. In order to reduce overfitting, the validation dataset and the new weight are used to validate the accuracy of the model. The model with the highest accuracy is taken as the test model and applied to the landslide detection and result analysis of the test dataset.

3.4. Accuracy evaluation

The ratio of intersection and union between prediction box and real box is usually used to evaluate the accuracy of object detection (Zhang et al. Citation2018; Liu et al. Citation2019). This ratio can be measured by IoU (Intersection over Union) which is expressed as EquationEquation (1)(1) $I_{U} = \frac{I}{U}$ (1) . (1) $I_{U} = \frac{I}{U}$ (1)

In which, $I_{U}$ is the value of IoU, I is the intersection area of the predicted box and the real box, and U is the union area of the predicted box and the real box.

It is necessary to judge whether the prediction result is correct before evaluating the accuracy of object detection. The correctness of the prediction result is judged by setting the confidence threshold and the IoU threshold. In this paper, we regard the result of ‘IoU > 0.5 and confidence > 0.9’ as the correct prediction result (Liu Y et al. Citation2020; Zhou et al. Citation2022).

As the common measures of accuracy evaluation for object detection, Precision, Recall, F1 Score and mIoU are used to evaluate the accuracy of landslides detection, so as to validate the effectiveness of the proposed MSSCBE method.

Precision is the ratio of the number of correctly identified landslides to the total number of identified landslides. Recall is the ratio of the number of correctly identified landslides to the number of landslides in the test dataset. F1 Score is used to evaluate the overall performance of the model and is defined as the harmonic average of Precision and Recall. The larger the F1 Score, the better the performance of the model. And mIoU is the mean ratio of intersection and union between prediction and real landslides. The larger the mIoU, the better the performance of the model. The above four measures are defined as the following Equations: (2) $Precision = \frac{TP}{TP + FP}$ (2) (3) $Recall = \frac{TP}{TP + FN}$ (3) (4) $F_{1} Score = (1 + β^{2}) \frac{Precision * Recall}{Precision + Recall}, β = 1$ (4) (5) $mIoU = \frac{TP}{FP + TP + FN}$ (5)

In which, TP, FP, and FN are shown in . TP is the areas correctly identified as landslides. FP is the background areas incorrectly identified as landslides. FN is the landslide areas of incorrectly identified as background.

Table 3. Confusion matrix between true value and predicted value.

Download CSV Display Table

4. Experimental results and discussion

4.1. Analysis of results in experimental area

The optimal models trained from the above four comparative experiments are used respectively to detect landslides of the test area. The distribution map of landslides truth of the test area is shown in (a), and the results of four experiments are shown in .

Figure 10. Landslides truth and landslide detection results using Mask R-CNN. (a) Landslides truth; (b) results of Experiment I; (c) results of Experiment II; (d) results of Experiment III; (e) results of Experiment IV.

(b) shows the result of Experiment I which used only original samples to train the model. The result shows that there are a lot of false detections (red area) in rivers, roads and some artificial buildings, as well as some omissive detections for landslides (blue area). (c) shows the result of Experiment II, in which the model is trained by adding CutMix background enhanced samples to original samples. (d) shows the result of Experiment III, which used original samples and Mosaic background enhanced samples as the training dataset. From the results of Experiment II and III, we can see that the false detection of roads and some artificial buildings have decreased, and the false detections caused by rivers have disappeared, but the omissive detections are not improved. (e) shows the result of Experiment IV, which used original samples and MSSCBE background enhanced samples. Compared with the previous three experimental results, the false and omissive detections have decreased significantly, and the shape of landslides is more complete.

4.2. Comparative analysis of quantitative metrics for detection accuracy

The quantitative metrics can also be used to evaluate the improvements on landslides detection accuracy brought about by the proposed MSSCBE method. The Precision, Recall, F1 Score and mIoU of four comparison experiments are shown in .

Table 4. Comparative statistics of landslide detection results using Mask R-CNN.

Download CSV Display Table

In Experiment I, original samples were used to train Mask R-CNN model. After adjusting various settings, Precision, Recall, F1 Score and mIoU reached 67.50%, 79.90%, 73.18% and 57.70%, respectively. Subsequent experiments were conducted according to the settings of Experiment I. In Experiment II, the Precision was improved by 19.52%, but the Recall dropped by 3.54% after adding CutMix background enhanced samples to the training dataset. In Experiment III, when Mosaic background enhanced samples were added to the training dataset, the Precision was improved by 22.22%, but the Recall dropped by 3.11%. Finally, in Experiment IV, the model trained with MSSCBE background enhanced samples and original samples achieved the best performance, with Precision of 97.26%, Recall of 85.48%, F1 Score of 90.99% and mIoU of 83.50%. Each metric is 29.76%, 5.59%, 17.82% and 25.80% higher than traditional methods using only original samples as input.

4.3. Samples analysis of landslide detection

To demonstrate the superiority of MSSCBE method for landslide detection in a more detailed way, six groups (a-f) of representative samples were selected from the test dataset, as shown in . The original samples and ground truth of landslides are shown in (a1–f1), which provide reference for landslide detection results. The detection results of Experiments I-IV are presented as green masks shown in (a2–f4). To highlight differences between ground truth and detection results, boxes with different colours are drawn. Red boxes represent false detections, and white boxes represent omissive detections. By analysing the changes of landslide detection boundary, we can find the influence of different training dataset on detection results.

Figure 11. Comparisons of landslide detection results and ground truth of six samples. (a1–f1) Original remote sensing images and ground truth; (a2–f2) results of Experiment I; (a3–f3) results of Experiment II; (a4–f4) results of Experiment III; (a5–f5) results of Experiment IV. The detection results are presented as green masks. Red box represents false detections and white box represents omissive detections.

From detection results of groups (a-c), we can see that the detected landslide boundary is rough and there is a large area of false detections in Experiment I (a2–c2). When CutMix background enhanced samples or Mosaic background enhanced samples are added to the training dataset, false detections are reduced, but omissive detections are increased, especially some smaller landslides are omitted if there are multiple landslides in one sample, as shown in (a3–c3) (a4–c4). (a5–c5) show the detection results after adding MSSCBE background enhanced samples, and the false detections largely disappear. Compared with the previous two background enhancement methods, the omissive detections are significantly reduced, and some smaller and easily missed landslides can also be detected.

From detection results of groups (d-f), we can see that the proposed method can effectively distinguish the complex backgrounds and landslides. In Experiment I (d2–f2), the complex backgrounds are all incorrectly identified as landslides, including bare land and roads (in group (d)), artificial building (in group (e)), and water (in group (f)). In Experiment III, the bare land (d4) and water(f4) are incorrectly identified as landslides, but there is no false detection on roads (d4) and artificial building (e4). And in Experiment II (d3–f3) and Experiment IV (d5–f5), there is no false detection of complex backgrounds.

Overall, the MSSCBE method has achieved the best performance. On the one hand, landslides can be distinguished from complex backgrounds such as bare land, roads, artificial buildings and water, so the false detections are reduced significantly. On the other hand, some smaller landslides can be detected correctly when there are multiple landslides in one sample, so the omissive detections are also improved greatly. Meanwhile, the boundaries of landslides are better extracted. Therefore, the landslide detection accuracy based on the MSSCBE background enhancement method is higher.

4.4. Application in different deep learning models

Since the proposed method is to improve the landslide detection by changing the training dataset, theoretically it can be applied to different deep learning models. To further test the general applicability and effectiveness of MSSCBE method, landslide detection experiments are conducted with U-Net, PSPNet and Deeplab V3+ by inputting different training dataset. And the effects of different deep learning models for landslide detection are compared, as shown in .

Table 5. Comparative statistics of landslide detection results using different deep learning models.

Download CSV Display Table

The U-Net model, PSPNet model and Deeplab v3+ model all achieved better performance after adding MSSCBE background enhanced samples to the training dataset. With the exception of a slight drop in Recall, other indicators have improved greatly. Precision, F1 score and mIoU of the U-Net model were improved by 17.88%, 9.58% and 13.36%, respectively. These indicators of the PSPNet model were improved by 18.58%, 9.26% and 13.08%, respectively. These indicators of the Deeplab v3+ model were improved by 22.27%, 10.07% and 13.79%, respectively.

Comparing the detection results of the above ten experiments, Mask R-CNN trained with the dataset added MSSCBE background enhanced samples has the best performance.

4.5. Application of MSSCBE method in Bijie City

To further verify the applicability of MSSCBE method in other regions, remote sensing landslide dataset of Bijie City, Guizhou Province (called Bijie landslide dataset) created by Group of Photogrammetry and Computer Vision (GPCV) at Wuhan University (Ji Citation2021) are selected for experimental verification. The 770 landslide samples and 2003 negative samples from Bijie landslide dataset are taken as new test dataset, and the Mask R-CNN models trained in Experiment I, II, III, IV respectively is used to detect landslides in Bijie City. The detection results are shown in .

Table 6. Comparative statistics of landslide detection results in Bijie City.

Download CSV Display Table

The model achieves the best performance after adding MSSCBE background enhanced samples to the training dataset. Its Precision, Recall, F1 score and mIoU are 31.46%, 3.95%, 19.34% and 22.94% respectively higher than traditional methods using only original samples. However, because the training area and the test area are geographically far away, each metric of the detection results in Bijie City is lower than that in Jiuzhaigou County. Therefore, when we change the study area, applying the MSSCBE method to the study area instead of directly migrating the trained model should achieve higher accuracy.

5. Conclusions and prospect

In automatic landslide detection based on deep learning method, the confusion between landslide and complex background (e.g. bare land, roads, artificial buildings and water surfaces) is easy to lead to false detections. Meanwhile, landslides with smaller area are easy to be omitted when there are multiple landslides with different areas in one sample. To deal with these shortages, a complex background enhancement method with multi-scale samples (MSSCBE) is proposed based on CutMix and Mosaic. And the feasibility of this method was verified using the landslide data of Jiuzhaigou scenic spot and the Mask R-CNN model. Moreover, U-Net, PSPNet and Deeplab v3+ were used to test the general applicability and effectiveness of the proposed method on different deep learning models.

By comparing landslide detection results of Mask R-CNN model trained with different input data, we can conclude that MSSCBE method can effectively improve the false detection of complex background and the omissive detection of smaller landslides, and improve the accuracy of landslide boundary detection results. The Mask R-CNN model trained with dataset added MSSCBE background enhanced samples achieved the best performance on the whole, with Precision of 97.26%, Recall of 85.48%, F1 Score of 90.99% and mIoU of 83.50%. Each metric is 29.76%, 5.59%, 17.82% and 25.80% higher than that of the traditional model trained with only original samples, respectively. Moreover, it was verified that MSSCBE method can be applied to different deep learning models by using U-Net, PSPNet and Deeplab V3+ and different training samples. And the experimental results of Bijie City show that MSSCBE method can improve the accuracy of landslide detection in other regions.

The training area and the test area of Jiuzhaigou County are relatively close geographically, and images used in experiments have similar features. Therefore, the proposed method has achieved good results in Jiuzhaigou County. However, the application of MSSCBE method in Bijie City shows that when the images of the test area and the training area are not similar, some adjustments will be required in order to obtain better detection accuracy. To reduce the workload of transfer learning, future work will focus more on improving the general applicability of landslide detection model. At the same time, to further improve the omissive detections of multi-objects, we can try to fuse images with more scales.

Author contributions

Xiaohui Liu and Ling Xu conceived and designed the experiments; Ling Xu and Jinyu Zhang performed the experiments; All authors analysed the data and wrote the paper.

Acknowledgements

The authors would like to express their appreciation to the editor and the referees for their comments and suggestions that greatly improved the manuscript.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Data availability statement

The data presented in this study are available on request from the corresponding author.

Additional information

Funding

This research was supported by a Project of Shandong Province Higher Educational Science and Technology Program, China (Grant No. J16LH05), and Shandong Provincial Natural Science Foundation, China (Grant No. ZR2016DQ06).

References

Aksoy B, Ercanoglu M. 2012. Landslide identification and classification by object-based image analysis and fuzzy logic; an example from the Azdavay region (Kastamonu, Turkey). Comput Geosci. 38(1):87–98. doi: 10.1016/j.cageo.2011.05.010.
Web of Science ®Google Scholar
Behling R, Roessner S, Kaufmann H, Kleinschmit B. 2014. Automated spatiotemporal landslide mapping over large areas using rapideye time series data. Remote Sens. 6(9):8026–8055. doi: 10.3390/rs6098026.
Google Scholar
Bochkovskiy A, Wang C, Liao HM. 2020. YOLOv4: optimal speed and accuracy of object detection. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); Jun 14–19; Virtual. Piscataway, NJ: IEEE.
Google Scholar
Brardinoni F, Slaymaker O, Hassan MA. 2003. Landslide inventory in a rugged forested watershed: a comparison between air-photo and field survey data. Geomorphology. 54(3-4):179–196. doi: 10.1016/S0169-555X(02)00355-0.
Web of Science ®Google Scholar
Cai H, Chen T, Niu R, Plaza A. 2021. Landslide detection using densely connected convolutional networks and environmental conditions. IEEE J Sel Top Appl Earth Obs Remote Sens. 14:5235–5247. doi: 10.1109/JSTARS.2021.3079196.
Web of Science ®Google Scholar
Carvalho OLFD, de Carvalho Júnior OA, Albuquerque AOD, Bem PPD, Silva CR, Ferreira PHG, Moura RDSD, Gomes RAT, Guimarães RF, Borges DL. 2020. Instance segmentation for large, multi-channel remote sensing imagery using Mask-RCNN and a mosaicking approach. Remote Sens. 13(1):39. doi: 10.3390/rs13010039.
Google Scholar
Catani F. 2021. Landslide detection by deep learning of non-nadiral and crowdsourced optical images. Landslides. 18(3):1025–1044. doi: 10.1007/s10346-020-01513-4.
Web of Science ®Google Scholar
Chen G, Xianju L, Weitao C, Xinwen C, Zhang Y, Shengwei L. 2014. Extraction and application analysis of landslide influential factors based on LiDAR DEM; a case study in the Three Gorges Area, China. Nat Hazards. 74(2):509–526. doi: 10.1007/s11069-014-1192-6.
Web of Science ®Google Scholar
Chen W, Li X, Wang Y, Chen G, Liu S. 2014. Forested landslide detection using LiDAR data and the random forest algorithm: a case study of the Three Gorges, China. Remote Sens Environ. 152:291–301. doi: 10.1016/j.rse.2014.07.004.
Web of Science ®Google Scholar
Drazba MC, Yan-Richards A, Wilkinson S. 2018. Landslide hazards in Fiji, managing the risk and not the disaster, a literature review. Proc Eng. 212:1334–1338. doi: 10.1016/j.proeng.2018.01.172.
Google Scholar
Fu R, He J, Liu G, Li W, Mao J, He M, Lin Y. 2022. Fast seismic landslide detection based on improved Mask R-CNN. Remote Sens. 14(16):3928. doi: 10.3390/rs14163928.
Google Scholar
Gao X, Chen T, Niu R, Plaza A. 2021. Recognition and mapping of landslide using a fully convolutional densenet and influencing factors. IEEE J Sel Top Appl Earth Obs Remote Sens. 14:7881–7894. doi: 10.1109/JSTARS.2021.3101203.
Web of Science ®Google Scholar
Ghorbanzadeh O, Blaschke T, Gholamnia K, Meena SR, Tiede D, Aryal J. 2019. Evaluation of different machine learning methods and deep-learning convolutional neural networks for landslide detection. Remote Sens. 11(2):196. doi: 10.3390/rs11020196.
Google Scholar
He KM, Gkioxari G, Dollár P, Girshick R. 2017. Mask R-CNN. In: IEEE International Conference on Computer Vision (ICCV); Oct 22–29; Venice, Italy. Piscataway, NJ: IEEE. p. 2980–2988.
Google Scholar
Intrieri E, Carlà T, Gigli G. 2019. Forecasting the time of failure of landslides at slope-scale; a literature review. Earth-Sci Rev. 193:333–349. doi: 10.1016/j.earscirev.2019.03.019.
Web of Science ®Google Scholar
Ji S. 2019. Dataset collection by Group of Photogrammetry and Computer Vision (GPCV) at Wuhan University-Dataset 4: Bijie Landslide Dataset. Wuhan: Wuhan University.
Google Scholar
Ji S, Yu D, Shen C, Li W, Xu Q. 2020. Landslide detection from an open satellite imagery and digital elevation model dataset using attention boosted convolutional neural networks. Landslides. 17(6):1337–1352. doi: 10.1007/s10346-020-01353-2.
Web of Science ®Google Scholar
Jiang W, Xi J, Li Z, Ding M, Yang L, Xie D. 2023. Landslide detection and segmentation using Mask R-CNN with simulated hard samples. Geomat Inf Sci WH Uni. 48(12):1931–1942. doi: 10.13203/j.whugis20200692.
Google Scholar
Jin KP, Yao LK, Cheng QG, Xing AG. 2019. Seismic landslides hazard zoning based on the modified Newmark model: a case study from the Lushan Earthquake, China. Nat Hazards. 99(1):493–509. doi: 10.1007/s11069-019-03754-6.
Web of Science ®Google Scholar
Ju Y, Xu Q, Jin S, Li W, Su Y, Dong X, Guo Q. 2022. Loess landslide detection using object detection algorithms in Northwest China. Remote Sens. 14(5):1182. doi: 10.3390/rs14051182.
Google Scholar
Karantanellis E, Marinos V, Vassilakis E, Christaras B. 2020. Object-based analysis using unmanned aerial vehicles (UAVs) for site-specific landslide assessment. Remote Sens. 12(11):1711. doi: 10.3390/rs12111711.
Google Scholar
Keyport RN, Oommen T, Martha TR, Sajinkumar KS, Gierke JS. 2018. A comparative analysis of pixel- and object-based detection of landslides from very high-resolution images. Int J Appl Earth Obs Geoinf. 64:1–11. doi: 10.1016/j.jag.2017.08.015.
Web of Science ®Google Scholar
Li Y, Cui P, Ye C, Junior JM, Zhang Z, Guo J, Li J. 2021. Accurate prediction of earthquake-induced landslides based on deep learning considering landslide source area. Remote Sens. 13(17):3436. doi: 10.3390/rs13173436.
Google Scholar
Li Z, Shi W, Lu P, Yan L, Wang Q, Miao Z. 2016. Landslide mapping from aerial photographs using change detection-based Markov random field. Remote Sens Environ. 187:76–90. doi: 10.1016/j.rse.2016.10.008.
Web of Science ®Google Scholar
Liang R, Dai K, Shi X, Guo B, Dong X, Liang F, Tomas R, Wen N, Fan X. 2021. Automated Mapping of Ms 7.0 Jiuzhaigou Earthquake (China) post-disaster landslides based on high-resolution UAV Imagery. Remote Sens. 13(7):1330. doi: 10.3390/rs13071330.
Google Scholar
Liu P, Wei Y, Wang Q, Chen Y, Xie J. 2020. Research on post-earthquake landslide extraction algorithm based on improved U-Net model. Remote Sens. 12(5):894. doi: 10.3390/rs12050894.
Google Scholar
Liu P, Wei Y, Wang Q, Xie J, Chen Y, Li Z, Zhou H. 2021. A research on landslides automatic extraction model based on the improved Mask R-CNN. ISPRS Int J Geo-Inf. 10(3):168. doi: 10.3390/ijgi10030168.
PubMed Web of Science ®Google Scholar
Liu T, Chen T, Niu R, Plaza A. 2021. Landslide detection mapping employing CNN, ResNet, and DenseNet in the three gorges reservoir, China. IEEE J Sel Top Appl Earth Obs Remote Sens. 14:11417–11428. doi: 10.1109/JSTARS.2021.3117975.
Web of Science ®Google Scholar
Liu X, Peng Y, Lu Z, Li W, Yu J, Ge D, Xiang W. 2023. Feature-fusion segmentation network for landslide detection using high-resolution remote sensing images and digital elevation model data. IEEE Trans Geosci Remote Sens. 61:1–14. doi: 10.1109/TGRS.2022.3233637.
Web of Science ®Google Scholar
Liu Y, Gross L, Li Z, Li X, Fan X, Qi W. 2019. Automatic building extraction on high-resolution remote sensing imagery using deep convolutional encoder-decoder with spatial pyramid pooling. IEEE Access. 7:128774–128786. doi: 10.1109/ACCESS.2019.2940527.
Web of Science ®Google Scholar
Liu Y, Zhou J, Qi W, Li X, Gross L, Shao Q, Zhao Z, Ni L, Fan X, Li Z. 2020. ARC-Net: an efficient network for building extraction from high-resolution aerial images. IEEE Access. 8:154997–155010. doi: 10.1109/ACCESS.2020.3015701.
Web of Science ®Google Scholar
Long L, He F, Liu H. 2022. Correction to: the use of remote sensing satellite using deep learning in emergency monitoring of high-level landslides disaster in Jinsha River. J Supercomput. 78(9):11974–11974. doi: 10.1007/s11227-022-04353-2.
Web of Science ®Google Scholar
Lu H, Ma L, Fu X, Liu C, Wang Z, Tang M, Li N. 2020. Landslides information extraction using object-oriented image analysis paradigm based on deep learning and transfer learning. Remote Sens. 12(5):752. doi: 10.3390/rs12050752.
Google Scholar
Lu P, Qin Y, Li Z, Mondini AC, Casagli N. 2019. Landslide mapping from multi-sensor data through improved change detection-based Markov random field. Remote Sens Environ. 231:111235. doi: 10.1016/j.rse.2019.111235.
Web of Science ®Google Scholar
Luo S, Tong L, Chen Y, Tan L. 2016. Landslides identification based on polarimetric decomposition techniques using Radarsat-2 polarimetric images. Int J Remote Sens. 37(12):2831–2843. doi: 10.1080/01431161.2015.1041620.
Web of Science ®Google Scholar
Lv L, Chen T, Dou J, Plaza A. 2022. A hybrid ensemble-based deep-learning framework for landslide susceptibility mapping. Int J Appl Earth Obs Geoinf. 108:102713. doi: 10.1016/j.jag.2022.102713.
Web of Science ®Google Scholar
Manfre LA, de Albuquerque Nobrega RA, Quintanilha JA. 2016. Evaluation of multiple classifier systems for landslide identification in LANDSAT thematic mapper (TM) images. ISPRS Int J Geo-Inf. 5(9):164. doi: 10.3390/ijgi5090164.
Web of Science ®Google Scholar
Miao FS, Wu YP, Li LW, Liao K, Zhang LF. 2019. Risk assessment of snowmelt-induced landslides based on GIS and an effective snowmelt model. Nat Hazards. 97(3):1151–1173. doi: 10.1007/s11069-019-03693-2.
Web of Science ®Google Scholar
Ullo SL, Mohan A, Sebastianelli A, Ahamed SE, Kumar B, Dwivedi R, Sinha G. 2021. A new Mask R-CNN-based method for improved landslide detection. IEEE J Sel Top Appl Earth Obs Remote Sens. 14:3799–3810. doi: 10.1109/JSTARS.2021.3064981.
Web of Science ®Google Scholar
Wang H, Zhang L, Yin K, Luo H, Li J. 2021. Landslide identification using machine learning. Geosci Front. 12(1):351–364. doi: 10.1016/j.gsf.2020.02.012.
Web of Science ®Google Scholar
Wang J, Nie G, Gao S, Wu S, Li H, Ren X. 2021. Landslide deformation prediction based on a GNSS time series analysis and recurrent neural network model. Remote Sens. 13(6):1055. doi: 10.3390/rs13061055.
Google Scholar
Wang X, Zhang X, Bi J, Zhang X, Deng S, Liu Z, Wang L, Guo H. 2022. Landslide susceptibility evaluation based on potential disaster identification and ensemble learning. Int J Environ Res Public Health. 19(21):14241. doi: 10.3390/ijerph192114241.
PubMed Web of Science ®Google Scholar
Xu J, Zhang M, Fan W. 2015. An overview of geological disaster risk assessment. J Catastrophol. 30:130–134.
Google Scholar
Yang R, Zhang F, Xia J, Wu C. 2022. Landslide extraction using Mask R-CNN with background-enhancement method. Remote Sens. 14(9):2206. doi: 10.3390/rs14092206.
Google Scholar
Yun S, Han D, Oh SJ, Chun S, Choe J, Yoo Y. 2019. CutMix: regularization strategy to train strong classifiers with localizable features. In IEEE/CVF International Conference on Computer Vision (ICCV); Oct 27–Nov 2; Seoul, South Korea. Piscataway, NJ: IEEE. p. 6022–6031.
Google Scholar
Zhang Q. 2019. Convolutional neural network target detection algorithms and application in the field of landslide. Yichang: China Three Gorges University.
Google Scholar
Zhang W, Witharana C, Liljedahl AK, Kanevskiy M. 2018. Deep convolutional neural networks for automated characterization of Arctic ice-wedge polygons in very high spatial resolution aerial imagery. Remote Sens. 10(9):1487. doi: 10.3390/rs10091487.
Google Scholar
Zhang Y, Guo Z, Wu J, Tian Y, Tang H, Guo X. 2022. Real-time vehicle detection based on improved YOLO v5. Sustainability. 14(19):12274. doi: 10.3390/su141912274.
Web of Science ®Google Scholar
Zhou J, Liu Y, Nie G, Cheng H, Yang X, Chen X, Gross L. 2022. Building extraction and floor area estimation at the village level in Rural China via a comprehensive method integrating UAV photogrammetry and the novel EDSANet. Remote Sens. 14(20):5175. doi: 10.3390/rs14205175.
Google Scholar
Zhou X, Wen H, Zhang Y, Xu J, Zhang W. 2021. Landslide susceptibility mapping using hybrid random forest with GeoDetector and RFE for factor optimization. Geosci Front. 12(5):101211. doi: 10.1016/j.gsf.2021.101211.
Web of Science ®Google Scholar

Landslide detection with Mask R-CNN using complex background enhancement based on multi-scale samples

Abstract

1. Introduction

1.1. Optimization of the deep learning model

1.2. Improvement of training samples quality