118
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Semantic segmentation of urban land classes using a multi-scale dataset

, , , , &
Pages 653-675 | Received 28 Jul 2023, Accepted 27 Dec 2023, Published online: 30 Jan 2024
 

ABSTRACT

The use of remote sensing imagery for land cover and land use classification has made significant advancements in recent years. However, it becomes particularly challenging to enhance the semantic representation of high-resolution networks while dealing with uneven land categories and merging multi-scale data without compromising the accuracy of semantic segmentation. To tackle this challenge, this paper presents a novel method for classifying high-resolution remote sensing images based on a deep neural network that performs semantic segmentation of urban construction lands into five categories: vegetation, water, buildings, roads, and bare soil. The network incorporates a U-shaped high-resolution neural network and the advanced high-resolution network (HRNet) framework. The parallel storage of feature maps with different resolutions enables the exchange of information between them. The data pre-processing module addresses the issue of data imbalance in the semantic segmentation of urban construction lands, resulting in an increase in Intersection over Union (IoU) values for different land types by 3.75%-12.01%. Additionally, a target context representation module is introduced to enhance the feature representation of pixels by calculating the relationship between pixels and multiple target regions. Moreover, a polarization attention mechanism is proposed to extract the characteristics of geographical objects in all directions and achieve a stronger semantic representation. This method provides a novel approach to accurately and effectively extract information on construction lands and advance the development of monitoring algorithms for urban construction lands. To validate the proposed U-HRNet-OCR+PSA network, a comparative analysis was conducted with six classical networks, including DeepLabv3+, PSPNet, U-Net, U-Net++, HRNet, and HRNet-OCR, as well as the relatively new ViT-adapter-L, Oneformer and InternImage-H. The experiments demonstrate that the U-HRNet-OCR+PSA network achieves higher accuracy compared to the aforementioned networks. Specifically, the corresponding IoU values for the buildings, roads, vegetation, bare soil, and water in the multi-scale dataset are 89.79%, 90.05%, 94.89%, 85.91%, and 88.36%, respectively.

Acknowledgements

We would like to thank the Advanced Analysis and Testing Centre of Nanjing Forestry University, China for their assistance with data collection and technical support.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This work was supported by National Key Research and Development Program of China under Grant 2022YFD2201005-03 and the Institute of Resource Information, Chinese Academy of Forestry.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.