0
Views
0
CrossRef citations to date
0
Altmetric
Research article

Scale-wised feature enhancement network for change captioning of remote sensing images

, , &
Pages 5845-5869 | Received 05 Feb 2024, Accepted 27 Jun 2024, Published online: 31 Jul 2024
 

ABSTRACT

The Remote Sensing Image Change Captioning (RSICC) has recently emerged in the field of remote sensing image interpretation; it aims to automatically predict natural language captions of significant semantic changes in bi-temporal remote sensing images. Recent studies of RSICC have improved the accuracy of change captions of bi-temporal remote sensing images to a large extent. Nevertheless, there still remain challenges in multi-scale perception of ground objects and feature enhancement of bi-temporal remote sensing images. To address these challenges and further improve the accuracy of RSICC, a novel deep learning–based end-to-end scale-wised feature enhancement network (SFEN) is proposed in this paper. SFEN integrates four efficient blocks: 1) the siamese backbone network (SBN) to extract initial features of bi-temporal remote sensing images, 2) the siamese receptive field fusion (SRFF) block to explicitly capture multi-scale semantic information of ground objects in bi-temporal feature maps, 3) the siamese global feature enhancement (SGFE) block to adaptively enhance key information and filtering redundant features of bi-temporal feature maps in both channel and spatial dimensions, 4) the change caption decoder (CCD) to map bi-temporal feature maps into natural language. The SFEN aims to precisely capture significant semantic information of ground objects in bi-temporal remote sensing images and predict accurate change captions. Experimental results on LEVIR-CC dataset demonstrate our SFEN outperforms recent state-of-the-art (SOTA) approach in RSICC by 5.2% on CIDEr-D and achieves a new SOTA.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.