Search in:

Advanced search

International Journal of Remote Sensing Volume 44, 2023 - Issue 24

Submit an article Journal homepage

168

Views

CrossRef citations to date

Altmetric

Research article

Multi-scale attention fusion network for semantic segmentation of remote sensing images

Zhiqiang WenSchool of Computer and Information Engineering, Central South University of Forestry and Technology, Changsha, China

https://orcid.org/0009-0004-1234-4289 View further author information

Hongxu HuangSchool of Computer and Information Engineering, Central South University of Forestry and Technology, Changsha, ChinaView further author information

Shuai LiuSchool of Computer and Information Engineering, Central South University of Forestry and Technology, Changsha, ChinaCorrespondence[email protected]
View further author information

Pages 7909-7926 | Received 05 Sep 2023, Accepted 17 Nov 2023, Published online: 12 Dec 2023

Cite this article
https://doi.org/10.1080/01431161.2023.2290999
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions

References

Badrinarayanan, V., A. Kendall, and R. Cipolla. 2017. “Segnet: A deep convolutional encoder-decoder architecture for image segmentation.” IEEE Transactions on Pattern Analysis and Machine Intelligence 39 (12): 2481–2495. https://doi.org/10.1109/TPAMI.2016.2644615.
PubMed Web of Science ®Google Scholar
Breiman, L. 2001. “Random Forests.” Machine Learning 45 (1): 5–32. https://doi.org/10.1023/A:1010933404324.
Web of Science ®Google Scholar
Chen, L.-C., G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille. 2014. “Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected Crfs.” arXiv preprint arXiv:1412.7062. https://doi.org/10.1109/CVPR.2014.264.
Google Scholar
Child, R., S. Gray, A. Radford, and I. Sutskever. 2019. “Generating Long Sequences with Sparse Transformers.” arXiv preprint arXiv:1904.10509.https://doi.org/10.48550/arXiv.1904.10509.
Google Scholar
Comaniciu, D., V. Ramesh, and P. Meer. 2000. “Real-Time Tracking of Non-Rigid Objects Using Mean Shift.” In Proceedings IEEE Conference on Computer Vision and Pattern Recognition, Hilton Head, South Carolina, 142–149. IEEE.
Google Scholar
Cortes, C., and V. Vapnik. 1995. “Support-Vector Networks.” Machine Learning 20 (3): 273–297. https://doi.org/10.1007/BF00994018.
Web of Science ®Google Scholar
Ding, X., Y. Guo, G. Ding, and J. Han. 2019. “Acnet: Strengthening the kernel skeletons for powerful cnn via asymmetric convolution blocks.” In Proceedings of the IEEE/CVF international conference on computer vision, Seoul, Korea (South), 1911–1920.
Google Scholar
Dong, S., and Z. Chen. 2021. “Block Multi-Dimensional Attention for Road Segmentation in Remote Sensing Imagery.” IEEE Geoscience and Remote Sensing Letters 19:1–5. https://doi.org/10.1109/LGRS.2021.3137551.
Web of Science ®Google Scholar
Fu, Y., C. Zhao, J. Wang, X. Jia, G. Yang, X. Song, and H. Feng. 2017. “An Improved Combination of Spectral and Spatial Features for Vegetation Classification in Hyperspectral Images.” Remote Sensing 9 (3): 261. https://doi.org/10.3390/rs9030261.
Web of Science ®Google Scholar
Guo, J., K. Han, W. Han, Y. Tang, X. Chen, Y. Wang, and X. Chang. 2022. “Cmt: Convolutional neural networks meet vision transformers.” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 12175–12185.
Google Scholar
He, K., X. Zhang, S. Ren, and J. Sun. 2015. “Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition.” IEEE Transactions on Pattern Analysis and Machine Intelligence 37 (9): 1904–1916. https://doi.org/10.1109/TPAMI.2015.2389824.
PubMed Web of Science ®Google Scholar
He, X., Y. Zhou, J. Zhao, M. Zhang, R. Yao, B. Liu, and L. Haichao. 2021. “Semantic Segmentation of Remote-Sensing Images Based on Multiscale Feature Fusion and Attention Refinement.” IEEE Geoscience and Remote Sensing Letters 19:1–5. https://doi.org/10.1109/LGRS.2021.3052557.
Web of Science ®Google Scholar
Hou, Q., D. Zhou, and J. Feng. 2021. “Coordinate Attention for Efficient Mobile Network Design.” In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, Nashville, TN, USA, 13713–13722.
Google Scholar
Hu, J., L. Shen, and G. Sun. 2018. “Squeeze-And-Excitation Networks.” In Proceedings of the IEEE conference on computer vision and pattern recognition, Salt Lake City, UT, USA, 7132–7141.
Google Scholar
Kitaev, N., Ł. Kaiser, and A., Levskaya. 2020. ”Reformer: The efficient transformer.” In International Conference on Learning Representations.
Google Scholar
Li, W., Q. Fang, M. Tang, and Y. Zhengtao. 2020. “Bidirectional LSTM with Self-Attention Mechanism and Multi-Channel Features for Sentiment Classification.” Neurocomputing 387:63–77. https://doi.org/10.1016/j.neucom.2020.01.006.
Web of Science ®Google Scholar
Long, J., E. Shelhamer, and T. Darrell. 2015. “Fully convolutional networks for semantic segmentation.” In Proceedings of the IEEE conference on computer vision and pattern recognition, Boston, MA, USA, 3431–3440.
Google Scholar
Modava, M., G. Akbarizadeh, and M. Soroosh. 2018. “Integration of Spectral Histogram and Level Set for Coastline Detection in SAR Images.” IEEE Transactions on Aerospace and Electronic Systems 55 (2): 810–819. https://doi.org/10.1109/TAES.2018.2865120.
Web of Science ®Google Scholar
Modava, M., G. Akbarizadeh, and M. Soroosh. 2019. “Hierarchical Coastline Detection in SAR Images Based on Spectral-Textural Features and Global–Local Information.” IET Radar, Sonar & Navigation 13 (12): 2183–2195. https://doi.org/10.1049/iet-rsn.2019.0063.
Web of Science ®Google Scholar
Qin, Z., W. Sun, H. Deng, L. Dongxu, Y. Wei, L. Baohong, J. Yan, L. Kong, and Y. Zhong. 2022. “Cosformer: Rethinking Softmax in Attention.” In International Conference on Learning Representations.
Google Scholar
Radford, A., W. Jeffrey, R. Child, D. Luan, D. Amodei, I. Sutskever. 2019. “Language models are unsupervised multitask learners.” OpenAi Blog 1 (8): 9.
Google Scholar
Radford, A., K. Narasimhan, T. Salimans, I. Sutskever. 2018. “Improving language understanding by generative pre-training.”
Google Scholar
Ronneberger, O., P. Fischer, and T. Brox. 2015. “U-Net: Convolutional Networks for Biomedical Image Segmentation.” In Medical Image Computing and Computer-Assisted Intervention, Munich, Germany, 234–241. Springer.
Google Scholar
Sandler, M., A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen. 2018. “Mobilenetv2: Inverted Residuals and Linear Bottlenecks.” In Proceedings of the IEEE conference on computer vision and pattern recognition, Salt Lake City, UT, USA, 4510–4520.
Google Scholar
Sarker, I. H. 2022. “Ai-Based Modeling: Techniques, Applications and Research Issues Towards Automation, Intelligent and Smart Systems.” SN Computer Science 3 (2): 158. https://doi.org/10.1007/s42979-022-01043-x.
PubMedGoogle Scholar
Shen, Z., M. Zhang, H. Zhao, Y. Shuai, and L. Hongsheng 2021. “Efficient Attention: Attention with Linear Complexities.” In Proceedings of the IEEE/CVF winter conference on applications of computer vision, 3531–3539.
Google Scholar
Vedaldi, A., and S. Soatto. 2008. “Quick shift and kernel methods for mode seeking.” In European Conference on Computer Vision, Marseille, France, 705–718. Springer.
Google Scholar
Wang, S., B. Z. Li, M. Khabsa, H. Fang, and M. Hao. 2020. “Linformer: Self-Attention with Linear Complexity.” arXiv preprint arXiv:2006.04768. https://doi.org/10.48550/arXiv.2006.04768.
Google Scholar
Wang, J., L. Wei, Y. Wang, R. Tao, and D. Qian. 2023. “Representation-Enhanced Status Replay Network for Multisource Remote-Sensing Image Classification.” IEEE Transactions on Geoscience and Remote Sensing 61:1–12. https://doi.org/10.1109/TGRS.2023.3295797.
Web of Science ®Google Scholar
Wang, J., L. Wei, M. Zhang, R. Tao, and J. Chanussot. 2023 . “Remote Sensing Scene Classification via Multi-Stage Self-Guided Separation Network.” IEEE Transactions on Geoscience and Remote Sensing.
Google Scholar
Wang, Z., M. Xia, M. Lu, L. Pan, and J. Liu. 2021. “Parameter Identification in Power Transmission Systems Based on Graph Convolution Network.” IEEE Transactions on Power Delivery 37 (4): 3155–3163. https://doi.org/10.1109/TPWRD.2021.3124528.
Web of Science ®Google Scholar
Wang, W., E. Xie, L. Xiang, D.-P. Fan, K. Song, D. Liang, L. Tong, P. Luo, and L. Shao. 2022. “Pvt v2: Improved baselines with pyramid vision transformer.” Computational Visual Media 8 (3): 415–424. https://doi.org/10.1007/s41095-022-0274-8.
Google Scholar
Werbos, P. 1974. “Beyond regression: New tools for prediction and analysis in the behavioral sciences.” PhD thesis, Committee on Applied Mathematics, Harvard University: Cambridge, MA.
Google Scholar
Woo, S., J. Park, J.-Y. Lee, and I. So Kweon. 2018. “Cbam: Convolutional Block Attention Module.” In Proceedings of the European conference on computer vision, Munich, Germany, 3–19.
Google Scholar
Yang, Y., J. Dong, Y. Wang, Y. Bibo, and Z. Yang. 2023. “DMAU-Net: An Attention-Based Multiscale Max-Pooling Dense Network for the Semantic Segmentation in VHR Remote-Sensing Images.” Remote Sensing 15 (5): 1328. https://doi.org/10.3390/rs15051328.
Web of Science ®Google Scholar
Yan, M. A., and G. U. L. I. M. I. L. A. Kezierbieke. 2023. “The Research Review of Image Semantic Segmentation Method in High-Resolution Remote Sensing Image Interpretation.” Journal of Frontiers of Computer Science and Technology 17 (7): 1526–1548. https://doi.org/10.3778/j.issn.1673-9418.2211015.
Google Scholar
Yao, X., J. Han, G. Cheng, X. Qian, and L. Guo. 2016. “Semantic Annotation of High-Resolution Satellite Images via Weakly Supervised Learning.” IEEE Transactions on Geoscience and Remote Sensing 54 (6): 3660–3671. https://doi.org/10.1109/TGRS.2016.2523563.
Web of Science ®Google Scholar
Yu, F., and V. Koltun. 2016. “Multi-scale context aggregation by dilated convolutions.” In Proceedings of the International Conference on Learning Representations, San Juan, Puerto Rico.
Google Scholar
Zhang, H., K. Dana, J. Shi, Z. Zhang, X. Wang, A. Tyagi, and A. Agrawal. 2018. “Context Encoding for Semantic Segmentation.” In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 7151–7160.
Google Scholar
Zhang, Y., X. Sun, J. Dong, C. Chen, and L. Qingxuan. 2021. “GPNet: Gated Pyramid Network for Semantic Segmentation.” Pattern Recognition 115:107940. https://doi.org/10.1016/j.patcog.2021.107940.
Web of Science ®Google Scholar
Zhang, M., L. Wei, Y. Zhang, R. Tao, and D. Qian. 2022. “Hyperspectral and LiDar Data Classification Based on Structural Optimization Transmission.” IEEE Transactions on Cybernetics 53 (5): 3153–3164. https://doi.org/10.1109/TCYB.2022.3169773.
Web of Science ®Google Scholar
Zhang, Q., and Y.-B. Yang. 2021. “Rest: An efficient transformer for visual recognition.” Advances in Neural Information Processing Systems, 15475–15485.
Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Multi-scale attention fusion network for semantic segmentation of remote sensing images

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Multi-scale attention fusion network for semantic segmentation of remote sensing images

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date