168
Views
0
CrossRef citations to date
0
Altmetric
Research article

Multi-scale attention fusion network for semantic segmentation of remote sensing images

ORCID Icon, &
Pages 7909-7926 | Received 05 Sep 2023, Accepted 17 Nov 2023, Published online: 12 Dec 2023

References

  • Badrinarayanan, V., A. Kendall, and R. Cipolla. 2017. “Segnet: A deep convolutional encoder-decoder architecture for image segmentation.” IEEE Transactions on Pattern Analysis and Machine Intelligence 39 (12): 2481–2495. https://doi.org/10.1109/TPAMI.2016.2644615.
  • Breiman, L. 2001. “Random Forests.” Machine Learning 45 (1): 5–32. https://doi.org/10.1023/A:1010933404324.
  • Chen, L.-C., G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille. 2014. “Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected Crfs.” arXiv preprint arXiv:1412.7062. https://doi.org/10.1109/CVPR.2014.264.
  • Child, R., S. Gray, A. Radford, and I. Sutskever. 2019. “Generating Long Sequences with Sparse Transformers.” arXiv preprint arXiv:1904.10509.https://doi.org/10.48550/arXiv.1904.10509.
  • Comaniciu, D., V. Ramesh, and P. Meer. 2000. “Real-Time Tracking of Non-Rigid Objects Using Mean Shift.” In Proceedings IEEE Conference on Computer Vision and Pattern Recognition, Hilton Head, South Carolina, 142–149. IEEE.
  • Cortes, C., and V. Vapnik. 1995. “Support-Vector Networks.” Machine Learning 20 (3): 273–297. https://doi.org/10.1007/BF00994018.
  • Ding, X., Y. Guo, G. Ding, and J. Han. 2019. “Acnet: Strengthening the kernel skeletons for powerful cnn via asymmetric convolution blocks.” In Proceedings of the IEEE/CVF international conference on computer vision, Seoul, Korea (South), 1911–1920.
  • Dong, S., and Z. Chen. 2021. “Block Multi-Dimensional Attention for Road Segmentation in Remote Sensing Imagery.” IEEE Geoscience and Remote Sensing Letters 19:1–5. https://doi.org/10.1109/LGRS.2021.3137551.
  • Fu, Y., C. Zhao, J. Wang, X. Jia, G. Yang, X. Song, and H. Feng. 2017. “An Improved Combination of Spectral and Spatial Features for Vegetation Classification in Hyperspectral Images.” Remote Sensing 9 (3): 261. https://doi.org/10.3390/rs9030261.
  • Guo, J., K. Han, W. Han, Y. Tang, X. Chen, Y. Wang, and X. Chang. 2022. “Cmt: Convolutional neural networks meet vision transformers.” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 12175–12185.
  • He, K., X. Zhang, S. Ren, and J. Sun. 2015. “Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition.” IEEE Transactions on Pattern Analysis and Machine Intelligence 37 (9): 1904–1916. https://doi.org/10.1109/TPAMI.2015.2389824.
  • He, X., Y. Zhou, J. Zhao, M. Zhang, R. Yao, B. Liu, and L. Haichao. 2021. “Semantic Segmentation of Remote-Sensing Images Based on Multiscale Feature Fusion and Attention Refinement.” IEEE Geoscience and Remote Sensing Letters 19:1–5. https://doi.org/10.1109/LGRS.2021.3052557.
  • Hou, Q., D. Zhou, and J. Feng. 2021. “Coordinate Attention for Efficient Mobile Network Design.” In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, Nashville, TN, USA, 13713–13722.
  • Hu, J., L. Shen, and G. Sun. 2018. “Squeeze-And-Excitation Networks.” In Proceedings of the IEEE conference on computer vision and pattern recognition, Salt Lake City, UT, USA, 7132–7141.
  • Kitaev, N., Ł. Kaiser, and A., Levskaya. 2020. ”Reformer: The efficient transformer.” In International Conference on Learning Representations.
  • Li, W., Q. Fang, M. Tang, and Y. Zhengtao. 2020. “Bidirectional LSTM with Self-Attention Mechanism and Multi-Channel Features for Sentiment Classification.” Neurocomputing 387:63–77. https://doi.org/10.1016/j.neucom.2020.01.006.
  • Long, J., E. Shelhamer, and T. Darrell. 2015. “Fully convolutional networks for semantic segmentation.” In Proceedings of the IEEE conference on computer vision and pattern recognition, Boston, MA, USA, 3431–3440.
  • Modava, M., G. Akbarizadeh, and M. Soroosh. 2018. “Integration of Spectral Histogram and Level Set for Coastline Detection in SAR Images.” IEEE Transactions on Aerospace and Electronic Systems 55 (2): 810–819. https://doi.org/10.1109/TAES.2018.2865120.
  • Modava, M., G. Akbarizadeh, and M. Soroosh. 2019. “Hierarchical Coastline Detection in SAR Images Based on Spectral-Textural Features and Global–Local Information.” IET Radar, Sonar & Navigation 13 (12): 2183–2195. https://doi.org/10.1049/iet-rsn.2019.0063.
  • Qin, Z., W. Sun, H. Deng, L. Dongxu, Y. Wei, L. Baohong, J. Yan, L. Kong, and Y. Zhong. 2022. “Cosformer: Rethinking Softmax in Attention.” In International Conference on Learning Representations.
  • Radford, A., W. Jeffrey, R. Child, D. Luan, D. Amodei, I. Sutskever. 2019. “Language models are unsupervised multitask learners.” OpenAi Blog 1 (8): 9.
  • Radford, A., K. Narasimhan, T. Salimans, I. Sutskever. 2018. “Improving language understanding by generative pre-training.”
  • Ronneberger, O., P. Fischer, and T. Brox. 2015. “U-Net: Convolutional Networks for Biomedical Image Segmentation.” In Medical Image Computing and Computer-Assisted Intervention, Munich, Germany, 234–241. Springer.
  • Sandler, M., A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen. 2018. “Mobilenetv2: Inverted Residuals and Linear Bottlenecks.” In Proceedings of the IEEE conference on computer vision and pattern recognition, Salt Lake City, UT, USA, 4510–4520.
  • Sarker, I. H. 2022. “Ai-Based Modeling: Techniques, Applications and Research Issues Towards Automation, Intelligent and Smart Systems.” SN Computer Science 3 (2): 158. https://doi.org/10.1007/s42979-022-01043-x.
  • Shen, Z., M. Zhang, H. Zhao, Y. Shuai, and L. Hongsheng 2021. “Efficient Attention: Attention with Linear Complexities.” In Proceedings of the IEEE/CVF winter conference on applications of computer vision, 3531–3539.
  • Vedaldi, A., and S. Soatto. 2008. “Quick shift and kernel methods for mode seeking.” In European Conference on Computer Vision, Marseille, France, 705–718. Springer.
  • Wang, S., B. Z. Li, M. Khabsa, H. Fang, and M. Hao. 2020. “Linformer: Self-Attention with Linear Complexity.” arXiv preprint arXiv:2006.04768. https://doi.org/10.48550/arXiv.2006.04768.
  • Wang, J., L. Wei, Y. Wang, R. Tao, and D. Qian. 2023. “Representation-Enhanced Status Replay Network for Multisource Remote-Sensing Image Classification.” IEEE Transactions on Geoscience and Remote Sensing 61:1–12. https://doi.org/10.1109/TGRS.2023.3295797.
  • Wang, J., L. Wei, M. Zhang, R. Tao, and J. Chanussot. 2023 . “Remote Sensing Scene Classification via Multi-Stage Self-Guided Separation Network.” IEEE Transactions on Geoscience and Remote Sensing.
  • Wang, Z., M. Xia, M. Lu, L. Pan, and J. Liu. 2021. “Parameter Identification in Power Transmission Systems Based on Graph Convolution Network.” IEEE Transactions on Power Delivery 37 (4): 3155–3163. https://doi.org/10.1109/TPWRD.2021.3124528.
  • Wang, W., E. Xie, L. Xiang, D.-P. Fan, K. Song, D. Liang, L. Tong, P. Luo, and L. Shao. 2022. “Pvt v2: Improved baselines with pyramid vision transformer.” Computational Visual Media 8 (3): 415–424. https://doi.org/10.1007/s41095-022-0274-8.
  • Werbos, P. 1974. “Beyond regression: New tools for prediction and analysis in the behavioral sciences.” PhD thesis, Committee on Applied Mathematics, Harvard University: Cambridge, MA.
  • Woo, S., J. Park, J.-Y. Lee, and I. So Kweon. 2018. “Cbam: Convolutional Block Attention Module.” In Proceedings of the European conference on computer vision, Munich, Germany, 3–19.
  • Yang, Y., J. Dong, Y. Wang, Y. Bibo, and Z. Yang. 2023. “DMAU-Net: An Attention-Based Multiscale Max-Pooling Dense Network for the Semantic Segmentation in VHR Remote-Sensing Images.” Remote Sensing 15 (5): 1328. https://doi.org/10.3390/rs15051328.
  • Yan, M. A., and G. U. L. I. M. I. L. A. Kezierbieke. 2023. “The Research Review of Image Semantic Segmentation Method in High-Resolution Remote Sensing Image Interpretation.” Journal of Frontiers of Computer Science and Technology 17 (7): 1526–1548. https://doi.org/10.3778/j.issn.1673-9418.2211015.
  • Yao, X., J. Han, G. Cheng, X. Qian, and L. Guo. 2016. “Semantic Annotation of High-Resolution Satellite Images via Weakly Supervised Learning.” IEEE Transactions on Geoscience and Remote Sensing 54 (6): 3660–3671. https://doi.org/10.1109/TGRS.2016.2523563.
  • Yu, F., and V. Koltun. 2016. “Multi-scale context aggregation by dilated convolutions.” In Proceedings of the International Conference on Learning Representations, San Juan, Puerto Rico.
  • Zhang, H., K. Dana, J. Shi, Z. Zhang, X. Wang, A. Tyagi, and A. Agrawal. 2018. “Context Encoding for Semantic Segmentation.” In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 7151–7160.
  • Zhang, Y., X. Sun, J. Dong, C. Chen, and L. Qingxuan. 2021. “GPNet: Gated Pyramid Network for Semantic Segmentation.” Pattern Recognition 115:107940. https://doi.org/10.1016/j.patcog.2021.107940.
  • Zhang, M., L. Wei, Y. Zhang, R. Tao, and D. Qian. 2022. “Hyperspectral and LiDar Data Classification Based on Structural Optimization Transmission.” IEEE Transactions on Cybernetics 53 (5): 3153–3164. https://doi.org/10.1109/TCYB.2022.3169773.
  • Zhang, Q., and Y.-B. Yang. 2021. “Rest: An efficient transformer for visual recognition.” Advances in Neural Information Processing Systems, 15475–15485.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.