160
Views
0
CrossRef citations to date
0
Altmetric
Research Article

SwinHCST: a deep learning network architecture for scene classification of remote sensing images based on improved CNN and Transformer

, , , , , , , & show all
Pages 7439-7463 | Received 06 Jul 2023, Accepted 08 Nov 2023, Published online: 05 Dec 2023

References

  • Aleissaee, A. Amer, A. Kumar, R. Muhammad Anwer, S. Khan, H. Cholakkal, G. Xia, and F. Shahbaz Khan. 2023. “Transformers in Remote Sensing: A Survey.” Remote Sensing 15 (7): 1860. https://doi.org/10.3390/rs15071860.
  • Bazi, Y., L. Bashmal, M. M. Rahhal, R. Al Dayil, and N. Al Ajlan. 2021.“Vision Transformers for Remote Sensing Image Classification.” Remote Sensing 13 (3): 516. https://doi.org/10.3390/rs13030516.
  • Bi, H. X., F. Xu, Z. Q. Wei, Y. Xue, and Z. B. Xu. 2019. “An Active Deep Learning Approach for Minimally Supervised PolSar Image Classification.” IEEE Transactions on Geoscience & Remote Sensing 57 (11): 9378–9395. https://doi.org/10.1109/Tgrs.2019.2926434.
  • Cheng, G., J. W. Han, and X. Q. Lu. 2017. “Remote Sensing Image Scene Classification: Benchmark and State of the Art.” Proceedings of the IEEE 105 (10): 1865–1883. https://doi.org/10.1109/Jproc.2017.2675998.
  • Cheng, G., X. X. Xie, J. W. Han, L. Guo, and G. S. Xia. 2020. “Remote Sensing Image Scene Classification Meets Deep Learning: Challenges, Methods, Benchmarks, and Opportunities.” IEEE Journal of Selected Topics in Applied Earth Observations & Remote Sensing 13:3735–3756. https://doi.org/10.1109/Jstars.2020.3005403.
  • Dalal, N., and B. Triggs. 2005. ““Histograms of Oriented Gradients for Human Detection.” 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05) 1 (1): 886–893.
  • Dosovitskiy, A., L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, and M. Dehghani. 2020. “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale.” ArXiv abs/2010.11929. http://doi.org/10.48550/arXiv.2010.11929.
  • Gamba, P. 2013. “Human Settlements: A Global Challenge for EO Data Processing and Interpretation.” Proceedings of the IEEE 101 (3): 570–581. https://doi.org/10.1109/Jproc.2012.2189089.
  • Ghimire, B., J. Rogan, V. R. Galiano, P. Panday, and N. Neeti. 2012. “An Evaluation of Bagging, Boosting, and Random Forests for Land-Cover Classification in Cape Cod, Massachusetts, USA.” GIScience & Remote Sensing 49 (5): 623–643. https://doi.org/10.2747/1548-1603.49.5.623.
  • Hao, S. Y., N. Li, and Y. X. Ye. 2023. “Inductive Biased Swin-Transformer with Cyclic Regressor for Remote Sensing Scene Classification.” IEEE Journal of Selected Topics in Applied Earth Observations & Remote Sensing 16:6265–6278. https://doi.org/10.1109/Jstars.2023.3290676.
  • He, N. J., L. Y. Fang, S. T. Li, J. Plaza, and A. Plaza. 2020. “Skip-Connected Covariance Network for Remote Sensing Scene Classification.” IEEE Transactions on Neural Networks and Learning Systems 31 (5): 1461–1474. https://doi.org/10.1109/Tnnls.2019.2920374.
  • He, K., X. Zhang, S. Ren, and J. Sun. 2015. “Deep Residual Learning for Image Recognition.” 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 770–778.
  • He X., Y. Zhou, J. Zhao, D. Zhang, R. Yao, and Y. Xue. 2022. “Swin Transformer Embedding UNet for Remote Sensing Image Semantic Segmentation.” IEEE Trans. Geosci. Remote Sensing 60:1–15. https://doi.org/10.1109/TGRS.2022.3144165.
  • Huang, L., C. Chen, W. Li, and Q. Du. 2016. “Remote Sensing Image Scene Classification Using Multi-Scale Completed Local Binary Patterns and Fisher Vectors.” Remote Sensing 8 (6): 483. https://doi.org/10.3390/rs8060483.
  • Huang, G., Z. Liu, and K. Q. Weinberger. 2017. “Densely Connected Convolutional Networks.” 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 2261–2269.
  • Hu, J., L. Shen, S. Albanie, G. Sun, and W. Enhua. 2017. “Squeeze-And-Excitation Networks.” IEEE Transactions on Pattern Analysis & Machine Intelligence 42 (8): 2011–2023.
  • Kingma, D. P., and B. Jimmy. 2014. “Adam: A Method for Stochastic Optimization.” CoRr abs/1412.6980. https://api.semanticscholar.org/CorpusID:6628106.
  • Krizhevsky, A., I. Sutskever, and G. E. Hinton. 2012. “ImageNet Classification with Deep Convolutional Neural Networks.” Communications of the ACM 60 (6): 84–90.
  • Li, H. T., H. Y. Gu, Y. S. Han, and J. H. Yang. 2010. “Object-Oriented Classification of High-Resolution Remote Sensing Imagery Based on an Improved Colour Structure Code and a Support Vector Machine.” International Journal of Remote Sensing 31 (6): 1453–1470. https://doi.org/10.1080/01431160903475266.
  • Liu, S., D. Huang, and Y. Wang. 2018. “Receptive Field Block Net for Accurate and Fast Object Detection.” Paper presented at the European Conference on Computer Vision, Munich, Germany.
  • Liu, Z., Y. Lin, Y. Cao, H. Han, Y. Wei, Z. Zhang, S. Lin, and B. Guo. 2021. “Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows.” 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, Canada, 9992–10002.
  • Li, D., M. Wang, Z. Dong, X. Shen, and L. Shi. 2017. “Earth Observation Brain (EOB): An Intelligent Earth Observation System.” Geo-Spatial Information Science 20 (2): 134–140.
  • Lowe, D. G. 2004. “Distinctive Image Features from Scale-Invariant Keypoints.” International Journal of Computer Vision 60 (2): 91–110. https://doi.org/10.1023/B:Visi.0000029664.99615.94.
  • Lu T., L. Wan, S. Qi, and M. Gao. 2023. “Land Cover Classification of UAV Remote Sensing Based on Transformer–CNN Hybrid Architecture.” Sensors 23 (11): 5288. https://doi.org/10.3390/s23115288.
  • Neumann, M., A. Susano Pinto, X. Zhai, and N. Houlsby. 2019. “In-Domain Representation Learning for Remote Sensing.” ArXiv abs/1911.06721. https://api.semanticscholar.org/CorpusID:208076837.
  • Oliva, A., and A. Torralba. 2001. International Journal of Computer Vision 42 (3): 145–175. https://doi.org/10.1023/A:1011139631724.
  • Qiao, S., H. Wang, C. Liu, W. Shen, and A. Loddon Yuille. 2019. “Weight Standardization.” ArXiv abs/1903, 10520. https://api.semanticscholar.org/CorpusID:85517954.
  • Robbins, H., and S. Monro. 1951. “A Stochastic Approximation Method.” Ann. Math. Statist. 22 (3): 400–407. https://doi.org/10.1214/aoms/1177729586.
  • Sarker, I. H. 2021. “Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions.” SN Computer Science 2 (6): 420.
  • Selvaraju, R. R., M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra. 2020. “Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization.” International Journal of Computer Vision 128 (2): 336–359. https://doi.org/10.1007/s11263-019-01228-7.
  • Simonyan, K., and A. Zisserman. 2014. “Very Deep Convolutional Networks for Large-Scale Image Recognition.” CoRr abs/1409,1556.
  • Szegedy, C., W. Liu, Y. Jia, P. Sermanet, S. E. Reed, D. E. Dragomir Anguelov, V. Vanhoucke, and A. Rabinovich. 2015. “Going Deeper with Convolutions.” 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 1–9.
  • Tarabalka, Y., M. Fauvel, J. Chanussot, and J. A. Benediktsson. 2010. “SVM- and MRF-Based Method for Accurate Classification of Hyperspectral Images.” IEEE Geoscience & Remote Sensing Letters 7 (4): 736–740. https://doi.org/10.1109/Lgrs.2010.2047711.
  • Vaswani, A., N. M. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin. 2017. “Attention is All you Need.” Paper presented at the NIPS.
  • Wang, G., H. Chen, L. Chen, Y. Zhuang, S. Zhang, T. Zhang, H. Dong, and P. Gao. 2023. “P2FEViT: Plug-And-Play CNN Feature Embedded Hybrid Vision Transformer for Remote Sensing Image Classification.” Remote Sensing of Environment 15 (7): 1773.
  • Wang, F. L., J. Ji, and Y. Wang. 2023. “DSViT: Dynamically Scalable Vision Transformer for Remote Sensing Image Segmentation and Classification.” IEEE Journal of Selected Topics in Applied Earth Observations & Remote Sensing 16:5441–5452. https://doi.org/10.1109/Jstars.2023.3285259.
  • Wang, G. Q., N. Zhang, W. C. Liu, H. Chen, and Y. Z. Xie. 2022. “MFST: A Multi-Level Fusion Network for Remote Sensing Scene Classification.” IEEE Geoscience & Remote Sensing Letters 19:1–5. https://doi.org/10.1109/LGRS.2022.3205417.
  • Xia, G. S., J. W. Hu, F. Hu, B. G. Shi, X. Bai, Y. F. Zhong, L. P. Zhang, and X. Q. Lu. 2017. “AID: A Benchmark Data Set for Performance Evaluation of Aerial Scene Classification.” IEEE Transactions on Geoscience & Remote Sensing 55 (7): 3965–3981. https://doi.org/10.1109/Tgrs.2017.2685945.
  • Xie, X., P. Zhou, L. Huan, Z. Lin, and S. Yan. 2022. “Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models.” ArXiv abs/2208.06677.
  • Xu, Z., W. Zhang, T. Zhang, Z. Yang, and J. Li. 2021.“Efficient Transformer for Remote Sensing Image Segmentation.” Remote Sensing 13 (18): 3585. http://doi.org/10.3390/rs13183585.
  • Yang, Y., and S. Newsam. 2010. “Bag-Of-Visual-Words and Spatial Extensions for Land-Use Classification.” Paper presented at the ACM SIGSPATIAL International Workshop on Advances in Geographic Information Systems, New York, NY, USA.
  • Yan, H. P., E. R. Zhang, J. Wang, C. C. Leng, A. Basu, and J. Y. Peng. 2023. “Hybrid Conv-ViT Network for Hyperspectral Image Classification.” IEEE Geoscience & Remote Sensing Letters 20:1–5. https://doi.org/10.1109/Lgrs.2023.3287277.
  • Zhang, T., Z. Wang, P. Cheng, G. Xu, and X. Sun. 2023. “DCNNet: A Distributed Convolutional Neural Network for Remote Sensing Image Classification.” IEEE Trans. Geosci. Remote Sensing 61:1–18. https://doi.org/10.1109/TGRS.2023.3243238.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.