531
Views
0
CrossRef citations to date
0
Altmetric
Research Article

CMPF-UNet: a ConvNeXt multi-scale pyramid fusion U-shaped network for multi-category segmentation of remote sensing images

, &
Article: 2311217 | Received 11 Oct 2023, Accepted 23 Jan 2024, Published online: 14 Feb 2024

References

  • Ba JL, Kiros JR, Hinton GE. 2016. Layer normalization. arXiv:1607.06450.
  • Badrinarayanan V, Kendall A, Cipolla R. 2017. Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell. 39(12):2481–2495. doi: 10.1109/TPAMI.2016.2644615.
  • Cao H, Wang Y, Chen J, Jiang D, Zhang X, Tian Q, Wang M. 2023. Swin-unet: unet-like pure transformer for medical image segmentation. Computer Vision–ECCV 2022 Workshops; October 23–27, 2022; Tel Aviv, Israel: Springer.
  • Chen L-C, Zhu Y, Papandreou G, Schroff F, Adam H. 2018. Encoder-decoder with atrous separable convolution for semantic image segmentation. The European Conference on Computer Vision (ECCV); 8–14 September 2018; Munich, Germany.
  • Chen J, Lu Y, Yu Q, Luo X, Adeli E, Wang Y, Lu L, Yuille AL, Zhou YJ. 2021. Transunet: transformers make strong encoders for medical image segmentation. arXiv:2102.04306.
  • Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL. 2018. Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFS. IEEE Trans Pattern Anal Mach Intell. 40(4):834–848. doi: 10.1109/TPAMI.2017.2699184.
  • Chen L-C, Papandreou G, Schroff F, Adam H. 2017. Rethinking atrous convolution for semantic image segmentation. arXiv:1706.05587.
  • Chen F, Wang N, Yu B, Wang L. 2022. Res2-Unet, a new deep architecture for building detection from high spatial resolution images. IEEE J Sel Top Appl Earth Observations Remote Sensing. 15:1494–1501. doi: 10.1109/JSTARS.2022.3146430.
  • Dowden B, De Silva O, Huang W, Oldford D. 2021. Sea ice classification via deep neural network semantic segmentation. IEEE Sensors J. 21(10):11879–11888. doi: 10.1109/JSEN.2020.3031475.
  • Feng S, Zhao H, Shi F, Cheng X, Wang M, Ma Y, Xiang D, Zhu W, Chen X. 2020. CPFNet: context pyramid fusion network for medical image segmentation. IEEE Trans Med Imaging. 39(10):3008–3018. doi: 10.1109/TMI.2020.2983721.
  • Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, Lu H. 2019. Dual attention network for scene segmentation. The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); 15–20 June 2019; Long Beach, CA, USA.
  • Guan S, Khan AA, Sikdar S, Chitnis PV. 2019. Fully dense UNet for 2-D sparse photoacoustic tomography artifact removal. IEEE J Biomed Health Inform. 24(2):568–576. doi: 10.1109/JBHI.2019.2912935.
  • He K, Zhang X, Ren S, Sun J. 2016. Deep residual learning for image recognition. The IEEE conference on computer vision and pattern recognition; 27–30 June 2016; Las Vegas, NV, USA.
  • He X, Zhou Y, Zhao J, Zhang D, Yao R, Xue Y. 2022. Swin transformer embedding UNet for remote sensing image semantic segmentation. IEEE Trans Geosci Remote Sensing. 60:1–15. doi: 10.1109/TGRS.2022.3144165.
  • Hendrycks D, Gimpel K. 2016. Gaussian error linear units (gelus). arXiv:1606.08415.
  • Huang G, Liu Z, Van Der Maaten L, Weinberger KQ. 2017. Densely connected convolutional networks. The IEEE Conference on Computer Vision and Pattern Recognition.
  • Huang X, Zhang L, Gong W. 2011. Information fusion of aerial images and LIDAR data in urban areas: vector-stacking, re-classification and post-processing approaches. Int J Remote Sens. 32(1):69–84. doi: 10.1080/01431160903439882.
  • Ibtehaz N, Rahman MS. 2020. MultiResUNet: rethinking the U-Net architecture for multimodal biomedical image segmentation. Neural Netw. 121:74–87. doi: 10.1016/j.neunet.2019.08.025.
  • Ioffe S. 2017. Batch renormalization: towards reducing minibatch dependence in batch-normalized models. Adv Neural Info Processing Sys. 30.
  • Kampffmeyer M, Salberg A-B, Jenssen R. 2018. Urban land cover classification with missing data modalities using deep convolutional neural networks. IEEE J Sel Top Appl Earth Observations Remote Sensing. 11(6):1758–1768. doi: 10.1109/JSTARS.2018.2834961.
  • Li H, Qiu K, Chen L, Mei X, Hong L, Tao C. 2021. SCAttNet: semantic segmentation network with spatial and channel attention mechanism for high-resolution remote sensing images. IEEE Geosci Remote Sensing Lett. 18(5):905–909. doi: 10.1109/LGRS.2020.2988294.
  • Li W, Wang J, Gao Y, Zhang M, Tao R, Zhang B. 2022. Graph-feature-enhanced selective assignment network for hyperspectral and multispectral data classification. IEEE Trans Geosci Remote Sensing. 60:1–14. doi: 10.1109/TGRS.2022.3166252.
  • Li X, He H, Li X, Li D, Cheng G, Shi J, Weng L, Tong Y, Lin Z. 2021. Pointflow: flowing semantics through points for aerial image segmentation. The IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021; June 19–25, 2021.
  • Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B. 2021. Swin transformer: hierarchical vision transformer using shifted windows. The IEEE/CVF International Conference on Computer Vision; October 2021; Montreal, BC, Canada. p. 11–17.
  • Liu Z, Mao H, Wu C-Y, Feichtenhofer C, Darrell T, Xie S. 2022. A convnet for the 2020s. The IEEE/CVF Conference on Computer Vision and Pattern Recognition.
  • Long J, Shelhamer E, Darrell T. 2015. Fully convolutional networks for semantic segmentation. IEEE Conference on Computer Vision and Pattern Recognition; 7–12 June 2015; Boston, MA, USA.
  • Milletari F, Navab N, Ahmadi S-A. 2016. V-net: fully convolutional neural networks for volumetric medical image segmentation. The 2016 Fourth International Conference on 3D Vision (3DV); 25–28 October 2016; Stanford, CA, USA: IEEE. doi: 10.1109/3DV.2016.79.
  • Nair V, Hinton GE. 2010. Rectified linear units improve restricted Boltzmann machines. The 27th International Conference on Machine Learning (ICML-10).
  • Oktay O, Schlemper J, Folgoc LL, Lee M, Heinrich M, Misawa K, Mori K, McDonagh S, Hammerla NY, Kainz B. 2018. Attention u-net: learning where to look for the pancreas. arXiv:1804.03999.
  • Ronneberger O, Fischer P, Brox T. 2015. U-net: convolutional networks for biomedical image segmentation. Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference; October 5–9, 2015; Munich, Germany: Springer.
  • Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L-C. 2018. Mobilenetv2: inverted residuals and linear bottlenecks. The IEEE conference on computer vision and pattern recognition.
  • Simonyan K, Zisserman A. 2014. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556.
  • Sun W, Chen J, Yan L, Lin J, Pang Y, Zhang G. 2022. COVID-19 CT image segmentation method based on swin transformer. Front Physiol. 13:981463. doi: 10.3389/fphys.2022.981463.
  • Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I. 2017. Attention is all you need. Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017; December 4–9, 2017; Long Beach, CA, USA.
  • Wang J, Li W, Gao Y, Zhang M, Tao R, Du Q. 2023. Hyperspectral and SAR image classification via multiscale interactive fusion network. IEEE Trans Neural Netw Learn Syst. 34(12):10823–10837. doi: 10.1109/TNNLS.2022.3171572.
  • Wang J, Li W, Zhang M, Chanussot J. 2023. Large kernel sparse ConvNet weighted by multi-frequency attention for remote sensing scene understanding. IEEE Trans Geosci Remote Sensing. 61:1–12. doi: 10.1109/TGRS.2023.3333401.
  • Wang J, Li W, Zhang M, Tao R, Chanussot J. 2023. Remote sensing scene classification via multi-stage self-guided separation network. IEEE Trans Geosci Remote Sensing. 61:1–12. doi: 10.1109/TGRS.2023.3295797.
  • Woo S, Park J, Lee J-Y, Kweon IS. 2018. Cbam: convolutional block attention module. The European conference on computer vision (ECCV), 06 October 2018.
  • Wu H, Zhang J, Huang K, Liang K, Yu Y. 2019. Fastfcn: rethinking dilated convolution in the backbone for semantic segmentation. arXiv:1903.11816.
  • Xiao X, Lian S, Luo Z, Li S. 2018. Weighted Res-UNet for high-quality retina vessel segmentation. 2018 9th International Conference on Information Technology in Medicine and Education (ITME); IEEE. doi: 10.1109/ITME.2018.00080.
  • Xie S, Girshick R, Dollár P, Tu Z, He K. 2017. Aggregated residual transformations for deep neural networks. The IEEE Conference on Computer Vision and Pattern Recognition.
  • Yang M, Yu K, Zhang C, Li Z, Yang K. 2018. Denseaspp for semantic segmentation in street scenes. The IEEE Conference on Computer Vision and Pattern Recognition.
  • Yang Y, Hallman S, Ramanan D, Fowlkes CC. 2011. Layered object models for image segmentation. IEEE Trans Pattern Anal Mach Intell. 34(9):1731–1743. doi: 10.1109/TPAMI.2011.208.
  • Zhang M, Li W, Zhao X, Liu H, Tao R, Du Q. 2023. Morphological transformation and spatial-logical aggregation for tree species classification using hyperspectral imagery. IEEE Trans Geosci Remote Sensing. 61:1–12. doi: 10.1109/TGRS.2022.3233847.
  • Zhao H, Shi J, Qi X, Wang X, Jia J. 2017. Pyramid scene parsing network. The IEEE conference on computer vision and pattern recognition; 21–26 July 2017; Honolulu, HI, USA.