116
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Urban object detection algorithm based on feature enhancement and progressive dynamic aggregation strategy

, , &
Article: 2322061 | Received 30 Oct 2023, Accepted 16 Feb 2024, Published online: 15 Mar 2024

References

  • Ahern J. 2011. From fail-safe to safe-to-fail: sustainability and resilience in the new urban world. Landscape Urban Plann. 100(4):341–343. doi:10.1016/j.landurbplan.2011.02.021.
  • Bochkovskiy A, Wang CY, Liao HYM. 2020. Yolov4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934.
  • Ceccato VA, Snickars F. 2000. Adapting GIS technology to the needs of local planning. Environ Plann B Plann Des. 27(6):923–937. doi:10.1068/b26103.
  • Chouhan SS, Kaul A, Singh UP. 2019. Image segmentation using computational intelligence techniques. Arch Computat Methods Eng. 26(3):533–596. doi:10.1007/s11831-018-9257-4.
  • Del Prete R, Graziano MD, Renga A. 2021. RetinaNet: a deep learning architecture to achieve a robust wake detector in SAR images. 2021 IEEE 6th International Forum on Research and Technology for Society and Industry (RTSI), Naples, Italy. IEEE. p. 171–176. doi:10.1109/RTSI50628.2021.9597297.
  • Girshick R. 2015. Fast R-CNN. Proceedings of the IEEE International Conference on Computer Vision. p. 1440–1448.
  • Gomez-Chova L, Tuia D, Moser G, Camps-Valls G. 2015. Multimodal classification of remote sensing images: a review and future directions. A review and future directions. Proceedings of the IEEE 103(9):1560–1584.
  • Hazaymeh K,Almagbile Ali,Alsayed A. 2022. A cascaded data fusion approach for extracting the rooftops of buildings in heterogeneous urban fabric using high spatial resolution satellite imagery and elevation data. Egypt J Remote Sens Space Sci. 26(1):245–252.
  • He K, Zhang X, Ren S, Sun J. 2015. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell. 37(9):1904–1916. doi:10.1109/TPAMI.2015.2389824.
  • He K, Zhang X, Ren S, Sun J. 2016. Identity mappings in deep residual networks. Computer Vision–ECCV 2016: 14th European Conference, Proceedings, Part IV 14, Oct 11–14; Amsterdam, The Netherlands: Springer International Publishing. p. 630–645.
  • Huang G, Liu S, Van der Maaten L. 2018. Condensenet: an efficient densenet using learned group convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. p. 2752–2761.
  • Ioffe S, Szegedy C. 2015. Batch normalization: accelerating deep network training by reducing internal covariate shift. International Conference on Machine Learning.p. 448–456.
  • Krizhevsky A, Sutskever I, Hinton GE. 2017. Imagenet classification with deep convolutional neural networks. Commun ACM. 60(6):84–90. doi:10.1145/3065386.
  • Li J, Huang X, Tu L, Zhang T, Wang L. 2022. A review of building detection from very high resolution optical remote sensing images. GISci Remote Sens. 59(1):1199–1225. doi:10.1080/15481603.2022.2101727.
  • Li K, Wan G, Cheng G, Meng L, Han J. 2020. Object detection in optical remote sensing images: a survey and a new benchmark. ISPRS J Photogramm Remote Sens. 159:296–307. doi:10.1016/j.isprsjprs.2019.11.023.
  • Liu P, Zhang Z, Meng Z, Gao N. 2020. Joint attention mechanisms for monocular depth estimation with multi-scale convolutions and adaptive weight adjustment. IEEE Access. 8:184437–184450.
  • Maguire DJ. 1991. An overview and definition of GIS. Geogr Inf Syst. 1(1):9–20.
  • Reis D, Kupec J, Hong J, Daoudi A. 2023. Real-Time Flying Object Detection with YOLOv8.arXiv preprint: arXiv:2305.09972.
  • Rottensteiner F, Sohn G, Gerke M, Wegner JD, Breitkopf U, Jung J. 2014. Results of the ISPRS benchmark on urban object detection and 3D building reconstruction. ISPRS J Photogramm Remote Sens. 93:256–271. doi:10.1016/j.isprsjprs.2013.10.004.
  • Soudy M, Afify Y, Badr N. 2022. RepConv: A novel architecture for image scene classification on Intel scenes dataset. Int J Intell Computing Inf Sci. 22(2), 63–73.
  • Sun X, Wu P, Hoi SC. 2018. Face detection using deep learning: an improved faster RCNN approach. Neurocomputing. 299:42–50. doi:10.1016/j.neucom.2018.03.030.
  • Tan M, Pang R, Le QV. 2020. Efficientdet: scalable and efficient object detection. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seoul, Korea (South). p. 10781–10790.
  • Vicente S, Carreira J, Agapito L, Batista J. 2014. Reconstructing PASCAL VOC. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. p. 41–48.
  • Wang C, Bai X, Wang S, Zhou J, Ren P. 2019. Multiscale visual attention networks for object detection in VHR remote sensing images. IEEE Geosci Remote Sensing Lett. 16(2):310–314. doi:10.1109/LGRS.2018.2872355.
  • Wang CY, Bochkovskiy A, Liao HYM. 2022. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Vancouver, Canada (pp. 7464–7475).
  • Zhang L, Li H, Wang P, Yu X. 2013. Detection of regions of interest in a high-spatial-resolution remote sensing image based on an adaptive spatial subsampling visual attention model. GISci Remote Sens. 50(1):112–132. doi:10.1080/15481603.2013.778553.
  • Zhang Z, Liu F, Zhao X, Wang X, Shi L, Xu J, Yu S, Wen Q, Zuo L, Yi L, et al. 2018. Urban expansion in China based on remote sensing technology: a review. Chin Geogr Sci. 28(5):727–743. doi:10.1007/s11769-018-0988-9.
  • Zhao Q, Sheng T, Wang Y, Tang Z, Chen Y, Cai L, Ling H. 2019. M2det: a single-shot object detector based on multi-level feature pyramid network. AAAI. 33(01):9259–9266. Honolulu, Hawaii, USA. (doi:10.1609/aaai.v33i01.33019259).
  • Zhu X, Cheng D, Zhang Z, Lin S, Dai J. 2019. An empirical study of spatial attention mechanisms in deep networks. Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada. p. 6688–6697.
  • Zou Z, Shi Z. 2017. Random access memories: a new paradigm for target detection in high resolution aerial remote sensing images. IEEE Trans Image Process. 27(3):1100–1111. doi:10.1109/TIP.2017.2773199.