135
Views
0
CrossRef citations to date
0
Altmetric
Research Articles

A multiscale feature fusion method for cursive text detection in natural scene images

, , , &
Pages 302-318 | Received 07 May 2022, Accepted 08 Dec 2022, Published online: 06 Jan 2023

References

  • Liu X, Meng G, Pan C. Scene text detection and recognition with advances in deep learning: a survey. Int J Doc Anal Recognit (IJDAR). 2019;22(2):143–162. DOI:10.1007/s10032-019-00320-5.
  • Long S, He X, Yao C. Scene text detection and recognition: the deep learning era. Int J Comput Vis. 2021;;129(1):161–184. DOI:10.1007/s11263-02001369-0.
  • He W, Zhang XY, Yin F, et al. Realtime multi-scale scene text detection with scale-based region proposal network. Pattern Recognit. 2020;98:107,026. DOI:10.1016/j.patcog.2019.107026.
  • Lin H, Yang P, Zhang F. Review of scene text detection and recognition. Arch Comput Methods Eng. 2020;27(2):433–454. DOI:10.1007/s11831-019-09315-1.
  • Liu Z, Zhou W, Li H. Scene text detection with fully convolutional neural networks. Multimed Tools Appl. 2019;78(13):18,205–18,227. DOI:10.1007/s11042-019-7177-4.
  • Ma J, Shao W, Ye H, et al. Arbitrary-oriented scene text detection via rotation proposals. IEEE Trans Multimedia. 2018;20(11):3111–3122. DOI:10.1109/TMM.2018.2818020.
  • Yang P, Zhang F, Yang G. A fast scene text detector using knowledge distillation. IEEE Access. 2019;7:22,588–22,598. DOI:10.1109/ACCESS.2019.2895330.
  • Deng J, Dong W, Socher R, et al. Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition; Miami, FL, USA. IEEE; 2009. p. 248–255. DOI:10.1109/CVPR.2009.5206848
  • Liu F, Chen C, Gu D, et al. FTPN: scene text detection with feature pyramid based text proposal network. IEEE Access. 2019;7:44,219–44,228. DOI:10.1109/ACCESS.2019.2908933.
  • Xu Y, Wang Y, Zhou W, et al. Textfield: learning a deep direction field for irregular scene text detection. IEEE Trans Image Process. 2019;28(11):5566–5579. DOI:10.1109/TIP.2019.2900589.
  • Shi B, Bai X, Belongie S. Detecting oriented text in natural images by linking segments. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; Honolulu, Hawaii, USA; 2017. p. 2550–2558. DOI:10.1109/CVPR.2017.371.
  • Nayef N, Yin F, Bizid I, et al. ICDAR2017 robust reading challenge on multi-lingual scene text detection and script identification-RRC-MLT. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). Vol. 1; Kyoto, Japan. IEEE; 2017. p. 1454–1459. DOI:10.1109/ICDAR.2017.237
  • Nayef N, Patel Y, Busta M, et al. ICDAR2019 robust reading challenge on multi-lingual scene text detection and recognition-RRC-MLT-2019. In: 2019 International Conference on Document Analysis and Recognition (ICDAR); University of Technology, Sydney, Australia. IEEE; 2019. p. 1582–1587. DOI:10.1109/ICDAR.2019.00254
  • Chandio AA, Pickering M. Convolutional feature fusion for multi-language text detection in natural scene images. In: 2019 2nd International Conference on Computing, Mathematics and Engineering Technologies (iCoMET); IBA, Sukkur, Pakistan. IEEE; 2019. p. 1–6. DOI:10.1109/ICOMET.2019.8673517
  • Gaddour H, Kanoun S, Vincent N. A new method for Arabic text detection in natural scene image based on the color homogeneity. In: International Conference on Image and Signal Processing; University of Québec at Trois-Rivières, Canada. Springer; 2016. p. 127–136. DOI:10.1007/978-3-319-33618-3_14
  • Dong X, Yan Y, Tan M, et al. Late fusion via subspace search with consistency preservation. IEEE Trans Image Process. 2018;28(1):518–528. DOI:10.1109/TIP.2018.2867747.
  • Xu Z, Cao Y, Kang Y. Deep spatiotemporal residual early-late fusion network for city region vehicle emission pollution prediction. Neurocomputing. 2019;355:183–199. DOI:10.1016/j.neucom.2019.04.040.
  • Zheng X, Yuan Y, Lu X. A deep scene representation for aerial scene classification. IEEE Trans Geosci Remote Sens. 2019;57(7):4799–4809. DOI:10.1109/TGRS.2019.2893115.
  • Yang X, Yang J, Yan J, et al. SCRDet: towards more robust detection for small, cluttered and rotated objects. In: Proceedings of the IEEE International Conference on Computer Vision; Seoul, South Korea. 2019b. p. 8232–8241. DOI:10.1109/ICCV.2019.00832.
  • Hu J, Chen Z, Yang M, et al. A multiscale fusion convolutional neural network for plant leaf recognition. IEEE Signal Process Lett. 2018;25(6):853–857. DOI:10.1109/LSP.2018.2809688.
  • Zhang Y, Liu Y, Sun P, et al. IFCNN: a general image fusion framework based on convolutional neural network. Inf Fusion. 2020;54:99–118. DOI:10.1016/j.inffus.2019.07.011.
  • Li Z, Huang L, He J. A multiscale deep middle-level feature fusion network for hyperspectral classification. Remote Sens. 2019;11(6):695. DOI:10.3390/rs11060695.
  • Xu K, Huang H, Li Y, et al. Multilayer feature fusion network for scene classification in remote sensing. IEEE Geosci Remote Sens Lett. 2020;17(11):1894–1898. DOI:10.1109/LGRS.2019.2960026.
  • Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. 2014. Available from: arXiv:14091556.
  • Fang Y, Gao S, Li J, et al. Multi-level feature fusion based locality-constrained spatial transformer network for video crowd counting. Neurocomputing. 2020;392:98–107. DOI:10.1016/j.neucom.2020.01.087.
  • Qin J, Huang Y, Wen W. Multi-scale feature fusion residual network for single image super-resolution. Neurocomputing. 2020;379:334–342. DOI:10.1016/j.neucom.2019.10.076.
  • Song Y, Cui Y, Han H, et al. Scene text detection via deep semantic feature fusion and attention-based refinement. In: 2018 24th International Conference on Pattern Recognition (ICPR); Beijing, China. IEEE; 2018. p. 3747–3752. DOI:10.1109/ICPR.2018.8546050
  • Li R, Wang L, Zhang C, et al. A2-FPN for semantic segmentation of fine-resolution remotely sensed images. Int J Remote Sens. 2022;43(3):1131–1155. DOI:10.1080/01431161.2022.2030071.
  • He K, Zhang X, Ren S, et al. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; Las Vegas, NV, USA; 2016. p. 770–778. DOI:10.1109/CVPR.2016.90.
  • Girshick R. Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision; Santiago, Chile; 2015. p. 1440–1448. DOI:10.1109/ICCV.2015.169.
  • Gupta A, Vedaldi A, Zisserman A. Synthetic data for text localisation in natural images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; Las Vegas, Nevada, USA; 2016. p. 2315–2324. DOI:10.1109/CVPR.2016.254.
  • Feng W, He W, Yin F, et al. TextDragon: an end-to-end framework for arbitrary shaped text spotting. In: Proceedings of the IEEE International Conference on Computer Vision; Seoul, South Korea; 2019. p. 9076–9085. DOI:10.1109/ICCV.2019.00917.
  • Liu S, Tian G, Xu Y. A novel scene classification model combining resnet based transfer learning and data augmentation with a filter. Neurocomputing. 2019;338:191–206. DOI:10.1016/j.neucom.2019.01.090.
  • Bazazian D, Gómez R, Nicolaou A, et al. FAST: facilitated and accurate scene text proposals through FCN guided pruning. Pattern Recognit Lett. 2019;119:112–120. DOI:10.1016/j.patrec.2017.08.030.
  • Karatzas D, Gomez-Bigorda L, Nicolaou A, et al. ICDAR 2015 competition on robust reading. In: 2015 13th International Conference on Document Analysis and Recognition (ICDAR); Tunis, Tunisia. IEEE; 2015. p. 1156–1160. DOI:10.1109/ICDAR.2015.7333942
  • Veit A, Matera T, Neumann L, et al. COCO-text: dataset and benchmark for text detection and recognition in natural images. 2016. Available from: arXiv:160107140.
  • Zhou X, Yao C, Wen H, et al. EAST: an efficient and accurate scene text detector. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition; Honolulu, HI, USA; 2017 p. 5551–5560. DOI:10.1109/CVPR.2017.283.
  • Kim KH, Hong S, Roh B, et al. PVANET: Deep but lightweight neural networks for real-time object detection. 2016. Available from: arXiv:160808021.
  • Liu Y, Jin L, Zhang S, et al. Curved scene text detection via transverse and longitudinal sequence connection. Pattern Recognit. 2019;90:337–345. DOI:10.1016/j.patcog.2019.02.002.
  • Deng L, Gong Y, Lin Y, et al. Detecting multi-oriented text with corner-based region proposals. Neurocomputing. 2019;334:134–142. DOI:10.1016/j.neucom.2019.01.013.
  • Liu Y, Jin L, Fang C. Arbitrarily shaped scene text detection with a mask tightness text detector. IEEE Trans Image Process. 2019;29:2918–2930. DOI:10.1109/TIP.2019.2954218.
  • Zhang W, Yu J, Hu H, et al. Multimodal feature fusion by relational reasoning and attention for visual question answering. Inf Fusion. 2020;55:116–126. DOI:10.1016/j.inffus.2019.08.009.
  • Ren S, He K, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks. Adv Neural Inf Process Syst. 2015;28:91–99.
  • Cheng MM, Liu Y, Lin WY, et al. BING: binarized normed gradients for objectness estimation at 300fps. Comput Vis Media. 2019;5(1):3–20. DOI:10.1007/s41095-018-0120-1.
  • Tian Z, Huang W, He T, et al. Detecting text in natural image with connectionist text proposal network. In: European Conference on Computer Vision. Springer; 2016. p. 56–72. DOI:10.1007/978-3-319-46484-8_4.
  • Cho K, Van Merriënboer B, Gulcehre C, et al. Learning phrase representations using rnn encoder-decoder for statistical machine translation. 2014. Available from: https://arxiv.org/abs/1406.1078.
  • He K, Zhang X, Ren S, et al. Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE International Conference on Computer Vision; Santiago, Chile; 2015. p. 1026–1034. DOI:10.1109/ICCV.2015.123.
  • Wolf C, Jolion JM. Object count/area graphs for the evaluation of object detection and segmentation algorithms. Int J Doc Anal Recognit (IJDAR). 2006;8(4):280–296. DOI:10.1007/s10032-006-0014-0.
  • Li Y, Yu Y, Li Z, et al. Pixel-anchor: a fast oriented scene text detector with combined networks. 2018. Available from: arXiv:181107432.
  • Hassan E. Scene text detection using attention with depthwise separable convolutions. Appl Sci. 2022;12:6425. DOI:10.3390/app12136425.
  • Huang Y, Sun Z, Jin L, et al. EPAN: effective parts attention network for scene text recognition. Neurocomputing. 2020;376:202–213. DOI:10.1016/j.neucom.2019.10.010.
  • Li Y, Silamu W, Wang Z, et al. Attention-based scene text detection on dual feature fusion. Sensors. 2022;22:9072. DOI:10.3390/s22239072 .
  • Liu Y, Zhang S, Jin L, et al. Omnidirectional scene text detection with sequential-free box discretization. 2019. Available from: arXiv:190602371.
  • Liu X, Liang D, Yan S, et al. FOTS: fast oriented text spotting with a unified network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; Salt Lake City, UT, USA; 2018. p. 5676–5685. DOI:10.1109/CVPR.2018.00595.
  • Baek Y, Lee B, Han D, et al. Character region awareness for text detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; Long Beach, CA, USA; 2019. p. 9365–9374. DOI:10.1109/CVPR.2019.00959.
  • Deng L, Gong Y, Lu X, et al. Stela: a real-time scene text detector with learned anchor. IEEE Access. 2019;7:153,400–153,407. DOI:10.1109/ACCESS.2019.2948405.
  • Liu Y, Jin L. Deep matching prior network: toward tighter multi-oriented text detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; Honolulu, HI, USA; 2017. p. 1962–1969. DOI:10.1109/CVPR.2017.368.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.