150
Views
6
CrossRef citations to date
0
Altmetric
Articles

Separation of Machine-Printed and Handwritten Texts in Noisy Documents using Wavelet Transform

ORCID Icon &

REFERENCES

  • P. Sahare and S. B. Dhok, “Review of text extraction algorithms for scene-text and document images,” IETE Tech. Rev., Vol. 34, no. 2, pp. 144–164, 2017. doi: 10.1080/02564602.2016.1160805
  • E. Kavallieratou and S. Stamatatos, “Discrimination of machine-printed from handwritten text using simple structural characteristics,” in Proceedings of the 17th International Conference on Pattern Recognition (ICPR), Cambridge, Aug. 23–6, 2004, Vol. 1, pp. 437–40.
  • S. Malakar, R. K. Das, R. Sarkar, S. Basu, and M. Nasipuri, “Handwritten and printed word identification using gray-scale feature vector and decision tree classifier,” in International Conference on Computational Intelligence: Modeling Techniques and Applications (CIMTA), Kalyani, Sept. 27–28, 2013, pp. 831–39.
  • K. Zagoris, I. Pratikakis, and A. Antonacopoulos, “Distinction between handwritten and machine-printed text based on the bag of visual words model,” Pattern Recogn., Vol. 47, no. 3, pp. 1051–1062, Mar. 2014. doi: 10.1016/j.patcog.2013.09.005
  • S. A. Mahmoud and A. S. Mahmoud, “The use of Hartley transform in OCR with application to printed Arabic character recognition,” Pattern Anal. Appl., Vol. 12, pp. 353–365, 2009. doi: 10.1007/s10044-008-0128-8
  • Y. Wang and H. Guo, “Shape recognition of tyre marking points based on support vector machine,” IETE Tech. Rev., Vol. 32, no. 2, pp. 123–130, Jan. 2015. doi: 10.1080/02564602.2014.987327
  • Y. D. Chun, S. Y. Seo, and N. C. Kim, “Image retrieval using BDIP and BVLC moments,” IEEE Trans. Circuits Syst. Video Technol., Vol. 13, no. 9, pp. 951–957, Sept. 2003. doi: 10.1109/TCSVT.2003.816507
  • S. Nigam and A. Khare, “Curvelet transform-based technique for tracking of moving objects,” IET Comput. Vis., Vol. 6, no. 3, pp. 231–251, May 2012. doi: 10.1049/iet-cvi.2011.0023
  • X. Q. Luo, Z. C. Zhang, B. C. Zhang, and X. J. Wu, “Contextual information driven multi-modal medical image fusion,” IETE Tech. Rev., Vol. 34, no. 6, pp. 598–611, Oct. 2016.
  • P. Nagabhushan, R. Hannane, A. Elboushaki, and M. Javed, “Automatic removal of handwritten annotations from between-text-lines and inside-text-line regions of a printed text document,” in International Conference on Advanced Computing Technologies and Applications (ICACTA), Mumbai, Mar. 26–27, 2015, pp. 205–214.
  • U. B. Pal and B. B. Chaudhuri, “Machine-printed and hand-written text lines identification,” Pattern Recogn. Lett., Vol. 22, pp. 431–441, Mar. 2001. doi: 10.1016/S0167-8655(00)00126-4
  • H. Ding, H. Wu, J. Wang, and X. Zhang, “Handwritten and printed text distinction by using stroke thickness features,” in 7th International Conference on Electronics and Information Engineering, Nanjing, Sept. 17–18, 2016, pp. 103221B-1–103221B-7.
  • J. Tan, J. H. Lai, P. Wang, and N. Bi, “Multiscale region projection method to discriminate between printed and handwritten text on registration forms,” Int. J. Pattern Recogn. Artif. Intell., Vol. 29, no. 8, pp. 1553005-1–1553005-25, Sept. 2015. doi: 10.1142/S0218001415530055
  • M. S. Shirdhonkar and M. B. Kokare, “Discrimination between printed and handwritten text in documents,” IJCA Special Issue Recent Trends Image Process. Pattern Recogn., Vol. 6, no. 3, pp. 131–134, 2010.
  • K. C. Fan, L. S. Wang, and Y. T. Tu, “Classification of machine-printed and handwritten texts using character block layout variance,” Pattern Recogn., Vol. 31, no. 9, pp. 1275–1284, Sept. 1998. doi: 10.1016/S0031-3203(97)00143-X
  • Y. Lin, Y. Song, Y. Li, F. Wang, and K. He, “Multilingual corpus construction based on printed and handwritten character separation,” Multimed. Tools Appl., Vol. 76, no. 3, pp. 4123–4139, Feb. 2017. doi: 10.1007/s11042-015-2995-5
  • P. Banerjee and B. B. Chaudhuri, “A system for handwritten and machine-printed text separation in Bangla document images,” in International Conference on Frontiers in Handwriting Recognition, Bari, Sept. 18–20, 2012, pp. 758–762.
  • M. Kaur, “A technique for classification of printed & handwritten text,” Int. J. Eng. Sci., Vol. 21, pp. 123–128, Dec. 2016.
  • S. Jang, S. Jeong, and Y. Nam, “Classification of machine-printed and handwritten addresses on Korean mail piece images using geometric features,” in Proceedings of the 17th International Conference on Pattern Recognition, Cambridge, Aug. 2004, pp. 383–386.
  • Y. Zheng, H. Li, and D. Doermann, “Machine printed text and handwriting identification in noisy document images,” IEEE Trans. Pattern Anal. Mach. Intell., Vol. 26, no. 3, pp. 337–353, Mar. 2004. doi: 10.1109/TPAMI.2004.1262324
  • A. Saidani, A. Kacem, and A. Belaid, “Arabic/Latin and machine-printed/handwritten word discrimination using hog-based shape descriptor,” ELCVIA Electron. Lett. Computer Vision Image Anal., Vol. 14, no. 2, pp. 1–23, Dec. 2015. doi: 10.5565/rev/elcvia.762
  • A. Saïdani, A. Kacem Echi, and A. Belaïd, “Identification of machine-printed and handwritten words in Arabic and Latin Scripts,” in 12th International Conference on Document Analysis and Recognition, Washington, DC, Aug. 25–28, 2013, pp. 798–802.
  • A. Kacem, A. Saidani, and A. Belaid, “How to separate between machine-printed/handwritten and Arabic/Latin words?,” ELCVIA Electron. Lett. Comput. Vision Image Anal., Vol. 13, no. 1, pp. 1–16, 2014. doi: 10.5565/rev/elcvia.572
  • J. K. Guo and M. Y. Ma, “Separating handwritten material from machine printed text using Hidden Markov Models,” in Proceeding of 6th International Conference on Document Analysis and Recognition, Seattle, Sept. 13, 2001, pp. 439–443.
  • R. Srivastava, R. K. Tewari, and S. Kant, “Separation of machine printed and handwritten text for Hindi documents,” Int. Res. J. Eng. Tech., Vol. 2, no. 2, pp. 704–708, May 2015.
  • U. Patil and M. Begum, “Word level handwritten and printed text separation based on shape features,” Int. J. Emerg Tech. Adv. Eng., Vol. 2, no. 4, pp. 590–594, Apr. 2012.
  • P. Mathivanan, B. Ganesamoorthy, and P. Maran, “Watershed algorithm based segmentation for handwritten text identification,” ICTACT J. Image Video Process., Vol. 4, no. 3, pp. 767–772, Feb. 2014. doi: 10.21917/ijivp.2014.0111
  • F. Farooq, K. Sridharan, and V. Govindaraju, “Identifying handwritten text in mixed documents,” in 18th International Conference on Pattern Recognition, Hong Kong, Aug. 20–24, 2006, pp. 1142–1145.
  • J. E. B. D. Santos, B. Dubuisson, and F. Bortolozzi, “Characterizing and distinguishing text in bank cheque images,” in Proceedings XV Brazilian Symposium on Computer Graphics and Image Processing, Fortaleza-CE, Oct. 7–10, 2002, pp. 203–209.
  • M. Hangarge, K. C. Santosh, S. Doddamani, and R. Pardeshi, “Statistical texture features based handwritten and printed text classification in south Indian documents,” in Proceedings of International Conference on Emerging Trends in Electrical, Communications and Information Technologies, Tumkur, Nov. 22–23, 2013, pp. 215–221.
  • L. F. da Silva, A. Conci, and A. Sanchez, “Automatic discrimination between printed and handwritten text in documents,” in XXII Brazilian Symposium on Computer Graphics and Image Processing, Rio de Janeiro, Oct. 11–14, 2009, pp. 261–267.
  • Y. Ricquebourg, C. Raymond, B. Poirriez, A. Lemaitre, and B. Coüasnon, “Boosting Bonsai trees for handwritten/printed text discrimination,” in SPIE Proceedings 9021, Document Recognition and Retrieval XXI, 902105, California, Feb. 2–6, 2014, pp. 9021-1–9021-12.
  • X. Peng, S. Setlur, V. Govindaraju, R. Sitaram, and K. Bhuvanagiri, “Markov random field based text identification from annotated machine printed documents,” in 10th International Conference on Document Analysis and Recognition, Barcelona, Jul. 26–29, 2009, pp. 431–435.
  • X. Peng, S. Setlur, V. Govindaraju, and R. Sitaram, “Handwritten text separation from annotated machine printed documents using Markov random fields”, Int. J. Doc. Anal. Recog., Vol. 16, no. 1, pp. 1–16, 2013. doi: 10.1007/s10032-011-0179-z
  • P. Sahare and S. B. Dhok, “Script identification algorithms: a survey”, Int. J. Multimed. Inform. Retr., Vol. 6, no. 3, pp. 211–232, Sept. 2017. doi: 10.1007/s13735-017-0130-2
  • S. K. Md. Obaidullah, K. C. Santosh, C. Halder, N. Das, and K. Roy, “Automatic Indic script identification from handwritten documents: page, block, line and word-level approach,” J. Mach. Lear. Cybern., pp. 1–20, Jul. 2017.
  • D. Lewis, G. Agam, S. Argamon, O. Frieder, D. Grossman, and J. Heard, “Building a test collection for complex document information processing,” in Proceedings of 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Seattle, Aug. 6–10, 2006, pp. 665–666.
  • G. Agam, S. Argamon, O. Frieder, D. Grossman, and D. Lewis, “The complex document image processing (CDIP) test collection project,” Illinois Institute of Technology, Chicago, 2006. Available: http://ir.iit.edu/projects/CDIP.html.
  • “The Legacy Tobacco Document Library (LTDL),” University of California, San Francisco, 2007. Available: http://legacy.library.ucsf.edu/.
  • M. Pechwitz, S. S. Maddouri, V. Märgner, N. Ellouze, and H. Amiri, “IFN/ENIT-database of handwritten ARABIC words,” in 7th Colloque international Francophone sur l’Ecrit et le document (CIFED), Hammamet, Oct. 21–23, 2002, pp. 129–136.
  • W. Mao, F. Chung, K. K. M. Lam, and W. Siu, “Hybrid Chinese/English text detection in images and video frames,” in Proceedings of 16th International Conference on Pattern Recognition, Quebec city, Aug. 14, 2002, pp. 1015–1018.
  • K. P. Soman, K. I. Ramachandran, and N. G. Resmi, Insights into Wavelets from Theory to Practice, 3rd ed. New Delhi: PHI Learning Private Limited, 2011.
  • S. Li, Q. Shen, and J. Sun, “Skew detection using wavelet decomposition and projection profile analysis,” Pattern Recogn. Lett., Vol. 28, pp. 555–562, Apr. 2007. doi: 10.1016/j.patrec.2006.10.002
  • Y. D. Chun, N. C. Kim, and I. H. Jang, “Content-based image retrieval using multiresolution color and texture features,” IEEE Trans. Multimedia, Vol. 10, no. 6, pp. 1073–1084, Oct. 2008. doi: 10.1109/TMM.2008.2001357
  • Y. Zheng, H. Li, and D. Doermann, “The segmentation and identification of handwriting in noisy document images,” in International Workshop on Document Analysis Systems (DAS), Princeton, Aug. 19–21, 2002, pp. 95–105.
  • A. Busch, W. W. Boles, and S. Sridharan, “Texture for script identification,” IEEE Trans. Pattern Anal. Mach. Intell., Vol. 27, no. 11, pp. 1720–1732, Nov. 2005. doi: 10.1109/TPAMI.2005.227
  • V. Vapnik, The Nature of Statistical Learning Theory. New York: Springer-Verlag, 1995.
  • B. E. Boser, I. Guyon, and V. Vapnik, “A training algorithm for optimal margin classifiers,” in Proceedings of the fifth annual workshop on Computational learning theory (COLT), Pittsburgh, Jul. 27–29, 1992, pp. 144–152.
  • O. Surinta, M. F. Karaaba, L. R. B. Schomaker, and M. A. Wiering, “Recognition of handwritten characters using local gradient feature descriptors,” Eng. Appl. Artif. Intel., Vol. 45, pp. 405–414, Oct. 2015. doi: 10.1016/j.engappai.2015.07.017
  • S. Chanda, S. Pal, and U. Pal, “Word-wise Sinhala Tamil and English script identification using Gaussian kernel SVM,” in 19th International Conference on Pattern Recognition (ICPR), Florida, Dec. 8–11, 2008, pp. 1–4.
  • V. Vapnik, “An overview of statistical learning theory,” IEEE Trans. Neural Networ., Vol. 10, no. 5, pp. 988–999, Sept. 1999. doi: 10.1109/72.788640
  • C. Cortes and V. Vapnik, “Support vector networks,” Mach. Learn., Vol. 20, no. 3, pp. 197–273, Sept. 1995.
  • R. Manmatha and J. L. Rothfeder, “A scale space approach for automatically segmenting words from historical handwritten documents,” IEEE Trans. Pattern Anal. Mach. Intell., Vol. 27, no. 8, pp. 1212–1225, Aug. 2005. doi: 10.1109/TPAMI.2005.150
  • “Trails of hope: overland diaries and letters, 1846–1869,” Brigham Young University, Utah. Available: http://overlandtrails.lib.byu.edu.
  • D. J. Kennard and W. A. Barrett, “Separating lines of text in free-form handwritten historical documents,” in Proceedings of the 2nd International Conference on Document Image Analysis for Libraries, Lyon, Apr. 27–28, 2006, pp. 12–23.
  • S. Kanoun, I. Moalla, A. Ennaji, and A. M. Alimi, “Script identification for Arabic and Latin, printed and handwritten documents,” in Proceeding of International Workshop on Document Analysis Systems, Brasil, Dec. 10–13, 2000, pp. 159–165.
  • T. Huang, G. Yang, and G. Tang, “A fast two-dimensional median filtering algorithm,” IEEE Trans. Acoust. Speech Signal Process., Vol. 27, no. 1, pp. 13–18, Feb. 1979. doi: 10.1109/TASSP.1979.1163188
  • C. C. Chang and C. J. Lin, “LIBSVM: a library for support vector machines,” ACM Trans. Intell. Syst. Tech., Vol. 2, no. 3, pp. 27:1–27:27, Apr. 2011. doi: 10.1145/1961189.1961199

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.