Word Level Multi-Script Identification Using Curvelet Transform in Log-Polar Domain

Parul SahareDepartment of Electronics and Communication Engineering, Centre for VLSI & Nanotechnology, Visvesvaraya National Institute of Technology, Nagpur, IndiaCorrespondence[email protected]

http://orcid.org/0000-0002-9342-1159 View further author information

Ravindra E. ChaudhariDepartment of Electronics and Communication Engineering, Centre for VLSI & Nanotechnology, Visvesvaraya National Institute of Technology, Nagpur, IndiaView further author information

Sanjay B. DhokDepartment of Electronics and Communication Engineering, Centre for VLSI & Nanotechnology, Visvesvaraya National Institute of Technology, Nagpur, IndiaView further author information

ABSTRACT

Nowadays, a number of scripts are used for writing. Script identification finds many applications like sorting and preparing an online database of documents. Identifying these scripts, especially with different orientations and scales, is an important and challenging problem in document image analysis. This paper proposed a new scheme for script identification from word images using skew and scale robust log-polar curvelet features. These word images are first extracted in the form of text-patches from documents using Gaussian filtering. Thereafter, texture features are calculated using curvelet transform in log-polar domain. Log-polar domain is independent of rotation and scale variations, whereas curvelet transform exhibits directional and anisotropic properties. This helps in the extraction of significant features. For experiments, k-nearest neighbor classifier is employed to identify the scripts, as it has zero training time and is simple to implement. Further, statistical significance test is performed by using two more classifiers, namely random forest and support vector machine. Comprehensive experimentations are carried out on ALPH-REGIM, Pati and Ramakrishnan, PHDIndic_11, and proprietary databases containing printed as well as handwritten texts. Here, bi-script, tri-script, and multi-script identification results are reported. Benchmarking analysis illustrated the effectiveness of the proposed method, where a maximum recall rate of 98.76% has been achieved.

KEYWORDS:

DISCLOSURE STATEMENT

No potential conflict of interest was reported by the authors.

Additional information

Notes on contributors

Parul Sahare

Parul Sahare received his post-graduate degree (MTech degree (VLSI DESIGN)) from Visvesvaraya National Institute of Technology, Nagpur, Maharastra, India in 2010. Currently, he is pursuing his PhD degree from Visvesvaraya National Institute of Technology, Nagpur, Maharashtra, India. He has a total of three years of academic experience. His area of interest includes signal processing, image processing, pattern recognition.

Ravindra E. Chaudhari

Ravindra E. Chaudhari received his BE degree in electronics and telecommunication engineering from Government College of Engineering, Aurangabad, India, in 1999, and MTech degree, in electronics engineering from Visvesvaraya Regional College of Engineering, Nagpur in 2002. He is pursuing his PhD degree in electronics engineering from Visvesvaraya National Institute of Technology, Nagpur, and working as an assistant professor at St. Francis Institute of Technology, Mumbai, India. His research interests are in the areas of image/video and signal processing.

E-mail: [email protected]

Sanjay B. Dhok

Sanjay B. Dhok is Associate Professor in Centre for VLSI & Nanotechnology at Visvesvaraya National Institute of Technology, Nagpur (India). He received his PhD degree in electronics engineering from Visvesvaraya National Institute of Technology, Nagpur, India. He is a member of IEEE society. He has published many research papers in national and international journals and conferences. His area of interest includes signal processing, image processing, data compression, wireless sensor networks, and VLSI design.

E-mail: [email protected]

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.