61
Views
10
CrossRef citations to date
0
Altmetric
Original Articles

Compact and fast algorithms for safe regular expression search

Pages 383-401 | Received 06 Nov 2003, Accepted 01 Dec 2003, Published online: 08 Jun 2010
 

Abstract

This article describes an improvement of the brute force determinization algorithm in the case of homogeneous nondeterministic finite automata (NFAs), as well as its application to pattern matching. Brute force determinization with limited memory may provide a partially determinized automaton, but its bounded complexity makes it a safe procedure contrary to the classical subset construction. Actually, our algorithm is inspired by both recent results of Champarnaud concerning the subset automaton of a homogeneous NFA and the algorithm recently designed by Navarro and Raffinot to implement the brute force determinization of the Glushkov NFA of a regular pattern. Our algorithm significantly improves Navarro–Raffinot's one since it has an average exponentially smaller memory requirement for a given level of determinization, which, considering a bounded memory, implies a quadratically smaller parsing time. This algorithm has been implemented in CCP software (http://www.univ-rouen.fr/LIFAR/aia/ccp.html). Tests have been carried out in the field of text processing and biology. Experimental results are reported.

Notes

1E is epsilon-free if the symbol ϵ does not appear in E.

2http://www.ncbi.nlm.nih.gov/Genbank/index.html

3http://corpus.canterbury.ac.nz/

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.