179
Views
0
CrossRef citations to date
0
Altmetric
Articles

Intelligent information retrieval system using automatic thesaurus construction

, , &
Pages 395-415 | Received 16 Feb 2009, Accepted 12 Apr 2009, Published online: 10 Mar 2011
 

Abstract

This paper presents an intelligent information retrieval (IR) system based on automatic thesaurus construction for its applications of document clustering and classification. These two applications are the most influential and widely used fields amongst the IR research community. We apply two biologically inspired algorithms, i.e. genetic algorithm (GA) and neural network (NN), to these two fields. A fuzzy logic controller GA and an adaptive back-propagation NN are proposed in our study, which can validly overcome the problems existing in their archetypes, e.g. slow evolution and being prone to trap into a local optimum. Furthermore, a well-constructed thesaurus has been recognised as a valuable tool in the effective operation of clustering and classification. It solves the problem in document representation organised by a bag of words, where some important relationships between words, e.g. synonymy and polysemy, are ignored. To investigate how our IR system could be used effectively, we conduct experiments on four data sets from the benchmark Reuter-21578 document collection and 20-newsgroup corpus. The results reveal that our IR system enhances the performance in comparison with k-means, common GA, and conventional back-propagation NN.

Acknowledgements

The authors thank the Editor-in-Chief and the reviewers for providing very helpful comments and suggestions. Their insight and comments led to a better presentation of the ideas expressed in this paper. This work was supported by Brain Korea 21, the Youth Foundation and the Scientific Research Starting Foundation of Jiangnan University, the Fundamental Research Funds for the Central Universities (JUSRP11130), and the Specialized Research Fund for the Doctoral Programme of Higher Education (20100093120004).

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.