305
Views
6
CrossRef citations to date
0
Altmetric
Original Articles

Natural language processing and query expansion in legal information retrieval: Challenges and a response

&
Pages 63-72 | Published online: 02 Mar 2010
 

Abstract

As methods in legal information retrieval (IR) evolve to meet the demands of rapidly increasing stores of electronic information, there is the intuitive appeal of capturing detail in legal queries with natural language processing (NLP). One difficulty with this approach is that incorporation of word dependencies in IR has not been shown to consistently and reliably improve results over a unigram bag-of-words approach. We consider challenges faced when incorporating NLP in IR and briefly review three proposals in this vein, highlighting how these might have responded better to requirements in legal search. We then present our novel response based on split query expansion that accounts for the way lawyers seek to apply search results whilst meeting the challenges identified in a unique and flexible manner.

Notes

D.W. Oard, B. Hedin, S. Tomlinson, and J.R. Baron. ‘Overview of the Text Retrieval Conference (TREC) 2008 Legal Track’, in National Institute of Standards and Technology (NIST) Special Publication 500-277, ed. E.M. Vorhees and L.P. Buckland (Gaithersburg, MD: NIST, 2008).

K.T. Maxwell and B. Schafer, ‘Concept and Context in Legal Information Retrieval’, in Proceedings of the 21st International Conference on Legal Knowledge and Information Systems (JURIX), Florence, Italy, 2008.

T. Brants, ‘Natural Language Processing in Information Retrieval’, in Proceedings of Computational Linguistics in the Netherlands (CLIN) 2003, Antwerp, Belgium, ed B. Decodt, V. Hoste, and G. De Pauw, 1–13; T. Strzalkowski, J. Perez-Carballo, J. Karlgren, A. Hulth, P. Tapanainen and T. Lahtinen, ‘Natural Language Information Retrieval: TREC-8 Report’ in The Eighth Test Retrieval Conference (TREC 8), National Institute of Standards and Technology (NIST) Special Publication 500-246, ed. E.M. Vorhees and D.K. Harman (Gaithersburg, MD: NIST, 2000), 381–91.

I. Freckelton, ‘Vexatious litigant law reform’, Journal of Law and Medicine, 16, no. 5 (2009): 721–7.

D. Lin and P. Pantel, ‘DIRT – Discovery of Inference Rules from Text’, in KDD '01: Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, 2001, 323–8.

K.S. Jones, ‘What is the Role of NLP in Text Retrieval?’, in Natural Language Information Retrieval, ed. T. Strzalkowski (Dordecht, The Netherlands: Kluwer Academic Publishers, 1999), 1–24.

Brants, ‘Natural Language Processing’.

A.F. Smeaton, ‘Using NLP or NLP Resources for Information Retrieval Tasks’, in Natural Language Information Retrieval, ed. T. Strzalkowski, (Dordecht, The Netherlands: Kluwer Academic Publishers, 1999), 99–111.

D. Shen and M. Lapata, ‘Using Semantic Roles to Improve Question Answering’, in Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Prague, Czech Republic, 2007, 12–21.

Lin and Pantel, ‘DIRT – Discovery of Inference Rules from Text’.

D. Song and P. Bruza, ‘Discovering Information Flow Using High Dimensional Conceptual Space’, in SIGIR '01: Proceedings of the 24th SIGIR Conference on Research and Development in Information Retrieval, New Orleans, USA, 2001, 327–33, available at http://www.coli.uni-saarland.de/~dshen/publications/EMNLP07.pdf; D. Song and P. Bruza, ‘Towards Context Sensitive Information Inference’, Journal of the American Society for Information Science and Technology 54, no. 4 (2003): 321–34; on HAL, see K. Lund and C. Burgess, ‘Producing High Dimensional Semantic Spaces from Lexical Co-occurrence’, Behavior Research Methods, Instruments, Computers 28, no. 2 (1996): 203–8.

Lin and Pantel, ‘DIRT – Discovery of Inference Rules from Text’.

A.F. Smeaton, R. O'Donnell and F. Kelledy, ‘Indexing Structures Derived from Syntax in TREC-3: System Description’, in The Third Text Retrieval Conference (TREC-3), National Institute of Standards and Technology (NIST) Special Publication 500–226, ed. D. K. Harman (Gaithersburg, MD: NIST 1995), 55–67, available at http://trec.nist.gov/pubs/trec3/t3_proceedings.html; on TSAs see also A.F. Smeaton, ‘Progress in the Application of Natural Language Processing to Information Retrieval Tasks’, The Computer Journal 35, no. 3 (1992): 268–78.

W. Kraaij and R. Pohlmann, ‘Viewing Stemming as Recall Enhancement’, in Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR'96, 18–22 August 1996, Zurich, Switzerland (Special Issue of the SIGIR Forum), ed. H.-P. Frei, D. Harman, P. Schäuble and R. Wilkinson (New York: ACML, 1996), 40–8.

K.T. Maxwell, J. Oberlander and V. Lavrenko, ‘Evaluation of Semantic Events for Legal Case Retrieval’, in ESAIR '09: Proceedings of the WSDM '09 Workshop on Exploiting Semantic Annotations in Information Retrieval, New York, 2009, 39–41.

J. Kingston, B. Schafer, and W. Vandenberghe, ‘No Model Behaviour: Ontologies for Fraud Detection’, in Law and the Semantic Web, ed. R. Benjamins, P. Casanovas, A. Gangemi and J. Breuker (Berlin: Springer, 2005), 233–47.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.