1,725
Views
48
CrossRef citations to date
0
Altmetric
Original Articles

Similarity matching for integrating spatial information extracted from place descriptions

, &
Pages 56-80 | Received 15 Nov 2015, Accepted 08 May 2016, Published online: 29 May 2016
 

ABSTRACT

Place descriptions are used in everyday communication as a common way to convey spatial information. Processing the information from place descriptions poses multiple significant challenges because these descriptions are written in natural language. In particular, corpora of place descriptions provide a plethora of human spatial knowledge beyond geographical information system, even if these descriptions refer to the same places in various ways. This article focuses on resolving ambiguous or synonymous place names from place descriptions by exploring the given relationships with other spatial features. It matches place names from multiple descriptions by developing a novel labelled graph matching process that relies solely on the comparison of string, linguistic and spatial similarities between identified places. This process uses unstructured place descriptions as an input, and produces a composite place graph with qualitative spatial relations from the descriptions. The performance of this novel process exceeds current toponym resolution by coping with non-gazetteered places.

Disclosure statement

No potential conflict of interest was reported by the authors.

Notes

1. We refer to the general definition and notion: ‘Places are the conceptual entities that enable cognitive structuring of the spatial aspects of reality’ (Bennett and Agarwal Citation2007), and ‘Places are typically determined by entities in the geographic environment or by relations between entities in the environment rather than by externally imposed coordinates and geometric properties’ (Winter and Freksa Citation2012).

2. There are exceptions, for example, georeferenced Wikipedia entries.

3. The national topographic database provided by Ordnance Survey, Britain’s national mapping agency.

4. The sensitivity (or the recall = true positives / (true positives + false negatives)) represents the proportion of positives that actually belong to the correct pairs.

5. The specificity (=true negatives / (true negatives + false positives)) is the proportion of negatives that are correctly identified.

6. The precision represents the number of correctly matched pairs (true positives) divided by the total number of pairs matched as belonging to the correct pairs (the sum of true positives and false positives).

Additional information

Funding

This work was supported by Australian Research Council [ARC LP100200199].

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 704.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.