1,809
Views
29
CrossRef citations to date
0
Altmetric
Research Articles

A deep learning architecture for semantic address matching

ORCID Icon, ORCID Icon, , ORCID Icon &
Pages 559-576 | Received 24 May 2019, Accepted 13 Oct 2019, Published online: 24 Oct 2019
 

ABSTRACT

Address matching is a crucial step in geocoding, which plays an important role in urban planning and management. To date, the unprecedented development of location-based services has generated a large amount of unstructured address data. Traditional address matching methods mainly focus on the literal similarity of address records and are therefore not applicable to the unstructured address data. In this study, we introduce an address matching method based on deep learning to identify the semantic similarity between address records. First, we train the word2vec model to transform the address records into their corresponding vector representations. Next, we apply the enhanced sequential inference model (ESIM), a deep text-matching model, to make local and global inferences to determine if two addresses match. To evaluate the accuracy of the proposed method, we fine-tune the model with real-world address data from the Shenzhen Address Database and compare the outputs with those of several popular address matching methods. The results indicate that the proposed method achieves a higher matching accuracy for unstructured address records, with its precision, recall, and F1 score (i.e., the harmonic mean of precision and recall) reaching 0.97 on the test set.

Acknowledgments

We acknowledge Qin Tian for his valuable suggestions on the methodology of this study. We also appreciate the insightful comments from the associate editor, Christophe Claramunt, and all the anonymous reviewers.

Data and code availability statement

The first 498,294 records of the corpus derived from the Shenzhen Address Database, the labelled address dataset for semantic address matching and codes that support the findings of this study are available in Zenodo with the identifiers doi: 10.5281/zenodo.3477633 (part of the corpus for word2vec training), doi: 10.5281/zenodo.3477007 (labelled address dataset for semantic address matching) and doi: 10.5281/zenodo.3476673 (codes). Complete corpus from the Shenzhen Address Database cannot be made publicly available to protect personal information and to follow the national policy on data security.

Disclosure statement

No potential conflict of interest was reported by the authors.

Additional information

Funding

This work was supported by the National Key Research and Development Program of China [2017YFB0503500].

Notes on contributors

Yue Lin

Yue Lin is a first-year PhD student in the Department of Geography at the Ohio State University. She received her bachelor’s degree in Geographical Information Science from School of Resource and Environmental Sciences at Wuhan University. Her research interests include GeoAI, spatial data analysis, and urban geography.

Mengjun Kang

Mengjun Kang is an associate professor in School of Resource and Environmental Sciences at Wuhan University. His research interests include geocoding, urban addresses, and digital mapping.

Yuyang Wu

Yuyang Wu is an undergraduate student in School of Geography and Information Engineering at China University of Geosciences.

Qingyun Du

Qingyun Du is a professor and dean in School of Resource and Environmental Sciences at Wuhan University. His research interests include GIScience and natural language representations of spatial information.

Tao Liu

Tao Liu is a professor in Faculty of Geomatics at Lanzhou Jiaotong University.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 704.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.