1,260
Views
11
CrossRef citations to date
0
Altmetric
Research Articles

Identifying home locations in human mobility data: an open-source R package for comparison and reproducibility

ORCID Icon & ORCID Icon
Pages 1425-1448 | Received 03 Aug 2020, Accepted 03 Feb 2021, Published online: 10 Mar 2021
 

ABSTRACT

Identifying meaningful locations, such as home or work, from human mobility data has become an increasingly common prerequisite for geographic research. Although location-based services (LBS) and other mobile technology have rapidly grown in recent years, it can be challenging to infer meaningful places from such data, which – compared to conventional datasets – can be devoid of context. Existing approaches are often developed ad-hoc and can lack transparency and reproducibility. To address this, we introduce an R package for inferring home locations from LBS data. The package implements pre-existing algorithms and provides building blocks to make writing algorithmic ‘recipes’ more convenient. We evaluate this approach by analyzing a de-identified LBS dataset from Singapore that aims to balance ethics and privacy with the research goal of identifying meaningful locations. We show that ensemble approaches, combining multiple algorithms, can be especially valuable in this regard as the resulting patterns of inferred home locations closely correlate with the distribution of residential population. We hope this package, and others like it, will contribute to an increase in use and sharing of comparable algorithms, research code and data. This will increase transparency and reproducibility in mobility analyses and further the ongoing discourse around ethical big data research.

Acknowledgments

This research, led together with the Housing and Development Board, is supported by the Singapore Ministry of National Development and the National Research Foundation, Prime Minister’s Office under the Land and Liveability National Innovation Challenge (L2 NIC) Research Programme (L2 NIC Award No. L2NICTDF1-2017-4). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not reflect the views of the Housing and Development Board, Singapore Ministry of National Development and National Research Foundation, Prime Minister’s Office, Singapore.

Data and code availability statement

Along with this paper, we publish three specific data & code artifacts:

  1. The homelocator R package itself, with code hosted on: https://github.com/spatialnetworkslab/homelocator. The package contains a small test sample dataset so readers can quickly try out the package. A documentation site, including a tutorial and installation instructions, is available at: https://homelocator-website.netlify.app/.

  2. The deidentified LBS dataset, which can be downloaded from Figshare: https://doi.org/10.6084/m9.figshare.13394102.

  3. All code & data to reproduce the analysis and figures within this paper. This is available as a series of RMarkdown files hosted on: https://github.com/spatialnetworkslab/identifying-meaningful-locations. It includes a Binder link so readers can explore the analysis directly from their browser.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes

1. The Chicago Area Transportation Study, which is one of the first examples of a large-scale travel survey, cost $650,000 in 1956 for just the survey fieldwork (Black 1990).

2. Open datasets, such as WorldPop, do exist on a finer spatial scale than planning zones, but the estimation process underlying these datasets is not reliable in the case of Singapore, especially in industrial areas where such datasets (erroneously) show a high number of estimated residents.

Additional information

Funding

This work was supported by the Singapore Ministry of National Development and the National Research Foudation, Prime Minister's Officer under the Land and Liveability National Innovation Challenge (L2 NIC) Research Programme (L2 NIC Award NO. L2NICTDF1-2017-4).

Notes on contributors

Qingqing Chen

Qingqing Chen is a Ph.D. student in the Department of Geography at the University at Buffalo - SUNY, USA. Her research focuses on critically understanding urban space by leveraging big data, combined with data science and machine techniques. She is interested in urban data science, geocomputation, non-visual sensory measuring and monitoring, social media and big data. 

Ate Poorthuis

Ate Poorthuis is an Assistant Professor of Big Data and Human-Environment Systems in the Department of Earth and Environmental Sciences at KU Leuven, Belgium. His research focuses on exploring the possibilities and limitations of big data, through quantitative analysis and visualization, to better understand how our cities work, where he is particularly interested in applying these academic insights within urban planning and policy.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 704.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.