761
Views
3
CrossRef citations to date
0
Altmetric
Articles

Predictive modelling of movements of refugees and internally displaced people: towards a computational framework

ORCID Icon & ORCID Icon
Pages 408-444 | Published online: 16 Aug 2022
 

ABSTRACT

Predicting forced displacement is an important undertaking of many humanitarian aid agencies, which must anticipate flows in advance in order to provide vulnerable refugees and Internally Displaced Persons (IDPs) with shelter, food, and medical care. While there is a growing interest in using machine learning to better anticipate future arrivals, there is little standardised knowledge on how to predict refugee and IDP flows in practice. Researchers and humanitarian officers are confronted with the need to make decisions about how to structure their datasets and how to fit their problem to predictive analytics approaches, and they must choose from a variety of modelling options. Most of the time, these decisions are made without an understanding of the full range of options that could be considered, and using methodologies that have primarily been applied in different contexts – and with different goals – as opportunistic references. In this work, we attempt to facilitate a more comprehensive understanding of this emerging field of research by providing a systematic model-agnostic framework, adapted to the use of big data sources, for structuring the prediction problem. As we do so, we highlight existing work on predicting refugee and IDP flows. We also draw on our own experience building models to predict forced displacement in Somalia, in order to illustrate the choices facing modellers and point to open research questions that may be used to guide future work.

Acknowledgments

We thank the editors of the special issue, two anonymous reviewers, and Joseph Aylett-Bullock for helpful comments which substantially improved this article. KHP thanks the United Nations Global Pulse Data Fellows Program and the UNHCR Innovation Service for incubating and supporting the case study research on Somalia, and in particular Rebeca Moreno Jiménez and Sofia Kyriazi who lead the development of Project Jetson and designed the initial experiments. We thank Patricia Angkiriwang for the design of the modelling cards. KHP also thanks the BIGSSS CSS Summer School on Migration, held in Sardinia, Italy in 2019 with support from the Volkswagen Foundation, from which this special issue emerged.

Data availability statement

A limited sample of Somali displacement data is available from UNHCR's PRMN (UNHCR Citation2019b). Due to the sensitivity of the data, the full historical dataset on Somali displacement flows should be requested from UNHCR. Additional data used in this article was obtained from ACLED (ACLED Citation2019), FSNAU (FSNAU Citation2019a Citation2019b), the Open Source Routing Machine (Project OSRM Citation2022), and the Humanitarian Data Exchange (OCHA Citation2020). The code used to produce the analysis in this article, and to download limited historical series of displacement data as well as the other datasets used in the paper, can be obtained from the Project Jetson GitHub repository: https://github.com/unhcr/Jetson/blob/master/experiment_2/.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes

1 Formally, a person of concern to UNHCR is ‘a person whose protection and assistance needs are of interest to UNHCR. This includes refugees, asylum-seekers, stateless people, internally displaced people and returnees’ (UNHCR Citation2021a). Additional definitions are given in Appendix B.2 of the supplementary material.

2 For example, see Danish Refugee Council (Citation2022) and Team Elva (Citation2021). The UN Office for the Coordination of Humanitarian Affairs (OCHA) maintains an ongoing list of predictive analytics projects at: https://centre.humdata.org/catalogue-for-predictive-models-in-the-humanitarian-sector/. A helpful review of different predictive analytics initiatives in the humanitarian sector can be found in Hernandez and Roberts (Citation2020).

3 For further reading on this topic see Breiman (Citation2001b), Shmueli (Citation2010), and Mullainathan and Spiess (Citation2017). In recent years there has been a dramatic growth in research on machine learning approaches for causal inference, so this distinction is not as clear as it once was.

4 Iraq (https://dtm.iom.int/iraq) and Mali (https://dtm.iom.int/mali) are two other promising contexts in which IOM has been tracking displacement since 2014 and 2013, respectively.

5 Since the LSTM used a 12-month window of historical arrivals to produce predictions, it did not produce forecasts for 2011 and therefore no observations from 2011 were included in the calculation of the RMSE. Furthermore, observations were dropped when a lagged value of the target variable was missing for one of the prior 23 months; as a consequence, model scoring focusses on periods and regions with relatively stable data collection.

6 The most basic benchmarks – 1 and 12-month lags – do in fact perform poorly on both the train and test datasets.

Additional information

Funding

United Nations Global Pulse is supported by the Governments of Sweden and Canada.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 288.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.