Abstract
Social media messages posted by people during natural disasters often contain important location descriptions, such as the locations of victims. Recent research has shown that many of these location descriptions go beyond simple place names, such as city names and street names, and are difficult to extract using typical named entity recognition (NER) tools. While advanced machine learning models could be trained, they require large labeled training datasets that can be time-consuming and labor-intensive to create. In this work, we propose a method that fuses geo-knowledge of location descriptions and a Generative Pre-trained Transformer (GPT) model, such as ChatGPT and GPT-4. The result is a geo-knowledge-guided GPT model that can accurately extract location descriptions from disaster-related social media messages. Also, only 22 training examples encoding geo-knowledge are used in our method. We conduct experiments to compare this method with nine alternative approaches on a dataset of tweets from Hurricane Harvey. Our method demonstrates an over 40% improvement over typically used NER approaches. The experiment results also show that geo-knowledge is indispensable for guiding the behavior of GPT models. The extracted location descriptions can help disaster responders reach victims more quickly and may even save lives.
Acknowledgement
We thank the anonymous reviewers for their constructive comments and suggestions.
Disclosure statement
No potential conflict of interest was reported by the author(s).
Data and code availability statement
The data and code that support the findings of this study are available on figshare at https://doi.org/10.6084/m9.figshare.22659337.
Additional information
Funding
Notes on contributors
Yingjie Hu
Yingjie Hu is an Associate Professor in the Department of Geography at the University at Buffalo. His research interests include GIScience, geospatial artificial intelligence (GeoAI), and disaster resilience. His contributions to this paper include conceptualization, data collection and curation, methodology, data analysis, result interpretation and discussion, visualization, writing – original draft and writing – editing and revision.
Gengchen Mai
Gengchen Mai is currently a Tenure-Track Assistant Professor in the Department of Geography at the University of Georgia. His research interests are spatially explicit artificial intelligence, geographic knowledge graphs, geographic question answering, geospatial foundation models, and so on. His contributions to this paper include conceptualization, methodology, data analysis, result interpretation and discussion, visualization, writing – original draft, and writing – editing and revision.
Chris Cundy
Chris Cundy is a Ph.D. student at Stanford University. His research interests include variational inference, generative models, large language models, and AI safety. His contributions to this paper include methodology, result interpretation and discussion, and writing – editing and revision.
Kristy Choi
Kristy Choi is a Ph.D. candidate in Computer Science at Stanford University advised by Dr. Stefano Ermon. Her research is centered around machine learning with limited labeled supervision, and is focused on developing techniques for better adaptation and controllability in deep generative models. Her contributions to this paper include methodology, result interpretation and discussion, and writing – editing and revision.
Ni Lao
Ni Lao is currently a research scientist at Google. He holds a Ph.D. degree in Computer Science from the Language Technologies Institute, School of Computer Science at Carnegie Mellon University, and is an expert in machine learning, knowledge graph and natural language understanding. His contributions to this paper include methodology, result interpretation and discussion, and writing – editing and revision.
Wei Liu
Wei Liu has recently graduated with a M.S. in GIS from the University at Buffalo. Her interests include GeoAI, urban analytics, and social equality. Her contributions to this paper include data collection and curation and data analysis.
Gaurish Lakhanpal
Gaurish Lakhanpal recently graduated from Stevenson High School and is currently studying computer science at Purdue University. He is deeply interested in machine learning. His contributions to this paper include data collection and curation, data analysis, and visualization.
Ryan Zhenqi Zhou
Ryan Zhenqi Zhou (Zhenqi Zhou) is a Ph.D. candidate in the Department of Geography at the University at Buffalo and a research assistant in the GeoAI Lab. His research interests include GeoAI, disaster resilience, public health, and human mobility. His contributions to this paper include data collection and curation and data analysis.
Kenneth Joseph
Kenneth Joseph is an Assistant Professor in the Computer Science and Engineering Department at the University at Buffalo. He is a computational social scientist who focuses on the measurement and modeling of complex social systems. His contributions to this paper include result interpretation and discussion and writing – editing and revision.