ABSTRACT
The rapid development of information and communications technology has turned individuals into sensors, fostering the growth of human-generated geospatial big data. In disaster management, geospatial big data, mainly social media data, have opened new avenues for observing human responses to disasters in near real-time. Previous research relies on geographical information in geotags, content, and user profiles to locate social media messages. However, less than 1% of users geotag their messages, leaving geolocating users through user profiles or message content addresses increasingly crucial. This paper evaluates and visualizes the margin of error incurred when using user profiles or message-mentioned addresses to geolocate social media data for disaster research. Using Twitter data during the 2017 Hurricane Harvey as an example, this research assessed the inconsistencies in predicting users’ locations in various administrative units during each disaster phase using three geolocating strategies. The results reveal that the similarities between geotags, and user profile locations decrease from 94.07% to 64.56%, 43.9%, 31.82%, 27.05%, and 26.7% as the geographical scale changes from country to state, county, block group, 1-kilometer, and 30-meter levels. These similarities are overall higher than the agreements between locations derived from geotags and tweet content. The geolocation consistencies among the three methods remain stable across disaster phases. The impacts of uncertainties in geolocating Twitter data for disaster management applications were further unraveled. The findings offer valuable insights into the trade-off between spatial scale and geolocation accuracy and inform the selection of appropriate scales when applying different geolocating strategies in future social media-based investigations.
Acknowledgments
We extend our sincere appreciation to the anonymous reviewers whose constructive comments have greatly improved the quality of this manuscript.
Disclosure statement
No potential conflict of interest was reported by the author(s).
Data availability statement
Data will be available upon request. It will be hosted in a private GitHub Repo: https://github.com/rohan-debayan/DeidentifiedHarveyTweets.git