882
Views
6
CrossRef citations to date
0
Altmetric
Research Articles

Integration of Internet search data to predict tourism trends using spatial-temporal XGBoost composite model

ORCID Icon, ORCID Icon, ORCID Icon, &
Pages 236-252 | Received 20 Feb 2020, Accepted 21 May 2021, Published online: 05 Jul 2021
 

ABSTRACT

Tourism trend prediction facilitates estimation of tourism investment and revenue. Studies on tourism prediction have primarily relied on linear models and historical visitors; however, relationships between tourism trends and their factors may be nonlinear. This study constructed factors from internet search data and predicted tourism trends using a spatiotemporal framework based on the extreme gradient boosting (XGBoost) method. The study first sorted Baidu index data that is computed by weighting the search frequency. The spatial cluster analysis was conducted to incorporate spatial characteristics, and principal component analysis was further performed to identify factors. The next step derived variables using the weighted moving average method to reduce the lag effect between tourism internet search and actual behavior. We applied the proposed spatiotemporal XGBoost composite model to predict Beijing’s tourism trends. The R2 scores of the simple XGBoost model, the autoregressive integrated moving average model, the spatial XGBoost model, and the spatiotemporal XGBoost composite model were 0.517, 0.625, 0.791, and 0.940, respectively. Compared to predictions from different models, the spatiotemporal XGBoost composite model has the best prediction ability. The findings also suggest that machine learning methods may not perform well without considering spatial properties, such as spatial autocorrelation and spatial heterogeneity.

Acknowledgments

We thank Dr. Shenjun Yao for her valuable comments on the revision of the manuscript. We are also grateful to Prof. May Yuan, Dr. Michela Bertolotto, and the three anonymous referees for their valuable comments and suggestions.

Data and codes availability statement

The data and core codes that support the findings of this study are available in https://doi.org/10.6084/m9.figshare.14511657.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This research was supported by the National Natural Science Foundation of China [No. 41301423, No 41701462], the National Key Research and Development Program of China [No. 2016YFC0502705, No 2016YFC0803105], the China Postdoctoral Science Foundation [No. 2018M641926], Major program of Social Science Foundation of China [No. 14ZDB140], and the Jiangxi Provincial Department of Education Science and Technology Research Projects [Grants No. GJJ150661].

Notes on contributors

Junfeng Kang

Junfeng Kang is an associate professor of Geographic Information Science at Jiangxi University of Science and Technology, Ganzhou, China. He is interested in high-performance GIS algorithms and applications. Email: [email protected]

Xingyu Guo

Xingyu Guo is a researcher on information science at Nanchang Hangkong University, Nanchang, China. His research interests include location-based services (LBS), spatial cognition, urban informatics, and big data. ID: https://orcid.org/0000-0003-4710-3659; Email: [email protected]

Lei Fang

Lei Fang is a postdoctoral researcher on the application of Geographical Information Science in the Department of Environmental Science and Engineering, Fudan University, Shanghai, China. His research works focus on spatio-temporal analysis and big data. ID: https://orcid.org/0000-0001-8902-1817; Email: [email protected]

Xiangrong Wang

Xiangrong Wang is a Professor in the Department of Environmental Science and Engineering, Fudan University, China. His research interests include ecological evaluation, virtual reality, and informatization of urban ecology. Email: [email protected]

Zhengqiu Fan

Zhengqiu Fan is an associate professor in the Department of Environmental Science and Engineering, Fudan University, China. His research interests include urban planning, spatio-temporal analysis of soil pollution. ID: https://orcid.org/0000-0002-2908-811X; Email: [email protected]

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 704.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.