ABSTRACT
Sampling fraction is crucial to sampling-related studies and applications, especially in the big data era when most data are neither originally designed nor controllable in the data collection process. A common concern among researchers is ‘what’s the modelling accuracy when using a sample?’. Taking intra-city human mobility as the study objective, this study utilizes a simple and direct method to analyse the influences of various sampling fractions on modelling accuracy. Five common intra-city human mobility indicators (travel distance, travel time, travel frequency, radius of gyration and movement entropy) are evaluated considering mean value, median and probability distribution. Experimental results demonstrate that the representativeness of each considered indicator converges to 1 in its own unique rate and variances. The minimum required sampling fractions to satisfy specific accuracies differ for various indicators and evaluation measures. To further investigate how related factors influence the modelling accuracy of sampling fractions, additional experiments are conducted considering multiple sampling methods, study scopes, and data sources. Several interesting general findings are observed. This study provides a reference for other sampling-based applications.
Acknowledgments
We would like to thank all three anonymous reviewers for their valuable comments on this article.
Disclosure statement
No potential conflict of interest was reported by the authors.
Supplementary material
Supplemental data can be accessed here
Additional information
Funding
Notes on contributors
Jincheng Jiang
Jincheng Jiang is a post-doctor researcher of Shenzhen University. His research interests include human mobility, spatial-temporal data analysis and emergency response. Email: [email protected]
Qingquan Li
Qingquan Li is a professor and president of Shenzhen University, and the director of Shenzhen Key Laboratory of Spatial Smart Sensing and Service. His research interests include spatial-temporal data analysis, multi-sensor integration industry and engineering surveying. Email: [email protected]
Wei Tu
Wei Tu is senior associate research fellow at the department of urban informactis, school of architecture and urban planning, Shenzhen University. His research interests are big data-driven human activity and urban studies. Email: [email protected]
Shih-Lung Shaw
Shih-Lung Shaw is a Professor of Geography at the University of Tennessee, Knoxville. His research interests include transportation, human dynamics, geographic information science, and space-time analytics. Email: [email protected]
Yang Yue
Yang Yue is a professor at the department of urban informactis, school of architecture and urban planning, Shenzhen University. Her research interests are urban informatics and trajectory-based human behavior analysis. Email: [email protected]