ABSTRACT
Artificial surfaces represent one of the key land cover types, and validation is an indispensable component of land cover mapping that ensures data quality. Traditionally, validation has been carried out by confronting the produced land cover map with reference data, which is collected through field surveys or image interpretation. However, this approach has limitations, including high costs in terms of money and time. Recently, geo-tagged photos from social media have been used as reference data. This procedure has lower costs, but the process of interpreting geo-tagged photos is still time-consuming. In fact, social media point of interest (POI) data, including geo-tagged photos, may contain useful textual information for land cover validation. However, this kind of special textual data has seldom been analysed or used to support land cover validation. This paper examines the potential of textual information from social media POIs as a new reference source to assist in artificial surface validation without photo recognition and proposes a validation framework using modified decision trees. First, POI datasets are classified semantically to divide POIs into the standard taxonomy of land cover maps. Then, a decision tree model is built and trained to classify POIs automatically. To eliminate the effects of spatial heterogeneity on POI classification, the shortest distances between each POI and both roads and villages serve as two factors in the modified decision tree model. Finally, a data transformation based on a majority vote algorithm is then performed to convert the classified points into raster form for the purposes of applying confusion matrix methods to the land cover map. Using Beijing as a study area, social media POIs from Sina Weibo were collected to validate artificial surfaces in GlobeLand30 in 2010. A classification accuracy of 80.68% was achieved through our modified decision tree method. Compared with a classification method without spatial heterogeneity, the accuracy is 10% greater. This result indicates that our modified decision tree method displays considerable skill in classifying POIs with high spatial heterogeneity. In addition, a high validation accuracy of 92.76% was achieved, which is relatively close to the official result of 86.7%. These preliminary results indicate that social media POI datasets are valuable ancillary data for land cover validation, and our proposed validation framework provides opportunities for land cover validation with low costs in terms of money and time.
Acknowledgments
This work was supported by the National Natural Science Foundation of China under Grant number 41501420, and China Postdoctoral Science Foundation Funded Project under Grant number 2017M612330 and 2017M612329. We are grateful for the valuable comments of Prof. Jun Chen, Yu Liu, Songnian Li and the anonymous reviewers.
Disclosure statement
No potential conflict of interest was reported by the authors.