Classifying natural-language spatial relation terms with random forest algorithm

Shihong DuInstitute of Remote Sensing and GIS, Peking University, Beijing, ChinaCorrespondence[email protected]
View further author information

Xiaonan WangInstitute of Remote Sensing and GIS, Peking University, Beijing, ChinaView further author information

Chen-Chieh FengDepartment of Geography, National University of Singapore, Singapore, SingaporeView further author information

Xiuyuan ZhangInstitute of Remote Sensing and GIS, Peking University, Beijing, ChinaView further author information

ABSTRACT

The exponential growth of natural language text data in social media has contributed a rich data source for geographic information. However, incorporating such data source for GIS analysis faces tremendous challenges as existing GIS data tend to be geometry based while natural language text data tend to rely on natural language spatial relation (NLSR) terms. To alleviate this problem, one critical step is to translate geometric configurations into NLSR terms, but existing methods to date (e.g. mean value or decision tree algorithm) are insufficient to obtain a precise translation. This study addresses this issue by adopting the random forest (RF) algorithm to automatically learn a robust mapping model from a large number of samples and to evaluate the importance of each variable for each NLSR term. Because the semantic similarity of the collected terms reduces the classification accuracy, different grouping schemes of NLSR terms are used, with their influences on classification results being evaluated. The experiment results demonstrate that the learned model can accurately transform geometric configurations into NLSR terms, and that recognizing different groups of terms require different sets of variables. More importantly, the results of variable importance evaluation indicate that the importance of topology types determined by the 9-intersection model is weaker than metric variables in defining NLSR terms, which contrasts to the assertion of ‘topology matters, metric refines’ in existing studies.

KEYWORDS:

Acknowledgements

The work of the first author is supported by the National Natural Science Foundation of China (No. 41171297). The work of the second author is supported by the National University of Singapore Academic Research Fund (R-109-000-112-112). Comments from the editor and three anonymous reviewers are greatly appreciated.

Disclosure statement

No potential conflict of interest was reported by the authors.

Additional information

Funding

This work was supported by the the National Natural Science Foundation of China; [41171297]; the National University of Singapore Academic Research Fund; [R-109-000-112-112].

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Classifying natural-language spatial relation terms with random forest algorithm

Information for

Open access

Opportunities

Help and information

Classifying natural-language spatial relation terms with random forest algorithm

ABSTRACT

Acknowledgements

Disclosure statement

Additional information

Funding

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature