609
Views
3
CrossRef citations to date
0
Altmetric
Research Articles

A machine learning approach to extracting spatial information from geological texts in Chinese

ORCID Icon, ORCID Icon, , , , , & show all
Pages 2169-2193 | Received 19 Jul 2021, Accepted 05 Jun 2022, Published online: 15 Jun 2022

References

  • Alimova, I., and Tutubalina, E., 2020. Multiple features for clinical relation extraction: a machine learning approach. Journal of Biomedical Informatics, 103, 103382.
  • Bollegala, D.T., Matsuo, Y., and Ishizuka, M., 2010. Relational duality: Unsupervised extraction of semantic relations between entities on the web. In: Proceedings of the 19th international conference on World Wide Web, 151–160. New York, NY: Association for Computing Machinery.
  • Bouziat, A., et al., 2020. Assisted processing of geological data with Deep Learning technologies: levers to optimize mineral exploration workflows? Mineral Exploration Symposium. European Association of Geoscientists & Engineers, 2020 (1), 1–4.
  • Brin, S., 1999. Extracting patterns and relations from the World Wide Web. In: P. Atzeni, A. Mendelzon, and G. Mecca, eds. The World Wide Web and Databases. WebDB 1998. Lecture Notes in Computer Science, vol. 1590. Berlin: Springer, 172–183. https://doi.org/10.1007/10704656_11
  • Califf, M.E., and Mooney, R.J., 2003. Bottom-up relational learning of pattern matching rules for information extraction. Journal of Machine Learning Research, 4 (2), 177–210.
  • Chen, N., et al., 2021. KE-CNN: A new social sensing method for extracting geographical attributes from text semantic features and its application in Wuhan, China. Computers, Environment and Urban Systems, 88, 101629.
  • Chen, P., et al., 2019. STLP-GSM: a method to predict future locations of individuals based on geotagged social media data. International Journal of Geographical Information Science, 33 (12), 2337–2362.
  • Chu, D.P., et al., 2021. Geological entity recognition based on ELMO-CNN-BiLSTM-CRF model. Earth Science-Journal of China University of Geosciences, 46 (8), 3039–3048.
  • Church, K.W., 2017. Word2Vec. Natural Language Engineering, 23 (1), 155–162.
  • Derungs, C., and Samardžić, T., 2018. Are prominent mountains frequently mentioned in text? Exploring the spatial expressiveness of text frequency. International Journal of Geographical Information Science, 32 (5), 856–873.
  • Devlin, J., et al., 2018. BERT: pre-training of deep bidirectional transformers for language understanding. NAACL-HLT, 1: 4171–4186.
  • Du, S., and Guo, L., 2016. Similarity measurements on multi-scale qualitative locations. Transactions in GIS, 20 (6), 824–847.
  • Du, S., et al., 2017. Classifying natural-language spatial relation terms with random forest algorithm. International Journal of Geographical Information Science, 31 (3), 542–568.
  • Du, S., Feng, C.-C., and Guo, L., 2015. Integrative representation and inference of qualitative locations about points, lines, and polygons. International Journal of Geographical Information Science, 29 (6), 980–1006.
  • Egenhofer, M.J., 1991. Reasoning about binary topological relations. In: O. Günther, H.J. Schek, eds. Advances in spatial databases. SSD 1991. Lecture Notes in Computer Science, vol. 525. Berlin, Heidelberg: Springer, 141–160. https://doi.org/10.1007/3-540-54414-3_36
  • Egenhofer, M.J., and Mark, D.M., 1995. Naive geography. In: A.U. Frank and W. Kuhn, eds. Spatial information theory: a theoretical basis for GIS, lecture notes in computer sciences 988. Berlin: Springer-Verlag, 1–15.
  • Elsahar, H., et al., 2017. Unsupervised open relation extraction. In: E. Blomqvist, K. Hose, H. Paulheim, A. Ławrynowicz, F. Ciravegna, O. Hartig, eds. The Semantic Web: ESWC 2017 Satellite Events. ESWC 2017. Lecture Notes in Computer Science, vol. 10577. Cham: Springer. https://doi.org/10.1007/978-3-319-70407-4_3
  • Estak, M., et al., 2021. Applying k -vertex cardinality constraints on a Neo4j graph database. Future Generation Computer Systems, 115, 459–474.
  • Etzioni, O., et al., 2008. Open information extraction from the web. Communications of the ACM, 51 (12), 68–74.
  • Geng, B., 2021. Open relation extraction in patent claims with a hybrid network. Wireless Communications and Mobile Computing, 2021 (1), 1–7.
  • Geng, Z.Q., Zhang, Y.H., and Han, Y.M., 2021. Joint entity and relation extraction model based on rich semantics. Neurocomputing, 429, 132–140.
  • Haris, E., Gan, K.H., and Tan, T.-P., 2020. Spatial information extraction from travel narratives: analysing the notion of co-occurrence indicating closeness of tourist places. Journal of Information Science, 46 (5), 581–599.
  • He, C., et al., 2020. Open domain Chinese triples hierarchical extraction method. Applied Sciences, 10 (14), 4819.
  • Hou, J., et al., 2020. BERT-based Chinese relation extraction for public security. IEEE Access., 8, 132367–132375.
  • Hu, Y., 2018a. Geospatial semantics. In: B. Huang, ed. Comprehensive geographic information systems. Oxford: Elsevier, 80–94.
  • Hu, Y., 2018b. Geo-text data and data-driven geospatial semantics. Geography Compass, 12 (11), e12404.
  • Hu, Y., Ye, X., and Shaw, S.-L., 2017. Extracting and analyzing semantic relatedness between cities using news articles. International journal of geographical information science, 31 (12), 2427–2451.
  • Huang, H., Lei, M., and Feng, C., 2021. Graph-based reasoning model for multiple relation extraction. Neurocomputing, 420, 162–170.
  • Huang, T., et al., 2019. Unsupervised monocular depth estimation based on residual neural network of coarse–refined feature extractions for drone. Electronics, 8 (10), 1179.
  • Huang, Y., et al., 2022. A fast antibiotic detection method for simplified pretreatment through spectra-based machine learning. Frontiers of Environmental Science & Engineering, 16 (3), 1–12.
  • Janowicz, K., et al., 2020. GeoAI: spatially explicit artificial intelligence techniques for geographic knowledge discovery and beyond. International Journal of Geographical Information Science, 34 (4), 625–636.
  • Kim, J.-T., and Moldovan, D.I., 1995. Acquisition of linguistic patterns for knowledge-based information extraction. IEEE Transactions on Knowledge and Data Engineering, 7 (5), 713–724.
  • Kim, J.-H., 2009. Estimating classification error rate: repeated cross-validation, repeated hold-out and bootstrap. Computational Statistics & Data Analysis, 53 (11), 3735–3745.
  • Kordjamshidi, P., Pustejovsky, J., and Moens, M.F., 2020. Representation, learning and reasoning on spatial language for downstream NLP task. In: Proceedings of the 2020 conference on empirical methods in natural language processing: tutorial abstracts, 28–33, Stroudsburg, PA. Association for Computational Linguistics.
  • Laurini, R., 2015. Geographic ontologies, gazetteers and multilingualism. Future Internet, 7 (4), 1–23.
  • Lin, Y.K., et al., 2016. Neural relation extraction with selective attention over instances. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL, 2124–2133. Stroudsburg, PA: Association for Computational Linguistics.
  • Liu, W., et al., 2020. A multi-label text classification model based on ELMo and attention. MATEC Web of Conferences, 309, 03015.
  • Liu, Y., et al., 2021. Quality assessment of post-consumer plastic bottles with joint entropy method: A case study in Beijing, China. Resources, Conservation and Recycling, 175, 105839.
  • Liu, Z., et al., 2015. Automatic de-identification of electronic medical records using token-level and character-level conditional random fields. Journal of Biomedical Informatics, 58, S47–S52.
  • Luo, X., et al., 2018. Attention-based relation extraction with bidirectional gated recurrent unit and highway network in the analysis of geological data. IEEE Access., 6, 5705–5715.
  • Ma, M., et al., 2019. Flash flood risk analysis based on machine learning techniques in the Yunnan Province, China. Remote Sensing, 11 (2), 170.
  • Ma, X., and Hovy, E., 2016. End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. In: Proceedings of the 54th annual meeting of the association for computational linguistics (Volume 1: Long Papers), Berlin, Germany, 1064–1074. Stroudsburg, PA: Association for Computational Linguistics.
  • McDonough, K., et al., 2019. Named entity recognition goes to old regime France: geographic text analysis for early modern French corpora. International Journal of Geographical Information Science, 33 (12), 2498–2522.
  • Nazar, R., Vivaldi, J., and Wanner, L., 2012. Automatic taxonomy extraction for specialized domains using distributional semantics. Terminology. International Journal of Theoretical and Applied Issues in Specialized Communication, 18 (2), 188–225.
  • Nesi, P., Pantaleo, G., and Tenti, M., 2016. Geographical localization of web domains and organization addresses recognition by employing natural language processing, pattern matching and clustering. Engineering Applications of Artificial Intelligence, 51, 202–211.
  • None, 1993. Bootstrapping: a nonparametric approach to statistical inference. Technometrics, 36 (4), 435–436.
  • Pang, N., et al., 2020. Domain relation extraction from noisy Chinese texts. Neurocomputing, 418, 21–35.
  • Papadias, D., and Theodoridis, Y., 1997. Spatial relations, minimum bounding rectangles, and spatial data structures. International Journal of Geographical Information Science, 11 (2), 111–138.
  • Purves, R.S., et al., 2018. Geographic information retrieval: progress and challenges in spatial search of text. Foundations and Trends® in Information Retrieval, 12 (2–3), 164–318.
  • Qiu, Q., et al., 2019a. BiLSTM-CRF for geological named entity recognition from the geoscience literature. Earth Science Informatics, 12 (4), 565–579.
  • Qiu, Q., et al., 2019b. GNER: a generative model for geological named entity recognition without labeled data using deep learning. Earth and Space Science, 6 (6), 931–946.
  • Qiu, Q., et al., 2020. Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics, 13 (4), 1393–1410.
  • Reinbergerr, M.-L., 2005. Automatic extraction of spatial relations. In 2005 Purtuguese CONFERENce on artificial intelligence, 331–337. Piscataway, NJ: IEEE. https://www.ieee.org/about/contact.html?utm_source=dhtml_footer&utm_medium=hp&utm_campaign=office-locations
  • Schockaert, S., et al., 2008. Mining topological relations from the web. In: 2008 19th International Workshop on Database and Expert Systems Applications, 652–656. Piscataway, NJ: IEEE. https://doi.org/10.1109/DEXA.2008.15
  • Shannon, E.C., 1948. A mathematical theory of communication. Bell System Technical Journal, 27 (3), 379–423.
  • Shelmanov, A.O., et al., 2020. Open information extraction from texts: part II. extraction of semantic relations using unsupervised machine learning. Scientific and Technical Information Processing, 47 (6), 340–347.
  • Shi, W., 2008. Modeling uncertainty in geographic information and analysis. Science in China Series E: Technological Sciences, 51 (S1), 38–47.
  • Shi, Y., et al., 2021. Distant Supervision Relation Extraction via adaptive dependency-path and additional knowledge graph supervision. Neural Networks : The Official Journal of the International Neural Network Society, 134, 42–53.
  • Stock, K., et al., 2022. Detecting geospatial location descriptions in natural language text. International Journal of Geographical Information Science, 36 (3), 547–584.
  • Tran, T.T., Le, P., and Ananiadou, S., 2020. Revisiting unsupervised relation extraction. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 7498–7505, Stroudsburg, PA: Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.acl-main.669
  • Wang, C., et al., 2018. Information extraction and knowledge graph construction from geoscience literature. Computers & Geosciences, 112, 112–120.
  • Wang, M.Y., Li, L., and Huang, F., 2014. Semi-supervised Chinese open entity relation extraction. In: 2014 IEEE 3rd international conference on cloud computing and intelligence systems, 415–420. Piscataway, NJ: IEEE.
  • Wang, W., and Stewart, K., 2015. Spatiotemporal and semantic information extraction from web news reports about natural hazards. Computers, Environment and Urban Systems, 50, 30–40.
  • Wartmann, F.M., Acheson, E., and Purves, R.S., 2018. Describing and comparing landscapes using tags, texts, and free lists: an interdisciplinary approach. International Journal of Geographical Information Science, 32 (8), 1572–1592.
  • Wellmann, J.F., and Regenauer-Lieb, K., 2012. Uncertainties have a meaning: Information entropy as a quality measure for 3-D geological models. Tectonophysics, 526–529, 207–216.
  • Wing, B.P., and Baldridge, J., 2011. Simple supervised document geolocation with geodesic grids. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 955–964. Stroudsburg, PA: Association for Computational Linguistics.
  • Wu, L., et al., 2017. A knowledge-driven geospatially enabled framework for geological big data. ISPRS International Journal of Geo-Information, 6 (6), 166.
  • Xu, J., 2007. Formalizing natural‐language spatial relations between linear objects with topological and metric properties. International Journal of Geographical Information Science, 21 (4), 377–395.
  • Yang, S.M., Yoo, S.Y., and Jeong, O.R., 2020. DeNERT-KG: named entity and relation extraction model using DQN, knowledge graph, and BERT. Applied Sciences, 10 (18), 6429.
  • Yu, H., et al., 2020. A relationship extraction method for domain knowledge graph construction. World Wide Web, 23 (2), 735–753.
  • Yu, L., et al., 2016. Context enhanced keyword extraction for sparse geo-entity relation from web texts. In Asia-Pacific web conference, 253–264. Switzerland: Springer.
  • Zenasni, S., et al., 2018. Spatial information extraction from short messages. Expert Systems with Applications, 95, 351–367.
  • Zhang, C., et al., 2015. Construction of semantic bootstrapping models for relation extraction. Knowledge-Based Systems, 83, 128–137.
  • Zhang, J., et al., 2020. A multi-feature fusion model for Chinese relation extraction with entity sense. Knowledge-Based Systems, 206, 106348.
  • Zhang, T., et al., 2021. Chinese medical relation extraction based on multi-hop self-attention mechanism. International Journal of Machine Learning and Cybernetics, 12 (2), 355–363.
  • Zhou, C., et al., 2021. Prospects for the research on geoscience knowledge graph in the big data era. Science China Earth Sciences, 64 (7), 1105–1114.,
  • Zuo, R., and Xiong, Y., 2020. Geodata science and geochemical mapping. Journal of Geochemical Exploration, 209, 106431.
  • Zuo, R., 2020. Geodata science-based mineral prospectivity mapping: a review. Natural Resources Research, 29 (6), 3415–3424.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.