1,130
Views
22
CrossRef citations to date
0
Altmetric
Articles

Source Domain Verification Using Corpus-based Tools

&
Pages 43-55 | Published online: 03 Feb 2020
 

ABSTRACT

Source domain verification has not received as much attention as criteria for metaphor identification in the study of conceptual metaphor. In this paper, we provide a replicable approach to source domain verification which we hope will provide a foundation for new approaches to this important question. We adopt an empirical method extended from previous research that used corpus-based linguistic tools such as SUMO (Suggested Upper Merged Ontology), WordNet, collocational patterns and an online dictionary. We present a new, step-by-step procedure to verify which keywords may be categorized in the source domain of building, using data from the Corpus of Hong Kong Political Speeches which contains parsed Chinese-language speeches by Hong Kong Chief Executives of the Hong Kong Special Administrative Region (1997–2014). Following the verification of a number of keywords in the building source domain, we discuss how this method may be adapted for other source domains and languages and discuss its application to various areas of study within metaphor research as well as the current limitations of this approach.

Acknowledgments

We would like to thank Winnie Hui-heng Zeng, Jessie Shijie Zhang, Joanna Zhuoan Chen, Ivy Wing-Shan Chan, and Leslie Chen Tong for their assistance in building the corpus and in data extraction and analysis. Responsibility for any errors remains with the authors.

Disclosure statement

No potential conflict of interest was reported by the authors.

Notes

1 Shutova and Teufel (Citation2010) take a related approach in English using a subset of categories from the Master Metaphor List (http://araw.mede.uic.edu/~alansz/metaphor/METAPHORLIST.pdf).

2 One advantage to ascertaining source domains before identifying metaphors is that it is then possible to contrast what concepts are mapped to a target domain and which ones are not in a given corpus. This may vary for corpora from different genre, such as medicine or politics. An advantage to ascertaining metaphors first is that there will then be fewer examples to analyze, since literal instances will already be ruled out.

3 If a previous researcher has postulated keywords to be in a particular source domain and these appear in the corpus, these may also be included in the verification process.

4 The three above resources were originally linked in two pairs: the English synsets in WordNet were mapped to Chinese lexical equivalents by ECTED, and WordNet 1.6 was mapped to SUMO by Niles and Pease. Thus, WordNet synsets were a mediating link for the integration work (Huang et al., Citation2004).

5 Chung and Chung and Huang (Citation2010) also have used collocational information when seeking to determine which source domains are related to a given target domain.

6 Sketch Engine used a version of MI-Score modified to give greater weight to the frequency of the collocation. A very high score of the collocate means that there is little competition from other collocates because the node (i.e., the search word, the keyword) does not often combine with other collocates (Kilgarriff et al., Citation2014).

7 SUMO defines a “stationary artifact” as: “an Artifact that has a fixed spatial location.” “Class” in upper ontology is defined as abstract group, set, or collection of objects. Most instances of this Class are architectural works, e.g., the Eiffel Tower, the Great Pyramids, office towers, single-family houses, etc. The words “fixed spatial location” and “Architectural works” are terms that allow us to confirm the suggested source domain – “building.”

8 For a full taxonomy of the class of Building in SUMO, please see: http://sigma.ontologyportal.org:8080/sigma/Browse.jsp?lang=EnglishLanguage&flang=SUO-KIF&kb=SUMO&term=Building. To view the entire taxonomy of SUMO, please see: http://www.adampease.org/OP/images/SUMOclasses.gif.

9 These criteria will need to be developed independently for each source domain analyzed, with the understanding that identifying and then using these criteria allow for others to have a framework to verify these decisions, instead of relying solely on intuition alone.

10 We included criteria b) and c) as they are similar to what can be found under the conceptual domain of “stationary artifact” in SUMO.

11 We included criteria b) and c) as they are similar to what can be found under the conceptual domain of “Constructing” in SUMO.

12 Chinese Sketch Engine is based on the Chinese Gigaword Corpus, which can be found at http://wordsketch.ling.sinica.edu.tw/. Registration is required to access Chinese Sketch Engine.

13 The formula for mean of means is shown as: (Mean1 (Saliency1, Saliency2)+...+Meann-1(Saliency1, Saliency2......, Saliency(n)))/(n-1).

14 Note that utilizing a specific corpus allows the researcher to read through a portion of the text and identify potential keywords specific to that corpus that may be missed if solely relying on intuition or previous work.

15 In Chinese, this report is known as 施政报告.

16 In addition to the bilingual tools used in this study, language generation templates for SUMO in Hindi, Italian, German and Czech can be found here: http://www.adampease.org/OP/. Information on Wordnets in other languages can be found here: http://globalwordnet.org/resources/wordnets-in-the-world/and information on how to use SketchEngine in over 90 languages can be found here: https://www.sketchengine.eu/corpora-and-languages/.

Additional information

Funding

This work was supported by the University Grants Committee of Hong Kong [General Research Fund #12400014].

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 401.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.