155
Views
0
CrossRef citations to date
0
Altmetric
Articles

Semantic similarity and mutual information predicting sentence comprehension: the case of dangling topic construction in Chinese

ORCID Icon &
Pages 142-165 | Received 29 Apr 2022, Accepted 29 Nov 2022, Published online: 15 Dec 2022

References

  • Andrews, G., Ogden, J. E., & Halford, G. S. (2017). Resolving conflicts between syntax and plausibility in sentence comprehension. Advances in Cognitive Psychology, 13(1), 11–27. https://doi.org/10.5709/acp-0203-8
  • Broderick, M. P., Anderson, A. J., Di Liberto, G. M., Crosse, M. J., & Lalor, E. C. (2018). Electrophysiological correlates of semantic dissimilarity reflect the comprehension of natural, narrative speech. Current Biology, 28(5), 803–809.e3. https://doi.org/10.1016/j.cub.2018.01.080
  • Bross, F. (2019). Acceptability ratings in linguistics: A practical guide to grammaticality judgments, data collection, and statistical analysis. Version 1.02. Mimeo.
  • Bürkner, P. C., & Vuorre, M. (2019). Ordinal regression models in psychology: A tutorial. Advances in Methods and Practices in Psychological Science, 2(1), 77–101. https://doi.org/10.1177/2515245918823199
  • Chafe, W. L. (1976). Givenness, contrastiveness, definiteness, subjects, topics, and point of view. In C. N. Li (Ed.), Subject and topic (pp. 27–55). Academic Press.
  • Chandrasekaran, D., & Mago, V. (2022). Evolution of semantic similarity—A survey. ACM Computing Surveys, 54(2), 1–37. https://doi.org/10.1145/3440755
  • Chao, Y. R. (1968). A grammar of spoken Chinese. California University Press.
  • Chen, P. (1996). Pragmatic interpretations of structural topics and relativization in Chinese. Journal of Pragmatics, 26(3), 389–406. https://doi.org/10.1016/0378-2166(95)00042-9
  • Church, K., & Hanks, P. (1990). Word association norms, mutual information, and lexicography. Computational Linguistics, 16(1), 22–29.
  • Cui, Y., Che, W., Liu, T., Qin, B., & Yang, Z. (2021). Pre-training with whole word masking for Chinese Bert. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 29, 3504–3514. https://doi.org/10.1109/TASLP.2021.3124365
  • DeLong, K. A., Quante, L., & Kutas, M. (2014). Predictability, plausibility, and two late ERP positivities during written sentence comprehension. Neuropsychologia, 61, 150–162. https://doi.org/10.1016/j.neuropsychologia.2014.06.016
  • de Paiva Alves, E. (1996). The selection of the most probable dependency structure in Japanese using mutual information. In A. Joshi & M. Palmer (Eds.), Proceedings of the 34th annual meeting of the association for computational linguistics (pp. 372–374). The Association for Computational Linguistics.
  • Dong, Z., Rhodes, R., & Hestvik, A. (2021). Active gap filling and island constraint in processing the Mandarin “gap-type” topic structure. Frontiers in Communication, 6, 650659. https://doi.org/10.3389/fcomm.2021.650659
  • Ehrlich, S. F., & Rayner, K. (1981). Contextual effects on word perception and eye movements during reading. Journal of Verbal Learning and Verbal Behavior, 20(6), 641–655. https://doi.org/10.1016/S0022-5371(81)90220-6
  • Elman, J. L. (1990). Finding structure in time. Cognitive Science, 14(2), 179–211. https://doi.org/10.1207/s15516709cog1402_1
  • Fano, R. M. (1961). Transmission of information: A statistical theory of communication. MIT Press.
  • Ferreira, F. (2003). The misinterpretation of noncanonical sentences. Cognitive Psychology, 47(2), 164–203. https://doi.org/10.1016/S0010-0285(03)00005-7
  • Firth, J. R. (1957). A synopsis of linguistic theory. Studies in linguistic analysis. Blackwell.
  • Frank, S. L., & Willems, R. M. (2017). Word predictability and semantic similarity show distinct patterns of brain activity during language comprehension. Language, Cognition and Neuroscience, 32(9), 1192–1203. https://doi.org/10.1080/23273798.2017.1323109
  • Futrell, R. (2019). Information-theoretic locality properties of natural language. In X. Chen & R. Ferrer-i-Cancho (Eds.), Proceedings of the first workshop on quantitative syntax (pp. 2–15). The Association for Computational Linguistics.
  • Gibson, E. (1998). Linguistic complexity: Locality of syntactic dependencies. Cognition, 68(1), 1–76. https://doi.org/10.1016/S0010-0277(98)00034-1
  • Gibson, E., Bergen, L., & Piantadosi, S. T. (2013). Rational integration of noisy evidence and prior semantic expectations in sentence interpretation. Proceedings of the National Academy of Sciences, 110(20), 8051–8056. https://doi.org/10.1073/pnas.1216438110
  • Gries, S. T. (2010). Useful statistics for corpus linguistics. In A. Sánchez & M. Almela (Eds.), A mosaic of corpus linguistics: Selected approaches (pp. 269–291). Peter Lang.
  • Griffiths, T. L., Steyvers, M., & Tenenbaum, J. B. (2007). Topics in semantic representation. Psychological Review, 114(2), 211–244. https://doi.org/10.1037/0033-295X.114.2.211
  • Günther, F., Dudschig, C., & Kaup, B. (2016). Latent semantic analysis cosines as a cognitive similarity measure: Evidence from priming studies. Quarterly Journal of Experimental Psychology, 69(4), 626–653. https://doi.org/10.1080/17470218.2015.1038280
  • Hale, J. T. (2001). A probabilistic Earley parser as a psycholinguistic model. In L. Levin & K. Knight (Eds.), Proceedings of the Second Meeting of the North American Chapter of the Association for Computational Linguistics and Language Technologies (pp. 1–8). The Association for Computational Linguistics.
  • Harispe, S., Ranwez, S., Janaqi, S., & Montmain, J. (2015). Semantic similarity from natural language and ontology analysis. Morgan & Claypool.
  • Hendrix, P., & Sun, C. (2020). The role of information theory for compound words in Mandarin Chinese and English. Cognition, 205, 104389. https://doi.org/10.1016/j.cognition.2020.104389
  • Hu, J., & Pan, H. (2009). Decomposing the aboutness condition for Chinese topic constructions. Linguistic Review, 26, 371–384. https://doi.org/10.1515/tlir.2009.014
  • Huang, C. T., Li, Y. H., & Li, Y. F. (2009). The syntax of Chinese. Cambridge University Press.
  • Huang, R. H., & Ting, J. (2006). Are there dangling topics in Mandarin Chinese? Concentric: Studies Linguistics, 32, 119–146.
  • Huang, Y. C., & Kaiser, E. (2008). Investigating filler-gap dependencies in Chinese topicalization. In M. Chan & H. Kang (Eds.), Proceedings of the 20th North American conference on Chinese linguistics (Vol. 2, pp. 927–941). The Ohio State University.
  • Jay, T. B. (2003). The psychology of language. Prentice Hall.
  • Jones, M. N., Kintsch, W., & Mewhort, D. J. (2006). High-dimensional semantic space accounts of priming. Journal of Memory and Language, 55(4), 534–552. https://doi.org/10.1016/j.jml.2006.07.003
  • Kennedy, A., Pynte, J., Murray, W. S., & Paul, S. A. (2013). Frequency and predictability effects in the Dundee Corpus: An eye movement analysis. Quarterly Journal of Experimental Psychology, 66(3), 601–618.
  • King, J., & Just, M. A. (1991). Individual differences in syntactic processing: The role of working memory. Journal of Memory and Language, 30(5), 580–602. https://doi.org/10.1016/0749-596X(91)90027-H
  • Kuperberg, G. R. (2007). Neural mechanisms of language comprehension: Challenges to syntax. Brain Research, 1146, 23–49. https://doi.org/10.1016/j.brainres.2006.12.063
  • Landauer, T. K., & Dumais, S. T. (1997). A solution to Plato's problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review, 104(2), 211–240. https://doi.org/10.1037/0033-295X.104.2.211
  • Landauer, T. K., Foltz, P. W., & Laham, D. (1998). An introduction to latent semantic analysis. Discourse Processes, 25(2-3), 259–284. https://doi.org/10.1080/01638539809545028
  • Levy, O., Goldberg, Y., & Dagan, I. (2015). Improving distributional similarity with lessons learned from word embeddings. Transactions of the Association for Computational Linguistics, 3, 211–225. https://doi.org/10.1162/tacl_a_00134
  • Levy, R. (2008). Expectation-based syntactic comprehension. Cognition, 106(3), 1126–1177. https://doi.org/10.1016/j.cognition.2007.05.006
  • Li, N. C., & Thompson, S. A. (1981). Mandarin Chinese: A functional reference grammar. University of California Press.
  • Li, N. C., & Thompson, S. A. (1981). Subject and topic: A new typology of language. In C. N. Li (Ed.), Subject and topic (pp. 457–461). Academic Press. (Original work published 1976)
  • Liu, H. (2008). Dependency distance as a metric of language comprehension difficulty. Journal of Cognitive Science, 9(2), 159–191. https://doi.org/10.17791/jcs.2008.9.2.159
  • Luke, S. G., & Christianson, K. (2016). Limits on lexical prediction during reading. Cognitive Psychology, 88, 22–60. https://doi.org/10.1016/j.cogpsych.2016.06.002
  • Lund, K., & Burgess, C. (1996). Producing high-dimensional semantic spaces from lexical co-occurrence. Behavior Research Methods, Instruments, and Computers, 28(2), 203–208.
  • Mandler, J. M. (1984). Scripts, stories and scenes: Aspects of schema theory. Erlbaum.
  • Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In C. Burges, L. Bottou, M. Welling, Z. Ghahramani, & K. Weinberger (Eds.), Advances in neural information processing systems (pp. 3111–3119). Curran Associates Inc.
  • Miller, G. (1999). On knowing a word. Annual Review of Psychology, 50, 1–19.
  • Minsky, M. (1974). A framework for representing knowledge. MIT Press.
  • Narayanan, S., & Jurafsky, D. (1998). Bayesian models of human sentence processing. In M. Gernsbacher & S. Derry (Eds.), Proceedings of the 12th Annual Meeting of the Cognitive Science (pp. 752–757). Psychology Press.
  • Norman, D. A., & Rumelhart, D. E. (1981). The LNR approach to human information processing. Cognition, 10(1), 235–240. https://doi.org/10.1016/0010-0277(81)90051-2
  • Osgood, C. E., Suci, G. J., & Tannenbaum, P. H. (1957). The measurement of meaning. University of Illinois Press.
  • Padó, U., Crocker, M. W., & Keller, F. (2009). A probabilistic model of semantic plausibility in sentence processing. Cognitive Science, 33(5), 794–838. https://doi.org/10.1111/j.1551-6709.2009.01033.x
  • Pan, H., & Hu, J. (2008). A semantic–pragmatic interface account of (dangling) topics in Mandarin Chinese. Journal of Pragmatics, 40(11), 1966–1981. https://doi.org/10.1016/j.pragma.2008.03.005
  • Peters, M., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., & Zettlemoyer, L. (2018). Deep contextualized word representations. In M. Marilyn Walker, H. Ji, & A. Stent (Eds), Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (pp. 2227–2237). The Association for Computational Linguistics.
  • Pilehvar, M. T., & Camacho-Collados, J. (2020). Embeddings in natural language processing: Theory and advances in vector representations of meaning. Morgan Claypool.
  • Pynte, J., New, B., & Kennedy, A. (2008). On-line contextual influences during reading normal text: A multiple-regression analysis. Vision Research, 48(21), 2172–2183. https://doi.org/10.1016/j.visres.2008.02.004
  • Reimers, N., Gurevych, I., Reimers, N., Gurevych, I., & Thakur, N. (2019). Sentence-BERT: Sentence embeddings using Siamese BERT-networks. In S. Padó & R. Huang (Eds.), Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (pp. 671–688). Association for Computational Linguistics.
  • Resnik, P. (1996). Selectional constraints: An information-theoretic model and its computational realization. Cognition, 61(1-2), 127–159. https://doi.org/10.1016/S0010-0277(96)00722-6
  • Roland, D., Yun, H., Koenig, J.-P., & Mauner, G. (2012). Semantic similarity, predictability, and models of sentence processing. Cognition, 122(3), 267–279. https://doi.org/10.1016/j.cognition.2011.11.011
  • Sánchez-Casas, R., Ferré, P., García-Albea, J., & Guasch, M. (2006). The nature of semantic priming: Effects of the degree of semantic similarity between primes and targets in Spanish. European Journal of Cognitive Psychology, 18(2), 161–184. https://doi.org/10.1080/09541440500183830
  • Schank, R., & Abelson, R. P. (1977). Scripts, plans, goals and understanding: An inquiry into human knowledge structures. Earlbaum Assoc.
  • Schütze, C. T. (2016). The empirical base of linguistics. Grammaticality judgments and linguistic methodology. Language Science Press.
  • Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technical Journal, 27(3), 379–423. https://doi.org/10.1002/j.1538-7305.1948.tb01338.x
  • Shi, D. X. (2000). Topic and topic-comment constructions in Mandarin Chinese. Language, 76(2), 383–408. https://doi.org/10.1353/lan.2000.0070
  • Smith, N., & Levy, R. (2013). The effect of word predictability on reading time is logarithmic. Cognition, 128(3), 302–319. https://doi.org/10.1016/j.cognition.2013.02.013
  • Song, Y., Shi, S., Li, J., & Zhang, H. (2018, June). Directional skip-gram: Explicitly distinguishing left and right context for word embeddings. In M. Walker, H. Ji. & A. Stent (Eds.), Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics (pp. 175–180).
  • Sun, K. (2018). Approaching the double-nominal construction in Mandarin Chinese through the semantic-cognitive interaction. Studia Linguistica, 72(3), 687–724. https://doi.org/10.1111/stul.12085
  • Sun, K. (2022). A novel (contextual) semantic similarity predicting eye movements in reading: Compared with cosine and Euclidean algorithms. Manuscript.
  • Traxler, M. J. (2014). Trends in syntactic parsing: Anticipation, Bayesian estimation, and good-enough parsing. Trends in Cognitive Sciences, 18(11), 605–611.
  • Traxler, M. J., & Pickering, M. J. (1996). Plausibility and the processing of unbounded dependencies: An eye-tracking study. Journal of Memory and Language, 35(3), 454–475.
  • Traxler, M. J., Williams, R. S., Blozis, S. A., & Morris, R. K. (2005). Working memory, animacy, and verb class in the processing of relative clauses. Journal of Memory and Language, 53(2), 204–224. https://doi.org/10.1016/j.jml.2005.02.010
  • Tsao, F. F. (1990). A functional study of topic in Chinese: The first step towards discourse analysis. Student Book Co. (Original work published 1979)
  • van Rij, J., Wieling, M., Baayen, R. H., van Rijn, H., & van Rij, M. J. (2020). itsadug: Interpreting Time Series and Autocorrelated Data Using GAMMs. R package version 2.4.1. https://CRAN.R-project.org/package=itsadug
  • Wayne, C. (1997). Experimental syntax: Applying objective methods to sentence judgements. Sage.
  • Wittgenstein, L. (2010). Philosophical investigations. John Wiley and Sons.
  • Wood, S. N. (2017). Generalized additive models: An introduction with R. CRC press.
  • Wu, T. (2016). Chinese-style topics as indexicality. International Journal of Chinese Linguistics, 3(2), 201–244. https://doi.org/10.1075/ijchl.3.2.02wu
  • Xu, L. J. (2015). Topic prominence. In W. S. Wang & C. F. Sun (Eds.), The Oxford handbook of Chinese linguistics (pp. 393–403). Oxford University Press.
  • Xu, L. J., & Langendoen, D. T. (1985). Topic structures in Chinese. Language, 61(1), 1–27. https://doi.org/10.2307/413419
  • Xu, L. J., & Liu, D. Q. (2007). Huati de jiegou yu gongneng [Structure and function of Chinese topic construction]. Shanghai Education Press.
  • Yang, Y., & Tao, L. (2014). Neural mechanisms of trace in Chinese topicalized constructions. Social Sciences in China, 35(1), 86–111. https://doi.org/10.1080/02529203.2013.875661

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.