5,157
Views
5
CrossRef citations to date
0
Altmetric
Articles

Digital begriffsgeschichte: Tracing semantic change using word embeddings

&

References

  • Allan, K. and Robinson, J.A. (Eds.). 2012. Current methods in historical semantics. Berlin: De Gruyter Mouton.
  • Andersen, N. Å. 2003. Discursive analytical strategies: Understanding Foucault, Koselleck, Laclau, Luhmann. 1st ed., Bristol: Bristol University Press. 10.2307/j.ctt1t898nd.
  • Antoniak, M., and D. Mimno. 2018. Evaluating the stability of embedding-based word similarities. Transactions of the Association for Computational Linguistics 6:107–19. doi: 10.1162/tacl_a_00008.
  • Azarbonyad, H., M. Dehghani, K. Beelen, A. Arkut, M. Marx, and J. Kamps. 2017. Words are malleable: Computing semantic shifts in political and media discourse. Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, ACM, 1509–1518. doi: 10.1145/3132847.3132878.
  • Baroni, M., G. Dinu, and G. Kruszewski. 2014. Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors. ACL 1 (1):238–247.
  • Barranco, R. C., R. F. Dos Santos, M. S. Hossain, and M. Akbar. 2018. Tracking the Evolution of Words with Time-reflective Text Representations. 2018 IEEE International Conference on Big Data (Big Data), IEEE, 2088–2097. doi: 10.1109/BigData.2018.8621902.
  • Batchkarov, M., T. Kober, J. Reffin, J. Weeds, and D. Weir. 2016. A critique of word similarity as a method for evaluating distributional semantic models. Proceedings of the 1st Workshop on Evaluating Vector-Space Representations for NLP, Association for Computational Linguistics, Berlin, Germany, 7–12.
  • Bingham, A. 2010. The digitization of newspaper archives: Opportunities and challenges for historians. Twentieth Century British History 21 (2):225–31. doi: 10.1093/tcbh/hwq007.
  • Bollmann, M., and A. Søgaard. 2016. Improving historical spelling normalization with bi-directional LSTMs and multi-task learning. Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, presented at the COLING 2016, The COLING 2016 Organizing Committee, Osaka, Japan, 131–139.
  • Bolukbasi, T., K.-W. Chang, J. Y. Zou, V. Saligrama, and A. T. Kalai. 2016. Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. In Advances in neural information processing systems 29, ed. D.D. Lee, M. Sugiyama, U.V. Luxburg, I. Guyon and R. Garnett, 4349–57. Curran Associates, Inc.
  • Brunner, O., W. Conze, and R. Koselleck. 2004. Geschichtliche grundbegriffe: Historisches lexikon zur politisch-sozialen sprache in deutschland. 8 bände in 9. Stuttgart, Germany: Klett-Cotta.
  • Carpineto, C., and G. Romano. 2012. A survey of automatic query expansion in information retrieval. ACM Computing Surveys 44 (1):1–50. doi: 10.1145/2071389.2071390.
  • De Bolla, P. 2013. The architecture of concepts: The historical formation of human rights. New York: Fordham University Press.
  • Deerwester, S., S. T. Dumais, G. W. Furnas, T. K. Landauer, and R. Harshman. 1990. Indexing by latent semantic analysis. Journal of the American Society for Information Science 41 (6):391–407. doi: 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9.
  • Devlin, J., M.-W. Chang, K. Lee, and K. Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), presented at the NAACL-HLT 2019, Association for Computational Linguistics, Minneapolis, Minnesota, 4171–4186.
  • Diaz, F., B. Mitra, and N. Craswell. 2016. Query expansion with locally-trained word embeddings. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 1, 367–377. doi: 10.18653/v1/P16-1035.
  • Dubossarsky, H., D. Weinshall, and E. Grossman. 2017. Outta control: Laws of semantic change and inherent biases in word representation models. Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Copenhagen, Denmark, 1136–1145.
  • Efthimiadis, E. N. 1996. Query expansion. Annual Review of Information Science and Technology (ARIST) 31:121–87.
  • Egense, T. 2017. Automated improvement of search in low quality OCR using Word2Vec. https://sbdevel.wordpress.com/2017/02/02/automated-improvement-of-search-in-low-quality-ocr-using-word2vec/.
  • Egozi, O., S. Markovitch, and E. Gabrilovich. 2011. Concept-based information retrieval using explicit semantic analysis. ACM Transactions on Information Systems 29 (2):1–34. doi: 10.1145/1961209.1961211.
  • Foucault, M. 2012. The archaeology of knowledge [1974]. London: Random House.
  • Garg, N., L. Schiebinger, D. Jurafsky, and J. Zou. 2018. Word embeddings quantify 100 years of gender and ethnic stereotypes. Proceedings of the National Academy of Sciences 115 (16):E3635–E3644. doi: 10.1073/pnas.1720347115.
  • Gavin, M. 2015. “The arithmetic of concepts: A response to Peter de Bolla”. Modeling Literary History, 18 September. Accessed February 25, 2019 http://modelingliteraryhistory.org/2015/09/18/the-arithmetic-of-concepts-a-response-to-peter-de-bolla/.
  • Giuliano, V. E., and P. E. Jones. 1962. “Linear associative information retrieval”. Computer Science. 10.21236/ad0290313.
  • Hamilton, W. L., J. Leskovec, and D. Jurafsky. 2016a. Cultural shift or linguistic drift? comparing two computational measures of semantic change. Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing, Vol. 2016, NIH Public Access, 2116.
  • Hamilton, W. L., J. Leskovec, and D. Jurafsky. 2016b. Diachronic word embeddings reveal statistical laws of semantic change. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), presented at the Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Berlin, Germany, 1489–1501. doi: 10.18653/v1/P16-1141.
  • Hamilton, W. L., J. Leskovec, and D. Jurafsky. 2016c. Cultural shift or linguistic drift? comparing two computational measures of semantic change. Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing, Vol. 2016, NIH Public Access, 2116.
  • Harris, Z. S. 1954. Distributional structure. Word 10 (2-3):146–62. Nodoi: 10.1080/00437956.1954.11659520.
  • Hellrich, J., S. Buechel, and U. Hahn. 2018. JeSemE: Interleaving Semantics and Emotions in a Web Service for the Exploration of Language Change Phenomena. Proceedings of the 27th International Conference on Computational Linguistics: System Demonstrations, Association for Computational Linguistics, Santa Fe, New Mexico, 10–14.
  • Hellrich, J., and U. Hahn. 2016. Bad Company-Neighborhoods in Neural Embedding Spaces Considered Harmful. COLING 2785–96.
  • Huistra, H., and B. Mellink. 2016. Phrasing history: Selecting sources in digital repositories. Historical Methods: A Journal of Quantitative and Interdisciplinary History 49 (4):220–9. doi: 10.1080/01615440.2016.1205964.
  • Ifversen, J. 2003. Text, discourse, concept: Approaches to textual analysis. Kontur 7:60–9.
  • James, P., and M. B. Steger. 2014. A Genealogy of ‘Globalization’: The Career of a Concept. Globalizations 11 (4):417–34. doi: 10.1080/14747731.2014.951186.
  • Junge, K., and K. Postoutenko. 2014. Asymmetrical concepts after reinhart koselleck: Historical semantics and beyond. Bielefeld: Transcript Verlag.
  • Jurafsky, D., and J. Martin. 2009. Speech and language processing: An introduction to natural language processing, computational linguistics, and speech recognition. Upper Saddle River: Pearson.
  • Kenter, T., M. Wevers, P. Huijnen, and M. de Rijke. 2015. Ad Hoc monitoring of vocabulary shifts over time. Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, ACM, New York, 1191–1200. doi: 10.1145/2806416.2806474.
  • Kim, Y., Chiu, Y.-I., Hanaki, K., Hegde, D. and Petrov, S. (2014), “Temporal analysis of language through Neural Language Models”, Proceedings of the ACL 2014 Workshop on Language Technologies and Computational Social Science, Association for Computational Linguistics, Baltimore, MD, USA, pp. 61–65.
  • Koolen, M., F. Adriaans, J. Kamps, and M. De Rijke. 2006. A cross-language approach to historic document retrieval. European Conference on Information Retrieval, 407–19.
  • Koselleck, R. 1989. Social History and Conceptual History. International Journal of Politics, Culture and Society 2 (3):308–25. doi: 10.1007/BF01384827.
  • Koselleck, R. 2002. The practice of conceptual history: Timing history, spacing concepts. 1st ed. Stanford: Stanford University Press.
  • van Lange, M., and R. Futselaar. 2018. Debating evil: Using word embeddings to analyze parliamentary debates on war criminals in The Netherlands. Proceedings of the Conference on Language Technologies & Digital Humanities, 147–153.
  • Levy, O., and Y. Goldberg. 2014. Linguistic Regularities in Sparse and Explicit Word Representations. Proceedings of the Eighteenth Conference on Computational Natural Language Learning, presented at the Proceedings of the Eighteenth Conference on Computational Natural Language Learning, Association for Computational Linguistics, Ann Arbor, Michigan, 171–180. doi: 10.3115/v1/W14-1618.
  • Levy, O., Y. Goldberg, and I. Dagan. 2015. Improving distributional similarity with lessons learned from word embeddings. Transactions of the Association for Computational Linguistics 3:211–25. doi: 10.1162/tacl_a_00134.
  • Martinez-Ortiz, C., T. Kenter, M. Wevers, P. Huijnen, J. Verheul, and J. Van Eijnatten. 2016. Design and implementation of ShiCo: Visualising shifting concepts over time. HistoInformatics 2016 1632:11–9.
  • Matsuoka, J., and Y. Lepage. 2014. Measuring similarity from word pair matrices with syntagmatic and paradigmatic associations. Proceedings of the 4th Workshop on Cognitive Aspects of the Lexicon (CogALex), Association for Computational Linguistics and Dublin City University, Dublin, Ireland, 77–86. doi: 10.3115/v1/W14-4712.
  • Mikolov, T., I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In Advances in neural information processing systems 26, ed. C.J.C. Burges, L. Bottou, M. Welling, Z. Ghahramani and K.Q. Weinberger, 3111–9. Curran Associates, Inc.
  • Mikolov, T., W. Yih, and G. Zweig. 2013. Linguistic regularities in continuous space word representations. Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 746–51.
  • Miller, G. A., and W. G. Charles. 1991. Contextual correlates of semantic similarity. Language and Cognitive Processes 6 (1):1–28. doi: 10.1080/01690969108406936.
  • Nicholson, B. 2013. The digital turn: Exploring the methodological possibilities of digital newspaper archives. Media History 19 (1):59–73. doi: 10.1080/13688804.2012.752963.
  • Orlikowski, M., M. Hartung, and P. Cimiano. 2018. Learning diachronic analogies to analyze concept change. Proceedings of the Second Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, 1–11.
  • Palti, E. J. Reinhart Koselleck: His concept of the concept and Neo-Kantianism. Contributions to the History of Concepts 6 (2):1–20. doi:10.3167/choc.2011.060201.
  • Palti, E. J. 2011. Reinhart Koselleck: His Concept of the Concept and Neo-Kantianism, Contributions to the History of Concepts, 6 (2): 1–20.
  • Peters, M., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K. and Zettlemoyer, L. 2018, “Deep Contextualized Word Representations”, Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), presented at the NAACL-HLT 2018, Association for Computational Linguistics, New Orleans, Louisiana, pp. 2227–2237.
  • Piotrowski, M. 2012. Natural language processing for historical texts. Synthesis Lectures on Human Language Technologies 5 (2):1–157. doi: 10.2200/S00436ED1V01Y201207HLT017.
  • Qiu, Y., and H.-P. Frei. 1993. Concept based query expansion. Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 160–169. doi: 10.1145/160688.160713.
  • Rapp, R. 2003. Syntagmatic and paradigmatic associations in information retrieval. In Between data science and applied data analysis, ed. M. Schader, W. Gaul and M. Vichi, 473–82. Berlin: Springer Berlin Heidelberg.
  • Recchia, G., E. Jones, P. Nulty, J. Regan, and P. de Bolla. 2016. Tracing shifting conceptual vocabularies through time. European knowledge acquisition workshop, 19–28. Springer.
  • Richter, C., M. Wickes, D. Beser, and M. Marcus. 2018. Low-resource post processing of noisy OCR output for historical corpus digitisation. Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), presented at the LREC 2018, European Language Resources Association (ELRA), Miyazaki, Japan. Accessed April 13, 2020. https://www.aclweb.org/anthology/L18-1369.
  • Richter, M. 1987. Begriffsgeschichte and the History of Ideas. Journal of the History of Ideas 48 (2):247–63. doi: 10.2307/2709557.
  • Roy, D., D. Paul, M. Mitra, and U. Garain. 2016. Using word embeddings for automatic query expansion. ArXiv Preprint ArXiv:1606.07608
  • Rudolph, M., and D. Blei. 2018. Dynamic embeddings for language evolution. Proceedings of the 2018 World Wide Web Conference, International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, 1003–1011. doi: 10.1145/3178876.3185999.
  • Sahlgren, M., and A. Lenci. 2016. The Effects of Data Size and Frequency Range on Distributional Semantic Models. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, presented at the EMNLP 2016, Association for Computational Linguistics, Austin, Texas, 975–980.
  • Salton, G., A. Wong, and C.-S. Yang. 1975. A vector space model for automatic indexing. Communications of the ACM 18 (11):613–20. doi: 10.1145/361219.361220.
  • Schmieder, F. 2019. “Begriffsgeschichte’s methodological neighbors and the scientification of concepts”, 2 October. Accessed October 28, 2019. https://jhiblog.org/2019/10/02/begriffsgeschichtes-methodological-neighbors-and-the-scientification-of-concepts/.
  • Schütze, H., and J. Pedersen. 1993. A vector model for syntagmatic and paradigmatic relatedness. Proceedings of the 9th Annual Conference of the UW Centre for the New OED and Text Research :104–13.
  • Skinner, Q. 2002. Visions of politics. Cambridge: Cambridge University Press.
  • Sommerauer, P., and A. Fokkens. 2019. Conceptual change and distributional semantic models: An exploratory study on pitfalls and possibilities. Proceedings of the 1st International Workshop on Computational Approaches to Historical Language Change, presented at the Proceedings of the 1st International Workshop on Computational Approaches to Historical Language Change, Association for Computational Linguistics, Florence, Italy, 223–233. doi: 10.18653/v1/W19-4728.
  • Sommerauer, P. J. M., and A. S. Fokkens. 2018. Firearms and tigers are dangerous, kitchen knives and zebras are not: Testing whether word embeddings can tell. The 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP: Proceedings of the First Workshop, Association for Computational Linguistics (ACL), 276–286.
  • Szymanski, T. 2017. Temporal word analogies: Identifying lexical replacement with diachronic word embeddings. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 448–453.
  • Tulkens, S., C. Emmery, W. Daelemans, et al. 2016. Evaluating unsupervised dutch word embeddings as a linguistic resource. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), ed. N.C Chair, K. Choukri, T. Declerck, M. Grobelnik, B. Maegaard, J. Mariani, A. Moreno. European Language Resources Association (ELRA).
  • Wang, B., A. Wang, F. Chen, Y. Wang, and C.-C. J. Kuo. 2019. Evaluating word embedding models: Methods and experimental results. In APSIPA transactions on signal and information processing, vol. 8. Cambridge: Cambridge University Press. doi: 10.1017/ATSIP.2019.12.
  • Wevers, M. 2019. Using word embeddings to examine gender bias in dutch newspapers, 1950-1990. In Proceedings of the 1st International Workshop on Computational Approaches to Historical Language Change, presented at the ACL, ed. N. Tahmasebi, L. Borin, A. Jatowt and Y. Xu. Florence, Italy. doi: 10.18653/v1/W19-4712.
  • Wittgenstein, L. 2010. Philosophical investigations. Hoboken, New Jersey: John Wiley & Sons.
  • Yao, Z., Y. Sun, W. Ding, N. Rao, and H. Xiong. 2018. Dynamic Word Embeddings for Evolving Semantic Discovery. Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining - WSDM ‘18, 673–681. doi: 10.1145/3159652.3159703.