Search in:

Advanced search

Historical Methods: A Journal of Quantitative and Interdisciplinary History Volume 53, 2020 - Issue 4

Submit an article Journal homepage

Open access

5,157

Views

CrossRef citations to date

Altmetric

Articles

Digital begriffsgeschichte: Tracing semantic change using word embeddings

Melvin WeversDHLab, KNAW Humanities Cluster, Amsterdam, The Netherlands; ;Department of History, University of Amsterdam, Amsterdam, The Netherlands; Correspondence[email protected]

Marijn KoolenDigital Infrastructure, KNAW Humanities Cluster, Amsterdam, The Netherlands

Pages 226-243 | Published online: 13 May 2020

Cite this article
https://doi.org/10.1080/01615440.2020.1760157
CrossMark

Full Article
Figures & data
References
Citations
Metrics
Licensing
Reprints & Permissions
View PDF PDF View EPUB EPUB

References

Allan, K. and Robinson, J.A. (Eds.). 2012. Current methods in historical semantics. Berlin: De Gruyter Mouton.
Google Scholar
Andersen, N. Å. 2003. Discursive analytical strategies: Understanding Foucault, Koselleck, Laclau, Luhmann. 1st ed., Bristol: Bristol University Press. 10.2307/j.ctt1t898nd.
Google Scholar
Antoniak, M., and D. Mimno. 2018. Evaluating the stability of embedding-based word similarities. Transactions of the Association for Computational Linguistics 6:107–19. doi: 10.1162/tacl_a_00008.
Google Scholar
Azarbonyad, H., M. Dehghani, K. Beelen, A. Arkut, M. Marx, and J. Kamps. 2017. Words are malleable: Computing semantic shifts in political and media discourse. Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, ACM, 1509–1518. doi: 10.1145/3132847.3132878.
Google Scholar
Baroni, M., G. Dinu, and G. Kruszewski. 2014. Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors. ACL 1 (1):238–247.
Google Scholar
Barranco, R. C., R. F. Dos Santos, M. S. Hossain, and M. Akbar. 2018. Tracking the Evolution of Words with Time-reflective Text Representations. 2018 IEEE International Conference on Big Data (Big Data), IEEE, 2088–2097. doi: 10.1109/BigData.2018.8621902.
Google Scholar
Batchkarov, M., T. Kober, J. Reffin, J. Weeds, and D. Weir. 2016. A critique of word similarity as a method for evaluating distributional semantic models. Proceedings of the 1st Workshop on Evaluating Vector-Space Representations for NLP, Association for Computational Linguistics, Berlin, Germany, 7–12.
Google Scholar
Bingham, A. 2010. The digitization of newspaper archives: Opportunities and challenges for historians. Twentieth Century British History 21 (2):225–31. doi: 10.1093/tcbh/hwq007.
Web of Science ®Google Scholar
Bollmann, M., and A. Søgaard. 2016. Improving historical spelling normalization with bi-directional LSTMs and multi-task learning. Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, presented at the COLING 2016, The COLING 2016 Organizing Committee, Osaka, Japan, 131–139.
Google Scholar
Bolukbasi, T., K.-W. Chang, J. Y. Zou, V. Saligrama, and A. T. Kalai. 2016. Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. In Advances in neural information processing systems 29, ed. D.D. Lee, M. Sugiyama, U.V. Luxburg, I. Guyon and R. Garnett, 4349–57. Curran Associates, Inc.
Google Scholar
Brunner, O., W. Conze, and R. Koselleck. 2004. Geschichtliche grundbegriffe: Historisches lexikon zur politisch-sozialen sprache in deutschland. 8 bände in 9. Stuttgart, Germany: Klett-Cotta.
Google Scholar
Carpineto, C., and G. Romano. 2012. A survey of automatic query expansion in information retrieval. ACM Computing Surveys 44 (1):1–50. doi: 10.1145/2071389.2071390.
Web of Science ®Google Scholar
De Bolla, P. 2013. The architecture of concepts: The historical formation of human rights. New York: Fordham University Press.
Google Scholar
Deerwester, S., S. T. Dumais, G. W. Furnas, T. K. Landauer, and R. Harshman. 1990. Indexing by latent semantic analysis. Journal of the American Society for Information Science 41 (6):391–407. doi: 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9.
Web of Science ®Google Scholar
Devlin, J., M.-W. Chang, K. Lee, and K. Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), presented at the NAACL-HLT 2019, Association for Computational Linguistics, Minneapolis, Minnesota, 4171–4186.
Google Scholar
Diaz, F., B. Mitra, and N. Craswell. 2016. Query expansion with locally-trained word embeddings. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 1, 367–377. doi: 10.18653/v1/P16-1035.
Google Scholar
Dubossarsky, H., D. Weinshall, and E. Grossman. 2017. Outta control: Laws of semantic change and inherent biases in word representation models. Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Copenhagen, Denmark, 1136–1145.
Google Scholar
Efthimiadis, E. N. 1996. Query expansion. Annual Review of Information Science and Technology (ARIST) 31:121–87.
Google Scholar
Egense, T. 2017. Automated improvement of search in low quality OCR using Word2Vec. https://sbdevel.wordpress.com/2017/02/02/automated-improvement-of-search-in-low-quality-ocr-using-word2vec/.
Google Scholar
Egozi, O., S. Markovitch, and E. Gabrilovich. 2011. Concept-based information retrieval using explicit semantic analysis. ACM Transactions on Information Systems 29 (2):1–34. doi: 10.1145/1961209.1961211.
Google Scholar
Foucault, M. 2012. The archaeology of knowledge [1974]. London: Random House.
Google Scholar
Garg, N., L. Schiebinger, D. Jurafsky, and J. Zou. 2018. Word embeddings quantify 100 years of gender and ethnic stereotypes. Proceedings of the National Academy of Sciences 115 (16):E3635–E3644. doi: 10.1073/pnas.1720347115.
PubMed Web of Science ®Google Scholar
Gavin, M. 2015. “The arithmetic of concepts: A response to Peter de Bolla”. Modeling Literary History, 18 September. Accessed February 25, 2019 http://modelingliteraryhistory.org/2015/09/18/the-arithmetic-of-concepts-a-response-to-peter-de-bolla/.
Google Scholar
Giuliano, V. E., and P. E. Jones. 1962. “Linear associative information retrieval”. Computer Science. 10.21236/ad0290313.
Google Scholar
Hamilton, W. L., J. Leskovec, and D. Jurafsky. 2016a. Cultural shift or linguistic drift? comparing two computational measures of semantic change. Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing, Vol. 2016, NIH Public Access, 2116.
Google Scholar
Hamilton, W. L., J. Leskovec, and D. Jurafsky. 2016b. Diachronic word embeddings reveal statistical laws of semantic change. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), presented at the Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Berlin, Germany, 1489–1501. doi: 10.18653/v1/P16-1141.
Google Scholar
Hamilton, W. L., J. Leskovec, and D. Jurafsky. 2016c. Cultural shift or linguistic drift? comparing two computational measures of semantic change. Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing, Vol. 2016, NIH Public Access, 2116.
Google Scholar
Harris, Z. S. 1954. Distributional structure. Word 10 (2-3):146–62. Nodoi: 10.1080/00437956.1954.11659520.
Web of Science ®Google Scholar
Hellrich, J., S. Buechel, and U. Hahn. 2018. JeSemE: Interleaving Semantics and Emotions in a Web Service for the Exploration of Language Change Phenomena. Proceedings of the 27th International Conference on Computational Linguistics: System Demonstrations, Association for Computational Linguistics, Santa Fe, New Mexico, 10–14.
Google Scholar
Hellrich, J., and U. Hahn. 2016. Bad Company-Neighborhoods in Neural Embedding Spaces Considered Harmful. COLING 2785–96.
Google Scholar
Huistra, H., and B. Mellink. 2016. Phrasing history: Selecting sources in digital repositories. Historical Methods: A Journal of Quantitative and Interdisciplinary History 49 (4):220–9. doi: 10.1080/01615440.2016.1205964.
Web of Science ®Google Scholar
Ifversen, J. 2003. Text, discourse, concept: Approaches to textual analysis. Kontur 7:60–9.
Google Scholar
James, P., and M. B. Steger. 2014. A Genealogy of ‘Globalization’: The Career of a Concept. Globalizations 11 (4):417–34. doi: 10.1080/14747731.2014.951186.
Web of Science ®Google Scholar
Junge, K., and K. Postoutenko. 2014. Asymmetrical concepts after reinhart koselleck: Historical semantics and beyond. Bielefeld: Transcript Verlag.
Google Scholar
Jurafsky, D., and J. Martin. 2009. Speech and language processing: An introduction to natural language processing, computational linguistics, and speech recognition. Upper Saddle River: Pearson.
Google Scholar
Kenter, T., M. Wevers, P. Huijnen, and M. de Rijke. 2015. Ad Hoc monitoring of vocabulary shifts over time. Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, ACM, New York, 1191–1200. doi: 10.1145/2806416.2806474.
Google Scholar
Kim, Y., Chiu, Y.-I., Hanaki, K., Hegde, D. and Petrov, S. (2014), “Temporal analysis of language through Neural Language Models”, Proceedings of the ACL 2014 Workshop on Language Technologies and Computational Social Science, Association for Computational Linguistics, Baltimore, MD, USA, pp. 61–65.
Google Scholar
Koolen, M., F. Adriaans, J. Kamps, and M. De Rijke. 2006. A cross-language approach to historic document retrieval. European Conference on Information Retrieval, 407–19.
Google Scholar
Koselleck, R. 1989. Social History and Conceptual History. International Journal of Politics, Culture and Society 2 (3):308–25. doi: 10.1007/BF01384827.
Google Scholar
Koselleck, R. 2002. The practice of conceptual history: Timing history, spacing concepts. 1st ed. Stanford: Stanford University Press.
Google Scholar
van Lange, M., and R. Futselaar. 2018. Debating evil: Using word embeddings to analyze parliamentary debates on war criminals in The Netherlands. Proceedings of the Conference on Language Technologies & Digital Humanities, 147–153.
Google Scholar
Levy, O., and Y. Goldberg. 2014. Linguistic Regularities in Sparse and Explicit Word Representations. Proceedings of the Eighteenth Conference on Computational Natural Language Learning, presented at the Proceedings of the Eighteenth Conference on Computational Natural Language Learning, Association for Computational Linguistics, Ann Arbor, Michigan, 171–180. doi: 10.3115/v1/W14-1618.
Google Scholar
Levy, O., Y. Goldberg, and I. Dagan. 2015. Improving distributional similarity with lessons learned from word embeddings. Transactions of the Association for Computational Linguistics 3:211–25. doi: 10.1162/tacl_a_00134.
Google Scholar
Martinez-Ortiz, C., T. Kenter, M. Wevers, P. Huijnen, J. Verheul, and J. Van Eijnatten. 2016. Design and implementation of ShiCo: Visualising shifting concepts over time. HistoInformatics 2016 1632:11–9.
Google Scholar
Matsuoka, J., and Y. Lepage. 2014. Measuring similarity from word pair matrices with syntagmatic and paradigmatic associations. Proceedings of the 4th Workshop on Cognitive Aspects of the Lexicon (CogALex), Association for Computational Linguistics and Dublin City University, Dublin, Ireland, 77–86. doi: 10.3115/v1/W14-4712.
Google Scholar
Mikolov, T., I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In Advances in neural information processing systems 26, ed. C.J.C. Burges, L. Bottou, M. Welling, Z. Ghahramani and K.Q. Weinberger, 3111–9. Curran Associates, Inc.
Google Scholar
Mikolov, T., W. Yih, and G. Zweig. 2013. Linguistic regularities in continuous space word representations. Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 746–51.
Google Scholar
Miller, G. A., and W. G. Charles. 1991. Contextual correlates of semantic similarity. Language and Cognitive Processes 6 (1):1–28. doi: 10.1080/01690969108406936.
Web of Science ®Google Scholar
Nicholson, B. 2013. The digital turn: Exploring the methodological possibilities of digital newspaper archives. Media History 19 (1):59–73. doi: 10.1080/13688804.2012.752963.
Google Scholar
Orlikowski, M., M. Hartung, and P. Cimiano. 2018. Learning diachronic analogies to analyze concept change. Proceedings of the Second Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, 1–11.
Google Scholar
Palti, E. J. Reinhart Koselleck: His concept of the concept and Neo-Kantianism. Contributions to the History of Concepts 6 (2):1–20. doi:10.3167/choc.2011.060201.
Google Scholar
Palti, E. J. 2011. Reinhart Koselleck: His Concept of the Concept and Neo-Kantianism, Contributions to the History of Concepts, 6 (2): 1–20.
Google Scholar
Peters, M., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K. and Zettlemoyer, L. 2018, “Deep Contextualized Word Representations”, Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), presented at the NAACL-HLT 2018, Association for Computational Linguistics, New Orleans, Louisiana, pp. 2227–2237.
Google Scholar
Piotrowski, M. 2012. Natural language processing for historical texts. Synthesis Lectures on Human Language Technologies 5 (2):1–157. doi: 10.2200/S00436ED1V01Y201207HLT017.
Google Scholar
Qiu, Y., and H.-P. Frei. 1993. Concept based query expansion. Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 160–169. doi: 10.1145/160688.160713.
Google Scholar
Rapp, R. 2003. Syntagmatic and paradigmatic associations in information retrieval. In Between data science and applied data analysis, ed. M. Schader, W. Gaul and M. Vichi, 473–82. Berlin: Springer Berlin Heidelberg.
Google Scholar
Recchia, G., E. Jones, P. Nulty, J. Regan, and P. de Bolla. 2016. Tracing shifting conceptual vocabularies through time. European knowledge acquisition workshop, 19–28. Springer.
Google Scholar
Richter, C., M. Wickes, D. Beser, and M. Marcus. 2018. Low-resource post processing of noisy OCR output for historical corpus digitisation. Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), presented at the LREC 2018, European Language Resources Association (ELRA), Miyazaki, Japan. Accessed April 13, 2020. https://www.aclweb.org/anthology/L18-1369.
Google Scholar
Richter, M. 1987. Begriffsgeschichte and the History of Ideas. Journal of the History of Ideas 48 (2):247–63. doi: 10.2307/2709557.
Web of Science ®Google Scholar
Roy, D., D. Paul, M. Mitra, and U. Garain. 2016. Using word embeddings for automatic query expansion. ArXiv Preprint ArXiv:1606.07608
Google Scholar
Rudolph, M., and D. Blei. 2018. Dynamic embeddings for language evolution. Proceedings of the 2018 World Wide Web Conference, International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, 1003–1011. doi: 10.1145/3178876.3185999.
Google Scholar
Sahlgren, M., and A. Lenci. 2016. The Effects of Data Size and Frequency Range on Distributional Semantic Models. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, presented at the EMNLP 2016, Association for Computational Linguistics, Austin, Texas, 975–980.
Google Scholar
Salton, G., A. Wong, and C.-S. Yang. 1975. A vector space model for automatic indexing. Communications of the ACM 18 (11):613–20. doi: 10.1145/361219.361220.
Web of Science ®Google Scholar
Schmieder, F. 2019. “Begriffsgeschichte’s methodological neighbors and the scientification of concepts”, 2 October. Accessed October 28, 2019. https://jhiblog.org/2019/10/02/begriffsgeschichtes-methodological-neighbors-and-the-scientification-of-concepts/.
Google Scholar
Schütze, H., and J. Pedersen. 1993. A vector model for syntagmatic and paradigmatic relatedness. Proceedings of the 9th Annual Conference of the UW Centre for the New OED and Text Research :104–13.
Google Scholar
Skinner, Q. 2002. Visions of politics. Cambridge: Cambridge University Press.
Google Scholar
Sommerauer, P., and A. Fokkens. 2019. Conceptual change and distributional semantic models: An exploratory study on pitfalls and possibilities. Proceedings of the 1st International Workshop on Computational Approaches to Historical Language Change, presented at the Proceedings of the 1st International Workshop on Computational Approaches to Historical Language Change, Association for Computational Linguistics, Florence, Italy, 223–233. doi: 10.18653/v1/W19-4728.
Google Scholar
Sommerauer, P. J. M., and A. S. Fokkens. 2018. Firearms and tigers are dangerous, kitchen knives and zebras are not: Testing whether word embeddings can tell. The 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP: Proceedings of the First Workshop, Association for Computational Linguistics (ACL), 276–286.
Google Scholar
Szymanski, T. 2017. Temporal word analogies: Identifying lexical replacement with diachronic word embeddings. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 448–453.
Google Scholar
Tulkens, S., C. Emmery, W. Daelemans, et al. 2016. Evaluating unsupervised dutch word embeddings as a linguistic resource. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), ed. N.C Chair, K. Choukri, T. Declerck, M. Grobelnik, B. Maegaard, J. Mariani, A. Moreno. European Language Resources Association (ELRA).
Google Scholar
Wang, B., A. Wang, F. Chen, Y. Wang, and C.-C. J. Kuo. 2019. Evaluating word embedding models: Methods and experimental results. In APSIPA transactions on signal and information processing, vol. 8. Cambridge: Cambridge University Press. doi: 10.1017/ATSIP.2019.12.
Google Scholar
Wevers, M. 2019. Using word embeddings to examine gender bias in dutch newspapers, 1950-1990. In Proceedings of the 1st International Workshop on Computational Approaches to Historical Language Change, presented at the ACL, ed. N. Tahmasebi, L. Borin, A. Jatowt and Y. Xu. Florence, Italy. doi: 10.18653/v1/W19-4712.
Google Scholar
Wittgenstein, L. 2010. Philosophical investigations. Hoboken, New Jersey: John Wiley & Sons.
Google Scholar
Yao, Z., Y. Sun, W. Ding, N. Rao, and H. Xiong. 2018. Dynamic Word Embeddings for Evolving Semantic Discovery. Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining - WSDM ‘18, 673–681. doi: 10.1145/3159652.3159703.
Google Scholar

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Digital begriffsgeschichte: Tracing semantic change using word embeddings

References

Information for

Open access

Opportunities

Help and information

Your download is now in progress and you may close this window

Login or register to access this feature

Digital begriffsgeschichte: Tracing semantic change using word embeddings

References

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date