144
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Linking asset prices to news without direct asset mentions

ORCID Icon, ORCID Icon, ORCID Icon & ORCID Icon
Pages 2907-2912 | Published online: 25 Aug 2022
 

ABSTRACT

Advances in Natural Language Processing (NLP), computing power and data availability are driving an explosion in research about the impact of news on asset prices. However, when relating news to individual assets, this research is based on mentions of specific assets or related terms in the news stories. Such an approach has two shortcomings. First, it requires a substantial time investment in a specific NLP technology. Second, and more importantly, it ignores news articles that do not directly mention a given asset or a pre-defined asset-related term, even if these articles are logically related to the asset in question. Our approach relies instead on a novel NLP technology called ‘semantic fingerprinting’, which projects any text onto a binary vector representing its meaning. The greater the overlap between the semantic fingerprint of a news article and a given asset description, the more relevant we expect the article to be, whether or not the given asset is mentioned in the news directly. We show that this approach successfully picks up the positive impact of news on prices of commonly traded commodities using a dataset of general news published by The Guardian. We include the needed data and instructions for implementing this approach.

JEL CLASSIFICATION:

Acknowledgement

We are grateful to Jasper Ginn, Søren Tjagvad Madsen, Francisco Webber, Fang Xu and Maxim Zagonov for many insightful conversations. Special thanks to Jasper Ginn for extensive assistance with the data. All errors are ours. We gratefully acknowledge research funding from Europlace Institute of Finance, from the Romanian Ministry of Education (CNCS - UEFISCDI, project number PN-II-ID-PCE-2012-4-0631) and from the Israeli Science Foundation (project number 1957/19).

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes

1 For more detailed illustrations of the semantic fingerprinting process, see also Pungulescu (Citation2022).

2 The semantic space of the Retina engine has K = 128 × 128 = 16,384 positions. The term fingerprint corresponding to a commodity identifies a subset of 328 positions, whereas the text fingerprint corresponding to a news article highlights a subset of 984 related “contexts” from the semantic space.

3 Our news dataset is available at https://osf.io/edky5.

4 The Retina engine can be accessed at http://languages.cortical.io. Note that, consistently with a convention used by many programming languages, Cortical.io numbers the positions starting from 0 and not from 1.

5 Sample Python code for semantic fingerprinting is available from https://tinyurl.com/fpcodepython.

Additional information

Funding

The work was supported by the Israeli Science Foundation [1957 / 19]; Romanian Ministry of Education [PN-II-ID-PCE-2012-4-0631]; Europlace Institute of Finance.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 205.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.