490
Views
9
CrossRef citations to date
0
Altmetric
Articles

Predicting stock return correlations with brief company descriptions

ORCID Icon, ORCID Icon, ORCID Icon & ORCID Icon
Pages 88-102 | Published online: 24 Jul 2018
 

ABSTRACT

A series of influential papers by Hoberg and Phillips measure the similarity of pairs of companies based on a textual analysis of their business descriptions and show these measures to be useful in a variety of research contexts in finance. Hoberg and Phillips derive the similarity measures from a comparison of word lists extracted from extensive business descriptions contained in US companies’ electronic 10-K filings. Unfortunately, this method is of little use in non-US settings, where lengthy English-language company self-descriptions are not available on a consistent basis. Instead, we use semantic fingerprinting to extract such similarity measures from much shorter but globally available third-party company descriptions. We show that our approach significantly predicts stock return correlations even after controlling for past correlations and for membership in the same industry. Remarkably, company similarity measures based on brief third-party company descriptions predict stock return correlations significantly better than those based on much longer company self-descriptions.

JEL CLASSIFICATION:

Acknowledgement

We are grateful to Sandrine Foldvari, David Le Bris, Xiaojuan Liu and participants at the ICMA Econometrics and Financial Data Science workshop for comments, and to Thomson Reuters for providing us with a history of company descriptions.

Disclosure statement

No potential conflict of interest was reported by the authors.

Notes

1 A zipped file with all the company descriptions we use can be downloaded from https://goo.gl/Bh8bgY.

2 http://www.cortical.io/keyword-extraction.html shows how text is converted into semantic fingerprints, and http://www.cortical.io/similarity-explorer.html shows how textual similarity is compared. See IKSS and Webber (Citation2015) for more details on semantic fingerprinting.

4 For Alcoa, the missing value in Column 10 is due to the fact that the only company in the same sector is Du Pont, for which HP similarities are missing, and the missing values in Columns 12 and 13 are due to the fact that 2013 TR company descriptions are unavailable for this company. Related to this, for Du Pont, the missing value in Column 10 is due to the fact HP similarities are missing for this company, and the missing values in Columns 12 and 13 are explained by the fact that Du Pont’s only GICS sector peer is Alcoa, for which, as mentioned earlier, 2013 TR company descriptions are unavailable.

5 We have also implemented the regression specifications of separately for pairs of firms belonging to the same sector as well as for different-sector firm pairs. The same-sector sample consists of only 40 firm pairs, making our tests less powerful. Nonetheless, our similarity measures based on both long and short TR descriptions are significant at the 10% level. For the different-sector sample, the corresponding significance easily clears the 1% threshold.

6 While we do not have access to other company descriptions on a historical basis, to demonstrate that different providers’ descriptions are broadly similar, in we showed that the correlation between cosine similarities derived from then-current (as of 3/2017) short TR descriptions and from Yahoo descriptions is high, at 0.676. To expand this point to further sources of company descriptions, we have collected current (as of 4/2018) company descriptions from another key provider of market data, Factset, and also updated our short TR descriptions to that date. We find that the TR/Factset correlation is 0.645. These high correlations suggest that a variety of third-party company descriptions are a potentially valuable alternative to extracting self-descriptions from 10-K filings.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 387.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.