552
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Tomorrow's fish and chip paper? Slowly incorporated news and the cross-section of stock returns

ORCID Icon, ORCID Icon & ORCID Icon
Pages 774-795 | Received 26 Sep 2019, Accepted 30 Oct 2020, Published online: 16 Nov 2020
 

ABSTRACT

The link between news and investor decision making is widely discussed in the literature. Utilising unique U.S. firm-level news data between 1979 and 2016, we document a cross-sectional difference in the speed of the diffusion of information contained in news. We distinguish news articles as being either slowly or quickly incorporated into contemporaneous stock prices. The return spread between stocks classified according to these two types of news yields a statistically significant profit of 139 basis points per month. This abnormal return cannot be explained by other well-known risk factors and is robust when allowing for trading costs. Overall, our research refines the role of news regarding information dissemination in the financial markets.

JEL classifications:

Acknowledgments

We are grateful to Chris Adcock (the Editor), an Associate Editor and two anonymous referees for constructive comments and suggestions. We thank Tian Han for advice about textual data mining and the Google Cloud Platform (GCP) for providing research credits for our sentiment analysis. We also thank C.S Agnes Cheng, Tony Moore, Simone Varotto, Steven Young, the 2019 BAFA doctoral conference and the FMA Europe Doctoral Student Consortium, for useful comments.

Disclosure statement

No potential conflict of interest was reported by the authors.

Notes

Note: Panel A of Table  presents the summary statistics. CV G is the number of news articles. RET is the DGTW-adjusted return computed by following Daniel et al. (Citation1997). NSS is the aggregated monthly news sentiment score. SIZE is computed by taking the natural logarithm of the stock market capitalisation at the end of each June. BTM is computed by taking the natural logarithm of the stock market value divided by the firm's book value, adjusted at each end of June. MOM is the stock's most recent 12-month cumulative returns. ILLIQ is a proxy for stock liquidity based on Amihud (Citation2002). BETA is calculated following Scholes and Williams (Citation1977) and Dimson (Citation1979). IVOL is the idiosyncratic risk computed by the standard deviation of residuals from the Fama–French–Carhart four-factor model over the month using daily returns. Panel B reports the summary statistics across all four types of news portfolios. The monthly sequential double-sorted approach is used. In each month, firms are sorted into three groups based on their past abnormal returns and then within each group stocks are further ranked into three groups based on their news sentiment scores. SIGoodNews is defined as those with low current stock returns but good positive news; SIBadNews are those with high current stock returns but bad negative news; QIGoodNews refers to news portfolios with high current stock returns and good positive news. QIBadNews are low stock return and bad news stocks. Avg#ofStocks is the average number of stocks in each portfolio in each month.

Note: The stock return predictability of slowly incorporated news and quickly incorporated news based on various regression specifications. SINws and QINws are dummy variables if news is slowly incorporated or quickly incorporated, respectively. NSSSINws and NSSQINws are the interaction terms where NSS is a news sentiment score. CONTROLS is a battery of control variables including LRET, SIZE, BTM, BETA, IV OL, MOM, ILLIQ. SIZE is computed by taking the natural logarithm of stock market values in each previous month. LRET is the lagged 1-month stock return. BTM is computed by taking the natural logarithm of stock market values divided by firm book values adjusted at each end of June. BETA is calculated following Scholes and Williams (Citation1977) and Dimson (Citation1979). MOM is the stock's most recent 12-month cumulative returns. IV OL is the idiosyncratic risk computed by the standard deviation of residuals from the Fama–French–Carhart four-factor model over the month using daily returns. ILLIQ is a proxy for stock liquidity based on (Amihud Citation2002). The sample period ranges from 1979 to 2016. t-statistics are reported in parentheses and **, *** refer to the 5% and 1% significance levels respectively.

Note: The performance of slowly incorporated news and quickly incorporated news portfolios in different information environments. We independently sort all stocks into two portfolios based on their most recent market capitalisation (Size), current monthly change of media coverage, turnover ratio and analyst coverage. We report the ‘Small–Large’ SIZE, ‘Low–High’ Δ MEDIA, ‘Low–High’ TURN, ‘Low–High’ AstCvg spread profitability in the post-formation period. t-statistics are reported in parentheses and **, *** refer to the 5% and 1% significance levels respectively.

Note: The performance of slowly incorporated news and quickly incorporated news for different news characteristics. In Panel A, we independently sort all stocks into two portfolios based on textual complexity. In Panel B, we again sort all stocks into two portfolios based on news informativeness. We report the ‘Ambiguous-Accurate’ Tone, ‘Complex-Concise’ Readability, ‘Short-Long’ Length and ‘EarningsEx-EarningsIn’ Topic spread profitability in the post-formation period. t-statistics are reported in parentheses and **, *** refer to the 5%, 1% significance levels respectively.

1 In the UK, fish and chips (a takeaway treat) was traditionally wrapped in newspaper in order to absorb grease. This demonstrated that a newspaper was only valuable for the news it carried on the day of publication.

2 For example, Chan (Citation2003) concludes that investors underreact to public information and overreact to private information by measuring long-run stock return performance; Tetlock (Citation2007) documents an investor overreaction pattern by constructing a VAR (Vector Autoregression) model. Other related studies include Tetlock, Saar-Tsechansky, and Macskassy (Citation2008); Garcia (Citation2013); Ahmad et al. (Citation2016); Jiang, Li, and Wang (Citation2017); Kräussl and Mirgorodskaya (Citation2017).

3 News tone refers to the sentiment score of news content assessed using computational tools. A news article with positive news tone (or a high news sentiment score) tends to contain good news. The terms news tone' and news sentiment score' are used interchangeably hereafter.

4 Under this method, each word in a news item is labelled as belonging to a pre-specified category, examples of which include ‘positive’, ‘negative’, ‘model-weak’ and ‘litigious’. This approach provides a means of quantifying business news tone.

5 A 10-K is a comprehensive corporate filing by a U.S. publicly-listed company about its annual financial performance and is required by the Securities and Exchange Commission (SEC).

6 The Dow Jones Newswire is a global real-time news product and is used in many research papers – e.g. Tetlock (Citation2010), Tetlock (Citation2011), Engelberg, Reed, and Ringgenberg (Citation2012), Engelberg, David McLean, and Pontiff (Citation2018).

7 Although Henry (Citation2008) is the first work that contributes to sentiment analysis, we do not employ the word list developed by therein for two reasons: First, Henry's list only has a very limited number of sentiment words (e.g. 85 negative words) whereas that of Loughran and McDonald (Citation2011) includes 2329 such words. More importantly, the most frequent LM negative words based on 10-K annual reports such as loss, losses, claims, impairment, against, adverse, restated, adversely, restructuring and litigation, do not appear in Henry's list. This suggests that Henry's dictionary may not sufficiently capture all potential negative tone from the text.

8 The Google Natural Language API is a newly built natural language processing tool by Google Cloud, details of which can be found at https://cloud.google.com/natural-language/.

9 In an unreported test, we confirm its even distribution by performing the Jarque-Bera test suggested by the Editor.

10 The Fama–French risk factors can be downloaded from Kenneth French's Data library https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/datalibrary:html. The liquidity factor we utilise is Pástor and Stambaugh (Citation2003), which can be accessed from https://faculty.chicagobooth.edu/lubos-pastor/data.

11 Part-of-speech tagging is a computational linguistic method to label the category of words as noun, adverb, adjective, etc.

12 Bloomberg AIA data is missing for the periods 12/6/2010 -- 1/7/2011 and 8/17/2011 -- 11/2/2011.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 490.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.