553
Views
4
CrossRef citations to date
0
Altmetric
Research Papers

Are missing values important for earnings forecasts? A machine learning perspective

ORCID Icon, , &
Pages 1113-1132 | Received 11 Aug 2020, Accepted 14 Jul 2021, Published online: 05 Jan 2022
 

Abstract

Analysts' forecasts are one of the most common and important estimators for firms' future earnings. However, they are challenging to fully utilize because of missing values. This study applies machine learning techniques to estimate missing values in individual analysts' forecasts and subsequently to predict firms' future earnings based on both estimated and observed forecasts. After estimating missing values, forecast error is reduced by 41% compared to the mean forecast, suggesting that missing values after estimating are indeed useful for earnings forecasts. We analyze multiple estimation methods and show that the out-performance of matrix factorization (MF) is consistent using different evaluation measures and across firms. Finally, we propose a stochastic gradient descent based coupled matrix factorization (CMF) to augment the estimation quality of missing values with multiple datasets. CMF further reduces the error of earnings forecasts by 19% compared to MF with a single dataset.

JEL Classification:

Open Scholarship

This article has earned the Center for Open Science badge for Open Materials. The materials are openly accessible at https://github.com/Ajim63/MF_CMF_for_Earning_Forecast.git

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes

† Following industry practice, the consensus forecast is defined as the average of all the analysts’ forecasts for a firm in a given quarter referred to as the mean forecast.

† Following standard practice, throughout the paper, we use upper case letter for matrix, e.g. X, and range, e.g. T, and lower case letter for index, e.g. t.

† The 1, the sparsity penalty, uses the absolute value of magnitude to penalize the loss function.

‡ The 2 norm used a square of the magnitude to penalize a large deviation from the ground truth.

† In recommender systems, the cold start occurs when the model cannot make any inferences for users (analysts) or items (quarters) because of insufficient information. In other words, lack of reference point for an analyst or quarter to be associated with other users or quarters (Bobadilla et al. Citation2012, Lika et al. Citation2014).

Additional information

Funding

This work was supported by US National Institutes of Health [UL1TR003017].

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 691.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.