Abstract
Analysts' forecasts are one of the most common and important estimators for firms' future earnings. However, they are challenging to fully utilize because of missing values. This study applies machine learning techniques to estimate missing values in individual analysts' forecasts and subsequently to predict firms' future earnings based on both estimated and observed forecasts. After estimating missing values, forecast error is reduced by 41% compared to the mean forecast, suggesting that missing values after estimating are indeed useful for earnings forecasts. We analyze multiple estimation methods and show that the out-performance of matrix factorization (MF) is consistent using different evaluation measures and across firms. Finally, we propose a stochastic gradient descent based coupled matrix factorization (CMF) to augment the estimation quality of missing values with multiple datasets. CMF further reduces the error of earnings forecasts by 19% compared to MF with a single dataset.
Open Scholarship
This article has earned the Center for Open Science badge for Open Materials. The materials are openly accessible at https://github.com/Ajim63/MF_CMF_for_Earning_Forecast.git
Disclosure statement
No potential conflict of interest was reported by the author(s).
Notes
† Following industry practice, the consensus forecast is defined as the average of all the analysts’ forecasts for a firm in a given quarter referred to as the mean forecast.
† Following standard practice, throughout the paper, we use upper case letter for matrix, e.g. X, and range, e.g. T, and lower case letter for index, e.g. t.
† The , the sparsity penalty, uses the absolute value of magnitude to penalize the loss function.
‡ The norm used a square of the magnitude to penalize a large deviation from the ground truth.
† In recommender systems, the cold start occurs when the model cannot make any inferences for users (analysts) or items (quarters) because of insufficient information. In other words, lack of reference point for an analyst or quarter to be associated with other users or quarters (Bobadilla et al. Citation2012, Lika et al. Citation2014).