ABSTRACT
Cryptocurrencies have become a trendy topic recently, primarily due to their disruptive potential and reports of unprecedented returns. Furthermore, social media has garnered attention for its predictive capabilities in various fields, including financial markets and the economy. In this study, we exploit the predictive power of sentiment from Twitter and Reddit, alongside Google Trends indexes, to forecast log returns for 10 cryptocurrencies, namely Bitcoin, Ethereum, Tether, Binance Coin, Litecoin, Enjin Coin, Horizen, Namecoin, Peercoin and Feathercoin. We evaluate the performance of LASSO Vector Autoregression using daily data from January 2018 to January 2022. In a 30-day recursive forecast, we achieve a mean directional accuracy (MDA) rate of over 50%. Moreover, we observe a significant increase in forecast accuracy in terms of MDA when using sentiment and attention variables as predictors, but only for less capitalized cryptocurrencies. This improvement is not reflected in the RMSE. We also conduct a Granger causality test using post-double LASSO selection for high-dimensional VAR models. Our results suggest that social media sentiment does not Granger-cause cryptocurrencies returns.
Disclosure statement
No potential conflict of interest was reported by the author(s).
Notes
2 See https://twitter.com/.
3 See https://www.reddit.com/.
5 It is essential to note that the primary objective of this work is not the forecasting of specific cryptocurrencies, especially those in the lower tier of market capitalization, the selection of which may not be of particular interest to the reader. Their significance lies in their representation of a group, alongside other cryptocurrencies with similar market capitalization.
6 Keywords used as Google Trends searches are in line with those used in Merediz-Solà and Bariviera (Citation2019) and Aslanidis et al. (Citation2022).
8 Where denote the
stacked vector containing all the observations of the variables in
. Similarly
,
and
.
and
stand respectively for chosen possible Granger causing variables and remaining variables.:
9 For simplicity, we omitted some combinations when testing Granger causality from Google Trends. However, the p-values reported were computed considering search engine data as ‘other variables’.