15,116
Views
98
CrossRef citations to date
0
Altmetric
Articles

Machine Learning for Stock Selection

ORCID Icon & , CFA
Pages 70-88 | Published online: 13 May 2019
 

Abstract

Machine learning is an increasingly important and controversial topic in quantitative finance. A lively debate persists as to whether machine learning techniques can be practical investment tools. Although machine learning algorithms can uncover subtle, contextual, and nonlinear relationships, overfitting poses a major challenge when one is trying to extract signals from noisy historical data. We describe some of the basic concepts of machine learning and provide a simple example of how investors can use machine learning techniques to forecast the cross-section of stock returns while limiting the risk of overfitting.

Disclosure: The authors report no conflicts of interest.

Editor’s Note

Submitted 19 July 2018

Accepted 30 January 2019 by Stephen J. Brown

Notes

1 “Bagging” is an abbreviation for “bootstrap aggregating,” or averaging forecasts from different training sets. “Boosting” is the process of reweighting observations to put more weight on misclassifications from prior forecasting rounds.

2 An alternative approach is to use a robust objective function, such as the pairwise rank correlation between returns and forecasts in a regression setting.

4 We refer interested readers to chapter 7 in López de Prado (2018) for an in-depth treatment of cross-validation for financial data.

5 Machine learning algorithms are well known for their ability to tease signals from big data—for instance, detecting sentiment in text or predicting future sales from social media posts. Although these applications are certainly promising, they are not the focus of this article. Our goal is to show how MLAs can be more effective than traditional quantitative techniques even when using widely known quant signals to forecast security returns.

6 Practitioners could also use a machine learning model to aggregate the information of the individual signals.

7 Results for individual regions are available on request.

8 Gu et al. (2018) found that price trend, volatility, and liquidity are by far the most important features. Our analysis suggests that these categories are important, but we also found that the percentage of shares sold short, the difference between put and call implied volatilities, and characteristics derived from financial statement information are among the 10 most important features.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 162.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.