Abstract
Using a large-scale Deep Learning approach applied to a high-frequency database containing billions of market quotes and transactions for US equities, we uncover nonparametric evidence for the existence of a universal and stationary relation between order flow history and the direction of price moves. The universal price formation model exhibits a remarkably stable out-of-sample accuracy across a wide range of stocks and time periods. Interestingly, these results also hold for stocks which are not part of the training sample, showing that the relations captured by the model are universal and not asset-specific.
The universal model—trained on data from all stocks—outperforms asset-specific models trained on time series of any given stock. This weighs in favor of pooling together financial data from various stocks, rather than designing asset- or sector-specific models, as is currently commonly done. Standard data normalizations based on volatility, price level or average spread, or partitioning the training data into sectors or categories such as large/small tick stocks, do not improve training results. On the other hand, inclusion of price and order flow history over many past observations improves forecast accuracy, indicating that there is path-dependence in price dynamics.
Acknowledgements
The authors thank seminar participants at the London Quant Summit 2018, NYC Quant Summit 2018, the London Quantitative Finance Seminar, University of Colorado Boulder, JP Morgan, Freiburg Institute for Advanced Study, ETH Zurich, the SwissQuote Conference on Machine Learning in Finance 2018 and Princeton University for their comments. Computations for this paper were performed using a Blue Waters supercomputer grant ‘Distributed Learning with Neural Networks’.
Disclosure statement
No potential conflict of interest was reported by the authors.
Notes
1 Historical order book data was reconstructed from NASDAQ Level III data using the LOBSTER data engine (Huang and Polak Citation2011).