666
Views
2
CrossRef citations to date
0
Altmetric
Special Issue Papers

Learning multi-market microstructure from order book data

ORCID Icon, ORCID Icon & ORCID Icon
Pages 1517-1529 | Received 28 Jun 2018, Accepted 06 May 2019, Published online: 10 Jul 2019
 

Abstract

In this paper, we investigate market behaviors at high-frequency using neural networks trained with order book data. Experiments are done intensively with 110 asset pairs covering 97% of spot-futures pairs in the Korea Exchange. An efficient training scheme that improves the performance and training stability is suggested, and using the proposed scheme, the lead–lag relationship between spot and futures markets are measured by comparing the performance gains of each market data set for predicting the other. In addition, the gradients of the trained model are analyzed to understand some important market features that neural networks learn through training, revealing characteristics of the market microstructure. Our results show that highly complex neural network models can successfully learn market features such as order imbalance, spread-volatility correlation, and mean reversion.

Acknowledgments

The authors thank their project counterpart for providing us with valuable datasets. Constructive comments from Prof. Jinwoo Shin are greatly appreciated. Author names are in alphabetical order.

Disclosure statement

No potential conflict of interest was reported by the authors.

Notes

1 Since the number of data varies with asset, longer training epochs are required for assets with less market activity. We find that 160 epochs are enough to guarantee the convergence for all assets.

2 From the cross-validation results, we found that using training data whose dates are after the test set gives no performance gain on predicting the micro-movements. This is mainly due to the highly localized characteristics of the short-term price dynamics.

3 We tried longer time delays up to 60 s, and seven labels were enough to improve the training stability. Since labels with longer time delay are less correlated with short-term price movements, using more labels with longer time delays results in underfitting.

Additional information

Funding

This work was supported by the National Research Foundation of Korea (NRF-2019R1A2C1003144).

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.