A hybrid convolutional neural network with long short-term memory for statistical arbitrage: Quantitative Finance: Vol 23 , No 4

Sample our Mathematics & Statistics journals, sign in here to start your FREE access for 14 days

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions
Read this article /doi/full/10.1080/14697688.2023.2181707?needAccess=true

Abstract

We propose a CNN-LSTM deep learning model, which has been trained to classify profitable from unprofitable spread sequences of cointegrated stocks, for a large scale market backtest ranging from January 1991 to December 2017. We show that the proposed model can achieve high levels of accuracy and successfully derives features from the market data. We formalize and implement a trading strategy based on the model output which generates significant risk-adjusted excess returns that are orthogonal to market risks. The generated out-of-sample Sharpe ratio and alpha coefficient significantly outperform the reference model, which is based on a standard deviation rule, even after accounting for transaction costs.

Keywords:

Statistical arbitrage
Pairs trading
Deep learning
Convolutional neural network
Long short-term memory

JEL Codes:

Acknowledgements

We thank the editor and two anonymous referees for carefully reading the manuscript and for several constructive and detailed comments that helped to improve our paper.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes

1 Harlacher (Citation2016) finds that differences between the ADF test and other unit root tests such as the Phillips–Perron or the Phillips–Ouliaris test are not significant.

2 CNN-LSTM architectures have been found to achieve state of the art performance in time series forecasting tasks related, e.g. to heart rate signals Swapna et al. (Citation2018), rainfall intensity Shi et al. (Citation2017), particulate matter Huang and Kuo (Citation2018), waterworks operations Cao et al. (Citation2018), or the gold price Livieris et al. (Citation2020).

3 Hyperparameters are inspired by the choices in Livieris et al. (Citation2020), except for the number of filters, where we found better optimization results for a lower number of filters compared to 32 and 64 filters used in Livieris et al. (Citation2020).

4 We tested the following hyperparameters: hidden LSTM layers ∈ [1,2] and LSTM cells per layer ∈ [2, 5, 10, 15, 20] on 10 randomly selected pairs. We found that a single layer with 10 cells returned the most accurate results.

5 The total number of trainable parameters of the model is 1,891.

6 For example, each element of the input vector $r^{i}$ that is passed to the outermost layer is standardized according to ${\tilde{r}}_{t}^{i} = (r_{t}^{i} - m e a n (r^{i})) / s t d d e v (r^{i})$ .

7 According to the classification in Krauss (Citation2017), this model represents a stochastic control approach.

8 See https://github.com/fchollet/keras

9 We note that we did not notice any problems related to vanishing or exploding gradients during the training of the models.

10 Precision is defined as $\frac{T P}{T P + F P}$ and F1 score is defined as $\frac{T P}{T P + \frac{1}{2} (F P + F N)}$ where TP, FP, and FN refer to true positives, false positives and false negatives, respectively.

11 We did a comparison for different values of extrapolation parameter k and obtained the most promising results for k = 5. We found that the superior performance of the k = 5-variant can be attributed to the better out-of-sample classification accuracy compared to the alternatives for k = 10 or 20 days. Final average out-of-sample accuracies are 68.5% for k = 5, 66.5% for k = 10, 67.1% for k = 20.

12 Alpha and beta coefficients relate to the one-factor model regression. We discuss further dependencies on risk factors in Section 4.3.3.

13 We use the statsmodels library (Seabold and Perktold Citation2010) in Python with default parameters for the linear regression.

14 The authors thank Kenneth French for allowing to source all data from his website: https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html

15 Note that Chen and Bassett (Citation2014) show that due to the self-financing nature of these factor portfolios and the market capitalization structure, this interpretation is not necessarily true.

16 It is important to note that deep learning techniques such as LSTM or CNN models have been introduced in the late 1990s. As such, the high risk-adjusted returns in the 1990s need to be seen against the backdrop that neither the theory nor the necessary technology for this strategy has been available for the majority of market participants.

17 Note that for the backtest with m>5 we need to optimize the extended CNN-LSTM models each trading period again based on the enlarged data set, i.e. on m = 20. The results for $m = {10, 15, 20}$ are therefore based on newly trained models.

18 We refer to Petersen (Citation2020) for a detailed mathematical study on neural networks.

19 We will refer to the LSTM model as established by Gers et al. (Citation2000), who modified the original LSTM of Hochreiter and Schmidhuber (Citation1997) and proposed a total of three gates named according to their functions: input, output and forget gate.

20 Subscripts are expressing the to-from-relationships, i.e. $W_{f, h}$ denotes the recurrent weight connection from the previous time step's hidden state $h_{t - 1}$ to the current time step's forget gate $f_{t}$ .

21 See https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.multivariate_normal.html.

Harlacher, M., Cointegration based algorithmic pairs trading. PhD Thesis, University of St. Gallen, 2016. Available at https://www.e-helvetica.nb.admin.ch/api/download/urn%3Anbn%3Ach%3Abel-983813%3ADis4600.pdf/Dis4600.pdf.

Google Scholar

Swapna, G., Kp, S. and Vinayakumar, R., Automated detection of diabetes using CNN and CNN-LSTM network and heart rate signals. Procedia. Comput. Sci., 2018, 132, 1253–1262.

Google Scholar

Shi, X., Gao, Z., Lausen, L., Wang, H., Yeung, D.-Y., Wong, W.-k. and WOO, W.-C., Deep learning for precipitation nowcasting: A benchmark and a new model. Adv. Neural. Inf. Process. Syst., 2017, 30, 5617–5627.

Google Scholar

Huang, C.-J. and Kuo, P.-H., A deep CNN-LSTM model for particulate matter (pm 2.5) forecasting in smart cities. Sensors, 2018, 18(7), 2220.

PubMed Web of Science ®Google Scholar

Cao, K., Kim, H., Hwang, C. and Jung, H., CNN-LSTM coupled model for prediction of waterworks operation data. J. Inform. Processing Syst., 2018, 14(6), 1508–1520.

Web of Science ®Google Scholar

Livieris, I., Pintelas, E.G. and Pintelas, P., A CNN-LSTM model for gold price time-series forecasting. Neural Comput. Appl., 2020, 32, 17351–17360.

Web of Science ®Google Scholar

Livieris, I., Pintelas, E.G. and Pintelas, P., A CNN-LSTM model for gold price time-series forecasting. Neural Comput. Appl., 2020, 32, 17351–17360.

Web of Science ®Google Scholar

Livieris, I., Pintelas, E.G. and Pintelas, P., A CNN-LSTM model for gold price time-series forecasting. Neural Comput. Appl., 2020, 32, 17351–17360.

Web of Science ®Google Scholar

Krauss, C., Statistical arbitrage pairs trading strategies: Review and outlook. J. Econ. Surv., 2017, 31(2), 513–545.

Web of Science ®Google Scholar

Seabold, S. and Perktold, J., Statsmodels: Econometric and statistical modeling with Python. In Proceedings of the 9th Python in Science Conference, edited by S. van der Walt and J, Millman, pp. 92–96, 2010 (Austin).

Google Scholar

Chen, H.-I. and Bassett, G., What does βsmb>0 really mean? J. Financ. Res., 2014, 37(4), 543–552.

Web of Science ®Google Scholar

Petersen, P.C., Neural network theory. University of Vienna, 2020. Available at http://www.pc-petersen.eu/Neural_Network_Theory.pdf.

Google Scholar

Gers, F.A., Schmidhuber, J. and Cummins, F., Learning to forget: Continual prediction with LSTM. Neural. Comput., 2000, 12(10), 2451–2471.

PubMed Web of Science ®Google Scholar

Hochreiter, S. and Schmidhuber, J., Long short-term memory. Neural. Comput., 1997, 9(8), 1735–1780.

PubMed Web of Science ®Google Scholar

Log in via your institution

Access through your institution

Log in to Taylor & Francis Online

Shibboleth

Log in to Taylor & Francis Online

Username Password

Forgot password?

Keep me logged in (not suitable for shared devices).

You will otherwise be logged out automatically, after a limited period, and will need to log in again.

Restore content access

Restore content access for purchases made as guest

Purchase options * Save for later Item saved, go to cart

PDF download + Online access

48 hours access to article PDF & online version
Article PDF can be downloaded
Article PDF can be printed

USD 53.00 Add to cart

PDF download + Online access - Online Checkout

Issue Purchase

30 days online access to complete issue
Article PDFs can be downloaded
Article PDFs can be printed

USD 691.00 Add to cart

Issue Purchase - Online Checkout

* Local tax will be added as applicable

Share icon
Back to Top

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

A hybrid convolutional neural network with long short-term memory for statistical arbitrage

Log in via your institution

Log in to Taylor & Francis Online

Restore content access

Related Research

Information for

Open access

Opportunities

Help and information

A hybrid convolutional neural network with long short-term memory for statistical arbitrage

Abstract

Acknowledgements

Disclosure statement

Notes

Log in via your institution

Log in to Taylor & Francis Online

Log in to Taylor & Francis Online

Restore content access

Related Research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature