State-dependent asset allocation using neural networks: The European Journal of Finance: Vol 28 , No 11

Abstract

Changes in market conditions present challenges for investors as they cause performance to deviate from the ranges predicted by long-term averages of means and covariances. The aim of conditional asset allocation strategies is to overcome this issue by adjusting portfolio allocations to hedge changes in the investment opportunity set. This paper proposes a new approach to conditional asset allocation that is based on machine learning; it analyzes historical market states and asset returns and identifies the optimal portfolio choice in a new period when new observations become available. In this approach, we directly relate state variables to portfolio weights, rather than firstly modeling the return distribution and subsequently estimating the portfolio choice. The method captures nonlinearity among the state (predicting) variables and portfolio weights without assuming any particular distribution of returns and other data, without fitting a model with a fixed number of predicting variables to data and without estimating any parameters. The empirical results for a portfolio of stock and bond indices show the proposed approach generates a more efficient outcome compared to traditional methods and is robust in using different objective functions across different sample periods.

KEYWORDS:

JEL Classifications:

Acknowledgements

We thank Talis Putnins, Gabor Rudolf, who is no longer with us, Adrian Lee, Nigar Hashimzade, Eduardo Roca, Robert Elliott, Ihsan Badshah, Chandra Krishnamurti, and participants at the 2019 UniSA Fintech Conference for their comments and discussions.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes

1 See, e.g. Merton (Citation1980), Fama and French (Citation1988), Barberis (Citation2000), Diebold, Lee, and Weinbach (Citation1994), Chen (Citation2009), Kanas (Citation2008), Costa and Kwon (Citation2019), and Ang and Bekaert (Citation2004).

2 Performance ratios are considered as the explicit asset allocation formula in which the optimal allocation is proportional to the ratio of the reward to risk. They are used for managing mutual funds and are popular in optimal asset allocations (see, e.g. Farinelli et al. (Citation2008, Citation2009) and Biglova et al. (Citation2004)).

3 The Sharpe ratio is valid for the portfolio choice of an investor if returns are normally distributed because mean and variance can explain their distribution; however, it is well documented that returns do not follow normal distributions (Leland Citation1999; Kat and Brooks Citation2002; Agarwal and Naik Citation2004; Malkiel and Saha Citation2005). When returns exhibit heavy tails or present kurtosis or skewness, using Sharpe as the objective function leads to an inefficient estimation of risk, and incorrect investment decisions. To overcome this well-known limitation of the Sharpe ratio, several alternative performance ratios have been proposed in the literature. For example, in Gini (Shalit and Yitzhaki Citation1984), Mean Absolute Deviation (MAD) (Konno and Yamazaki Citation1991) and Minimax (Young Citation1998), the risk measures are redefined. Recently, asymmetrical parameter-dependent ratios have become popular. Value at Risk (VaR) and Conditional VaR (CVaR) (Martin, Rachev, and Siboulet Citation2003) capture downside risk, and Rachev (Biglova, et al., Citation2004) includes the truncated moments conditioned to tail events as the risk measure.

4 The possibility of incorporating large number of input variables is the feature of ANN as a machine learning tool. Since the proposed method is based on ANN, it allows data to speak and disregard irrelevant state variables in the information set over time. However, in the empirical section of this paper, we use four popular state variables to facilitate comparing the performance of this approach with traditional methods.

5 Sigmoid function generates output values between 0 and 1 and is the most popular activation function in multilayer perceptron networks (Alpaydin Citation2004).

6 The aim of the stochastic gradient descent method (Hannah, Powell, and Blei Citation2010) is to optimize a function iteratively by generating a sequence of solutions based on the estimated gradients calculated from a randomly selected subset of the data. The algorithm stops when there is no improvement in objective function, or some other criteria are met. The learning rate is considered as a step decaying at each iteration of the algorithm computed as $γ_{i} = γ_{0} / 1 + c_{i}$ , where $γ_{i}$ is the rate of ith iteration, $c_{i}$ is the number of iteration, and $γ_{0}$ is a constant. We do cross validation to choose the best parameters on the training set.

7 We control the number of nodes and other associated parameters by implementing cross validation and measuring the objective function value over training process to ensure we avoid the overfitting problem. Technical details are available upon request.

8 To maximize function (6) its gradients have to be set to zero. Here, we do this numerically and we need to follow the positive gradient direction to find a local maximum point. Since $\frac{\partial \bar{L}}{\partial μ} = {x^{'}}_{t} e - 1$ is one of the gradients and it is zero in optimality, so the algorithm does not end as long as ${x^{'}}_{t} e - 1 \neq 0$ .

9 Training refers to the process of modeling the observation data by machine learning tools like ANN to infer a function. The observation data used in the process and the final estimated model are known as ‘training data’ and a ‘trained model’, respectively.

10 In other words, we use ANN as an optimization tool which has been discussed in the computer science literature and used in few fields such as inventory, logistics and smart grids (e.g. Oroojlooyjadid, Snyder, and Takac Citation2020; Villarrubia et al. Citation2018).

11 Some of the variables on mean predictability include term spread (Campbell Citation1987; Fama and French Citation1988, Citation1989; Ferson and Harvey Citation1991), default spread (Fama and French Citation1988, Citation1989; Keim and Stambaugh Citation1986) and Treasury bill yield (Fama and William Schwert Citation1977; Ferson and Harvey Citation1991). The variables for predicting variances include lagged squared return and/or lagged variance (Bollerslev Citation1986; Engle Citation1982; French, William Schwert, and Stambaugh Citation1987; Harvey Citation2001; Schwert Citation1989; Whitelaw Citation1994), default spread (Harvey Citation2001; Whitelaw Citation1994), dividend yield (Harvey Citation2001) and debt-to-equity ratio (Schwert Citation1989). Finally, the predictability of convarinces is attributed to variables such as lagged covariances, lagged cross-products of returns (Bollerslev, Engle, and Wooldridge Citation1988), term spread (Campbell Citation1987; Harvey Citation2001), and default spread and dividend yield (Harvey Citation2001).

12 We use two terms of ‘performance measure’ and ‘performance ratio’ interchangeably.

13 We consider CVaR with parameter $α = 0.5$ , and for Rachev we use the parameter $α = 0.5$ and $β = 0.99$ . However, our proposed procedure is not sensitive to these settings.

14 In empirical applications of machine learning tools, it is common to divide the whole sample into two parts, in-sample and out of sample, which usually contain 70% and 30% of the observations, respectively.

15 Out-of-sample periods are 1999–2003, 2004–2008, 2009–2013 and 2014–2018. For each out-of-sample period, ANN is trained over the previous 156 months of observations.

16 The information for state variables is available at the end of the previous month.

17 For example, see Chen, Tsai, and Lin (Citation2011), Campbell, Huisman, and Koedijk (Citation2001), Kroll, Levy, and Markowitz (Citation1984) and Bekaert and Hodrick (Citation1992) among others that use the same approach.

18 See Chan, Karceski, and Lakonishok (Citation1999) for the approach among others.

19 Our results are not sensitive to the choice of lags.

20 The returns of assets are assumed to have a multi-variate normal distribution. We follow the simulation method suggested by Kotz et. al (Citation2000) to simulate daily returns.

21 In order to maximize Sharpe, Gini and Rachev, we follow optimization approaches suggested by Cornuejols and Tütüncü (Citation2006). We use Konno and Yamazaki’s (Citation1991) method to maximize MAD, and finally we follow Rockafellar and Uryasev’s (Citation2002) approach to optimize CVaR and MiniMax. In general, in all these approaches, the idea is to iteratively explore the portfolios at different return levels on the efficient frontier and locate the one with maximum ratio.

22 We also examine the risk parameter $γ = 100$ in benchmark (3) for investors who are extremely sensitive to losses (Brandt, Santa-Clara, and Valkanov Citation2009). The performance ratios with this risk parameter are significantly lower compared to the risk parameter $γ = 5.$

23 In addition to these three traditional benchmarks, we also consider ‘static’ portfolios with static weights for bond and stock portfolios which are common among practitioners. We construct three static portfolios as Benchmark-static_x-y, where x and y denote the weights (in percent) of the S&P 500 index and the bond index in the portfolios, respectively.

We use the static weights and asset returns to compute portfolio returns and performance ratios during the out-of-sample periods. The mean values of different performance ratios for these static portfolios are as follows.

Benchmark-static_20–80: Sharpe = 0.05, MAD = 0.06, MiniMax = 0.04, Gini = 0.09, CVaR = 0.18, Rachev = 3.31;

Benchmark-static_60–40: Sharpe = 0.07, MAD = 0.09, MiniMax = 0.05, Gini = 0.12, CVaR = 0.25, Rachev = 3.56; and

Benchmark-static_80–20: Sharpe = 0.07, MAD = 0.09, MiniMax = 0.06, Gini = 0.12, CVaR = 0.17, Rachev = 3.53.

Comparing these results with those in Table indicates that the means of all ratios are greater when we use ANN to find the optimal weights compared to the static approach. We thank the associate editor for this valuable suggestion.

24 We thank an anonymous referee for this suggestion.

25 The results for other ratios are qualitatively similar and available upon request. Also, note that in benchmark (3), the objective function is a utility function.

26 Results for other performance ratios are available upon request.

27 The alternatives are Partial derivatives (PaD) (Dimopoulos, Bourret, and Lek Citation1995, Citation1999), Garson’s algorithm (Garson Citation1991), Perturb method (Yao et al. Citation1998; Scardi and Harding Citation1999) and Profile method (Lek et al. Citation1996a, Citation1996b). Olden, Joy, and Death (Citation2004) shows Connection Weights performs better than other methods.

28 We thank an anonymous referee for this suggestion.

29 Since we have only two asset classes in our empirical section, our network has one output node which is the portfolio weight (x) for one asset class (the portfolio weight for the other asset class is 1 − x).

Additional information

Notes on contributors

Reza Bradrania

Reza Bradrania is a Senior Lecturer of Finance, Program Director of Finance, Economics and Property programs, and member of Centre for Markets, Values and Inclusion at the University of South Australia Business School. Reza's research is in the area of empirical asset pricing and investment. His current interests include utilizing machine learning in investment, and implications of gambling in the stock market. His articles have appeared in premier international peer-reviewed journals and conferences and featured in the media. He is the recipient of several research grants from the Chartered Institute of Management Accountants (CIMIA) in UK, Accounting and Finance Association of Australia and New Zealand (AFAANZ) and Centre for International Finance and Regulations (CIFR). Reza holds PhD in Finance from the University of Sydney.

Davood Pirayesh Neghab

Davood Pirayesh Neghab has a PhD in Industrial Engineering and Operations Research from Koc University, Turkey. He holds a master's degree in Financial Engineering from the University of Tehran and a bachelor' degree in Industrial Engineering from Iran University of Science and Technology. He has been involved in research projects with a focus on machine learning applications in Finance. His research interests include portfolio optimization, risk management, data science, and inventory control systems.

State-dependent asset allocation using neural networks

Notes on contributors

Reza Bradrania

Davood Pirayesh Neghab

Log in via your institution

Log in to Taylor & Francis Online

Restore content access

Related Research

Information for

Open access

Opportunities

Help and information

State-dependent asset allocation using neural networks

Abstract

Acknowledgements

Disclosure statement

Notes

Additional information

Notes on contributors

Reza Bradrania

Davood Pirayesh Neghab

Log in via your institution

Log in to Taylor & Francis Online

Log in to Taylor & Francis Online

Restore content access

Related Research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature