28,290
Views
21
CrossRef citations to date
0
Altmetric
Research

Enhanced Portfolio Optimization

ORCID Icon, , CFAORCID Icon &
 

Abstract

Portfolio optimization should provide large benefits for investors, but standard mean–variance optimization (MVO) works so poorly in practice that optimization is often abandoned. Many of the approaches developed to address this issue are surrounded by mystique regarding how, why, and whether they really work. So, we sought to simplify, unify, and demystify optimization. We identified the portfolios that cause problems in standard MVO, and we present here a simple “enhanced portfolio optimization” method. Applying this method to industry momentum and time-series momentum across equities and global asset classes, we found significant alpha beyond the market, the 1/N portfolio, and standard asset pricing factors.

Disclosure: The authors report no conflicts of interest. AQR Capital Management is a global investment management firm that may or may not apply similar investment techniques or methods of analysis as described here. The views expressed here are those of the authors and not necessarily those of AQR. Lasse Heje Pedersen gratefully acknowledges support from Center for Financial Frictions (Grant No. DNRF102).

Editor’s Note:

Submitted 13 August 2020

Accepted 17 November 2020 by Stephen J. Brown

Acknowledgments

We thank Michele Aghassi, Stephen Brown (the editor), Ben Davis, Victor DeMiguel, Antti Ilmanen, Roni Israelov, Bryan Kelly, Lorenzo Garlappi, Ernst Schaumburg, and Raman Uppal for helpful comments and Matthew Silverman and Jusvin Dhillon for excellent research assistance.

Notes

1 A large literature has addressed estimation noise—for example, Ledoit and Wolf (2003, 2004) on noise in variance–covariance matrices and Black and Litterman (1992) on noise in expected returns.

2 Note that this result is not simply the same as saying that averaging portfolios improves performance (as shown by Tu and Zhou 2011). We found that EPO can work even better. For example, if we first compute the standard MVO portfolio without shrinkage, xw=0, and the solution with full shrinkage, xw=1, and then take the average of these, axw=0+1axw=1, the result does not work as well as our EPO method for any a, especially if the MVO is particularly ill behaved. The EPO method first shrinks and then optimizes, not the other way around, which is useful because shrinking the correlations stabilizes the optimization process.

3 We unify several leading approaches to optimization, but EPO obviously does not nest all methods. Roncalli (2013) and Bruder, Gaussel, Richard, and Roncalli (2013) reviewed various methods of regularizing MVO, including a discussion of the eigendecomposition of the variance–covariance matrix similar to our problem portfolios, showing that the risk of these portfolios is low. We additionally show that the expected return of problem portfolios is too high (see Panel B of ) and that large EPO shrinkage can help address both these problems. DeMiguel et al. (2009), considering 14 methods of optimization, found that none consistently outperformed the simple 1/N portfolio. Some methods do show promise in outperforming the 1/N portfolio, however, such as methods that constrain the portfolio norm (Jagannathan and Ma 2003; DeMiguel, Garlappi, Nogales, and Uppal 2009), methods based on ambiguity aversion (Garlappi, Uppal, and Wang 2007), methods that average several approaches (Tu and Zhou 2011), and methods that apply careful MVO with good inputs (Allen, Lizieri, and Satchell 2019).

4 Although a version of EPO can be shown to be equivalent to Black and Litterman (1992), there are several differences. Indeed, Black and Litterman always shrank toward the market portfolio, whereas we consider a general anchor (or no anchor); they considered long–short “view portfolios,” whereas we simply consider signals about expected returns, such as industry momentum or time-series momentum, and we allow “double shrinkage”— of both the estimated expected returns and the variance–covariance matrix. Most importantly, our contribution is to unify this approach with other optimization methods by showing the link to correlation shrinkage (which is not clear from the equations in Black and Litterman, p. 42), by presenting a simple, new, and powerful way to operationalize the method, and by documenting empirically how it works.

5 Appendix A describes a method to stabilize the risk model that is more sophisticated than shrinking correlations called “random matrix theory” (RMT). We have found empirically, however, that EPO works as well with simple correlation shrinkage as with RMT.

6 The variance of η is proportional to Σ in order to capture the idea that true fluctuations in expected returns are correlated across correlated assets (similar to the assumption made in Point 7 of the appendix of Black and Litterman 1992). Expressed in a different way, the PC portfolios have expected returns P'σ1μ=γP'σ1Σa+P'σ1η, where the random fluctuation term, P'σ1η, has variance τP'σ1Σσ1P=τD, implying that the expected returns of the least important principal components vary the least.

7 To understand the anchor at a deeper level, consider again the case of η=0. In this case, the expected excess return on any asset—say, asset number 1, is Er1=γ 1, 0,, 0Σa=γ covr1, ra|s. Using this relationship for anchor portfolio a and solving for γ=Era/varra|s, we get Er1=covr1, ra|s/varra|sEra=:β1,aEra. If a is the market portfolio, this relationship is simply the conditional capital asset pricing model (CAPM). Hence, EquationEquation 12 defining μ means that the CAPM holds, on average, but η pushes the expected returns around in such a way that the CAPM does not always hold exactly, resulting in trading opportunities. More generally, EquationEquation 12 says that the anchor is the tangency portfolio when there are no shocks (η=0).

8 To our knowledge, the specification of EquationEquation 15 and its solution is new, but Fabozzi et al. (2010) considered a version of EquationEquation 15 that is simpler in two ways: First, whereas we consider a general Λ, Fabozzi et al. assumed that Λ equals Σ, which means that there is no shrinkage of the variance–covariance matrix, and second, Fabozzi et al. did not have an anchor portfolio.

9 The assumption of independence of errors in the expected returns across securities, Λ=λV, implies that the error in the measurement of the expected return of the principal components has a variance given by P'σ1λVσ1P= λI, where I  is the identity matrix. That is, errors of all the principal components are independent and of equal magnitude.

10 Alternatively, we can think of the anchor being a=0, which gives the same result as EquationEquation 20 up to a constant that can be absorbed in the risk aversion coefficient. However, we think of the anchor as also being the EPO portfolio with full shrinkage, w=1, implying that a=1/γV1s is the more natural interpretation of EquationEquation 20.

11 Investors can also avoid specifying γ altogether by solving an equivalent optimization that maximizes expected returns subject to a maximum volatility constraint, thus specifying a volatility target in lieu of γ.

12 Choosing γ may be done in several other, related ways, some of which work better than others. For example, although γ in EquationEquation 22 equalizes the variance of the anchor with that of (1/γ)Σw1s, one could also replace the latter with the variance of the standard MVO solution, (1/γ)Σ˜1s, but this is a poor choice if the standard MVO is ill behaved. Ao et al. (2019) and Raponi et al. (2020) also considered methods where γ is based on variance.

13 Specifically, the general EPO is the solution to a Lavrentiev regularization (Lavrentiev 1967), and the simple EPO is the solution to a Tikhonov regularization. The simple EPO can also be seen as a ridge regression of a vector of 1s on the matrix of realized returns when risk and expected returns are estimated by their sample counterparts.

14 If we had included non-USD currency pairs, then the variance–covariance matrix would not be of full rank because, for example, EUR–USD, EUR–JPY, and USD–JPY are linked through a triangular arbitrage.

16 The annualized variance of instrument i was estimated as (σti)2=261k=0,,(1δ)δk(rt1kir¯ti)2, where r¯ti is the exponentially weighted average return computed similarly, 261 annualizes the daily returns, and δ was chosen to achieve a center of mass of k=0,,(1δ)δkk= δ/(1δ)=60 days. The correlations were estimated by first computing covariance and volatilities in the corresponding way—using 3-day returns with 150-day center of mass—and then computing the correlations as ratios of the covariances to the product of the volatilities. We required at least 300 days of data to be available for an asset before it entered the covariance matrix.

17 In other words, the covariance of assets i and j is estimated as 1/(K1)k=1, , Krtkir¯tirtkjr¯tj.

18 Some studies have considered longer time horizons—for example, past five-year returns. Past long-term returns, however, predict returns negatively, if at all, perhaps because securities that have risen in price over a long time have become expensive (De Bondt and Thaler 1985). Alas, comparing optimization methods using a faulty signal of expected returns is not informative.

19 Babu et al. (2020) reported a median time-series momentum Sharpe ratio per asset of 0.34 per year (i.e., 0.10 per month) for traditional assets.

20 Indeed, this coefficient implies that the EPO portfolio with full shrinkage, EPOsw=100% =(1/γt)Vt1st, has a notional exposure to asset i that matches that of Moskowitz et. al (2012) given in Appendix A. That is, EPOsw=100%i  =(1/γt)sti/σti2=(1/nt)40%/σti signrt12,ti.

21 Because the number of assets in our sample varied over time, we scaled the realized and ex ante average returns and volatilities to preserve the trace of the correlation matrix—that is, ensuring that the sum of variances would equal the largest number of assets in our sample, 55.

22 From 1963 to 2018, the five Fama–French factors realized Sharpe ratios between 0.27 and 0.49 and the equal-weighted portfolio of all five factors realized a Sharpe ratio of 0.93.

23 Estimates of the eigenvectors are kept equal to the sample eigenvectors to make the estimate of the correlation matrix rotational invariant, meaning that rotating the data by some orthogonal matrix rotates the estimator in the same way (see Ledoit and Wolf 2012; Bun et al. 2017).

24 The Woodbury matrix identity shows a way to rewrite the inverse of a sum of matrices and, using the Woodbury formula, we see that IτΣ+Λ1τΣ=I+Λ1τΣ1=Λ1Λ+τΣ1=τΣ+Λ1Λ..

25 We can also write the regression in a simpler way, 1γs=Σx+ε, as we do when we consider the Lavrentiev regularization. When we use the standard ridge regression on this simpler equation, we get x=1γΣ2+λI1Σs, so we have written the regression differently to avoid the Σ-squared.