449
Views
0
CrossRef citations to date
0
Altmetric
Articles

Forecasting semi-stationary processes and statistical arbitrage

, , &
Pages 179-189 | Received 15 May 2019, Accepted 30 Sep 2019, Published online: 14 Oct 2019

ABSTRACT

If a financial derivative can be traded consecutively and its terminal payoffs can be adjusted as the sum of a bounded process and a stationary process, then we can use the moving average of the historical payoffs to forecast and the corresponding errors form a generalised mean reversion process. Thus we can price the financial derivatives by its moving average. One can even possibly get statistical arbitrage from certain derivative pricing. We particularly discuss the example of European call options. We show that there is a possibility to get statistical arbitrage from Black–Scholes's option price.

1. Introduction

In economics, arbitrage is the act of exploiting price differences between two or more markets: building a portfolio of deals upon the imbalance, expecting to get the spread profit. In academia, arbitrage refers to the cost of the possibility obtaining a risk-free profit after the transaction is completed. For example, when there is an arbitrage opportunity, there will be a low buy and sell high immediately.

The practice of ‘statistical arbitrage’ actually came together with gambling. Suppose that a dice is rolled continuously and a gambler may lose 1 dollar each time when the result is ‘1’ or wins 1 dollar otherwise. Denote by Sn the gambler's gain after n tries, then Sn/n is approximately 23 for large n. That is the law of large numbers in probability theory on which based the theory of mathematical statistics. Furthermore the strong law of large numbers tells us that when n, Sn/n approaches to 23 with probability 1. However, that is not an arbitrage opportunity for this gambler as he still has a chance to lose no matter how large n is. Morgan Stanley started in the early 1980s (see Gregory van Kipnis' foreword to Pole, Citation2007) to apply ‘statistical arbitrage’ to get profit from the stock market. Nevertheless, the returns in the stock market are not independent as rolling dices, so many mathematical models were created, of which most would be possibly never published due to profitability. In Pole (Citation2007), Andrew Pole showed many important historical examples, ‘rules’, and structural models of statistical arbitrage. In academic use, an arbitrage is risk-free; in common use, as in statistical arbitrage, it may refer to expected profit, though losses may occur, and in practice, there are always risks in arbitrage, some minor (such as fluctuation of prices decreasing profit margins), some major (such as devaluation of a currency or derivative). In academic use, an arbitrage involves taking advantage of differences in price of a single asset or identical cash flows; in common use, it is also used to refer to differences between similar assets (relative value or convergence trades), as in merger arbitrage.

There have been a few more general academic definitions of statistical arbitrage since that time on. Hogan, Jarrow, and Warachka (Citation2002) defined ‘Statistical Arbitrage’ by four conditions that the discounted cumulative value v(t) should satisfy: (1) v(0)=0; (2) limtE[v(t)]>0; (3) limtVar(v(t))/t=0; (4) limtP[v(t)<0]=0. Their Condition (3) excluded the cases where v(t) is the sum of independent identically distributed outcomes, and thus excluded our dice game and most of casino games. According to Lo (Citation2010), ‘Statistical Arbitrage’ refers to highly technical short-term mean-reversion strategies involving large numbers of securities (hundreds to thousands, depending on the amount of risk capital), very short holding periods (measured in days to seconds), and substantial computational, trading and information technology (IT) infrastructure. Wang and Zheng (Citation2014) simplified the definition to repeatedly trading a basket of assets according to the same algorithm and get an accumulated profit with statistically stable positive rate. Moreover, the concept of the ergodic theorem of stationary process was applied into the practice of statistical arbitrage in Wang and Zheng (Citation2014). We will further this discussion to derivative pricing in this paper. We will discuss the time series with bounded trend component, of which the increments have some nice property as mean reversion.

The ‘mean reversion’ phenomena is a special case of stationarity, which suggests that prices and returns eventually move back towards the mean or average. This mean or average can be the historical average of the price or return or another relevant average such as the growth in the economy or the average return of an industry. A typical example of mean reversion phenomena is the increments of the Ornstein–Uhlenbeck process (see, e.g., Karatzas & Shreve, Citation1987, p. 358). However, the Ornstein–Uhlenbeck process comes together with Gaussian law which could not be justified by some real market data. Thus we will define a generalised discrete time mean reversion process as the increments of a semi-stationary process which is the sum of a bounded process and a stationary process. The time average of a generalised mean reversion process vanishes faster than that of an ordinary stationary processes with null mean. Therefore a generalised mean reversion process is more stable than an ordinary stationary one. We show in the next section that if we use the moving average to estimate a time series with bounded trend component, then the time average of the error vanishes in inverse proportion to the time length. As its application, we give an example of statistical arbitrage from Black–Scholes' option price based on estimate of payoffs with generalised mean reversion errors.

We introduce the concept of generalised mean reversion process as the increment process of a semi-stationary process in Section 2. We use Section 3 to explain some basic facts of option pricing. We use Section 4 to study the options of the ETFs which track the main stock indices in the US market. Finally we show that one may get statistical arbitrage from Black–Scholes model in the last section.

2. Forecasting with MA

Given a discrete time stochastic process {X(t)}t=0,1,2,, it is very important to give an estimate X~(t+1) for X(t+1) based on the observed information {X(0),X(1),,X(t)}. Certainly, the selection of X~(t+1) depends on the selection of the norm of the error X(t+1)X~(t+1). When we need to estimate continuously for t=m+1,m+2,,N, then the most popular ones are the sum of lpnorms (p1) of errors t=m+1N{E[|X(t)X~(t)|p]}1/p or the similar ones. However, in many financial applications, an investor is more interested in the lp-norm of the accumulated sum: (1) E[|t=m+1N(X(t)X~(t))|p]1/p=E[|t=m+1TX(t)t=m+1NX~(t)|p]1/p.(1) The economic reason is very simple. If X(t+1) is the value of certain financial derivative at time t + 1 and X(t+1)X~(t+1) is the loss caused by estimate X~(t+1), then an investor is more interested in the accumulated lose t=m+1N(X(t)X~(t)) rather than the sum t=m+1N|X(t)X~(t)|. For the error norm (Equation1), the simplest estimate is to use the moving average (ab. MA) (2) M(m)(t)=X(t)+X(t1)++X(tm+1)m(2) to forecast X(t+1), which is called the MA model in time series analysis. We will impose a condition ‘semi-stationary’ on {X(t)} under which (Equation1) will be bounded in T for some p0.

We say that {X(t)} is a ‘strongly stationary’ process, if for each positive a, the processes {X(t+a)} and {X(t)} obey the same probability law. Thus a sequence of independent identically distributed random variables is a strongly stationary process. In the previous example of continuously rolling dice, if we denote X(t)=1 when the tth outcome is ‘1’, and X(t)=1 otherwise, then {X(t)} is a strongly stationary process. The weak form is the weakly stationary process. {X(t)}t is a ‘weakly stationary’ process if: (i) E[X(t)] is a constant; (ii) Cov(X(t),X(t+a))=Cov(X(0),X(a)) for each t. Certainly, if a strongly stationary process {X(t)} has its second moments, then {X(t)} is weakly stationary. The most important property of a strongly stationary process is the Birkhoff's theorem (Loeve, Citation1977, p. 76). The time-average (X(1)+X(2)++X(N))/N of a strongly stationary process {X(t)} converges with probability 1. Furthermore, when {X(t)} has the so-called ‘ergodic’ property, then the limit is just its mean.

Definition 2.1

We call {X(t)} a strongly (or weakly) semi-stationary process if there is a bounded process {A(t)} such that {X(t)A(t)} is a strongly (or weakly, respectively) stationary process.

It is easy to see that for a semi-stationary process {X(t)}, X(T)/T0 with the rate 1/T and the time average ΔX(1)+ΔX(2)++ΔX(N)N of the increments ΔX(t)=X(t+1)X(t) of a semi-stationary process {X(t)} vanishes at the rate 1/N which is faster than that of an ordinary sequence of independent identically distributed random variables, which is 1/N.

A time series {x(t)} is often to be considered as a sequence of data which can be written as x(t)=m(t)+S(t)+ϵ(t), where m(t) is the deterministic trend component, S(t) is the seasonal component and ϵ(t) is a weakly stationary process (see Wang & Zheng, Citation2014, p. 101, for example). Since S(t) is periodic, it is bounded. If m(t) is also bounded, then {x(t)} is a weakly semi-stationary process. We are going to use MA to forecast a weakly semi-stationary time series and the cumulated error has mean vanishes at the rate 1/T.

The increment of a semi-stationary process has the properties which are very similar to the so-called mean reversion process. The Mean Reversion phenomena in stock prices have been studied for more than three decades. This theory suggests that prices and returns eventually move back towards the mean or average. This mean or average can be the historical average of the price or return or another relevant average such as the growth in the economy or the average return of an industry (see, e.g., Ansley, Spivey, & Wrobleski, Citation1977; Fama & French, Citation1988; Mukherji, Citation2011; Poterba & Summers, Citation1988). Its typical mathematical model is the Ornstein–Uhlenbeck process of which the increments can be written as (3) dX(t)=θ(μX(t))dt+σdW(t),(3) where μ, θ(>0) and σ(>0) are constants. dX(t) and dW(t) are the increments of X(t) and Brownian motion W(t) respectively. Since the above equation is not taught in ordinary text books of probability theory, we briefly explain its meaning here. One considers dW(t) as the noise and σ as its magnitude. (X(t)μ) is the distance from the mean μ. So this process has a drift θ(μX(t)) push its path back to the mean. If we denote X(t)μ=Y(t), then dY(t)=θY(t)dt+σdW(t), which has its integral form (see Karatzas & Shreve, Citation1987, p. 358). Y(t)=Y(0)exp{θt}+σ0texp{θ(st)}dW(s). Thus X(t)=(X(0)μ)exp{θt}+σ0texp{θ(st)}dW(s)+μ, which is strongly stationary if X(0) has Gaussian distribution with mean μ and variance σ2/2θ. However, it is known that this process does not precisely match a lot of data experimentally. Therefore we need a more general definition. If we fix {Y(t)} and set X(t)=Y(t)+μ, then X(t) satisfies (Equation3) and they have same increments: X(t+1)X(t)=Y(t+1)Y(t),(t=0,1,2,).

Hence, from the uniqueness in law of the Ornstein–Uhlenbeck process (for fixed initial stationary distribution, μ, θ and σ), we get the following.

Lemma 2.2

For fixed θ and σ, the probability law of the increments {X(t+1)X(t)}t=0,1,2, of stationary Ornstein–Uhlenbeck process do not depend on the mean μ.

The above Lemma inspired us to introduce the following.

Definition 2.3

Let {U(t)}t=1,2, be a stochastic process. If there is a strongly (or weakly) semi-stationary process {X(t)}t=0,1,2, such that U(t)=ΔX(t),(t=1,2,), then {U(t)} is called a generalised strongly (or weakly, respectively) Mean Reversion process and {X(t)} is its integrated process. Furthermore, if the integrated process is strongly (or weakly) stationary, then {U(t)} is called strongly (or weakly, respectively) Mean Reversion process.

Thus the time average U(1)++U(N)N=X(N)X(0)N0 with rate 1/N.

We have easily that t=1mtmX(t)+t=m+1N[X(t)M(m)(t1)]=t=1mtmX(t)+t=m+1NX(t)1mi=1mX(ti)=t=1mtmX(t)+t=m+1NX(t)1mi=1mt=m+1NX(ti)=i=1mm+1imX(N+1i)=t=1mtmX(Nm+t). Thus we have

Lemma 2.4

t=m+1N[X(t)M(m)(t1)]=i=1mimX(Nm+i)i=1mimX(i).

Since i=1m(i/m)X(tm+i) is a finite linear combination of the values at different time of a strongly (or weakly) stationary processes, which is also strongly (weakly, respectively) stationary, we get

Theorem 2.5

When {X(t)} is a strongly (or weakly) semi-stationary process, the error process {X(t+1)M(m)(t)} is a generalised strongly (or weakly, respectively) mean reversion process and the accumulate error is strongly (or weakly, respectively) semi-stationary such that t=m+1N(X(t)M(m)(t1))N0 with rate 1/N.

Therefore, if {X(t)} is a sequence of financial derivatives in the market, which is semi-stationary, then we can use {M(m)(t1)} to forecast its values with errors forming a generalised mean reversion process. Hence the errors have mean nearly 0 in long run. In particular, we will consider option prices in the next section.

The ergodic theorem for stationary process has its weak point. It cannot be applied to an arbitrary subsequence of a stationary process. For example, when X is a standard Gaussian random variable, then {X,X,X,X,} is a strongly stationary process. However, one can easily choose a subsequence which has no convergent time average. Indeed, if we choose {X,X,X,X,X,X,X,X,X,X,X,X,X,X,X,X,}, then its time average will take values X/2 and (X/2) repeatedly. Therefore, we need the following.

Lemma 2.6

Suppose that {Xj} is a weakly stationary sequence. If {Xji} is a subsequence and mini{|jiji+1|}J, then for any ϵ>0, P1NiNXjiE[X1]ϵ1ϵ2Var(X1)N+supiJCov(X1,X1+i).

Proof. E1NiNXjiE[X1]21N2i=1NVar(Xji)+2N2i<kNCov(Xji,Xjk) Thus we get the result by simplification and the classical Chebyshev's inequality in probability theory.

3. Price of option

The basic idea of Black–Scholes–Merton's theory (Black & Scholes, Citation1973; Merton, Citation1973) is that the option price of an asset depends only on the current price S(t) of the asset, mature time, strike price, volatility and risk-free interest rate. It is also known that the real traded prices of various options in the market are quite different from those theoretical ones, so that the concepts of ‘implied volatility’ and ‘stochastic volatility’ were introduced (Canina & Figlewski, Citation1993; Chrisensen & Prabhala, Citation1998; Dumas, Fleming, & Whaley, Citation1998; Heston, Citation1993; Poon & Granger, Citation2003) as attempts to fill the gap between the theoretical ones and the real market ones. In the last two decades, many mathematicians and statisticians introduced a lot of more sophisticated stochastic models in order to describe more accurately the movement of stock price and its option prices. Among them, the volatility study always attracts the main interests.

In the basic Black–Scholes's model, stock share price is assumed to be a geometric Brownian motion. That is, S(t)=S(0)expσW(t)σ22μt. The main mathematical tool in Black–Scholes's theory is Cameron–Martin–Girsanov's theorem (Karatzas & Shreve, Citation1987, p. 190) which states that up to a bounded time T, the induced probability measures corresponding to different μ are equivalent. Therefore one can choose the probability measure which makes the geometric Brownian motion a martingale (μ=0) to get the call option price c through taking the mathematical expectation of (S(T)K)+=c+0TH(t)dS(t), where H(t) is the hedge, K is the strike price and T is the mature time. Therefore the call option price is just the mathematical expectation of (S(T)K)+ with respect to the risk neutral (μ=0) probability measure. However, some limit property cannot be kept when T. For example, logS(T)/T(σ2/2μ) depending on μ. Therefore in long run, some results deduced from real market price might be different to those deduced from the risk neutral one.

Let us fix a positive integer T. The payoff of ai shares of European call option with mature time at (i+1)T and strike price Ki (i=0,1,2,3,) is aiVi+1=ai(S((i+1)T)Ki)+. Denote by Qi the option price per share paid at time iT, then a buyer's profit at time (i+1)T will be aiCi+1=aiVi+1aiQi=ai(S((i+1)T)Ki)+aiQi. If we can find a sequence of {(Ki,ai)}i such that {aiVi+1}i form a semi-stationary sequence, then we may apply Theorem 2.5 to use its moving averages to estimate the payoffs {aiVi+1}i based on the market data and taken as {aiQi} such that the error terms {aiCi} are generalised mean reversion.

In order to get such a stationary sequence, we may choose (at time iT) Ki=kS(iT) and ai=1/S(iT) where k is a prefixed positive percentage constant. Denote by qi the cost (at time iT) for purchasing ai shares of option, then the terminal payoff and the buyer's profit will be vi+1=S((i+1)T)S(iT)k+andci+1=S((i+1)T)S(iT)k+qi, which can also be considered the corresponding values percentage to the original asset prices. In practice, in order to get profit percentage continuously, we may fix a large constant C and just trade C/S(jT) shares of options with strike price kS(jT) (both rounded out to the nearest adequate digits). In Black–Scholes's model, {S((i+1)T)/S(iT)}i are just the exponential functions of the increments of Brownian motion, which are independent and identically distributed Gaussian random variables. So the price qi is the mean of (S((i+1)T)/S(iT)k)+ under the risk-neutral measure, which is a constant. Therefore {vi}i and {ci}i are both strongly stationary processes in Black–Scholes' theory. Furthermore, the error terms {ci}i are independent identically distributed in Black–Scholes' formula, of which the time average converges slower than that of generalised mean reversion process.

By Lemma 2.6, if Cov(v1,vj)0, and {vjk}k is a subsequence such that mink{jk+1jk} is sufficiently large, then 1/NiNvji will be close to E[v1]. Its special case is the logarithmic return of Heston model (Heston, Citation1993), which is strongly stationary and ergodic, Cov(c1,cN)0 (when N).

In Black–Scholes–Merton's theory, the only undetermined factor is the volatility σ. In practice, the investors quite often use the near term sample variance to estimate the volatility at time t. That is, (4) σˆ2(t)=1n1j=tnt1i=tnt1(logS(j+1)logS(j))1ni=tnt1(logS(i+1)logS(i))2(4) for large n. However, if we use the above σˆ(n) with large n for a unified volatility in Black–Scholes formula, the result would be not good. So very often the investors use smaller n to estimate the volatility of S(t) near t, which is just Heston's price with ρ=0 (Heston, Citation1993; Poon & Granger, Citation2003).

From the above discussion, we have four different candidates for option prices: (1) the percentage prices {qi} are the moving averages {M(iT)}; (2) {qi} are the historical call option prices in the real market; (3) {qi} are Black–Scholes' prices with unified volatility σˆ; (4) {qi} are Black–Scholes' prices with moving volatility σˆ(t). We will use Market data to show that the cumulated error is stationary in Case (1). We will also illustrate the cumulated errors in the other three cases for comparison.

4. Statistics with real market data

The strong stationarity of logarithmic return is implied in the most popular mathematical models for asset prices. If we choose the trading volume inversely proportional to the current price, then the consecutive payoffs will be strongly stationary as discussed in the previous section. Thus we can use the moving averages {M(iT)} to estimate a this sequence and the errors will be mean reversion according to Theorem 2.5. In this section, we use the data of ETFs DIA, QQQ, SPY for main stock indices in the US market to illustrate the cumulated buyer's profit percentage, which is also the cumulated percentage errors, to compare with that of Black–Scholes'.

We consider the buyers successive profits ci+1=S((i+1)T)S(iT)k+qi for the prices {qi} defined in the four cases listed at the end of the previous section, which are also the cumulated errors of the corresponding option pricing.

The first thing is to test the stationarity. We have checked by the ADF test with Matlab software package and the hypothesis that ‘{ci+1} has a unit root’ is rejected with 95% of confidence, for the trading data of SPY (the ETF tracing S&P500), QQQ (the ETF tracing NASDAQ) and DIA (the ETF tracing DOW) under various k and T. In other words, {ci+1} can be statistically considered as a stationary process. For example, when we take S(t) as the price of SPY, take k = 0.98 and T = 7, Figure  shows the successive weekly trading profit sequence {ci+1} when qi=M(iT) (m = 6) from 6th January 1995 through the end of July 2015.

Figure 1. The successive profit sequence {X(j)} from 6th January 1995.

Figure 1. The successive profit sequence {X(j)} from 6th January 1995.

We also found that if qi is the weekly expired percentage option prices in the real market (with kS(jT) rounded out to the nearest strike price in the market), {ci+1} still pass the ADF test for SPY, QQQ and DIA data with various k and T.

The parameter k may assume any value. For simplicity, we just list the results for k = 0.98, 1 and 1.02 here. We discuss two time scales for T: (a) weekly expired options; (b) monthly expired options. For weekly expired options, we consider to trade call options between the closing time of neighbouring Fridays. Thus if we purchase 1/S(jT)(j=1,2,,n) unit of call option at time jT with mature time (j+1)T and strike price K=kS(jT), the profit by time (j+1)T will be cj+1, and the cumulated profit by time (j+1)T is i=1jci+1. In that case, we can take T = 7 calendar days except for very few holidays. Similarly, when we consider the monthly expired options, we trade the options at the 3rd Friday of neighbouring months. In that case, actually T = 30 or 31 (except for February). We compare the cumulated profits of the following four prices:

  1. {qi} are the moving average {M(iT)};

  2. {qi} are the historical call option price in the real market;

  3. {qi} are Black–Scholes' price with unified volatility σˆ;

  4. {qi} are Black–Scholes' price with moving volatility {σˆ(iT)}.

We should mention some technical points here: (a) when there was no trade at the closing time, we use the mean of bid and ask price as our last price; (b) the strike prices {kS(jT)} are rounded out to the nearest available strike prices in the market; (c) all our data are percentage priced; (d) the parameter n in (Equation4) is chosen according to the usual method applied in the real market (Poon & Granger, Citation2003) and the selection of m in (Equation2) is less crucial. Actually we found that there is no much differences when m is chosen between 2 and 12.

4.1. SPY analysis

Figure  shows the daily price of SPY from 3rd January 1995 through 2nd July 2015.

Figure 2. The daily price of SPY from 3rd January 1995 through 2nd July 2015.

Figure 2. The daily price of SPY from 3rd January 1995 through 2nd July 2015.

We have SPY trading data of monthly expired options from 21st January 2005 through 20th March 2015, and its trading data of weekly expired options from 1st July 2011 through 27th March 2015. Thus we can show our cumulative profits of corresponding options in those two periods respectively.

4.1.1. SPY results under different k values

Figures  show the cumulated profits of both monthly and weekly expired options of SPY under different k values.

Figure 3. The cumulated profits of SPY options when k = 0.98: (a) monthly expired options and (b) weekly expired options.

Figure 3. The cumulated profits of SPY options when k = 0.98: (a) monthly expired options and (b) weekly expired options.

Figure 4. The cumulated profits of SPY options when k = 1: (a) monthly expired options and (b) weekly expired options.

Figure 4. The cumulated profits of SPY options when k = 1: (a) monthly expired options and (b) weekly expired options.

Figure 5. The cumulated profits of SPY options when k = 1.02: (a) monthly expired options and (b) weekly expired options.

Figure 5. The cumulated profits of SPY options when k = 1.02: (a) monthly expired options and (b) weekly expired options.

4.2. QQQ analysis

Figure  shows the daily price of QQQ from 10th March 1999 through 2nd July 2015.

Figure 6. The daily price of QQQ from 10th March 1999 through 2nd July 2015.

Figure 6. The daily price of QQQ from 10th March 1999 through 2nd July 2015.

We have QQQ trading data of monthly expired options from 15th February 2002 through 17th April 2015, and trading data of weekly expired options from 1st July 2011 through 27th March 2015.

4.2.1. QQQ results under different k values

Figures  show the cumulated profits of both monthly and weekly expired options of SPY under different k values.

Figure 7. The cumulated profits of QQQ options when k = 0.98: (a) monthly expired options and (b) weekly expired options.

Figure 7. The cumulated profits of QQQ options when k = 0.98: (a) monthly expired options and (b) weekly expired options.

Figure 8. The cumulated profits of QQQ options when k = 1: (a) monthly expired options and (b) weekly expired options.

Figure 8. The cumulated profits of QQQ options when k = 1: (a) monthly expired options and (b) weekly expired options.

Figure 9. The cumulated profits of QQQ options when k = 1.02: (a) monthly expired options and (b) weekly expired options.

Figure 9. The cumulated profits of QQQ options when k = 1.02: (a) monthly expired options and (b) weekly expired options.

4.3. DIA analysis

Figure  shows the daily price of DIA from 20th January 1998 through 2nd July 2015.

Figure 10. The daily price of DIA from 20th January 1998 through 2nd July 2015.

Figure 10. The daily price of DIA from 20th January 1998 through 2nd July 2015.

We have DIA trading data of monthly expired options from 21st June 2002 through 17th April 2015, and trading data of weekly expired options from 10th August 2012 through 27th March 2015.

4.3.1. DIA results under different k values

Figures  show the cumulated profits of both monthly and weekly expired options of SPY under different k values.

Figure 11. The cumulated profits of DIA options when k = 0.98: (a) monthly expired options and (b) weekly expired options.

Figure 11. The cumulated profits of DIA options when k = 0.98: (a) monthly expired options and (b) weekly expired options.

Figure 12. The cumulated profits of DIA options when k = 1: (a) monthly expired options and (b) weekly expired options.

Figure 12. The cumulated profits of DIA options when k = 1: (a) monthly expired options and (b) weekly expired options.

Figure 13. The cumulated profits of DIA options when k = 1.02: (a) monthly expired options and (b) weekly expired options.

Figure 13. The cumulated profits of DIA options when k = 1.02: (a) monthly expired options and (b) weekly expired options.

5. Possibility of statistical arbitrage

From the above data analysis, we can easily find that the errors of our moving average pricing of option are mean reversion and the cumulated error is stationary. When the strike price is sufficiently low, then the asset price will be always above the strike price and the call option will be always executed. So that the buyer at Black–Scholes' option price can always take statistical arbitrage in long run. Let us take T = 7. If one investor continuously purchase at time iT call option for 1/S(iT) share expired at (i+1)T according to Black–Scholes' price, then Figure  shows his cumulated profit in the last 22 years (01/29/1993–07/02/2015) when k = 0 and k = 0.9.

Figure 14. Cumulated gain against Black–Scholes price: (a) when k = 0 and (b) when k = 0.9.

Figure 14. Cumulated gain against Black–Scholes price: (a) when k = 0 and (b) when k = 0.9.

However, there are still many problems left. For example, what happens if only some of the profits {ci+1} occur? That is, the partial sum of a strongly stationary sequence may not have a convergent mean if its terms are not uncorrelated as shown in Section 2. However, we can apply Lemma 2.6 to construct statistical arbitrages if the covariance function of {ci+1} tends to 0 quick enough.

We show here another more practical example to get statistical arbitrage from Black–Scholes' call option. The following graph shows that when qi=M(iT), m = 6 and T = 7, the sample autocorrelation function of ci+1 tends to 0 (Figure ).

Figure 15. The autocorrelation function.

Figure 15. The autocorrelation function.

Thus if cik is a subsequence such that mink{ik+1ik} is sufficiently large, we can apply Lemma 2.1 to get their mean sufficiently close to 0. Denote by Q(iT) the Black–Scholes' price. Our strategy is: (a) when M(iT)Q(iT)>0.01 and the previous trade was made at least 9 weeks ago, buy 1/S(iT) of call option at Q(iT); (b) when Q(iT)M(iT)>0.0005 and the previous trade was made at least 9 weeks ago, sell 1/S(iT) call option at Q(iT). Figure  shows our cumulated profit and Figure  gives our profit against Black–Scholes price with moving volatility. Thus we can get statistical arbitrage from the mean, which is similar to the case of high-frequency trading (see Wang & Zheng, Citation2014).

Figure 16. Cumulated gain against Black–Scholes price.

Figure 16. Cumulated gain against Black–Scholes price.

Figure 17. Cumulated gain against Black–Scholes price with stochastic volatility.

Figure 17. Cumulated gain against Black–Scholes price with stochastic volatility.

Disclosure statement

No potential conflict of interest was reported by the authors.

Additional information

Notes on contributors

Si Bao

Si Bao is now working in Xiangcai security Co., LTD. She studied for her Ph.D. in School of Statistics from ECNU.

Shi Chen

Shi Chen is a data scientist of PayPal Holding Inc. He received his Ph.D in statistics from ECNU in 2017.

Wei An Zheng

Wei An Zheng is Professor of ECNU and Professor Emeritus of University of California, Irvine, USA.

Yu Zhou

Yu Zhou works in Guotai Junan Securities. He received his Ph.D in statistics from ECNU in 2016.

References

  • Ansley, C. F., Spivey, W. A., & Wrobleski, W. J. (1977). On the structure of moving average processes. Journal of Econometrics, 6, 121–134. doi: 10.1016/0304-4076(77)90058-6
  • Black, F., & Scholes, M. (1973). The pricing of options and corporate liabilities. Journal of Political Economy, 81(3), 637–654. doi: 10.1086/260062
  • Canina, L., & Figlewski, S. (1993). The informational content of implied volatility. The Review of Financial Studies, 6(3), 659–681. doi: 10.1093/rfs/5.3.659
  • Chrisensen, B. J., & Prabhala, N. R. (1998). The relation between implied and realized volatility. Journal of Financial Economics, 50(2), 125–150. doi: 10.1016/S0304-405X(98)00034-8
  • Dumas, B., Fleming, J., & Whaley, R. E. (1998). Implied volatility functions: Empirical tests. The Journal of Finance, 53(6), 2059–2106. doi: 10.1111/0022-1082.00083
  • Fama, E. F, & French, K. R. (1988). Dividend yields and expected stock returns. Journal of Financial Economics, 22, 3–25. doi: 10.1016/0304-405X(88)90020-7
  • Heston, S. L. (1993). A closed form solution for options with stochastic volatility with applications to bond and currency option. The Review of Financial Studies, 6(2), 327–343. doi: 10.1093/rfs/6.2.327
  • Hogan, S., Jarrow, R., & Warachka, M. (2002). Statistical arbitrage and tests of market efficiency. Singapore: Singapore Management University Pre-Prints.
  • Karatzas, I., & Shreve, S. E. (1987). Brownian motion and stochastic calculus. Berlin, Germany: Springer-Verlag.
  • Lo, A. (2010). Hedge funds: An analytic perspective (Revised and Expanded ed, pp. 260). New Jersey, USA: Priceton University Press.
  • Loeve, M. (1977). Probability theory II. Berlin, Germany: Springer-Verlag.
  • Merton, R. C. (1973). A rational theory of option pricing. The Bell Journal of Economics and Management Science, 4(1), 141–183. doi: 10.2307/3003143
  • Mukherji, S. (2011). Are stock returns still mean-reverting? Review of Financial Economics, 20, 22–27. doi: 10.1016/j.rfe.2010.08.001
  • Pole, A. (2007). Statistical arbitrage. Hoboken, New Jersey, USA: John Wiley & Sons Inc.
  • Poon, S. H., & Granger, C. W. J. (2003). Forecasting volatility in financial markets: a review. Journal of Economic Literature, XLI, 478–539. doi: 10.1257/jel.41.2.478
  • Poterba, J. M, & Summers, L. H. (1988). Mean reversion in stock prices. Journal of Financial Economics, 22, 27–59. doi: 10.1016/0304-405X(88)90021-9
  • Wang, Z. D., & Zheng, W. A. (2014). High frequency trading and probability theory. Singapore: World Scientific.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.