348
Views
15
CrossRef citations to date
0
Altmetric
Original Articles

Stochastic volatility and the goodness-of-fit of the Heston model

, &
Pages 199-211 | Received 19 Dec 2003, Published online: 18 Feb 2007
 

Abstract

Recently, Drăgulescu and Yakovenko proposed an analytical formula for computing the probability density function of stock log returns, based on the Heston model, which they tested empirically. Their research design inadvertently favourably biased the fit of the data to the Heston model, thus overstating their empirical results. Furthermore, Drăgulescu and Yakovenko did not perform any goodness-of-fit statistical tests. This study employs a research design that facilitates statistical tests of the goodness-of-fit of the Heston model to empirical returns. Robustness checks are also performed. In brief, the Heston model outperformed the Gaussian model only at high frequencies and even so does not provide a statistically acceptable fit to the data. The Gaussian model performed (marginally) better at medium and low frequencies, at which points the extra parameters of the Heston model have adverse impacts on the test statistics.

Acknowlegments

We would like to thank Bin Yang for assisting in the data analysis. We are also grateful to the conference participants at the 9th Annual Meeting of the Society of Computational Economics in July 2003 for their helpful comments on an earlier draft of part of this paper. We are very grateful to two anonymous referees of this journal for their invaluable comments on an earlier draft of this paper. The usual disclaimer applies.

Notes

The normality assumption relies on the theory of random walks, beginning with Bachelier (Citation1900). Since the 1960s, many empirical studies have sought to describe the empirical distribution of risky asset returns (see Mandelbrot Citation1963, Fama Citation1965, Bookstaber and McDonald Citation1987).

An α in the interval 0 < α < 2.0 results in a stable Paretian distribution with tails that are much higher than those of the normal distribution. A distribution with an α of 2 is normally distributed, while one with an α of 1 depicts a Cauchy distribution. Beyond those special cases the probability density function (PDF) is not known.

The early empirical studies (see Mandelbrot Citation1963, Fama Citation1965) focused on the parametric unconditional density of asset returns. Since then, several other statistical models that assume unconditional (e.g. Theodossiou Citation1998) or conditional (e.g. Engle Citation1982) statistical properties for asset price behaviour and returns have been put forward.

It is expected that empirical models that accommodate both skewness and kurtosis will account more fully for the riskiness of returns, even if kurtosis is more pronounced than skewness in asset returns. Skewness is also an important variable in portfolio selection decisions (see Chunhachinda et al. Citation1997).

This is a valid point made by an anonymous referee of this journal.

Bakshi et al. (Citation1997) provide a useful set of citations for some of the more prominent deterministic and stochastic volatility models in this area. Stochastic volatility models are considered more appropriate for empirical examination since they can more readily accommodate certain risk factors, e.g. jumps (see also Fiorentini et al. Citation2002).

The actual data series begins on 4 January 1982 as the first three days of January 1982 were all non-trading days. This would also apply to D-Y's data set, although they did not explicitly state this.

D-Y say they had 5049 data points. After accounting for bank holidays between the dates they gave, there are actually 5050 prices, giving a maximum of 5049 returns.

D-Y did not indicate that their data was trimmed. However, we were alerted to this fact on observing strange points in our results using their research design. They kindly provided us with the numerical boundaries that they had used for trimming their data.

Those empirical results are reported in Daniel et al. (Citation2003) together with the replicated plots for D-Y's results.

While the exclusion of observations outside certain ranges is not uncommon in empirical work (see, e.g. Theodossiou Citation1998, p. 1658), the impact of the trimming undertaken by D-Y which seems extensive, can be dramatically seen by comparing the level of kurtosis for the trimmed and untrimmed DJIA8201 dataset, thus: The (excess) kurtosis values for ktrimmed do not appear to be statistically significant under the normal distribution. The corresponding values for kuntrimmed are statistically significant for all frequencies except t = 250 days.

Theoretically, a neural network can closely approximate any unknown continuous function to any desired degree of accuracy (see Conti et al. Citation1994). So we would expect the unspecified model generated by the neural network to exhibit the best overall fit and as such it would serve as a useful benchmark for comparison with the other models. The chosen neural network structure was a feed-forward back-propagation (B-P) network, with a five node hidden layer and a single node output layer. The transfer functions are respectively tansig and purelin, where

and
. The B-P function used is trainscg where the weight and bias values are updated according to Levenberg–Marquardt optimization. This optimization method minimizes a combination of squared errors and weights and then determines the correct combination so as to produce a network which generalizes well.

This finding has been noted elsewhere, albeit using alternative sampling procedures (see, e.g. Fama Citation1965). Variation in the extent of kurtosis suggests that the system is not ergodic. This finding has important implications for the constancy of the parameter estimates and the experimental design. If the system is ergodic we would expect the shape parameters to be almost constant from one path to the next, within the same frequency. So our test for ergodicity for all t paths at each t seeks to test for such variation. In general, both the mean and standard deviation of the returns increase as the frequency decreases, albeit not proportionally. The

appears to stabilize beyond t > 80 days. This might be because the distribution moves towards the normal distribution beyond this point, whereas for (say) t < 80 days, kurtosis will be less well defined if the distribution is non-normal.

This finding might be thought to be due to seasonal effects in the index returns, as depicted in the finance literature (see, e.g. Keim Citation1989). We believe this to be unlikely in our case. Consider a path beginning on a Monday. A frequency of 5 days will shift to Tuesday after the first public holiday, to Wednesday after the second holiday and so on, returning to Monday after 5 public holidays. There are about 10 public holidays in each year under consideration, so in each year any one path is forwarded about 10 times. So a path for t = 5 will have data from all the days of the week within about 6 months, although not in equal proportions. Thus, while we cannot rule out a seasonal day effect in our paths covering 18 or 20 years, it is not likely to be present.

These standard statistical tests typically reject the normality hypothesis for high frequency data. The (asymptotic) Jarque-Bera statistic tests a composite normality hypothesis which means that it assumes that the unknown parameters for computing the test statistic can be estimated from the data. The Lilliefors statistic also tests a composite hypothesis but, unlike the Jarque-Bera statistic, emphasizes the maximum departure of the empirical distribution from the normal distribution. When the sample size is large almost any goodness-of-fit tests would reject normality. However, the magnitude of the test statistics shown here does not suggest that approximate normality is a reasonable conclusion. Since our goodness-of-fit tests are performed on untrimmed data sets, we do not expect as good an empirical fit as D-Y have claimed. Indeed, in an application of both the Jarque-Bera and Lilliefors tests, the trimmed data exhibited a much better fit than the untrimmed data.

The data for different frequencies have been shifted up by a multiple of 10 in order to separate them out in the figure.

We also fitted the nnPDF to the empPDF. Those plots are not shown due to the need to retain clarity in both the plots and the focus of the paper. These results can be obtained from the authors. The neural network provides the best overall fit, particularly at high frequencies. The overall performance of the neural network in terms of statistical goodness-of-fit is discussed in subsequent subsections.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 691.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.