Full article: Representations for conditional expectations and applications to pricing and hedging of financial products in Lévy and jump-diffusion setting

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

In this article, we derive expressions for conditional expectations in terms of regular expectations without conditioning but involving some weights. For this purpose, we apply two approaches: the conditional density method and the Malliavin method. We use these expressions for the numerical estimation of the price of American options and their deltas in a Lévy and jump-diffusion setting. Several examples of applications to financial and energy markets are given including numerical examples.

Keywords:

MATHEMATICS SUBJECT CLASSIFICATION:

1. Introduction

Popular models for the fluctuation of the underlying process in finance and energy markets are determined by Brownian motions and are therefore continuous in the sense that they do not generate jumps in the sample paths. Their popularity follows from the fact that they are easy to handle. However, as it turns out from observed data, the absence of jumps is not realistic. Therefore, the use and study of discontinuous models have seen an important boost in the last decades.

In this article, we consider the problem of computing conditional expectations of functionals of Lévy processes and jump diffusions which are involved in pricing and hedging procedures as, e.g. of American options, and which usually require numerical evaluation techniques. In general, it is not possible to obtain analytical expressions for conditional expectations and thus numerical methods are called for. Fournié et al. [Citation1] derive expressions for the conditional expectations in terms of regular expectations for models which have a density and for diffusion models using, respectively, the density method and the Malliavin calculus. In this article, we generalize these two approaches to obtain representations for conditional expectations and their derivatives (with respect to the underlying) in a jump-diffusion setting. The representations we derive are expressed in terms of regular expectations without conditioning but involving a Heaviside step function and some weights. We apply the developed theory to the numerical estimation of American option prices and their deltas.

There is a wide literature on the use of Malliavin calculus in finance, see, e.g. Montero and Kohatsu-Higa [Citation2]. For an overview, we refer to Nualart [Citation3] for continuous processes and Di Nunnio et al. [Citation4] for jump processes. As for the numerical computation of American options, several approaches appeared in this field. von Sydow et al. [Citation5] compared some approaches grouped as Monte Carlo methods, Fourier methods, finite difference methods, and radial basis function methods but only for continuous underlying processes when pricing American options. Broadie and Glasserman [Citation6] built up a tree in order to obtain a discretisation of the underlying diffusion on a grid (see also Barranquand and Martineau [Citation7]). Longstaff and Schwartz [Citation8] use a regression method on a truncated basis of L² and then choose a basis of polynomials for the numerical estimation of conditional expectations. Shu et al. [Citation9] and Kovalov et al. [Citation10] use a partial differential equations approach. Zhang and Oosterlee [Citation11] use a Fourier method based on Fourier cosine-series expansions. We follow an approach similar to the one by Mrad et al. [Citation12] and Bally and Pagès [Citation13] based on a Monte Carlo method.

Considering a random variable F, a scalar random variable G, and a function f on $R$ , Fournié et al. [Citation1] provide the following representation for the conditional expectation $E [f (F) | G = 0] = \frac{E [f (F) H (G) π]}{E [H (G) π]},$ where π is a random variable called weight and H is the Heaviside step function increased with some constant, $H (x) = 1_{{x \geq 0}} + c, c \in R$ . The authors use two approaches: the density method and the Malliavin method (MM). The density method requires that the couple (F, G) has a density p(x, y), $(x \in R, y \in R)$ such that its log is C¹ in the second argument. In the Malliavin approach, they use a Malliavin derivative of the Wiener process and provide expressions for conditional expectations, where F and G are modeled by continuous diffusions. One of the goals in the present paper is to relax the conditions imposed on the random variables F and G and in particular to allow for random variables which do not necessarily have a known density and which might originate from processes with jumps.

We recall that the density method introduced in [Citation1] requires the knowledge of the density of (F, G). However when F and G are random variables generated from jump processes, the density of the couple (F, G) is in general not known or very hard to compute. To overcome this shortcoming, we use the conditional density method introduced by Benth et al. [Citation14]. For example, in the case of a geometric Lévy process, we only need the knowledge of the joint density of the continuous parts, which we do know. Thus, to apply the conditional density method a separability assumption on the random variables F and G will be required. F and G should consist of a part with known density. The density of the other part is not necessarily known.

For the Malliavin method, we work with the Malliavin derivative for jump processes developed by Petrou [Citation15]. The idea is to use the Malliavin derivative in the direction of the Wiener term in the jump-diffusion process. Using this approach, there is no separability assumption imposed, since the Malliavin calculus as presented in [Citation15] does not require any, as opposed to the Malliavin calculus used in Davis and Johansson [Citation16] or in Benth et al. [Citation17]. We use this approach to compute conditional expectations in our setting as was developed in [Citation1] in a continuous set up.

Furthermore, we provide expressions for the derivative of conditional expectations using both approaches and we illustrate our results with several examples of models which are commonly used in financial and energy markets. Notice that we present our study in the one-dimensional case for the ease of notation, although all results can be extended to a setting in higher dimensions.

The representations that we develop are interesting from a probabilistic point of view. Indeed we derive expressions for the conditional expectations of functionals of random variables involving only unconditional expectations of these functionals. Moreover, these representations are interesting from a numerical point of view. In this article, we apply them to the numerical estimation of American option prices and their deltas, the delta being the sensitivity of the option price with respect to the state of the underlying asset.

To perform the numerical experiments, American options are approximated, through a time discretisation, by Bermudan options as in Bally and Pagès [Citation13] or Abbas-Turki and Lapeyre [Citation18]. We make use of a localization technique and a control variate to minimize the variance, see, e.g. Glasserman [Citation19]. To reduce the memory capacity of the algorithm for the estimation of the American option price and the delta, we suggest to simulate the underlying stock price process backwards in time. This backward simulation technique turns out to be a specific application of Lévy bridges, see Baldeaux [Citation20].

To check the accuracy of the proposed algorithms, we first compute European option prices and their deltas at time t > 0 where we assume a Merton model for the price process. We compare the values obtained by our algorithm to the analytical solutions proposed by Merton [Citation21]. Then considering the same model we estimate the prices and the deltas of American options, which we in turn compare to estimates found in the literature. In addition we compute for the prices a confidence interval according to Bouchard and Warin [Citation22].

In their article [Citation22], Bouchard and Warin performed a numerical comparison and discussed the efficiency and the level of complexity of the regression based Longstaff and Schwartz [Citation8] algorithm and the Malliavin approach algorithm in the continuous setting. In our numerical examples, we investigate how the CDM and the MM perform in the discontinuous setting. The fundamental difference between the Longstaff–Schwartz approach and the (conditional) density or Malliavin approach is the way the conditional expectations are approximated. However, the Longstaff–Schwartz method may show rank deficiency for a certain choice of the parameters of the model (see Mostovyi [Citation23]) and is unable to provide an ad hoc method for the computation of the delta. It has to be combined with other methods such as the likelihood ratio method or pathwise sensitivities based approaches in order to obtain an approximation of the delta. The approaches presented in this article allow the computation of the price of the American option for any parameters choice and lead to representation formulas for the derivative of conditional expectations and consequently provide an estimation of the delta using its own specific method.

The article is organized as follows. In Sections 2 and 3, we develop a representation for conditional expectations via the conditional density method and the MM, respectively. In Section 4, we present variance reduction techniques to obtain acceptable convergence results in numerical applications. In Section 5, we present numerical examples to illustrate our results. Section 6 concludes the article.

2. Conditional expectation via the conditional density method

Let $(Ω, F, P)$ be a complete probability space equipped with a filtration $F : = {F_{t}}_{t \in [0, T]}$ for time horizon T > 0, satisfying the usual conditions (see Protter [Citation24]). We introduce the generic notation $L = {L_{t}}_{0 \leq t \leq T}$ , for a Lévy process on the given probability space. We set $L_{0} = 0$ by convention and work with the right-continuous with left limits version of the Lévy process. Let $Δ L_{t} : = L_{t} - L_{t -}$ indicate the jump of the Lévy process L at time t. Denote the Lévy measure of L by $L (d z)$ . Recall that $L (d z)$ is a σ-finite Borel measure on $R_{0} : = R \ {0}$ .

In this article, we express the realization of a conditional expectation $E [f (S_{t}) | S_{s} = α]$ in terms of regular expectations. Here f is a Borel measurable function (think for instance of the payoff function of a call option), S is an $F$ -adapted price process which may have jumps, and α is a real number. We also rewrite its differential w.r.t. α, i.e. the delta, by only using unconditional expectations.

2.1. Representation results

First, we state a general result for the conditional expectation $E [f (F) | G = α]$ , where F and G are two random variables satisfying the following separability assumptions.

Assumptions 2.1.

(Separability). Let F and G be two random variables such that (2.1) $F = g_{1} (X, Y) and G = g_{2} (U, V) .$ (2.1)

The couple (X, U) is independent of (Y, V). Moreover

(X, U) has a density $p_{(X, U)}$ with respect to the Lebesgue measure,
$log p_{(X, U)} (x, \cdot)$ is differentiable, for all $x \in R$ , and
$\frac{\partial}{\partial u} log p_{(X, U)} \in L^{2} (R^{2}, p_{(X, U)})$ , i.e. $E [{(\frac{\partial}{\partial u} log p_{(X, U)} (X, U))}^{2}] < \infty$ .

The functions g₁ and g₂ are Borel measurable and there exist a Borel measurable function $g^{*}$ and a strictly increasing differentiable function h such that (2.2) $g_{2} (u, v) = h^{- 1} (u + g^{*} (v)),$ (2.2) for all $(u, v) \in Dom g_{2} \cap (R \times Dom g^{*})$ .

We apply the conditional density method (CDM) as it is developed in Benth et al. [Citation14]. The CDM method is based on the density method (DM) introduced in Glasserman [Citation19] and used in Fournié et al. [Citation1], but allows for more general dynamics than the latter. Indeed, given F and G as in (2.1), the CDM does not require the knowledge of the density of the couple (F, G) as it is the case in the DM but only the knowledge of the density (X, U). The density $p_{(X, U)}$ of (X, U) plays the most important role in this method. The results follow from straightforward computations based on properties of (conditional) expectations.

We denote the Heaviside step function increased by an arbitrary number $c \in R$ by (2.3) $H (x) : = 1_{{x \geq 0}} + c,$ (2.3) and its distributional derivative, the Dirac delta function by δ₀.

In the upcoming theorem, we present the first representation result of this paper. In order to make the proof comprehensible, we follow a formal approach. The formal proof follows similar derivations as in [Citation1, Section 4.1] by Fournié et al. The main difference is that we apply the CDM instead of the DM by conditioning on $σ (Y, V)$ . A rigorous proof is included in Appendix A.

Theorem 2.2.

Let F and G be as described in Assumptions 2.1 and let f be a Borel measurable function such that $f (F) \in L^{2} (Ω)$ . Then it holds for any $α \in Dom h$ that (2.4) $E [f (F) | G = α] = \frac{E [f (F) H (G - α) π_{(X, U)}]}{E [H (G - α) π_{(X, U)}]},$ (2.4) where $π_{(X, U)} = - \frac{\partial}{\partial u} log p_{(X, U)} (X, U) .$

Proof.

Formally, for the conditional expectation, we know that (2.5) $E [f (F) | G = α] = \frac{E [f (F) δ_{0} (G - α)]}{E [δ_{0} (G - α)]} .$ (2.5)

Moreover we have that $\begin{matrix} E [f (F) δ_{0} (G - α)] = E [f (g_{1} (X, Y)) δ_{0} (g_{2} (U, V) - α)] \\ = E [E [f (g_{1} (X, Y)) δ_{0} (g_{2} (U, V) - α) | σ (Y, V)]], \end{matrix}$ where $σ (Y, V)$ is the σ-algebra generated by Y and V. From Assumptions 2.1(1) we derive $E [f (F) δ_{0} (G - α)] = E [\int_{R^{2}} f (g_{1} (x, Y)) δ_{0} (g_{2} (u, V) - α) p_{(X, U)} (x, u) dxdu] .$

For a function $ϕ \in C^{1}$ with a single root, the composition rule for the Dirac delta function (see Raju [Citation25]) states that $δ_{0} (ϕ (u)) = \frac{δ_{0} (u - u_{1})}{| ϕ' (u_{1}) |},$ where u₁ is such that $ϕ (u_{1}) = 0$ and $ϕ' (u_{1}) \neq 0$ . Because of relation (2.2), we apply this composition rule to the function $ϕ (u) = h^{- 1} (u + g^{*} (V)) - α$ , with root $u_{1} = h (α) - g^{*} (V)$ , and obtain that $\begin{matrix} \int_{R} δ_{0} (g_{2} (u, V) - α) p_{(X, U)} (x, u) d u = \int_{R} \frac{δ_{0} (u + g^{*} (V) - h (α))}{(h^{- 1})' (h (α))} p_{(X, U)} (x, u) d u \\ = h' (α) \int_{R} δ_{0} (u + g^{*} (V) - h (α)) p_{(X, U)} (x, u) d u . \end{matrix}$

The Dirac delta function is the distributional derivative of the Heaviside step function. Hence by integration by parts, we find that $\begin{matrix} \int_{R} δ_{0} (g_{2} (u, V) - α) p_{(X, U)} (x, u) d u \\ = - h' (α) \int_{R} H (u + g^{*} (V) - h (α)) \frac{\partial}{\partial u} p_{(X, U)} (x, u) d u \\ = - h' (α) \int_{R} H (g_{2} (u, V) - α) (\frac{\partial}{\partial u} log p_{(X, U)} (x, u)) p_{(X, U)} (x, u) d u . \end{matrix}$

Finally we conclude that $\begin{matrix} E [f (F) δ_{0} (G - α)] = E [E [f (F) H (G - α) {- \frac{\partial}{\partial u} log p_{(X, U)} (X, U)} h' (α) | σ (Y, V)]] \\ = E [f (F) H (G - α) {- \frac{\partial}{\partial u} log p_{(X, U)} (X, U)}] h' (α) . \end{matrix}$

Applying the latter result with $f \equiv 1$ for the denominator of (2.5) we prove the statement. □

Note that the weights $π_{(X, U)}$ in the representation (2.4) are not unique and one can derive infinitely many of them. This was discussed in the result (4.25)–(4.26) in [Citation1] for random variables admitting a density. In Proposition 2.1 of this latter article, the authors discussed also the optimality of the weights in a minimal variance setting. The extension of these results to our jump-diffusion setting using the CDM is straightforward and follows similar derivations as in the proof of Theorem 2.2. For more details we refer to Daveloose [Citation26].

Figure 2. Estimates for the Bermudan put option price obtained through the CDM and MM representations with control variate and with localization technique, against the number of simulated paths. In the right graph the vertical axis is restricted to $[9.5, 12]$ .

In the next theorem, we deduce a representation for the sensitivity of (2.4) with respect to α.

Theorem 2.3.

Let F and G be as described in Assumptions 2.1, where $log p_{(X, U)}$ possesses a second order derivative in its second argument and let the Borel measurable function f guarantee $f (F) \in L^{2} (Ω)$ . Then it holds for any $α \in Dom h$ that $\frac{\partial}{\partial α} E [f (F) | G = α] = \frac{B_{F, G} [f] (α) A_{F, G} [1] (α) - A_{F, G} [f] (α) B_{F, G} [1] (α)}{A_{F, G} [1] {(α)}^{2}} h' (α),$ where $\begin{matrix} A_{F, G} [\cdot] (α) = E [\cdot (F) H (G - α) π_{(X, U)}], \\ B_{F, G} [\cdot] (α) = E [\cdot (F) H (G - α) (- π_{(X, U)}^{2} + π_{(X, U)}^{*})], \\ π_{(X, U)} = - \frac{\partial}{\partial u} log p_{(X, U)} (X, U), and \\ π_{(X, U)}^{*} = - \frac{\partial^{2}}{\partial u^{2}} log p_{(X, U)} (X, U) . \end{matrix}$

Proof.

From Theorem 2.2, it follows immediately that $\frac{\partial}{\partial α} E [f (F) | G = α] = \frac{\frac{\partial}{\partial α} {A_{F, G} [f] (α)} A_{F, G} [1] (α) - A_{F, G} [f] (α) \frac{\partial}{\partial α} {A_{F, G} [1] (α)}}{A_{F, G} [1] {(α)}^{2}} .$

For the derivatives in the right hand side, it holds by the square integrability assumption for f(F) that $\frac{\partial}{\partial α} E [f (F) H (G - α) π_{(X, U)}] = - E [f (F) δ_{0} (G - α) π_{(X, U)}] .$

Along the lines of the formal proof of Theorem 2.2, we derive $\begin{matrix} \int_{R} δ_{0} (g_{2} (u, V) - α) (\frac{\partial}{\partial u} log p_{(X, U)} (x, u)) p_{(X, U)} (x, u) d u \\ = \int_{R} δ_{0} (u + g^{*} (V) - h (α)) h' (α) \frac{\partial}{\partial u} p_{(X, U)} (x, u) d u \\ = - \int_{R} H (u + g^{*} (V) - h (α)) h' (α) \frac{\partial^{2}}{\partial u^{2}} p_{(X, U)} (x, u) d u \\ = - \int_{R} H (g_{2} (u, V) - α) h' (α) {{(\frac{\partial}{\partial u} log p_{(X, U)} (x, u))}^{2} + \frac{\partial^{2}}{\partial u^{2}} log p_{(X, U)} (x, u)} p_{(X, U)} (x, u) d u, \end{matrix}$ which concludes the proof.

□

2.2. Examples: Jump-diffusion models

In many applications in mathematical finance, one can make grateful use of Theorem 2.2. In fact, we are able to express a realization of the conditional expectation of the form $E [f (S_{t}) | S_{s} = α]$ in terms of regular expectations. Here, f is a function e.g. a payoff function, ${(S_{t})}_{t \in [0, T]}$ represents a Markovian stock price process, $0 < s < t < T$ , and α is a real number. We will use Monte Carlo simulations to evaluate these conditional expectations using their representations as regular expectations. In the sequel, we consider different jump-diffusion models for the stock price process to illustrate our results with specific examples.

Remark 2.4.

(Processes with independent and identically distributed increments). In case it holds that the stock price process S is modeled in terms of a process with increments which are independent and identically distributed we remark the following about a conditional expectation of the form $E [f (S_{t}) | S_{s} = α]$ . Consider for instance $S = S_{0} e^{L}$ where L denotes a Lévy process. Then, given that $S_{s} = α$ , it holds, due to the independence and the identical distribution of the increments, that for any $0 < s < t < T$ $S_{t} = S_{s} \frac{S_{t}}{S_{s}} = α e^{L_{t} - L_{s}} \overset{d}{=} α e^{L_{t - s}} .$

Therefore $E [f (S_{t}) | S_{s} = α] = E [f (α e^{L_{t - s}})]$ and the conditional expectation equals a regular expectations which can easily be estimated via a Monte Carlo method. Although it is possible to use Monte Carlo simulations exploiting the independence and stationarity property of the increments, the application of the obtained representations avoids nested simulations in the pricing of American options, as explained further on in Remark 5.1. Hence, this allows to obtain estimations based on a smaller number of simulated paths.

2.2.1. Jump-diffusion model

The next proposition considers a representation of conditional expectations in terms of unconditional expectations, when the price process S is an exponential jump-diffusion process. This type of stock price model is common in finance and conditional expectations appear when determining (American) option prices (which we will illustrate in Section 5 with numerical experiments).

Proposition 2.5.

Consider a price process S defined by $S_{t} = e^{L_{t}}, \forall t \in [0, T]$ , where L is a Lévy jump diffusion process with decomposition $L_{t} = μ t + β W_{t} + {\tilde{N}}_{t}$ . Here W is a standard Brownian motion, $\tilde{N}$ is a compound Poisson process independent of W, and μ and β are constant parameters. Then, for any Borel measurable function f fulfilling $f (S_{t}) \in L^{2} (Ω)$ , any strictly positive number α, and $0 < s < t < T$ , it holds that $E [f (S_{t}) | S_{s} = α] = \frac{E [f (S_{t}) H (S_{s} - α) π]}{E [H (S_{s} - α) π]},$ where $π = \frac{t W_{s} - s W_{t}}{β s (t - s)} .$

Proof.

Following the notation of Theorem 2.2, we set F = S_t and G = S_s. The random variables $X = β W_{t}, Y = μ t + {\tilde{N}}_{t}, U = β W_{s}$ and $V = μ s + {\tilde{N}}_{s}$ and the functions $g_{i} (x, y) = e^{x + y}, i \in {1, 2}, g^{*} (v) = v$ , and $h (α) = log α$ , with $α \in Dom h = R_{0}^{+}$ , satisfy Assumptions 2.1. The random variables X and U (resulting from a scaled Brownian motion) have a joint normal distribution with density function $p_{(X, U)} (x, u) = \frac{1}{2 π β^{2} \sqrt{(t - s) s}} exp (- \frac{x^{2} s - 2 x u s + u^{2} t}{2 β^{2} (t - s) s}) .$

To determine the weight in (2.4) we calculate $- \frac{\partial}{\partial u} log p_{(X, U)} (x, u) = \frac{\partial}{\partial u} \frac{x^{2} s - 2 x u s + u^{2} t}{2 β^{2} (t - s) s} = \frac{u t - x s}{β^{2} (t - s) s},$

such that we obtain $π = \frac{U t - X s}{β^{2} s (t - s)} = \frac{t W_{s} - s W_{t}}{β s (t - s)} .$

□

Example:

Merton model: In the light of the numerical experiments that we perform in Section 5, we derive the representation (2.4) for the following Merton model given by (2.6) $S_{t} = s_{0} exp ((r - \frac{β^{2}}{2}) t + β W_{t} + \sum_{i = 1}^{N_{t}} Y_{i}),$ (2.6) where r > 0 is the risk-free interest rate, β is a positive constant, and W is a Wiener process. The jump part is determined by a Poisson process N with jump intensity μ and the random variables Y_i are i.i.d. with distribution $N (- δ^{2} / 2, δ^{2})$ .

Since the Merton model is a special case of the exponential Lévy jump-diffusion model, the representations obtained via the conditional density method, i.e. through Proposition 2.5 and Theorem 2.3, are as follows, for $0 < s < t < T$ and $α \in R_{0}^{+}$ , (2.7) $E [f (S_{t}) | S_{s} = α] = \frac{E [f (S_{t}) H (S_{s} - α) π_{s, t}]}{E [H (S_{s} - α) π_{s, t}]} = : \frac{A_{t, s} [f] (α)}{A_{t, s} [1] (α)},$ (2.7) (2.8) $\frac{\partial}{\partial α} E [f (S_{t}) | S_{s} = α] = \frac{B_{t, s} [f] (α) A_{t, s} [1] (α) - A_{t, s} [f] (α) B_{t, s} [1] (α)}{A_{t, s} [1] {(α)}^{2}} \frac{1}{α},$ (2.8) where $\begin{matrix} B_{t, s} [\cdot] (α) = E [\cdot (S_{t}) H (S_{s} - α) {- π_{s, t}^{2} + π_{s, t}^{*}}], \\ π_{s, t} = \frac{t W_{s} - s W_{t}}{β s (t - s)}, and π_{s, t}^{*} = \frac{t}{β^{2} s (t - s)} . \end{matrix}$

2.2.2. Additive model

Now we observe a model which is often used to price energy products (see for example Benth et al. [Citation27]). The price process is given by an additive model (2.9) $S_{t} = X_{t} + Y_{t}, \forall t \in [0, T] with S_{0} > 0.$ (2.9)

The process Y is adapted to the filtration $F$ and does not have to be specified here. The process X is a so called $Γ (a, b)$ -Ornstein-Uhlenbeck process, see Section 17 in Sato [Citation28]. Namely, it is a process following the dynamics (2.10) $d X_{t} = - λ X_{t} d t + d L_{t}, X_{0} = S_{0},$ (2.10) where $λ > 0$ and L is a subordinator, admitting a stationary distribution for the process X which is here $Γ (a, b)$ . Hence, this means that X_t has a $Γ (a, b)$ -distribution for all t > 0. The solution of the stochastic differential equation (2.10) equals $X_{t} = e^{- λ t} S_{0} + \int_{0}^{t} e^{λ (r - t)} d L_{r} .$

An interesting property of this type of non-Gaussian OU-processes is the fact that the autocorrelation is independent of the stationary distribution, see, e.g. Barndorff–Nielsen and Shephard [Citation29] or Sato [Citation28], it equals (2.11) $Corr (X_{t}, X_{s}) = e^{λ (s - t)}, \forall 0 < s < t .$ (2.11)

Proposition 2.6.

Let us observe the additive model described by (2.9) and (2.10). Then it holds for any Borel measurable function f satisfying $f (S_{t}) \in L^{2} (Ω), 0 < s < t < T$ , and $α \in R$ , that $E [f (S_{t}) | S_{s} = α] = \frac{E [f (S_{t}) H (S_{s} - α) π]}{E [H (S_{s} - α) π]},$ where $π = \frac{1 - a}{X_{s}} + \frac{b}{1 - ρ} - \frac{v (X_{t}, X_{s})}{2 X_{s}} \frac{I_{a} (v (X_{t}, X_{s}))}{I_{a - 1} (v (X_{t}, X_{s}))} .$

Herein, $I_{a}$ is the modified Bessel function of the first kind with index a, $ρ = e^{λ (s - t)} and v (x, u) = \frac{2 \sqrt{ρ b^{2} x u}}{1 - ρ} .$

Proof.

As in Theorem 2.2, we put $F = S_{t} = X_{t} + Y_{t}, G = S_{s} = X_{s} + Y_{s}, (X, U) = (X_{t}, X_{s})$ , and $h (α) = α$ to satisfy Assumptions 2.1. To obtain the weight we need the density function of the vector (X_t, X_s). Since X is a $Γ (a, b)$ -OU process we know that X_t and X_s are both $Γ (a, b)$ distributed and by (2.11) we know that $Corr (X_{t}, X_{s}) = e^{λ (s - t)} = : ρ$ . According to Brewer et al. [Citation30], the density function of this bivariate gamma distribution with non-zero correlation equals $p_{(X_{t}, X_{s})} (x, u) = \frac{{(b^{2} x u)}^{(a - 1) / 2} exp (- (b x + b u) / (1 - ρ))}{ρ^{(a - 1) / 2} (1 - ρ) Γ (a)} I_{a - 1} (\frac{2 \sqrt{ρ b^{2} x u}}{1 - ρ}),$

where $I_{a}$ is the modified Bessel function of the first kind with index a. We compute $\frac{\partial}{\partial u} log p_{(X_{t}, X_{s})} (x, u) = \frac{a - 1}{2 u} - \frac{b}{1 - ρ} + \frac{\partial}{\partial u} log I_{a - 1} (\frac{2 \sqrt{ρ b^{2} x u}}{1 - ρ}) .$

For the function $v (x, u) = 2 \sqrt{ρ b^{2} x u} / (1 - ρ)$ , it holds that $\frac{\partial v}{\partial u} (x, u) = \sqrt{(ρ b^{2} x / u)} / (1 - ρ)$ and $\frac{\partial v}{\partial u} (x, u) = v (x, u) / (2 u)$ . Using the recurrence formulas for modified Bessel functions (see Bowman [Citation31]), we get $\begin{matrix} \frac{\partial}{\partial u} log (I_{a - 1} (v (x, u))) = \frac{1}{I_{a - 1} (v (x, u))} I_{a - 1}^{'} (v (x, u)) \frac{\partial v}{\partial u} (x, u) \\ = \frac{1}{I_{a - 1} (v (x, u))} \frac{1}{2} (I_{a - 2} (v (x, u)) + I_{a} (v (x, u))) \frac{v (x, u)}{2 u} \\ = \frac{1}{I_{a - 1} (v (x, u))} \frac{1}{2} ({I_{a - 2} (v (x, u)) - I_{a} (v (x, u))} + 2 I_{a} (v (x, u))) \frac{v (x, u)}{2 u} \\ = \frac{1}{I_{a - 1} (v (x, u))} \frac{1}{2} (\frac{2 (a - 1)}{v (x, u)} I_{a - 1} (v (x, u)) + 2 I_{a} (v (x, u))) \frac{v (x, u)}{2 u} \\ = (\frac{(a - 1)}{v (x, u)} + \frac{I_{a} (v (x, u))}{I_{a - 1} (v (x, u))}) \frac{v (x, u)}{2 u} = \frac{a - 1}{2 u} + \frac{v (x, u)}{2 u} \frac{I_{a} (v (x, u))}{I_{a - 1} (v (x, u))} . \end{matrix}$

According to (2.4) we conclude the statement. □

3. Conditional expectation via MM

Fournié et al. [Citation1] used Malliavin calculus defined for functions on the Wiener space to obtain representations for the conditional expectations. Therefore, their approach is applied to continuous diffusions. In this section, we extend their method to allow for the computation of conditional expectations in a Lévy and a jump-diffusion framework. For this purpose we use a Malliavin derivative of the combination of Gaussian and pure jump Lévy noises, see, e.g. Di Nunno et al. [Citation4] and Solé et al. [Citation32]. In our setting, we use the Malliavin derivative developed by Petrou [Citation15].

Notice that the MM allows for more general dynamics than the conditional density method (CDM). Indeed, no density is required to be known when using the MM. However, the CDM is easier to apply and follows using basic knowledge in probability theory.

Let $(Ω, F, P)$ be a complete probability space in which Lévy processes are well-defined. The Malliavin derivative in the Brownian direction is defined by Petrou [Citation15] in a subspace of $L^{2} (Ω)$ and is essentially a derivative with respect to the Brownian part of a Lévy process L. We denote it by $D^{(0)}$ . Its dual, the Skorohod integral is also defined in [Citation15] and denoted by $δ^{(0)}$ . In this section we make use of some notations, definitions, computational rules and properties which are summarized in Appendix B. Recall the function H defined in (2.3).

3.1. Representation results

The ideas and concepts of the proofs of the results presented in this section are based on the proofs of Theorem 4.1 and Corollary 4.1 in Fournié et al. [Citation1] and follow similar lines of derivations. The difference in our approach is that we use a Malliavin derivative defined for Lévy processes. This allows us to generalize the results in [Citation1] to processes with jumps.

In the following theorem we derive a first representation result for the conditional expectation $E [f (F) | G = α]$ , where the function f possesses a bounded and continuous derivative.

Theorem 3.1.

Denote by $C_{b}^{1}$ the space of continuously differentiable functions with bounded derivative. Moreover, Let $f \in C_{b}^{1}$ be a Borel measurable function, F and G be in $D^{(0)}$ , u be in $Dom δ^{(0)}$ such that $f (F) u$ is in $L^{2} (Ω \times [0, T])$ , and (3.1) $E [\int_{0}^{T} u_{t} D_{t}^{(0)} G d t | σ (F, G)] = 1.$ (3.1)

Then it holds for any $α \in R$ that (3.2) $E [f (F) | G = α] = \frac{E [f (F) H (G - α) δ^{(0)} (u) - f' (F) H (G - α) \int_{0}^{T} u_{t} D_{t}^{(0)} F d t]}{E [H (G - α) δ^{(0)} (u)]} .$ (3.2)

Proof.

We note that $E [f (F) | G = α] = \lim_{ε \to 0 +} E [f (F) | G \in] α - ε, α + ε [] = \lim_{ε \to 0 +} \frac{E [f (F) 1_{] - ε, ε [} (G - α)]}{E [1_{] - ε, ε [} (G - α)]} .$

As a first step we show that for any $ε > 0$ , (3.3) $\frac{1}{2} E [f (F) 1_{] - ε, ε [} (G - α)] = E [H_{ε} (G - α) δ^{(0)} (f (F) u)],$ (3.3) where $H_{ε} (x) = {\begin{matrix} ε c, & x < - ε, \\ \frac{1}{2} (x + ε) + ε c, & - ε \leq x < ε, \\ ε + ε c, & x \geq ε . \end{matrix}$

Hereto we approximate the function $\frac{1}{2} 1_{] - ε, ε [}$ by a sequence of bounded continuous functions ${(Φ_{ε, n})}_{n \geq 1}$ . Specifically, $Φ_{ε, n}$ has support $] - ε - ε / n, ε + ε / n [$ and equals $\frac{1}{2}$ on $[- ε + ε / n, ε - ε / n]$ . Moreover the graph of $Φ_{ε, n}$ connects the points $(- ε - ε / n, 0)$ and $(- ε + ε / n, 1 / 2)$ and the points $(ε - ε / n, 1 / 2)$ and $(ε + ε / n, 0)$ via straight lines. Besides, we define $Ψ_{ε, n} (x) : = \int_{- \infty}^{x} Φ_{ε, n} (y) d y + ε c$ . Then we have by the duality formula, the chain rule, and relation (3.1) that $\begin{matrix} E [Ψ_{ε, n} (G - α) δ^{(0)} (f (F) u)] = E [\int_{0}^{T} f (F) u_{t} D_{t}^{(0)} (Ψ_{ε, n} (G - α)) d t] \\ = E [f (F) Φ_{ε, n} (G - α) \int_{0}^{T} u_{t} D_{t}^{(0)} G d t] \\ = E [E [f (F) Φ_{ε, n} (G - α) \int_{0}^{T} u_{t} D_{t}^{(0)} G d t | σ (F, G)]] \\ = E [f (F) Φ_{ε, n} (G - α)] . \end{matrix}$

Because of the facts that $| Φ_{ε, n} | \leq \frac{1}{2}$ and $f (F) \in L^{2} (Ω)$ , the latter expression converges to $\frac{1}{2} E [f (F) 1_{] - ε, ε [} (G - α)]$ by the definition of the sequence ${(Φ_{ε, n})}_{n}$ . Besides $| Ψ_{ε, n} | \leq ε + ε c, δ^{(0)} (f (F) u) \in L^{2} (Ω)$ , and the sequence ${(Ψ_{ε, n})}_{n}$ approximates $H_{ε}$ . Hence this proves (3.3).

Next, by the integration by parts formula we find $\begin{matrix} \frac{1}{2} E [f (F) 1_{] - ε, ε [} (G - α)] = E [H_{ε} (G - α) {f (F) δ^{(0)} (u) - \int_{0}^{T} u_{r} D_{r}^{(0)} f (F) d r}] \\ = E [H_{ε} (G - α) {f (F) δ^{(0)} (u) - f' (F) \int_{0}^{T} u_{r} D_{r}^{(0)} F d r}] . \end{matrix}$

Then applying the latter result for $f \equiv 1$ too, shows that $E [f (F) | G = α] = \lim_{ε \to 0 +} \frac{E [\frac{1}{ε} H_{ε} (G - α) {f (F) δ^{(0)} (u) - f' (F) \int_{0}^{T} u_{r} D_{r}^{(0)} F d r}]}{E [\frac{1}{ε} H_{ε} (G - α) δ^{(0)} (u)]} .$

Since it holds that $| \frac{1}{ε} H_{ε} (G - α) | \leq 1 + c, E [| δ^{(0)} (f (F) u) |] < \infty, δ^{(0)} (u) \in L^{2} (Ω)$ , and $\frac{1}{ε} H_{ε} (x) \to H (x)$ for ε tending to zero, this concludes the proof.□

The latter theorem provides us with a representation formula for the conditional expectation $E [f (F) | G = α]$ for f being a continuously differentiable function. However in many applications in finance, we often have to consider non-smooth functions. In order to deal with the potential non-smoothness of f, we include an additional assumption on the process u introduced in Theorem 3.1 leading to the following theorem.

Theorem 3.2.

Let F and G be in $D^{(0)}$ and f be a Borel measurable function such that $f (F) \in L^{2} (Ω)$ . Consider a process u in $Dom δ^{(0)}$ , guaranteeing $f (F) u$ is in $L^{2} (Ω \times [0, T])$ , satisfying (3.1) and, in addition, (3.4) $E [\int_{0}^{T} u_{t} D_{t}^{(0)} F d t | σ (F, G)] = 0.$ (3.4)

Then the following representation holds for $α \in R$ (3.5) $E [f (F) | G = α] = \frac{E [f (F) H (G - α) δ^{(0)} (u)]}{E [H (G - α) δ^{(0)} (u)]} .$ (3.5)

Proof.

i) First, let us consider a Borel measurable function $f \in C_{b}^{1}$ , such that we can apply Theorem 3.1. Because of the properties of conditional expectations and relation (3.4), we have in representation (3.2) that $\begin{matrix} E [f' (F) H (G - α) \int_{0}^{T} u_{t} D_{t}^{(0)} F d t] = E [E [f' (F) H (G - α) \int_{0}^{T} u_{t} D_{t}^{(0)} F d t | σ (F, G)]] \\ = E [f' (F) H (G - α) E [\int_{0}^{T} u_{t} D_{t}^{(0)} F d t | σ (F, G)]] = 0. \end{matrix}$

Thus we obtain representation (3.5).ii) Now we observe a Borel measurable function f for which $f (F) \in L^{2} (Ω)$ . Let $C_{K}^{\infty} (R)$ and $C_{K}^{\infty} (Ω)$ be the space of infinitely differentiable functions with compact support respectively defined in $R$ and Ω. Since $C_{K}^{\infty} (Ω)$ is dense in $L^{2} (Ω)$ there exists a sequence of functions f_n in $C_{K}^{\infty} (R)$ , such that $f_{n} (F)$ converges to f(F) in $L^{2} (Ω)$ . In part i) we concluded that for any function f_n in this sequence representation (3.5) holds. By convergence arguments, we conclude that expression (3.5) also holds for the limit function f as follows. For any f_n we denote $g_{n} (α) : = E [f_{n} (F) | G = α] = \frac{E [f_{n} (F) H (G - α) δ^{(0)} (u)]}{E [H (G - α) δ^{(0)} (u)]} .$

Besides we define $g (α) : = \frac{E [f (F) H (G - α) δ^{(0)} (u)]}{E [H (G - α) δ^{(0)} (u)]} .$

Via the Cauchy–Schwarz inequality, we derive that $\begin{matrix} | g (α) - g_{n} (α) | \leq \frac{E [| f (F) - f_{n} (F) | | H (G - α) δ^{(0)} (u) |]}{| E [H (G - α) δ^{(0)} (u)] |} \\ \leq \frac{E {[| f (F) - f_{n} (F) |^{2}]}^{1 / 2} E {[| H (G - α) δ^{(0)} (u) |^{2}]}^{1 / 2}}{| E [H (G - α) δ^{(0)} (u)] |} . \end{matrix}$

For any $α \in R$ we have $\frac{E {[| H (G - α) δ^{(0)} (u) |^{2}]}^{1 / 2}}{| E [H (G - α) δ^{(0)} (u)] |} < \infty .$

By the density argument $f_{n} (F)$ converges to f(F) in L²-sense, hence we obtain that $| g (α) - g_{n} (α) | \to 0, for n \to \infty, \forall α \in R .$

Thus, $g_{n} (α)$ convergences to $g (α)$ . Moreover it holds that $g_{n} (α)$ converges to $E [f (F) | G = α]$ by the conditional dominated convergence theorem. Therefore, we conclude that $g (α)$ equals the latter conditional expectation. □

Via the MM we also deduce a representation for the delta in terms of unconditional expectations.

Theorem 3.3.

Consider the same setting as in Theorem 3.2 and assume that (3.6) $E [δ^{(0)} (u) \int_{0}^{T} u_{t} D_{t}^{(0)} F d t | σ (F, G)] = 0.$ (3.6)

Then the delta is given by (3.7) $\frac{\partial}{\partial α} E [f (F) | G = α] = \frac{B_{F, G} [f] (α) A_{F, G} [1] (α) - A_{F, G} [f] (α) B_{F, G} [1] (α)}{A_{F, G} [1] {(α)}^{2}},$ (3.7) where $\begin{matrix} A_{F, G} [\cdot] (α) = E [\cdot (F) H (G - α) δ^{(0)} (u)], \\ B_{F, G} [\cdot] (α) = E [\cdot (F) H (G - α) {- δ^{(0)} {(u)}^{2} + \int_{0}^{T} u_{r} D_{r}^{(0)} δ^{(0)} (u) d r}] . \end{matrix}$

Proof.

The structure of formula (3.7) follows clearly from the derivation of representation (3.5). Now we focus on the derivative $\begin{matrix} B_{F, G} [f] (α) = \frac{\partial}{\partial α} E [f (F) H (G - α) δ^{(0)} (u)] = - E [f (F) δ_{0} (G - α) δ^{(0)} (u)] \\ = - \lim_{ε \to 0 +} \frac{1}{2 ε} E [f (F) 1_{] - ε, ε [} (G - α) δ^{(0)} (u)] . \end{matrix}$ i) First, we consider a Borel measurable function $f \in C_{b}^{1}$ . It can be shown, as before in Theorem 3.1, that $\frac{1}{2} E [f (F) 1_{] - ε, ε [} (G - α) δ^{(0)} (u)] = E [H_{ε} (G - α) δ^{(0)} (f (F) δ^{(0)} (u) u)],$ and therefore $\begin{matrix} B_{F, G} [f] (α) = - \lim_{ε \to 0 +} \frac{1}{ε} E [H_{ε} (G - α) δ^{(0)} (f (F) δ^{(0)} (u) u)] \\ = - E [H (G - α) δ^{(0)} (f (F) δ^{(0)} (u) u)] . \end{matrix}$

By the chain rule and the integration by parts formula, we obtain $\begin{matrix} E [H (G - α) δ^{(0)} (f (F) δ^{(0)} (u) u)] \\ = E [H (G - α) {f (F) δ^{(0)} (δ^{(0)} (u) u) - \int_{0}^{T} δ^{(0)} (u) u_{r} D_{r}^{(0)} (f (F)) d r}] \\ = E [H (G - α) {f (F) {δ^{(0)} (u) δ^{(0)} (u) - \int_{0}^{T} u_{r} D_{r}^{(0)} δ^{(0)} (u) d r} - δ^{(0)} (u) f' (F) \int_{0}^{T} u_{r} D_{r}^{(0)} F d r}] \\ = E [f (F) H (G - α) {δ^{(0)} {(u)}^{2} - \int_{0}^{T} u_{r} D_{r}^{(0)} δ^{(0)} (u) d r}] \\ - E [f' (F) H (G - α) δ^{(0)} (u) \int_{0}^{T} u_{r} D_{r}^{(0)} F d r] . \end{matrix}$

By expression (3.6), the latter expectation equals $\begin{matrix} E [f' (F) H (G - α) δ^{(0)} (u) \int_{0}^{T} u_{r} D_{r}^{(0)} F d r] \\ = E [E [f' (F) H (G - α) δ^{(0)} (u) \int_{0}^{T} u_{r} D_{r}^{(0)} F d r | σ (F, G)]] \\ = E [f' (F) H (G - α) E [δ^{(0)} (u) \int_{0}^{T} u_{r} D_{r}^{(0)} F d r | σ (F, G)]] = 0. \end{matrix}$

Hence, we conclude that (3.8) $E [f (F) δ_{0} (G - α) δ^{(0)} (u)] = E [f (F) H (G - α) {δ^{(0)} {(u)}^{2} - \int_{0}^{T} u_{r} D_{r}^{(0)} δ^{(0)} (u) d r}] .$ (3.8) ii) Via a density argument which follows the same lines as the proof of Theorem 3.2, we conclude that Equationequation (3(3.1) $E [\int_{0}^{T} u_{t} D_{t}^{(0)} G d t | σ (F, G)] = 1.$ (3.1) .8) also holds for a Borel measurable function f such that $f (F) \in L^{2} (Ω)$ .

□

Remark 3.4.

Remark that condition (3.6) is fulfilled by a combination of condition (3.4) and $δ^{(0)} (u)$ being $σ (F, G)$ -measurable, since in this case $E [δ^{(0)} (u) \int_{0}^{T} u_{t} D_{t}^{(0)} F d t | σ (F, G)] = δ^{(0)} (u) E [\int_{0}^{T} u_{t} D_{t}^{(0)} F d t | σ (F, G)] = 0.$

Next, some practical examples motivated from financial applications are given.

3.2. Examples: Jump-diffusion models

We start by considering a stock price process which is modeled by a general stochastic differential equation (SDE). For this model we derive representations based on the results developed in Subsection 3.1. Further on, we illustrate these by looking into some specific types of SDEs such as exponential Lévy and stochastic volatility models. Let S satisfy the following stochastic differential equation (3.9) ${\begin{matrix} d S_{t} = μ (t, S_{t -}) d t + β (t, S_{t -}) d W_{t} + \int_{R_{0}} γ (t, S_{t -}, z) \tilde{N} (d t, d z), \\ S_{0} = s_{0} > 0, \end{matrix}$ (3.9) where W is a Wiener process and $\tilde{N}$ is a compensated Poisson random measure with Lévy measure $L$ . We assume that $β (t, x) > 0$ for all $(t, x) \in [0, T] \times R$ . The coefficient functions $μ (t, x), β (t, x), γ (t, x, z) \in C_{b}^{1}$ are Lipschitz continuous in the second argument, for all $(t, z) \in [0, T] \times R_{0}$ . The coefficients also satisfy the following linear growth condition $μ^{2} (t, x) + β^{2} (t, x) + \int_{R_{0}} γ^{2} (t, x, z) L (d z) \leq C (1 + x^{2}),$ for all $t \in [0, T]$ , where C is a positive constant. The existence and uniqueness of the solution S is ensured by Theorem 9.1. Chap IV collected from Ikeda and Watanabe [Citation33].

The first variation process V related to S equals $\frac{\partial S}{\partial s_{0}}$ and satisfies ${\begin{matrix} d V_{t} = μ_{x} (t, S_{t -}) V_{t -} d t + β_{x} (t, S_{t -}) V_{t -} d W_{t} + \int_{R_{0}} γ_{x} (t, S_{t -}, z) V_{t -} \tilde{N} (d t, d z), \\ V_{0} = 1. \end{matrix}$

We assume that the coefficients are such that V is strictly positive. The stock price S_t is in $D^{(0)}$ for all $t \in [0, T]$ , and its Malliavin derivative can be expressed in terms of the first variation process (see Theorem 3 and Proposition 7 in Petrou [Citation15]) (3.10) $D_{s}^{(0)} S_{t} = V_{t} {(V_{s -})}^{- 1} β (s, S_{s -}) 1_{{s \leq t}} .$ (3.10)

The aim is to find a representation formula for the conditional expectation $E [f (S_{t}) | S_{s} = α], 0 < s < t < T$ and $α \in R$ , containing only regular expectations. First we mention the following lemma. We do not present the proof since it is an adaptation of the proof of [Citation1, Lemma 4.1] to our setting.

Lemma 3.5.

It holds that $D_{s}^{(0)} V_{t} = {β_{x} (s, S_{s -}) V_{t} - \frac{β (s, S_{s -}) ζ_{s -} V_{t}}{V_{s -}^{2}} + \frac{β (s, S_{s -}) ζ_{t}}{V_{s -}}} 1_{{s \leq t}},$ where $ζ_{t} : = \frac{\partial^{2} S_{t}}{\partial s_{0}^{2}}$ . In other words ζ is the solution of the SDE ${\begin{matrix} d ζ_{t} = [μ_{x x} (t, S_{t -}) V_{t -}^{2} + μ_{x} (t, S_{t -}) ζ_{t}] d t + [β_{x x} (t, S_{t -}) V_{t -}^{2} + β_{x} (t, S_{t -}) ζ_{t}] d W_{t} \\ + \int_{R_{0}} [γ_{x x} (t, S_{t -}, z) V_{t -}^{2} + γ_{x} (t, S_{t -}, z) ζ_{t}] \tilde{N} (d t, d z), \\ ζ_{0} = 0. \end{matrix}$

Now we have all the ingredients to obtain a useful expression for the conditional expectation $E [f (S_{t}) | S_{s} = α]$ , for $α \in R$ . First we make use of Theorem 3.1 and later on we apply Theorem 3.2.

Proposition 3.6.

Let $f \in C_{b}^{1}, 0 < s < t < T$ , and $α \in R$ . In the setting described by the stochastic differential equation (3.9) we assume that (3.11) $E [\int_{0}^{T} {(\frac{V_{r -}}{β (r, S_{r -})})}^{2} d r] < \infty and E [\int_{0}^{T} {(\frac{1}{s V_{s}} \frac{V_{r -}}{β (r, S_{r -})})}^{2} d r] < \infty .$ (3.11)

Then the following representation holds for the conditional expectation (3.12) $E [f (S_{t}) | S_{s} = α] = \frac{E [f (S_{t}) H (S_{s} - α) π_{1} - f' (S_{t}) H (S_{s} - α) π_{2}]}{E [H (S_{s} - α) π_{1}]},$ (3.12) where the Malliavin weights equal (3.13) $π_{1} = \frac{1}{s V_{s}} ( \int_{0}^{s} \frac{V_{r -}}{β (r, S_{r -})} d W_{r} + s \frac{ζ_{s}}{V_{s}} + \int_{0}^{s} [\frac{β_{x} (r, S_{r -})}{β (r, S_{r -})} V_{r -} - \frac{ζ_{r -}}{V_{r -}}] d r) and π_{2} = \frac{V_{t}}{V_{s}} .$ (3.13)

Proof.

We apply Theorem 3.1 and to fulfill condition (3.1) we define ${\tilde{u}}_{r} = \frac{V_{r -}}{V_{s} β (r, S_{r -})} \frac{1}{s} 1_{{r \leq s}} .$

Note that the process $V_{-} / β (\cdot, S_{-})$ is predictable. By the first condition in (3.11) it turns out that this process is in $Dom δ^{(0)}$ . Moreover by Lemma 3.5 and the chain rule it holds that $1 / V_{s}$ is in $D^{(0)}$ . The second part of condition (3.11) allows us to conclude that $\tilde{u}$ is in $Dom δ^{(0)}$ .

The first weight that we calculate is the Skorohod integral of $\tilde{u}$ . Thereto we perform integration by parts, $δ^{(0)} (\tilde{u}) = \frac{1}{V_{s}} \int_{0}^{T} \frac{V_{r -}}{β (r, S_{r -}) s} 1_{{r \leq s}} d W_{r} - \int_{0}^{T} \frac{V_{r -}}{β (r, S_{r -}) s} 1_{{r \leq s}} D_{r}^{(0)} \frac{1}{V_{s}} d r .$

Because of the chain rule we rewrite this as $δ^{(0)} (\tilde{u}) = \frac{1}{s V_{s}} \int_{0}^{s} \frac{V_{r -}}{β (r, S_{r -})} d W_{r} + \frac{1}{s} \int_{0}^{s} \frac{V_{r -}}{β (r, S_{r -})} \frac{D_{r}^{(0)} V_{s}}{V_{s}^{2}} d r .$

Now we make use of Lemma 3.5 and obtain that the latter equals $\begin{matrix} \frac{1}{s V_{s}} \int_{0}^{s} \frac{V_{r -}}{β (r, S_{r -})} d W_{r} + \frac{1}{s} \int_{0}^{s} \frac{V_{r -}}{β (r, S_{r -})} \frac{1}{V_{s}^{2}} [β_{x} (r, S_{r -}) V_{s} - \frac{β (r, S_{r -}) ζ_{r -} V_{s}}{V_{r -}^{2}} + \frac{β (r, S_{r -}) ζ_{s}}{V_{r -}}] d r \\ = \frac{1}{s V_{s}} (\int_{0}^{s} \frac{V_{r -}}{β (r, S_{r -})} d W_{r} + \int_{0}^{s} [\frac{β_{x} (r, S_{r -})}{β (r, S_{r -})} V_{r -} - \frac{ζ_{r -}}{V_{r -}} + \frac{ζ_{s}}{V_{s}}] d r) \\ = \frac{1}{s V_{s}} (\int_{0}^{s} \frac{V_{r -}}{β (r, S_{r -})} d W_{r} + s \frac{ζ_{s}}{V_{s}} + \int_{0}^{s} [\frac{β_{x} (r, S_{r -})}{β (r, S_{r -})} V_{r -} - \frac{ζ_{r -}}{V_{r -}}] d r), \end{matrix}$ which is the mentioned expression for π₁.

The second weight in (3.2) is $\int_{0}^{T} {\tilde{u}}_{r} D_{r}^{(0)} S_{t} d r = \int_{0}^{T} \frac{1}{V_{s}} \frac{V_{r -}}{β (r, S_{r -}) s} 1_{{r \leq s}} V_{t} {(V_{r -})}^{- 1} β (r, S_{r -}) 1_{{r \leq t}} d r = \int_{0}^{s} \frac{V_{t}}{s V_{s}} d r = \frac{V_{t}}{V_{s}} .$

□

Theorem 3.2 can also be applied in this setting, which is interesting in case of non-differentiable functions f.

Proposition 3.7.

Consider again the setting defined by the stochastic differential equation (3.9). For any Borel measurable function f for which $f (S_{t}) \in L^{2} (Ω), 0 < s < t < T$ , and $α \in R$ it holds, under conditions (3.11), that $E [f (S_{t}) | S_{s} = α] = \frac{E [f (S_{t}) H (S_{s} - α) π]}{E [H (S_{s} - α) π]},$ where the Malliavin weight π differs from π₁ in (3.13) as follows (3.14) $π = π_{1} - \frac{1}{t - s} \frac{1}{V_{s}} \int_{s}^{t} \frac{V_{r -}}{β (r, S_{r -})} d W_{r} .$ (3.14)

Proof.

For the application of Theorem 3.2, we need the process (3.15) ${\hat{u}}_{r} = \frac{V_{r -}}{V_{s} β (r, S_{r -})} {\frac{1}{s} 1_{{r \leq s}} - \frac{1}{t - s} 1_{{s \leq r \leq t}}} = {\tilde{u}}_{r} - \frac{V_{r -}}{V_{s} β (r, S_{r -})} \frac{1}{t - s} 1_{{s \leq r \leq t}} .$ (3.15)

By comparing this with the intermediate process used in the proof of Proposition 3.6, we conclude that $\hat{u}$ is in $Dom δ^{(0)}$ . Moreover by the integration by parts formula and the fact that $V_{-} / β (\cdot, S_{-})$ is predictable, we obtain $\begin{matrix} δ^{(0)} (\hat{u}) = π_{1} + δ^{(0)} (- \frac{V_{r -}}{V_{s} β (r, S_{r -})} \frac{1}{t - s} 1_{{s \leq r \leq t}}) \\ = π_{1} - \frac{1}{V_{s}} \frac{1}{t - s} \int_{0}^{T} \frac{V_{r -}}{β (r, S_{r -})} 1_{{s \leq r \leq t}} d W_{r} + \frac{1}{t - s} \int_{0}^{T} \frac{V_{r -}}{β (r, S_{r -})} 1_{{s \leq r \leq t}} D_{r}^{(0)} \frac{1}{V_{s}} d r, \end{matrix}$ where π₁ is defined in (3.13). The last term equals zero, since by Lemma 3.5 the Malliavin derivative $D_{r}^{(0)} (1 / V_{s})$ introduces a factor $1_{{r \leq s}}$ . This concludes the proof. □

In the sequel, we present some models to illustrate our results from Propositions 3.6 and 3.7. The first two are defined by a linear SDE and the third one concerns stochastic volatility models.

3.2.1. Exponential Lévy model

We consider a stock price process S modeled by a stochastic exponential of a Lévy process, therefore let S satisfy the following linear SDE ${\begin{matrix} d S_{t} = μ S_{t -} d t + β S_{t -} d W_{t} + \int_{R_{0}} (e^{z} - 1) S_{t -} \tilde{N} (d t, d z), \\ S_{0} = s_{0} > 0, \end{matrix}$ where μ and $β > 0$ are constants. We assume that $\int_{R_{0}} {(e^{z} - 1)}^{2} L (d z) < \infty$ . All assumptions imposed on model (3.9) are satisfied. In this particular example, the first variation process V equals $V = S / s_{0}$ and $ζ \equiv 0$ and conditions (3.11) are fulfilled. From Proposition 3.6, we find that the expression (3.12) holds with (3.16) $π_{1} = \frac{s_{0}}{s S_{s}} (\int_{0}^{s} \frac{1}{s_{0} β} d W_{r} + \int_{0}^{s} \frac{1}{s_{0}} d r) = \frac{s_{0}}{s S_{s}} (\frac{W_{s}}{s_{0} β} + \frac{s}{s_{0}}) = \frac{1}{S_{s}} (\frac{W_{s}}{s β} + 1),$ (3.16) and $π_{2} = \frac{S_{t} / s_{0}}{S_{s} / s_{0}} = \frac{S_{t}}{S_{s}} .$

Substitution of the expressions for π₁ and π₂ into (3.12) leads to $E [f (S_{t}) | S_{s} = α] = \frac{E [f (S_{t}) H (S_{s} - α) \frac{1}{S_{s}} (\frac{W_{s}}{s β} + 1) - f' (S_{t}) H (S_{s} - α) \frac{S_{t}}{S_{s}}]}{E [H (S_{s} - α) \frac{1}{S_{s}} (\frac{W_{s}}{s β} + 1)]},$ where $f \in C_{b}^{1}, 0 < s < t < T$ , and $α \in R$ .

On the other hand, we apply Proposition 3.7 for the linear SDE we are observing now. The weight π differs from the weight π₁ in Proposition 3.6, when the intermediate process is of the form (3.15), only by the second term in (3.14). In the present setting, this term equals $\frac{s_{0}}{S_{s}} (- \frac{1}{t - s}) \int_{s}^{t} \frac{1}{s_{0} β} d W_{r} = - \frac{1}{β S_{s}} \frac{W_{t} - W_{s}}{t - s} .$

Hence combining this with (3.16) gives (3.17) $π = \frac{1}{S_{s}} (\frac{W_{s}}{s β} - \frac{1}{β} \frac{W_{t} - W_{s}}{t - s} + 1) = \frac{1}{S_{s}} (\frac{t W_{s} - s W_{t}}{s (t - s) β} + 1) .$ (3.17)

For any Borel measurable function f guaranteeing that $f (S_{t}) \in L^{2} (Ω), 0 < s < t < T$ , and $α \in R$ the conditional expectation can be rewritten as $E [f (S_{t}) | S_{s} = α] = \frac{E [f (S_{t}) H (S_{s} - α) \frac{1}{S_{s}} (\frac{t W_{s} - s W_{t}}{s (t - s) β} + 1)]}{E [H (S_{s} - α) \frac{1}{S_{s}} (\frac{t W_{s} - s W_{t}}{s (t - s) β} + 1)]} .$

3.2.2. Merton model

In the light of the numerical experiments, we consider again the Merton model (2.6) which is a special case of the exponential Lévy model. The representations obtained through the MM, thus via Proposition 3.7 and Theorem 3.3, are as follows, for $0 < s < t < T$ and $α \in R$ , (3.18) $E [f (S_{t}) | S_{s} = α] = \frac{E [f (S_{t}) H (S_{s} - α) π_{s, t}]}{E [H (S_{s} - α) π_{s, t}]} = \frac{A_{t, s} [f] (α)}{A_{t, s} [1] (α)},$ (3.18) (3.19) $\frac{\partial}{\partial α} E [f (S_{t}) | S_{s} = α] = \frac{B_{t, s} [f] (α) A_{t, s} [1] (α) - A_{t, s} [f] (α) B_{t, s} [1] (α)}{A_{t, s} [1] {(α)}^{2}},$ (3.19) with $B_{t, s} [\cdot] (α) = E [\cdot (S_{t}) H (S_{s} - α) {- π_{s, t}^{2} + π_{s, t}^{*}}],$ where the weight $π_{s, t}$ is given by (3.17) and $π_{s, t}^{*} = - \frac{1}{S_{s}^{2}} (\frac{β (t W_{s} - s W_{t}) - t}{β^{2} s (t - s)} + 1)$ is obtained through similar computations as above in this section.

3.2.3. Stochastic volatility models

Let us consider the following model (3.20) ${\begin{matrix} d S_{t} = μ S_{t -} d t + v (Y_{t -}) S_{t -} d W_{t}^{(1)} + \int_{R_{0}} (e^{z} - 1) S_{t -} \tilde{N} (d t, d z), \\ d Y_{t} = a (t, Y_{t -}) d t + b (t, Y_{t -}) d W_{t}^{(2)} + \int_{R_{0}} ψ (z) \tilde{N} (d t, d z), \end{matrix}$ (3.20) with $S_{0} = s_{0} > 0$ and $Y_{0} > 0$ . Herein, $\tilde{N}$ is the jump measure of a compound Poisson process with Lévy measure $L$ , and $W^{(1)}$ and $W^{(2)}$ are two correlated standard Brownian motions with (3.21) $d W_{t}^{(1)} d W_{t}^{(2)} = ρ d t, ρ \in (- 1, 1) .$ (3.21)

Moreover $μ \in R$ , the functions a and b on $[0, T] \times R$ are Lipschitz continuous and differentiable in the second argument for all t, v is a positive function which is Lipschitz continuous and differentiable on $R, \int_{R_{0}} {(e^{z} - 1)}^{2} L (d z) < \infty$ , and ψ is a function on $R$ such that $\int_{R_{0}} ψ^{2} (z) L (d z) < \infty$ . The process S may then perform the role of the stock price process, while v(Y) is interpreted as the stochastic volatility process. In many stochastic volatility models, the volatility v(Y) equals $\sqrt{Y}$ and some conditions should be included to guarantee the non-negativity of the process Y. Some interesting examples are the Bates model (see [Citation34]) and the Ornstein–Uhlenbeck stochastic volatility model (see [Citation29, Citation35]).

From (3.21) we know there exists a Brownian motion $\tilde{W}$ , independent of $W^{(2)}$ , such that we express $W^{(1)}$ in terms of $\tilde{W}$ and $W^{(2)}$ by $W_{t}^{(1)} = ρ W_{t}^{(2)} + \sqrt{1 - ρ^{2}} {\tilde{W}}_{t} .$

Using the notations of Propositions 3.6 and 3.7, where we consider the Malliavin derivative in the direction of the Brownian motion $\tilde{W}$ , we have (3.22) $V_{t} = \frac{S_{t}}{s_{0}}, β (t, S_{t -}) = v (Y_{t -}) S_{t -} \sqrt{1 - ρ^{2}}, and ζ_{t} = 0.$ (3.22)

Applying Proposition 3.6, we find for the weights in representation (3.12) $\begin{matrix} π_{1} = \frac{s_{0}}{s S_{s}} (\int_{0}^{s} \frac{S_{r -} / s_{0}}{v (Y_{r -}) S_{r -} \sqrt{1 - ρ^{2}}} d {\tilde{W}}_{r} + \int_{0}^{s} \frac{v (Y_{r -}) \sqrt{1 - ρ^{2}}}{v (Y_{r -}) S_{r -} \sqrt{1 - ρ^{2}}} \frac{S_{r -}}{s_{0}} d r) \\ = \frac{s_{0}}{s S_{s}} (\frac{1}{s_{0} \sqrt{1 - ρ^{2}}} \int_{0}^{s} \frac{d {\tilde{W}}_{r}}{v (Y_{r -})} + \frac{s}{s_{0}}) = \frac{1}{S_{s}} (\frac{1}{s \sqrt{1 - ρ^{2}}} \int_{0}^{s} \frac{d {\tilde{W}}_{r}}{v (Y_{r -})} + 1) \\ = \frac{1}{S_{s}} (\frac{1}{s (1 - ρ^{2})} {\int_{0}^{s} \frac{d W_{r}^{(1)}}{v (Y_{r -})} - ρ \int_{0}^{s} \frac{d W_{r}^{(2)}}{v (Y_{r -})}} + 1) \end{matrix}$ and $π_{2} = \frac{S_{t} / s_{0}}{S_{s} / s_{0}} = \frac{S_{t}}{S_{s}}$ . When we prefer not to use the derivative of the function f, we apply Proposition 3.7. The weight is then given by $\begin{matrix} π = π_{1} - \frac{1}{S_{s}} \frac{1}{(t - s) \sqrt{1 - ρ^{2}}} \int_{s}^{t} \frac{d {\tilde{W}}_{r}}{v (Y_{r -})} \\ = π_{1} - \frac{1}{S_{s}} \frac{1}{(t - s) (1 - ρ^{2})} {\int_{s}^{t} \frac{d W_{r}^{(1)}}{v (Y_{r -})} - ρ \int_{s}^{t} \frac{d W_{r}^{(2)}}{v (Y_{r -})}} . \end{matrix}$

Considering the model (3.20), we derived a representation for $E [f (S_{t}) | S_{s} = α]$ . In the sequel we observe the conditional expectation (3.23) $E [w (Y_{T}) | S_{T} = α],$ (3.23) for a certain Borel measurable function $w : R \mapsto R_{0}$ . Our motivation to consider the latter expression comes from a paper by Martynov and Rozanova [Citation36], where the authors are interested in the computation of conditional moments of Y. Thus, they consider (3.23) for w(x) = x and $w (x) = x^{2}$ . Moreover, in [Citation37] van der Stoep et al. consider (3.23) for $w (x) = v^{2} (x)$ , which is interesting for the study of stochastic local volatility.

We consider model (3.20) and a function w. It is clear that $D_{r}^{(0)} Y_{t} = 0$ since Y only depends on $W^{(2)}$ , which is independent of $\tilde{W}$ . Thus condition (3.4) is satisfied for any process u in $Dom δ^{(0)}$ . Thus when condition (3.1) is fulfilled, the conditional expectation can be written in the form (3.5). From expression (3.10) and previous derivations (3.22) we deduce that $D_{r}^{(0)} (S_{T} - α) = S_{T} v (Y_{r -}) \sqrt{1 - ρ^{2}}, for r \leq T .$

Therefore the process satisfying condition (3.1) is given by $u_{r} = {(T S_{T} v (Y_{r -}) \sqrt{1 - ρ^{2}})}^{- 1} .$

The Skorohod integral of this process is computed similarly as in the proof of Proposition 3.6 and it equals $δ^{(0)} (u) = \frac{1}{T S_{T} \sqrt{1 - ρ^{2}}} \int_{0}^{T} \frac{d {\tilde{W}}_{r}}{v (Y_{r -})} - \int_{0}^{T} \frac{1}{T v (Y_{r -}) \sqrt{1 - ρ^{2}}} D_{r}^{(0)} (\frac{1}{S_{T}}) d r .$

By the chain rule, the second term in the last equation equals $\int_{0}^{T} \frac{1}{T v (Y_{r -}) \sqrt{1 - ρ^{2}}} \frac{D_{r}^{(0)} S_{T}}{S_{T}^{2}} d r = \int_{0}^{T} \frac{1}{T S_{T}} d r = \frac{1}{S_{T}} .$

Finally we conclude that $E [w (Y_{T}) | S_{T} = α] = \frac{E [w (Y_{T}) H (S_{T} - α) \frac{1}{S_{T}} (\frac{1}{T \sqrt{1 - ρ^{2}}} \int_{0}^{T} \frac{d {\tilde{W}}_{r}}{v (Y_{r -})} + 1)]}{E [H (S_{T} - α) \frac{1}{S_{T}} (\frac{1}{T \sqrt{1 - ρ^{2}}} \int_{0}^{T} \frac{d {\tilde{W}}_{r}}{v (Y_{r -})} + 1)]} .$

4. Variance reduction

In the representations considered in the previous sections the random variables whose expectation should be estimated can have a large variance. To obtain a smaller variance and satisfactory convergence results in the context of Monte Carlo simulations, one might include variance reduction techniques. In Moreni [Citation38], a variance reduction technique based on importance sampling is proposed for the Monte Carlo pricing of American options via the Longstaff–Schwartz algorithm. For our conditional expectation representations, we study in subsection 4.1 the localization technique. This technique was used in Bally et al. [Citation39] but we adapt it here to our setting. Moreover we include control variates to reduce the variance. We handle this approach in subsection 4.2.

4.1. Localisation

We adapt the localization technique of Bally et al. [Citation39] for both methods; the conditional density method and the MM. For the proofs of the two following propositions, we refer the reader to Appendix C.

Proposition 4.1.

Assume the setting of Theorem 2.2. Then for any function $ψ : R \mapsto [0, \infty)$ satisfying $\int_{R} ψ (t) d t = 1$ and for all $α \in Dom h$ , we have $E [f (F) | G = α] = \frac{J_{F, G}^{ψ} [f] (α)}{J_{F, G}^{ψ} [1] (α)},$ where $J_{F, G}^{ψ} [\cdot] (α)$ is given by $J_{F, G}^{ψ} [\cdot] (α) = E [\cdot (F) (ψ (G - α) \frac{\partial}{\partial u} g_{2} (U, V) + π_{(X, U)} [H (G - α) - Ψ (G - α)])]$ where $Ψ (x) = \int_{- \infty}^{x} ψ (t) d t$ .

Proposition 4.2.

Assume the setting of Theorem 3.2, then for any continuous function with bounded derivative $ψ : R \mapsto [0, \infty)$ satisfying $\int_{R} ψ (t) d t = 1$ and for all $α \in R$ , we have $E [f (F) | G = α] = \frac{J_{F, G}^{ψ} [f] (α)}{J_{F, G}^{ψ} [1] (α)},$ where $J_{F, G}^{ψ} [\cdot] (α)$ is given by $J_{F, G}^{ψ} [\cdot] (α) = E [\cdot (F) (ψ (G - α) + δ^{(0)} (u) [H (G - α) - Ψ (G - α)])]$ where $Ψ (x) = \int_{- \infty}^{x} ψ (t) d t$ .

Once we have introduced the localized versions of the representation formulas for the conditional expectation, one natural question arises, namely what is the optimal choice of the localizing function ψ. To find this optimal function, we assume that the additional constant c in the function H is zero, i.e. $H (x) = 1_{{x \geq 0}}$ . Let Z represent either the factor $\frac{\partial}{\partial u} g_{2} (U, V)$ in case of the conditional density method or the factor 1 when the MM is considered. Then, practically speaking, an expectation of the form $J_{F, G}^{ψ} [\cdot] (α) = E [\cdot (F) (ψ (G - α) Z + π [H (G - α) - Ψ (G - α)])]$ is estimated via Monte Carlo simulation. More precisely if we denote by N the number of simulated values of F and G, we have the following estimation $J_{F, G}^{ψ} [\cdot] (α) \approx \frac{1}{N} \sum_{q = 1}^{N} \cdot (F^{q}) (ψ (G^{q} - α) Z^{q} + π^{q} [H (G^{q} - α) - Ψ (G^{q} - α)]) .$

In order to reduce the variance, the idea is to minimize the integrated mean squared error with respect to the localizing function ψ. Thus, we have to solve the following optimization problem (this criterion has been introduced by Kohatsu-Higa and Petterson [Citation40]) (4.1) $\inf_{ψ \in L^{1}} I (ψ),$ (4.1) where $L^{1} = {ψ : R \mapsto [0, \infty) : ψ \in C^{1} (R), ψ (+ \infty) = 0, \int_{R} ψ (t) d t = 1}$ and I equals the integrated variance up to a constant (in terms of ψ) (4.2) $I (ψ) = \int_{R} E [\cdot^{2} (F) {(ψ (G - α) Z + π [H (G - α) - Ψ (G - α)])}^{2}] d α .$ (4.2)

The choice of the optimal localizing function ψ is given in the following proposition. It is obvious that the optimal localization function will be different for the numerator and denominator since the optimization problem is different. (The proof in Bally et al. [Citation39] can easily be extended to the current setting.)

Proposition 4.3.

The infimum of the optimization problem (4.1) with $I (ψ)$ given by (4.2) and $H (x) = 1_{{x \geq 0}}$ , is reached at $ψ^{⋆}$ , where $ψ^{⋆}$ is the probability density of the Laplace distribution with parameter $λ^{⋆}$ , i.e. for all $t \in R, ψ^{⋆} (t) = \frac{λ^{⋆}}{2} e^{- λ^{⋆} | t |}$ , where (4.3) $λ^{⋆} = {(\frac{E [\cdot^{2} (F) π^{2}]}{E [\cdot^{2} (F) Z^{2}]})}^{\frac{1}{2}} .$ (4.3)

The localizing function defined in the previous proposition is optimal in the sense of minimal variance, however it is not optimal in numerical experiments when it comes to the computational effort. Therefore, Bouchard and Warin [Citation22] considered the exponential localizing function (4.4) $ψ (x) = λ^{*} e^{- λ^{*} x} 1_{{x \geq 0}},$ (4.4) where $λ^{*}$ is given by (4.3). In paragraph 5.2.4 we show how the use of this function reduces the computational effort. We perform numerical experiments for both localizing functions in Section 5.

The representations for the derivatives in Theorems 2.3 and 3.3 have a localized version too. We state the localized versions as well as the choice of the optimal localizing function ψ in the following propositions. We do not present the proofs since they follow along similar lines as Propositions 4.1, 4.2, and 4.3.

Proposition 4.4.

Assume the setting of Theorem 2.3, then for any function $ψ : R \mapsto [0, \infty)$ satisfying $\int_{R} ψ (t) d t = 1$ and for all $α \in Dom h$ , we have $B_{F, G} [\cdot] (α) = E [\cdot (F) (ψ (G - α) (- π) Z + (- π^{2} + π^{*}) [H (G - α) - Ψ (G - α)])]$ where $Ψ (x) = \int_{- \infty}^{x} ψ (t) d t$ , $Z = \frac{\partial}{\partial u} g_{2} (U, V), π = π_{(X, U)}, and π^{*} = π_{(X, U)}^{*} .$

Proposition 4.5.

Assume the setting of Theorem 3.3, then for any continuous function with bounded derivative $ψ : R \mapsto [0, \infty)$ satisfying $\int_{R} ψ (t) d t = 1$ and for all $α \in R$ , we have $B_{F, G} [\cdot] (α) = E [\cdot (F) (ψ (G - α) (- π) Z + (- π^{2} + π^{*}) [H (G - α) - Ψ (G - α)])]$ where $Ψ (x) = \int_{- \infty}^{x} ψ (t) d t$ , $Z = 1, π = δ^{(0)} (u), and π^{*} = \int_{0}^{T} u_{r} D_{r}^{(0)} δ^{(0)} (u) d r .$

The optimal localizing functions minimize the integrated variance (4.5) $\tilde{I} (ψ) = \int_{R} E [\cdot^{2} (F) {(ψ (G - α) (- π) Z + (- π^{2} + π^{*}) [H (G - α) - Ψ (G - α)])}^{2}] d α .$ (4.5)

Proposition 4.6.

The infimum of the optimization problem $\inf_{ψ \in L^{1}} \tilde{I} (ψ)$ , with $\tilde{I} (ψ)$ given by (4.5), where $H (x) = 1_{{x \geq 0}}$ , is reached at $\tilde{ψ}$ , where $\tilde{ψ}$ is the probability density of the Laplace distribution with parameter $\tilde{λ}$ , i.e. for all $t \in R, \tilde{ψ} (t) = \frac{\tilde{λ}}{2} e^{- \tilde{λ} | t |}$ , where $\tilde{λ} = {(\frac{E [\cdot^{2} (F) {(- π^{2} + π^{*})}^{2}]}{E [\cdot^{2} (F) π^{2} Z^{2}]})}^{\frac{1}{2}} .$

4.2. Control variate

Another approach to obtain variance reduction (besides localization) is to include a control variate, see, e.g., [Citation19, Section 4.1]. The advantage of adding a control variate is to use the observed error in estimating a known quantity to adjust an estimator for an unknown quantity. In case of American option pricing, the control variate can be the corresponding European option price. The price of the American and respectively the European option with maturity T and payoff function $Φ$ , on an asset with value α at time t is denoted by $P (t, α)$ , respectively $P^{Eu} (t, α)$ . Let us define the function $P_{γ} (t, α) : = P (t, α) - γ P^{Eu} (t, α)$ , for a real number γ close to 1. Then it holds that $P_{γ} (t, α) = \sup_{τ \in T_{t, T}} E_{t, α} [e^{- \int_{t}^{τ} r_{u} d u} {Φ (S_{τ}) - γ P^{Eu} (τ, S_{τ})}],$ where $T_{t, T}$ denotes the set of all stopping times in $[t, T]$ . The price of the American option at time 0 is given by $P (0, s_{0}) = P_{γ} (0, s_{0}) + γ P^{Eu} (0, s_{0})$ and its delta equals $Δ (0, s_{0}) = Δ_{γ} (0, s_{0}) + γ Δ^{Eu} (0, s_{0})$ . We rewrite this formula for the American option price as $P (0, s_{0}) = \sup_{τ \in T_{0, T}} E [e^{- \int_{0}^{τ} r_{u} d u} Φ (S_{τ}) - γ {e^{- \int_{0}^{τ} r_{u} d u} P^{Eu} (τ, S_{τ}) - P^{Eu} (0, s_{0})}] .$

From this expression, the advantage of adding a control variate is clear. Indeed, the error between $P^{Eu} (0, s_{0})$ and an estimation of $E [e^{- \int_{0}^{τ} r_{u} d u} P^{Eu} (τ, S_{τ})]$ for each $τ \in T_{0, T}$ is used to adjust the estimation of the American option price $P (0, s_{0}) = \sup_{τ \in T_{0, T}} E [e^{- \int_{0}^{τ} r_{u} d u} Φ (S_{τ})]$ .

Example 4.7.

(Merton model). The European option price in the Merton model is derived in Merton [Citation21]. In the setting described in paragraph 2.2.1, where the price process is as described in (2.6), the European put option price is given by the series (4.6) $P^{Eu} (t, S_{t}) = P^{Me} (t, S_{t}) = \sum_{n = 0}^{\infty} \frac{e^{- μ (T - t)} {(μ (T - t))}^{n}}{n!} P_{n}^{BS} (t, S_{t}) .$ (4.6)

Herein μ is the jump intensity of the Poisson process N introduced in (2.6) and $P_{n}^{BS} (t, S_{t})$ is the Black-Scholes price of the European put option with the same maturity, strike, and interest rate r, and where the underlying stock price process has variance $v_{n}^{2} = β^{2} + n δ^{2} / 2$ . The first 20 terms in the series are sufficient for a good approximation for the put option price.

5. Numerical experiments

In this section, we apply our results to estimate the price and delta of American options at time zero. We illustrate our methods with numerical results in a specified jump-diffusion model.

5.1. Algorithm to estimate prices and deltas of American options

American options can be executed at any time prior to maturity. Since it is practically impossible to observe the possibility to execute the option at infinitely many times, an American option is often approximated by a Bermudan option with the same maturity and payoff function. To obtain this approximation, the time interval $[0, T]$ is discretised into n time periods with step size $ε = T / n$ . The Bermudan option can then be executed at the n discrete times $i T / n, i = 1, \dots, n$ . When the number of time periods increases, the Bermudan option converges to the American option (see Bally and Pagès [Citation13]). Bermudan options can be priced through a Bellman dynamic programing principle, see Bellman [Citation41] and Bally et al. [Citation39]. Let $Φ$ denote the payoff function and S the underlying stock price process with initial value s₀. Then the price of the Bermudan option $P (0, s_{0})$ follows from the recursive computations (5.1) $\begin{matrix} P (n ε, S_{n ε}) = Φ (S_{n ε}) = Φ (S_{T}), \\ P (k ε, S_{k ε}) = \max {Φ (S_{k ε}), e^{- r ε} E [P ((k + 1) ε, S_{(k + 1) ε}) | S_{k ε}]}, k = n - 1, \dots, 1, 0. \end{matrix}$ (5.1)

The sensitivity of the option price with respect to the initial value of the underlying asset, i.e. the delta of the option $Δ (0, s_{0}) : = \partial_{s_{0}} P (0, s_{0})$ , can be derived as follows $\begin{matrix} Δ (ε, S_{ε}) = {\begin{matrix} e^{- r ε} \partial_{α} E [P (2 ε, S_{2 ε}) | S_{ε} = α] |_{α = S_{ε}} & if P (ε, S_{ε}) > Φ (S_{ε}), \\ \partial_{α} Φ (α) |_{α = S_{ε}} & if P (ε, S_{ε}) = Φ (S_{ε}), \end{matrix} \\ Δ (0, s_{0}) = e^{- r ε} E [Δ (ε, S_{ε}) | S_{ε} = s_{0}] |_{s_{0} = S_{ε}} . \end{matrix}$

Hence to obtain a numerical estimation of the price and the delta at time zero, we proceed by estimating the prices and the deltas recursively and backwards in time. For estimations based on simulated values for the underlying stock price, one can simulate the number of required paths at the discrete time points and store them all before performing the recursive computations. On the other hand, since the pricing program and computations of the deltas go backwards in time, it is more convenient to simulate the stock price process simultaneously. Simulating the stock price process backwards in time too leads to more efficiency concerning memory capacity.

Remark 5.1.

(Nested simulations). To estimate the values for $P (k ε, S_{k ε})$ for $k \in {0, \dots, n - 1}$ , the estimation of the conditional expectation requires the simulation of N values of $S_{(k + 1) ε}$ starting from the value of $S_{k ε}$ . In total this results into N^k used simulated values for $S_{k ε}$ for any k. The Bellman dynamic programing principle requires nested simulations to estimate a function of conditional expectations similarly to the compound option case in [Citation19, Example 1.1.3] and illustrated in Figure 1.3 of that book. However, by the use of the obtained representations, the number of simulated values can be considerably reduced. The resulting algorithm needs one single set of simulated paths. For any $k \in {0, \dots, n - 1}$ only N simulated values for $S_{k ε}$ will be required.

5.2. Implementations for American put options in a Merton model

We consider American put options on a stock price process S defined by the Merton model (2.6). The put payoff function equals $Φ (x) = {(K - x)}^{+}$ . Since we want to compare our results to the analysis in Amin [Citation42], we use the parameter setting as in his paper. That explains our choice of this specific connection between the jump mean and jump variance. This simplifies the Merton formula (4.6).

5.2.1. Representations

The conditional expectations and their derivatives in (5.1) and (5.2) can be estimated based on the representations we developed in the previous sections. In particular, in the present Merton setting, the representations are presented in paragraphs 2.2.1 and 3.2.2. Throughout this section we consider $H (x) = 1_{{x \geq 0}}$ .

The regular expectations appearing in the representations (2.7), (2.8), (3.18), and (3.19) can easily be estimated by a Monte Carlo simulation. For example, consider the estimation of the numerator of representation (2.7). We require N simulated values of S_t, S_s, and $π_{s, t}$ , belonging to the same path. If we denote the j-th simulated values by $S_{t}^{j}, S_{s}^{j}$ , and $π_{s, t}^{j}$ , then we approximate (5.3) $E [\cdot (S_{t}) H (S_{s} - α) π_{s, t}] \approx \frac{1}{N} \sum_{j = 1}^{N} \cdot (S_{t}^{j}) H (S_{s}^{j} - α) π_{s, t}^{j} .$ (5.3)

5.2.2. Variance reduction techniques

As discussed in paragraph 4.1, we include the localizing technique. The estimation (5.3) is then replaced by $E [\cdot (S_{t}) H (S_{s} - α) π_{s, t}] \approx \frac{1}{N} \sum_{j = 1}^{N} \cdot (S_{t}^{j}) (ψ (S_{s}^{j} - α) Z_{s}^{j} + π_{s, t}^{j} [H (S_{s}^{j} - α) - Ψ (S_{s}^{j} - α)]),$ where Z_s equals S_s in case of the CDM and 1 in case of the MM. The functions ψ and $Ψ$ are defined by Proposition 4.3.

On the other hand, we include a control variate, see subsection 4.2. For the estimation of the American option price $P (0, s_{0})$ and delta $Δ (0, s_{0})$ , we include the European option as a control variate. In the current setting, the European option price and delta can be obtained through Merton’s approach, see Example 4.7. Consider the algorithm for the price of Bermudan options (5.1). To introduce the control variate, we proceed in two steps. First we replace the put payoff function at each time $k ε, k = 1, \dots, n$ , by $Φ_{γ} (k ε, S_{k ε}) = Φ (S_{k ε}) - γ P^{Me} (k ε, S_{k ε}),$ where γ is a real number close to one and $P^{Me} (k ε, S_{k ε})$ denotes the European option price, obtained through Merton’s approach, at time $k ε$ . Secondly, in the last step (k = 0) we add $γ P^{Me} (0, s_{0})$ (respectively $γ Δ^{Me} (0, s_{0})$ ) to obtain the American option price $P (0, s_{0})$ (respectively the American option delta $Δ (0, s_{0})$ ).

5.2.3. Backward simulation

As remarked at the end of subsection 5.1 the algorithm for the pricing of a Bermudan option goes backwards in time and we can simulate the different stochastic variables backwards in time too. For the Brownian motion we base the backward simulation on a Brownian bridge (see Bally et al. [Citation39]). To simulate the compound Poisson process backwards in time, we base our method on results of Karatzas and Shreve [Citation43] and Baldeaux [Citation20]. We split the simulation of a compound Poisson process in the simulation of a Poisson process and in the simulation of the sum of the jump sizes. First we mention the following proposition implying a backward simulation algorithm for a Poisson process. This is covered by [Citation20, Lemma 3.1].

Proposition 5.2.

Let N be a Poisson process with intensity μ. For any time t > 0 it holds that N_t has a $Poisson (μ t)$ distribution. Moreover for any $0 < s < t$ it holds that N_s, conditioned on N_t = z, follows a $Binomial (z, s / t)$ distribution.

Secondly we present the following proposition considering the (conditional) distribution of sums of independent and identically normal distributed variables. This result is a consequence of Brownian bridges, see Karatzas and Shreve [Citation43].

Proposition 5.3.

Consider the following sum $C^{(k)} = \sum_{i = 1}^{k} Y_{i},$ where Y_i are i.i.d. $N (η, ν)$ . For any k > 0 it holds that $C^{(k)}$ has a $N (k η, k ν)$ distribution. Moreover for any $0 < j < k$ it holds that $C^{(j)}$ , given that $C^{(k)} = z$ , has a $N ((j / k) z, (j / k) (k - j) ν)$ distribution.

The backward simulation technique is interesting in numerical applications and following Baldeaux [Citation20], this technique can also be derived for the Kou model, see Kou [Citation44].

5.2.4. Reduction of computational effort

Bouchard and Warin [Citation22] observed that the computational effort to estimate the American option prices by a MM is reduced by sorting the estimated stock prices. Consider the Bermudan dynamic programing algorithm (5.1). For a fixed k in ${n - 1, \dots, 1}$ we estimate the conditional expectations for $q = 1, \dots, N$ by our representations, including localization, as follows $E [P ((k + 1) ε, S_{(k + 1) ε}) | S_{k ε} = S_{k ε}^{(q)}] \approx \frac{J [P_{(k + 1) ε}] (S_{k ε}^{(q)})}{J [1] (S_{k ε}^{(q)})},$ where $J [\cdot] (S_{k ε}^{(q)}) = \frac{1}{N} \sum_{j = 1}^{N} \cdot^{(j)} (ψ (S_{k ε}^{(j)} - S_{k ε}^{(q)}) Z_{k}^{(j)} + π_{k}^{(j)} (H (S_{k ε}^{(j)} - S_{k ε}^{(q)}) - Ψ (S_{k ε}^{(j)} - S_{k ε}^{(q)}))) .$

If we consider the exponential localizing function (4.4), then it holds that $J [\cdot] (S_{k ε}^{(q)}) = \frac{1}{N} \sum_{j = 1}^{N} \cdot^{(j)} H (S_{k ε}^{(j)} - S_{k ε}^{(q)}) e^{λ^{*} S_{k ε}^{(q)}} e^{- λ^{*} S_{k ε}^{(j)}} (λ^{*} Z_{k}^{(j)} + π_{k}^{(j)}) .$

Now let us sort the simulated paths such that the values $S_{k ε}^{(q)}$ increase for q going from 1 to N and let us indicate this by the superscript s, say $S_{k ε}^{s, (q)}$ . Then we write for each q $\begin{matrix} E [P ((k + 1) ε, S_{(k + 1) ε}) | S_{k ε} = S_{k ε}^{s, (q)}] \\ = \frac{e^{λ_{P}^{*} S_{k ε}^{s, (q)}} \sum_{j = q}^{N} P ((k + 1) ε, S_{(k + 1) ε}^{s, (j)}) e^{- λ_{P}^{*} S_{k ε}^{s, (j)}} (λ_{P}^{*} Z_{k}^{s, (j)} + π_{k}^{s, (j)})}{e^{λ_{1}^{*} S_{k ε}^{s, (q)}} \sum_{j = q}^{N} e^{- λ_{1}^{*} S_{k ε}^{s, (j)}} (λ_{1}^{*} Z_{k}^{s, (j)} + π_{k}^{s, (j)})} . \end{matrix}$

Thus for q going from N to 1, the sums in the numerator and denominator get only one additional term. Hence to estimate $E [P ((k + 1) ε, S_{(k + 1) ε}) | S_{k ε} = S_{k ε}^{s, (q)}]$ for each q, we make use of the previously performed computations for q + 1.

5.3. Numerical results for the Merton model

In this subsection, we present the numerical results obtained via our representations in the context of Bermudan options. We compare our results for the prices to those reported by Amin [Citation42] and to the regression based method introduced by Longstaff and Schwartz [Citation8]. For the deltas, we compare our results to those of Hilliard and Schwartz [Citation45]. To evaluate the accuracy of our representations, we consider European options since there are analytic formulas at hand in the Merton model, whereas there are non for Bermudan and American options. To overcome this problem, we compute confidence intervals for the prices as discussed in Section 3.1 in Bouchard and Warin [Citation22].

The following parameter set for a put option on the underlying stock price process S is used, (5.4) $\begin{matrix} S modelled by (2.6) : & s_{0} = 40, r = 0.08, β^{2} = 0.05, μ = 5, δ^{2} = 0.05, \\ put option : & T = 1, K \in {30, 35, 40, 45, 50} . \end{matrix}$ (5.4)

5.3.1. Accuracy of the method

European options may only be executed at time of maturity T. However, they can be traded at any moment between time 0 and T. Consider the risk-free interest rate r and the underlying stock price process S, then the price at time t > 0 of a European option with payoff function $Φ$ equals (5.5) $P^{Eu} (t, α) = e^{- r (T - t)} E [Φ (S_{T}) | S_{t} = α] .$ (5.5)

The delta at time t equals (5.6) $Δ^{Eu} (t, α) = e^{- r (T - t)} \frac{\partial}{\partial α} E [Φ (S_{T}) | S_{t} = α] .$ (5.6)

The conditional expectations and their derivatives appearing in (5.5) and (5.6) are estimated following the techniques described in subsection 5.2 (except the control variate). As an example, we estimate the prices $P^{Eu} (t, α)$ and deltas $Δ^{Eu} (t, α)$ of a European put option with maturity T = 1 and strike K = 45 on the underlying S described in (5.4), at times $t \in {0.1, 0.2, \dots, 0.9}$ and for $α \in {35, 36, \dots, 45}$ . We do not consider European option prices or deltas at time zero since they do not involve conditional expectations. The estimation of the prices or deltas based on the CDM or MM approach includes the localizing technique. Each estimate results from the same set of $N = 5000000$ simulated paths. In , we present the CDM and the MM estimates for the option prices for $α \in {35, 40, 42}$ . We also report the relative errors in percentages to the Merton option prices, see Example 4.7. Similar results were obtained for the other values of $α \in {35, 36, \dots, 45}$ . shows the corresponding results for the deltas. It turns out that the relative errors when comparing our approach to the one of Merton [Citation21] are very small. Hence the algorithm we developed based on our representations is accurate for European options.

Table 1. Estimates of European put option prices $P^{Eu} (s, α)$ (5.5) via the CDM and MM approach, with relative errors to the Merton prices in percentages.

Display Table

Table 2. Estimates of European put option deltas $Δ^{Eu} (s, α)$ (5.6) via the CDM and MM approach, with relative errors to the Merton deltas in percentages.

Display Table

5.3.2. Results for Bermudan option prices and deltas

We consider a Bermudan put option on the stock price process S with parameters given in (5.4), the strike is fixed at K = 45. Amin [Citation42] and Hilliard and Schwartz [Citation45] developed a tree method to estimate Bermudan and American option prices. In the current setting their estimate for the option price equals 9.954. The Merton European option price at time zero equals 9.422. We choose n = 10. The dynamic programing algorithm presented in subsection 5.1 and our representations are used to estimate $P (0, s_{0})$ and $Δ (0, s_{0})$ .

and illustrate the influence of the variance reduction techniques on the estimates for the price. The graphs on the right hand side are obtained by zooming in on the left graphs. Notice the difference in scale between the left and the right graph on the vertical axis. For $N = 250 i, i = 1, \dots, 60$ , we simulated N paths of the underlying at the discrete time points jT/n, $j = 1, \dots, n$ , and we estimated the option price at time zero through the CDM and the MM, with and without control variate and optimal localization technique. In case the European option is included as a control variate, we put $γ = 0.9$ .

Figure 1. Estimates for the Bermudan put option price obtained through the CDM and MM representations without control variate and without localization technique, against the number of simulated paths. In the right graph the vertical axis is restricted to $[0, 100]$ .

The variance reduction techniques have a remarkable improvement on the results obtained via the CDM and MM approaches. It appears that the CDM results show some more variation than the MM results.

presents the estimated prices of the Bermudan put option with strikes 30, 35, 40, 45, and 50, obtained through the sorted CDM and MM approach including the control variate and the exponential localization function. For these estimates, a time discretisation is performed for n = 10 and $100000$ paths were simulated. We include the estimates for the prices obtained respectively by Amin [Citation42] and by using a regression based method with control variate as in Longstaff and Schwartz [Citation8].

Table 3. Estimates of Bermudan put option prices and confidence intervals for parameter set (5.4), obtained through the sorted CDM and MM approach with control variate and exponential localization and by the Longstaff–Schwartz method [Citation8] with a control variate (LSM). n = 10 and $N = 100000$ .

Display Table

To evaluate the accuracy of our method, we include confidence intervals for the prices. The computation of these confidence intervals is discussed in [Citation22, Section 3.1]. In our case, we include two confidence intervals which we compute using respectively the CDM and the MM. It turns out that the prices we find lay within the confidence intervals and thus they are in an acceptable range.

As described in paragraph 5.2.4, the computational effort is reduced when we perform a sorted algorithm. presents the CPU time in seconds for the LSM, CDM, and MM, respectively. We do not compare the time to Amin [Citation42] and Hilliard and Schwartz [Citation45] methods since there is no clear indication about how long their algorithms take. It turns out that when considering 10 time steps, our algorithms are comparably fast to the regression based algorithm. As analyzed in Bouchard and Warin [Citation22], the complexity of our algorithms for both the MM and the CDM, when considering the sorted localizing method is of order $O (N ln {(N)}^{(d - 1) \lor 1})$ , where d is the dimension of the underlying factor. The complexity of the LSM method depends on the choice of the basis of polynomials as well as the number and power of these polynomials. However, this is not the aim of the study of this paper and we refer to many articles in the literature that studied this issue for LSM. In, e.g. the article by Zhou [Citation46], the author investigated whether there exists an optimal regression complexity in the LSM framework for options pricing. See also the analysis in Bouchard and Warin [Citation22] and in the thesis of Plavsic [Citation47]. Notice that in this latter thesis, the author pointed out that the computational time for LSM is proportional to the average number of in-the-money paths (i.e. the option’s moneyness). That explains why for increasing values of the strike K, the CPU time for LSM increases in .

Table 4. CPU time in seconds for the different methods to compute the estimates of Bermudan put option prices for parameter set (5.4), obtained through the sorted CDM and MM approach with control variate and exponential localization and by the Longstaff–Schwartz method [Citation8] with control variate (LSM). n = 10 and $N = 100000$ .

Display Table

presents the estimated deltas of the Bermudan put option with strikes 30, 35, 40, 45, and 50, obtained through the sorted CDM and MM approach including the control variate and the exponential localization function. For these estimates a time discretisation is performed for n = 10 and $500000$ paths were simulated.

Table 5. Estimates of Bermudan put option deltas for parameter set (5.4), obtained through the sorted CDM and MM approach with control variate and exponential localization. n = 10 and $N = 500000$ .

Display Table

We conclude that the extension that we provided to the geometric Brownian motion model observed in Bally et al. [Citation39] by adding normally distributed jumps to the driving process leads to numerical results which are in line with those found in the literature. As is the case in Bouchard and warin [Citation22], the Monte Carlo method showed to be very promising in the context of jump-diffusions and could be further improved.

5.4. Numerical results for the additive model

In this subsection, we consider the additive model (2.9) with X as described in (2.10) and Y given by (5.7) $d Y_{t} = - λ' Y_{t} d t + d L'_{t},$ (5.7) where $L'$ is a compound Poisson process with intensity μ and exponentially distributed jumps with parameter ν.

The aim is to compute the price of Bermudan options as time-discrete approximations to American options written on such models using the conditional density representation. For the numerical example to be relevant for the energy markets, we note that Benth et al. [Citation48] showed empirically that the spot model fits the Phelix Base electricity price index at the European Power Exchange (EEx) very well. In their article, they estimated the parameters in the suggested model and found the following values (5.8) $\begin{matrix} X is as in (2.10) with & λ = 0.2008, a = 13.3009, b = 8.5689, \\ Y is as in (5.7) with & λ' = 0.3333, μ = 20 / 250, ν = 0.2, \\ Put option & T = 10 / 365, K \in {5.5, 5.25, 5, 4.5} . \end{matrix}$ (5.8)

We compare our conditional density method to the regression based method as in Longstaff and Schwartz [Citation8]. represents the estimated prices of the American put option with strikes 5.5, 5.25, 5, 4.5 obtained through the sorted conditional density method including the exponential localization function. For these estimates, the number of exercise dates n = 10 and $10000$ paths were simulated. For the same parameters, we include the estimates for the prices obtained by the regression method which we denote in the table by LSM and also the cost in seconds for each method.

Table 6. Estimates of American put option prices for parameter set (5.8), obtained through the sorted CDM approach with exponential localization. n = 10 and $N = 10000$ .

Display Table

In the case, the strike K = 5 and n = 10 the LSM indicates rank deficiency. That could explain the difference in the prices given respectively by the LSM and the CDM in the third row of . This is due to the fact that the LSM algorithm for pricing American options is unstable when the time parameter is small. See for example the discussion in Mostovyi [Citation23].

The CDM with sorting as shown in is comparably fast to the LSM. It can always be applied to compute the price of the American option in contrary to the LSM.

6. Conclusion

Conditional expectations play an important role in the pricing and hedging of financial derivatives. In the literature, there exist several methods for the numerical computations of conditional expectations. One of the methods is to use Monte Carlo by rewriting conditional expectations in terms of expectations without conditioning but involving weights. This was first discussed in Fournié et al. [Citation1] in a continuous framework. In this paper we extended this latter approach to include Lévy and jump-diffusion processes. For this purpose we used two approaches: the conditional density method and the MM. We applied the developed theory to the estimation of Bermudan and American option prices and their deltas. We used a localization technique and a control variate to improve the estimation of the involved expectations. Moreover, we illustrated our results with different examples and found accurate numerical results.

As far as further investigations are concerned, one may study other choices of the weights in the representation of the conditional expectations. Notice that there are infinitely many possibilities for the weights and thus infinitely many representations of the conditional expectations. Moreover, one can study parallelization for the Monte Carlo method with Malliavin calculus to improve the numerical results as discussed in Abbas-Turki and Lapeyre [Citation18].

Finally, one can also study the MM for the computation of conditional expectations where conditioning is considered w.r.t. two random variables. One might think for example of the hedging of Asian options or spread options. In this context, the Longstaff–Schwartz method is not suited since the regression with two-dimensional explanatory data becomes numerically involved. The MM can be promising in this case.

Disclosure statement

No potential conflict of interest was reported by the authors.

Additional information

Funding

The financial support from the Agency for Innovation by Science and Technology in Flanders (IWT, grant number 111262) is gratefully acknowledged by Catherine Daveloose. Asma Khedher thanks the KPMG Center of Excellence in Risk Management for the financial support. Part of the research by Asma Khedher and Michèle Vanmaele was carried out during their stay at the CAS - Centre of Advanced Study of the Norwegian Academy of Science and Letter with the support from the research program SEFE. Michèle Vanmaele also acknowledges the Research Foundation Flanders (FWO) and the Special Research Fund (BOF) of the Ghent University for providing the possibility to go on sabbatical leave to CAS.

References

Fournié, E., Lasry, J.-M., Lebuchoux, J., Lions, P.-L. (2001). Applications of Malliavin calculus to Monte Carlo methods in finance II. Finance Stochast. 5(2):201–236. DOI: 10.1007/PL00013529.
Google Scholar
Montero, M., Kohatsu-Higa, A. (2003). Malliavin calculus applied to finance. Physica A Statis. Mech. Appl. 320:548–570. DOI: 10.1016/S0378-4371(02)01531-5.
Web of Science ®Google Scholar
Nualart, D. (2006). The Malliavin Calculus and Related Topics. Berlin, Germany: Springer.
Google Scholar
Di Nunno, G., Meyer-Brandis, T., Øksendal, B., Proske, F. (2005). Malliavin calculus and anticipative itô formulae for lévy processes. Infin. Dimens. Anal. Quantum Probab. Relat. Top. 8(2):235–258. DOI: 10.1142/S0219025705001950.
Web of Science ®Google Scholar
von Sydow, L., Josef Höök, L., Larsson, E., Lindström, E., Milovanović, S., Persson, J., Shcherbakov, V., Shpolyanskiy, Y., Sirén, S., Toivanen, J., et al., Zhao, Y. (2015). BENCHOP-the BENCH marking project in option pricing. Int. J. Comput. Math. 92(12):2361–2379. DOI: 10.1080/00207160.2015.1072172.
Web of Science ®Google Scholar
Broadie, M., Glasserman, P. (1997). Pricing American-style securities using simulation. J. Econ. Dynam. Control 21(8-9):1323–1352. DOI: 10.1016/S0165-1889(97)00029-8.
Web of Science ®Google Scholar
Barraquand, J., Martineau, D. (1995). Numerical valuation of high dimensional multivariate American securities. J. Finance Quantitative Anal. 30(3):383–405. DOI: 10.2307/2331347.
Web of Science ®Google Scholar
Longstaff, F. A., Schwartz, E. S. (2001). Valuing American options by simulations: a simple least squares approach. Rev. Finan. Stud. 14(1):113–148. DOI: 10.1093/rfs/14.1.113.
Web of Science ®Google Scholar
Shu, J., Gu, Y., Zheng, W. (2002). A novel numerical approach of computing American option. Int. J. Found. Comput. Sci. 13(05):685–693. DOI: 10.1142/S0129054102001394.
Google Scholar
Kovalov, P., Linetsky, V., Marcozzi, M. (2007). Pricing multi-asset American options: a finite element method-of-lines with smooth penalty. J. Sci. Comput. 33(3):209–237. DOI: 10.1007/s10915-007-9150-z.
Web of Science ®Google Scholar
Zhang, B., Oosterlee, C. W. (2012). Fourier cosine expansions and put–call relations for Bermudan options. In: Carmona, A. R., Del Moral, P., Hu, P., and Oudjane, N., (eds.). Numerical Methods in Finance: Bordeaux. Berlin: Springer, pp. 323–350.
Google Scholar
Mrad, M., Touzi, N., Zeghal, A. (2006). Monte Carlo estimation of a joint density using Malliavin calculus, and application to American options. Comput. Econ. 27(4):497–531. DOI: 10.1007/s10614-005-9005-3.
Google Scholar
Bally, V., Pagès, G. (2003). Error analysis of the quantization algorithm for obstacle problems. Stochastic Process. Appl. 106(1):1–40. DOI: 10.1016/S0304-4149(03)00026-7.
Web of Science ®Google Scholar
Benth, F. E., Di Nunno, G., Khedher, A. (2010). Lévy models robustness and sensitivity. In QP-PQ: Quantum Probability and White Noise Analysis, Proceedings of the 29th Conference in Hammamet, Tunisia, 1318 October 2008. H. Ouerdiane and A Barhoumi (eds.). 25:153–184.
Google Scholar
Petrou, E. (2008). Malliavin calculus in lévy spaces and applications to finance. Electron. J. Probab. 13(0):852–879. DOI: 10.1214/EJP.v13-502.
Google Scholar
Davis, M. H. A., Johansson, M. P. (2006). Malliavin Monte Carlo Greeks for jump-diffusions. Stochastic Process. Appl. 116(1):101–129. DOI: 10.1016/j.spa.2005.08.002.
Web of Science ®Google Scholar
Benth, F. E., Di Nunno, G., Khedher, A. (2011). Robustness of option prices and their deltas in markets modelled by jump-diffusions. Commun. Stochastic Anal. 5(2):285–307.
Google Scholar
Abbas-Turki, L. A., Lapeyre, B. (2012). American options by Malliavin calculus and nonparametric variance and bias reduction methods. SIAM J. Finan. Math. 3(1):479–510. DOI: 10.1137/11083890X.
Web of Science ®Google Scholar
Glasserman, P. (2004). Monte Carlo Methods in Financial Engineering. Berlin: Springer.
Google Scholar
Baldeaux, J. (2008). Quasi-Monte Carlo methods for the Kou model. Monte Carlo Methods Appl. 14(4):281–302.
Google Scholar
Merton, R. (1976). Option pricing when underlying stock returns are discontinuous. J. Finan. Econ. 3(1-2):125–144. DOI: 10.1016/0304-405X(76)90022-2.
Web of Science ®Google Scholar
Bouchard, B., Warin, X. (2012). Monte-Carlo valuation of American options: Facts and new algorithms to improve existing methods. Num. Meth. Finan. Springer Proceedings in Mathematics 12:215–255.
Google Scholar
Mostovyi, O. (2013). On the stability the least squares Monte Carlo. Optim. Lett. 7(2):259–265. DOI: 10.1007/s11590-011-0414-z.
Web of Science ®Google Scholar
Protter, P. (2005). Stochastic Integration and Differential Equations. 2nd edn, Version 2.1. No. 21 in Stochastic Modelling and Applied Probability. Berlin: Springer.
Google Scholar
Raju, C. K. (1982). Products and compositions with the Dirac Delta function. J. Phys. A Math. Gen. 15(2):381–396. DOI: 10.1088/0305-4470/15/2/011.
Google Scholar
Daveloose, C. (2016). Essays on Pricing and Hedging in Markets with Imperfections. Ph.D. thesis, Ghent University.
Google Scholar
Benth, F. E., Kallsen, J., Meyer-Brandis, T. (2007). A non-Gaussian Ornstein-Uhlenbeck process for electricity spot price modeling and derivatives pricing. Appl. Math. Finance 14(2):153–169. DOI: 10.1080/13504860600725031.
Google Scholar
Sato, K.-I. (1999). Lévy Processes and Infinitely Divisible Distributions. No. 68 in Cambridge Studies in Advanced Mathematics. Cambridge: Cambridge University Press.
Google Scholar
Barndorff-Nielsen, O. E., Shephard, N. (2001). Modelling by Lévy processes for financial econometrics. Barndorff-Nielsen, O., Mikosch, T., and Resnick, S. (eds.), Lévy Processes: Theory and Applications. Basel, Switzerland: Birkhäuser, pp. 283–318.
Google Scholar
Brewer, D. W., Tubbs, J. D., Smith, O. E. (1987). A differential equations approach to the modal location for a family of bivariate gamma distributions. J. Multivariate Anal. 21(1):53–66. DOI: 10.1016/0047-259X(87)90098-4.
Web of Science ®Google Scholar
Bowman, F. (1958). Introduction to Bessel Functions. New York: Dover Publications Inc.
Google Scholar
Solé, J. L., Utzet, F., Vives, J. (2007). Canonical Lévy process and Malliavin calculus. Stochastic Process. Appl. 117(2):165–187. DOI: 10.1016/j.spa.2006.06.006.
Web of Science ®Google Scholar
Ikeda, N., Watanabe, S. (1981). Stochastic Differential Equations and Diffusion Processes. New York: Elsevier Science.
Google Scholar
Bates, D. (1996). Jumps and stochastic volatility: exchange rate processes implicit in Deutsche Mark options. Rev. Finan. Stud. 9(1):69–107. DOI: 10.1093/rfs/9.1.69.
Web of Science ®Google Scholar
Barndorff-Nielsen, O. E., Shephard, N. (2002). Econometric analysis of realized volatility and its use in estimating stochastic volatility models. J. Royal Stat. Soc. B 64(2):253–280. DOI: 10.1111/1467-9868.00336.
Google Scholar
Martynov, M., Rozanova, O. (2013). On dependence of volatility on return for stochastic volatility models. Stochastics 85(5):917–927. DOI: 10.1080/17442508.2012.673616.
Google Scholar
van der Stoep, A. W., Grzelak, L. A., Oosterlee, C. W. (2014). The Heston stochastic-local volatility model: efficient Monte Carlo simulation. Int. J. Theor. Appl. Finan. 17(7):14050045.
Google Scholar
Moreni, N. (2004). A variance reduction technique for American option pricing. Physica A Statis. Mech. Appl. 338(1-2):292–295. DOI: 10.1016/j.physa.2004.02.055.
Web of Science ®Google Scholar
Bally, V., Caramellino, L., Zanette, A. (2005). Pricing American options by Monte Carlo methods using a Malliavin calculus approach. Monte Carlo Methods Appl. 11:97–133.
Google Scholar
Kohatsu-Higa, A., Pettersson, R. (2002). Variance reduction methods for simulation of densities on wiener space. SIAM J. Numer. Anal. 40(2):431–450. DOI: 10.1137/S0036142901385507.
Web of Science ®Google Scholar
Bellman, R. (1957). Dynamic Programming. Princeton, NJ: Princeton University Press.
Google Scholar
Amin, K. I. (1993). Jump diffusion option valuation in discrete time. J. Finan. 48(5):1833–1863. DOI: 10.2307/2329069.
Web of Science ®Google Scholar
Karatzas, I., Shreve, S. (1991). Brownian Motion and Stochastic Calculus. Berlin: Springer.
Google Scholar
Kou, S. (2002). A jump-diffusion model for option pricing. Manag. Sci. 48(8):1086–1101. DOI: 10.1287/mnsc.48.8.1086.166.
Web of Science ®Google Scholar
Hilliard, J., Schwartz, A. (2005). Pricing European and American derivatives under a jump-diffusion process: a bivariate tree approach. J. Finan. Quant. Anal. 40(03):671–691.
Google Scholar
Zhou, Y. (2004). On the existence of an optimal regression complexity in the Least-Squares Monte Carlo (LSM) framework for options pricing. 39th Actuarial Research Conference, Iowa Memorial Union, The University of Iowa, Iowa City, April.
Google Scholar
Plavsic, M. (2011). Pricing American options - aspects of computation. Ph.D. thesis, Imperial College London.
Google Scholar
Benth, F. E., Kallsen, J., Meyer‐Brandis, T. (2007). A critical empirical study of two electricity spot price modeling and derivatives pricing. Appl. Math. Finance 14(2):153–169. DOI: 10.1080/13504860600725031.
Google Scholar

Appendix A. Proof in Section 2

Proof of Theorem 2.2. We consider the following approximation

\begin{matrix} E [f (F) | G = α] = \lim_{ε \to 0 +} E [f (F) | G \in] α - ε, α + ε [] \\ = \lim_{ε \to 0 +} \frac{E [f (F) 1_{] - ε, ε [} (G - α)]}{E [1_{] - ε, ε [} (G - α)]}, \end{matrix}

and know by the conditional density method and Assumptions 2.1(1) that

\begin{matrix} E [f (F) 1_{] - ε, ε [} (G - α)] = E [E [f (g_{1} (X, Y)) 1_{] - ε, ε [} (g_{2} (U, V) - α) | σ (Y, V)]] \\ = E [\int_{R^{2}} f (g_{1} (x, Y)) 1_{] - ε, ε [} (g_{2} (u, V) - α) p_{(X, U)} (x, u) dxdu] . \end{matrix}

We introduce two functions $\begin{matrix} Φ (x; ε, α, h) : = 1_{] - ε, ε [} (h^{- 1} (x) - α), \\ Ψ (x; ε, α, h) : = \int_{- \infty}^{x} Φ (z; ε, α, h) d z + 2 ε c', where c' = c h' (α) . \end{matrix}$

By relation (2.2) and integration by parts we have $\begin{matrix} \int_{R} 1_{] - ε, ε [} (g_{2} (u, V) - α) p_{(X, U)} (x, u) d u \\ = \int_{R} 1_{] - ε, ε [} (h^{- 1} (u + g^{*} (V)) - α) p_{(X, U)} (x, u) d u \\ = \int_{R} Φ (u + g^{*} (V); ε, α, h) p_{(X, U)} (x, u) d u \\ = - \int_{R} Ψ (u + g^{*} (V); ε, α, h) \frac{\partial}{\partial u} p_{(X, U)} (x, u) d u . \end{matrix}$

Combining the previous steps leads to the observation that (A.1) $E [f (F) | G = α] = \lim_{ε \to 0 +} \frac{E [\int_{R^{2}} f (g_{1} (x, Y)) \frac{1}{2 ε} Ψ (u + g^{*} (V); ε, α, h) (- \frac{\partial}{\partial u} p_{(X, U)} (x, u)) dxdu]}{E [\int_{R^{2}} \frac{1}{2 ε} Ψ (u + g^{*} (V); ε, α, h) (- \frac{\partial}{\partial u} p_{(X, U)} (x, u)) dxdu]} .$ (A.1)

Because of the facts that $\frac{1}{2 ε} Ψ (u + g^{*} (V); ε, α, h)$ is bounded in ε, that $E [f^{2} (F)] < \infty$ , and that $E [π_{(X, U)}^{2}] < \infty$ , the limit can be brought inside the integrals in the numerator and denominator. Thus we determine $\begin{matrix} \lim_{ε \to 0 +} \frac{1}{2 ε} Ψ (u + g^{*} (V); ε, α, h) = \lim_{ε \to 0 +} \frac{1}{2 ε} \int_{- \infty}^{u + g^{*} (V)} Φ (z; ε, α, h) d z + c' \\ = \lim_{ε \to 0 +} \frac{1}{2 ε} \int_{- \infty}^{u + g^{*} (V)} 1_{] - ε, ε [} (h^{- 1} (z) - α) d z + c' = \int_{- \infty}^{u + g^{*} (V)} δ_{0} (h^{- 1} (z) - α) d z + c' \\ = \int_{- \infty}^{u + g^{*} (V)} δ_{0} (z - h (α)) h' (α) d z + c' = h' (α) 1_{{h (α) \in] - \infty, u + g^{*} (V)]}} + c' \\ = h' (α) 1_{{α \leq h^{- 1} (u + g^{*} (V))}} + c' = h' (α) 1_{{g_{2} (u, V) - α \geq 0}} + c' = h' (α) H (g_{2} (u, V) - α) . \end{matrix}$

The numerator of (A.1) therefore equals $\begin{matrix} E [\int_{R^{2}} f (g_{1} (x, Y)) H (g_{2} (u, V) - α) {- \frac{\partial}{\partial u} log p_{(X, U)} (x, u)} p_{(X, U)} (x, u) dxdu] h' (α) \\ = E [E [f (F) H (G - α) {- \frac{\partial}{\partial u} log p_{(X, U)} (X, U)} | σ (Y, V)]] h' (α) \\ = E [f (F) H (G - α) {- \frac{\partial}{\partial u} log p_{(X, U)} (X, U)}] h' (α) . \end{matrix}$

Simiatolarly the denominator of (A.1) equals $E [H (G - α) {- \frac{\partial}{\partial u} log p_{(X, U)} (X, U)}] h' (α) .$

This concludes the proof of Theorem 2.2. □

Appendix B. Malliavin calculus

In this article, we make use of the Malliavin calculus as defined in Petrou [Citation15]. The following properties and definitions concerning the Malliavin derivative in the direction of the Wiener process are applied.

The Malliavin derivative in the direction of the Brownian motion is denoted by $D^{(0)}$ . The space $D^{(0)}$ contains all the random variables in $L^{2} (Ω)$ that are differentiable in the Wiener direction.
Chain rule
Let $F \in D^{(0)}$ and $f \in C_{b}^{1}$ . Then it holds that $f (F) \in D^{(0)}$ and $D^{(0)} f (F) = f' (F) D^{(0)} F .$
Skorohod integral $δ^{(0)}$
Let $δ^{(0)}$ be the adjoint operator of the directional derivative $D^{(0)}$ . The operator $δ^{(0)}$ maps $L^{2} (Ω \times [0, T])$ to $L^{2} (Ω)$ . The set of processes $u \in L^{2} (Ω \times [0, T])$ such that $| E [\int_{0}^{T} u_{t} D_{t}^{(0)} F d t] | \leq C {‖ F ‖}_{L^{2} (Ω)}$ for all $F \in D^{(0)}$ , is the domain of $δ^{(0)}$ , denoted by $Dom δ^{(0)}$ . For every $u \in Dom δ^{(0)}$ we define $δ^{(0)} (u)$ as the Skorohod integral in the Wiener direction of u by $E [F δ^{(0)} (u)] = E [\int_{0}^{T} u_{t} D_{t}^{(0)} F d t],$ for any $F \in D^{(0)}$ . The equation above is called the duality formula.
Integration by parts
Let $F u \in L^{2} (Ω \times [0, T])$ , where $F \in D^{(0)}$ , and $u \in Dom δ^{(0)}$ . Then $F u \in Dom δ^{(0)}$ and $δ^{(0)} (F u) = F δ^{(0)} (u) - \int_{0}^{T} u_{t} D_{t}^{(0)} F d t$ if and only if the right hand side of the latter equation is in $L^{2} (Ω)$ .
Predictable processes
Let u be a predictable process such that $E [\int_{0}^{T} u_{t}^{2} d t] < \infty$ . Then $u \in Dom δ^{(0)}$ and the Skorohod integral coincides with the Ito-integral $δ^{(0)} (u) = \int_{0}^{T} u_{t} d W_{t} .$

Appendix C. Proofs localization

Proof of Proposition 4.1.

Adding and subtracting the same expression, keeping in mind that $ψ = Ψ'$ , we get for the numerator of representation (2.4), the following equality $\begin{matrix} E [f (F) H (G - α) π_{(X, U)}] = E [f (F) ψ (G - α) \frac{\partial}{\partial u} g_{2} (U, V)] + E [f (F) H (G - α) π_{(X, U)}] \\ - E [f (F) Ψ' (G - α) \frac{\partial}{\partial u} g_{2} (U, V)] . \end{matrix}$

The last term equals $E [f (F) Ψ' (G - α) \frac{\partial}{\partial u} g_{2} (U, V)] = E [\int_{R^{2}} f (g_{1} (x, Y)) Ψ' (g_{2} (u, V) - α) \frac{\partial}{\partial u} g_{2} (u, V) p (x, u) dxdu] .$

Using the integration by parts formula, we get $\begin{matrix} E [f (F) Ψ' (G - α) \frac{\partial}{\partial u} g_{2} (U, V)] \\ = E [- \int_{R^{2}} f (g_{1} (x, Y)) Ψ (g_{2} (u, V) - α) \frac{\partial}{\partial u} p (x, u) dxdu] \\ = E [\int_{R^{2}} f (g_{1} (x, Y)) Ψ (g_{2} (u, V) - α) (- \frac{\partial}{\partial u} log p (x, u)) p (x, u) dxdu] \\ = E [E [f (F) Ψ (G - α) π_{(X, U)} | σ (Y, V)]] = E [f (F) Ψ (G - α) π_{(X, U)}], \end{matrix}$ and the result follows.

□

Proof of Proposition 4.2.

Adding and subtracting the same expression, keeping in mind that $ψ = Ψ'$ , we get for the numerator of representation (3.5) that $\begin{matrix} E [f (F) H (G - α) δ^{(0)} (u)] = E [f (F) ψ (G - α)] + E [f (F) H (G - α) δ^{(0)} (u)] \\ - E [f (F) Ψ' (G - α)] . \end{matrix}$

Applying the same arguments as in Theorem 3.2, we show that the last term equals $E [f (F) Ψ' (G - α)] = E [f (F) Ψ (G - α) δ^{(0)} (u)]$ and the result follows.

□

Representations for conditional expectations and applications to pricing and hedging of financial products in Lévy and jump-diffusion setting

Abstract

1. Introduction

2. Conditional expectation via the conditional density method

2.1. Representation results

2.2. Examples: Jump-diffusion models

2.2.1. Jump-diffusion model

2.2.2. Additive model

3. Conditional expectation via MM

3.1. Representation results

3.2. Examples: Jump-diffusion models

3.2.1. Exponential Lévy model

3.2.2. Merton model

3.2.3. Stochastic volatility models

4. Variance reduction

4.1. Localisation

4.2. Control variate

5. Numerical experiments

5.1. Algorithm to estimate prices and deltas of American options

5.2. Implementations for American put options in a Merton model

5.2.1. Representations

5.2.2. Variance reduction techniques

5.2.3. Backward simulation

5.2.4. Reduction of computational effort

5.3. Numerical results for the Merton model

5.3.1. Accuracy of the method

Table 1. Estimates of European put option prices P Eu(s,α) (5.5) via the CDM and MM approach, with relative errors to the Merton prices in percentages.

Table 2. Estimates of European put option deltas ΔEu(s,α) (5.6) via the CDM and MM approach, with relative errors to the Merton deltas in percentages.

5.3.2. Results for Bermudan option prices and deltas

Table 3. Estimates of Bermudan put option prices and confidence intervals for parameter set (5.4), obtained through the sorted CDM and MM approach with control variate and exponential localization and by the Longstaff–Schwartz method [Citation8] with a control variate (LSM). n = 10 and N=100000.

Table 5. Estimates of Bermudan put option deltas for parameter set (5.4), obtained through the sorted CDM and MM approach with control variate and exponential localization. n = 10 and N=500000.

5.4. Numerical results for the additive model

Table 6. Estimates of American put option prices for parameter set (5.8), obtained through the sorted CDM approach with exponential localization. n = 10 and N=10000.

6. Conclusion

Disclosure statement

Additional information

Funding

References

Appendix A. Proof in Section 2

Appendix B. Malliavin calculus

Appendix C. Proofs localization

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date

Table 1. Estimates of European put option prices $P^{Eu} (s, α)$ (5.5) via the CDM and MM approach, with relative errors to the Merton prices in percentages.

Table 2. Estimates of European put option deltas $Δ^{Eu} (s, α)$ (5.6) via the CDM and MM approach, with relative errors to the Merton deltas in percentages.

Table 5. Estimates of Bermudan put option deltas for parameter set (5.4), obtained through the sorted CDM and MM approach with control variate and exponential localization. n = 10 and $N = 500000$ .

Table 6. Estimates of American put option prices for parameter set (5.8), obtained through the sorted CDM approach with exponential localization. n = 10 and $N = 10000$ .