Abstract
The computation of credit risk measures such as exposure and Credit Value Adjustments (CVA) requires the simulation of future portfolio prices. Recent metrics, such as dynamic Initial Margin (IM) and Margin Value Adjustments (MVA) additionally require the simulation of future conditional sensitivities. For portfolios with non-linear instruments that do not admit closed-form valuation formulas, this poses a significant computational challenge. This problem is addressed by proposing a static replication algorithm for interest rate options with early-exercise features under an affine term-structure model. Under the appropriate conditions, we can find an equivalent portfolio of vanilla options that replicate these products. Specifically, we decompose the product into a portfolio of European swaptions. The weights and strikes of the portfolio are obtained by regressing the target option value with interpretable, feed-forward neural networks. Once an equivalent portfolio of European swaptions is determined, we can leverage on closed-form expressions to obtain the conditional prices and sensitivities, which serve as an input to exposure and SIMM-driven MVA quantification. For a consistent forward sensitivity estimation, this involves the differentiation of the portfolio-weights. The accuracy and convergence of the method is demonstrated through several representative numerical examples, benchmarked against the established least-square Monte Carlo method.
1. Introduction
This research contributes to the quantification of risk measures related to counterparty credit risk (CCR) for interest rate derivatives that are traded over-the-counter (OTC). CCR refers to the risk that the counterparty in a bilateral agreement may default before the final settlement of every cash-flow and will fail to meet its financial obligations (Bank for International Settlements Citation2019). This type of risk has gained a lot of attention in the aftermath of the global financial crisis of 2007–08. Since then, the regulators have introduced numerous measures to mitigate systemic credit risk that have resulted in quantifying CCR through various value adjustments (xVA) (Gregory Citation2020). The family of xVAs includes CVA, which is defined as the expected loss that is incurred from a counterparty defaulting. Other xVAs quantify the cost related to funding, collateral and capital requirements. A general introduction to the challenges related to CCR quantification can be found in Zhu and Pykhtin (Citation2007).
An important means to reduce CCR for uncleared OTC derivatives is to post collateral (Gregory Citation2020). For interest rate derivatives we can in general distinguish two types of collateral: Variation Margin (VM) and Initial Margin (IM). VM matches the current value of the underlying portfolio and needs to be updated on a regular basis. The funding cost related to VM is known as FVA and is reflected by the expected exposure of the portfolio (Burgard and Kjaer Citation2013), similar to CVA. Posting VM significantly reduces the exposure to counterparty risk but does not bring it down to zero. Typically there will be a delay between the event of default and the settlement of all outstanding positions, which is known as the margin period of risk (Gregory Citation2020). IM serves as a protection against exposure changes due to market moves during this close-out period. In general, IM is supposed to cover the potential future exposure of a contract over a time-horizon of 10 business days with a 99% confidence. In that regard, the volume of IM typically reflects the 99% value-at-risk of the 10-day move in portfolio value. A portfolio's VaR varies over time and should therefore (just like VM) be updated on a regular basis.
A VaR measure is highly model-dependent and computationally intensive to quantify. To address these problems, the International Swaps and Derivatives Association (ISDA) has developed the Standard Initial Margin Model (SIMM). Since its publication in 2016, SIMM has become an industry-standard approach to IM quantification. Benefits of such a common methodology include a transparent dispute resolution and a consistent regulatory governance of collateral (International Swaps and Derivatives Association, Inc Citation2016). In short, SIMM is a sensitivity-based approach, in which VaR measures are approximated using the instrument's sensitivity to the shock of an underlying risk-factor. For the user of SIMM, this means that IM quantification reduces to the calculation of a set of bucketed portfolio sensitivities, such as Delta and Vega.
The exchange of collateral needs to be funded, which comes at a cost for the dealer. The total expected funding cost over the life-time of a derivative contract is known as the Margin Value Adjustment (MVA) (Green and Kenyon Citation2015) and is a recent addition to the xVA-collection. An ISDA margin survey reported that market participants collected over $300 billion of IM for non-cleared derivatives at year-end 2021 (International Swaps and Derivatives Association, Inc Citation2021). Due to its volume, an adequate quantification and management of IM-induced funding cost is highly relevant. Under the assumption that the size of IM is implied by SIMM and thus by sensitivities, it follows that MVA is driven by the future distribution of portfolio sensitivities over the life-time of the trade. Within a Monte Carlo simulation framework, this means sensitivities need to be calculated along the Monte Carlo path. For vanilla derivatives, which admit closed-form Greeks, the computational cost may be manageable. However, for exotic derivatives, such as options with early-exercise features, this is a demanding problem.
In this work, we present a universal approach to the computation of prices, Greeks, exposures and sensitivities along the Monte Carlo path for callable interest rate derivatives. Our focus will be on Bermudan swaptions, which is a class of exotic OTC derivatives that is heavily traded in the market. Our methodology is an extension of the work presented in Lokeshwar et al. (Citation2022) and Hoencamp et al. (Citation2023) and relies of the concept of static replication. A static replication is a portfolio of vanilla instruments, that mirrors the value of the original exotic option, until it is either exercised or matured. The portfolio composition of a static replication is constant throughout the life-time of the trade. This is in contrast with a dynamic replication, which needs to be continuously re-balanced (such as a traditional Delta hedge), or a semi-static replication, which needs to be re-balanced on a finite number of instances. The decomposition of a complex product into a portfolio of vanilla options greatly simplifies its risk analysis and allows for an efficient price and sensitivity calculation.
In the context of CVA, methodologies that facilitate efficient exposure calculations, have received a lot of attention in the recent literature. Ordinary least-square regression techniques for American-style options embedded in a simulation framework, have been popularized by Carriere (Citation1996), Tsitsiklis and Van Roy (Citation2001) and Longstaff and Schwartz (Citation2001). For exposure purposes, this regression methodology is extended by Joshi and Kwon (Citation2016) and Feng et al. (Citation2016) to reduce regression bias and noise along the Monte Carlo path. In other works, Monte Carlo methods have been combined with finite-difference estimation (De Graaf et al. Citation2014, Simaitis et al. Citation2016), the COS-method (Shen et al. Citation2013, Feng et al. Citation2016), the Stochastic Grid Bundling Method (SGBM) (Karlsson et al. Citation2016) and Chebyshev interpolation (Glau et al. Citation2021). In recent advances, machine learning has been applied to facilitate efficiency in exposure calculations. This includes a Deep Optimal Stopping algorithm to estimate the exercise strategy (Andersson and Oosterlee Citation2021) and a Deep xVA Solver, which relies on a neural network-based BSDE solver.
The publication of SIMM and the increasing significance of MVA have been incentives to investigate methodologies that facilitate efficient sensitivity calculation. For practitioners, the default approach is ‘bump-and-reval’, due to its simplicity and straight-forward implementation. For spot calculations, this is feasible, but for an application such as MVA, it would imply path-wise model re-calibration and option re-evaluation for each scenario, time-step and sensitivity-component. Thus, the computational burden would be excessive. In Glasserman (Citation2004), unbiased Monte Carlo sensitivity estimators are proposed, which yield better convergence properties than the traditional finite-difference estimator. These are known as the likelihood ratio method, which relies on differentiating the payoff probability density, and the path-wise derivative method, which relies of differentiating the payoff function itself. Yet the computational improvement of these methods can be limited for path-dependent instruments.
The application of algorithmic differentiation (AD) to the path-wise derivative method has been shown to yield a significant efficiency gain. The concept of AD is that the estimation routine is decomposed into a series of basic operations, which are differentiated by a repeated application of the chain rule (Griewank and Walther Citation2008). The differentiation is either performed in a forward iteration (tangent mode) or backward iteration (adjoint mode, AAD). See for example Giles and Glasserman (Citation2006), Capriotti (Citation2011) or Capriotti et al. (Citation2017) for an application with American-style derivatives. The tangent mode is efficient if the number of independent variables (i.e. the dimensionality of the function input) is small. It is therefore less suitable for the calculations of large gradients. In contrast, the adjoint mode is efficient if the number of dependent variables (i.e. the dimensionality of the function output) is small. A limitation of AAD, however, is that it is demanding in terms of memory and its implementation is often technical. Others have combined adjoint differentiation with LSMC (Antonov et al. Citation2017, Caspers and Lichters Citation2018), stochastic automatic differentiation (Fries Citation2019) or SGBM (Jain et al. Citation2019). Alternative approaches to path-wise sensitivity approximations include principal component analysis (Kappen Citation2017), Chebyshev interpolation (Zeron and Ruiz Citation2018) and least-square regression (Lakhany and Zhang Citation2021).
With a focus on Bermudan swaptions under an affine term-structure model, our contribution to the literature is threefold. First we extend the semi-static replication algorithm that is presented in Lokeshwar et al. (Citation2022) and Hoencamp et al. (Citation2023) to a replication approach that is truly static. The method is based on formulating the portfolio optimization as the regression of a shallow neural network to the target's option value. As a result, the trained parameters represent the weights and strikes of the portfolio composition. Second, we show that with the proper constraints on the regression, a Bermudan can be replicated with a portfolio of European swaptions. The accuracy is demonstrated through several numerical examples. Thirdly, we derive efficient estimators for sensitivities along the Monte Carlo path, which serve as an input to dynamic IM quantification. Here we exploit the fact that Deltas and Vegas of European swaptions are available in closed-form. We also show that for complete and accurate sensitivities, a differentiation of the portfolio weights needs to be incorporated. We demonstrate the performance of the algorithm through several numerical examples and convergence analyzes, benchmarking the results to the established least-square Monte Carlo method (Longstaff and Schwartz Citation2001).
The outline of the article is as follows. In section 2, we define the modeling landscape, the considered interest rate derivatives and the risk metrics of interest. In section 3, we present the regression algorithm and show how we can achieve a static replication for Bermudan swaptions. In section 4, we derive the estimators used for price and sensitivity calculation along the Monte Carlo path. Section 5 is subject to several representative numerical examples. We conclude with a summary of the results in section 6.
2. Problem formulation
A problem overview is provided in figures .
Problem overview
Initial Margin (IM) is a form of collateral, which serves as a protection against the 99% value-at-risk of the exposure change during the margin period of risk (figure ).
The industry standard approach to IM quantification is the Standard Initial Margin Model (SIMM), which takes the portfolio sensitivities as input (figure ).
The expected funding cost of posting IM is called Margin Value Adjustment (MVA). Quantifying MVA requires the simulation of future IM (i.e. portfolio sensitivities), which is a computational challenge (figure ).
We propose a method to statically replicate a callable derivative with a portfolio of vanilla options. This simplifies the computation of path-wise sensitivities and future IM (figure ).
2.1. Model and assumptions
First, we fix a finite time-horizon , on which we consider a continuous-time financial market. We assume the market is frictionless and free of arbitrage. We let the probability space represent all possible states of the economy and the filtration all information generated by the economy up to time-t.
We consider the notion of a bank account or money market account. Investments in the money market are assumed to compound a continuous, risk-free interest . We refer to as the short-rate. The time-t value of a unit of currency invested in the money market at time-zero is denoted as , and we assume it satisfies the following dynamics We denote by the risk-neutral measure equivalent to , which is associated with as the numéraire. Attainable claims denominated by the numéraire are assumed to be martingales under , which guarantees the absence of arbitrage (Harrison and Pliska Citation1981).
Throughout this paper we shall consider short-rate dynamics that are captured by an affine term-structure model, as introduced in Duffie and Kan (Citation1996). That means that the short-rate can be written as for some function . The vector denotes a Markovian system of state-variables in . The state-variables are assumed to satisfy an SDE of the form (1) (1) with denoting a d-dimensional Brownian motion under and functions and . We let and be affine functions of , satisfying the standard regularity conditions, such that admits a strong solution. Note that we restrict our scope to the Gaussian subclass of affine short-rate models. This is done to impose an intuitive relationship between model parameters and model-implied option volatilities (see section 3.5). This will facilitate the tractability of the computations; the estimation of Vega in particular. Generalizations to state-dependent diffusion coefficients should be possible but will involve the differentiation of Fourier transforms. This increases the complexity of the problem and falls outside the scope of this paper.
A zero-coupon bond is a contract that guarantees the holder one unit of currency at a pre-specified maturity date T. We will denote its time-t value as for . An important result for affine term-structure models, is that zero-coupon bond prices are exponential affine in . See for example Andersen and Piterbarg (Citation2010) and Duffie and Kan (Citation1996) for details. Next to zero-coupon bonds, we will often refer to a related quantity, known as the zero-rate. We define a zero-rate as where τ denotes the year fraction between date t and T. For simplicity, we will assume that the collateral rate used for discounting and the instantaneous rate used to derive term-rates are both implied by the same short-rate . In other words, our set-up will be a classic single-curve model environment. As term rates, we consider the classic, forward-looking LIBOR. Similarly one could consider a backward-looking, RFR-based term rate. In certain markets, the IBOR benchmark is discontinued, in favor of such set-in-arrears term rates (see Lyashenko and Mercurio Citation2019). As forward rates of forward- and backward-looking term rates are equivalent before the start of the accrual period, their treatment within this framework would mostly be the same.
2.2. Interest rate derivatives
The focus of this work will be on modeling Bermudan swaptions. Here we briefly introduce related derivatives and notation.
Interest rate swap: We consider fixed-for-floating interest rate swaps, where floating LIBOR payments are exchanged against fixed rate payments on dates . The year-fraction between and will be denoted .
European swaption: A European swaption is a contract, which gives the holder the right, but not the obligation to enter an fixed-for-floating interest rate swap with pre-specified fixed rate K at a pre-specified future inception date . Considering the annuity as numéraire, the risk-neutral value of a swaption (see Filipovic Citation2009) can be expressed as (2) (2) where the expectation above is taken under the annuity measure . The parameter δ distinguishes between a payer () and a receiver () swaption. Under the assumption that the swap rate has Gaussian dynamics, the swaption price can evaluated using Bachelier's formula (see Musiela and Rutkowski Citation1997). Given a swaption has market price , the implied normal volatility is defined as the unique scalar that solves the equation
Bermudan swaption: A Bermudan swaption is a contract, which gives the holder the right to enter a fixed-for-floating interest rate swap with maturity at any of the dates at a pre-specified fixed rate K. Should the holder of the Bermudan choose to exercise at , the payoff is given by The value of a Bermudan swaption can be represented through a dynamic programming formulation, in which the price is captured by means of a backward recursion (Glasserman Citation2004), given by where represents the continuation value of the contract.
2.3. Credit risk metrics
In this work, we will consider credit exposure and forward sensitivity profiles of Bermudan swaptions. Exposure is essential for CVA computation. Product sensitivities, Delta and Vega in particular, serve as an input to ISDA SIMM calculations for initial margin.
2.3.1. Exposure and CVA
The expected positive exposure (EPE) of a financial contract with price at time is Expected positive exposure represents the expected loss on a claim in the event of a counterparty defaulting at a given time (Gregory Citation2020). The expectation is evaluated under the risk-neutral measure . Exposure is a key ingredient in the quantification of counterparty risk at trade or counterparty level through CVA. CVA is the difference between the total value of a derivative in a market that is completed with counterparty risk and its default-free value (Burgard and Kjaer Citation2013). Let denote the default time of the counterparty. Assuming τ is independent of exposure (i.e. ignoring wrong- or right-way risk), CVA can be computed as follows Green (Citation2015) (3) (3) with RR denoting the recovery rate, LGD the loss given default and PD the probability of default of the counterparty. A common approach in the literature is to model the default time τ as an adapted stopping time using a Cox-construction (Jeanblanc and Li Citation2020). In that case, the survival function of the counterparty follows an exponential distribution, i.e. with denoting a hazard function. The hazard function can be specified in terms of a hazard rate , which yields . Therefore, the probability of default can be written as (4) (4)
2.3.2. Initial margin and MVA
An important means to reduce the exposure at risk is the exchange of collateral. In this work, we will consider initial margin (IM) and its related funding costs. IM is a protection against exposure changes during the margin period of risk, the time interval after a default when outstanding positions are not yet settled, but VM is no longer updated (Green and Kenyon Citation2015). Typically, IM reflects the 99% value-at-risk of the portfolio change over a 10-day time interval. Throughout this work, we will consider the Standard Initial Margin Model (SIMM) to be the default method for IM computations. The main take-away from SIMM–IM is that it is a sensitivity-based approach. The idea is that the response of an instrument to a risk-factor shock is efficiently approximated by multiplying the instrument's risk-factor sensitivity by the corresponding shock size (International Swaps and Derivatives Association, Inc Citation2016). We will provide a brief summary of SIMM below, but refer to International Swaps and Derivatives Association, Inc (Citation2020) for details.
Consider risk-factors and denote the 10-day risk-factor increments as . The variance of the total response of an instrument to each shock is computed as where denotes the gradient of w.r.t. and denotes the covariance matrix of . If the total response defined above, is assumed to be approximately Gaussian with mean zero, the VaR can be estimated as follows Where denotes the inverse of the standard normal CDF. For interest rate derivatives, ISDA distinguishes three relevant types of risk-factors, namely Delta-risk, Vega-risk and Curvature-risk, the latter referring to the second-order Gamma impact. The total IM is given by (5) (5) where each component represents a VaR estimation w.r.t. a type of underlying risk-factors, i.e. market interest rates and market implied volatilities. The risk-factors of each type are subdivided into twelve buckets, corresponding to the yields at the tenors . Hence for each component one needs to compute the corresponding 12-dimensional sensitivity vector, which are specified as (see also International Swaps and Derivatives Association, Inc Citation2020):
Delta | : The PV01 of the instrument w.r.t. the risk-free interest rate at tenor , i.e. . | ||||
Vega | : The sensitivity of the instrument w.r.t. volatility risk-factor , i.e. . Here, denotes the the implied at-the-money volatility of a swaption with an expiry time equal to . | ||||
Curvature | : Curvature margin represents the second-order impact of the interest rate risk-factors to the VaR estimate. In SIMM, the Gamma is approximated with a Vega-Gamma relationship. Therefore, the computation of Curvature margin takes the Vega sensitivity vector as input, similar to Vega margin. |
For SIMM–IM quantification in practice, the user only needs to compute the portfolio sensitivities. The other parameters are provided by ISDA in an annual publication.
A key difference between VM and IM is that the former is symmetric. VM reflects the value of the underlying trade and can be rehypothecated. This means the receiver can reuse the collateral, for example in the margin agreement of an opposite transaction. In contrast, IM reflects the risk of a trade and must be exchanged in both directions, without netting the amounts. The collateral is often required to be posted on a segregated account and therefore cannot be rehypothecated (Gregory Citation2020). As a result, IM must be funded by the dealer, which comes at a cost. The expected lifetime cost of posting IM against a portfolio is called Margin Value Adjustment (MVA) (Green and Kenyon Citation2015).
The size of MVA is driven by the dynamic IM profile of the portfolio and the funding spread that applies to the IM posting. As the volume of IM is dependent on the state of the economy, represents a stochastic process. The expected initial margin (EIM) of a financial contract at time is defined as EIM represents the expected volume of IM that needs to be posted by the dealer at a given time (Gregory Citation2020). With this expression at hand, MVA can be computed as follows (6) (6) with FS denoting the funding spread, which reflects the spread on the collateral rate w.r.t. the risk-free rate (Green and Kenyon Citation2015).
3. The static replication method
In this section, we describe the algorithm that is used to compose a static replication portfolio for a Bermudan swaption. The algorithm is embedded in a Monte Carlo framework and relies on a regress-later technique. See Glasserman and Yu (Citation2004) for a general introduction to regress-later. The method is strongly inspired by the work of Jain and Oosterlee (Citation2015) and extends the approach that is presented in Lokeshwar et al. (Citation2022) and Hoencamp et al. (Citation2023).
In Jain and Oosterlee (Citation2015), it is shown that early-exercise options can be priced with an application of a regress-later technique. As such, the ‘later’ option price is regressed against the ‘later’ risk-factor realization . The continuation value is subsequently estimated by evaluating the conditional expectation of the regressed option value, i.e. which can be computed in (semi) closed-form for an appropriate choice of basis-functions .
In Lokeshwar et al. (Citation2022), it is proposed to perform the regression with a shallow neural network (i.e. a feed-forward neural network with a single hidden layer). By considering the appropriate network structure, it is shown that the regression can be interpreted as a portfolio of short-maturity options. As a consequence, the continuation value can be evaluated by simply pricing the replication instruments. In Hoencamp et al. (Citation2023), this approach is extended to an interest rate modeling framework. There it is shown that a Bermudan swaption can be semi-statically replicated by an options portfolio written on zero-coupon bonds. The central concept of the latter two studies is that if a portfolio perfectly reproduces the value function of a derivative security at some future time for every realization of the state-variables, the no-arbitrage condition implies that this portfolio will also replicate the security at any time , as long as no cash-flows can occur between t and . Where a dynamic replicating portfolio needs to be rebalanced continuously through time, the semi-static replication only needs rebalancing on a finite number of instances.
The algorithm that is proposed in this paper builds further on the foundations that were laid by Jain and Oosterlee (Citation2015), Lokeshwar et al. (Citation2022) and Hoencamp et al. (Citation2023). We suggest two main novelties compared to the earlier studies
We propose to use the swap rate as the regression variable, which is an implicit risk-factor in our model set-up. As a consequence, the regression can be interpreted as the payoff of a portfolio of European swaptions. We further elaborate on this in Section 3.4.
We propose a new variation to the algorithm, which allows us to compose a replication that is not semi-static, but in fact truly static. In other words, this algorithm lets one compose a portfolio at time zero, which will mirror the value of the Bermudan until it is either exercised or expired, without the need of updating the portfolio at intermediate time-points. We further elaborate on this in Section 3.3.
3.1. The algorithm
The algorithm is executed in an iterative manner, moving backwards in time. Starting at the final exercise date , a regression is performed to determine the weights and strikes of an options portfolio which replicates the Bermudan's payoff. This replication is subsequently used to estimate the value of the Bermudan at the preceding monitor date, yielding the target function of the consecutive regression. This process is repeated until the first monitor date and a portfolio has been composed which statically replicates the Bermudan between time-zero and its final exercise date. Below we will describe the algorithm in more detail.
Start by sampling N trajectories of the underlying state-variable. For each realization of , compute the Bermudan swaption payoff at (the final exercise opportunity) . This will act as the target function of the first regression. Now recursively execute the following steps for :
1. Select and sample the regression variable
Select an asset taking value in that will act as the independent regression variable. For compactness, we will suppress the dependency of on the state variable , when clear from the context. The asset(s) should satisfy the following conditions:
is a square integrable random variable on the probability space .
For each , the realization is uniquely determined by .
The parametrization is continuous.
where denotes a target function, which we define in (Equation8(8) (8) ) below. Subsequently compute for each sample path the realizations of .
2. Regress the target function
Consider a regression function that is defined as (7) (7) Here we denote by , affine functions of the form with , , and . Furthermore we denote by , activation functions of the form Additionally, consider a target function of the form (8) (8) where denotes a possible scaling factor. The rationale behind this target function is that we aim to regress the fraction of the option value that is lost, when an exercise date passes, but the option is not exercised. We provide an elaborate motivation and interpretation in section 3.3.
Subsequently, optimize the parameters , such that fits the target function . The parameters are determined by minimizing the cost function (9) (9) (10) (10) where and denote sampled realizations of the asset and the state variable, respectively, on a corresponding MC path.
3. Estimate the continuation value
Finally, the continuation value of the Bermudan swaption at the preceding exercise date is estimated by calculating (11) (11) With an appropriate choice of , this can be computed in closed-form or approximated with an efficient numerical routine.
3.2. Interpretation of the regression function
The regression function as presented in (Equation7(7) (7) ) can be considered a shallow neural network, i.e. a feed-forward neural network with a single hidden layer. The structure is graphically represented in figure (a). The outcome of the first hidden layer is a vector in , consisting of n neurons. Each individual neuron is calculated as The functional form of each hidden node thus corresponds to the payoff of an arithmetic basket option written on .
The outcome of the second layer is calculated as The functional form of the outcome of the neural network, hence corresponds to the payoff of a weighted portfolio of basket options, plus a zero-coupon bond maturing at . Computing the conditional expectation given in (Equation11(11) (11) ) can be interpreted as evaluating the risk-neutral price of this portfolio.
3.3. A static replication
Let denote an options portfolio maturing at with payoff equal to (12) (12) By the deliberate choice of the target function in (Equation8(8) (8) ), the payoff of this portfolio yields the approximation which is graphically represented by the red area in figure (a). As a consequence of the no-arbitrage condition, it follows that at any time the risk-neutral portfolio value yields the approximation . By a repeated argument, whenever we have The combined portfolio , composed at t = 0, can be interpreted as a fully static replication of the Bermudan swaption. Whenever a monitor date is reached, subportfolio expires. Then, two things could happen:
The holder exercises the Bermudan and the value of plus the payoff of will equal the exercise value.
The holder holds the Bermudan, subportfolio expires worthless and will continue to replicate the Bermudan.
Thus, portfolio Π will mirror the Bermudan swaption value until it is either matured or exercised.
The algorithm presented in this work is a variation to the semi-static replication approach presented in Lokeshwar et al. (Citation2022) and Hoencamp et al. (Citation2023). There, the target function is selected such that the portfolio payoff yields the approximation which is graphically represented by the red area in figure (b). In that case, subportfolio is set-up on , it replicates the Bermudan on the time-interval and expires at . Its payoff will equal the exercise value of the Bermudan, or suffice to set-up subportfolio in case the Bermudan is continued. The advantage of the latter approach is that only needs to be priced in order to value the Bermudan, rather than . A disadvantage is that the replication is only semi-static and needs rebalancing on each monitor date .
3.4. Estimation of the continuation value
How the continuation value as given in (Equation11(11) (11) ) is computed, depends on the specification of the model and the selection of the regression asset. In the general case, the portfolio can be interpreted as a portfolio of arithmetic basket options. The valuation of a basket option is typically still not an easy exercise. However, with the appropriate constraints on the neural network structure used for the regression, the replicating portfolio can be reduced to a set of European options. A first approach would be to consider a one-dimensional regression asset, which corresponds to the design depicted in figure (b). This would be appropriate under a one-factor model. A second approach would be to constrain matrix to have only one non-zero entry in each row. This corresponds to the neural network design depicted in figure (c). In the numerical examples included in this work, we will be considering a 1-factor model environment. Generalizations to multi-factor models have been described in Hoencamp et al. (Citation2023). We will consider the following assumptions:
Regression asset: select .
Scaling factor: set .
Under the assumptions stated above, the (sub)portfolio as defined in (Equation12(12) (12) ), can be interpreted as a portfolio of European swaptions, written on the swap rate .Footnote1 The valuation of for some comes down to where the expectation in the last line is taken under the annuity measure associated to the numéraire . The expression above should be recognized as a linear combination of European swaption prices (see equation (Equation2(2) (2) )). The parameter distinguishes between a payer () and receiver () swaption. The parameter distinguishes between a long () and a short () position in the swaption contract.
3.5. Valuation of the replicating portfolio
Under the assumptions of section 3.4, portfolio consists of European swaptions, which means the valuation comes down to pricing each individual option. We apply a coefficient freezing technique, described in Andersen and Piterbarg (Citation2010) and Schrager and Pelsser (Citation2006). With this technique, the swap rate process is approximated as a generalized arithmetic Brownian motion, by freezing the stochastic terms in its diffusion. The option's implied volatility can then be approximated by integrating the ‘frozen’ diffusion coefficient. We briefly summarize the method below, and refer to Andersen and Piterbarg (Citation2010) and Schrager and Pelsser (Citation2006) for details.
Consider the dynamics of the swap rate. By Itô's lemma, it can be established that satisfies the SDE Considering the annuity as numéraire, the swap rate can regarded as the quotient of some tradable assets and the numéraire itself. As a consequence, the swap rate is a Martingale under the annuity measure (Jamshidian Citation1997). Hence, for we can write (13) (13) where denotes a d-dimensional Brownian motion and . Working out the diffusion for yields (14) (14) The diffusion coefficient is stochastic, due to the terms. However, these terms again represent the quotient of a tradable asset and the numéraire. It is conjectured in Schrager and Pelsser (Citation2006), that these terms are Martingales with low quadratic variation. A good approximation is therefore achieved by freezing them at their time-t value, resulting in a deterministic diffusion coefficient for . It follows (15) (15) The approximation conditioned on is normally distributed. Applying Itô isometry allows us to approximate the implied volatility of an option written on , which yields (16) (16) (17) (17) Substituting into Bachelier's formula results is the desired swaption price.
4. Numerical algorithms and mathematical consistency
In this section, we describe the numerical routines for computing sensitivities along the Monte Carlo path. In short, we show how to compute the sensitivities of the replicating portfolio in a time-consistent manner. This involves the differentiation of the replicating instrument values (i.e. greeking) and the differentiation of the portfolio weights.
Our approach to the latter is inspired by the work of Jain et al. (Citation2019). In Jain et al. (Citation2019) a regress-later approach is considered, which utilizes ordinary least-square regression. As such, the weights of the basis-functions can be obtained explicitly as a function of the risk-factor and consistently differentiated. In this paper, only an implicit relation between the risk-factors and the portfolio weights is known through the minimized cost function (equation (Equation10(10) (10) )). Yet, we show that an estimator of the weight sensitivities can still be obtained through an application of the implicit function theorem.
An advantage of our approach is that it should be easier to evaluate sensitivities to implicit risk-factors, which are not directly modeled. In the context of MVA this concerns the sensitivities to the realizations of future market rates and future implied volatilities. With regard to static replication, the replicating portfolio is selected such that the required future Delta and Vega are tractable to compute. In contrast, with, for example, an application of AD on LSM, this may not be straight-forward. It would require the computation pseudo Jacobian-inverses or adding plenty pseudo-interpolation-nodes to the term-structure of zero-rates and instantaneous volatilities, which may be a tedious task.
4.1. Notation and assumtions
Let and let denote the time-t risk-neutral value of a Bermudan swaption with strike K and tenor structure , conditioned on the event that the option is not yet exercised at t. Next to the Bermudan swaption, we consider a static replicating portfolio Π consisting of European swaptions. We write where denotes a subportfolio of swaptions, written on the swap rate . The subportfolios are defined as Where we denote , and
To keep computations feasible, we will consider two assumptions:
The time-zero zero-rate curve is assumed to be an interpolation between nodes , with . The nodes are assumed to be the result of a calibration bootstrap against market rates, such as swap- or deposit rates.
The instantaneous volatility is assumed to be piece-wise constant between dates . The volatility level between and , is assumed to be the result of a bootstrap against market implied volatilities of European swaptions with maturity .
4.2. Sensitivities along the monte carlo path
In this work, we focus on two types of sensitivities, namely Delta and Vega. The time-zero sensitivities are specified as follows:
Delta: | Let , where denotes the sensitivity of to an infinitesimal bump of the zero-rate node . It is defined as The impact of a bump in a zero-rate node, while keeping other nodes fixed, is graphically represented in figure (a). | ||||
Vega: | Let , where denotes the sensitivity of to an infinitesimal bump of implied volatility corresponding to the swaption with maturity . It is defined as A bump in , while keeping other implied volatilities fixed will impact the level of and . This impact is graphically represented in figure (b). |
The sensitivities along the Monte Carlo path are understood as future-time generalizations of the quantities defined above. That is, future Delta is interpreted as the sensitivity of to an infinitesimal bump of the hypothetical zero-rate node , conditional on . This is graphically represented in figure (c). Future Vega is interpreted as the sensitivity of to an infinitesimal bump of the time-t implied volatility corresponding to an ATM swaption maturing at , conditional on . This is graphically represented in figure (d).
4.3. Computing the sensitivity
First we consider the general case and denote where either or . Later on we will distinguish between the two cases. Without loss of generality, assume that . The aim is to compute We iteratively compute the sensitivities of each subportfolio, for . Starting with i = M−1, a static replication of is achieved with a single European swaption, i.e. . Thus, setting and . The Delta and Vega simply correspond to the sensitivities of a European swaption, i.e. Next, consider the case i = m<M−1 and assume the sensitivities are known. With slight abuse of notation, we have according to the chain rule the following relation (18) (18) We detail each term below:
Options Delta/Vega | The first term represents the standard Delta or Vega of the options composing the portfolio, i.e. (19) (19) | ||||
Weight sensitivities | The weights are optimized, such that the portfolio payoff matches a target function (equation (Equation8(8) (8) )). This target function is sensitive to parameter changes in the zero-rate curve or volatility term-structure. As a consequence we need to consider the weight sensitivities, i.e. (20) (20) | ||||
Strike sensitivities | For the same reason as for the weights, the sensitivities for the strikes should be considered. However, we conjecture the impact of the strike sensitivities is relatively small, which is supported by our numerical experiments. We will ignore this term for the remainder of this work. |
A further derivation of the Delta/Vega sensitivities is the subject of sections 4.4 and 4.5. A further derivation of the weight sensitivities is subject of section 4.6.
4.4. The delta sensitivities
We start by working out the terms of the form and consider the case . We perform the derivation for the general setting and aim to compute The in (Equation19(19) (19) ) corresponds to the case t = T. The case t<T will serve as a path-wise sensitivity estimator, which will be required to compute the weight sensitivities in the subsequent section.
We first characterize the forward Delta sensitivity of a zero-coupon bond in the following lemma.
Lemma 4.1
Let and assume that the zero-rate curve is an interpolation between nodes , where . Then the forward Delta sensitivity is given by where and denotes an -measurable coefficient, dependent on the interpolation scheme of the zero-rate nodes. If the interpolation is linear, the coefficient is given by
A proof to this lemma is given in appendix 1. Now, for brevity denote . Then by an application of the chain rule and the product rule, we can write (21) (21) Working out each of the three partial derivatives above, we find
Bachelier Delta | Under the swap rate dynamics approximation given in (Equation15(15) (15) ), this term is computed as the standard Bachelier Delta sensitivity, i.e. (22) (22) | ||||
Annuity sensitivity | From Lemma 4.1 it directly follows | ||||
Swap rate sensitivity | Applying the chain rule and Lemma 4.1, it follows |
4.5. The vega sensitivities
Now consider the case . Just as for Delta, we perform the derivation for the general setting and aim to compute Again, the in (Equation19(19) (19) ) corresponds to the case t = T. The case t<T will serve as a path-wise sensitivity estimator, which will be required to compute the weight sensitivities in the subsequent section.
We first characterize the forward implied volatility Vega in the following two lemmas.
Lemma 4.2
Let satisfy the Gaussian approximation of (Equation15(15) (15) ) and denote for . Then
A proof is given in appendix 2. The sensitivity of w.r.t. is an -measurable random variable, which is dependent on the calibration procedure of the instantaneous volatility . If are scalars obtained by a bootstrap procedure, the sensitivities can be further characterized by the following lemma.
Lemma 4.3
Let be scalar-valued and piece-wise constant. Let be as in (Equation14(14) (14) ) and let denote the diffusion of the swap rate underlying the option corresponding to . Then the forward implied volatility Vega is given by where and denotes an -measurable coefficient. If by approximation , the coefficient is given by where we denote .
A proof is given in appendix 3. By an application of the chain rule, we obtain (23) (23) Considering each of the partial derivatives above, we find
Bachelier Vega Under the swap rate approximation given in (Equation15(15) (15) ), this term can be computed as the Bachelier Vega sensitivity, i.e.
Bachelier Delta Under the swap rate approximation given in (Equation15(15) (15) ), this term can be computed as the Bachelier Delta sensitivity. Hence, similar to (Equation22(22) (22) ), we have (24) (24)
Imp. vol. sensitivity The forward implied volatility sensitivity is characterized in Lemma 4.2. Under the assumptions of Lemma 4.3, the sensitivity can be approximated as
Swap rate sensitivity Under the swap rate approximation given in (Equation15(15) (15) ), we can write where . It follows that the forward swap rate sensitivity can be estimated as Under the assumptions of Lemma 4.3, this can be further approximated as
4.6. The weight sensitivities
The weights and strikes of are obtained through an optimization scheme as described in section 3.1. The objective function for this optimization can equivalently be expressed as where we denoted . The parameters and are determined by (25) (25) Assuming attains a local minimum in , it follows by the first order optimality condition that . In particular, we have where we can compute In order to characterize the weight sensitivities, we first consider the Jacobian of (i.e. the Hessian of w.r.t. ). The Jacobian can be obtained by again differentiating the expression above, which yields (26) (26) (27) (27) Hence, note that is specified by the auto-correlation matrix of the random vector . In other words, it depends on the cross-sectional co-variance matrix of the portfolio payoffs. With this expression at hand, we characterize the weight sensitivities in the following proposition
Proposition 4.4
Consider a swaption portfolio written on swap rate and let its weights and strikes satisfy equation (Equation25(25) (25) ). Furthermore assume that the strikes are all distinct. Then the derivative of w.r.t. is well-defined and its Jacobian matrix at is given by (28) (28)
This result is a consequence of the implicit function theorem. A rigorous proof is given in appendix 4.
Remark 1
The first term in (Equation28(28) (28) ) is given by the inverse of the matrix in (Equation27(27) (27) ). Note that this term is both path-independent and time-independent. Hence, although its computation requires numerical estimation and matrix inversion, it can be used along each MC path, at each time-step without nested re-evaluation.
What remains to be done is computing . We therefore compute Note that the second term is zero if subportfolio yields a perfect fit to the target function . Therefore, we can expect this term to be small and we will ignore it. Working out the first term, we find In the first term of the expression above, represent the Deltas/Vegas of subportfolio . This term needs to be numerically estimated. The path-wise estimator has been derived in sections 4.4 and 4.5. The second term also needs to be numerically estimated. The path-wise estimator involves , where . Working out this term and denoting the event yields (29) (29) In the expression above, the term is equivalent to with . The terms for were assumed to be known from previous iterations.
4.7. The algorithms
Below we summarize the algorithms for composing the replicating portfolios (Algorithm 1) and computing the path-wise sensitivities (Algorithm 2). For the execution of the path-wise sensitivity algorithm, it is assumed that the static replication has been performed and subportfolios have been derived from the trained parameters .
5. Numerical experiments
We demonstrate the performance of the algorithms presented in this work by considering several case studies. We select a one-factor Gaussian model with piece-wise constant instantaneous volatility to perform our numerical experiments (see Andersen and Piterbarg Citation2010). That is, we consider state-variable dynamics given by where the scalar a denotes a constant mean-reversion rate, is a one-dimensional Brownian motion under and denotes the instantaneous forward rate. The time-zero yield curve is assumed to be a linear interpolation between zero-rates with tenors as given in table . We consider a flat yield curve with all zero-rates equal to 3%. The instantaneous forward rate is assumed to be piece-wise constant between the tenor dates . Each volatility level is bootstrapped against an ATM European payer swaption with maturity and implied volatility , such that . All implied volatility targets are set to 50 bps, resulting in the instantaneous volatilities presented in table . The mean-reversion rate is fixed a priori at a = 0.01. The tenors have been selected to coincide with the SIMM tenor buckets specified by ISDA (International Swaps and Derivatives Association, Inc Citation2020).
Sampling required for regression data, pricing, sensitivities, exposures and IM profiles is done using Monte Carlo simulation. The risk factor is simulated through an Euler discretization scheme. The discretization uses weekly time-steps, i.e. .
As case studies, we consider a and a Bermudan receiver swaption with unit notional. The underlying of the Bermudan is assumed to be a fixed-for-floating interest rate swap with fixed rate K and a tenor structure of annual payments. Unless stated otherwise, the fixed rate K is selected to be ATM, i.e. . The exercise dates of the Bermudan are set to coincide with the fixing dates of the underlying swap, i.e. being the first and being the last exercise opportunity.
The composition of the static replication portfolio relies on the regression of feed-forward neural networks with a single hidden layer. Below, we list some details concerning the fitting procedure of the neural networks.
As an independent regression variable for we select the swap rate . The regression data is normalized before is fitted to the data, to improve the regression robustness (see Hoencamp et al. Citation2023 for details).
Each network is selected to have 8 hidden nodes.
Each regression is performed with a data set of 2000 independently sampled training points.
The neural network parameters are optimized using the AdaMax optimizer, which is a stochastic gradient-descent method (Kingma and Ba Citation2014). The batch-size (i.e. number of training points used per iteration) is set to 32. The learning rate (i.e. the step-size scaling per iteration) is set to 0.01.
The parameters of are randomly initialized from a uniform distribution . For m<M−1, the parameters of are initialized with the optimized parameters of .
5.1. Pricing
We start by analyzing the pricing accuracy of the static replication. As each neural network has 8 hidden nodes, every regression yields a subportfolio of 8 European swaptions. Hence at t-zero, the Bermudan is replicated with 40 swaptions and the Bermudan is replicated with 80 swaptions.
5.1.1. Time-zero prices
Table reports the price estimates implied by the risk-neutral valuation of the static replication portfolio (SR) for different levels of moneyness. As a benchmark, we report price estimates obtained from the least-square method (LSM) introduced by Longstaff and Schwartz (Citation2001). As basis functions for the ordinary least-square regression, we use . The LSM estimate is the mean of 25 independent runs of 80 000 MC paths each. In brackets, we report the standard error generated by these 25 runs.
From table , we observe a close correspondence between SR and LSM, in particular for ATM strikes. The accuracy slightly deteriorates for far OTM options. This can be attributed to a limited number of ITM paths at future monitor dates, yielding only a few non-zero data-points for the regression. SR is not subject to a standard error as the portfolios are evaluated at t-zero. SR is however subject to numerical errors due to the inaccuracy of the neural network regression. On the normalized data, the fits typically show a mean squared error in the order of .
5.1.2. Exposure profiles
Consecutively, we consider the exposure profile and related CVA quantities of the Bermudan swaptions. Exposure depends on the future distribution of the option price. Figure shows the expected positive exposure (EPE) simulated with the static replication next to an LSM implied benchmark. The benchmark is generated with the LSM-bundle algorithm, presented in Feng et al. (Citation2016). We briefly summarize this method below.
The LSM-bundle algorithm allows one to estimate the EPE of a Bermudan option when combined with the LSM algorithm of Longstaff and Schwartz (Citation2001). Within LSM, the continuation value is estimated by regressing realized (discounted) cashflows at against the risk-factor realizations at on in-the-money MC paths. However, to compute the exposure, an unbiased estimate of the continuation value is required on each MC path on every exposure date (not just monitor dates). In Feng et al. (Citation2016) it is therefore proposed to first perform an LSM sweep. Then in a second sweep, for each exposure date t the MC paths are divided into two bundles. If , the two bundles are defined as: Subsequently for both bundles an ordinary least-square regression is performed against the discounted realized cashflows that were computed in the first LSM sweep. The regression functions provide an unbiased estimate of the exposure for a given realization of the risk-factor in their respective domains. We refer to Feng et al. (Citation2016) for details. Note that for the SR method, no second sweep of regressions is required. Exposures are obtained by simply pricing the replicating portfolio along the MC path.
The EPE profiles in figure represent the mean of 25 MC simulations of 3000 paths each. The shaded areas represent a window of one standard error for both respective methods. We observe the typical staircase nature that corresponds to the exposure profile of a Bermudan option. Each discontinuity represents the passing of an exercise opportunity, after which the option is exercised for a certain fraction of the scenarios. Similar to the time-zero prices, we observe a good correspondence between the SR estimates and the benchmark. Additionally, it is noted that the standard error for SR is significantly smaller compared to LSM-bundle. This can be attributed to the benefit of the regress-later approach of SR compared to the regress-now approach of LSM. In the latter, the continuation value is estimated by regressing the (noisy) realizations of the cashflow. In the former, the regression is performed against the deterministic option's target value (numerical errors aside). As a result, the MC noise is integrated out more efficiently with SR. In general, this indicates that a significantly smaller number of MC paths would be required to reach a similar level of accuracy.
We further study the convergence of the standard error, by considering the CVA statistic. To do so, we assume a constant LGD = 1 and constant hazard rate of . The CVA is estimated using equation (Equation3(3) (3) ). Figure shows the mean and standard deviation of the CVA estimate generated with 25 independent MC runs, as a function of the number of MC paths per run. We observe that on average the SR standard error is a factor 4 smaller compared to the LSM-bundle benchmark. For an increasing number of paths, the two methods show a consistent convergence.
5.2. Sensitivities
We continue by analyzing the accuracy in the sensitivity computation for the static replication. We distinguish between time-zero sensitivities and future sensitivities along the MC path. The Delta and Vega sensitivities in this section are understood in accordance with the definitions provided in section 4.2.
5.2.1. Time-zero sensitivities
Figure reports the time-zero Delta sensitivities for the 12 tenor buckets. The sensitivities have been scaled to represent the basis point value w.r.t. the zero-rate . The SR results (presented in blue) are calculated in accordance with the Delta routine described in section 4.2. As a benchmark, we consider an LSM bump-and-reval estimate (presented in orange). For Delta, this means a zero-rate corresponding to one of the predefined tenors is bumped with one basis point and the yield curve is re-interpolated accordingly. The same seed is used for the LSM base run as for the bumped LSM run, to facilitate variance reduction.
Figure reports the time-zero Vega sensitivities for the 12 tenor buckets. These sensitivities have been scaled to represent the basis point value w.r.t. implied volatility . The bump-and-reval benchmark for Vega is obtained by bumping one of the implied volatilities with one basis point and re-bootstrapping the volatilty term-structure accordingly. Also here the same seed is used for the base and the bumped LSM run.
The results shown is figures and are the mean of 25 independent runs of 80,000 paths each. For the LSM bump-and-reval, this means 13 LSM simulations were involved for each run (1 base simulation and 12 bumped simulations). The SR estimate only requires a single simulation, in order to estimate the weight sensitivities of the replicating portfolio. The swaption sensitivities can be readily evaluated. The error bars indicate the standard error of the LSM estimate. The SR standard errors were each 2 to 3 orders of magnitude smaller compared to LSM and are therefore not included in the figures. Overall, we observe a satisfactory correspondence between the SR sensitivities and the LSM benchmark. For the LSM Vega, we observe relatively large standard errors compared to Delta. This can be attributed to the fact that a bumped yield curve translates one-to-one into bumped zero-coupon bond prices. A bumped volatility term-structure only indirectly impacts the risk-factor distribution and is therefore more susceptible to a higher standard error.
Table reports the SIMM Delta, Vega and Curvature margins computed with the sensitivity estimates of SR and LSM, respectively, for different levels of moneyness. In brackets we report the standard error of the LSM implied quantities. For the Delta and Vega margin estimates, we see the accuracy of the SR method confirmed. The Curvature margin LSM estimates demonstrate a significantly larger standard error in the relative sense. Curvature margin reflects the Gamma sensitivity of the option, which in SIMM is approximated through a Vega-Gamma relationship (International Swaps and Derivatives Association, Inc Citation2016). Second-order sensitivity approximations are notorious for amplifying numerical errors of the first order sensitivities, which we see reflected in the LSM benchmark. The SR estimate does not suffer from such standard errors.
The convergence of the margin estimates is demonstrated in figure . Here we present the mean and standard error as a function of the number of MC paths, obtained from 25 independent runs. We observe that the consistency of the SR method is confirmed for an increasing number of MC paths.
5.2.2. Future sensitivities: European swaption
We finally analyze the expected future sensitivity profiles of a Bermudan swaption. Calculation of expected future sensitivities through the LSM bump-and-reval method would impose a severe computational burden. A nested MC simulation is required for each time-step, on each MC path, for each sensitivity. Benchmarking future sensitivity profiles is hence not feasible with LSM and out of scope for this paper.
To verify the accuracy of the SR method along the MC path, we therefore start by considering a European payer swaption. This can be considered a special case of the Bermudan, with only a single exercise opportunity at . We proceed to replicate this swaption, while selecting the swap rate as regression asset. As a result, the swaption is replicated with a portfolio of swaptions. The same neural network structure is used as for the Bermudans. Subsequently, we analyze the sensitivity profiles of the replicating portfolio, which we benchmark with the analytical sensitivities of the swaption.
The results are displayed in figure . The Vega profiles on the right show a near-perfect match between SR and the benchmark. We do observe some numerical errors in the Delta , and buckets. Around t = 0, the latter two are dominant for the swaptions in the replicating portfolio, but zero for the target swaption. If the replication is perfect, the sensitivities in this bucket cancel with the weight sensitivities. Here we note that some small errors remain. Yet, overall, we observe a satisfactory agreement between SR and the analytical profiles.
5.2.3. Future sensitivities: Bermudan swaption
Figure shows the expected future Delta sensitivity profiles of a Bermudan swaption. The profiles are generated by taking the mean of 25 independent MC runs of 3000 paths each. The shaded areas (where visible) represent a single standard error window. As the underlying is a receiver swap, we note that the Delta profiles are mostly negative. The tenor buckets surrounding the time until maturity of the underlying swap are dominant. As time passes, the time to maturity decreases and the dominant sensitivities shift from one bucket to the next, implying the triangular-shaped profiles. We see this for example reflected in the (red-colored) Delta-profile of the bucket in figure (a). In the first year, the sensitivity grows (in the absolute sense) to its maximum until the maturity of the underlying swap is exactly away. Then, the sensitivity of this bucket starts to drop and flows into the bucket. The discontinuous jumps signify the passing of an exercise opportunity. Despite the small number of paths, we observe a satisfactory low standard error.
Figure shows the corresponding expected Vega profiles and their standard error window. The sensitivities are scaled by the implied volatility in basis points of each corresponding bucket, which is in line with the SIMM routines. The Vega sensitivities are bucketed according to their time to expiry. Hence, we observe triangular patterns resulting from Vegas shifting from one bucket into the next as the time until the future monitor dates decreases. Also here, we observe that the shaded areas representing the standard errors are minor or even invisible in the figure, even though the number of paths is relatively small.
Figure shows the expected initial margin profiles, obtained from the simulated sensitivity distributions. The shaded areas again denote a single standard error window. The figure shows the separate profiles corresponding the Delta-, Vega- and Curvature margin. The sum of these three yields the total expected initial margin as a function of time. From this plot, we observe that overall the Delta margin is dominant for the computation of IM and MVA. For Delta, we observe the typical staircase profile. Each discontinuity represents the passing of an exercise date and the subsequent decrease in the option's total Delta sensitivity. The volume of Vega scales with the square root of the times until expiry of the underlying options. Hence the smooth decay of the Vega margin over time, with similar jumps after each monitor date. The contribution of the curvature margin to the total IM and MVA is very limited. In terms of shape, the profile shows a similar nature as Delta.
We study the convergence of the margin estimates by considering MVA as a function of the number of MC paths. Figure shows the Delta-, Vega- and Curvature components of MVA in separate plots. We report the mean and standard error, obtained from 25 independent runs. The MVA estimates have been generated from the future IM distributions, while assuming a constant funding spread of . From the figures, we observe a consistent convergence and see the low standard errors of the sensitivity profiles confirmed. We conclude that generally, 3000 MC paths suffice to guarantee a relative standard error of below 1%.
6. Conclusion
This paper presented a static replication algorithm for Bermudan swaptions under an affine term-structure model. We showed that under the appropriate conditions, one can construct a portfolio of European swaptions that mirrors the value of the exotic instrument until it is either exercised or matured. This replication is static, i.e. despite the early-exercise features of the instrument, the portfolio does not need to be re-balanced throughout the lifetime of the original contract. We continued to derive efficient estimators for the price and sensitivities of the portfolio along the Monte Carlo path. These quantities are essential for the quantification of modern risk metrics such as exposure, CVA, IM and MVA. Exploiting closed-form price and sensitivity approximations for European swaptions allowed us to avoid cumbersome nested simulations that are associated with a naive bump-and-reval approach. Moreover, this enabled us to accurately estimate high dimensional, bucketed Delta- and Vega sensitivities, despite the fact that the risk-factors may be embedded in a low dimensional model. This yielded a richer handle for sensitivity modeling compared to, for example, a parallel shift in the model variables. Through several representative numerical examples, we demonstrated the performance of our approach under a one-factor model, benchmarked to the established least-square Monte Carlo method of Longstaff and Schwartz (Citation2001). Overall, we observed superior convergence for the static replication method with regard to exposures, sensitivities and IM statistics. Thus, we demonstrated static replication to be a suitable alternative for exposure, dynamic IM and MVA quantification.
Disclosure statement
The opinions expressed in this work are solely those of the authors and do not represent in any way those of their current and past employers. No potential conflict of interest was reported by the authors.
Additional information
Funding
Notes
1 For simplicity we will ignore the zero-coupon bond in the replicating portfolio (see section 3.2). This is achieved by constraining . Our numerical experiments indicate a minimal impact on the accuracy of the replication.
References
- Andersen, L.B.G. and Piterbarg, V.V., Interest Rate Modeling, Volume II: Term Structure Models, 2010 (Atlantic Financial Press: London, UK).
- Andersson, K. and Oosterlee, C.W., A deep learning approach for computations of exposure profiles for high-dimensional bermudan options. Appl. Math. Comput., 2021, 408, 126332.
- Antonov, A., Issakov, S. and McClelland, A., Efficient SIMM-MVA calculations for callable exotics. Available at SSRN 3040061, 2017.
- Bank for International Settlements, Counterparty credit risk definitions and terminology, 2019.
- Burgard, C. and Kjaer, M., Funding strategies, funding costs. Risk, 2013, 26, 82.
- Capriotti, L., Fast Greeks by algorithmic differentiation. J. Comput. Finance, 2011, 14, 3.
- Capriotti, L., Jiang, Y. and Macrina, A., AAD and least-square Monte Carlo: Fast bermudan-style options and xVA Greeks. Algorithmic Finance, 2017, 6, 35–49.
- Carriere, J.F., Valuation of the early-exercise price for options using simulations and nonparametric regression. Insur. Math. Econ., 1996, 19, 19–30.
- Caspers, P. and Lichters, R., Initial margin forecast-bermudan swaption methodology and case study. Available at SSRN 3132008, 2018.
- De Graaf, C.S., Feng, Q., Kandhai, D. and Oosterlee, C.W., Efficient computation of exposure profiles for counterparty credit risk. Int. J. Theor. Appl. Finance, 2014, 17, 1450024.
- Duffie, D. and Kan, R., A yield-factor model of interest rates. Math. Financ., 1996, 6, 379–406.
- Feng, Q., Jain, S., Karlsson, P., Kandhai, D. and Oosterlee, C.W., Efficient computation of exposure profiles on real-world and risk-neutral scenarios for bermudan swaptions. Available at SSRN 2790874, 2016.
- Filipovic, D., Term-Structure Models. A Graduate Course, 2009 (Springer: Berlin, Heidelberg).
- Fries, C.P., Fast stochastic forward sensitivities in Monte Carlo simulations using stochastic automatic differentiation (with applications to initial margin valuation adjustments). J. Comput. Finance, 2019, 22, 103–125.
- Giles, M. and Glasserman, P., Smoking adjoints: Fast Monte Carlo Greeks. Risk, 2006, 19, 88–92.
- Glasserman, P., Monte Carlo Methods in Financial Engineering, Vol. 53, 2004 (Springer: Berlin, Heidelberg).
- Glasserman, P. and Yu, B., Simulation for American options: Regression now or regression later? In Monte Carlo and Quasi-Monte Carlo Methods 2002: Proceedings of a Conference held at the National University of Singapore, Republic of Singapore, 25–28 November, 2002, pp. 213–226, 2004, (Springer: Berlin, Heidelberg).
- Glau, K., Pachon, R. and Pötz, C., Speed-up credit exposure calculations for pricing and risk management. Quant. Finance, 2021, 21, 481–499.
- Green, A., XVA: Credit, Funding and Capital Valuation Adjustments, 2015 (John Wiley & Sons: Hoboken, NJ).
- Green, A.D. and Kenyon, C., Mva: Initial margin valuation adjustment by replication and regression. Available at SSRN 2432281, 2015.
- Gregory, J., The XVA Challenge: Counterparty Risk, Funding, Collateral, Capital and Initial Margin, 2020 (John Wiley & Sons: Chichester, UK).
- Griewank, A. and Walther, A., Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation, 2008 (SIAM: Philadelphia, USA).
- Harrison, J.M. and Pliska, S.R., Martingales and stochastic integrals in the theory of continuous trading. Processes and Their Applications¡/DIFdel¿Stoch. Process. Their. Appl., 1981, 11, 215–260.
- Hoencamp, J., Jain, S. and Kandhai, D., A semi-static replication method for bermudan swaptions under an affine multi-factor model. Risks, 2023, 11, 168.
- International Swaps and Derivatives Association, Inc, Isda SIMM: From principles to model specification, 2016.
- International Swaps and Derivatives Association, Inc, Isda SIMM: Methodology, version 2.3, 2020.
- International Swaps and Derivatives Association, Inc, Isda margin survey year-end 2021, 2021.
- Jain, S. and Oosterlee, C.W., The stochastic grid bundling method: Efficient pricing of bermudan options and their Greeks. Appl. Math. Comput., 2015, 269, 412–431.
- Jain, S., Leitao, Á. and Oosterlee, C.W., Rolling adjoints: Fast Greeks along monte carlo scenarios for early-exercise options. J. Comput. Sci., 2019, 33, 95–112.
- Jamshidian, F., Libor and swap market models and measures. Finance Stoch., 1997, 1, 293–330.
- Jeanblanc, M. and Li, L., Characteristics and constructions of default times. SIAM J. Financ. Math., 2020, 11, 720–749.
- Joshi, M. and Kwon, O.K., Least squares monte carlo credit value adjustment with small and unidirectional bias. Int. J. Theor. Appl. Finance, 2016, 19, 1650048.
- Kappen, C., Computing MVA via regression and principal component analysis, SSRN, 2017.
- Karlsson, P., Jain, S. and Oosterlee, C.W., Counterparty credit exposures for interest rate derivatives using the stochastic grid bundling method. Appl. Math. Finance, 2016, 23, 175–196.
- Kingma, D.P. and Ba, J., Adam: A method for stochastic optimization, 2014, arXiv preprint arXiv:1412.6980.
- Lakhany, A. and Zhang, A., Efficient ISDA initial margin calculations using least squares monte-carlo, 2021, arXiv preprint arXiv:2110.13296.
- Lokeshwar, V., Bharadwaj, V. and Jain, S., Explainable neural network for pricing and universal static hedging of contingent claims. Appl. Math. Comput., 2022, 417, 126775.
- Longstaff, F.A. and Schwartz, E.S., Valuing American options by simulation: A simple least-squares approach. Rev. Financ. Stud., 2001, 14, 113–147.
- Lyashenko, A. and Mercurio, F., Looking forward to backward-looking rates: A modeling framework for term rates replacing LIBOR. Available at SSRN 3330240, 2019.
- Musiela, M. and Rutkowski, M., Martingale Methods in Financial Modeling, 2nd ed., 1997 (Springer: Berlin, Heidelberg).
- Schrager, D.F. and Pelsser, A.A., Pricing swaptions and coupon bond options in affine term structure models. Math. Financ., 2006, 16, 673–694.
- Shen, Y., Van Der Weide, J.A. and Anderluh, J.H., A benchmark approach of counterparty credit exposure of bermudan option under Lévy process: The Monte Carlo-cos method. Procedia. Comput. Sci., 2013, 18, 1163–1171.
- Simaitis, S., de Graaf, C., Hari, N. and Kandhai, D., Smile and default: The role of stochastic volatility and interest rates in counterparty credit risk. Quant. Finance, 2016, 16, 1725–1740.
- Tsitsiklis, J.N. and Van Roy, B., Regression methods for pricing complex American-style options. IEEE Trans. Neural Netw., 2001, 12, 694–703.
- Zeron, M. and Ruiz, I., Dynamic initial margin via Chebyshev spectral decomposition, Tech. Rep., Working paper (24 August), 2018.
- Zhu, S.H. and Pykhtin, M., A guide to modeling counterparty credit risk, GARP Risk Review, July/August 2007.
Appendices
Appendix 1.
Proof of Lemma 4.1
Proof.
Consider the process As y represents a tradable asset denominated by the numéraire, it follows that y is a Martingale under . An application of Itô's lemma and a subsequent measure change shows that y satisfies the following SDE where denotes a d-dimensional Brownian motion under . Denoting and solving the SDE, conditioned on yields Let and . Note that by the definition of the zero-rate, the expression above can be rewritten as Therefore, the forward Delta sensitivity can be computed as where denotes an -measurable coefficient. Under the assumption that the time-t zero-rate curve is an interpolation between nodes , it follows that This concludes the proof.
Appendix 2.
Proof of Lemma 4.2
Proof.
Let . Note that we can write Subsequently applying the chain rule to this expression yields This concludes the proof.
Appendix 3.
Proof of Lemma 4.3
Proof.
Recall that it is assumed that the instantaneous volatility in (Equation1(1) (1) ) takes values in and is piece-wise constant between dates . Furthermore it is assumed that the volatility levels are obtained through a bootstrap procedure, such that the implied volatility approximations given in (Equation17(17) (17) ) agree with the implied volatilities observed in the market. Lastly, note that only and are sensitive to a change in if other market-implied volatilities are kept fixed. For compactness of notation, we show the proof for t = 0. The generalization to t>0 follows the same line of thought.
First consider the case . Note that we can write (A1) (A1) Furthermore, we have the relation It follows by the chain rule where the last equality follows from substituting with (EquationA1(A1) (A1) ).
The case follows a similar line of thought. Note that we can write (A2) (A2) Furthermore, we have the relation Applying the chain rule to the expression above, while substituting and with (EquationA1(A1) (A1) ) and (EquationA2(A2) (A2) ), respectively, we find If by approximation , the expression above reduces to
Appendix 4.
Proof of Proposition 4.4
Proof.
For compactness, we will write . First, note that is continuously differentiable w.r.t. for all . It follows that is continuously differentiable w.r.t. both and . Secondly, we show that is non-singular. By the nature of the ReLu function, the payoffs are linearly independent when the strikes are disjoint. That is, there is no such that a.s. Now consider the matrix . As a property of the auto-correlation matrix, it is known to be positive semi-definite, i.e.
Through an argument by contradiction, we can show the matrix is in fact strictly positive definite. In that regard, assume there exists a such that . It would follow that this is in contradiction with the fact that are linearly independent. Hence we can conclude that is a real-valued, symmetric matrix, which is strictly positive definite. It follows that all its eigenvalues are real and positive. As the determinant of a symmetric matrix is the product of its eigenvalues, we can conclude that has a positive determinant and is non-singular. The result now follows by an application of the implicit function theorem.