![MathJax Logo](/templates/jsp/_style2/_tandf/pb2/images/math-jax.gif)
Abstract
We consider impulse control of stochastic functional differential equations (SFDEs) driven by Lévy processes under an additional -Lipschitz condition on the coefficients. Our results, which are first derived for a general stochastic optimization problem over infinite horizon impulse controls and then applied to the case of a controlled SFDE, apply to the infinite horizon as well as the random horizon settings. The methodology employed to show existence of optimal controls is a probabilistic one based on the concept of Snell envelopes.
1. Introduction
The standard stochastic impulse control problem is an optimal control problem that arises when an operator controls a dynamical system by intervening on the system at a discrete set of stopping times. Generally, an intervention can be represented by an element in the control set U which we assume to be a compact subset of .
In impulse control the control-law, thus, takes the form , where
is a sequence of times when the operator intervenes on the system and
is the impulse that the operator affects the system with at time
. The standard impulse control problem in infinite horizon can be formulated as finding a control that maximizes
(1)
(1) where
is a constant referred to as the discount factor,
is an
-valued controlled stochastic process that jumps at interventions (e.g. by setting
for some deterministic function Γ) and the deterministic functionsFootnote1
and
give the running reward and the intervention costs, respectively. The quantity
, thus, represents the cost incurred by applying the impulse
at time
when the state is x.
As impulse control problems appear in a vast number of real-world applications (see e.g. [Citation19,Citation23] for applications in finance and [Citation2,Citation5] for applications in energy) a lot of attention has been given to various types of problems where the control is of impulse type. In the standard Markovian setting, where solves a stochastic differential equation (SDE) driven by a Lévy process on
, the relation to quasi-variational inequalities has frequently been exploited to find optimal controls (see the seminal work in [Citation3] or turn to [Citation21] for a more recent textbook). In the non-Markovian framework an impulse control problem in finite horizon (
) was solved in [Citation8] by utilizing the link between optimal stopping and reflected BSDEs (originally discovered in [Citation13]) while considering the reward functional
where
is now a random (and not necessarily Markovian) field and the controlled process
takes the particular form
, with L an (exogenous) non-controlled process and assuming that U is a finite set. Relevant is also the treatment of multi-modes optimal switching problems in a non-Markovian setting in [Citation11].
In [Citation16] the original work of [Citation8] was extended to incorporate delivery lag by setting for a fixed
. As in [Citation22] the work in [Citation16] is based on the assumption that
. This work was later extended by considering the infinite horizon setting in [Citation10]. Notable is also the recent work on finite horizon impulse control of SFDEs driven by a Brownian motion in [Citation17].
In the present article we take a different approach to all the above mentioned works by considering the abstract reward functional
(2)
(2) where the terminal reward φ maps controls to values of the real line and is measurable with respect to
, where
is the Borel σ-field of
, with
,
for
and
is the σ-field of a complete probability space
. The intervention cost c is also assumed to be a
-measurable map in addition to being bounded from below by a deterministic positive function. We consider the partial information setting and assume that we observe the system through a filtration
of sub-σ-fields of
and thus restrict our attention to
-adapted controls.
To indicate the applicability of the results we consider the special case when
(3)
(3) and
(4)
(4) where
solves an impulsively controlled stochastic functional differential equation (SFDE) driven by a Lévy process under an additional
-type Lipschitz condition on the coefficients of the SFDE. Furthermore, we will see that the results easily extend to problems with a random horizon which allows us to model aspects such as default in financial applications. We thus extend the result in [Citation17] on the one hand by considering a more general driving noise but also by considering both the infinite and the random horizon settings. Our treatment of the random horizon problem also motivates the exploration of partial information as optimal controls may be fundamentally different in the partial information setting.
The main contributions of the present work are twofold. First, we show that the problem of maximizing J has a solution under certain assumptions on φ and c, summarized in the definition of an admissible reward pair, by finding an optimal control in terms of a family of interconnected value processes. We refer to this family of processes as a verification family. Furthermore, we give a set of conditions under which the reward pair defined by (Equation3(3)
(3) )–(Equation4
(4)
(4) ) is admissible.
The remainder of the article is organized as follows. In the next section we state the problem, set the notations used throughout the article and detail the set of assumptions that are made. In particular we introduce the notion of an admissible reward pair. Then, in Section 3 a verification theorem is derived. This verification theorem is an extension of the verification theorem for the multi-modes optimal switching problem with memory developed in [Citation24] and presumes the existence of a verification family. In Section 4 we show that, under the assumptions made, there exists a verification family whenever is an admissible reward pair, thus proving existence of an optimal control for the impulse control problem with the cost functional J defined in (Equation2
(2)
(2) ). Then, in Section 5 we show that a type of impulse control problems for controlled SFDEs satisfies the conditions on φ and c as prescribed in the definition of an admissible reward pair, both in the infinite and random horizon settings. Finally in the appendix, we recall some results, such as the Snell envelope, that are useful when showing existence of optimal controls.
2. Preliminaries
Let be a complete probability space, and
a filtration of sub-σ-fields of
satisfying the usual conditions in addition to being quasi-left continuous. We assume that
is the trivial σ-field and define
.
Throughout, we will use the following notations. Let:
be the σ-algebra on
of
-progressively measurable subsets.
For
,
be the set of all finite,
-valued,
-measurable, càdlàg processes
such that
and let
be the subset of processes that are quasi-left upper semi-continuous (see Appendix 1 for a definition of quasi-left continuity).
be the set of all
-stopping times and for each
we let
be the corresponding subsets of stopping times τ such that
,
-a.s. Furthermore, we let
(resp.
) be the subset of
(resp.
) with all stopping times τ for which
.
For each
,
be the set of all
-measurable random variables taking values in U.
be the set of all controls
, where
(the intervention times) is a non-decreasing sequence of
-stopping times,
(the interventions) and
is the (random,
-measurable) number of interventions.
denote the subset of
for which
is
-a.s. finite on compacts (i.e.
) and for all
we let
be the set of all controls
truncated at k interventions and set
.
For a random interval A (i.e. a set of the type
,
,
or
for some
)
(and
resp.
) be the subset of
(and
resp.
) with
,
-a.s. for
. When the interval is
for some
we use the shorthand
(and
resp.
).
be the subset of
with all finite sequences and for
we let
.
, with
and
, where n is possibly infinite, denote a generic element of
.
For
and
, we introduce the composition, denoted by °, defined asFootnote2
. For
, we define the truncation to
interventions as
.
The composition operator ° be extended to controls by setting
, where
, whenever
and
.
for
and set
.
Furthermore, we introduce the following set:
Definition 2.1
We let be the set of all
-measurable mapsFootnote3
such that the collection
is uniformly integrable and (outside of a
-null set) we have for all
:
The limit
exists;
;
.
Furthermore, let be the set of all
such that for any predictable stopping time
and any announcing sequence
with
we have
,
-a.s.
The sets and
will play an important role in the characterization of optimal controls.
2.1. Problem formulation
With the notations above, the problem we deal with is characterized by the following objects:
a complete probability space.
A
-measurable map
.
A
-measurable map
.
To obtain existence of optimal controls we need to make some assumptions on the involved objects. The assumptions that we will use are summarized in the definition of what we refer to as an admissible reward pair:
Definition 2.2
We call the pair an admissible reward pair if, for someFootnote4 p>2:
The terminal reward φ and the intervention cost c are both right-continuous in the intervention times (uniformly in the interventions) and satisfy the following bounds:
.
and
for all
,
-a.s., where
is a deterministic, continuous, non-increasing and positive function, i.e.
, whenever
.
For every
and every
and T>0, there are maps
,
and
such that for all
and
we have
and
-a.s. (with a
-null exception set that can be chosen independently of b).
We have
as
The conditions in the above definition are mainly standard assumptions for infinite horizon stochastic impulse control problems translated to our setting (see e.g. [Citation10]). Condition (i.a) together with positivity of the intervention cost, c, in (i.b) implies that the expected maximal reward is finite. Condition (iii) implies that the future has diminishing impact on the total reward and can be seen as a generalization of the deterministic discounting applied in (Equation1(1)
(1) ). We show below that the boundedness of the intervention costs from below by a positive function together with (i. (a) imply that, with probability one, the optimal control (whenever it exists) can only make a finite number of interventions within any compact time interval.
Remark 2.1
Note that we may hide part of the intervention cost within the function φ which implies that, similar to the setting in [Citation20], we can handle problems with negative intervention costs as long as a type of martingale condition is satisfied.
Recall the reward functional given by (Equation2(2)
(2) ). The problem we deal with might be formulated as:
Problem 2.1
Find , such that
(5)
(5) when
is an admissible reward pair.
Throughout Sections 3–4 we will thus assume that is an admissible reward pair, before we state a set of conditions under which we are able to show that a particular
of the form (Equation3
(3)
(3) )–(Equation4
(4)
(4) ) is an admissible reward pair.
As a step in solving Problem 1 we need the following proposition which is a reduction result for impulse control problems.
Proposition 2.3
Suppose there is a such that
for all
. Then
is an optimal control for (Equation5
(5)
(5) ), i.e.
for all
.
Proof.
Pick . Then there is a
such that
with
, where
. Furthermore, by positivity of the intervention costsFootnote5
for all
, by Definition 2.2.(i). However, again by Definition 2.2.(i.a) we have
. Hence,
is dominated by the strategy of doing nothing and the assertion follows.
2.2. Relevant properties of ![](//:0)
![](//:0)
We note the following properties:
Lemma 2.4
If
(resp.
), then
and
are also in
(resp.
).
If
then there is a
-measurable càdlàg process,
, of class [D], such that
,
-a.s. for each
. If
then
is quasi-left upper semi-continuous.
If
, then
is
-measurable and càdlàg.
If
is a sequence in
that converges uniformly to some h (outside of a
-null set) then
.
Proof.
Moving on to property b), we let and note that for
, there is a
such that
-a.s., by Corollary A.5 in Appendix 3. By Theorem A.6 we can define the sequence of càdlàg processes
as
Now, we let
and then recursively define
for
. Then,
is a non-decreasing sequence of càdlàg processes and
-a.s., for all
and all
. Furthermore, for
we let
and get that
Since
we have, by quasi-left continuity of the filtration and uniform integrability that
Now, as
, it follows by Definition 2.1.(ii) that
,
-a.s., as
. We note that
is an increasing, uniformly bounded, sequence of
-measurable càdlàg processes. The sequence, thus, converges to a
-measurable process,
. It remains to show that
is càdlàg, quasi-left upper semi-continuous and that it agrees with
on stopping times.
To show that the limit is càdlàg we let
We note that
has left and right limits and that, furthermore,
. Now, if
then
for some
-measurable
, but then we would have
contradicting the fact that
is right continuous. We conclude that
is a non-increasing sequence with
a
-adapted càdlàg process and
For
we know that there is a non-increasing sequence
, with
a
-measurable r.v. taking values in
such that
Now, as
we have
. It follows that
-a.s., and in particular we find that
,
-a.s. as
. This gives that for any sequence
and
,
and
Letting l tend to infinity we find that
Similarly we get existence of left-limits and by the uniform integrability property imposed on members of
we conclude that
is a càdlàg,
-measurable process of class [D].
For , let
be a non-increasing sequence of stopping times in
(the subset of
with all stopping times taking values in the countable set
) such that
. We may, for example, set
. Since
is countable we have
(6)
(6)
-a.s. Now, by right-continuity we get that
. Letting
be a sequence of maximizers for the right-hand side of (Equation6
(6)
(6) ) at times
we get
-a.s. Moreover, for each
there is a subsequence
such that
Since U is compact, there is a subsequence
such that
converges to some
and so we have
Now,
-a.s., by Definition 2.1.(ii) and
-a.s., by the upper semi-continuity declared in Definition 2.1.(iii). Further, as
we conclude that
-a.s. On the other hand, there is a
such that
,
-a.s., and we have
-a.s. This implies that the limit exists with
,
-a.s., establishing that
,
-a.s. Finally, if
then quasi-left upper semi-continuity of
is immediate from Definition 2.1 and (b) follows.
Property (c) follows similarly by noting that for any we can chose
such that
and we can choose the sequence
such that
.
Concerning the last property we note that for each we can, by uniform convergence, choose a
-a.s. finite
such that
for all
. Then,
and property (i) in Definition 2.1 follows as
was arbitrary. The remaining properties follow similarly.
3. A verification theorem
Our approach to finding a solution to Problem 1 is based on deriving an optimal control under the assumption that a specific family of processes exists, and then showing that the family does indeed exist. We will refer to any such family of processes as a verification family. Before making precise the concept of a verification family we introduce the notion of consistency:
Definition 3.1
We refer to a family of processes as being consistent if for each
, the map
given by
is
-measurable and for each
and each
we have
,
-a.s.
We are now ready to state the definition of a verification family:
Definition 3.2
We define a verification family to be a consistent family of càdlàg supermartingales such that for each
:
The family satisfies the recursion
(7)
(7)
The family is uniformly bounded in the sense that
The map
belongs to
.
, as
.
The purpose of the present section is to reduce the solution of Problem 1 to showing existence of a verification family. This is accomplished by the following verification theorem (the proof of which follows along the lines of the proof of Theorem 2 in [Citation9]):
Theorem 3.3
Assume that there exists a verification family and let:
the sequence
be given by
(8)
(8) using the convention that
, with
and set
;
the sequence
be defined recursively as a measurable selection of
(9)
(9)
Then is an optimal control for (Equation5
(5)
(5) ) in the sense that
. Moreover, the family is unique (i.e. there is at most one verification family, up to indistinguishability of the maps
and
) and
(where
).
Proof.
The proof is divided into three steps where we first, in Step 1, show that for any we have
(10)
(10)
-a.s. for
. Following this, in Step 2, we show that
. Then in Step 3 we show that
is the optimal control, establishing (i) and (ii). A straightforward generalization to arbitrary initial conditions
then gives that
(11)
(11) by which uniqueness follows. Below we refer to the properties of a verification family in Definition 3.2 simply as properties a), b), a) and d).
Step 1 We start by showing that for each the recursion (Equation7
(7)
(7) ) can be written in terms of a
-stopping time and that the inner supremum is attained,
-a.s. In particular, this will imply the existence of a maximizer in (Equation9
(9)
(9) ). From (Equation7
(7)
(7) ) and consistency we note that
is the smallest supermartingale that dominates
(12)
(12) By Property (c) and Definition 2.2.(ii) we have that the map
belongs to
. It thus follows from Lemma 2.4.(b) that
is a càdlàg process of class [D] that is quasi-left upper semi-continuous on
. Furthermore, by Property (d) and positivity of the intervention costs we note that
,
-a.s. By Theorem A.1.(iii) in Appendix 2 and consistency we conclude that for any
, the stopping time
given by
is such that:
Now, since
, the map
is
-measurable and u.s.c. on
for some
-null set
. Corollary A.5 of Appendix 3 and consistency then implies that there is a
such that
-a.s., and in particular (Equation10
(10)
(10) ) holds. As mentioned above, this also implies the existence of a
-measurable
satisfying (Equation9
(9)
(9) ).
Step 2 We now show that . We start by noting that Y is the Snell envelope of
and by Step 1 we thus have (since
is trivial) that
Moving on we pick
and note that
. But then, by Step 1, we have that
By induction we get that for each
we have,
(13)
(13) Now, arguing as in the proof of Proposition 2.3 and using Property (b) we find that
. To show that the right-hand side of (Equation13
(13)
(13) ) equals
we note that (Equation13
(13)
(13) ) can be rewritten as
which gives
(14)
(14) for all
. For the first term on the right-hand side we note that for any T>0 we have
where we have used Hölder's inequality to arrive at the last inequality. As
is an admissible reward pair, Definition 2.2.(iii) gives that the second term can be made arbitrarily small by choosing T sufficiently large and Definition 2.2.(i) implies that the first term tends to zero as
for all finite T, since
. We, thus, conclude that the first term on the right-hand side in (Equation14
(14)
(14) ) tends to zero as
.
For the second term we note that letting in (Equation13
(13)
(13) ) and using Property (b) and Definition 2.2.(i.a) we find that
converges increasingly to a limit in
as
. Hence, the second term on the right-hand side of (Equation14
(14)
(14) ) also tends to zero as
.
Conditioning on in the third term of the right-hand side of (Equation14
(14)
(14) ) and noting that
is
-measurableFootnote6 we find that, similar to the above case, we have for any
that
where the second term can be made arbitrarily small by Property (d) and we conclude that
.
Step 3 It remains to show that the strategy is optimal. To do this we pick any other strategy
. By Step 2 and the definition of
in (Equation7
(7)
(7) ) we have
but in the same way
-a.s. Repeating this procedure K times gives
Now, we have
where the right-hand side tends to zero as
by repeating the argument in Step 2, which is possible since
. We conclude that
for all
and it follows by Proposition 2.3 that
is an optimal control for Problem 1.
4. Existence of the verification family
Theorem 3.3 presumes existence of the verification family . To obtain a satisfactory solution to Problem 1, we thus need to establish that a verification family exists. This is the topic of the present section. We will follow the standard existence proof which goes by applying a Picard iteration (see [Citation5,Citation11,Citation15]). We first show that there exists a sequence of consistent families of processes
that satisfy the recursion
(15)
(15) and
(16)
(16) for
. Then, we show that the limit family obtained by letting
is a verification family.
Proposition 4.1
There is a sequence of consistent families of càdlàg supermartingales such that for each
:
The sequence satisfies the recursion (Equation15
(15)
(15) )–(Equation16
(16)
(16) ).
There is a K>0 (that does not depend on k) such that,
For each
, the map
belongs to
.
, as
, uniformly in k.
The proof of Proposition 4.1 will be based on two lemmas and the following induction hypothesis:
Hypthesis (VF.k). There is a sequence of consistent families of càdlàg supermartingales such that for
and
:
The relation (Equation15
(15)
(15) ) holds for
and (Equation16
(16)
(16) ) holds for
.
The map
belongs to
.
We note that Hypothesis VF.k lacks properties (b) and (d) of Proposition 4.1. In the following two propositions we show that these are implicit.
Lemma 4.2
Assume that Hypothesis VF.k holds for some . Then, the sequence of families of processes
is well defined and uniformly bounded in the sense that there is a K>0 (that does not depend on k) such that,
Furthermore, for each
(whenever it is well defined) the collection
of random variables is uniformly integrable.
Proof.
We note that under Hypothesis VF.k, the sequence of families exists and is uniquely defined up to indistinguishability for each
by repeated application of Theorem A.1 in Appendix 2. By the definition of
and positivity of the intervention costs we have that for any
,
For
, we define the càdlàg supermartingale
and the stopping times
for all
. Then, by Definition 2.2.(i.a) we have
In particular, by right-continuity this implies that
or
where C does not depend on v and k. The first assertion now follows as
Concerning the second claim, note that for each
and each
, repeating the proof of Corollary A.5 in Appendix 3 with
instead of g we find that there is a
such that
Now,
where the right-hand side is bounded, uniformly in
and
, by the above in combination with Definition 2.2.(i.b) and Doob's maximal inequality.
We also have the following diminishing future impact property:
Lemma 4.3
Assume again that Hypothesis VF.k holds for some . Then,
as
, uniformly in k.
Proof.
By the properties of the essential supremum and positivity of the intervention costs we have for every ,
This implies that
The desired result now follows by a similar argument to the one used in the proof of Lemma 4.2 and Definition 2.2.(iii).
Proof
Proof of Proposition 4.1
First, note that by Definition 2.2.(ii) there is a such that for
we have
(17)
(17) for all
,
-a.s. The statement, thus, holds for k=0.
Moving on we assume that VF.k holds for some . But then, by Lemmas 4.2 and 4.3 we can applying a reasoning similar to that in the proof of Theorem 3.3 to find that
is a càdlàg supermartingale with
(18)
(18) By Definition 2.2.(ii) it follows that there is a consistent family satisfying (Equation18
(18)
(18) ) such that
and we conclude that VF.k+1 holds as well. By induction this extends to all
.
The objective in the remainder of this section is to show that the limit family that we get when letting in
is a verification family.
Proposition 4.4
For each , the limit
, exists as an increasing pointwise limit,
-a.s.
Proof.
Since we have that
,
-a.s. Moreover, by Proposition 4.1 the sequence is bounded
-a.s., thus, it converges
-a.s. for all
.
To assess the type of convergence that we have for the sequence , we introduce a sequence of families of processes corresponding to a truncation of the time interval. For each T>0 and
, we define the consistent family
of càdlàg supermartingales as
for all
with
. Then,
Lemma 4.5
The sequence satisfies:
.
For each
there is a
such that
, with
There is a
-a.s. finite
-measurable random variable ξ and a constant q>0 such that
-a.s. for each
.
Proof.
The inequality in (i) follows from noting that .
For the second statement we note that by Lemma 2.4.(c), the process is
-measurable and càdlàg. Now, each
can be decomposed as
with
and
, which implies that
(19)
(19) where
and
are the number of interventions in
and
, respectively. We thus define the sets
and have by (Equation19
(19)
(19) ) that
for
. Furthermore, as
and
we find that
for all
.
Now, let
(recalling our convention that
and pick
such that
Then, by right continuity we have
where
We thus only need to show that there is a T>0 such that
for all
. We have,
where right-hand side is independent of k and tends to 0 as
by Definition 2.2.(iii). We thus conclude that there is a
such that
for all
.
Concerning the third statement, we note that for , we have for each
and all
, that
and similarly
for all
(where the inequalities hold
-a.s.). Now, arguing as in the proof of Proposition 4.2 we have
and we conclude that there is a
-null set
such that for each
we have
.
For , let
be such that
We note that on
we have
and get that for
(in the remainder of the proof
denotes a generic
-null set), we have
and we conclude that
for all
.
Now, for all we have,
where we have introduced
corresponding to the truncation
of
. As the truncation only affects the performance of the controller when
we have
Applying Hölder's inequality we get that for
,
with
. Since
, there is thus a
-a.s. finite
-measurable r.v.
such that (for all τ and β) we have
Since,
was arbitrary we can choose β such that
-a.s. and by right-continuity the last statement follows as
was arbitrary.
Proposition 4.6
For each , we have
as
, outside of a
-null set.
Proof.
By Lemma 4.5.(ii) there exist for each , a
and a measurable set
with
such that
for all
and
. Furthermore, by Lemma 4.5.(iii) there is a
-a.s. finite r.v., ξ, such that
Combining these and taking the limit as
we find that
on
for some
-null set
. Now, as
was arbitrary the statement follows.
We are now ready to show that a verification family exists, establishing the existence of optimal controls for Problem 1.
Proposition 4.7
A verification family exists.
Proof.
Letting we have by Proposition 4.6 that
converges uniformly in
to
as
(outside of a
-null set). Since
by Proposition 4.1.c), we have by Lemma 2.4.(d) that
. In particular, we conclude that property c) in the definition of a verification family holds for the limit family
.
Moreover, for each and
we have that
,
-a.s., as
and we conclude by consistency of
for each
that
,
-a.s., implying consistency of
.
We treat each of the remaining properties separately:
(a) By the above and (b) of Lemma 2.4 we have that is a càdlàg, quasi-left upper semi-continuous process of class [D]. In particular we note that
is càdlàg. Applying (iv) of Theorem A.1 in Appendix 2 then gives
(b) By Proposition 4.1 we have that
is uniformly bounded in k. From this it follows immediately that
.
(d) We have that
where the first term on the right-hand side can be made arbitrarily small by choosing k sufficiently large and the last term tends to 0 as
for all
.
5. Application to impulse control of SFDEs
In [Citation24] a finite horizon impulse control problem with a discrete set U was solved when the underlying process followed a stochastic delay differential equation (SDDE) under a loop condition on the impulses. This problem was motivated by hydropower operation where the flow-times between different power plants induce delays in the dynamics of the controlled system.
In this section we extend the results from [Citation24] by considering a discounted infinite horizon setting, allowing an uncountable control set U and also by taking the dynamics of the underlying process to follow a stochastic functional differential equation. Furthermore, our prior treatment of the problem with abstract reward, φ, and intervention cost, c, allows us to consider a less restrictive set of assumptions on the coefficients in the problem formulation. In particular, we are able to remove the loop condition.
Our treatment of non-Markovian impulse control problems in infinite horizon should also be compared to [Citation10] where an infinite horizon impulse control problem in a non-Markovian framework with a fixed discrete delay is considered. The work presented in this section goes in a different direction by having an underlying dynamics driven by a Lévy process that is affected by the impulses in the control, resulting in a more complex relation between the control and the output of the performance functional. Furthermore, we investigate the important extension to random horizon which turns out to be a trivial modification of our initial problem.
Throughout this section, we will only consider controls for which ,
-a.s., and restrict our attention to the setting when the underlying uncertainty stems from a process
, with
, defined as
where
(20)
(20)
(21)
(21) for some
, the set of all (deterministic) uniformly bounded, càdlàg functions
, and
(22)
(22)
(23)
(23) The dynamics of
are driven by a d-dimensional Brownian motion B and a Poisson random measure P with intensity measure
, where
is the Lévy measure on
of P and
is called the compensated jump martingale random measure of P. We assume that
is the natural filtration generated by B and P, with
.
As mentioned above, we assume that all uncertainty comes from the process and consider the discounted setting with a continuous discount factor
. The reward functional is then
(24)
(24)
5.1. Assumptions
We assume that the involved coefficients satisfy the following constraints:
Assumption 5.1
For any ,
,
and
and for some
and p>2 we have:
The function
satisfies the Lipschitz condition
and the growth condition
for some constant
.
The coefficients
and
are continuous in t and satisfy the growth condition
and the Lipschitz continuity
There is a
, with
such that
satisfies
The running reward
is
-measurable and satisfies the growth condition
Moreover, there is a non-decreasing function
such that for all
,
whenever
.
There is a finite collection of closed connected subsets
of U and corresponding maps
that are jointly continuous in
, bounded from below, i.e.
of polynomial growth,
and locally Lipschitz in x, i.e. there is a non-decreasing function
such that for all
,
whenever
, and we have
.
Note that the growth condition on Γ in (i) implies that interventions can only increase the magnitude of the state as long as
. In particular, this avoids the problem of explosions in a finite time due to impulses.
Remark 5.1
To see that the above SFDE is a generalization of discrete delay SDDEs with Lipschitz coefficients note that if satisfies
for each
, then for
we have
Remark 5.2
In the above assumptions the involved coefficients are all deterministic. We remark that a trivial extension is to allow these to depend on ω as well in which case the coefficients in the Lipschitz conditions can be taken to be non-decreasing, -a.s. finite,
-measurable càdlàg processes.
The motivation for allowing intervention costs that are discontinuous in b is the important application of production systems, where increasing the production beyond a certain threshold may necessitate a costly startup of additional production units.
5.2. Existence of optimal controls
In this section we show that the problem of maximizing the reward functional (Equation24(24)
(24) ) has a solution. Throughout we will, for notational simplicity, only consider the one-dimensional case (d=1), but we note that all results extend trivially to higher dimensions. We start with the following moment estimate:
Proposition 5.2
Under Assumption 5.1, the SFDE (Equation20(20)
(20) )–(Equation23
(23)
(23) ) admits a unique solution for each
. Furthermore, the solution has moments of order pq on compacts, in particular we have for T>0, that
(25)
(25) where
and for each
, we have
(26)
(26) where
.
Proof.
By repeated use of Theorem 3.2 in [Citation1] existence and uniqueness of solutions to (Equation20(20)
(20) )–(Equation23
(23)
(23) ) follows since
,
-a.s. By Assumption 5.1.(i) we get, for
, using integration by parts, that
We note that if
and
for some
then there is a largest time
such that
. This means that during the interval
interventions will not increase the magnitude
. By induction, since
is finite, we find that
for all
, where
,
,
for
and
. Letting
we thus find that for
,
Now, since
and
coincide on
we have
and
From Assumption 5.1.(ii)-(iii) and the Burkholder-Davis-Gundy inequality we get that
and Grönwall's lemma gives that
(27)
(27)
-a.s., where the constant
does not depend on u or j and (Equation25
(25)
(25) ) follows by letting t=0. We now give a more straightforward way of showing (Equation26
(26)
(26) ) than the method used in the proof of Lemma 4.2. Applying (Equation27
(27)
(27) ) to the left-hand side of (Equation26
(26)
(26) ) we get
and the desired result follows from (Equation25
(25)
(25) ).
Lemma 5.3
For each , there is a
-null set
such that for all
and all
the limit
exists in the topology of uniform convergence on compact subsets of
. Furthermore, for all
, we have
-a.s., for any
,
and
(with an exception set that is independent of
).
Proof.
Our proof will rely on a pre-localization argument and we introduce the following non-decreasing sequence of stopping times
for
and set
. By, Assumption 5.1.(iii) it then follows that
,
-a.s. as
. Furthermore, we note that on
the magnitude of the jumps of
due to the Poisson jump integral of (Equation20
(20)
(20) )–(Equation23
(23)
(23) ) are bounded by
and repeating the argument in the proof of Proposition 5.2 gives that
(28)
(28) for all
.
For , we let
solve the SFDE (Equation20
(20)
(20) )–(Equation23
(23)
(23) ) with integrand
in the jump part and let
be the largest integer such that
. Then by Assumption 5.1.(i) we have for
(recalling that
is the truncation of u limiting the number of interventions to l),
with
. We define
,
and let
and set
. Then, since the jump part is deactivated during
and by (Equation28
(28)
(28) ), we have
For
, we have by Assumption 5.1.(i) that
Now, for
,
with
. Taking the absolute value on both sides we get
The Burkholder-Davis-Gundy inequality now gives
Appealing to the boundedness of the jumps and the integral Lipschitz conditions on the coefficients then gives that
for all
. Now, Grönwall's lemma gives
where C does not depend on
. Furthermore, for each
and each
(for some
-null set
) there is a
such that
. Uniform convergence on
, thus, follows by applying a Kolmogorov continuity argument (see e.g. Theorem 72 in Chapter IV of [Citation25]) and uniform right-continuity follows as
,
-a.s. The existence of limits follows similarly.
Definition 5.4
For all and
we define the map
as
Moreover, for
we define the truncation
of
as
where
, and for
we define the localization
of
as
where
.
Corollary 5.5
For each ,
,
and
the map
has limits everywhere and is
-a.s. continuous on
, where
are the jump times of P.
Proof.
Let
and note that for
we have
Now, by Lemma 5.3 it follows immediately that
and from its proof we have that
whenever
. Concerning the intervention costs we have
(29)
(29) where the first term tends to zero as
by joint continuity of ℓ, continuity of ρ and right continuity of X. By continuity of ℓ, the assertion follows by repeating the argument in the proof of Lemma 5.3.
Lemma 5.6
For each T>0 and there is, for every
, a
such that for all
and
we have
-a.s. (with an exception set that is independent of b).
Proof.
For any it follows by Corollary 5.5 and Theorem A.6 in Appendix 3 that there is for each
an
-optional càdlàg process
such that
-a.s. for any
. Now, pick a sequence
of positive real numbers such that
and for
define
. Then, there is a control
such that
Define the sequence of càdlàg processes
as
and set
. Then,
is an increasing
-a.s. finite sequence of càdlàg processes and it, thus, converges pointwisely,
-a.s. to a limit
that, moreover, is
-measurable. We note that for any
and
we have with
and
, that
and as
we get that
Jensen's inequality now gives that
Letting
and using that the map
is right-continuous uniformly in u it follows that
-a.s., for each stopping time
.
We now show that is a càdlàg process. First, since
is the limit of an increasing sequence of càdlàg processes we have that
. For any
and
let
Then as
is non-decreasing, the sequence
is non-increasing. Let
and note that
by right-continuity of the filtration and
on
. Moreover, with
, Fatou's lemma gives
On the other hand, we have
as
and we conclude that
and, since
was arbitrary, it follows that
.
To prove that has left limits we define, for
, the sequence
as
and then recursively let
We note by the above discussion that
and furthermore, by right-continuity that
and
,
-a.s. If not, we would have
on some set A of positive measure. However, as increments in the jump integral part is
-a.s. zero at predictable times we note by Corollary 5.5 that
is continuous in t at
on
for some
-null set
, uniformly in u. Now, as the filtration is quasi-left continuous this implies that
on
, a contradiction. Letting,
we find that
is a sequence of càdlàg processes with
and we conclude that
is càdlàg.
By repeating the argument in the proof of Lemma 5.3 we find that
-a.s. for any
and
and it follows that
Hence, by Kolmogorov's continuity theorem and Corollary 5.5 it follows that there is a unique map
such that
-a.s. for all
. By dominated convergence we find that
converges pointwisely to some h as
. We define the set
and note that for
, we have
where
and the last step follows by Hölder's inequality and Proposition 5.2. Now, the right-hand side of the last inequality goes to zero as
by the definition of
and Proposition 5.2 and by uniform convergence we conclude that there is a
such that
-a.s. for each
.
It remains to show that we can choose the exception set to be independent of b. Let be a sequence of finite subsets of U with
. For
define
as a measurable selection of
. Then since
takes values in a finite set we have
-a.s. By continuity it follows that
-a.s. Furthermore, by uniform integrability and
-a.s. continuity of
uniformly in u we have that
and we conclude that
-a.s. From this the statement follows as
was arbitrary.
This far we have not made any assumption on the discount factor ρ, other than it being continuous. Clearly, some assumptions on the growth of ρ have to be made in order for the maximization problem to have a finite value. We summarize our assumptions in the following hypothesis:
Hypothesis. [Disc.-A] There an such that
,
and
for all
and
. Furthermore, for each
there is an
such that for all
for some
we have
and
for all
.
Remark 5.3
An important situation where Hypothesis Disc.-A holds with for any
is when the functions ϕ and ℓ are eventually bounded, i.e. when there is a
such that
and
for all
. Another important case is when
grows linearly in T, where C is the bound in Proposition 5.2.
We are now ready to state the main result of this section, showing that under Assumption 5.1 and Hypothesis Disc.-A an optimal control for the problem of maximizing J exist.
Proposition 5.7
Under Hypothesis Disc.-A there is a such that
for all
. Furthermore,
is given by the recursion (Equation8
(8)
(8) )–(Equation9
(9)
(9) ), with
and
Proof.
To show that the assertion is true we need to show that the pair is an admissible reward pair. It is clear that the uniform
-bounds on φ and c in Definition 2.2.(i) hold by Hypothesis Disc.-A. In particular, we note that by Jensen's inequality we get that
The decreasing importance property stated in Definition 2.2.(iii) follows similarly by noting that for
with
we have, by Hypothesis Disc.-A, that
which tends to 0 as
.
Concerning the continuity properties listed in Definition 2.2.(ii) we note that for each and
we have that
Now,
and similarly
This implies that
and the Borel-Cantelli lemma gives that
-a.s., as
for all
.
By Lemma 5.6 and uniform convergence it follows from Lemma 2.4.(d) that . The desired result now follows by Lemma 2.4.(a) while noting that by the construction of ℓ in Assumption 5.1.(v), a simplified version of Lemma 5.6 (without having to consider maximization over u) applied to each of the
gives that there is an
such that
,
-a.s. (with an exception set that is independent of b).
Remark 5.4
In a perfect information setting, i.e. when , we note that
can be taken to be any upper semi-continuous function in b that satisfies the remaining properties of polynomial growth and local Lipschitz continuity.
5.3. The random horizon setting
We turn instead to the reward
(30)
(30) where η is a
-stopping and
. A notable convention applied in (Equation30
(30)
(30) ) is that the terminal reward disregards interventions made at the horizon. This is natural from an applications perspective as it is generally to late to intervene at a default in a financial setting or at the failure of a unit in an engineering application.
In addition to the requirements listed in Assumption 5.1, we make the following assumptions:
Assumption 5.8
The terminal reward is Borel-measurable, satisfies the growth condition
and there is a non-decreasing continuous function
such that for all
, we have
whenever
. Moreover, if there is a sequence
in
such that
on some set
, then there is a
-null set
such that on
we have for every
that
(31)
(31)
We introduce the following hypothesis:
Hypothesis. [Disc.-B] The terminal reward satisfies the bound
Furthermore, for each
there is an
such that for all
for some
we have
for all
.
We have the following extension of Proposition 5.7.
Proposition 5.9
Under Hypotheses Disc.-A and Disc.-B there is a such that
for all
. Furthermore,
is given by the recursion (Equation8
(8)
(8) )–(Equation9
(9)
(9) ), with
and
If, in addition η is an
-stopping time, then
for all
.
Proof.
We note that all details in the proof of Proposition 5.7 transfer immediately to this situation except for the quasi-left upper semi-continuity property in the definition of (Definition 2.1). We thus assume that there is a sequence non-decreasing sequence
of stopping times such that
. When
,
-a.s. left-continuity at θ follows by Lemma 5.3 and the local Lipschitz property of ψ and when
,
-a.s. left-continuity at θ is immediate. We thus assume that
on some measurable set
.
Then, we have
where the first term on the right-hand side tends to zero,
-a.s. Concerning the second term we have
where
and ° denotes composition of functions. The first term on the right-hand side is
-a.s. non-positive by Assumption 5.8 and the last term tends to zero,
-a.s., by the local Lipschitz property of ψ and Lemma 5.3 in combination with Proposition 5.2, the polynomial growth condition on ψ and Hypothesis Disc.-B.
The last assertion follows by noting that since c>0 it will never be optimal to intervene at times greater than or equal to η.
We note the following distinction between the finite (deterministic) horizon and the random horizon settings:
Remark 5.5
In the case when for some
it follows from the proof of Proposition 5.9 that we can relax (Equation31
(31)
(31) ) to
To see that there is an actual distinction here consider the following example:
Example 5.10
We let be the trivial σ-algebra
and assume that
. We take
and set
. Then, with the rewards
,
, the intervention cost
and the discount
, we get
but there is no control that attains this value.
Disclosure statement
No potential conflict of interest was reported by the author(s).
Additional information
Funding
Notes
1 Throughout, we let
2 We let (resp.
) to denote minimum (resp. maximum), so that
.
3 Throughout, we generally suppress dependence on ω and refer to as a map
.
4 Requiring that p>2 is for notational convenience only and can easily be loosened to p>1.
5 Throughout, C will denote a generic positive constant that may change value from line to line.
6 By definition , which belongs to
References
- N. Agram and B. Øksendal, Stochastic control of memory mean-field processes, Appl. Math. Optim.79(1) (2019), pp. 181–204.
- M. Basei, Optimal price management in retail energy markets: an impulse control problem with asymptotic estimates, Math. Meth. Oper. Res. 89(3) (2019), pp. 355–383.
- A. Bensoussan and J.L. Lions, Impulse Control and Quasivariational Inequalities, Gauthier-Villars, Montrouge, France, 1984.
- D.P. Bertsekas and S.E. Shreve, Stochastic Optimal Control: The Discrete-time Case, Academic Press, 1978.
- R. Carmona and M. Ludkovski, Pricing asset scheduling flexibility using optimal switching, Appl. Math. Finance 15(5-6) (2008), pp. 405–447.
- C. Dellacherie and P.-A. Meyer, Probabilités Et Potentiel, I-IV, Hermann, Paris, 1975.
- C. Dellacherie and P.-A. Meyer, Probabilités Et Potentiel, V-VIII, Hermann, Paris, 1980.
- B. Djehiche, S. Hamadène, and I. Hdhiri, Stochastic impulse control of non-markovian processes, Appl. Math. Optim. 61(1) (2010), pp. 1–26.
- B. Djehiche and S. Hamadène, On a finite horizon starting and stopping problem with risk of abandonment, Int. J. Theoret. Appl. Finance 12(04) (2009), pp. 523–543.
- B. Djehiche, S. Hamadène, I. Hdhiri, and H. Zaatra, Infinite horizon stochastic impulse control with delay and random coefficients, Math. Oper. Res. 47(1) (2022), pp. 665–689.
- B. Djehiche, S. Hamadène, and A. Popier, A finite horizon optimal multiple switching problem, SIAM J. Control Optim. 48(4) (2009), pp. 2751–2770.
- N. El Karoui, Les aspects probabilistes du contrôle stochastique. Ecole d'Eté de SaintFlour IX 1979. Lecture Notes in Math, Berlin, Springer, 1981.
- N. El-Karoui, C. Kapoudjian, E. Pardoux, S. Peng, and M.C. Quenez, Reflected solutions of backward SDEs and related obstacle problems for PDEs, Ann. Probab. 25(2) (1997), pp. 702–737.
- S. Hamadène, Reflected BSDE's with discontinuous barrier and application, Stoch. Int. J. Probab. Stoch. Process. 74(3-4) (2002), pp. 571–596.
- S. Hamadène and J. Zhang, Switching problem and related system of reflected backward SDEs, Stoch. Process. Appl. 120(4) (2010), pp. 403–426.
- I. Hdhiri and M. Karouf, Optimal stochastic impulse control with random coefficients and execution delay, Stoch. Int. J. Probab. Stoch. Process. 90(2) (2018), pp. 151–164.
- J. Jönsson and M. Perninge, Finite horizon impulse control of stochastic functional differential equations, SIAM J. Control Optim. 61(2) (2023), pp. 924–948.
- N. El Karoui and X. Tan, Capacities, measurable selection and dynamic programming part i: Abstract framework. arXiv:1310.3363, 2013.
- R. Korn, Some applications of impulse control in mathematical finance, Math. Meth. Oper. Res. 50(3) (1999), pp. 493–518.
- R. Martyr, Finite-horizon optimal multiple switching with signed switching costs, Math. Oper. Res.41(4) (2016), pp. 1432–1447.
- B. Øksendal and A. Sulem, Applied Stochastic Control of Jump Diffusions, Springer, 2007.
- B. Øksendal and A. Sulem, Optimal stochastic impulse control with delayed reaction, Appl. Math. Optim. 58(2) (2008), pp. 243–255.
- J. Palczewski and L. Stettner, Impulsive control of portfolios, Appl. Math. Optim. 56(1) (2007), pp. 67–103.
- M. Perninge, A finite horizon optimal switching problem with memory and application to controlled sddes, Math. Meth. Oper. Res. 91(3) (2020), pp. 465–500.
- P. Protter, Stochastic Integration and Differential Equations, 2nd ed. Springer, Berlin, 2004.
Appendices
Appendix 1. Quasi-left continuity
A càdlàg process is quasi-left continuous if for each predictable stopping time θ and every announcing sequence of stopping times
we have
,
-a.s. Similarly, X is quasi-left upper semi-continuous if
,
-a.s. A filtration is quasi-left continuous if
for every predictable stopping time θ.
Appendix 2. The Snell envelope
In this section we gather some useful results concerning the Snell envelope. Recall that a progressively measurable process X is of class [D] if the set of random variables is uniformly integrable.
Theorem A.1
The Snell envelope
Let be an
-adapted,
-valued, càdlàg process of class
D
. Then there exists a unique
up to indistinguishability
,
-valued càdlàg process
called the Snell envelope of X, such that Z is the smallest supermartingale that dominates X. Moreover, the following holds (with
):
For any stopping time η,
(A1)
(A1)
The Doob-Meyer decomposition of the supermartingale Z implies the existence of a triple
where
is a uniformly integrable right-continuous martingale,
is a non-decreasing, predictable, continuous process with
and
is non-decreasing purely discontinuous predictable with
, such that
(A2)
(A2) Furthermore,
for all
.
Let
be given and assume that for any predictable
and any increasing sequence
with
and
,
-a.s, we have
,
-a.s. Then, the stopping time
defined by
(with the convention that
) is optimal after η, i.e.
Furthermore, in this setting the Snell envelope, Z, is quasi-left continuous, i.e.
.
Let
be a sequence of càdlàg processes converging increasingly and pointwisely to the càdlàg process X and let
be the Snell envelope of
. Then the sequence
converges increasingly and pointwisely to a process Z and Z is the Snell envelope of X.
In the above theorem, (i)–(iii) are standard results and proofs can be found in, for example, [Citation12,Citation14]. A finite horizon version of statement (iv), which extends trivially to infinite horizon, was proved in [Citation11].
Appendix 3. The section and projection theorems
In this section we recall two fundamental results from the general theory of stochastic processes, namely the measurable selection and the optional projection theorems.
We consider a complete filtered probability space , with
a right-continuous filtration. For any space E, we define the projection of a set
onto Ω as
.
Theorem A.2
Measurable projection
Let E be a locally compact Polish space. For every the set
is
-measurable.
A proof can be found in, e.g. [Citation18] (see the proof of Theorem 2.10) or [Citation6] Chapter III. In particular we need the following corollary result:
Corollary A.3
Let be a real valued, measurable function defined on the product space
. Then for all
, the function
(with the convention
) is
-measurable.
Proof.
For each we have
. Now, since h is measurable, the set
is in
and the result follows by the measurable projection theorem.
Theorem A.4
Measurable selection
Let be a Borel space with
. For every
there is a
-measurable r.v. β taking values in
(with ∂ a cemetery point) such that
This is a standard result and a proof can be found in [Citation18] (Theorem 2.20) (see also Chapter 7 in [Citation4] where several extensions are given). In particular we need the following well known corollary result:
Corollary A.5
Let be a measurable function defined on the product space
, such that for
-almost every ω the map
is upper semi-continuous. Then, with U a compact subset of
, there exists a
-measurable r.v. β such that
-a.s.
Proof.
Since (where now
) the function
is
-measurable. Furthermore, as h is
-measurable, the set
is in
. Now, by Theorem A.4 there is a
-measurable
-valued r.v. β such that
and
. As U is compact and
is u.s.c. on
with
, we have that
for all
and, hence,
.
The last result that we need is the optional projection theorem.
Theorem A.6
Optional projection
Assume that is a measurable process (not necessarily adapted to the filtration
) with
for all stopping times
, then there exists a unique optional process
such that
for all stopping times
. If, furthermore, X is càdlàg then
is also càdlàg.
A proof of Theorem A.6 can be found in Chapter VI, pp. 103 of [Citation7].