Publication Cover
Stochastics
An International Journal of Probability and Stochastic Processes
Latest Articles
285
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Infinite horizon impulse control of stochastic functional differential equations driven by Lévy processes

Received 17 Aug 2020, Accepted 08 Sep 2023, Published online: 04 Oct 2023

Abstract

We consider impulse control of stochastic functional differential equations (SFDEs) driven by Lévy processes under an additional Lp-Lipschitz condition on the coefficients. Our results, which are first derived for a general stochastic optimization problem over infinite horizon impulse controls and then applied to the case of a controlled SFDE, apply to the infinite horizon as well as the random horizon settings. The methodology employed to show existence of optimal controls is a probabilistic one based on the concept of Snell envelopes.

1. Introduction

The standard stochastic impulse control problem is an optimal control problem that arises when an operator controls a dynamical system by intervening on the system at a discrete set of stopping times. Generally, an intervention can be represented by an element in the control set U which we assume to be a compact subset of Rm.

In impulse control the control-law, thus, takes the form u=(τ1,,τN;β1,,βN), where τ1τ2τN is a sequence of times when the operator intervenes on the system and βj is the impulse that the operator affects the system with at time τj. The standard impulse control problem in infinite horizon can be formulated as finding a control that maximizes (1) E[0eρ0sϕ(s,Xsu)dsj=1Neρ0τj(τj,Xτju,βj)],(1) where ρ0>0 is a constant referred to as the discount factor, Xu is an Rm-valued controlled stochastic process that jumps at interventions (e.g.  by setting Xτju=Γ(τj,Xτju,βj) for some deterministic function Γ) and the deterministic functionsFootnote1 ϕ:R+×RmR and :R+×Rm×UR give the running reward and the intervention costs, respectively. The quantity (t,x,b), thus, represents the cost incurred by applying the impulse bU at time tR+ when the state is x.

As impulse control problems appear in a vast number of real-world applications (see e.g. [Citation19,Citation23] for applications in finance and [Citation2,Citation5] for applications in energy) a lot of attention has been given to various types of problems where the control is of impulse type. In the standard Markovian setting, where Xu solves a stochastic differential equation (SDE) driven by a Lévy process on [τj,τj+1), the relation to quasi-variational inequalities has frequently been exploited to find optimal controls (see the seminal work in [Citation3] or turn to [Citation21] for a more recent textbook). In the non-Markovian framework an impulse control problem in finite horizon (T<) was solved in [Citation8] by utilizing the link between optimal stopping and reflected BSDEs (originally discovered in [Citation13]) while considering the reward functional E[0Tϕ(s,ω,Lsu)dsj=1Nc(βj)],where ϕ:[0,T]×Ω×RnR is now a random (and not necessarily Markovian) field and the controlled process Lu takes the particular form Ltu:=Lt+j=1N1[τjt]βj, with L an (exogenous) non-controlled process and assuming that U is a finite set. Relevant is also the treatment of multi-modes optimal switching problems in a non-Markovian setting in [Citation11].

In [Citation16] the original work of [Citation8] was extended to incorporate delivery lag by setting Ltu:=Lt+j=1N1[τj+Δt]βj for a fixed Δ>0. As in [Citation22] the work in [Citation16] is based on the assumption that τj+1τj+Δ. This work was later extended by considering the infinite horizon setting in [Citation10]. Notable is also the recent work on finite horizon impulse control of SFDEs driven by a Brownian motion in [Citation17].

In the present article we take a different approach to all the above mentioned works by considering the abstract reward functional (2) J(u):=E[φ(τ1,,τN;β1,,βN)j=1Nc(τ1,,τj;β1,,βj)],(2) where the terminal reward φ maps controls to values of the real line and is measurable with respect to GB(D), where B(D) is the Borel σ-field of D:=i=0Di, with D0:=, Di:={(t1,,ti;b1,,bi):0t1ti,bjU,forj=1,,i} for i1 and G is the σ-field of a complete probability space (Ω,G,P). The intervention cost c is also assumed to be a GB(D)-measurable map in addition to being bounded from below by a deterministic positive function. We consider the partial information setting and assume that we observe the system through a filtration F:={Ft}t0 of sub-σ-fields of G and thus restrict our attention to F-adapted controls.

To indicate the applicability of the results we consider the special case when (3) φ(u)=0eρ(s)ϕ(s,Xsu)ds,(3) and (4) c(τ1,,τj;β1,,βj)=eρ(τj)(τj,Xτj(τ1,,τj1;β1,,βj1),βj),(4) where Xu solves an impulsively controlled stochastic functional differential equation (SFDE) driven by a Lévy process under an additional Lp-type Lipschitz condition on the coefficients of the SFDE. Furthermore, we will see that the results easily extend to problems with a random horizon which allows us to model aspects such as default in financial applications. We thus extend the result in [Citation17] on the one hand by considering a more general driving noise but also by considering both the infinite and the random horizon settings. Our treatment of the random horizon problem also motivates the exploration of partial information as optimal controls may be fundamentally different in the partial information setting.

The main contributions of the present work are twofold. First, we show that the problem of maximizing J has a solution under certain assumptions on φ and c, summarized in the definition of an admissible reward pair, by finding an optimal control in terms of a family of interconnected value processes. We refer to this family of processes as a verification family. Furthermore, we give a set of conditions under which the reward pair defined by (Equation3)–(Equation4) is admissible.

The remainder of the article is organized as follows. In the next section we state the problem, set the notations used throughout the article and detail the set of assumptions that are made. In particular we introduce the notion of an admissible reward pair. Then, in Section 3 a verification theorem is derived. This verification theorem is an extension of the verification theorem for the multi-modes optimal switching problem with memory developed in [Citation24] and presumes the existence of a verification family. In Section 4 we show that, under the assumptions made, there exists a verification family whenever (φ,c) is an admissible reward pair, thus proving existence of an optimal control for the impulse control problem with the cost functional J defined in (Equation2). Then, in Section 5 we show that a type of impulse control problems for controlled SFDEs satisfies the conditions on φ and c as prescribed in the definition of an admissible reward pair, both in the infinite and random horizon settings. Finally in the appendix, we recall some results, such as the Snell envelope, that are useful when showing existence of optimal controls.

2. Preliminaries

Let (Ω,G,P) be a complete probability space, and F:=(Ft)t0 a filtration of sub-σ-fields of G satisfying the usual conditions in addition to being quasi-left continuous. We assume that F0 is the trivial σ-field and define F:=F.

Throughout, we will use the following notations. Let:

  • PF be the σ-algebra on R+×Ω of F-progressively measurable subsets.

  • For p1, Sp be the set of all finite, R-valued, PF-measurable, càdlàg  processes (Zt:t0) such that E[suptR+|Zt|p]< and let Sqp be the subset of processes that are quasi-left upper semi-continuous (see Appendix 1 for a definition of quasi-left continuity).

  • T be the set of all F-stopping times and for each ηT we let Tη be the corresponding subsets of stopping times τ such that τη, P-a.s. Furthermore, we let Tf (resp. Tηf) be the subset of T (resp. Tη) with all stopping times τ for which P[τ<]=1.

  • For each τT, A(τ) be the set of all Fτ-measurable random variables taking values in U.

  • U be the set of all controls u=(τ1,,τN;β1,,βN), where (τj)j=1 (the intervention times) is a non-decreasing sequence of F-stopping times, βjA(τj) (the interventions) and N:=sup{j1:τj<}0 is the (random, F-measurable) number of interventions.

  • Uf denote the subset of uU for which N(t):=sup{j:τjt} is P-a.s. finite on compacts (i.e.  Uf:={uU:P[{ωΩ:N(t)>k,k>0}]=0,tR+}) and for all k0 we let Uk be the set of all controls (τ1,,τNk;β1,,βNk) truncated at k interventions and set Ulim:=k0Uk.

  • For a random interval A (i.e.  a set of the type [η1,η2], [η1,η2), (η1,η2] or (η1,η2) for some η1,η2T) UA (and UAf resp. UAk) be the subset of U (and Uf resp. Uk) with τjA, P-a.s. for j=1,,N. When the interval is A=[η,] for some ηT we use the shorthand Uη (and Uηf resp. Uηk).

  • Df be the subset of D with all finite sequences and for k0 we let Dk:=i=0kDi.

  • v=(t,b), with t:=(t1,,tn) and b:=(b1,,bn), where n is possibly infinite, denote a generic element of D.

  • For v=(t,b)Df and v=(t,b)D, we introduce the composition, denoted by °, defined asFootnote2 vv:=(t1,,tn,t1tn,,tntn;b1,,bn,b1,,bn). For vD, we define the truncation to k0 interventions as [v]k:=(t1,,tkn;b1,,bkn).

  • The composition operator ° be extended to controls by setting uu~:=(τ1,,τNk,τ~1τk,,τ~Nˆτk;β1,,βNk,β~1,,β~Nˆ), where Nˆ:=sup{j1:τ~jτk<}0, whenever uUk and u~:=(τ~1,,τ~N~;β~1,,β~N~)U.

  • Πl:={0,1/2l,2/2l,} for l0 and set Π:=l=1Πl.

Furthermore, we introduce the following set:

Definition 2.1

We let HF be the set of all PFB(U)-measurable mapsFootnote3 h:R+×Ω×UR such that the collection {h(τ,β):τTf,βA(τ)} is uniformly integrable and (outside of a P-null set) we have for all (t,b)R+×U:

  1. The limit lim(t,b)(t,b)h(t,b) exists;

  2. limttsupbU|h(t,b)h(t,b)|=0;

  3. limbbh(t,b)h(t,b).

Furthermore, let HF be the set of all hHF such that for any predictable stopping time θT and any announcing sequence θjθ with θjTf we have lim supjsupbU{h(θj,b)h(θ,b)}0, P-a.s.

The sets HF and HF will play an important role in the characterization of optimal controls.

2.1. Problem formulation

With the notations above, the problem we deal with is characterized by the following objects:

  • (Ω,G,F,P) a complete probability space.

  • A GB(D)-measurable map φ:DR.

  • A GB(D)-measurable map c:DR+.

To obtain existence of optimal controls we need to make some assumptions on the involved objects. The assumptions that we will use are summarized in the definition of what we refer to as an admissible reward pair:

Definition 2.2

We call the pair (φ,c) an admissible reward pair if, for someFootnote4 p>2:

  1. The terminal reward φ and the intervention cost c are both right-continuous in the intervention times (uniformly in the interventions) and satisfy the following bounds:

    1. supuUE[|φ(u)|p]<.

    2. supuUE[|c(u)|2]< and c(v)δ(tn) for all vDf, P-a.s., where δ:R+R+ is a deterministic, continuous, non-increasing and positive function, i.e.  δ(s)δ(t)>0, whenever 0st<.

  2. For every vUlim and every k0 and T>0, there are maps χTHF, χHF and (t,b)ct,bvHF such that for all τT and bU we have χ(τ,b)=esssupuUτkE[φ(v(τ,b)u)j=1Nc(v(τ,b)[u]j)|Fτ],χT(τ,b)=esssupuU[τ,T)kE[φ(v(τ,b)u)j=1Nc(v(τ,b)[u]j)|Fτ],and cτ,bv=E[c(v(τ,b))|Fτ],P-a.s. (with a P-null exception set that can be chosen independently of b).

  3. We have supuUlim,vUTfE[|φ(uv)φ(u)|p]0 as T

The conditions in the above definition are mainly standard assumptions for infinite horizon stochastic impulse control problems translated to our setting (see e.g.  [Citation10]). Condition (i.a) together with positivity of the intervention cost, c, in (i.b) implies that the expected maximal reward is finite. Condition (iii) implies that the future has diminishing impact on the total reward and can be seen as a generalization of the deterministic discounting applied in (Equation1). We show below that the boundedness of the intervention costs from below by a positive function together with (i. (a) imply that, with probability one, the optimal control (whenever it exists) can only make a finite number of interventions within any compact time interval.

Remark 2.1

Note that we may hide part of the intervention cost within the function φ which implies that, similar to the setting in [Citation20], we can handle problems with negative intervention costs as long as a type of martingale condition is satisfied.

Recall the reward functional given by (Equation2). The problem we deal with might be formulated as:

Problem 2.1

Find uU, such that (5) J(u)=supuUJ(u),(5) when (φ,c) is an admissible reward pair.

Throughout Sections 34 we will thus assume that (φ,c) is an admissible reward pair, before we state a set of conditions under which we are able to show that a particular (φ,c) of the form (Equation3)–(Equation4) is an admissible reward pair.

As a step in solving Problem 1 we need the following proposition which is a reduction result for impulse control problems.

Proposition 2.3

Suppose there is a uU such that J(u)J(u) for all uUf. Then u is an optimal control for (Equation5), i.e. J(u)J(u) for all uU.

Proof.

Pick uˆ:=(τˆ1,;βˆ1,)UUf. Then there is a tR+ such that P[B]>0 with B:={ωΩ:Nˆ(t)>k,k>0}, where Nˆ(t):=max{j:τˆjt}0. Furthermore, by positivity of the intervention costsFootnote5 J(uˆ)supuUE[|φ(u)|]kδ(t)P[B]Ckδ(t)P[B],for all k0, by Definition 2.2.(i). However, again by Definition 2.2.(i.a) we have J()C. Hence, uˆ is dominated by the strategy of doing nothing and the assertion follows.

2.2. Relevant properties of HF

We note the following properties:

Lemma 2.4

  1. If h,hHF (resp. h,hHF), then h+h and hh are also in HF (resp. HF).

  2. If hHF then there is a PF-measurable càdlàg  process, h, of class [D], such that hτ=supbUh(τ,b), P-a.s. for each τTf. If hHF then h is quasi-left upper semi-continuous.

  3. If h,hHF, then (supbU|h(t,b)h(t,b)|:t0) is PF-measurable and càdlàg. 

  4. If (hk)k0 is a sequence in HF that converges uniformly to some h (outside of a P-null set) then hHF.

Proof.

Moving on to property b), we let sjl:=j2l and note that for j0, there is a βj+1lA(sjl) such that supbUE[h(sj+1l,b)|Fsjl]=E[h(sj+1l,βj+1l)|Fsjl]P-a.s., by Corollary A.5 in Appendix 3. By Theorem A.6 we can define the sequence of càdlàg  processes (hˆl)l0 as hˆtl:=j=01[sjl,sj+1l)(t)E[h(t,βj+1l)|Ft].Now, we let h_0:=hˆ0 and then recursively define h_l:=h_l1hˆl for l1. Then, (h_l)l0 is a non-decreasing sequence of càdlàg  processes and supbUh(t,b)h_tl,P-a.s., for all t[0,) and all l0. Furthermore, for t0 we let ιl:=max{j:j2lt} and get that supbUh(t,b)h_tl=supbUh(t,b)E[h(sιl+1l,βιl+1l)|Fsιll]+E[h(sιl+1l,βιl+1l)|Fsιll]E[h(t,βιl+1l)|Ft]supbU{h(t,b)E[h(sιl+1l,b)|Fsιll]}+E[h(sιl+1l,βιl+1l)|Fsιll]E[h(t,βιl+1l)|Ft].Since sιllt we have, by quasi-left continuity of the filtration and uniform integrability that limlE[h(sιl+1l,βιl+1l)|Fsιll]=limlE[h(sιl+1l,βιl+1l)|Ft].Now, as sιl+1lt, it follows by Definition 2.1.(ii) that h_tlsupbUh(t,b), P-a.s., as l. We note that h_l is an increasing, uniformly bounded, sequence of PF-measurable càdlàg  processes. The sequence, thus, converges to a PF-measurable process, h. It remains to show that h is càdlàg, quasi-left upper semi-continuous and that it agrees with supbUh(t,b) on stopping times.

To show that the limit is càdlàg  we let h¯tl:=j=01[sjl,sj+1l)(t)sup(r,b)[t,sj+1l)×Uh(r,b).We note that h¯l has left and right limits and that, furthermore, limtth¯tlh¯tl. Now, if limtth¯tl<h¯tl then h¯tl=supbUh(t,b)=h(t,βt)for some Ft-measurable βt, but then we would have h(t,βt)>limtth¯tllimtth(t,βt)contradicting the fact that (h(s,βt):st) is right continuous. We conclude that (h¯l)l0 is a non-increasing sequence with h¯l a {Ft+2l}t2l-adapted càdlàg  process and supbUh(t,b)h¯tl.For t0 we know that there is a non-increasing sequence (τ~l)l0, with τ~l a Ft+2l-measurable r.v. taking values in [t,t+2l) such that limlh¯tl=limlsupbUh(τ~l,b).Now, as hHF we have limttsupbU|h(t,b)h(t,b)|=0. It follows that limlh¯tlsupbUh(t,b)=limlsupbUh(τ~l,b)supbUh(t,b)limlsupbU{h(τ~l,b)h(t,b)}=0,P-a.s., and in particular we find that h¯tlht, P-a.s. as l. This gives that for any sequence tjt and l0, lim inftjthtjlim inftjth_tjl=h_tland lim suptjthtjlim suptjth¯tjl=h¯tl.Letting l tend to infinity we find that lim inftjthtj=lim suptjthtj=ht.Similarly we get existence of left-limits and by the uniform integrability property imposed on members of HF we conclude that h is a càdlàg, PF-measurable process of class [D].

For τT, let (τl)l0 be a non-increasing sequence of stopping times in TΠ (the subset of T with all stopping times taking values in the countable set Π) such that τlτ. We may, for example, set τl:=inf{sΠl:sτ}. Since Π is countable we have (6) hτl=supbUh(τl,b),(6) P-a.s. Now, by right-continuity we get that limlhτl=hτ. Letting (βl)l0 be a sequence of maximizers for the right-hand side of (Equation6) at times (τl)l0 we get lim suplsupbUh(τl,b)=lim suplh(τl,βl),P-a.s. Moreover, for each ωΩ there is a subsequence (ιj(ω))j1 such that lim suplh(τl,βl)=limjh(τιj,βιj).Since U is compact, there is a subsequence (ιj(ω))j1(ιj(ω))j1 such that (βιj(ω))j0 converges to some β~(ω)U and so we have lim suplsupbUh(τl,b)=limj(h(τιj,βιj)h(τ,βιj))+limj(h(τ,βιj)h(τ,β~))+h(τ,β~).Now, limj(h(τιj,βιj)h(τ,βιj))=0,P-a.s., by Definition 2.1.(ii) and limj(h(τ,βιj)h(τ,β~))0,P-a.s., by the upper semi-continuity declared in Definition 2.1.(iii). Further, as h(τ,β~)supbUh(τ,b) we conclude that lim suplsupbUh(τl,b)supbUh(τ,b),P-a.s. On the other hand, there is a βA(τ) such that supbUh(τ,b)=h(τ,β), P-a.s., and we have lim inflsupbUh(τl,b)=lim inflh(τl,βl)limlh(τl,β)=h(τ,β),P-a.s. This implies that the limit exists with limlsupbUh(τl,b)=supbUh(τ,b), P-a.s., establishing that hτ=supbUh(τ,b), P-a.s. Finally, if hHF then quasi-left upper semi-continuity of h is immediate from Definition 2.1 and (b) follows.

Property (c) follows similarly by noting that for any ϵl>0 we can chose βj+1lA(sjl) such that supbUE[|h(sj+1l,b)h(sj+1l,b)||Fsjl]=E[|h(sj+1l,βj+1l)h(sj+1l,βj+1l)||Fsjl]+ϵland we can choose the sequence (ϵl)l0 such that ϵl0.

Concerning the last property we note that for each ϵ>0 we can, by uniform convergence, choose a P-a.s. finite k(ω)0 such that |h(t,b)hk(t,b)|ϵ for all (t,b)R+×U. Then, lim sup(t,b)(t,b)h(t,b)lim inf(t,b)(t,b)h(t,b)lim sup(t,b)(t,b)hk(t,b)lim inf(t,b)(t,b)hk(t,b)+2ϵ=2ϵand property (i) in Definition 2.1 follows as ϵ>0 was arbitrary. The remaining properties follow similarly.

3. A verification theorem

Our approach to finding a solution to Problem 1 is based on deriving an optimal control under the assumption that a specific family of processes exists, and then showing that the family does indeed exist. We will refer to any such family of processes as a verification family. Before making precise the concept of a verification family we introduce the notion of consistency:

Definition 3.1

We refer to a family of processes ((Xtv)t0:vUlim) as being consistent if for each uUlim, the map h:R+×Ω×UR given by h(t,b)=Xtu(t,b) is PFB(U)-measurable and for each τT and each βA(τ) we have Xτu(τ,β)=h(τ,β), P-a.s.

We are now ready to state the definition of a verification family:

Definition 3.2

We define a verification family to be a consistent family of càdlàg  supermartingales ((Ysv)s0:vUlim) such that for each vUlim:

  1. The family satisfies the recursion (7) Ysv=esssupτTsE[1[τ=]φ(v)+1[τ<]supbU{cτ,bv+Yτv(τ,b)}|Fs].(7)

  2. The family is uniformly bounded in the sense that supuUlimE[sups[0,]|Ysu|2]<.

  3. The map (t,b)Ytv(t,b) belongs to HF.

  4. supuUlimE[sups[T,]|YsuE[φ(u)|Fs]|]0, as T.

The purpose of the present section is to reduce the solution of Problem 1 to showing existence of a verification family. This is accomplished by the following verification theorem (the proof of which follows along the lines of the proof of Theorem 2 in [Citation9]):

Theorem 3.3

Assume that there exists a verification family ((Ysv)s0:vUlim) and let:

  • the sequence (τj)j=1 be given by (8) τj:=inf{sτj1:Ys[u]j1=supbU{cs,b[u]j1+Ys[u]j1(s,b)}},(8) using the convention that inf=, with τ0=0 and set N=sup{j0:τj<};

  • the sequence (βj)j=1 be defined recursively as a measurable selection of (9) βjargmaxbU{cτj,b[u]j1+Yτj[u]j1(τj,b)}.(9)

Then u=(τ1,,τN;β1,,βN)Uf is an optimal control for (Equation5) in the sense that J(u)=supuUJ(u). Moreover, the family is unique (i.e.  there is at most one verification family, up to indistinguishability of the maps tYtv and (t,b)Ytv(t,b)) and Y0=supuUJ(u) (where Y:=Y).

Proof.

The proof is divided into three steps where we first, in Step 1, show that for any 0jN we have (10) Ys[u]j=E[1[τj+1=]φ(u)+1[τj+1<]{c([u]j+1)+Yτj+1[u]j+1}|Fs],(10) P-a.s. for s[τj,τj+1]. Following this, in Step 2, we show that Y0=J(u). Then in Step 3 we show that u is the optimal control, establishing (i) and (ii). A straightforward generalization to arbitrary initial conditions vUlim then gives that (11) Ysv=esssupuUsE[φ(vu)j=1Nc(v[u]j)|Fs],(11) by which uniqueness follows. Below we refer to the properties of a verification family in Definition 3.2 simply as properties a), b), a) and d).

Step 1 We start by showing that for each vUlim the recursion (Equation7) can be written in terms of a F-stopping time and that the inner supremum is attained, P-a.s. In particular, this will imply the existence of a maximizer in (Equation9). From (Equation7) and consistency we note that Yv is the smallest supermartingale that dominates (12) Rv:=(1[s=]E[φ(v)|F]+1[s<]supbU{cs,bv+Ysv(s,b)}:0s).(12) By Property (c) and Definition 2.2.(ii) we have that the map (t,b)ct,bv+Ytv(t,b) belongs to HF. It thus follows from Lemma 2.4.(b) that Rv is a càdlàg  process of class [D] that is quasi-left upper semi-continuous on [0,). Furthermore, by Property (d) and positivity of the intervention costs we note that lim suptRtvRv, P-a.s. By Theorem A.1.(iii) in Appendix 2 and consistency we conclude that for any θTf, the stopping time τθTθ given by τθ:=inf{sθ:Ysv=supbU{cs,bv+Ysv(s,b)}}is such that: Yθv=E[1[τθ=]φ(v)+1[τθ<]supbU{cτθ,bv+Yτθv(τθ,b)}|Fθ].Now, since (t,b)ct,bv+Ytv(t,b)HF, the map bcτθ,bv+Yτθv(τθ,b) is FτθB(U)-measurable and u.s.c. on {τθ<}N for some P-null set N. Corollary A.5 of Appendix 3 and consistency then implies that there is a βθA(τθ) such that Yθv=E[1[τθ=]φ(v)+1[τθ<]{c(v(τθ,βθ))+Yτθv(τθ,βθ)}|Fθ],P-a.s., and in particular (Equation10) holds. As mentioned above, this also implies the existence of a Fτj-measurable βj satisfying (Equation9).

Step 2 We now show that Y0=J(u). We start by noting that Y is the Snell envelope of (1[s=]E[φ()|F]+1[s<]supbU{cs,b+Yss,b}:0s)and by Step 1 we thus have (since F0 is trivial) that Y0=E[1[τ1=]φ()+1[τ1<]{c(τ1,β1)+Yτ1τ1,β1}].Moving on we pick j{1,,N} and note that [u]jUj. But then, by Step 1, we have that Yτj[u]j=E[1[τj+1=]φ(u)+1[τj+1<]{c([u]j+1)+Yτj+1[u]j+1}|Fτj].By induction we get that for each K0 we have, (13) Y0=E[1[NK]φ(u)j=1NKc([u]j)+1[N>K]{c([u]K+1)+YτK+1[u]K+1}].(13) Now, arguing as in the proof of Proposition 2.3 and using Property (b) we find that uUf. To show that the right-hand side of (Equation13) equals J(u) we note that (Equation13) can be rewritten as Y0=E[φ([u]K+1)j=1NK+1c([u]j)+1[N>K]{YτK+1[u]K+1φ([u]K+1)}]which gives (14) |Y0J(u)|E[|φ([u]K+1)φ(u)|]+E[j=K+2Nc([u]j)]+|E[1[N>K]{YτK+1[u]K+1φ([u]K+1)}]|(14) for all K0. For the first term on the right-hand side we note that for any T>0 we have E[|φ([u]K+1)φ(u)|]E[1[τK+2<T]|φ([u]K+1)φ(u)|]+supuUlim,vUTE[|φ(uv)φ(u)|]2P[τK+2<T]1/2supuUE[|φ(u)|2]1/2+supuUlim,vUTE[|φ(uv)φ(u)|].where we have used Hölder's inequality to arrive at the last inequality. As (φ,c) is an admissible reward pair, Definition 2.2.(iii) gives that the second term can be made arbitrarily small by choosing T sufficiently large and Definition 2.2.(i) implies that the first term tends to zero as K for all finite T, since uUf. We, thus, conclude that the first term on the right-hand side in (Equation14) tends to zero as K.

For the second term we note that letting K in (Equation13) and using Property (b) and Definition 2.2.(i.a) we find that j=1Nlc([u]j) converges increasingly to a limit in L2(Ω,P) as l. Hence, the second term on the right-hand side of (Equation14) also tends to zero as K.

Conditioning on FτK+1 in the third term of the right-hand side of (Equation14) and noting that {N>K} is FτK+1-measurableFootnote6 we find that, similar to the above case, we have for any T0 that |E[1[N>K]{YτK+1[u]K+1φ([u]K+1)}]|E[|YτK+1[u]K+1E[φ([u]K+1)|FτK+1]|]CP[τK+1<T]1/2+supuUlimE[sups[T,]|YsuE[φ(u)|Fs]|],where the second term can be made arbitrarily small by Property  (d) and we conclude that Y0=J(u).

Step 3 It remains to show that the strategy u is optimal. To do this we pick any other strategy uˆ:=(τˆ1,,τˆNˆ;βˆ1,,βˆNˆ)Uf. By Step 2 and the definition of Y0 in (Equation7) we have J(u)=Y0E[1[Nˆ=0]φ()+1[Nˆ>0]supbU{cτˆ1,b+Yτˆ1τˆ1;b}]E[1[Nˆ=0]φ()+1[Nˆ>0]{c(τˆ1;βˆ1)+Yτˆ1τˆ1;βˆ1}]but in the same way Yτˆ1τˆ1,βˆ1E[1[Nˆ=1]φ(τˆ1,βˆ1)+1[Nˆ>1]{c(τˆ1,τˆ2;βˆ1,βˆ2)+Yτˆ2τˆ1,τˆ2;βˆ1,βˆ2}|Fτˆ1],P -a.s. Repeating this procedure K times gives J(u)YˆK:=E[1[NˆK]φ(uˆ)j=1NˆKc([uˆ]j)+1[Nˆ>K]{c([uˆ]K+1)+YτˆK+1[uˆ]K+1}].Now, we have J(uˆ)YˆKE[1[Nˆ>K]{φ(uˆ)YτˆK+1[uˆ]K+1}]=E[1[Nˆ>K]{φ([uˆ]K+1)YτˆK+1[uˆ]K+1}]+E[1[Nˆ>K]{φ(uˆ)φ([uˆ]K+1)}]E[|E[φ([uˆ]K+1)|FτˆK+1]YτˆK+1[uˆ]K+1|]+E[|φ(uˆ)φ([uˆ]K+1)|]where the right-hand side tends to zero as K by repeating the argument in Step 2, which is possible since uˆUf. We conclude that J(u)J(uˆ) for all uˆUf and it follows by Proposition 2.3 that u is an optimal control for Problem 1.

4. Existence of the verification family

Theorem 3.3 presumes existence of the verification family ((Ysv)s0:vUlim). To obtain a satisfactory solution to Problem 1, we thus need to establish that a verification family exists. This is the topic of the present section. We will follow the standard existence proof which goes by applying a Picard iteration (see [Citation5,Citation11,Citation15]). We first show that there exists a sequence of consistent families of processes ((Ysv,k)s0:vUlim)k0 that satisfy the recursion (15) Ysv,0:=E[φ(v)|Fs](15) and (16) Ysv,k:=esssupτTsE[1[τ=]φ(v)+1[τ<]supbU{cτ,bv+Yτv(τ,b),k1}|Fs](16) for k1. Then, we show that the limit family obtained by letting k is a verification family.

Proposition 4.1

There is a sequence of consistent families of càdlàg  supermartingales ((Ysv,k)s0:vUlim)k0 such that for each vUlim:

  1. The sequence satisfies the recursion (Equation15)–(Equation16).

  2. There is a K>0 (that does not depend on k) such that, supuUlimE[sups[0,]|Ysu,k+1|2]K.

  3. For each k0, the map (t,b)Ytv(t,b),k belongs to HF.

  4. supuUlimE[supt[T,]|Ytu,kE[φ(u)|Ft]|]0, as T, uniformly in k.

The proof of Proposition 4.1 will be based on two lemmas and the following induction hypothesis:

Hypthesis (VF.k). There is a sequence of consistent families of càdlàg  supermartingales ((Ysv,k)s0:vUlim)0kk such that for k=0,,k and vUlim:

  1. The relation (Equation15) holds for k=0 and (Equation16) holds for k>0.

  2. The map (t,b)Ytv(t,b),k belongs to HF.

We note that Hypothesis VF.k lacks properties (b) and (d) of Proposition 4.1. In the following two propositions we show that these are implicit.

Lemma 4.2

Assume that Hypothesis VF.k holds for some k0. Then, the sequence of families of processes ((Ysv,k)s0:vUlim)0kk is well defined and uniformly bounded in the sense that there is a K>0 (that does not depend on k) such that, supuUlimE[sups[0,]|Ysu,k+1|2]K.Furthermore, for each vDf (whenever it is well defined) the collection {supbU{cτ,bv+Yτv(τ,b),k}:τTf,k0} of random variables is uniformly integrable.

Proof.

We note that under Hypothesis VF.k, the sequence of families ((Ysv,k)s0:vUlim)0kk+1 exists and is uniquely defined up to indistinguishability for each Yv,k by repeated application of Theorem A.1 in Appendix 2. By the definition of Yv,k+1 and positivity of the intervention costs we have that for any vUlim, E[φ(v)|Fs]Ysv,k+1esssupuUsk+1E[φ(vu)|Fs].For k0, we define the càdlàg  supermartingale Zsk:=esssupuUskE[φ(vu)|Fs] and the stopping times τz,k:=inf{s0:|Zsk|pz} for all z0. Then, by Definition 2.2.(i.a) we have E[|Zτz,kk|p]E[esssupuUτz,kkE[|φ(vu)|p|Fτz,k]]supuUlimE[|φ(u)|p]C.In particular, by right-continuity this implies that P[sups[0,)|Zsk|pz]Cz1or P[sups[0,)|Zsk|2z]Czp/21,where C does not depend on v and k. The first assertion now follows as E[sups[0,)|Zsk|2]0(Czp/21)dz<.Concerning the second claim, note that for each τTf and each ϵ>0, repeating the proof of Corollary A.5 in Appendix 3 with gϵ:=gϵ instead of g we find that there is a βϵA(τ) such that supbU{cτ,bv+Yτv(τ,b),k}cτ,βϵv+Yτv(τ,βϵ),k+ϵ.Now, E[supbU|cτ,bv+Yτv(τ,b),k|2]2E[|cτ,βϵv+Yτv(τ,βϵ),k|2]+2ϵ24E[|c(v(τ,βϵ))|2]+4E[supt[0,)|Ytv(τ,βϵ),k|2]+2ϵ2where the right-hand side is bounded, uniformly in (τ,βϵ) and k0, by the above in combination with Definition 2.2.(i.b) and Doob's maximal inequality.

We also have the following diminishing future impact property:

Lemma 4.3

Assume again that Hypothesis VF.k holds for some k0. Then, supuUlimE[supt[T,]|Ytu,k+1E[φ(u)|Ft]|]0,as T, uniformly in k.

Proof.

By the properties of the essential supremum and positivity of the intervention costs we have for every vUlim, E[φ(v)|Ft]Ytv,k+1esssupuUtfE[φ(vu)|Ft].This implies that |Ytv,k+1E[φ(v)|Ft]|esssupuUtfE[|φ(vu)φ(v)||Ft].The desired result now follows by a similar argument to the one used in the proof of Lemma 4.2 and Definition 2.2.(iii).

Proof

Proof of Proposition 4.1

First, note that by Definition 2.2.(ii) there is a hHF such that for τTf we have (17) h(τ,β)=E[c(v(τ,β))+φ(v(τ,β))|Fτ],(17) for all βA(τ), P-a.s. The statement, thus, holds for k=0.

Moving on we assume that VF.k holds for some k0. But then, by Lemmas 4.2 and 4.3 we can applying a reasoning similar to that in the proof of Theorem 3.3 to find that Yv,k+1 is a càdlàg  supermartingale with (18) Ytv,k+1=esssupuUtk+1E[φ(vu)j=1Nc(v[u]j)|Ft].(18) By Definition 2.2.(ii) it follows that there is a consistent family satisfying (Equation18) such that (t,b)Ytv(t,b),k+1HF and we conclude that VF.k+1 holds as well. By induction this extends to all k0.

The objective in the remainder of this section is to show that the limit family that we get when letting k in ((Ysv,k)s0:vUlim) is a verification family.

Proposition 4.4

For each vUlim, the limit Y¯v:=limkYv,k, exists as an increasing pointwise limit, P-a.s.

Proof.

Since UtkUtk+1 we have that Ytv,kYtv,k+1, P-a.s. Moreover, by Proposition 4.1 the sequence is bounded P-a.s., thus, it converges P-a.s. for all t[0,].

To assess the type of convergence that we have for the sequence Yv,k, we introduce a sequence of families of processes corresponding to a truncation of the time interval. For each T>0 and k0, we define the consistent family ((TYtv,k)t0:vUlim) of càdlàg  supermartingales as TYtv,k=esssupuU[t,T)kE[φ(vu)j=1Nc(v[u]j)|Ft]for all vUlim with (t,b)TYtv(t,b),kHF. Then,

Lemma 4.5

The sequence ((TYsv,k)s0:vUlim)k0 satisfies:

  1. Yv,0TYv,kYv,k.

  2. For each ϵ>0 there is a T0 such that P[k=0BkT,ϵ]<ϵ, with BkT,ϵ:={ωΩ:sups[0,]supbU|Ysv(s,b),kTYsv(s,b),k|>ϵ}.

  3. There is a P-a.s. finite F-measurable random variable ξ and a constant q>0 such that supt[0,]supbU|TYtv(t,b),kTYtv(t,b),k|ξ/(k)qP-a.s. for each 0<kk.

Proof.

The inequality in (i) follows from noting that U[t,T)kUtk.

For the second statement we note that by Lemma 2.4.(c), the process (supbU|Ysv(s,b),kTYsv(s,b),k|:s0) is PF-measurable and càdlàg.  Now, each uUτk can be decomposed as u=u1u2 with u1U[τ,T)k and u2UTk, which implies that

(19) Yτv(τ,β),kTYτv(τ,β),k=esssupuUτkE[φ(v(τ,β)u)j=1Nc(v(τ,β)[u]j)|Fτ]esssupuU[τ,T)kE[φ(v(τ,β)u)j=1Nc(v(τ,β)[u]j)|Fτ]esssupu1U[τ,T)k,u2UTkE[φ(v(τ,β)u1u2)j=1N1+N2c(v(τ,β)[u1u2]j)|Fτ]esssupuU[τ,T)kE[φ(v(τ,β)u)j=1Nc(v(τ,β)[u]j)|Fτ]esssupu1U[τ,T)k,u2UTkE[|φ(v(τ,β)u1u2)φ(v(τ,β)u1)||Fτ],(19) where N1 and N2 are the number of interventions in u1 and u2, respectively. We thus define the sets B~kT,ϵ:={ωΩ:sups[0,]supbUesssupu1U[0,T)k,u2UTkE[|φ(v(s,b)u1u2)φ(v(s,b)u1)||Fs]ϵ}and have by (Equation19) that BkT,ϵB~kT,ϵ for k0. Furthermore, as U[τ,T)kU[τ,T)k+1 and UTkUTk+1 we find that B~kT,ϵB~k+1T,ϵ for all k0.

Now, let τkϵ:=inf{s0:supbUesssupu1U[0,T)k,u2UTkE[|φ(v(s,b)u1u2)φ(v(s,b)u1)||Fs]ϵ}(recalling our convention that inf=) and pick βkϵ such that supbUesssupu1U[0,T)k,u2UTkE[|φ(v(τkϵ,b)u1u2)φ(v(τkϵ,b)u1)||Fτkϵ]esssupu1U[0,T)k,u2UTkE[|φ(v(τkϵ,βkϵ)u1u2)φ(v(τkϵ,βkϵ)u1)||Fτkϵ]+ϵ/2.Then, by right continuity we have B~kT,ϵBˆkT,ϵ where BˆkT,ϵ:={ωΩ:esssupu1U[0,T)k,u2UTkE[|φ(v(τkϵ,βkϵ)u1u2)φ(v(τkϵ,βkϵ)u1)||Fτkϵ]ϵ/2}.We thus only need to show that there is a T>0 such that P[BˆkT,ϵ]<ϵ for all k0. We have, E[esssupu1U[0,T)k,u2UTkE[|φ(v(τkϵ,βkϵ)u1u2)φ(v(τkϵ,βkϵ)u1)||Fτkϵ]]supu1U[0,T)lim,u2UTfE[|φ(u1u2)φ(u1)||],where right-hand side is independent of k and tends to 0 as T by Definition 2.2.(iii). We thus conclude that there is a T=T(ϵ) such that P[BˆkT,ϵ]<ϵ for all k0.

Concerning the third statement, we note that for r(1,2), we have for each τT and all βA(τ), that TYτv(τ,β),ksups[0,]Ysv(τ,β),ksups[0,]esssupuUsfE[|φ(v(τ,β)u)||Fs]1+sups[0,]esssupuUsfE[|φ(v(τ,β)u)|r|Fs]=:K(ω)and similarly TYτv(τ,β),kK(ω)for all k0 (where the inequalities hold P-a.s.). Now, arguing as in the proof of Proposition 4.2 we have E[sups[0,]esssupuUsfE[|φ(v(τ,β)u)|r|Fs]2/r]<and we conclude that there is a P-null set N such that for each ωΩN we have K(ω)<.

For ϵ>0, let uk,ϵ:=(τ1k,ϵ,,τNk,ϵk,ϵ;β1k,ϵ,,βNk,ϵk,ϵ)U[τ,T)k be such that TYτv(τ,β),kE[φ(v(τ,β)uk,ϵ)j=1Nk,ϵc(v(τ,β)[uk,ϵ]j)|Fτ]+ϵ.We note that on [τT] we have Nk,ϵ=0 and get that for ωΩN (in the remainder of the proof N denotes a generic P-null set), we have KϵE[φ(v(τ,β)uk,ϵ)j=1Nk,ϵc(v(τ,β)[uk,ϵ]j)|Fτ]Kδ(T)E[Nk,ϵ|Fτ]and we conclude that E[1[Nk,ϵ>k]|Fτ]<(2K+ϵ)/(δ(T)k) for all k0.

Now, for all 0kk we have, TY˘τv(τ,β),k,k:=E[φ(v(τ,β)[uk,ϵ]k)j=1Nk,ϵkc(v(τ,β)[uk,ϵ]j)|Fτ]TYτv(τ,β),kTYτv(s,b),k,where we have introduced TY˘τv(τ,β),k,k corresponding to the truncation [uk,ϵ]k:=

(τ1k,ϵ,,τNk,ϵkk,ϵ;β1k,ϵ,,βNk,ϵkk,ϵ) of uk,ϵ. As the truncation only affects the performance of the controller when Nk,ϵ>k we have TYτv(τ,β),kTY˘τv(τ,β),k,kE[1[Nk,ϵ>k](φ(vuk,ϵ)j=k+1Nk,ϵc(v(τ,β)[uk,ϵ]j)φ(v(τ,β)[uk,ϵ]k))|Fτ]+ϵE[1[Nk,ϵ>k](φ(v(τ,β)uk,ϵ)φ(v(τ,β)[uk,ϵ]k))|Fτ]+ϵ.Applying Hölder's inequality we get that for ωΩN, TYτv(τ,β),kTY˘τv(τ,β),k,k2E[1[Nk,ϵ>k]|Fτ]1/qesssupuUτfE[|φ(v(τ,β)u)|r|Fτ]1/r+ϵ21+1qK(ω)+ϵ(δ(T)k)1/q+ϵ,with 1r+1q=1. Since δ(T)>0, there is thus a P-a.s. finite F-measurable r.v. ξ=ξ(ω) such that (for all τ and β) we have |TYτv(τ,β),kTYτv(τ,β),k|ξ(k)1/q+Cϵ.Since, βA(τ) was arbitrary we can choose β such that supbU|TYτv(τ,b),kTYτv(τ,b),k||TYτv(τ,β),kTYτv(τ,β),k|+ϵξ(k)1/q+Cϵ,P-a.s. and by right-continuity the last statement follows as ϵ>0 was arbitrary.

Proposition 4.6

For each vUlim, we have supt[0,]supbU|Ytv(t,b),kY¯tv(t,b)|0as k, outside of a P-null set.

Proof.

By Lemma 4.5.(ii) there exist for each ϵ>0, a T0 and a measurable set BΩ with P[B]1ϵ such that supt[0,]supbU|Ytv(t,b),kYtv(t,b),k|supt[0,]supbU|TYtv(t,b),kTYtv(t,b),k|+2ϵfor all 0kk and ωB. Furthermore, by Lemma 4.5.(iii) there is a P-a.s. finite r.v., ξ, such that supt[0,]supbU|TYtv(t,b),kTYtv(t,b),k|ξ(k)q.Combining these and taking the limit as k,k we find that limksupt[0,]supbU|Y¯tv(t,b)Ytv(t,b),k|limklimksupt[0,]supbU|Ytv(t,b),kYtv(t,b),k|2ϵon BN for some P-null set N. Now, as ϵ>0 was arbitrary the statement follows.

We are now ready to show that a verification family exists, establishing the existence of optimal controls for Problem 1.

Proposition 4.7

A verification family exists.

Proof.

Letting hk(t,b):=Ytv(t,b),k we have by Proposition 4.6 that hk(t,b) converges uniformly in (t,b) to h¯(t,b):=Y¯tv(t,b) as k (outside of a P-null set). Since hkHF by Proposition 4.1.c), we have by Lemma 2.4.(d) that h¯HF. In particular, we conclude that property c) in the definition of a verification family holds for the limit family ((Y¯sv)s0:vUlim).

Moreover, for each τT and βA(τ) we have that Yτv(τ,β),kY¯τv(τ,β), P-a.s., as k and we conclude by consistency of ((Y¯sv,k)s0:vUlim) for each k0 that Y¯τv(τ,β)=h¯(τ,β), P-a.s., implying consistency of ((Y¯sv)s0:vUlim).

We treat each of the remaining properties separately:

(a) By the above and (b) of Lemma 2.4 we have that supbU{cs,bv+Y¯sv(s,b)} is a càdlàg, quasi-left upper semi-continuous process of class [D]. In particular we note that 1[s=]φ(v)+1[s<]supbU{cs,bv+Y¯sv(s,b)}is càdlàg.  Applying (iv) of Theorem A.1 in Appendix 2 then gives Y¯sv=esssupτTsE[1[τ=]φ(v)+1[τ<]supbU{cτ,bv+Y¯τv(τ,b)}|Fs].(b) By Proposition 4.1 we have that supuUlimE[supt[0,]|Ytu,k|2] is uniformly bounded in k. From this it follows immediately that supuUlimE[supt[0,]|Y¯tu|2]<.

(d) We have that supuUlimE[supt[T,]|Y¯tuE[φ(u)|Ft]|]supuUlimE[supt[0,]|Y¯tuYtu,k|]+supuUlimE[supt[T,]|Ytu,kE[φ(u)|Ft]|]where the first term on the right-hand side can be made arbitrarily small by choosing k sufficiently large and the last term tends to 0 as T for all k0.

5. Application to impulse control of SFDEs

In [Citation24] a finite horizon impulse control problem with a discrete set U was solved when the underlying process followed a stochastic delay differential equation (SDDE) under a loop condition on the impulses. This problem was motivated by hydropower operation where the flow-times between different power plants induce delays in the dynamics of the controlled system.

In this section we extend the results from [Citation24] by considering a discounted infinite horizon setting, allowing an uncountable control set U and also by taking the dynamics of the underlying process to follow a stochastic functional differential equation. Furthermore, our prior treatment of the problem with abstract reward, φ, and intervention cost, c, allows us to consider a less restrictive set of assumptions on the coefficients in the problem formulation. In particular, we are able to remove the loop condition.

Our treatment of non-Markovian impulse control problems in infinite horizon should also be compared to [Citation10] where an infinite horizon impulse control problem in a non-Markovian framework with a fixed discrete delay is considered. The work presented in this section goes in a different direction by having an underlying dynamics driven by a Lévy process that is affected by the impulses in the control, resulting in a more complex relation between the control and the output of the performance functional. Furthermore, we investigate the important extension to random horizon which turns out to be a trivial modification of our initial problem.

Throughout this section, we will only consider controls for which τj, P-a.s., and restrict our attention to the setting when the underlying uncertainty stems from a process Xu, with uUf, defined as Xu:=limkXu,k where (20) Xtu,0=x(t)fort(,0)(20) (21) Xtu,0=x(0)+0ta(s,(Xru,0)rs)ds+0tσ(s,(Xru,0)rs)dBs+0tRd{0}γ(s,(Xru,0)r<s,z)dP~(ds,dz),fort0(21) for some xD, the set of all (deterministic) uniformly bounded, càdlàg  functions x:RRd, and (22) Xtu,k=Xtu,k1fort[0,τk)(22) (23) Xtu,k=Γ(τk,(Xsu,k1)sτk,βk)+τkta(s,(Xru,k)rs)ds+τktσ(s,(Xru,k)rs)dBs+τk+tRd{0}γ(s,(Xru,k)r<s,z)dP~(ds,dz),fortτk.(23) The dynamics of Xu are driven by a d-dimensional Brownian motion B and a Poisson random measure P with intensity measure ϱ(ds;dz)=ds×μ(dz), where μ(dz) is the Lévy measure on Rd of P and P~(ds;dz):=(Pϱ)(ds;dz) is called the compensated jump martingale random measure of P. We assume that G:={Gt}t0 is the natural filtration generated by B and P, with G=G:=limtGt.

As mentioned above, we assume that all uncertainty comes from the process Xu and consider the discounted setting with a continuous discount factor ρ:R+R+. The reward functional is then (24) J(u)=E[0eρ(t)ϕ(t,Xtu)dtj=1Neρ(τj)(τj,Xτju,j1,βj)].(24)

5.1. Assumptions

We assume that the involved coefficients satisfy the following constraints:

Assumption 5.1

For any t,t0, b,bU, x,xRd and y,yD and for some q2 and p>2 we have:

  1. The function Γ:R+×D×URd satisfies the Lipschitz condition |Γ(t,(ys)st,b)Γ(t,(ys)st,b)|C(tt|ysys|ds+|ytyt|+(|tt|+|bb|)(1+supst|ys|+supst|ys|))and the growth condition |Γ(t,(ys)st,b)|KΓ|yt|.for some constant KΓ>0.

  2. The coefficients a:R+×DRd and σ:R+×DRd×d are continuous in t and satisfy the growth condition |a(t,(ys)st)|+|σ(t,(ys)st)|C(1+supst|ys|)and the Lipschitz continuity |a(t,(ys)st)a(t,(ys)st)|+|σ(t,(ys)st)σ(t,(ys)st)|Csupst|ysys|,0t|a(s,(yr)rs)a(s,(yr)rs)|dsCt|ysys|ds0t|σ(s,(yr)rs)σ(s,(yr)rs)|2dsCt|ysys|2ds.

  3. There is a γ¯:RdR+, with γ¯pq(z)μ(dz)< such that γ:R+×D×RdRd satisfies |γ(t,(ys)s<t,z)|γ¯(z)(1+sups<t|ys|),|γ(t,(ys)s<t,z)γ(t,(ys)s<t),z)|γ¯(z)sups<t|ysys|,0t|γ(s,(yr)r<s,z)γ(s,(yr)r<s,z)|2(m+2)dsγ¯2(m+2)(z)t|ysys|2(m+2)ds.

  4. The running reward ϕ:R+×RdR is B(R+×Rd)-measurable and satisfies the growth condition |ϕ(t,x)|C(1+|x|q).Moreover, there is a non-decreasing function Cϕ:R+R+ such that for all L0, |ϕ(t,x)ϕ(t,x)|Cϕ(L)|xx|,whenever |x||x|L.

  5. There is a finite collection of closed connected subsets (Ui)i=1M of U and corresponding maps i:R+×Rd×UiR that are jointly continuous in (t,x,b), bounded from below, i.e.  i(t,x,b)δ>0,of polynomial growth, |i(t,x,b)|C(1+|x|q),and locally Lipschitz in x, i.e.  there is a non-decreasing function C:R+R+ such that for all L0, |i(t,x,b)i(t,x,b)|C(L)|xx|,whenever |x||x|L, and we have (t,x,b)=mini:bUii(t,x,b).

Note that the growth condition on Γ in (i) implies that interventions can only increase the magnitude of the state Xt as long as |Xt|<KΓ. In particular, this avoids the problem of explosions in a finite time due to impulses.

Remark 5.1

To see that the above SFDE is a generalization of discrete delay SDDEs with Lipschitz coefficients note that if χ:R+×(Rd)k+1R satisfies |χ(t,x1,,xk+1)χ(t,x1,,xk+1)|C(|x1x1|++|xk+1xk+1|)for each (x1,,xk+1),(x1,,xk+1)(Rd)k+1, then for l0 we have 0t|χ(s,ys,ysδ1,,ysδk)χ(s,ys,ysδ1,,ysδk)|ldsC0t(|ysys|+|ysδ1ysδ1|+|ysδkysδk|)ldsCt|ysys|lds.

Remark 5.2

In the above assumptions the involved coefficients are all deterministic. We remark that a trivial extension is to allow these to depend on ω as well in which case the coefficients in the Lipschitz conditions can be taken to be non-decreasing, P-a.s. finite, PG-measurable càdlàg  processes.

The motivation for allowing intervention costs that are discontinuous in b is the important application of production systems, where increasing the production beyond a certain threshold may necessitate a costly startup of additional production units.

5.2. Existence of optimal controls

In this section we show that the problem of maximizing the reward functional (Equation24) has a solution. Throughout we will, for notational simplicity, only consider the one-dimensional case (d=1), but we note that all results extend trivially to higher dimensions. We start with the following moment estimate:

Proposition 5.2

Under Assumption 5.1, the SFDE (Equation20)–(Equation23) admits a unique solution for each uUf. Furthermore, the solution has moments of order pq on compacts, in particular we have for T>0, that (25) supuUfE[supt[0,T]|Xtu|pq]C,(25) where C=C(T,pq) and for each vUlim, we have (26) E[supt[0,T]esssupuUtf|E[sups[t,T]|Xsvu|q|Ft]|2]C(26) where C=C(T,q).

Proof.

By repeated use of Theorem 3.2 in [Citation1] existence and uniqueness of solutions to (Equation20)–(Equation23) follows since τj, P-a.s. By Assumption 5.1.(i) we get, for t[τj,T], using integration by parts, that |Xtu,j|2=|Xτju,j|2+2τj+tXsu,jdXsu,j+τj+td[Xu,j,Xu,j]sKΓ2|Xτju,j1|2+2τj+tXsu,jdXsu,j+τj+td[Xu,j,Xu,j]s.We note that if |Xtu,j|>KΓ and |Xsu,j|KΓ for some s[0,t) then there is a largest time ζ<t such that |Xζu,j|KΓ. This means that during the interval (ζ,t] interventions will not increase the magnitude |Xu,j|. By induction, since |x0| is finite, we find that |Xtu,j|2C+i=0j{2ζ(τ~i+)tτ~i+1Xsu,idXsu,i+ζ(τ~i+)tτ~i+1d[Xu,i,Xu,i]s}for all t[0,T], where ζ=sup{s0:|Xsu|KΓ}0, τ~0+=0, τ~i=τi for i=1,,j and τ~j+1=. Letting Rt:=i=0j{2τ~i+tτ~i+1Xsu,idXsu,i+τ~i+tτ~i+1d[Xu,i,Xu,j]s}we thus find that for p2, E[sups[t,T]|Xsu,j|p|Ft]C(1+E[|Xtu,j|p+sups[t,T]|RsRt|p/2|Ft]).Now, since Xu,i and Xu,j coincide on [0,τi+1j+1) we have i=0jτ~i+tτ~i+1Xsu,idXsu,i=0tXsu,ja(s,(Xru,j)rs)ds+0tXsu,jσ(s,(Xru,j)rs)dWs+0tRd{0}Xsu,jγ(s,(Xru,j)r<s,z)P~(ds,dz),and i=0jτ~i+tτ~i+1d[Xu,i,Xu,j]s=0tσ2(s,(Xru,j)rs)ds+0tRd{0}γ2(s,(Xru,j)r<s,z)P(ds,dz).From Assumption 5.1.(ii)-(iii) and the Burkholder-Davis-Gundy inequality we get that E[suptsT|RsRt|p/2|Ft]CE[|tT(1+suprs|Xru,j|4)ds|p/4+|tT(1+suprs|Xru,j|2)ds|p/2|Ft]C(1+Tp/21)E[tT(1+suprs|Xru,j|p)ds|Ft]and Grönwall's lemma gives that (27) E[sups[t,T]|Xsu,j|p|Ft]C(1+E[sups[0,t]|Xsu,j|p|Ft]),(27) P-a.s., where the constant C=C(T,q) does not depend on u or j and (Equation25) follows by letting t=0. We now give a more straightforward way of showing (Equation26) than the method used in the proof of Lemma 4.2. Applying (Equation27) to the left-hand side of (Equation26) we get E[supt[0,T]esssupuUtf|E[sups[t,T]|Xsvu|q|Ft]|2]C(1+E[supt[0,T]|E[sups[0,t]|Xsv|q|Ft]|2+sup(t,b)[0,T]×U|E[|Γ(t,Xtv,b)|q|Ft]|2])C(1+E[supt[0,T]|E[sups[0,t]|Xsv|q|Ft]|2])C(1+E[supt[0,T]|Xtv|2q])and the desired result follows from (Equation25).

Lemma 5.3

For each k0, there is a P-null set N such that for all ωΩN and all (t,b)Dk the limit lim(t,b)(t,b)Xt,b exists in the topology of uniform convergence on compact subsets of R+{t1,,tk}. Furthermore, for all (t,b)R+×U, we have limttlimbbsups[t,T]|Xsv(t,b)uXsv(t,b)u|=0,P-a.s., for any T0, vUlim and uUk (with an exception set that is independent of (t,b)).

Proof.

Our proof will rely on a pre-localization argument and we introduce the following non-decreasing sequence of stopping times κK:=inf{s0:Rd{0}γ¯(z)P({s},dz)K},for K0 and set ΛK:=[0,κK). By, Assumption 5.1.(iii) it then follows that κK, P-a.s. as K. Furthermore, we note that on ΛK the magnitude of the jumps of Xu due to the Poisson jump integral of (Equation20)–(Equation23) are bounded by C+Ksupst|Xsu| and repeating the argument in the proof of Proposition 5.2 gives that (28) supuUfE[sups[0,T]ΛK|Xtu|l]C,(28) for all l0.

For 0ttT, we let t,tXv solve the SFDE (Equation20)–(Equation23) with integrand (11(t,t](s))γ(s,,) in the jump part and let j0 be the largest integer such that τjt. Then by Assumption 5.1.(i) we have for l=1,,j+1 (recalling that [u]l=(τ1,,τNl;β1,,βNl) is the truncation of u limiting the number of interventions to l), |t,tXtv[(t,b)u]lt,tXtv[(t,b)u]l|C(tτl1|t,tXsv[(t,b)u]l1t,tXsv[(t,b)u]l1|ds+|t,tXtv[(t,b)u]l1t,tXtv[(t,b)u]l1|+|t,tXtv[(t,b)u]l1t,tXτl1v[(t,b)u]l1|+((tτl1)+1[l=1]|bb|)(1+supst|t,tXsv[(t,b)u]l1|+supsτl|t,tXsv[(t,b)u]l1|)),with τ0:=t. We define 1Xl:=t,tXv[(t,b)u]l, 2Xl:=t,tXv[(t,b)u]l and let δXl:=2Xl1Xl and set δX:=δXk+1. Then, since the jump part is deactivated during (t,t] and by (Equation28), we have E[|δXt|2(m+2)]C(|tt|m+2+|bb|m+2).For l=j+2,,N, we have by Assumption 5.1.(i) that |δXτll+1|C(0τl|δXsl|ds+|δXτll|).Now, for st, δXs=δXt+l=jN(tτl)+sτl+1d(δXl+1)r+l=j+1N1[sτl](δXτll+1δXτll),with τN+1:=. Taking the absolute value on both sides we get |δXs||δXt|+C(ts|a(r,(2Xζ)ζr)a(r,(1Xζ)ζr)|dr+l=jN|tτlτl+1sσ(r,(2Xζ)ζr)σ(r,(1Xζ)ζr)dWr|+t+sRd{0}|γ(r,(2Xζ)ζ<r,z)γ(r,(1Xζ)ζ<s,z)|P~(dr,dz)+ts|δXr|dr).The Burkholder-Davis-Gundy inequality now gives E[supr[t,s]|δXr|2(m+2)]CE[|δXt|2(m+2)+(ts|a(r,(2Xζ)ζr)a(r,(1Xζ)ζr)|dr)2(m+2)+(ts|σ(r,(2Xζ)ζr)σ(r,(1Xζ)ζr)|2dr)m+2+(t+sRd{0}|γ(r,(2Xζ)ζ<r,z)γ(r,(1Xζ)ζ<r,z)|2P~(dr,dz))m+2+ts|δXr|2(m+2)dr].Appealing to the boundedness of the jumps and the integral Lipschitz conditions on the coefficients then gives that E[supr[t,s]ΛK|δXr|2(m+2)]C(tsE[supζ[t,r]ΛK|δXζ|2(m+2)]dr+|tt|m+2+|bb|m+2),for all s[0,T]. Now, Grönwall's lemma gives E[sups[t,T]ΛK|δXs|2(m+2)]C(|tt|m+2+|bb|m+2),where C does not depend on uUk. Furthermore, for each t[0,T] and each ωΩN (for some P-null set N) there is a t>t such that P(ω,(t,t],Rd)=0. Uniform convergence on [t,T]ΛK, thus, follows by applying a Kolmogorov continuity argument (see e.g.  Theorem 72 in Chapter IV of [Citation25]) and uniform right-continuity follows as κK, P-a.s. The existence of limits follows similarly.

Definition 5.4

For all vUlim and uUf we define the map Ψv,u:R+×Ω×U as Ψv,u(t,b):=0eρ(s)ϕ(s,Xsv(t,b)u)dsj=1Neρ(τjt)(τjt,Xτjtv(t,b)[u]j1,βj).Moreover, for T0 we define the truncation ΨTv,u:R+×Ω×U of Ψv,u as ΨTv,u(t,b):=0Teρ(s)ϕ(s,Xsv(t,b)u)dsj=1N(T)eρ(τjt)(τjt,Xτjtv(t,b)[u]j1,βj),where N(T):=max{j:τj<T}, and for L0 we define the localization ΨT,Lv,u:R+×Ω×UR of ΨTv,u as ΨT,Lv,u(t,b):=0Teρ(s)ϕ(s,LXsv(t,b)u)dsj=1N(T)eρ(τjt)(τjt,LXτjtv(t,b)[u]j1,βj),where LXsu:=LL|Xsu|Xsu.

Corollary 5.5

For each T,L0, k0, vUlim and uUk the map (t,b)ΨT,Lv,u(t,b) has limits everywhere and is P-a.s. continuous on [ηj,ηj+1)×U, where (ηj)j1 are the jump times of P.

Proof.

Let φT,L(u):=0Teρ(s)ϕ(s,LXsu)dsand note that for 0tt we have |φT,L(v(t,b)u)φT,L(v(t,b)u)|tteρ(s)(|ϕ(s,LXsv(t,b)u)|+|ϕ(s,LXsv(t,b)u)|)ds+tTeρ(s)|ϕ(s,LXsv(t,b)u)ϕ(s,LXsv(t,b)u)|dsC(|tt|+tTeρ(s)|LXsv(t,b)uLXsv(t,b)u|ds)C(|tt|+tTeρ(s)|Xsv(t,b)uXsv(t,b)u|ds).Now, by Lemma 5.3 it follows immediately that limttlimbb|φT(v(t,b)u)φT(v(t,b)u)|=0,and from its proof we have that limttlimbb|φT(v(t,b)u)φT(v(t,b)u)|=0,whenever t{η1,η2,}. Concerning the intervention costs we have (29) j=1N(T)|eρ(τjt)(τjt,LXτjtv(t,b)[u]j1,βj)eρ(τjt)(τjt,LXτjtv(t,b)[u]j1,βj)|j=1N(T){1[0,t)(τj)|eρ(t)(t,LXtv(t,b)[u]j1,βj)eρ(τjt)(τjt,LXτjtv(t,b)[u]j1,βj)|+|(τjt,LXτjtv(t,b)[u]j1,βj)(τjt,LXτjtv(t,b)[u]j1,βj)|}(29) where the first term tends to zero as tt by joint continuity of ℓ, continuity of ρ and right continuity of X. By continuity of ℓ, the assertion follows by repeating the argument in the proof of Lemma 5.3.

Lemma 5.6

For each T>0 and k0 there is, for every vUlim, a JTHF such that for all τT and bU we have JT(τ,b)=esssupuUτkE[ΨTv,u(τ,b)|Fτ],P-a.s. (with an exception set that is independent of b).

Proof.

For any K,L0 it follows by Corollary 5.5 and Theorem A.6 in Appendix 3 that there is for each bU an F-optional càdlàg  process (Ztb,u:t0) such that Zτb,u=E[ΨTκK,Lv,u(τ,b)|Fτ],P-a.s. for any τT. Now, pick a sequence (ϵl)l0 of positive real numbers such that ϵl0 and for j,l0 define sjl:=j2l. Then, there is a control ujlUsjlk such that E[ΨTκK,Lv,ujl(sjl,b)|Fsjl]esssupuUsjlkE[ΨTκK,Lv,u(sjl,b)|Fsjl]ϵl.Define the sequence of càdlàg  processes (Z~tb,l:t0)l0 as Z~tb,l:=j=01[sjl,sj+1l)(t)Ztb,uj+1land set Zˆtb,l:=maxi{0,,l}Z~tb,i. Then, Zˆb,l is an increasing P-a.s. finite sequence of càdlàg  processes and it, thus, converges pointwisely, P-a.s. to a limit Zb, that, moreover, is PF-measurable. We note that for any l0 and τTf we have with τl:=inf{sτ:sΠl} and ul:=j=11[τl=sjl]ujl, that esssupuUτkE[ΨTκK,Lv,u(τ,b)|Fτ]Zτb,esssupuUτkE[ΨTκK,Lv,u(τ,b)|Fτ]E[ΨTκK,Lv,ul(τ,b)|Fτ]=esssupuUτkE[ΨTκK,Lv,u(τ,b)|Fτ]esssupuUτkE[ΨTκK,Lv,u(τl,b)|Fτ]+esssupuUτkE[ΨTκK,Lv,u(τl,b)|Fτ]E[ΨTκK,Lv,ul(τ,b)|Fτ]and as |esssupuUτkE[ΨTκK,Lv,u(τl,b)|Fτ]E[ΨTκK,Lv,ul(τ,b)|Fτ]|esssupuUτkE[|ΨTκK,Lv,u(τ,b)ΨTκK,Lv,u(τl,b)||Fτ]+ϵlwe get that |Zτb,esssupuUτkE[ΨTκK,Lv,u(τ,b)|Fτ]|2esssupuUτkE[|ΨTκK,Lv,u(τ,b)ΨTκK,Lv,u(τl,b)||Fτ]+ϵl.Jensen's inequality now gives that E[|Zτb,esssupuUτkE[ΨTκK,Lv,u(τ,b)|Fτ]|2]8supuUτkE[|ΨTκK,Lv,u(τ,b)ΨTκK,Lv,u(τl,b)|2]+2ϵl2.Letting l and using that the map tΨTκK,Lv,u(t,b) is right-continuous uniformly in u it follows that Zτb,=esssupuUτkE[ΨTκK,Lv,u(τ,b)|Fτ],P-a.s., for each stopping time τT.

We now show that Zb, is a càdlàg  process. First, since Zb, is the limit of an increasing sequence of càdlàg  processes we have that lim infttZtb,Ztb,. For any τTf and ϵ>0 let τl:=inf{sτ:Zˆsb,lZτb,+ϵ}.Then as (Zb,l)l0 is non-decreasing, the sequence (τl)l0 is non-increasing. Let B:={ωΩ:limlτl=τ} and note that BFτ by right-continuity of the filtration and lim supttZtb,<Ztb,+ϵ on Bc:=ΩB. Moreover, with τˆl:=1Bτl+1Bcτ, Fatou's lemma gives lim inflE[Zτˆlb,Zτb,]=lim inflE[1B(Zτˆlb,Zτb,)]P[B]ϵ.On the other hand, we have E[Zτˆlb,Zτb,]supuUτkE[ΨTκK,Lv,u(τ,b)ΨTκK,Lv,u(τˆl,b)|]0as l and we conclude that P[B]=0 and, since ϵ>0 was arbitrary, it follows that lim supttZtb,=Ztb,.

To prove that Zb, has left limits we define, for ϵ>0, the sequence (ϑjϵ)j0 as ϑ0ϵ=0 and then recursively let ϑjϵ:=inf{sϑj1ϵ:esssupuUsk|Zsb,uZϑj1ϵb,u|ϵ}.We note by the above discussion that ϑjϵT and furthermore, by right-continuity that ϑjϵ>ϑj1ϵ and ϑjϵ, P-a.s. If not, we would have ϑjϵϑϵTf on some set A of positive measure. However, as increments in the jump integral part is P-a.s. zero at predictable times we note by Corollary 5.5 that Ψv,u(t,b) is continuous in t at ϑϵ on AN for some P-null set N, uniformly in u. Now, as the filtration is quasi-left continuous this implies that lim supjesssupuUksupbU|Zϑjϵb,uZϑj1ϵb,u|=0,on AN, a contradiction. Letting, Z˘tb,l:=j=01[ϑj1/l,ϑj11/l)(t)Zϑj1/lb,,we find that (Z˘b,l)l0 is a sequence of càdlàg  processes with supt[0,T]|Ztb,Z˘tb,l|1/l and we conclude that Zb, is càdlàg. 

By repeating the argument in the proof of Lemma 5.3 we find that supuUtkE[|ΨTκK,Lv,u(t,b)ΨTκK,Lv,u(t,b)|2(m+1)]C|bb|m+1,P-a.s. for any tR+ and b,bU and it follows that E[supt[0,]|Ztb,Ztb,|m+1]C|bb|m+1.Hence, by Kolmogorov's continuity theorem and Corollary 5.5 it follows that there is a unique map hK,LHF such that hK,L(τ,b)=esssupuUkE[ΨTκK,Lv,u(τ,b)|Fτ],P-a.s. for all bU. By dominated convergence we find that hK,L converges pointwisely to some h as K,L. We define the set ΞL:={ωΩ:supt[0,T]esssupuUtfE[sups[t,T]|Xsvu||Ft]>L}and note that for r(1,2), we have E[sup(t,b)[0,T]×U|esssupuUkE[ΨTv,u(t,b)|Ft]hK,L(t,b)|r]E[sup(t,b)[0,T]×UesssupuUtk|E[ΨTv,u(t,b)ΨTκK,Lv,u(t,b)|Ft]|r]CE[(1[κK<T]+1Ξ)supt[0,T]esssupuUtk+1|E[sups[t,T]|Xsvu|q|Ft]|r]CE[1[κK<T]+1ΞL]1/rwhere 1r+r2=1 and the last step follows by Hölder's inequality and Proposition 5.2. Now, the right-hand side of the last inequality goes to zero as K,L by the definition of κK and Proposition 5.2 and by uniform convergence we conclude that there is a JTHF such that JT(τ,b)=esssupuUkE[ΨTv,u(τ,b)|Fτ],P-a.s. for each bU.

It remains to show that we can choose the exception set to be independent of b. Let U¯0U¯1 be a sequence of finite subsets of U with minbU¯lmaxbU|bb|2l. For βA(τ) define (βl)l0 as a measurable selection of βlargminbU¯l|βb|. Then since βl takes values in a finite set we have JT(τ,βl)=esssupuUkE[ΨTv,u(τ,βl)|Fτ],P-a.s. By continuity it follows that limlJT(τ,βl)=JT(τ,β),P-a.s. Furthermore, by uniform integrability and P-a.s. continuity of ΨTv,u uniformly in u we have that limlesssupuUkE[|ΨTv,u(τ,β)ΨTv,u(τ,βl)||Fτ]=0and we conclude that JT(τ,β)=esssupuUkE[ΨTv,u(τ,β)|Fτ],P-a.s. From this the statement follows as βA(τ) was arbitrary.

This far we have not made any assumption on the discount factor ρ, other than it being continuous. Clearly, some assumptions on the growth of ρ have to be made in order for the maximization problem to have a finite value. We summarize our assumptions in the following hypothesis:

Hypothesis. [Disc.-A] There an ϵ>0 such that ρ(t)ϵt, supuUfE[Teρ(t)|ϕ(t,Xtu)|pdt]CeϵTand supuUfE[supt[0,)epρ(t)|(t,Xtu,b)|pdt]<for all T0 and bU. Furthermore, for each k0 there is an ϵ>0 such that for all TT for some T>0 we have E[supt[0,)esssupuUtkE[|Teρ(s)ϕ(s,Xsvu)ds||Ft]]eϵT,and E[supt[0,)esssupuUtkE[1[N1]1[τNT]eρ(τN)|(τN,XτNv[u]N1,βN)||Ft]]eϵT.for all vUlim.

Remark 5.3

An important situation where Hypothesis Disc.-A holds with ρ(t)=ρ0t for any ρ0>0 is when the functions ϕ and ℓ are eventually bounded, i.e.  when there is a T>0 such that |ϕ(t,x)|C and |(u(t,x))|C for all (t,x)[T,)×Rd. Another important case is when ρ(T)ln(C(T,pq)) grows linearly in T, where C is the bound in Proposition 5.2.

We are now ready to state the main result of this section, showing that under Assumption 5.1 and Hypothesis Disc.-A an optimal control for the problem of maximizing J exist.

Proposition 5.7

Under Hypothesis Disc.-A there is a uUf such that J(u)J(u) for all uUf. Furthermore, u is given by the recursion (Equation8)–(Equation9), with φ(u):=0eρ(t)ϕ(t,Xtu)dtand c(u(t,b)):=eρ(tτN)(tτN,XtτNu,b).

Proof.

To show that the assertion is true we need to show that the pair (φ,c) is an admissible reward pair. It is clear that the uniform L2-bounds on φ and c in Definition 2.2.(i) hold by Hypothesis Disc.-A. In particular, we note that by Jensen's inequality we get that E[|φ(u)|p]=E[|0eρ(t)ϕ(t,Xtu)dt|p]CE[0eρ(t)|ϕ(t,Xtu)dt|pdt]C.The decreasing importance property stated in Definition 2.2.(iii) follows similarly by noting that for vUTf with TT we have, by Hypothesis Disc.-A, that E[|φ(uv)φ(u)|p]=E[|Teρ(t)(ϕ(t,Xtuv)ϕ(t,Xtu))dt|p]CE[Teρ(t)(|ϕ(t,Xtuv)|p+|ϕ(t,Xtu)|p)dt]CeϵT,which tends to 0 as T.

Concerning the continuity properties listed in Definition 2.2.(ii) we note that for each k0 and vUlim we have that |Ψv,u(t,b)Ψv,u(t,b)||ΨTv,u(t,b)ΨTv,u(t,b)|+|Ψv,u(t,b)ΨTv,u(t,b)|+|Ψv,u(t,b)ΨTv,u(t,b)|.Now, E[sup(t,b)[0,)×UesssupuUtkE[|φ(v(t,b)u)φT(v(t,b)u)||Ft]]E[supt[0,)esssupuUtkE[|Teρ(s)ϕ(s,Xsvu)ds||Ft]]eϵT.and similarly E[sup(t,b)[0,)×UesssupuUtkE[|c(v(t,b)u)1[τNt<T]c(v(t,b)u)||Ft]]E[supt[0,)esssupuUtk+1E[1[N1]1[τNT]eρ(τN)|(τN,XτNv[u]N1,βN)||Ft]]eϵT.This implies that P[sup(t,b)[0,)×UesssupuUtkE[|Ψv,u(t,b)ΨTv,u(t,b)||Ft]eϵT/2]CeϵT/2and the Borel-Cantelli lemma gives that sup(t,b)R+×UesssupuUtkE[|Ψv,u(t,b)ΨTv,u(t,b)||Ft]0,P-a.s., as T for all vUlim.

By Lemma 5.6 and uniform convergence it follows from Lemma 2.4.(d) that J:=limTJTHF. The desired result now follows by Lemma 2.4.(a) while noting that by the construction of ℓ in Assumption 5.1.(v), a simplified version of Lemma 5.6 (without having to consider maximization over u) applied to each of the i gives that there is an hHF such that h(τ,b)=E[(τ,Xτv,b)|Fτ], P-a.s. (with an exception set that is independent of b).

Remark 5.4

In a perfect information setting, i.e.  when F=G, we note that (t,x,b) can be taken to be any upper semi-continuous function in b that satisfies the remaining properties of polynomial growth and local Lipschitz continuity.

5.3. The random horizon setting

We turn instead to the reward (30) Jη(u)=E[0ηeρ(t)ϕ(t,Xtu)dt+eρ(η)ψ(η,Xη[u]N(η))j=1Neρ(τj)(τj,Xτjν,j1,βj)].(30) where η is a G-stopping and N(η):=sup{j:τj<η}0. A notable convention applied in (Equation30) is that the terminal reward disregards interventions made at the horizon. This is natural from an applications perspective as it is generally to late to intervene at a default in a financial setting or at the failure of a unit in an engineering application.

In addition to the requirements listed in Assumption 5.1, we make the following assumptions:

Assumption 5.8

The terminal reward ψ:R+×RdR is Borel-measurable, satisfies the growth condition |ψ(t,x)|C(1+|x|q)and there is a non-decreasing continuous function Cψ:R+R+ such that for all L0, we have |ψ(t,x)ψ(t,x)|Cψ(L)|xx|whenever |x||x|L. Moreover, if there is a sequence (θj)j0 in Tf such that θjη on some set BG, then there is a P-null set N such that on BN we have for every (y,b)D×U that (31) ψ(η,yη)ψ(η,Γ(η,(ys)sη,b))(31)

We introduce the following hypothesis:

Hypothesis. [Disc.-B] The terminal reward satisfies the bound supuUfE[epρ(η)|ψ(η,Xηu)|p]<.Furthermore, for each k0 there is an ϵ>0 such that for all TT for some T>0 we have E[supt[0,)esssupuUtkE[1[ηT]eρ(η)|ψ(η,Xηvu)||Ft]]eϵTfor all vUlim.

We have the following extension of Proposition 5.7.

Proposition 5.9

Under Hypotheses Disc.-A and Disc.-B there is a uUf such that Jη(u)Jη(u) for all uUf. Furthermore, u is given by the recursion (Equation8)–(Equation9), with φ(u):=0ηeρ(t)ϕ(t,Xtu)dt+eρ(η)ψ(η,Xη[u]N(η))and c(u(t,b)):=eρ(tτN)(tτN,XtτNu,b).If, in addition η is an F-stopping time, then τj<η for all 1jN.

Proof.

We note that all details in the proof of Proposition 5.7 transfer immediately to this situation except for the quasi-left upper semi-continuity property in the definition of HF (Definition 2.1). We thus assume that there is a sequence non-decreasing sequence (θj)j0 of stopping times such that θjθT. When θ<η, P-a.s. left-continuity at θ follows by Lemma 5.3 and the local Lipschitz property of ψ and when θ>η, P-a.s. left-continuity at θ is immediate. We thus assume that θjη on some measurable set BΩ.

Then, we have 1B(φ(v(θj,b)u)φ(v(θ,b)u))θjθeρ(t)|ϕ(t,Xtv(θj,b)u)ϕ(t,Xtv(θ,b)u)|dt+1Beρ(η)(ψ(η,Xηv(θj,b)u)ψ(η,Xηv)),where the first term on the right-hand side tends to zero, P-a.s. Concerning the second term we have 1Beρ(η)(ψ(η,Xηv(θj,b)u)ψ(η,Xηv))1Beρ(η)(ψ(η,ΓβN(η)Γβ1Γb(η,Xηv)ψ(η,Xηv))+eρ(η)|ψ(η,Xηv(θj,b)[u]N(η))ψ(η,Xηv(θ,b)[u]N(η))|,where Γb(,):=Γ(,,b) and ° denotes composition of functions. The first term on the right-hand side is P-a.s. non-positive by Assumption 5.8 and the last term tends to zero, P-a.s., by the local Lipschitz property of ψ and Lemma 5.3 in combination with Proposition 5.2, the polynomial growth condition on ψ and Hypothesis Disc.-B.

The last assertion follows by noting that since c>0 it will never be optimal to intervene at times greater than or equal to η.

We note the following distinction between the finite (deterministic) horizon and the random horizon settings:

Remark 5.5

In the case when P(η=T)=1 for some T0 it follows from the proof of Proposition 5.9 that we can relax (Equation31) to ψ(T,yT)ψ(T,Γ(T,(ys)sT,b))(T,yT,b).

To see that there is an actual distinction here consider the following example:

Example 5.10

We let F be the trivial σ-algebra {,Ω} and assume that P(η=x)={0.5,x=10.5,x=2. We take U:={1} and set Xt=1[τ1,)(t). Then, with the rewards ϕ0, ψ(t,x)=xe|t1|, the intervention cost (t,x,b)=e|t1| and the discount ρ0, we get supuUfJη(u)=0.5(e11),but there is no control that attains this value.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This work was supported by the Swedish Energy Agency through grant number 48405-1

Notes

1 Throughout, we let R+:=[0,)

2 We let (resp. ) to denote minimum (resp. maximum), so that xy:=min(x,y).

3 Throughout, we generally suppress dependence on ω and refer to hHF as a map (t,b)h(t,b).

4 Requiring that p>2 is for notational convenience only and can easily be loosened to p>1.

5 Throughout, C will denote a generic positive constant that may change value from line to line.

6 By definition {N>K}={τK+1<}, which belongs to FτK+1

References

  • N. Agram and B. Øksendal, Stochastic control of memory mean-field processes, Appl. Math. Optim.79(1) (2019), pp. 181–204.
  • M. Basei, Optimal price management in retail energy markets: an impulse control problem with asymptotic estimates, Math. Meth. Oper. Res. 89(3) (2019), pp. 355–383.
  • A. Bensoussan and J.L. Lions, Impulse Control and Quasivariational Inequalities, Gauthier-Villars, Montrouge, France, 1984.
  • D.P. Bertsekas and S.E. Shreve, Stochastic Optimal Control: The Discrete-time Case, Academic Press, 1978.
  • R. Carmona and M. Ludkovski, Pricing asset scheduling flexibility using optimal switching, Appl. Math. Finance 15(5-6) (2008), pp. 405–447.
  • C. Dellacherie and P.-A. Meyer, Probabilités Et Potentiel, I-IV, Hermann, Paris, 1975.
  • C. Dellacherie and P.-A. Meyer, Probabilités Et Potentiel, V-VIII, Hermann, Paris, 1980.
  • B. Djehiche, S. Hamadène, and I. Hdhiri, Stochastic impulse control of non-markovian processes, Appl. Math. Optim. 61(1) (2010), pp. 1–26.
  • B. Djehiche and S. Hamadène, On a finite horizon starting and stopping problem with risk of abandonment, Int. J. Theoret. Appl. Finance 12(04) (2009), pp. 523–543.
  • B. Djehiche, S. Hamadène, I. Hdhiri, and H. Zaatra, Infinite horizon stochastic impulse control with delay and random coefficients, Math. Oper. Res. 47(1) (2022), pp. 665–689.
  • B. Djehiche, S. Hamadène, and A. Popier, A finite horizon optimal multiple switching problem, SIAM J. Control Optim. 48(4) (2009), pp. 2751–2770.
  • N. El Karoui, Les aspects probabilistes du contrôle stochastique. Ecole d'Eté de SaintFlour IX 1979. Lecture Notes in Math, Berlin, Springer, 1981.
  • N. El-Karoui, C. Kapoudjian, E. Pardoux, S. Peng, and M.C. Quenez, Reflected solutions of backward SDEs and related obstacle problems for PDEs, Ann. Probab. 25(2) (1997), pp. 702–737.
  • S. Hamadène, Reflected BSDE's with discontinuous barrier and application, Stoch. Int. J. Probab. Stoch. Process. 74(3-4) (2002), pp. 571–596.
  • S. Hamadène and J. Zhang, Switching problem and related system of reflected backward SDEs, Stoch. Process. Appl. 120(4) (2010), pp. 403–426.
  • I. Hdhiri and M. Karouf, Optimal stochastic impulse control with random coefficients and execution delay, Stoch. Int. J. Probab. Stoch. Process. 90(2) (2018), pp. 151–164.
  • J. Jönsson and M. Perninge, Finite horizon impulse control of stochastic functional differential equations, SIAM J. Control Optim. 61(2) (2023), pp. 924–948.
  • N. El Karoui and X. Tan, Capacities, measurable selection and dynamic programming part i: Abstract framework. arXiv:1310.3363, 2013.
  • R. Korn, Some applications of impulse control in mathematical finance, Math. Meth. Oper. Res. 50(3) (1999), pp. 493–518.
  • R. Martyr, Finite-horizon optimal multiple switching with signed switching costs, Math. Oper. Res.41(4) (2016), pp. 1432–1447.
  • B. Øksendal and A. Sulem, Applied Stochastic Control of Jump Diffusions, Springer, 2007.
  • B. Øksendal and A. Sulem, Optimal stochastic impulse control with delayed reaction, Appl. Math. Optim. 58(2) (2008), pp. 243–255.
  • J. Palczewski and L. Stettner, Impulsive control of portfolios, Appl. Math. Optim. 56(1) (2007), pp. 67–103.
  • M. Perninge, A finite horizon optimal switching problem with memory and application to controlled sddes, Math. Meth. Oper. Res. 91(3) (2020), pp. 465–500.
  • P. Protter, Stochastic Integration and Differential Equations, 2nd ed. Springer, Berlin, 2004.

Appendices

Appendix 1. Quasi-left continuity

A càdlàg  process (Xt:t0) is quasi-left continuous if for each predictable stopping time θ and every announcing sequence of stopping times θkθ we have Xθ:=limkXθk=Xθ, P-a.s. Similarly, X is quasi-left upper semi-continuous if XθXθ, P-a.s. A filtration is quasi-left continuous if Fθ=Fθ for every predictable stopping time θ.

Appendix 2. The Snell envelope

In this section we gather some useful results concerning the Snell envelope. Recall that a progressively measurable process X is of class [D] if the set of random variables {Xτ:τTf} is uniformly integrable.

Theorem A.1

The Snell envelope

Let X=(Xt)t0 be an F-adapted, R-valued, càdlàg  process of class [D]. Then there exists a unique ( up to indistinguishability ) , R-valued càdlàg  process Z=(Zt)t0 called the Snell envelope of X, such that Z is the smallest supermartingale that dominates X. Moreover, the following holds (with ΔXt:=XtXt):

  1. For any stopping time η, (A1) Zη=esssupτTηE[Xτ|Fη].(A1)

  2. The Doob-Meyer decomposition of the supermartingale Z implies the existence of a triple (M,Kc,Kd) where (Mt:t0) is a uniformly integrable right-continuous martingale, (Ktc:t0) is a non-decreasing, predictable, continuous process with K0c=0 and (Ktd:t0) is non-decreasing purely discontinuous predictable with K0d=0, such that (A2) Zt=MtKtcKtd.(A2) Furthermore, {ΔKtd>0}{ΔXt<0}{Zt=Xt} for all t0.

  3. Let ηT be given and assume that for any predictable θTη and any increasing sequence {θj}j0 with θjTηf and limjθj=θ, P-a.s, we have lim supjXθjXθ, P-a.s. Then, the stopping time τη defined by τη:=inf{sη:Zs=Xs} (with the convention that inf=) is optimal after η, i.e.  Zη=E[Xτη|Fη].Furthermore, in this setting the Snell envelope, Z, is quasi-left continuous, i.e.  Kd0.

  4. Let Xk be a sequence of càdlàg  processes converging increasingly and pointwisely to the càdlàg  process X and let Zk be the Snell envelope of Xk. Then the sequence Zk converges increasingly and pointwisely to a process Z and Z is the Snell envelope of X.

In the above theorem, (i)–(iii) are standard results and proofs can be found in, for example, [Citation12,Citation14]. A finite horizon version of statement (iv), which extends trivially to infinite horizon, was proved in [Citation11].

Appendix 3. The section and projection theorems

In this section we recall two fundamental results from the general theory of stochastic processes, namely the measurable selection and the optional projection theorems.

We consider a complete filtered probability space (Ω,F,F,P), with F:={Ft} a right-continuous filtration. For any space E, we define the projection of a set AΩ×E onto Ω as πΩ(A):={ωΩ:xE,(ω,x)A}.

Theorem A.2

Measurable projection

Let E be a locally compact Polish space. For every AFB(E) the set πΩ(A) is F-measurable.

A proof can be found in, e.g. [Citation18] (see the proof of Theorem 2.10) or [Citation6] Chapter III. In particular we need the following corollary result:

Corollary A.3

Let h(ω,x) be a real valued, measurable function defined on the product space (Ω×Rm,FB(Rm)). Then for all AFB(Rm), the function g(ω):=supxRm{h(ω,x):(ω,x)A}(with the convention sup=) is F-measurable.

Proof.

For each KR we have {g(ω)>K}=πΩ(Ah1((K,])). Now, since h is measurable, the set Ah1((K,]) is in FB(Rm) and the result follows by the measurable projection theorem.

Theorem A.4

Measurable selection

Let (E,E) be a Borel space with E:=B(E). For every AFB(E) there is a F-measurable r.v. β taking values in E¯:=E{} (with ∂ a cemetery point) such that {(ω,β(ω))Ω×E}Aand{ωΩ:β(ω)E}=πΩ(A).

This is a standard result and a proof can be found in [Citation18] (Theorem 2.20) (see also Chapter 7 in [Citation4] where several extensions are given). In particular we need the following well known corollary result:

Corollary A.5

Let h(ω,x) be a measurable function defined on the product space (Ω×Rm,FB(Rm)), such that for P-almost every ω the map xh(ω,x) is upper semi-continuous. Then, with U a compact subset of Rm, there exists a F-measurable r.v. β such that h(ω,β(ω))=supxRn{h(ω,x):(ω,x)Ω×U},P-a.s.

Proof.

Since A:=Ω×UFB(E) (where now E=Rm) the function g(ω)=supxE{h(ω,x):(ω,x)A} is F-measurable. Furthermore, as h is FB(E)-measurable, the set B:={(ω,x)Ω×U:h(ω,x)=g(ω)} is in FB(E). Now, by Theorem A.4 there is a F-measurable E¯-valued r.v. β such that {(ω,β(ω))Ω×E}B and {ωΩ:β(ω)E}=πΩ(B). As U is compact and bh(ω,b) is u.s.c. on ΩN with P(N)=0, we have that Bω:={bU:(ω,b)B}={bU:h(ω,b)=g(ω)} for all ωΩN and, hence, P(πΩ(B))=1.

The last result that we need is the optional projection theorem.

Theorem A.6

Optional projection

Assume that (Xt:t0) is a measurable process (not necessarily adapted to the filtration F) with E[|Xτ|]< for all stopping times τT, then there exists a unique optional process (oXt:t0) such that 1[τ<]oXτ=E[1[τ<]Xτ|Fτ],for all stopping times τT. If, furthermore, X is càdlàg  then oX is also càdlàg.

A proof of Theorem A.6 can be found in Chapter VI, pp. 103 of [Citation7].