Publication Cover
Sequential Analysis
Design Methods and Applications
Volume 38, 2019 - Issue 1
537
Views
1
CrossRef citations to date
0
Altmetric
Articles

Monotonicity and robustness in Wiener disorder detection

&
Pages 57-68 | Received 30 Oct 2017, Accepted 16 Nov 2018, Published online: 13 May 2019

Abstract

We study the problem of detecting a drift change of a Brownian motion under various extensions of the classical case. Specifically, we consider the case of a random post-change drift and examine monotonicity properties of the solution with respect to different model parameters. Moreover, robustness properties – effects of misspecification of the underlying model – are explored.

Subject Classifications:

1. Introduction

In the classical version of the quickest disorder detection (QDD) problem, see Shiryaev (Citation1967), one observes a one-dimensional process Y which satisfiesYt=b(tΘ)++σWt,where b and σ are non-zero constants, W is a standard Brownian motion and the disorder time Θ is an exponentially distributed random variable (with intensity λ>0) such that W and Θ are independent. The associated Bayes’ risk (expected cost) corresponding to a stopping rule τ is defined as(1.1) P(Θ>τ)+cE[(τΘ)+],(1.1) where c > 0 is the cost of one unit of detection delay. It is well-known (see (Shiryaev, Citation1978, Chapter 4)) that to minimise the Bayes risk one should stop the first time the conditional probability process Πt:=P(Θt|FtY) reaches a certain level a. Moreover, the level a is characterized as the unique solution of a transcendental equation.

In many situations, however, it is natural not to know the exact value of the disorder magnitude b, but merely its distribution. This is the case for example when a specific machine is monitored continuously, and the machine can break down in several possible ways. To study such a situation, we allow for the new drift to be a random variable B with distribution μ such that B is independent of the other sources of randomness. In this setting we study monotonicity properties of the QDD problem, i.e. whether the (minimal) expected cost is monotone with respect to various model parameters. In particular, we study the dependence of the expected cost on the volatility σ, the distribution μ, and the disorder intensity λ. We also study robustness in the QDD problem, i.e. what happens if one misspecifies various model parameters. More specifically, we aim at estimates for the increased cost associated with the use of suboptimal strategies. Clearly, such estimates are helpful in situations where the model is badly calibrated, but also in situations where one chooses to use a simpler suboptimal strategy rather than a computationally more demanding optimal strategy.

As mentioned above, the classical version of the QDD problem was studied in Shiryaev (Citation1967), see also (Shiryaev, Citation1978, Chapter 4) and (Peskir and Shiryaev, Citation2006, Section 22); for extensions to the case of detecting a change in the intensity of a Poission process, see Peskir and Shiryaev (Citation2002), Bayraktar et al. (Citation2005) and Bayraktar et al. (Citation2006). For the case of a random disorder magnitude, Beibel (Citation1997) obtains asymptotic results of a problem with normally distributed drift. Concavity of the value function in a related hypothesis testing problem with two possible post-change drift values in a time-homogeneous case was obtained in Muravlev and Shiryaev (Citation2014). Finally, practical significance of the disorder detection problem in modern engineering applications is explained in Zucca et al. (2016).

2. General model formulation

We model a signal-processing activity on a stochastic basis (Ω,F,F,P), where the filtration F={Ft}t0 satisfies the usual conditions. We are interested in the signal process X, which is not directly observable, but we can continuously observe the noisy process(2.1) Yt=0tXudu+0tσ(u)dWu,t0.(2.1)

Here W is a Brownian motion independent of X, the dispersion σ is deterministic and strictly positive, and the signal process follows(2.2) Xt=B01{Θ=0}+B11{0<Θt},(2.2) where Θ is a [0,)-valued random variable representing the disorder occurrence time. Moreover, B0, B1 are real-valued random variables corresponding to disorder magnitudes in the cases ‘disorder occurs before we start observing Y’ and ‘disorder occurs while we observe Y’, respectively. Also, Θ, B0, and B1 are independent. Let Θ have the distribution π˜δ0+(1π˜)ν, were ν is a probability measure on (0,) with a continuously differentiable distribution function Fν. In addition, denote the distributions of B0 and B1 by μ0 and μ1, respectively. When referring to μ0 and μ1 collectively, we will simply say that the prior is μ. Let us introduce the notationDn:={π[0,)n:||π||11}andΔn:={π[0,)n:||π||1=1},

where ||π||1=i=1nπi. We assume thatμ0=i=1npˇiδbi,μ1=i=1npiδbi,where b1,,bnR{0} and (pˇ1,,pˇn),(p1,,pn)Δn.

The model studied in the paper is a generalisation of the classical disorder occurrence model Shiryaev (Citation1967). Firstly, the exponential disorder distribution used in the classical problem is replaced by an arbitrary distribution with time-dependent intensity. The generalisation is advantageous in situations when the intensity of the disorder occurrence changes with time. For example, if the disorder corresponds to a component failure in a system, for many physical systems, the failure intensity is known to increase with age. Also, if occurrence of the disorder depends on external factors such as weather, then such dependency can be incorporated into the time-dependent disorder intensity from an accurate weather forecast. Moreover, in contrast to the classical problem in which the disorder magnitude is known in advance, in this generalisation, the magnitude takes a value from a range of possible values. Returning to the component failure example, the different possible disorder magnitudes would represent different types of component failure. In the problem of detecting malfunctioning atomic clocks Zucca et al. (2016), the disorder corresponds to a systematic drift of a clock. The sign of the disorder magnitude reflects whether a clock starts to go too slow or too fast while the absolute value represents the severity of the drift. In addition, the different distributions μ0,μ1 of B0 and B1 and the weight π˜ reflect the prior knowledge about how likely different disorder magnitudes are if the disorder happened before or while observing Y. For instance, such model flexibility is relevant when we start observing the system after a particular incident (e.g. a storm if the system is affected by the weather) and we know that the distribution of possible disorder magnitudes after the incident is different than under normal operating conditions. From a mathematical point of view, π˜ and B0 allow us to give a statistical interpretation to an arbitrary starting point in the Markovian embedding (2.7) of the original optimal stopping problem studied later.

Remark 2.1.

We point out that the finite support assumption on μ is made for notational convenience. As any distribution can be approximated arbitrarily well by finitely supported ones, obviously, our monotonicity results below can be extended to general disorder magnitude distributions.

We are interested in a disorder detection strategy τ incorporating two objectives: short detection delay and a small portion of false alarms. As noted in the introduction, a classical choice of Bayes’ risk for a detection strategy to minimize is given by (1.1). In the present paper, we consider a slightly more flexible risk structure by allowing a time-dependent cost for the detection delay. More precisely, we consider the Bayes’ riskR(τ):=E[1{τ<Θ}+Θτc(u)du]where 1{τ<Θ} is a fixed penalty for a false alarm and the term Θτc(u)du is a penalty for detection delay. Here tc(t) is a deterministic function with c(t) > 0 for all t0. Writing FY={FtY}t0 for the filtration generated by Y (which is our observation filtration), let us introduce Π˜t:=E[1R{0}(Xt)|FtY]. ThenR(τ)=E[1E[1{Θτ}|FτY]]+0c(t)E[1{tτ}E[1{Θt}|FtY]]dt=E[1Π˜τ+0τc(t)Π˜tdt].

Hence the optimal stopping problem to solve is(2.3) V=infτTYE[1Π˜τ+0τc(t)Π˜tdt],(2.3) where TY denotes the set of FY-stopping times.

2.1. Subsection: filtering equations

Let us define Πt(i):=E[1{Xt=bi}|FtY], where i=1,,n. By the Kallianpur-Striebel formula, see (Crisan and Rozovskii, Citation2011, Theorem 2.9 on p. 39),(2.4) Πt(i)=π˜pˇie0tbiσ(u)2dYu0tbi22σ(u)2du+(1π˜)pi[0,t]eθtbiσ(u)2dYuθtbi22σ(u)2duν(dθ)π˜jpˇje0tbjσ(u)2dYu0tbj22σ(u)2du+(1π˜)(jpj[0,t]eθtbjσ(u)2dYuθtbj22σ(u)2duν(dθ)+ν((t,)))(2.4) for i=1,,n. Moreover, from the Kushner-Stratonovich equation, see (Crisan and Rozovskii, Citation2011, Theorem 3.1 on p. 58), we know that Π(i) satisfies(2.5) dΠt(i)=piλ(t)(1j=1nΠt(j))dt+Πt(i)σ(t)(bij=1nbjΠt(j))dŴt,i=1,,n.(2.5)

Here λ(t)=Fν(t)/(1Fν(t)) is the intensity of the disorder occurring at time t > 0 (conditional on not having occurred yet), andŴt=0t1σ(u)(dYuE[Xu|FuY]du)is a standard Brownian motion with respect to {FtY}t0, see Bain and Crisan (Citation2009) (the process Ŵt is referred to as the innovation process). Note that Π˜t=i=1nΠt(i) yields(2.6) dΠ˜t=λ(t)(1Π˜t)dt+X̂tσ(t)(1Π˜t)dŴt,(2.6) where X̂t=E[Xt|FtY].

The posterior distribution P(Xt·|FtY)=i=1nΠt(i)δbi(·), so the n-tuple Πt=(Πt(1),,Πt(n)) fully describes the posterior. As a result, (2.4) and (2.5) provide two different representations of the posterior distribution.

2.2. Subsection: markovian embedding

Following standard lines in optimal stopping theory, we embed our optimal stopping problem into a Markovian framework. To do that, define a Markovian value function V by(2.7) V(t,π):=infτTtΠEt,π[1Π˜t+τ+tt+τc(u)Π˜udu],(t,π)[0,)×Dn,(2.7) where TtΠ denotes the stopping times with respect to the n-dimensional process {Πt+st,π}s0 starting from π at time t and satisfying (2.5). It is worth noting that V(t,π) corresponds to the value of the problem (2.3) in which the initial time is t and μ0=i=1nπiδbi.

Remark 2.2.

The value function V(t,·) in (2.7) is concave for any t0. Indeed, the concavity proof in Muravlev and Shiryaev (Citation2014) extends to the current setting. Since concavity is not used in the monotonicity results below, however, we omit the details.

2.2.1. Subsubsection: the classical shiryaev solution

In this subsection we recall the solution in the classical case where the cost c, the intensity λ and the post-change drift b are constants. In that case, we have the optimal stopping problem(2.8) U(π)=supτTΠEπ[1Πτ+c0τΠtdt](2.8) with an underlying diffusion processdΠt=λ(1Πt)dt+bσΠt(1Πt)dŴt.

It is well-known (see (Shiryaev, Citation1978, Chapter 4) or (Peskir and Shiryaev, Citation2006, Section 22)) that U solves the free-boundary problem(2.9) {b2π2(1π)22σ2π2U+λ(1π)πU+cπ=0π(0,a)U(π)=1ππ[a,1]πU(a)=1.(2.9)

Here a(0,1) is the free-boundary, and it can be determined as the solution of a certain transcendental equation. Moreover, the stopping time τ:=inf{t0:Πta} is optimal in (2.8), and one can check that the value function U is decreasing and concave.

3. Value dependencies and robustness

3.1. Subsection: monotonicity properties of the value function

In this section, we study parameter dependence of the optimal stopping problem (2.7). In particular, we investigate how the value function changes when we alter parameters of the probabilistic model, which include the prior for the drift magnitude and the prior for the disorder time.

The effects of adding more noise, stretching out the prior by scaling, and increasing the observation cost are explained by the following theorem.

Theorem 3.1

(General monotonicity properties of the value function V).

  1. V is increasing in the volatility σ(·).

  2. Given a prior μ for the drift magnitude, let Vk denote the Markovian value function (2.7) in the case when the drift prior is μ(·k). Then the map kVk(t,π) is decreasing on (0,) for any (t,π).

  3. V is increasing in the cost function c(·).

Proof.

For simplicity of notation, and without loss of generality, we consider the case t = 0 in the proofs below.

  1. For the volatility, let tσ1(t) and tσ2(t) be two time-dependent volatility functions satisying σ1(t)σ2(t) for all t0. Also, letYti:=0tXudu+0tσi(u)dWu,i=1,2,

and let Vi, i = 1, 2, be the corresponding value functions. In addition, let W be a standard Brownian motion independent of W and X. Then, clearly,V1=infτTY1E[1{τ<Θ}+Θτc(u)du]=infτTY1,WE[1{τ<Θ}+Θτc(u)du].

Moreover, the processY˜t2:=Yt1+0tσ22(u)σ12(u)dWucoincides in law with Y2 and TY˜2TY1,W. Hence it follows thatV1=infτTY1,WE[1{τ<Θ}+Θτc(u)du]infτTY˜2E[1{τ<Θ}+Θτc(u)du]=V2,which finishes the proof of the claim.

  1. Note that for k > 0, the processYtk:=0tkXudu+0tσ(u)dWu

satisfies Ytk=kY˜t, whereY˜t:=0tXudu+0tσ(u)kdWu.

Moreover, the set of FYk-stopping times coincides with the set of FY˜-stopping times, so monotonicity in k is implied by monotonicity in the volatility. Thus claim 2 follows from claim 1.

  1. The fact that the value is increasing in c is obvious from the definition (2.7) of the value function.

The monotonicity of the minimal Bayes’ risk with respect to volatility σ is of course not surprising: more noise in the observation process gives a smaller signal-to-noise ratio, which slows down the speed of learning. It is less clear how a change in the disorder intensity λ should affect the value function under a general disorder magnitude distribution. However, we have the following comparison result for the case of constant parameters.

Theorem 3.2

(Monotonicity in the intensity for constant parameters). Assume that the disorder magnitude can only take one value bR{0}. Let the cost c, the volatility σ and the intensity λ be constants, and assume that λλ(·). Let U be the value function for Shiryaev’s problem with parameters (b,σ,λ,c), and let V denote the value function for the problem specification (b,σ,λ,c). Then U(π)V(t,π) for all π[0,1] and t0.

Proof.

Without loss of generality, we only consider the case t = 0. Let π[0,1], denote by Y the observation process corresponding to the model specification (b,σ,λ,c), and let Π denote the corresponding process Π started from π at time 0. Let τTY be a bounded stopping time. Then, applying (a generalised version of) Ito’s formula and taking expectations at the stopping time τ, we getU(π)=E[U(Π(τ))]E[0τ(λ(s)(1Π(s))πU(Π(s))+b22σ2(Π)2(s)(1Π(s))2π2U(Π(s)))ds]E[U(Π(τ))]E[0τ(λ(1Π(s))πU(Π(s))+b22σ2(Π)2(s)(1Π(s))2π2U(Π(s)))ds]E[U(Π(τ))]+E[c0τΠ(s)ds]E[1Π(τ)]+E[c0τΠ(s)ds],

where we used the monotonicity of U and the fact that(3.1) λ(1π)πU(π)+b22σ2π2(1π)2π2U(π)+cπ0(3.1) at all points away from the optimal stopping boundary of Shiryaev’s classical problem, compare (2.9). Taking the infimum over bounded stopping times τ, we get U(π)V(0,π), which finishes the proof. □

Remark 3.1.

  1. The monotonicity in intensity does not easily extend to cases with unknown post-change drift by the same argument. In fact, one can check that in higher dimensions the partial derivatives Vπi are not necessarily all negative, which implies difficulties with extending the above proof to a more general setting. However, the robustness result in Theorem 3.3 below provides a partial extension in which models with general support for the drift magnitude and general intensities are compared with a fixed parameter model.

  2. Though the authors expect the inequality in Theorem 3.2 to hold also when one time-dependent intensity dominates another, the comparison with the constant intensity case was chosen to avoid additional mathematical complications that need to be resolved in order to apply Ito’s formula to the value function of a time-dependent disorder detection problem.

3.2. Subsection: robustness

Robustness concerns how a possible misspecification of the model parameters affects the performance of the detection strategy when evaluated under the real physical measure. In this section, we use coupling arguments to study robustness properties with respect to the disorder magnitude and disorder time. For simplicity, we assume that the parameters λ, c and σ are constant so that we have a time-independent case; generalizations to the time-dependent case are straightforward but notationally more involved.

Thus we assume that the signal process follows(3.2) Xt=B01{Θ=0}+B11{0<Θt},(3.2) where B0, B1 are random variables with distributions μ0,μ1 respectively, and Θ has the distribution νπ˜:=π˜δ0+(1π˜)ν, where ν is an exponential distribution with intensity λ. Let us simply write μ:=(μ0,μ1).

For a given lR{0}, let Θl satisfy ΘlΘ with distribution π˜δ0+(1π˜)νl, where νl is an exponential distribution with intensity λlλ. Letgl(t,π˜,Y):=π˜elσ2Ytl22σ2t+(1π˜)λl0telσ2(YtYθ)l22σ2(tθ)eθ/λldθπ˜elσ2Ytl22σ2t+(1π˜)(λl0telσ2(YtYθ)l22σ2(tθ)eθ/λldθ+1et/λl),compare (2.4). Also, we introduce the notationYtμ:=0tXudu+σWt,Ytδl:=l(tΘl)++σWt,Π˜δlδl(t):=gl(t,π˜,Yδl)andΠ˜δlμ(t):=gl(t,π˜,Yμ).

Here Yμ is the observation process for a setting in which the post-change drift has distribution μ and the disorder happens at Θ. The process Yδl is the observation process and Π˜δlδl is the corresponding conditional probability process in the situation of a post-change drift l that occurs at Θl. Moreover, the process Π˜δlμ represents the conditional probability process calculated as if the drift change is described by (δl,Θl) in the scenario where the true drift-change is given by (μ,Θ).

Now, let a:=al denote the optimal stopping boundary for the classical Shiryaev one-dimensional problem in the model (δl,Θl), and defineτδlδl:=inf{t0:Π˜δlδl(t)a},τδlμ:=inf{t0:Π˜δlμ(t)a},andVδlμ:=E[1{τδlμ<Θ}+c(τδlμΘ)+].

Here τδlδl is the optimal stopping time in the model (δl,Θl), and τδlμ is the (sub-optimal) stopping time and Vδlμ is the corresponding cost for someone who believes in (δl,Θl), whereas the true model is (μ,Θ).

Finally, letΠ˜tμ:=P(1R{0}(Xt)|FtYμ)=Πt(1)++Πt(n)as in Section 2, and defineγδlμ:=inf{t0:Π˜tμa}.

Theorem 3.3

(Robustness with respect to disorder magnitude and intensity).

  1. Suppose that inf( supp μ)>0 or sup( supp μ)<0, and let l:=argminx supp (μ)|x|.

  2. Then(3.3) VμVδlμVδl+cλλlλλl(1π˜),(3.3)

where Vμ and Vδl denote the minimal associated Bayes’ risks for the models (μ,Θ) and (δl,Θ), respectively.

  1. Also,(3.4) VμP(Θ>γδlμ)+cE[(γδlμΘ)+]Vδl.(3.4)

  2. Suppose r:=argmaxx supp(μ)|x|, and define Vδrμ like Vδlμ for l = r. If λrλ, then(3.5) VδrVμVδrμ.(3.5)

Remark 3.2.

Note that (3.3) and (3.5) correspond to situations in which the tester uses a misspecified model. More precisely, filtering and stopping are performed as if the underlying model had a one-point distribution as the disorder magnitude prior (the classical Shiryaev model). Such a situation may appear due to model miscalibration but is also relevant in situations with limited computational resources as the tester can deliberately choose to under/overestimate the actual parameters in order to use a simpler detection strategy. Equation (3.3) thus gives an upper bound for the expected loss when the classical Shiryaev model is employed. In (3.4), on the other hand, filtering is performed according to the correct model but the simple Shiryaev threshold strategy (suboptimal) is used for stopping.

Proof.

1. (a) For definiteness, we consider the case inf( supp μ)>0 so that l > 0; the other case is completely analogous. First note that the suboptimality of τδlμ yields VμVδlμ. Next, observe that we have Ytδl=Ytμ for all 0tΘ and YtδlYtμ for all t0, and thereforeΠ˜δlδl(t)=Π˜δlμ(t)for t[0,Θ]andΠ˜δlδl(t)Π˜δlμ(t) for all t0by the filtering Equationequation (3.3). Consequently,τδlδlτδlμ,so(3.6) E[(τδlδlΘl)+]E[(τδlμΘ)+]E[(ΘlΘ)+]=E[(τδlμΘ)+]λλlλλl(1π˜).(3.6)

Moreover, since Π˜δlδl(t)=Π˜δlμ(t) on the time interval [0,Θ], we haveP(τδlδl<Θl)P(τδlδl<Θ)=P(τδlμ<Θ),

which together with (3.6) yieldsVδl=E[1{τδlδl<Θl}+c(τδlδlΘl)+]E[1{τδlμ<Θ}+c(τδlμΘ)+]cλλlλλl(1π˜)=Vδlμcλλlλλl(1π˜).

(b) The first inequality is immediate by suboptimality of γδlμ. For the second one, let U be the value function of the classical Shiryaev problem so that U(π˜)=Vδl. Then U is C2 on [0,al)(al,1] and C1 on [0,1], so applying Itô’s formula to U(Π˜t) and taking expectations at the bounded stopping time γδlμk, we getU(π˜)=E[U(Π˜γδlμk)]E[0γδlμkλ(1Π˜u)U(Π˜u)+X̂u22σ2(1Π˜u)2U(Π˜u)du]E[U(Π˜γδlμk)]E[0γδlμkλl(1Π˜u)U(Π˜u)+l22σ2Π˜u2(1Π˜u)2U(Π˜u)du]=E[U(Π˜γδlμk)]+E[c0γδlμkΠ˜udu],where monotonicity and concavity of U were used in the inequality. Letting k givesU(π˜)E[1Π˜γδlμ]+E[c0γδlμΠ˜udu],which finishes the proof of the claim.

2. Recall thatdΠ˜t=λ(1Π˜t)dt+X̂tσ(1Π˜t)dŴt.

Let U(π˜)=Vδr(π˜). Since U is C1 on [0,1] and C2 on [0,a)(a,1], where a = ar is the boundary in Shiryaev’s problem with drift r and intensity λr, applying Itô’s formula to U(Π˜t) and taking expectations at a bounded stopping time τ yields(3.7) U(π˜)=E[U(Π˜τ)]E[0τλ(1Π˜u)U(Π˜u)+X̂u22σ2(1Π˜u)2U(Π˜u)du]E[U(Π˜τ)]E[0τλr(1Π˜u)U(Π˜u)+r22σ2Π˜u(1Π˜u)2U(Π˜u)du]E[U(Π˜τ)]+E[c0τΠ˜udu](3.7) (3.8) E[1Π˜τ]+E[c0τΠ˜udu].(3.8)

Here concavity was used for the first inequality, (3.7) follows from the fact thatλr(1π˜)U(π˜)+r22σ2π˜(1π˜)2U(π˜)+cπ˜0,π˜[0,a)(a,1],and the inequality (3.8) because U(π˜)1π˜. Hence, since the same value Vμ is obtained if one in (2.3) restricts the infimum to only bounded stopping times,Vδr=UVμ.

Lastly, since τlμ is a suboptimal strategy, we also haveVμVδrμ,which finishes the claim.□

Corollary 3.1.

In the notation above, assume that λ=λl so that there is no mis-specification of the intensity. Moreover, assume that  supp (μ)[l,r], where 0<l<r. ThenVδrVμVδl,so monotonicity in the disorder magnitude holds when comparing with deterministic magnitudes. Furthermore,0VδlμVμVδlVδr,so the increase in the Bayes’ risk due to underestimation (with a constant) of the disorder magnitude is bounded by the difference of two value functions of the classical Shiryaev problem.

We finish with some implications concerning the stopping strategy τD:=inf{t0:ΠtD}, where D={πΔn:V(π)=1π} is a standard abstractly defined optimal stopping set, see Peskir and Shiryaev (Citation2006) (we now assume that we are in the case of time-independent coefficients so that the value function is merely a function of πDn). The concavity of V, compare Remark 2.2, yields the existence of a boundary γΔn separating D from its complement ΔnD. The following result provides a more accurate location of the boundary γ.

Corollary 3.2

(Confined stopping boundary). Assume that the coefficients c, σ and λ are constant and that  supp (μ)[l,r], where 0<l<r. Let al and ar denote the boundaries in the classical Shiryaev problem with disorder magnitude l and r, respectively. Thenalinf{||π||1:πγ}sup{||π||1:πγ}ar,

i.e. the stopping boundary is contained in a strip. Moreover, the optimal strategy τD satisfies1arP(τD<Θ|FτDY)1al.

Acknowledgements

We thank the Associate Editor and an anonymous referee for their suggestions to improve the paper.

References

  • Bain, A. and Crisan, D. (2009). Fundamentals of Stochastic Filtering. Stochastic Modelling and Applied Probability, 60, New York: Springer.
  • Beibel, M. (1997). Sequential Change-Point Detection in Continuous Time When the Post-Change Drift is Unknown, Bernoulli 3: 457–478.
  • Bayraktar, E., Dayanik, S., and Karatzas, I. (2005). The Standard Poisson Disorder Problem Revisited, Stochastic Processes and Their Applications 115: 1437–1450.
  • Bayraktar, E., Dayanik, S., and Karatzas, I. (2006). Adaptive Poisson Disorder Problem, Annals of Applied Probability 16: 1190–1261.
  • Crisan, D. and Rozovskii, B. (2011). The Oxford Handbook of Nonlinear Filtering, Oxford: Oxford University.
  • Muravlev, A. and Shiryaev, A. (2014). Two-Sided Disorder Problem for a Brownian Motion in a Bayesian Setting, Proceedings of Steklov Institute of Mathematics 287: 202–224.
  • Peskir, G. and Shiryaev, A. (2002). Solving the Poisson Disorder Problem, in Advances in Finance and Stochastics, pp. 295–312, Berlin: Springer.
  • Peskir, G. and Shiryaev, A. (2006). Optimal Stopping and Free-Boundary Problems, Lectures in Mathematics, ETH Zurich, Basel: Birkhäuser.
  • Shiryaev, A. N. (1967). Two Problems of Sequential Analysis, Cybernetics 3: 63–69.
  • Shiryaev, A. N. (1978). Optimal Stopping Rules, New York: Springer.
  • Zucca, C., Tavella, P., and Peskir, G. (2016). Detecting Atomic Clock Frequency Trends Using an Optimal Stopping Method, Metrologia 53: 89–95.