Publication Cover
Applicable Analysis
An International Journal
Volume 99, 2020 - Issue 6
3,868
Views
37
CrossRef citations to date
0
Altmetric
Articles

On the second-order asymptotical regularization of linear ill-posed inverse problems

ORCID Icon &
Pages 1000-1025 | Received 06 Jul 2018, Accepted 26 Aug 2018, Published online: 19 Sep 2018

ABSTRACT

In this paper, we establish an initial theory regarding the second-order asymptotical regularization (SOAR) method for the stable approximate solution of ill-posed linear operator equations in Hilbert spaces, which are models for linear inverse problems with applications in the natural sciences, imaging and engineering. We show the regularizing properties of the new method, as well as the corresponding convergence rates. We prove that, under the appropriate source conditions and by using Morozov's conventional discrepancy principle, SOAR exhibits the same power-type convergence rate as the classical version of asymptotical regularization (Showalter's method). Moreover, we propose a new total energy discrepancy principle for choosing the terminating time of the dynamical solution from SOAR, which corresponds to the unique root of a monotonically non-increasing function and allows us to also show an order optimal convergence rate for SOAR. A damped symplectic iterative regularizing algorithm is developed for the realization of SOAR. Several numerical examples are given to show the accuracy and the acceleration effect of the proposed method. A comparison with other state-of-the-art methods are provided as well.

COMMUNICATED BY:

AMS SUBJECT CLASSIFICATIONS:

1. Introduction

We are interested in solving linear operator equations, (1) Ax=y,(1) where A is an injective and compact linear operator acting between two infinite dimensional Hilbert spaces X and Y. For simplicity, we denote by , and the inner products and norms, respectively, for both Hilbert spaces. Since A is injective, the operator equation (Equation1) has a unique solution xX for every y from range R(A) of the linear operator A. In this context, R(A) is assumed to be an infinite dimensional subspace of Y.

Suppose that, instead of the exact right-hand side y=Ax, we are given noisy data yδY obeying the deterministic noise model yδyδ with noise level δ>0. Since A is compact and dim(R(A))=, we have R(A)R(A)¯ and the problem (Equation1) is ill-posed. Therefore, regularization methods should be employed for obtaining stable approximate solutions.

Loosely speaking, two groups of regularization methods exist: variational regularization methods and iterative regularization methods. Tikhonov regularization is certainly the most prominent variational regularization method (cf., e.g. [Citation1]), while the Landweber iteration is the most famous iterative regularization approach (cf., e.g. [Citation2,Citation3]). In this paper, our focus is on the latter, since from a computational viewpoint the iterative approach seems more attractive, especially for large-scale problems.

For the linear problem (Equation1), the Landweber iteration is defined by (2) xk+1=xk+ΔtA(yδAxk),Δt(0,2/A2),(2) where A denotes the adjoint operator of A. We refer to [Citation2, Section 6.1] for the regularization property of the Landweber iteration. The continuous analogue of (Equation2) can be considered as a first-order evolution equation in Hilbert spaces (3) x˙(t)+AAx(t)=Ayδ(3) if an artificial scalar time t is introduced, and Δt0 in (Equation2). Here and later on, we use Newton's conventions for the time derivatives. The formulation (Equation3) is known as Showalter's method, or asymptotic regularization [Citation4,Citation5]. The regularization property of (Equation3) can be analysed through a proper choice of the terminating time. Moreover, it has been shown that by using Runge–Kutta integrators, all of the properties of asymptotic regularization (Equation3) carry over to its numerical realization [Citation6].

From a computational viewpoint, the Landweber iteration, as well as the steepest descent method and the minimal error method, is quite slow. Therefore, in practice accelerating strategies are usually used; see [Citation3,Citation7] and references therein for details.

Over the last few decades, besides the first-order iterative methods, there has been increasing evidence to show that the discrete second-order iterative methods also enjoy remarkable acceleration properties for ill-posed inverse problems. The well-known methods are the Levenberg–Marquardt method [Citation8], the iteratively regularized Gauss–Newton method [Citation9], the ν-method [Citation2, Section 6.3], and the Nesterov acceleration scheme [Citation10]. Recently, a more general second-order iterative method – the two-point gradient method – has been developed in [Citation11]. In order to understand better the intrinsic properties of the discrete second-order iterative regularization, we consider in this paper the continuous version (4) x¨(t)+ηx˙(t)+AAx(t)=Ayδ,x(0)=x0,x˙(0)=x˙0(4) of the second-order iterative method in the form of an evolution equation, where x0X and x˙0X are the prescribed initial data and η>0 is a constant damping parameter.

From a physical viewpoint, the system (Equation4) describes the motion of a heavy ball that rolls over the graph of the residual norm square functional Φ(x)=yδAx2 and that keeps rolling under its own inertia until friction stops it at a critical point of Φ(x). This nonlinear oscillator with damping, which is called the Heavy Ball with Friction (HBF) system, has been considered by several authors from an optimization viewpoint, establishing different convergence results and identifying the circumstances under which the rate of convergence of HBF is better than the one of the first-order methods; see [Citation12–14]. Numerical algorithms based on (Equation4) for solving some special problems, e.g. inverse source problems in partial differential equations, large systems of linear equations, and the nonlinear Schrödinger problem, etc., can be found in [Citation15–18]. The main goal of this paper is the intrinsic structure analysis of the theory of the second-order iterative regularization and the development of new iterative regularization methods based on the framework (Equation4).

The remainder of the paper is structured as follows: in Section 2, we extend the theory of general affine regularization schemes for solving linear ill-posed problems to a more general setting, adapted for the analysis of the second-order model (Equation4). Then, the existence and uniqueness of the second-order flow (Equation4), as well as some of its properties, are discussed in Section 3. Section 4 is devoted to the study of the regularization property of the dynamical solution to (Equation4), while Section 5 presents the results about convergence rates under the assumption of conventional source conditions. In Section 6, based on the Störmer–Verlet method, we develop a novel iterative regularization method for the numerical implementation of the second-order asymptotical regularization (SOAR). Some numerical examples, as well as a comparison with four existing iterative regularization methods, are presented in Section 7. Finally, concluding remarks are given in Section 8.

2. General affine regularization methods

In this section, we consider general affine regularization schemes based on a family of pairs of piecewise continuous functions {gα(λ),φα(λ)}α (0<λA2) for regularization parameters α(0,α¯]. Once a pair of generating functions {gα(λ),φα(λ)} is chosen, the approximate solution to (Equation1) can be given by the procedure (5) xαδ=(1AAgα(AA))x0+φα(AA)x˙0+gα(AA)Ayδ.(5)

Remark 2.1

The affine regularization procedure defined by formula (Equation5) is designed in particular for the second-order evolution equation (Equation4). If one sets (x0,x˙0)=(0,0), the proposed regularization method coincides with the classical linear regularization schema for general linear ill-posed problems; see, e.g.  [Citation19]. However, as the numerical experiments in Section 7 will show, the initial data influence the behaviour of the regularized solutions obtained by (Equation4). By finding an appropriate choice of the triple (x0,x˙0,η), the second-order analogue of the asymptotical regularization yields an accelerated procedure with approximate solutions of higher accuracy.

To evaluate the regularization error e(x,α,δ):=xαδx for the procedure (Equation5) in combination with the noise-free intermediate quantity (6) xα:=(1AAgα(AA))x0+φα(AA)x˙0+gα(AA)AAx,(6) where evidently e(x,α,δ)xαx+xαδxα, we introduce the concepts of index functions and profile functions from [Citation19,Citation20], as follows:

Definition 2.1

A real function ϕ:(0,)(0,) is called an index function if it is continuous, strictly increasing and satisfies the condition limλ0+ϕ(λ)=0.

Definition 2.2

An index function f, for which xαxf(α)(α(0,α¯]) holds, is called a profile function to x under the assumptions stated above.

Having a profile function f estimating the noise-free error xαx and taking into account that δgα(AA)A is an upper bound for the noise-propagation error xαδxα, which is independent of x, we can estimate the total regularization error e(x,α,δ) as (7) e(x,α,δ)f(α)+δgα(AA)A(7) for all α(0,α¯]. If we denote by (8) rα(λ)=1λgα(λ),λ(0,A2](8) the bias function related to the major part of the regularization method gα from (Equation5), then f(α)=rα(AA)(x0x)+φα(AA)x˙0 is evidently a profile function to x and we have (9) e(x,α,δ)rα(AA)(x0x)+φα(AA)x˙0+δgα(AA)A.(9)

Proposition 2.1

Assume that the pairs of functions {gα(λ),φα(λ)}α are piecewise continuous in α and satisfy the following three conditions:

  1. For any fixed λ(0,A2]: limα0rα(λ)=0 and limα0φα(λ)=0.

  2. Two constants γ1 and γ2 exist such that |rα(λ)|γ1 and |φα(λ)|γ2 hold for all α(0,α¯].

  3. A constant γ exists such that λ|gα(λ)|γ/α for all α(0,α¯].

Then, if the regularization parameter α=α(δ,yδ) is chosen so that limδ0α=limδ0δ2/α=0, the approximate solution in (Equation5) converges to the exact solution x as δ0.

Proof.

From the properties (i) and (ii) of Proposition 2.1 we deduce for α0 point-wise convergence rα(AA)x10 and φα(AA)x20 for any x1,2X (see, e.g.  [Citation2, Theorem 4.1]). Therefore, by the estimate (Equation9) we can derive that e(x,α,δ)rα(AA)(x0x)+φα(AA)x˙0+γδ/α0 as δ0.

Proposition 2.1 motivates us to call the procedure (Equation5) a regularization method for the linear inverse problem (Equation1) if the pair of functions {gα(λ),φα(λ)}α satisfies the three requirements (i), (ii) and (iii).

Example 2.1

For the Landweber iteration (Equation2) with the step size Δt(0,2/A2), we have φα(λ)=0 and gα(λ)=(1(1Δtλ)1/α)/λ. It is not difficult to show, e.g. in [Citation19,Citation21], that gα(λ) is a regularization method by Proposition 2.1 with constants γ2=0, γ1=1 and γ=2Δt.

Consider the continuous version of the Landweber iteration (Equation3), i.e. Showalter's method. It is not difficult to show that φα(λ)=0 and gα(λ)=(1eλ/α)/λ, and hence rα(λ)=eλ/α. Obviously, gα(λ) is a regularization method with γ2=0, γ1=1 and γ=θ by noting that supλR+λgα(λ)=θα and θ=supλR+λ(λeλ)0.6382 [Citation5].

Note that the three requirements (i)–(iii) in Proposition 2.1 are not enough to ensure rates of convergence for the regularized solutions. More precisely, for rates in the case of ill-posed problems, additional smoothness assumptions on x in correspondence with the forward operator A and the regularization method under consideration have to be fulfilled. This allows us to verify the specific profile functions f(α) in formula (Equation7) that are specified for our second-order method in formula (Equation9). Once a profile function f is given, together with the property (iii) in Proposition 2.1, we obtain from the estimate (Equation7) that (10) e(x,α,δ)f(α)+γδ/αforall α(0,α¯].(10) Moreover, if we consider the auxiliary index function (11) Θ(α):=αf(α),(11) and choose the regularization parameter a priori as α=Θ1(δ), then we can easily see that (12) e(x,α,δ)(1+γ)f(Θ1(δ)).(12) Hence, the convergence rate f(Θ1(δ)) of the total regularization error as δ0 depends on the profile function f only, but for our suggested approach, f is a function of x,gα,φα,x0,x˙0,A and on the damping parameter η.

In order to verify the profile function f in detail, it is of interest to consider how sensitive the regularization method is with respect to a priori smoothness assumptions. In this context, the concept of qualification can be exploited for answering this question: the higher its qualification, the more the method is capable of reacting to smoothness assumptions. Expressing the qualification by means of index functions ψ, the traditional concept of qualifications with monomials ψ(λ)=λκ for κ>0 from [Citation5] (see also [Citation2]) has been extended in [Citation19,Citation20] to general index functions ψ. We adapt this generalized concept to our class of methods (Equation5) in the following definition.

Definition 2.3

Let ψ be an index function. A regularization method (Equation5) for the linear operator equation (Equation1) generated by the pair {gα(λ),φα(λ)}(0<λA2) is said to have the qualification ψ with constant γ>0 if both inequalities (13) supλ(0,A2]|rα(λ)|ψ(λ)γψ(α)andsupλ(0,A2]|φα(λ)|ψ(λ)γψ(α)(13) are satisfied for all 0<αA2.

Remark 2.2

Since the bias function of Showalter's method equals rα(λ)=eλ/α and φα(λ)=0, set ξ=λ in the following identity: (14) sup0ξ<eξ/αξp=(p/e)pαp(14) to conclude that for all exponents p>0 the monomials ψ(λ)=λp are qualifications for Showalter's method. We will show that an analogue result also holds for the SOAR method – see Proposition 4.1 – and will apply this fact to obtaining associated convergence rates.

3. Properties of the second-order flow

We first prove the existence and uniqueness of strong global solutions of the second-order equation (Equation4). Then, we study the long-term behaviour of the dynamical solution x(t) of (Equation4) and the residual norm functional Ax(t)yδ.

Definition 3.1

x:[0,+)X is a strong global solution of (Equation4) with initial data (x0,x˙0) if x(0)=x0X,x˙(0)=x˙0X, and

  • x(),x˙():[0,+)X are locally absolutely continuous [Citation22],

  • x¨(t)+ηx˙(t)+AAx(t)=Ayδ holds for almost every t[0,+).

Theorem 3.1

For any pair (x0,x˙0)X×X there exists a unique strong global solution of the second-order dynamical system (Equation4).

Proof.

Denote by z=(x,x˙)T, and rewrite (Equation4) as a first-order differential equation (15) z˙(t)=Bz(t)+d,(15) where B=[0,I;AA,ηI], d=[0;Ayδ] and I denotes the identity operator in X. Since A is a bounded linear operator, both A and B are also bounded linear operators. Hence, by the Cauchy–Lipschitz–Picard theorem, the first-order autonomous system (Equation15) has a unique global solution for the given initial data (x0,x˙0).

Now, we start to investigate the long-term behaviours of the dynamical solution and the residual norm functional. These properties will be used for the study of convergence rate in Section 5.

Lemma 3.1

Let x(t) be the solution of (Equation4). Then, x˙()L2([0,),X) and x˙(t)0 as t. Moreover, we have the following two limit relations: (16) limtAx(t)yδδ(16) and (17) limtAx(t)yδ2+x˙(t)2=infxXAxyδ2.(17)

The proof of the above lemma uses the idea given in [Citation14], and can be found in Appendix A.1. If yδ does not belong to the domain D(A) of the Moore–Penrose inverse A of A, it is not difficult to show that there is a ‘blow-up’ for the solution x(t) of the dynamical system (Equation4) in the sense that x(t) as t. Contrarily, for yδD(A), i.e. if the noisy data yδ satisfy the Picard condition, one can show more assertions concerning the long-term behaviour of the solution to the evolution equation (Equation4), and we refer to Lemma 3.2 for results in the case of noise-free data y=Ax. In this work, for the inverse problem with noisy data, we are first and foremost interested in the case that yδD(A) may occur, since the set D(A) is of the first category and the chance to meet such an element is negligible.

At the end of this section, we show some properties of x(t) of (Equation4) with noise-free data.

Lemma 3.2

Let x(t) be the solution of (Equation4) with the exact right-hand side y as data. Then, in the case ηA, we have

  1. x()L([0,),X).

  2. x˙()L([0,),X)L2([0,),X) and x˙(t)0 as t.

  3. x¨()L([0,),X)L2([0,),X) and x¨(t)0 as t.

  4. Ax(t)y=o(t1/2) as t.

The proof of Lemma 3.2 follows as a special case for f(x)=12Axy2 in [Citation22], and it is given in Appendix A.2. The rate Ax(t)y=o(t1/2) as t given in Lemma 3.2 for the second-order evolution equation (Equation4) should be compared with the corresponding result for the first-order method, i.e. the gradient decent methods, where one only obtains Ax(t)y=O(t1/2) as t. If we consider a discrete iterative method with the number k of iterations, assertion (iv) in Lemma 3.2 indicates that in comparison with gradient descent methods, the second-order methods (Equation4) need the same computational complexity for the number k of iterations, but can achieve a higher order o(k1/2) of accuracy for the objective functional as k.

4. Convergence analysis for noisy data

This section is devoted to the verification of the pair {gα(λ),φα(λ)}α of generator functions occurring in formula (Equation6) associated with the second-order equation problem (Equation4) with the inexact right-hand side yδ and the corresponding regularization properties.

Let {σj;uj,vj}j=1 be the well-defined singular system for the compact and injective linear operator A, i.e. we have Auj=σjvj and Avj=σjuj with ordered singular values A=σ1σ2σjσj+10 as j. Since the eigenelements {uj}j=1 and {vj}j=1 form complete orthonormal systems in X and Y, respectively, the equation in (Equation4) is equivalent to (18) x¨(t),uj+ηx˙(t),uj+σj2x(t),uj=σjyδ,vj,j=1,2,.(18) Using the decomposition x(t)=jξj(t)uj under the basis {uj}j=1 in X, we obtain (19) ξ¨j(t)+ηξ˙j(t)+σj2ξj=σjyδ,vj,j=1,2,.(19)

In order to solve the above differential equation, we have to distinguish three different cases: (a) the overdamped case: η>2A, (b) the underdamped case: there is an index j0 such that 2σj0+1<η<2σj0, and (c) the critical damping case: an index j0 exists such that η=2σj0. In this section, we discuss for simplicity the overdamped case only. The other two cases are studied similarly, and the corresponding details can be found in Appendix 2. We remark that all results that concluded in the overdamped case also hold for the other two cases, but with different value of positive constants γ1,2,γ in Proposition 2.1 and γ in Definition 2.3.

In the overdamped case, the characteristic equation of (Equation19), possessing the form ξ¨j(t)+ηa˙ξj(t)+σj2ξj=0, which has two independent solutions ξj1=eηt/2eωjt/2 and ξj2=eηt/2eωjt/2 for all j=1,2,, where ωj=η24σj2>0. Hence, it is not difficult to show that the general solution to (Equation19) in the overdamped case is (20) ξj(t)=cj1eηt/2eωjt/2+cj2eηt/2eωjt/2+σj1yδ,vj,j=1,2,.(20)

Introducing the initial conditions in (Equation4) to obtain a system for {cj1,cj2} yields (21) j(cj1+cj2+σj1yδ,vj)uj=x0,jηωj2cj1+η+ωj2cj2uj=x˙0,(21) or equivalently with the decomposition x0=jx0,ujuj for all j=1,2, (22) cj1+cj2+σj1yδ,vj=x0,uj,ηωj2cj1+η+ωj2cj2=x˙0,uj,(22) which gives (23) cj1=η+ωj2ωjx0,uj1ωjx˙0,ujη+ωj2ωjσjyδ,vj,cj2=ηωj2ωjx0,uj+1ωjx˙0,uj+ηωj2ωjσjyδ,vj.j=1,2,.(23) By a combination of (Equation23), the definition of ωj and the decomposition of x(t) we obtain x(t)=jη+η24σj22η24σj2e((ηη24σj2)/2)tηη24σj22η24σj2e((η+η24σj2)/2)tx0,ujujj12η24σj2e((ηη24σj2)/2)te((η+η24σj2)/2)tx˙0,ujuj+j1η+η24σj22η24σj2e((ηη24σj2)/2)tηη24σj22η24σj2e((η+η24σj2)/2)tσjyδ,vjuj=:(1AAg(t,AA))x0+φ(t,AA)x˙0+g(t,AA)Ayδ, where (24) g(t,λ)=1λ1η+η24λ2η24λe((ηη24λ)/2)t+ηη24λ2η24λe((η+η24λ)/2)t,φ(t,λ)=12η24λe((ηη24λ)/2)te((η+η24λ)/2)t.(24) We find the form required for the generator functions in formula (Equation6) if we set (25) gα(λ):=g(1/α,λ)andφα(λ):=φ(1/α,λ).(25) Then the corresponding bias function rα(λ)=1λg(1/α,λ) is (26) rα(λ)=η+η24λ2η24λe((ηη24λ)/2)(1/α)ηη24λ2η24λe((η+η24λ)/2)(1/α).(26)

Theorem 4.1

The functions {gα(λ),φα(λ)}α in (Equation25) based on (Equation4) satisfy the conditions (i)–(iii) of Proposition 2.1, which means that we consequently have a regularization method with the procedure (Equation5) for the linear inverse problem (Equation1).

Proof.

We check all of the three requirements in Proposition 2.1. The first condition obviously holds for φα(λ) and rα(λ), defined in (Equation25) and (Equation26), respectively.

The second condition can be obtained by using (27) |rα(λ)|η+η24λ2η24λ=η2η24λ+12γ1:=η2η24A2+12,|φα(λ)|12η24λe((ηη24λ)/2)tγ2:=η2η24A2.(27) It remains to bound γ in Proposition 2.1. By the inequality 1eatat for a>0, we obtain 1e((ηη24λ)/2)(1/α)ηη24λ21α, which implies that λ|gα(λ)|=1λ1η+η24λ2η24λe((ηη24λ)/2)(1/α)+ηη24λ2η24λe((η+η24λ)/2)(1/α)=1λη+η24λ2η24λ1e((ηη24λ)/2)(1/α)1ληη24λ2η24λ1e((η+η24λ)/2)(1/α)1λη+η24λ2η24ληη24λ21α=1η24λη+η24λ21αηη24A21α. Therefore, the third requirement in Proposition 2.1 holds for gα(λ) with (28) γ=η/(η24A2).(28) Finally, by the proof above, we see that the upper bound α¯ for the affine regularization method with {gα(λ),φα(λ)}α can be selected arbitrarily.

Proposition 4.1

For all exponents p>0 the monomials ψ(λ)=λp are qualifications with the constants (29) γ=pηepη2η24A2+12(29) for the SOAR method in the overdamped case.

Proof.

Set ξ=(ηη24λ)/2 in (Equation14) and use the following inequality: ηη24λ2=4λ2(η+η24λ)λη and the inequality (Equation27), and we can derive that supλ(0,A2]|rα(λ)|ψ(λ)supλ(0,A2]η+η24λ2η24λe((ηη24λ)/2)(1/α)λpγ1supλ(0,A2]e((ηη24λ)/2)(1/α)λpγ1ηpsupλ(0,A2]e((ηη24λ)/2)(1/α)ηη24λ2pγ1ηpsupξ(0,(ηη24A2)/2]eξ/α(ξ)pγ1ηppepαp=γαp. Similarly, we have supλ(0,A2]|φα(λ)|ψ(λ)supλ(0,A2]12η24λe((ηη24λ)/2)(1/α)λpγ2supλ(0,A2]e((ηη24λ)/2)(1/α)λpγ2ηppepαpγαp, which completes the proof.

The assertion of Theorem 4.1 and analogues to Proposition 4.1 can be found in Appendix 2 for the other values of the constant η>0 occurring as a parameter in the second-order differential equation of problem (Equation4). In particular, this means the underdamped case (b), as well as the critical damping case (c).

5. Convergence rate results

Under the general assumptions of the previous sections, the rate of convergence of x(T)x as T in the case of precise data, and of x(T(δ))x as δ0 in the case of noisy data, can be arbitrarily slow (cf. [Citation23]) for solutions x which are not smooth enough. In order to prove convergence rates, some kind of smoothness assumptions imposed on the exact solution must be employed. Such smoothness assumptions can be expressed by range-type source conditions (cf., e.g.  [Citation2]), approximate source conditions (cf. [Citation24]), and variational source conditions occurring in form of variational inequalities (cf. [Citation25]). Now, range-type source conditions have the advantage that, in many cases, interpretations in the form of differentiability of the exact solution, boundary conditions, or similar properties are accessible. Hence, we focus in the following on the traditional range-type source conditions only. More precisely, we assume that an element v0X and numbers p>0 and ρ0 exist such that (30) x0x=(AA)pv0with v0ρ.(30) Moreover, the initial data x˙0 is supposed to satisfy such source conditions as well, i.e. (31) x˙0=(AA)pv1with v1ρ.(31) For the choice x˙0=0, the condition (Equation31) is trivially satisfied. However, following the discussions in Sections 2 and 6, the regularized solutions essentially depend on the value of x˙0. A good choice of x˙0 provides an acceleration of the regularization algorithm. In practice, one can choose a relatively small value of x˙0 to balance the source condition and the acceleration effect.

Proposition 5.1

Under the source conditions (Equation30) and (Equation31), f(α)=2γραp is a profile function for the SOAR, where the constant γ is defined in (Equation29).

Proof.

Combining the formulas (Equation9), (Equation30) and (Equation31) yields x(1/α)xrα(AA)(x0x)+φα(AA)x˙0rα(AA)(AA)pv0+φα(AA)(AA)pv1ρsup0λA2|rα(λ)|λp+ρsup0λA2|φα(λ)|λp2γραp. This proves the proposition.

Theorem 5.1

A priori choice of the regularization parameter

If the terminating time T of the second-order flow (Equation4) is selected by the a priori parameter choice (32) T(δ)=c0ρ2/(2p+1)δ2/(2p+1)(32) with the constant c0=(2γ)2/(2p+1), then we have the error estimate for δ(0,δ0] (33) x(T)xcρ1/(2p+1)δ2p/(2p+1),(33) where the constant c=(1+γ)(2γ)1/(2p+1) and δ0=2γρη2p+1.

Proof.

By the discussion in Section 2, we choose the value of α such that δ=Θ(α)=αf(α)=2γραp+1/2. By solving this equation we directly obtain α=(2γ)2/(2p+1) ρ2/(2p+1)δ2/(2p+1). Setting T=1/α and using the estimate (Equation12), this gives the relations (Equation32) and x(T)x=e(x,α,δ)(1+γ)f(α)=(1+γ)2γρ(T)p={(1+γ)(2γ)1/(2p+1)}ρ1/(2p+1)δ2p/(2p+1). Finally, we use the inequality αα¯=η2 to get the bound δ0 (the upper bound α¯=η2 is required for the affine regularization (Equation5) in both the underdamped and critical cases; see the appendix for details).

In practice, the stopping rule in (Equation32) is not realistic, since a good terminating time point T requires knowledge of ρ (a characteristic of unknown exact solution). Such knowledge, however, is not necessary in the case of a posteriori parameter choices. In the following two subsections, we consider two types of discrepancy principles for choosing the terminating time point a posteriori.

5.1. Morozov's conventional discrepancy principle

In our setting, Morozov's conventional discrepancy principle means searching for values T>0 satisfying the equation (34) χ(T):=Ax(T)yδτδ=0,(34) where τ>γ11 is a constant, and the number γ1 is defined in Proposition 2.1.

Lemma 5.1

If Ax0yδ>τδ, then the function χ(T) has at least one root.

Proof.

The continuity of χ(T) is obvious according to Theorem 3.1. On the other hand, from (Equation16) and the assumption of the lemma, we conclude that limTχ(T)(1τ)δ<0andχ(0)=Ax0yδτδ>0, which implies the existence of the root of χ(T).

Theorem 5.2

A posteriori choice I of the regularization parameter

Suppose that Ax0yδ>τδ and the source conditions (Equation30) and (Equation31) hold. If the terminating time T of the second-order flow (Equation4) is chosen according to the discrepancy principle (Equation34), we have for any δ(0,δ0] and p>0 the error estimates (35) TC0ρ2/(2p+1)δ2/(2p+1)(35) and (36) x(T)xC1δ2p/(2p+1),(36) where δ0 is defined in the Theorem 5.1, C0:=(τγ1)2/(2p+1)(2γ)2/(2p+1), and C1:=(τ+γ1)2p/(2p+1)(γ1+γ2)1/(2p+1)+γ(τγ1)1/(2p+1)(2γ)1/(2p+1).

Proof.

Using the moment inequality BpuBqup/qu1p/q and the source conditions (Equation30)–(Equation31), we deduce that (37) rα(AA)(x0x)+φα(AA)x˙0=(AA)(p+1/2)rα(AA)v0+φα(AA)v1(AA)(p+1/2)rα(AA)v0+φα(AA)v12p/(2p+1)rα(AA)v0+φα(AA)v11/(2p+1)Arα(AA)(x0x)+Aφα(AA)x˙02p/(2p+1)rα(AA)v0+φα(AA)v11/(2p+1).(37) Since the terminating time T is chosen according to the discrepancy principle (Equation34), we derive that (38) τδ=Ax(T)yδ=Ar1/T(AA)(x0x)+Aφ1/T(AA)x˙0r1/T(AA)(yδy)Ar1/T(AA)(x0x)+Aφ1/T(AA)x˙0r1/T(AA)(yδy).(38) Now we combine the estimates (Equation37) and (Equation38) to obtain, with the source conditions, that (39) rα(AA)(x0x)+φα(AA)x˙0Arα(AA)(x0x)+Aφα(AA)x˙02p/(2p+1)(rα(AA)v0+φα(AA)v1)1/(2p+1)(τδ+r1/T(AA)(yδy))2p/(2p+1)((γ1+γ2)ρ)1/(2p+1)c1ρ1/(2p+1)δ2p/(2p+1),(39) where c1:=(τ+γ1)2p/(2p+1)(γ1+γ2)1/(2p+1).

On the other hand, in a similar fashion to (Equation38), it is easy to show that τδAr1/T(AA)(x0x)+Aφ1/T(AA)x˙0+r1/T(AA)(yδy)Ar1/T(AA)(x0x)+Aφ1/T(AA)x˙0+γ1δ. If we combine the above inequality with the source conditions (Equation30)–(Equation31) and the qualification inequality (2.3), we obtain (τγ1)δAr1/T(AA)(x0x)+Aφ1/T(AA)x˙0(AA)p+1/2r1/T(AA)v0+(AA)p+1/2φ1/T(AA)v12ργ(T)(p+1/2), which yields the estimate (Equation35). Finally, using (Equation35) and (Equation39), we conclude that x(T)xrα(AA)(x0x)+φα(AA)x˙0+γTδc1ρ1/(2p+1)δ2p/(2p+1)+γC0ρ1/(2p+1)δ2p/(2p+1)=C1ρ1/(2p+1)δ2p/(2p+1). This completes the proof.

Remark 5.1

If the function χ(T) has more than one root, we recommend selecting T from the rule χ(T)=0<χ(T), T<T. In other words, T is the first time point for which the size of the residual Ax(T)yδ has about the order of the data error. By Lemma 5.1 such T always exists.

It is easy to show that χ(T) is bounded by a decreasing function as the proof of Proposition 5.2 will show. Roughly speaking, the trend of χ(T) is to be a decreasing function, where oscillations may occur, and we refer to Figure . On the other hand, one can anticipate that the more oscillations of the discrepancy function χ(T) occur, the smaller the damping parameter η is. This is an expected result due to the behaviour of damped Hamiltonian systems.

Figure 1. The behaviour of χ(T) from (Equation34) with different damping parameters η.

Figure 1. The behaviour of χ(T) from (Equation34(34) χ(T):=∥Ax(T)−yδ∥−τδ=0,(34) ) with different damping parameters η.

5.2. The total energy discrepancy principle

For presenting a newly developed discrepancy principle, we introduce the total energy discrepancy function as follows: (40) χte(T):=Ax(T)yδ2+x˙(T)2τ2δ2,(40) where τ>γ1 as before.

Proposition 5.2

The function χte(T) is continuous and monotonically non-increasing. If Ax0yδ2+x˙02>τ2δ2, then χte(T)=0 has a unique solution.

Proof.

The continuity of χte(T) is obvious according to Theorem 3.1. The non-growth of χte(T) is straight-forward according to χ˙te=2ηx˙2. Furthermore, from (Equation16), (Equation17) and the assumption of the proposition, we derive that (41) limTχte(T)δ2(1τ2)<0,(41) and that, moreover, χte(0)=Ax0yδ2+x˙02τ2δ2>0. This implies the existence of roots for χte(T).

Finally, let us show that χte(T) has a unique solution. We prove this by contradiction. Since χte(T) is a non-increasing function, a number T0 exists so that χte(T)=0 for T[T0,T0+ε] with some positive ε>0. This means that χ˙te(T)=2ηx˙20 in (T0,T0+ε). Hence, x¨0 in (T0,T0+ε). Using equation (Equation4) we conclude that for all T>T0: x(T)x(T0). Since χte(T0)=0, we obtain that χte(T)0 for T>T0, which implies that limTχte(T)=0. This contradicts the fact in (Equation41).

Theorem 5.3

A posteriori choice II of the regularization parameter 

Assume that Ax0yδ2+x˙02>τ2δ2 and a positive number δ1 exists such that for all δ(0,δ1], the unique root T of χte(T) satisfies the inequality Ax(T)yδτ1δ, where τ1>γ1 is a constant, independent of δ. Then, under the source conditions (Equation30) and (Equation31), for any δ(0,δ0] and p>0 we have the error estimates (42) TC0ρ2/(2p+1)δ2/(2p+1),x(T)xC1δ2p/(2p+1),(42) where δ0 is defined in Theorem 5.1, and constants C0 and C1 are the same as in Theorem   5.2.

Proof.

The proof can be done along the lines and using the tools of the proof of Theorem 5.2.

In the simulation Section 7.1, we will computationally show that the assumptions occurring in the above theorem can happen in practice. Empirically, when the value of the initial velocity is not too small (x˙0>0) or the noise is small enough (δ1), the additional assumption Ax(T)yδτ1δ in Theorem 5.3 always holds.

6. A novel iterative regularization method

Roughly speaking, the second-order evolution equation (Equation4) with an appropriate numerical discretization scheme for the artificial time variable yields a concrete second-order iterative method. Just as with the Runge–Kutta integrators [Citation6] or the exponential integrators [Citation26] for solving first-order equations, the damped symplectic integrators are extremely attractive for solving second-order equations, since the schemes are closely related to the canonical transformations [Citation27], and the trajectories of the discretized second flow are usually more stable.

The simplest discretization scheme should be the Euler method. Denote by v=x˙, and consider the following Euler iteration: (43) xk+1=xk+Δtkvk,vk+1=vk+Δtk(A(yδAxk+1)ηvk),x0=x0,v0=x˙0.(43) By elementary calculations, scheme (Equation43) expresses the form of following three-term semi-iterative method: (44) xk+1=xk+μk(xkxk1)+ωkA(yδAxk)(44) with a specially defined parameters ωk=Δtk and μk=1Δtkη. It is well known that the semi-iterative method (Equation44), equipped with an appropriate stopping rule, yields order optimal regularization method with (asymptotically) much fewer iterations than the classical Landweber iteration [Citation2, Section 6.2].

In this paper, we develop a new iterative regularization method based on the Störmer–Verlet method, which also belongs to the family of symplectic integrators and takes the form (45) vk+1/2=vk+Δt2(A(yδAxk)ηvk+1/2),xk+1=xk+Δtvk+1/2,vk+1=vk+1/2+Δt2(A(yδAxk+1)ηvk+1/2),x0=x0,v0=x˙0.(45)

Proposition 6.1

For any fixed damping parameter η, if the step size is chosen by (46) Δtmin(2/A,2/η),(46) then, the scheme (Equation45) is convergent. Consequently, for any fixed T, there exists a pair of parameters (k,Δt), satisfying kΔt=T and the condition (Equation46), such that xk=x(T)+O(Δt2) as Δt0. Here xk and x() are solutions to (Equation45) and (Equation4), respectively.

Proof.

Denote by z=(x,v)T, and rewritten (Equation45) as (47) zk+1=Bzk+Δt2+Δtηb,(47) where b=[Δt;2I(Δt2/2)AA]Ayδ and B=IΔt22+ΔtηAA2Δt2+ΔtηIΔt2(2+Δtη)(4IΔt2AA)AA2Δtη2+ΔtηIΔt22+ΔtηAA.

By Taylor's theorem and the finite difference formula, it is not difficult to show the consistency of the scheme (Equation45). It is well known that boundedness implies the convergence of consistent schemes for any problem [Citation28], hence, it suffices to show the boundedness of the scheme (Equation45). The asymptotical behaviour xk=x(T)+O(Δt2) follows from the convergence result and the second order of the Störmer–Verlet method. Furthermore, a sufficient condition for the boundedness of the iterative algorithm (Equation45) is that the operator B is non-expansive. Hence, it is necessary to prove that B21.

Using the singular value decomposition, we have AA=ΦΛΦT, where Φ is a unitary matrix and Λ=diag(λi), where λi0,i=1,,n.

By the elementary calculations, we obtain that the eigenvalues of B equal (48) μi±=12+Δtη{(2Δt2λi)±(2Δt2λi)2(4Δt2η2)},i=1,,n.(48)

Denote by i the index of λi, corresponding to the maximal absolute value of μi±, i.e. |μmax|=12+Δtηmax±(2Δt2λi)±(2Δt2λi)2(4Δt2η2).

There are three possible cases here: the overdamped case ((2Δt2λi)2>4Δt2η2), the underdamped case ((2Δt2λi)2<4Δt2η2), and the critical damping case ((2Δt2λi)2=4Δt2η2).

Let us consider these cases, respectively. For the chosen time step size Δt in (Equation46), we have 2Δt2λi0. Therefore, for the overdamped case, |μmax|=12+Δtη{(2Δt2λi)+(2Δt2λi)2(4Δt2η2)}. Define 2Δt2λi=a4Δt2η2 with a>1 (by the condition (Equation46), it holds that 4Δt2η20), and we have (49) |μmax|=(a+a21)4Δt2η22+Δtη=(a+a21)(2Δt2λi)a(2+Δtη).(49) Note that (50) Δtη=42Δt2λia2.(50) Combine (Equation49) and (Equation50) to obtain |μmax|=(a+a21)(2Δt2λi)2a+4a2(2Δt2λi)2=(a+a21)(1Δt2λi2)a+a21Δt2λi221Δt2λi21. Now, consider the underdamped case. In this case, the complex eigenvalue μmax satisfies |μmax|2=1(2+Δtη)2{(2Δt2λi)2+[(4Δt2η2)(2Δt2λi)2]}, which implies that |μmax|2=4Δt2η2(2+Δtη)2=2Δtη2+Δtη<1. Finally, consider the critical damping case. Similarly, we have |μmax|=(2Δtη)/(2+Δtη)<1, which yields the desired result.

At the end of this second, we show the convergence rate of the scheme (Equation45).

Theorem 6.1

Under the assumptions of Theorem 5.2 or 5.3, if the time step size is chosen by Δt=Ctδp/(2p+1), then for any δ(0,δs] and p>0 we have the convergence rate (51) xkx=O(δ2p/(2p+1)),(51) where k=T/Δt , δs=min{(2/Ct)2+1/pA21/p,(2/Ct)2+1/pη21/p,δ0}, Ct=(T/δp/(2p+1))/T/δp/(2p+1), and T is the root of (Equation34) or (Equation40). Here, denotes the standard floor function and δ0 is defined in Theorem 5.1.

Proof.

It follows from Proposition 6.1 that the choice Δt=Ctδp/(2p+1) yields xkx(T)Cδ2p/(2p+1) with some constant C for all δ(0,δs].

Combine the above inequality and the results in Theorems 5.2 and 5.3 to obtain xkxxkx(T)+x(T)xCrδ2p/(2p+1) for some constant Cr. This gives the estimate (Equation51).

7. Numerical simulations

In this section, we present some numerical results for the following integral equation: (52) Ax(s):=01K(s,t)x(t)dt=y(s),K(s,t)=s(1t)χst+t(1s)χs>t.(52) If we choose X=Y=L2[0,1], the operator A is compact, selfadjoint and injective. It is well known that the integral equation (Equation52) has a solution x=y if yH2[0,1]H01[0,1]. Moreover, the operator A has the eigensystem Auj=σjuj, where σj=(jπ)2 and uj(t)=2sin(jπt). Furthermore, using the interpolation theory (see e.g. [Citation29]) it is not difficult to show that for 4p1/2N R((AA)p)={xH4p[0,1]: x2l(0)=x2l(1)=0, l=0,1,,2p1/4}. In general, a regularization procedure becomes numerically feasible only after an appropriate discretization. Here, we apply the linear finite elements to solve (Equation52). Let Yn be the finite element space of piecewise linear functions on a uniform grid with step size 1/(n1). Denote by Pn the orthogonal projection operator acting from Y into Yn. Define An:=PnA and Xn:=AnYn. Let {φj}j=1n be a basis of the finite element space Yn, then, instead of the original problem (Equation52), we solve the following system of linear equations: (53) Anxn=yn,(53) where [An]ij=01(01k(s,t)φi(s)ds)φj(t)dt and [yn]j=01y(t)φj(t)dt.

As shown in [Citation2], the finite dimensional projection error ϵn:=(IPn)A plays an important role in the convergence rates analysis. For the compact operator A, ϵn0 as n. Moreover, if the noise level δ0 slowly enough as n, the quality ϵn has no influence and we obtain the same convergence rates as in Theorems 5.2 and 5.3.

Uniformly distributed noises with the magnitude δ are added to the discretized exact right-hand side: (54) [ynδ]j:=[1+δ(2Rand(x)1)][yn]j,j=1,,n,(54) where Rand(x) returns a pseudo-random value drawn from a uniform distribution on [0,1]. The noise level of measurement data is calculated by δ=ynδyn2, where 2 denotes the standard vector norm in Rn.

To assess the accuracy of the approximate solutions, we define the L2-norm relative error for an approximate solution xnk (k=T/Δt): L2Err:=xnkxL2[0,1]/xL2[0,1], where x is the exact solution to the corresponding model problem.

7.1. Influence of parameters

The purpose of this subsection is to explore the dependence of the solution accuracy and the convergence rate on the initial data (x0,x˙0), damping parameter η and the discrepancy functions χ and χte, and thus to give a guide on the choices of them in practice.

In this subsection, we solve integral equation (Equation52) with the exact right-hand side y=s(1s). Then, the exact solution x=2, and xR((AA)p) for all p<1/8. Denote by ‘DP’ and ‘TEDP’ the newly developed iterative scheme (Equation45) equipped with the Morozov's conventional discrepancy function χ(T) and the total energy discrepancy functions χte(T), respectively.

The results about the influence of the solution accuracy (L2Err) and the convergence rate (iteration numbers k(δ)) on the initial data (x0,x˙0) are given in Tables  and , respectively. As we can see, both the initial data x0 and x˙0 influence the solution accuracy as well as the convergence rate. Moreover, when the value of the damping parameter is not too small (see Tables  and  ) the results (both solution accuracy and convergence rate) by the methods ‘DP’ and ‘TEDP’ almost coincide with each other. This result verifies the rationality of the assumption in Theorem 5.3.

Table 1. The dependence of the solution accuracy and the convergence rate on the initial data x0. Δt=19.4946,η=2.5648×104,x˙0=0,τ=2,p=0.1125.

Table 2. The dependence of the solution accuracy and the convergence rate on the initial data x˙0. Δt=19.4946,η=0.0154,x0=1,τ=2,p=0.1125.

Table 3. The dependence of the solution accuracy and the convergence rate on the damping parameter η. Δt=19.4946,x0=1,x˙0=0,τ=2,p=0.1125.

In Table , we displayed the numerical results with different value of damping parameters η. With the appropriate choice of the damping parameter, say η=2.5648×103 in our example, the SOAR not only gives the most accurate result but exhibits an acceleration effect. The critical value of the damping parameter, say η=2/Δt, also provides an accurate result. But it requires a few more steps. The influence of the damping parameter on the residual functional can be found in Figure . It shows that at the same time point, the larger the damping parameter, the smaller the residual norm functional.

7.2. Comparison with other methods

In order to demonstrate the advantages of our algorithm over the traditional approaches, we solve the same problems by four famous iterative regularization methods – the Landweber method, the Nesterov's method, the ν-method and the conjugate gradient method for the normal equation (CGNE, cf., e.g. [Citation30]). The Landweber method is given in (Equation2), while Nesterov's method is defined as [Citation31] (55) zk=xk+k1k+α1(xkxk1),xk+1=zk+ΔtA(yδAzk),(55) where α>3 (we choose α=3.1 in our simulations). Moreover, we select the Chebyshev method as our special ν-method, i.e. ν=1/2 [Citation2, Section 6.3]. For all of four traditional iterative regularization methods, we use the Morozov's conventional discrepancy principle as the stopping rule.

We consider the following two different right-hand sides for the integral equation (Equation52).

  • Example 1: y(s)=s(1s). Then, x=2, and xR((AA)p) for all p<1/8. This example uses the discretization size n=400. Other parameters are Δt=19.4946,η=2.5648×104,x0=1,x˙0=0,τ=2,p=0.1125.

  • Example 2: y(s)=s4(1s)3. Then, x=6t2(1t)(28t+7t2), and xR((AA)p) for all p<5/8. This example uses the discretization size n=400. Other parameters are Δt=19.4946,η=0.0051,x0=0,x˙0=0,τ=2,p=0.5625.

The results of the simulation are presented in Table , where we can conclude that, in general, the SOAR need less iteration and CPU time, and offers a more accurate regularization solution. Moreover, with respect to the Morozov's conventional discrepancy principle, the newly developed total energy discrepancy principle provides more accurate results in most cases. Concerning the number of iterations, the CGNE method performed much better than all of the other methods. However, the accuracy of the CGNE method is much worse than other methods (including DP and TEDP), since the step size of CGNE is too large to capture the optimal point and the semi-convergence effect disturbs the iteration rather early. Note that we set a maximal iteration number kmax=400,000 in all of our simulations.

Table 4. Comparisons with the Landweber method, the Nesterov's method, the Chebyshev method, and the CGNE method.

8. Conclusion and outlook

In this paper, we have investigated the method of SOAR for solving the ill-posed linear inverse problem Ax=y with the compact operator A mapping between infinite dimensional Hilbert spaces. Instead of y, we are given noisy data yδ obeying the inequality yδyδ. We have shown regularization properties for the dynamical solution of the second-order equation (Equation4). Moreover, by using Morozov's conventional discrepancy principle on the one hand and a newly developed total energy discrepancy principle on the other hand, we have proven the order optimality of SOAR. Furthermore, based on the framework of SOAR, by using the Störmer–Verlet method, we have derived a novel iterative regularization algorithm. The convergence property of the proposed numerical algorithm is proven as well. Numerical experiments in Section 7 show that, in comparison with conventional iterative regularization methods, SOAR is a faster regularization method for solving linear inverse problems with high levels of accuracy.

Various numerical results show that the damping parameter η in the second-order equation (Equation4) plays a prominent role in regularization and acceleration. Therefore, how to choose an optimal damping parameter should be studied in the future. Moreover, using the results of the nonlinear Landweber iteration, it will be possible to develop a theory of SOAR for wide classes of nonlinear ill-posed problems. Furthermore, it could be very interesting to investigate the case with the dynamical damping parameter η=η(t). For instance, the second-order equation (Equation4) with η=r/t (r3) presents the continuous version of Nesterov's scheme [Citation32], and the discretization of (Equation4) with ηk=(k+2ν1)(2k+4ν1)(2k+2ν3)(k1)(2k3)(3k+3ν1)4(2k+2ν3)(2k+2ν1)(k+ν1),Δtk=4(2k+2ν1)(k+ν1)(k+2ν1)(2k+4ν1), yields the well-known ν-methods [Citation2, Section 6.3]. Even in the linear case (Equation1), to the best of our knowledge, it is not quite clear whether Nesterov's approach equipped with a posteriori choice of the regularization parameter is an accelerated regularization method for solving ill-posed inverse problems. In our opinion, however, the SOAR can be a candidate for the systematic analysis of general second-order regularization methods.

Disclosure statement

No potential conflict of interest was reported by the authors.

Additional information

Funding

The work of Y. Zhang is supported by the Alexander von Humboldt-Stiftung through a postdoctoral researcher fellowship, and the work of B. Hofmann is supported by the Deutsche Forschungsgemeinschaft [DFG-grant HO 1454/12-1].

References

  • Tikhonov A, Leonov A, Yagola A. Nonlinear ill-posed problems. Vols. I and II. London: Chapman and Hall; 1998.
  • Engl H, Hanke M, Neubauer A. Regularization of inverse problems. New York (NY): Springer; 1996.
  • Kaltenbacher B, Neubauer A, Scherzer O. Iterative regularization methods for nonlinear ill-posed problems. Berlin: Walter de Gruyter GmbH & Co. KG; 2008.
  • Tautenhahn U. On the asymptotical regularization of nonlinear ill-posed problems. Inverse Probl. 1994;10:1405–1418. doi: 10.1088/0266-5611/10/6/014
  • Vainikko G, Veretennikov A. Iteration procedures in ill-posed problems. Moscow: Nauka; 1986. (In Russian)
  • Rieder A. Runge–Kutta integrators yield optimal regularization schemes. Inverse Probl. 2005;21:453–471. doi: 10.1088/0266-5611/21/2/003
  • Neubauer A. On Landweber iteration for nonlinear ill-posed problems in Hilbert scales. Numer Math. 2000;85:309–328. doi: 10.1007/s002110050487
  • Jin Q. On a regularized Levenberg–Marquardt method for solving nonlinear inverse problems. Numer Math. 2010;115:229–259. doi: 10.1007/s00211-009-0275-x
  • Jin Q. On the discrepancy principle for some Newton type methods for solving nonlinear inverse problems. Numer Math. 2009;111:509–558. doi: 10.1007/s00211-008-0198-y
  • Neubauer A. On Nesterov acceleration for Landweber iteration of linear ill-posed problems. J Inverse Ill-Pose Probl. 2017;25:381–390.
  • Hubmer S, Ramlau R. Convergence analysis of a two-point gradient method for nonlinear ill-posed problems. Inverse Probl. 2017;33:095004. doi: 10.1088/1361-6420/aa7ac7
  • Alvarez F. On the minimizing property of a second-order dissipative system in Hilbert spaces. SIAM J Control Optim. 2000;38:1102–1119. doi: 10.1137/S0363012998335802
  • Alvarez F, Attouch H, Bolte J, et al. A second-order gradient-like dissipative dynamical system with hessian-driven damping. Application to optimization and mechanics. J Math Pures Appl. 2002;81:747–779. doi: 10.1016/S0021-7824(01)01253-3
  • Attouch H, Goudou X, Redont P. The heavy ball with friction method. I. The continuous dynamical system. Commun Contemp Math. 2000;2:1–34.
  • Zhang Y, Gong R, Cheng X, et al. A dynamical regularization algorithm for solving inverse source problems of elliptic partial differential equations. Inverse Probl. 2018;34:065001.
  • Edvardsson S, Neuman M, Edström P, et al. Solving equations through particle dynamics. Comput Phys Commun. 2015;197:169–181. doi: 10.1016/j.cpc.2015.08.028
  • Edvardsson S, Gulliksson M, Persson J. The dynamical functional particle method: an approach for boundary value problems. J Appl Mech. 2012;79:021012. doi: 10.1115/1.4005563
  • Sandin P, Gulliksson M. Numerical solution of the stationary multicomponent nonlinear schrodinger equation with a constraint on the angular momentum. Phys Rev E. 2016;93:033301. doi: 10.1103/PhysRevE.93.033301
  • Hofmann B, Mathé P. Analysis of profile functions for general linear regularization methods. SIAM J Numer Anal. 2007;45:1122–1141. doi: 10.1137/060654530
  • Mathé P, Pereverzev SV. Geometry of linear ill-posed problems in variable Hilbert scales. Inverse Probl. 2003;19(3):789–803. doi: 10.1088/0266-5611/19/3/319
  • Mathé P. Saturation of regularization methods for linear ill-posed problems in Hilbert spaces. SIAM J Numer Anal. 2006;42:968–973. doi: 10.1137/S0036142903420947
  • Boţ R, Csetnek E. Second order forward–backward dynamical systems for monotone inclusion problems. SIAM J Control Optim. 2016;54:1423–1443. doi: 10.1137/15M1012657
  • Schock E. Approximate solution of ill-posed equations: arbitrarily slow convergence vs. superconvergence. In: Hämmerlin G, Hoffmann KH, editors. Constructive methods for the practical treatment of integral equations. Vol. 73. Basel: Birkhäuser; 1985 p. 234–243.
  • Hein T, Hofmann B. Approximate source conditions for nonlinear ill-posed problems – chances and limitations. Inverse Probl. 2009;25:035003.
  • Hofmann B, Kaltenbacher B, Poschl C, et al. A convergence rates result for Tikhonov regularization in Banach spaces with non-smooth operators. Inverse Probl. 2007;23:987–1010. doi: 10.1088/0266-5611/23/3/009
  • Hochbruck M, Lubich C, Selhofer H. Exponential integrators for large systems of differential equations. SIAM J Sci Comput. 1998;19:1152–1174.
  • Hairer E, Wanner G, Lubich C. Geometric numerical integration: structure-preserving algorithms for ordinary differential equations. 2nd ed. New York (NY): Springer; 2006.
  • Tadmor E. A review of numerical methods for nonlinear partial differential equations. Bull Am Math Soc. 2012;49(4):507–554. doi: 10.1090/S0273-0979-2012-01379-4
  • Lions J, Magenes E. Non-homogeneous boundary value problems and applications. Vol. I. Berlin: Springer; 1972.
  • Hanke M. Conjugate gradient type methods for ill-posed problems. New York (NY): Wiley; 1995.
  • Nesterov Y. A method of solving a convex programming problem with convergence rate. Sov Math Dokl. 1983;27:372–376.
  • Su W, Boyd S, Candes E. A differential equation for modeling Nesterov's accelerated gradient method: theory and insights. J Mach Learn Res. 2016;17:1–43.

Appendix 1. Proofs in Section 3

A.1. Proof of Lemma 3.1

Define the Lyapunov function of (Equation4) by E(T)=Ax(T)yδ2+x˙(T)2. It is not difficult to show that (A1) E˙(t)=2ηx˙(t)2(A1) by looking at equation (Equation4) and the differentiation of the energy function E˙(t)=2a˙x(t),x¨(t)A(yδAx(t)). Hence, E(t) is non-increasing, and consequently, x˙(t)2E(0). Therefore, x˙() is uniform bounded. Integrating both sides in (EquationA1), we obtain 0x˙(t)2dtE(0)/(2η)<, which yields x˙()L2([0,),X).

Now, let us show that for any xX the following inequality holds: (A2) limsuptAx(t)yδAxyδ.(A2)

Consider for every t[0,) the function e(t)=e(t;x):=12x(t)x. Since e˙(t)=x(t)x,x˙(t) and e¨(t)=x˙(t)2+x(t)x,x¨(t) for every t[0,). Taking into account (Equation4), we get (A3) e¨(t)+ηe˙(t)+x(t)x,A(Ax(t)yδ)=x˙(t)2.(A3)

On the other hand, by the convexity inequality of the residual norm square functional Ax(t)yδ2, we derive (A4) Ax(t)yδ2+2xx(t),A(Ax(t)yδ)Axyδ2.(A4)

Combine (EquationA3) and (EquationA4) with the definition of E(t) to obtain e¨(t)+ηe˙(t)12Axyδ212E(t)+32x˙(t)2. By (EquationA1), E(t) is non-increasing, hence, given t>0, for all τ[0,t] we have e¨(τ)+ηe˙(τ)12Axyδ212E(t)+32x˙(τ)2. By multiplying this inequality with eηt and then integrating from 0 to θ, we obtain e˙(θ)eηθe˙(0)+1eηθ2η(Axyδ2E(t))+320θeη(θτ)x˙(τ)2dτ. Integrate the above inequality once more from 0 to t together with the fact that E(t) decreases, to obtain (A5) e(t)e(0)+1eηtηe˙(0)+ηt1+eηt2η2(Axyδ2E(t))+h(t),(A5) where h(t):=320t0θeη(θτ)x˙(τ)2dτdθ.

Since e(t)0 and E(t)Ax(t)yδ2, it follows from (EquationA5) that ηt1+eηt2η2Ax(t)yδ2e(0)+1eηtηe˙(0)+ηt1+eηt2η2Axyδ2+h(t). Dividing the above inequality by (ηt1+eηt)/2η2 and letting t, we deduce that limsuptAx(t)yδ2Axyδ2+limsupt2ηth(t).

Hence, for proving (EquationA2), it suffices to show that h()L([0,),X). It is obviously held by noting the following inequalities: 0h(t)=32η0t(1eη(tτ))x˙(τ)2dτ32η0x˙(τ)2dτ<. From the inequality Ax(t)yδinfxXAxyδ, we conclude together with (EquationA2) that (A6) limtAx(t)yδ=infxXAxyδ.(A6) Consequently, we have limtAx(t)yδAxyδδ.

Now, let us show the remaining parts of the assertion. Since E(t) is non-increasing and bounded from below by infxXAxyδ2, it converges as t. If limtE(t)>infxXAxyδ2, then limtx˙(t)>0 by noting (EquationA6). This contradicts the fact that x()L2([0,),X). Therefore, the limit (Equation17) holds and x˙(t)0 as t.

A.2. Proof of Lemma 3.2

(i) Consider for every t[0,) the function e(t)=e(t;x)=12x(t)x2. Similarly as in (EquationA3), it holds that e¨(t)+ηe˙(t)+x(t)x,A(Ax(t)y)=x˙(t)2, which implies that (A7) e¨(t)+ηe˙(t)+1A2A(yAx(t))2x˙(t)2(A7) or, equivalently (by using the evolution equation (Equation4)), (A8) e¨(t)+ηe˙(t)+ηA2dx˙(t)2dt+η2A21x˙(t)2+1A2x¨20.(A8)

By the assumption ηA, we deduce that (A9) e¨(t)+ηe˙(t)+ηA2dx˙(t)2dt0,(A9) which means that the function ta˙e(t)+ηe(t)+(η/A2)x˙(t)2 is monotonically decreasing. Hence, a real number C exists, such that (A10) e˙(t)+ηe(t)+ηA2x˙(t)2C,(A10) which implies e˙(t)+ηe(t)C. By multiplying this inequality with eηt and then integrating from 0 to T, we obtain the inequality e(T)e(0)eηT+C(1eηT)/ηe(0)+C/η. Hence, e() is uniform bounded, and, consequently, the trajectory x() is uniform bounded.

(ii) follows from Lemma 3.1.

Now, we prove assertion (iv). Define (A11) h(t)=η2x(t)x2+x˙(t),x(t)x.(A11) By elementary calculations, we derive that h˙(t)=ηx˙(t),x(t)x+x¨(t),x(t)x+x˙(t)2=x˙(t)2Axy2, which implies that (by noting E˙(t)=2ηx˙(t)2) E˙(t)+ηE(t)+ηh˙(t)=0.

Integrate the above equation on [0,T] to obtain together, with the nonnegativity of E(t), (A12) 0TE(t)dt=1ηE(0)E(t)(h(t)h(0))1ηE(0)+h(0)h(t).(A12)

On the other hand, since both x(t) and x are uniform bounded, and x˙(t)0 as t0, a constant M exists such that |h(t)|M. Hence, letting T in (EquationA12), we obtain (A13) 0E(t)dt<.(A13)

Since E(t) is non-increasing, we deduce that (A14) T/2TE(t)dtT2E(T).(A14) Using (EquationA13), the left side of (EquationA14) tends to 0 when T, which implies that limTTE(T)=0. Hence, we conclude limTTAx(T)y2=0, which yields the desired result in (iv).

Finally, let us show the long-term behaviour of x¨(). Integrating the inequality (EquationA8) from 0 to T we obtain the fact that there exists a real number C, such that for every t[0,) (A15) e˙(T)+ηe(T)+ηA2x˙(T)2+η2A210Tx˙(t)2dt+1A20Tx¨(t)2dTC.(A15) Since both e() and e˙() are global bounded (note that x(),x˙()L([0,),X)), inequality (EquationA15) gives x¨()L2([0,),X). The relations x¨()L([0,),X) and x¨(t)0 as t are obvious by noting assertions (i), (ii), (iv) and the connection equation (Equation4).

Appendix 2. Convergence analysis of the SOAR for the case when η(0,2A]

B.1 The underdamped case: 2σj0+1<η<2σj0

In this case, the solution to the second-order differential equation (Equation4) reads x(t)=(1AAgab(t,AA))x0+φab(t,AA)x˙0+gab(t,AA)Ayδ, where gab(t,λ)=g(t,λ)if λ<η2/4,gb(t,λ)if λ>η2/4,φab(t,λ)=φ(t,λ)if λ<η2/4,φb(t,λ)if λ>η2/4, where g(t,λ) and φ(t,λ) are defined in (Equation24), and (B1) gb(t,λ)=1λ1e(η/2)tη4λη2sin4λη22t+cos4λη22t,φb(t,λ)=24λη2e(η/2)tsin4λη22t.(B1)

As in the overdamped case, we define (B2) gαab(λ)=gab(1/α,λ)andφαab(λ)=φab(1/α,λ).(B2) In this case, the corresponding bias function becomes rαab(λ)=rα(λ)if λ<η2/4,rαb(λ):=eη/2αη4λη2sin4λη22α+cos4λη22αif λ>η2/4, where rα(λ) is given in (Equation26).

Theorem B.1

The functions {gαab(λ),φαab(λ)}, defined in (EquationB2), satisfy the conditions (i)–(iii) of Proposition 2.1.

Proof.

The first requirement in Proposition 2.1 is obvious. Furthermore, using the inequalities |sinξ||ξ| and eξ(ξ+1)1 we obtain (B3) eη/2αη4λη2sin4λη22α+cos4λη22α1,(B3) which implies the second condition in Proposition 2.1: |rαab(λ)|γ1ab with γ1ab:=max{η/2η24σj0+12+1/2,1} . Similarly, we have (B4) φαb(λ)=24λη2eη/2αsin4λη22αα1eη/2α2eη,(B4) which means that |φαab(λ)|γ2ab with γ2ab:=max{η2η24σj0+12,2eη}.

Now, let us check the third condition in Proposition 2.1. Using the inequality (EquationB4) we obtain that for λ>η2/4 1λ1eη/2αη4λη2sin4λη22α+cos4λη22α2λ4η. Hence, in the case when λ>η2/4 and αη2, we have λ|gαb(λ)|4/η4/α.

Finally, by defining (B5) γab=maxη/(η24σj0+12),4(B5) we can deduce that λ|gαab(λ)|γab/α for α(0,α¯] with α¯=η2.

Proposition B.1

For all exponents p>0 the monomials ψ(λ)=λp are qualifications with the constants γab=max{γ,γb} for the SOAR method in the underdamped case, where γ is defined in (Equation29) and (B6) γb:=η+2A222(p+1)eηp+1A2p.(B6)

Proof.

By Proposition 4.1, we only need to show that (B7) supλ(η2/4,A2]|rαb(λ)|λpγbαpand supλ(η2/4,A2]|φαb(λ)|λpγbαp(B7) for all α(0,A2]. Set ξ=η/2 and p=p+1 in (Equation14) to obtain (B8) eη/2α(2(p+1)/(eη))p+1αp+1.(B8) Then, using (EquationB4) and (EquationB8), we can derive that for all α(0,A2] eη/2αη4λη2sin4λη22α+cos4λη22αλpeη/2αη2α+1λpeη/2αη+2A22αA2p2(p+1)eηp+1αp+1η+2A22αA2p=γbαp, which yields the first inequality in (EquationB7).

Finally, from the above result, we can deduce that for all α(0,A2] |φαb(λ)|λpeη/2αη2αλpeη/2αη2α+1λpγbαp, which completes the proof.

B.2 The critical damping case: η=2σj0

In this case, the solution of (Equation4) is x(t)=(1AAgabc(t,AA))x0+φabc(t,AA)x˙0+gabc(t,AA)Ayδ, where gabc(t,λ)=g(t,λ)if λ>η2/4,gb(t,λ)if λ<η2/4,gc(t,λ),λ=η2/4,φabc(t,λ)=φ(t,λ)if λ>η2/4,φb(t,λ)if λ<η2/4,φc(t,λ),λ=η2/4, and gc(t,λ):=(4/η2){1e(η/2)t((η/2)t+1)}, φc(t,λ):=te(η/2)t.

Define (B9) gαabc(λ)=gabc(1/α,λ)andφαabc(λ)=φabc(1/α,λ),(B9) and obtain the corresponding bias function rαabc(λ)=rαa(λ)if λ>η2/4,rαb(λ)if λ<η2/4,rαc(λ):=eη/2αη2α+1if λ=η2/4.

Theorem B.2

The functions {gαabc(λ),φαabc(λ)}, given in (EquationB9), satisfy the conditions (i)–(iii) of Proposition 2.1.

Proof.

By Theorem B.1, we only need to check the case when λ=η2/4. In this case, it is easy to verify that limα0φα(λ)=limα0eη/2α/α=0, limα0rα(λ)=limα0eη/2α(η/2α+1)=0 and |φα(λ)|2/eη and |rα(λ)|1 for all α>0. Finally, by the inequality (assume that αη2) 1/λ{1eη/2α(η/2α+1)}2/λ=4/η4/α, and Theorem B.1, we complete the proof with λ|gα(λ)|γab/α for α(0,α¯], α¯=η2, and γab is defined in (EquationB5).

Proposition B.2

For all exponents p>0 the monomials ψ(λ)=λp are qualifications with the constants γabc=max{γ,γb,γc} for the SOAR method in the critical damping case, where (B10) γc:=η+2A22p+1ep+1η2p1maxη2p,1,(B10) and the constants γ and γb are defined in (Equation29) and (EquationB6), respectively.

Proof.

By Propositions 4.1 and B.1, we only need to show that for all α(0,A2] |rα(η2/4)|(η2/4)pγcαpand|φα(η2/4)|(η2/4)pγcαp. By (EquationB9) and elementary calculations, we derive that rαη24η24p=eη/2αη2α+1η24p2(p+1)eηp+1αp+1η+2A22αη24p=η+2A222(p+1)eηp+1η24pαp=η+2A22p+1ep+1η2p1αpγcαp and φαη24η24p=eη/2αη2α2ηη24peη/2αη2α+1η22p1p+1ep+1αp+1η+2A22αη22p1γcαp, which yields the required result.