296
Views
5
CrossRef citations to date
0
Altmetric
Articles

Monotonicity of error of regularized solution and its use for parameter choice

, , , &
Pages 10-30 | Received 10 Jul 2013, Accepted 13 Jul 2013, Published online: 20 Aug 2013

Abstract

We consider an ill-posed equation in a Hilbert space with a noisy operator and a noisy right-hand side. The noise level information is given in a general form, as a norm of a certain operator applied to the noise. We derive the monotone error rule (ME-rule) for the choice of the regularization parameter in many methods, giving parameter such that the error is monotonically increasing for larger parameters in the Tikhonov method and for smaller stopping indices in iteration methods. Regularization methods considered include Y-scale regularization in (iterated) Tikhonov method and in iteration methods (Landweber method, CG type methods, semi-iterative methods). We also consider modifications of the ME-rule and show in numerical experiments (test problems from Hansen’s Regularization Toolbox, including the sideways heat equation) their advantages over the discrepancy principle.

AMS Subject Classifications:

Introduction

In this paper, we consider linear ill-posed problems(1) A0x=y0,y0R(A),(1) where A0 is a bounded linear operator with non-closed range R(A) and X, Y are infinite-dimensional real Hilbert spaces with inner products (·,·) and norms ·. We are interested in the minimum-norm solution x of problem (Equation1) and assume that instead of exact data y0 and A0, there are given noisy data yY and AL(X,Y) with(2) y0yδ,A0Ah(2) and known noise levels δ, h. Later, we also consider the case of noise level information in more general form (Equation4).

Ill-posed problems (Equation1) arise in a wide variety of problems in applied sciences. For their stable numerical solution, regularization methods are necessary, see [Citation1, Citation2]. Regularization methods include Tikhonov regularization xr=(AA+r1I)1Ay with regularization parameter rR+ and iterative and projection methods, where the stopping index r=nN is the regularization parameter. Here, AL(Y,X) is the adjoint operator to AL(X,Y). Traditional regularization methods possess the property that in the case of exact data, the error xr0x of regularized solution xr0, as a function of r, is monotonically decreasing for r. This property is no longer true for the error xrx in the case of noisy data. The monotone decrease of the error xrx for growing r-values can only be guaranteed for small r. Typically, xrx diverges for r. Therefore, a rule for the proper choice of the regularization parameter r is necessary.

In the monotone error rule (ME-rule) for choosing a proper regularization parameter, the idea consists in searching for the largest computable regularization parameter r=rME for which we can guarantee the ME-property: the error xrx is monotonically decreasing for r(0,rME]. For the continuous regularization methods, this means thatddrxrx20r(0,rME],for the iteration methods and other methods with r=nN this means that(3) xnxxn1xn=1,2,,nME.(3) From derivation of the ME-rule for concrete regularization methods, one can see for which perturbations of the operator and the right-hand side, the ME-rule gives the optimal regularization parameter.

Consider now the case of the exact operator. In theoretical works on ill-posed problems, often the worst-case error sup{xrx:yY,Ax=y0,yy0δ} is considered. Here, the exact data y0 are fixed, supremum is over noisy data yY. In applications, the exact data y0 are unknown and noisy data y are known. Then, we are interested in finding the parameter r which minimizes the analogue of the worst-case error sup{xrx:yY,Ax=y,yyδ}, where given data y are fixed, supremum is over “candidates of exact data” yY. This analogue of the worst-case error is minimized by the parameter rME. This can be seen from derivation of the ME-rule for concrete regularization methods: for r(0,rME], the error xrx is decreasing for all “candidates of exact data” yY with Ax=y, but for r>rME there exists yY with Ax=y where xrx>xrMEx.

All regularization methods considered in this paper have the property xrR(A) because we need this property for the formulation of the ME rule.

The ME-rule was proposed and studied for continuous methods in [Citation3Citation8], for iterative methods in [Citation4, Citation7, Citation8], analogous rules were proposed in [Citation9, Citation10]. In this paper, we extend these results in the following directions: we consider the case of noisy operator (h>0), the noise level information may be given in general form (4), we derive the ME-rule for a more general class of regularization methods (including the regularization in Y-scale,[Citation11] also for iterative and semi-iterative methods). We also give convergence results and error estimates for the ME-rule.

In practical applications, often different regularization parameters are selected and the corresponding regularized solutions are studied online. With our ME-rule, we provide some help for the selection of suitable values of r, since we can guarantee that for r<rME always xrMEx<xrx holds. This information may also be used for improving other a posteriori rules R for the choice of the regularization parameter, since the error for the parameter max(rR,rME) is less than or equal to the error in rule R. Unfortunately, this observation does not help to improve the discrepancy principle, which gives the parameter rDrME, and for a smooth solution rD is often too large in regularization methods with the finite qualification. In (iterated) Tikhonov method, the ME-rule is in contrast to the discrepancy principle a quasi-optimal parameter choice.

Note that typically the error of the regularized solution decreases monotonically also, somewhat further up to some roptrME. Our numerical experiments suggest to use rMEe=crME (or its integer part) with a certain c>1.

The plan of this paper is as follows. In Sections 2 and 3, well-known regularization methods and parameter choice rules are introduced, using noise-level information (Equation4). In Sections 4 and 5, the ME-rule is derived for the continouous regularization methods and for the iterative methods, respectively. In Section 6 convergence conditions and quasioptimality results of the ME-rule are given. The paper is finished by extensive numerical experiments.

Well-known regularization methods in Y-scale

We assume that the noise level information is given in the form(4) D(y0y)δ,D(A0A)h.(4) where D is a linear injective, possibly unbounded operator in Y with domain D(D). We assume that y0,yD(D). The standard case (Equation2) is special case with D=I, where I is the identity operator. In [Citation12], the operator D=Lt with tR was used. Here, the operator L is a densely defined, unbounded, self-adjoint and strictly positive operator in the space Y, inducing Hilbert scale Ys with norm ys:=Lsy. In the standard case (Equation2), t=0. The case t<0 corresponds to large noise, the case t>0 to small noise. The noise-level information (Equation4) with different operators D (only the case h=0 was considered) was used in works.[Citation13Citation16]

In many papers, Xs-scale is considered with the generating operator L in the space X, the equation Ax=y is transformed to the equation L2sAAx=L2sAy and the regularized solutions are constructed in the form xr=gr(L2sAA)L2sAy. Here, r is the regularization parameter and the function gr(λ) satisfies the conditions(5) sup0λΛ|λgr(λ)|γr,r0,sup0λΛλp|1λgr(λ)|γprp,r0,0pp0.(5) Here, p0, γ and γp are positive constants, Λ is at least the norm of the operator in the argument of the function gr, γ0=1 and the greatest value of p0, for which the inequality (Equation6) holds is called the qualification of the method. We can formulate the ME rule for choice of parameter r only for regularized solutions xrR(A). Therefore, we prefer to use the Y-scale instead of X scale. The Y-scale regularization was proposed by Egger [Citation11, Citation12]. Here, the equation Ax=y is transformed into the equationBx=y¯,B=LsA,y¯=Lsy,assuming B:XY is bounded and y¯Y. If s0 this is always satisfied; if s<0, this means that R(A)Ys and yYs. The regularized solutions are constructed in the form(6) xr=gr(BB)By¯=Bgr(BB)y¯=Awr,wr=Lsgr(BB)y¯.(6) Special cases of regularization methods of this form are the following well-known methods (cf. [Citation1, Citation2, Citation17]).

  1. The Tikhonov method xα=(αI+BB)1By¯ . Here, r=α1, gr(λ)=(λ+r1)1, p0=1, γ=1/2, γp=pp(1p)1p.

  2. Iterative variants of the Tikhonov method. Let mN, m1, x0,αX – initial approximation andxn,α=(αI+BB)1(αxn1,α+By¯)(n=1,2,,m).Here r=α1, gr(λ)=1λ(1(11+rλ)m), p0=m, γ=m, γp=(pm)p(1pm)mp.

  3. Explicit iteration scheme (Landweber’s method). Let x0=0,xn=xn1μ(BBxn1y¯),μ(0,1/BB),n=1,2,Here, r=n, gr(λ)=1λ(1(1μλ)r), p0=, γ=μ, γp=(pμe)p.

  4. Implicit iteration scheme. Let α>0 be a constant, x0=0 andxn=(αI+BB)1(αxn1+By¯),n=1,2,Here, r=n, gr(λ)=1λ(1(αα+λ)r), p0=, γ=b0α, where b0=sup0<λ<λ1/2(1eλ)0.6382 and γp=(αp)p.

  5. The method of asymptotical regularization: approximation xr solves the Cauchy problemx(r)+BBx(r)=By¯,x(0)=0.Here, gr(λ)=1λ(1erλ), p0=, γ=b0, γp=(p/e)p.

In the iterated Tikhonov method, we may use different parameter αn at every iteration step n; then, we get the same approximation as in the extrapolated Tikhonov method [Citation18, Citation19](7) xα1,,αm=i=1mcixαi,ci=j=1jimαjαjαi,(7) where xαi, i=1, ..., m are the approximations of Tikhonov method with the parameters αi.

Note that regularized solutions in Xs-scales and Ys-scales variants of Tikhonov method have the form(8) xα=(αI+L2sAA)1L2sAy,xα=(αI+AL2sA)1AL2sy,(8) and they minimize the corresponding functionalsAxy2+αLsx2,Ls(Axy)2+αx2.If the operator L acts in both spaces X and Y and the operators A and L commute, then these regularized approximations coincide. Much more widespread of these two approximations is the first one (cf.[Citation20]). For the operators L and L1, one may use, for example, the differentiation and integration operators. The fractional powers Ls and Ls can be implemented efficiently, e.g. via FFT or multi-level techniques.

Positive values of s are good for delaying saturation in Tikhonov and iterated Tikhonov methods,[Citation12] negative values of s are good for preconditioning in the iteration methods that drastically decreases the number of iteration steps.[Citation11]

Well-known rules for choice of the regularization parameter

For the choice of the regularization parameter in any regularization method under noise information (Equation4), some estimate or approximation of value D(yAx) is needed. This value may be estimated as follows:(9) D(yAx)=D(yy0+(A0A)x)δ+hx=:Δ1.(9) Typically, the exact value of x is unknown. Substitution of x by some upper bound Mx gives a rough estimate D(yAx)δ+hM=:Δ2. If an upper bound M is not available, approximation of D(yAx) by Δ3:=δ+hxα (in iteration methods Δ3:=δ+hxn) may be used.

In the following, we formulate some a posteriori rules for choosing the regularization parameter which use Δ{Δ1,Δ2,Δ3} and are well known in the case D=I, s=0. We assume R(AA)D(D).

  1. Discrepancy principle.[Citation1, Citation2, Citation17, Citation21] In the continuous regularization methods, this principle (D principle) chooses the parameter α=αD as the solution of the equation(10) dD(α):=D(yAxα)=CΔwithC1.(10) In iteration methods, the discrepancy principle finds nD as the first index for which D(yAxn)CΔ.

  2. Modified discrepancy principle.[Citation22] In this rule for the methods M1, M2, the parameter α=αMD is chosen as the solution of the equation(11) dMD(α):=DLs(Igα(BB)BB)1/(2p0)Ls(yAxα)=CΔ(11) with C1, assuming YsD(D) (if D=Lt then a sufficient condition is st). Here, p0 is the qualification of the regularization method, see (Equation6). One may use also the alternative MD’ rule(12) dMD(α)=(D(yAxα),DLs(Igα(BB)BB)1/p0Ls(yAxα))1/2=CΔ.(12)

In case D=Ls, the functions dMD(α) and dMD(α) coincide. For regularization methods with p0=, both these forms of the modified discrepancy principle coincide with the discrepancy principle.

ME-rule for the continuous regularization methods

Derivation of the ME-rule

Let us consider the ME-rule for the continuous regularization methods. The prominent example of these methods is Tikhonov method where the regularization parameter is traditionally denoted by α. Therefore, in this section we use notation α=1/r instead of r. We assume that the corresponding function gα(λ) is differentiable with respect to α.

Let us consider the regularized approximation(13) xα=Awα,zα:=ddαwαY,(13) assuming that zαD((D1)). An example of a regularized approximation of this form is the approximation xα=Awα with wα=Lsgr(BB)y¯ from (Equation7). The reformulation of the general idea of ME-rule in terms of α=1/r means: search in the approximation xα for the smallest computable regularization parameter α=αME for which we can guarantee that(14) ddαxαx20forallα[αME,).(14) In order to guarantee this property, we estimate the derivative of the squared error xαx2 with respect to α under condition (Equation10) as follows:(15) 12ddαxαx2=(xαx,Azα)=(D(Axαy+yAx),(D1)zα)(D1)zα((Axαy,zα)(D1)zα(δ+hx)).(15) This estimate leads us to following ME-rule for the continuous regularization method (Equation14).

ME rule. Choose α=αME as the largest solution of the equation(16) dME(α):=(Axαy,zα)(D1)zα=Δ(16) with Δ{Δ1,Δ2}. If dME(α)>Δ for all α>0 take αME=0. If dME(α)<Δ for all α>0 take xα=0 (it corresponds to α=αME=). For regularized approximation (Equation7), this equation has the form(17) dME(α):=(Axαy,Lsddαgα(BB)Lsy)(D1)Lsddαgα(BB)Lsy=Δ.(17) Here, the assumption zαD((D1)) is satisfied if YsD((D1)).

Note that for Δ=Δ3, the ME-property is not guaranteed but quasi-optimality retains (see Theorem 6.3). However, Δ=Δ3 does not need the norm of the exact solution or a bound of it and in practice may give better results than Δ{Δ1,Δ2}.

From definition of dME(α) follows the inequality dME(α)D(yAxα), therefore αDαME if C=1 in the discrepancy principle. The derivation of the ME-rule in (Equation16) uses only one inequality, which turns to the equality in the caseyAx=δ+hx(D1)zα(DD)1zα.Therefore, in this case the ME-rule gives the optimal parameter, if the equation (Equation18) has a unique solution.

ME-rule and modifications for the (iterated) Tikhonov regularization

Let x0=0. Let ρm,α denote the discrepancy of the (iterated) Tikhonov approximation xα=xm,α, i.e. ρm,α:=yAxm,α.

Using the identities1λgα(λ)=(αλ+α)mandddαgα(λ)=mα2(αλ+α)m+1we obtainLsρm,α=y¯Bxm,α=[IBBgα(BB)]y¯=[α(BB+αI)1]my¯andddαgα(BB)y¯=mα2Lsρm+1,α.From these representations and from (Equation18), we conclude that the functions dMD(α) in (Equation13) and dME(α) for the MD’- and ME-rules have the form(18) dMD(α)=(Dρm,α,Dρm+1,α)1/2anddME(α)=(ρm,α,L2sρm+1,α)(D1)L2sρm+1,α.(18) In the case D=I, h=s=0, convergence and order optimal error estimates for (iterated) Tikhonov method are proved in the case of MD’-rule in [Citation5, Citation22, Citation23] and in the case of ME-rule in [Citation3, Citation5, Citation6, Citation8, Citation24].

One may also use an analog of the ME-rule with the function(19) dMEa(α)=(Dρm,α,Dρm+1,α)Dρm+1,α,(19) which coincides with ME-rule in the case D=Ls.

Consider now some modifications of the ME-rule. We know from (Equation15) that αMEαopt:=argmin{xm,αx,α0}. It means that typically somewhat a smaller parameter αME/c with proper c>1 is better parameter than αME. For case D=I, h=s=0 we optimized the value of c (and other constants below) in extensive numerical experiments and recommend the estimated parameter αMEe=αME/2.3. Note that in the case of rough estimate of the noise level, better rules than the ME-rule and the discrepancy principle are rules from the recently derived family of rules,[Citation25] see also [Citation26Citation29].

In the extrapolated Tikhonov method (Equation8) xα:=xα1,,αm, one may use the logarithmically uniform mesh of parameters αn=αqn(m+1)/2, n=1,,m; q<1 and to choose α by the same rules as in the iterated Tikhonov method.

Note that in [Citation30], an analog of ME-rule in case D=I, h=s=0 was proposed for the (m1 times iterated) Lavrentiev method, finding the parameter α=αMEa as solution of the equation ρm+1,α2/ρm+2,αCmδ, where C1=1.55, C2=1.6. Numerical experiments in [Citation30, Citation31] showed that with this parameter choice, the error of regularized solution was in average only 5% larger (in modifications of this rule 3% larger) than with optimal parameter. Note also that several other analogs of ME-rule for the (m1 times iterated) Lavrentiev method were proposed and analysed in [Citation32].

ME-rule for asymptotical regularization

In this regularization method, the regularized solution is given by xα as the solution of the initial value problemddtx(t)+BBx(t)=By¯for0<tr,x(0)=0with r=1/α. In this method, we have gα(λ)=(1eλ/α)/λ. Using the identitiesddαgα(λ)=1α2eλ/α=1α2(1λgα(λ))we obtain for the method of asymptotical regularizationdME(α)=(Axαy,L2s(Axαy))(D1)L2s(Axαy).In the case s=0 and D=I, the ME-rule coincides with the discrepancy principle for which convergence and order optimal error estimates are well known (cf. [Citation1, Citation2, Citation8, Citation17]).

ME-rule for iterative regularization methods

Derivation of the ME-rule

For solving ill-posed problems (Equation1), we now consider iteration methods of the general form(20) xn=xn1+Azn1,n=1,2,(20) with znY. The elements zn characterize the special iteration method. We assume that znD((D1)). Simple iteration methods are explicit iteration scheme (Landweber method) M3 with zn=μL2sρn, μ(0,1/BB) and implicit iteration scheme M4 with zn=Ls(αI+BB)1Lsρn=α1L2sρn+1, α>0, where ρn=yAxn.

In the monotone error rule (ME-rule), we search for a largest computable iteration numbernME for which the monotonicity property (Equation3)(21) xnxxn1xforalln=1,2,,nME(21) can be guaranteed. Exploiting (Equation21) and (Equation10), we obtain(22) xn+1x2xnx2=xn+Aznx2xnx2=2(xnx,Azn)+Azn2=2(D(Axny0),(D1)zn)+Azn2=2(D(yAxρn),(D1)zn)+Azn22(D1)zn(δ+hx)2(ρn,zn)+Azn2=:Φ(zn).(22) This estimate leads us to the following ME-rule for iteration methods (Equation21).

ME rule. Choose nME as the first index n satisfying(23) dME(n):=2(ρn,zn)Azn22(D1)znΔ(23) with Δ{Δ1,Δ2}.

The derivation of the ME-rule in (Equation22) uses only one inequality, which turns to the equality in the caseyAx=δ+hx(D1)zn(DD)1zn.Therefore, in this case the ME-rule gives the optimal parameter.

Note that for Δ=Δ3, the ME-property is not guaranteed but for certain methods it is known that quasi-optimality still holds (see Theorem 6.3). However, typically the monotonicity of error also holds up to somewhat larger iteration numbers, hence Δ=Δ3 may also give better results than Δ{Δ1,Δ2}.

From definition of dME(n) follows the inequality dME(n)Dρn=dD(n).

Due to equality ρn+1=ρnAAzn, the function dME(n) may be presented also in forms(24) dME(n):=(ρn+ρn+1,zn)2(D1)zn=2(ρn+1,zn)+Azn22(D1)zn.(24) Inequalities (Equation23), (Equation24) may be used:

  1. for stopping iterations (Equation21), if for the first time dME(n)Δ;

  2. for choosing zn, minimizing the function Φ(zn) in error estimate (Equation23) (for example choosing steplength μ in (Equation26));

  3. for using several iteration methods switching from some faster method (for example CGLS) to a slower method (for example Landweber method): if inequality (Equation23) does not guarantee decrease of error in the fast method, the accuracy of this approximation may be further increased by a slower method.

ME-rule in gradient-type methods

Let us consider gradient-type methods of the form (Equation21) with zn=μnL2sρn, where ρn=yAxn is the discrepancy and μn>0 is some properly chosen stepsize:(25) xn=xn1+μn1AL2s(yAxn1),n=1,2,(25) Special cases of method (Equation26) are the Landweber method with μn=μ(0,1/BB), the minimal error method with μn=Lsρn2AL2sρn2 and the steepest descent method with μn=AL2sρn2LsAAL2sρn2.

Here, the ME-function (Equation25) has the formdME(n)=(ρn+ρn+1,L2sρn)2(D1)L2sρn.The estimate (Equation23) which led us to the ME-rule can also be exploited for finding stepsizes μn>0 in iteration methods (Equation26) which guarantee that xn+1 is a better approximation for x than xn. Here, the stepsize μn may not only depend on y but also on the noise level Δ. Exploiting zn=μnL2sρn, the estimate (Equation23) obtains form(26) xn+1x2xnx22μn((D)1L2sρnΔLsρn2)+μn2AL2sρn2.(26) Therefore, xn+1 is a better approximation for x than xn, ifΔ<Lsρn2(D)1L2sρn,0<μn<2(Lsρn2(D)1L2sρnΔ)AL2sρn2.Minimizing the right-hand side of (Equation27) with respect to μn yieldsμn=Lsρn2(D)1L2sρnΔAL2sρn2.Substituting into (Equation27) shows that for this stepsize, the improvement of the squared error can be estimated byxn+1x2xnx2(Lsρn2(D)1L2sρnΔ)2AL2sρn2.In case D=I, h=s=0 last two relations were given in [Citation8, Citation10], but convergence results and order optimal error bounds for stopping many gradient methods [Citation2, Citation9, Citation10, Citation17, Citation33] with the ME-rule were stated in [Citation4, Citation7Citation10, Citation34].

ME-rule in conjugate gradient methods

Some gradient-type methods that do not fit into the class of methods (Equation26) are conjugate gradient methods (see [Citation9, Citation10, Citation33, Citation36]).

Methods CGLS and CGME are the conjugate gradient method applied to BBx=By¯ and BBv=y¯ with x=Bv, respectively. In both methods, we fix the initial approximation x0, the initial value u0=0 and β0=0, and compute τ0=y¯Bx0. Further, we compute in CGLS p0=Bτ0, and for n=1,2, iterativelyun=βn1un1τn1,γn=pn12BBun2,xn=xn1γnBun,τn=τn1+γnBBun,pn=Bτn,βn=pn2pn12.In the method CGME, one computes for n=1,2, iterativelyun=βn1un1τn1,γn=τn12Bun2,xn=xn1γnBun,τn=τn1+γnBBun,βn=τn2τn12Both CGLS and CGME method have the form (Equation21) with zn1=γnLsun, hence ME-rule (Equation24) can be used. In case D=I, h=s=0, the ME-rule for CGLS and CGME methods was proposed and numerically tested in [Citation38].

ME-rule for sequence of approximations

In contrast to iterative methods where approximation on step n is computed using approximation on previous step n1, one may consider an arbitrary sequence of approximations(27) xn=Awn,n=1,2,,wnY(27) and ask which from these approximations to take as approximate solution.

Theorem 5.1

If in sequence (Equation28) the index n=nME is chosen as the first index n satisfying(28) dME(n)=(ρn+ρn+1,wn+1wn)2((D1)(wn+1wn)Δ(28) with Δ{Δ1,Δ2}, then ME-property (Equation3) holds.

Proof

This follows from (Equation25) with zn=wn+1wn.

To use the functional dME(n), elements wn are needed. They can be found by first computing wn and later xn=Awn.

Note that the last theorem also gives an alternative way to derive the ME-rule for continuous regularization methods, using the sequence αn=qn, q<1 and taking the limit q1.

The last theorem may be applied as an alternative or a complement to any other parameter choice rule for finding a regularized approximation from the sequence in the form xn=Awn, n=1,2, (for example in (iterated) Tikhonov method). Note that computation of the sequence of all approximate solutions for n=1,2,,N with some N is required, for example, in the balancing principle (see [Citation11, Citation35, Citation37, Citation39]. This theorem guarantees that the stopping index nME,B=max(nME,nB) is at least as good as the index nB from arbitrary balancing principle: xnME,BxxnBx.

ME-rule for semi-iterative regularization methods

Let pn be residual polynoms orthogonal in interval [0,BB] with respect to some measure (see [Citation40]). Then p0=1,pn(λ)=pn1(λ)+(1μn)(pn2(λ)pn1(λ))ωnλpn1(λ),n2with some constants μn and ωn, where μ1=1. Then, semi-iterative approximations are found asx1=x0+ω1AL2s(yAx0),xn=xn1+(1μn)(xn2xn1)+ωnAL2s(yAxn1),n2.In case x0=Aw0, we have xn=Awn with w1=w0+ω1L2s(yAAw0),(29) wn=wn1+(1μn)(wn2wn1)+ωnL2s(yAAwn1),n2.(29) Thus, the iterations of the semi-iterative methods may be stopped by the ME-rule (Equation29). We give formula for wn for the following examples of semi-iterative methods (see [Citation40]).

  1. The Chebyshev method of Stiefelwn=2nn+1wn1n1n+1wn2+4nn+1L2s(yAAwn1),n=1,2,

  2. The Chebyshev method of Nemirovskii and Polyakw1=43L2s(yAAw0),wn=22n12n+1wn12n32n+1wn2+42n12n+1L2s(yAAwn1),n=2,3,

  3. The ν-method of Brakhage is method where wn in (Equation30) has for fixed ν>0 the coefficientsμ1=1,μn=1+(n1)(2n3)(2n+2ν1)(n+2ν1)(2n+4ν1)(2n+2ν3),n=2,3,ω1=4ν+24ν+1,ωn=4(2n+2ν1)(n+ν1)(n+2ν1)(2n+4ν1),n=2,3,

ME-rule in implicit iteration methods

Let us consider implicit iteration methods of the form(30) xn+1=xn+ALs(BB+αnI)1Ls(yAxn)=(BB+αnI)1(αn1xn+BLsy),n=0,1,(30) This method has form (Equation21) with(31) zn=Ls(αnI+BB)1Lsρn=αn1L2sρn+1.(31) Therefore, the function dME(n) in (Equation25) attains the formdME(n)=(ρn+ρn+1,L2sρn+1)(D1)L2sρn+1.In case D=I, h=s=0, ME-rule for method (Equation31) was studied in [Citation4, Citation7, Citation8, Citation34].

Let us now consider the question of choosing the parameter αn such that the guaranteed improvement of accuracy of approximation xn+1 over xn would be maximal.

Proposition 5.2

In regularization method (Equation31), the guaranteed improvement Φ(zn) of accuracy in estimate (Equation23) is maximized for zn in (Equation32) with αn as solution α of the equationα(D1)Ls(αI+BB)1Lsρn(αI+BB)3/2Lsρn2=((D1)Ls(αI+BB)1Lsρn,(D1)Ls(αI+BB)2Lsρn)Δ1.

Proof

Substitution of zn in (Equation32) to function Φ(zn) in (Equation23) shows that minimization of function Φ(zn) is equivalent to minimization of the functionϕ(α):=2(D1)Ls(αI+BB)1τnΔ12(τn,(αI+BB)1τn)+B(αI+BB)1τn2,where τn:=Lsρn. Differentiation with use of formula 12ddα·=(12ddα·2)/(2·) gives12ϕ(α)=((D1)Ls(αI+BB)1τn,(D1)Ls(αI+BB)2τn)(D1)Ls(αI+BB)1τnΔ1+(αI+BB)1τn2B(αI+BB)3/2τn2.The equality(αI+BB)1τn2=((αI+BB)τn,(αI+BB)3τn)=α(αI+BB)3/2τn2+B(αI+BB)3/2τn2shows that ϕ(α)=0 iff α solves the equation (Equation33).

Theorem 5.3

In the case D=Ls in regularization method (Equation31), the guaranteed improvement Φ(zn) of accuracy in estimate (Equation23) is maximized, if αn is chosen by the analogue of the discrepancy principle Lsρn+1=Δ1.

Proof

Indeed, in the case D=Ls, the equation (Equation33) for finding α=rn1 has the form αn(αnI+BB)1Lsρn=Δ1, which is equivalent to Lsρn+1=Δ1 due to the equality αn(αnI+BB)1Lsρn=Lsρn+1.

Convergence and quasi-optimality for the ME-rule

In the ME-rule, considered in previous sections, the regularization parameter r in continuous regularization methods is found as the solution (the largest solution in case of many solutions) of a certain equation d(r,y,A)=Δ with Δ{Δ1,Δ2}. In regularization methods, where the regularization parameter is a natural number as in iteration methods, we choose rME as the first r=1, 2, ... for which d(r,y,A)Δ. Let us denote by xr0 the regularized solutions for exact data y0, A0. We give for the ME-rule, the following convergence result.

Theorem 6.1

Let Δ{Δ1,Δ2}. Let xr0, xr satisfy the following properties:

  1. xr0x0 as r;

  2. xr10xr1xr20xr2 (for r1r2);

  3. if r(Δ)const< as Δ0 , then xr(Δ)xr(Δ)00 as Δ0;

  4. if for some sequence rkconst< (k) it holds true d(rk,y0,A0)0 (k), then xrk0x0 (k).

Then, xrMEx0 as Δ0.

Proof

For every r it holds(32) xrxxrxr0+xr0x.(32) First we consider the main case where rME(Δ) as Δ0. Then, xrME0x0 due to the assumption (1). It remains to show that xrME0xrME0 as Δ0. Let r0(Δ) be a monotonically increasing function giving some a priori regularization parameter guaranteeing convergence xr0x0 as Δ0. If rMEr0, then due to assumption (2)xrME0xrMExr00xr0.If rMEr0, then ME-property says thatxrMExxr0x.Both estimates converge to 0 as Δ0.

Consider now the case if rME(Δ)const (Δ0). Then, for r=rME both terms in the right-hand side of (Equation34) converge as Δ0 due to assumptions (3) and (4).

Assumptions (1)–(3) of Theorem 6.1 are general concerning regularization method. Only the last assumption (4) uses concrete form of the function d(r,y,A), for regularization method (Equation7) it can be proved similarly to Lemma 3.2 in [Citation2, p. 66].

Consider now the case D=Ls. The problem Ax=y may be rewritten in form Bx=y¯ with operator BL(X,Y) and for xr in (Equation7)(33) xr=Bgr(BB)y¯=gr(BB)By¯(33) the standard regularization theory applies. We get the following results.

Theorem 6.2

Let (Equation4) hold with D=Ls. Let xr be defined by one of methods M1–M5. Then, choice of r by one of rules D, MD, ME or MEe guarantees convergence xrx0 as Δ0 and under additional assumption xR((BB)p/2) the order optimal error estimate xrxO(Δpp+1) holds with p2p01 for rule D, with p2p0 for rules MD, ME and MEe.

proof

Let D=Ls. Then, results of this theorem for discrepancy principle follow from the results in [Citation1, Citation2, Citation17], for MD rule in [Citation22, Citation41]. Let C=1 in rules D, MD. In methods M1, M2 it holds dMD(α)dME(α)dD(α), therefore αDαMEαMD ([Citation5Citation8]), in methods M3, M4 nD1nMEnD ([Citation4, Citation7, Citation8]), in method M5 dME(α)=dD(α). Therefore, results for ME rule follow from results for D rule and MD rule. The same results hold for parameter rMEe=constrME.

Let us consider the quasi-optimality results for the regularization methods. To characterize the quality of the rule of an a posteriori regularization parameter choice, we use in the following the quasi-optimality property (see [Citation41, Citation42]). We say that a rule R for a posteriori choice of the regularization parameter r=r(R) is quasi-optimal for a given regularization method xr=Agr(AA)y, if there exists a constant C (not depending on A0, A, x, y) such that for yy0δ, AA0h the error estimate(34) xr(R)xCinfr0ψ(r)+O(δ+hx),(34) holds, where the function(35) ψ(r)=(IAAgr(AA))x)+γr(δ+hx)(35) is an upper bound of the error xrx. The following result holds.

Theorem 6.3

Let D=Ls. The modified discrepancy principle with Δ{Δ1,Δ2,Δ3} and C>1 is quasi-optimal rule for solving the problem Bx=y¯ by methods M1–M5: the inequality (Equation36) holds where the operators A, A are replaced by operators B, B in (37). This quasi-optimality is also guaranteed for ME-rule with Δ{Δ1,Δ2}, but also with Δ=Δ3 if in (Equation17) and (Equation24) Δ is replaced by CΔ with C>1.

Proof

Applying results of [Citation41] (Theorem 3) (see also [Citation34](Theorem 2, Remark 2)) to the approximation (Equation35) instead of xr=Agr(AA)y, we get the assertions of theorem about MD rule, but also for ME rule with Δ{Δ1,Δ2,Δ3} if in (Equation17) and (Equation24) CΔ with C>1 is used. In case Δ{Δ1,Δ2}, the increase of C1 in right-hand side CΔ in (Equation17) and (Equation24) leads to decrease of parameter rrME, therefore, to increase of the error xrx.

Numerical examples

Our tests are performed on the well-known set of test problems by Hansen [Citation43]: baart, deriv2, foxgood, gravity, heat, ilaplace, phillips, shaw, spikes, wing. In all tests, we used the discretization parameter n=100.

For example, the test problem heat (see [Citation43, Citation44]) represents the heat conduction in one-dimensional semi-infinite rod [0,) with the thermal diffusivity 1. The initial temperature of the rod is 0 and the boundary condition at the point ξ=0 is not known. Instead, one measures the temperature at ξ=1 for the time t[0,1] and tries to determine the temperature at ξ=0 from this measurement. (NB! In this paragraph and in the following paragraph, t and s have different meaning from the rest of the paper.)

This results in the Volterra integral equation of the first kind0t(ts)3/22πexp(14(ts))x(s)ds=y(t),t[0,1],where y(t) is the measured temperature at ξ=1 and x(t) is the temperature at ξ=0.

The integral equation is discretized by means of a simple collocation method on the points ti=i/n and the midpoint rule with n=100 points. An exact discrete solution z with componentszi={75ti2,ifti0.10.75+(20ti2)(320ti),if0.1ti<0.150.75exp(2(20ti3)),if0.15<ti0.50,otherwiseis constructed, and then the right-hand side b is computed as b=Az.

Since the performance of methods and rules generally depends on the smoothness p of the exact solution, we complemented the standard solutions x of the (now discrete) test problems with smoothened solutions (AA)p/2x (p=2) in all test problems (computing the right-hand side as A((AA)p/2x)). After discretization, all problems were scaled (normalized) in such a way that the Euclidian norms of the operator and the right-hand side were 1. We used operator D=Lt with t{0.5,0,0.5}. We computed Lty0 and added a normally distributed noise such that Lt(yy0) had the values 0.3, 101, 102, 103, 104, 105, 106. Here, · means the Euclidean norm.

We generated 10 noise vectors and used these vectors in all of the problems. The problems were regularized by the Tikhonov method in Ys-scales (the second formula in (Equation9)), where the regularization parameters were chosen according to the rules that we wanted to compare. We used the Ys-scale with the operator L acting as Lu=uu. In the discrete form, we used the approximating matrix (lij) with the elements lii=2n2+1, i=1,,n, li,i+1=li+1,i=n2, i=1,,n1, lij=0 otherwise.

Since in these model equations the exact solution is known, it is possible to find the regularization parameter α=α which gives the smallest error: xαx=minα>0{xαx}. For every rule R, the error ratio xαRx/xαx describes the performance of the rule R on this particular problem. For better comparison of the cases with different s, in the denominator we always use s=0. To compare the rules or to present their properties, the following tables show averages of these error ratios over various parameters of the data set (problems 1–10, noise levels δ, noise vectors).

In Tables and we compare the following rules. Earlier we have introduced the discrepancy principle D (see page 5, formula (Equation11)), the rule MD’ (page 7, (Equation19)), the monotone error rule ME (page 7, (Equation19)) and the rule MEa as the analog of the ME-rule (page 7, (Equation20)). In rule MEe (recall motivation on page 7), we chose the regularization parameter as αMEe=αME/c with c=2.3·10c1, c1=(st)(1.5+1.3s0.5t0.8s2+2stt2); this form of the constant was obtained by extensive numerical experiments on different values of s and t. In rule MEae, we chose the parameter αMEae=αMEa/c, where c=2.3 for t=0 and c=3 for t=0.5 or t=0.5.

Table 1 Error ratios for different rules.

Note that in the case s>t, the ME rule (unlike rules D, MD’, MEa) allows mild underestimation of the noise level. In the columns ME08 and ME07, we present the error ratios obtained by using the noise levels 0.8δ and 0.7δ instead of δ. Typically, ME08 and ME07 give better results than ME rule, but sometimes they can give large errors. If instead of the exact noise level approximate noise level is given, we recommend using the MEe-rule with s>t. In the case s=t, all considered rules failed if the noise level was underestimated.

In Table , we present the results separately for large δ’s (l means δ102) and small δ’s (s means δ103). We present results for s=t and s=t+0.5. For a given t in most cases, the best choice of s is s=t, in which case all of the considered rules gave similar results. But in the case s=t=0.5, wecould not obtain good results for smooth solutions, especially in case of small noise levels. Then, the error function had a very sharp decrease at its minimum, which was hard to locate. Therefore, in the case t=0.5 we recommend taking s=0. Then, the ME and MEe rules gave good results. However, other rules failed in this case in the problems baart, foxgood, and wing.

In most problems, using other values of s and t than s=t and s=t+0.5 caused slightly worse results in the rules ME and MEe and essentially worse results in rules D and MD’. However, in the problems deriv2 and phillips in the case of a smooth solution (p2), increasing s increased the accuracy of rules D, MD’, MEe for all t (in problem deriv2 the accuracy of rule D increased for s up to 0.5).

Table shows that the discrepancy principle works well in the case s=t=0, s=t=0.5 but is not as good for p>1. Typically, the MEe-rule gives the best results. We made computations besides MD’ rule (Equation13) also with MD rule (Equation12), the results were almost the same.

Table presents the results for the problem heat. In column ‘min’, we present averages of minimal errors over 10 runs for concrete s and t. Unlike Table , the rules D, MD’, MEa, MEae gave almost as good results as the rule MEe. These rules were not as good as the rule MEe only in the case of smooth solution (p=2): rule D for t=s=0 and rules MD’, MEa, MEae for t=0.5, s=0.

Table 2 Results for the problem heat.

Both in Tables and in case t=0.5, s=1, the best results were obtained by using the MEae rule.

Some remarks about numerical realization of algorithms. In our numerical computations, the discretization parameter n was 100. This low dimension allowed us to use the advantages of singular value decomposition of the matrix for fast computations. Note also that instead of the solution of the Equations (Equation11), (Equation19) and (Equation20) we made computations on the α-sequence αi=0.9i, i=0,1, and used for the corresponding regularization parameter the first αi for which the left hand side of the equation was smaller than the right-hand side.

The computations are made with the exact operator, since the inexact operator is typically accompanied by a significant overestimation of the noise level. In this case, it is preferable to use the rules from the article [Citation25] where the exact operator was considered. We plan to extend these results to the case of a noisy operator in a forthcoming paper.

It is worthwhile to think about the possibility to choose the operator L depending on the operator A.

Note that the ME-rule can also be used in projection methods. It was studied for the exact operator in [Citation45]. Generalizations to noisy operators are to be published in a further paper.

Conclusion

We have considered a linear ill-posed problem A0x=y0 with the operator A0L(X,Y). A noisy right-hand side y and an operator A with noise levels D(y0y)δ, D(A0A)h, where D is certain operator, are given. We derived the monotone error rule (ME rule) for parameter choice in many regularization methods in Ys-scale, generated by powers sR of certain operator L. These regularization methods include the (iterated) Tikhonov method and many iterative methods (Landweber method, steepest descent method, conjugate gradient-type methods, semi-iterative methods, implicit iteration method). It was shown for which data the ME rule gives the optimal parameter. For a class of methods, the convergence and quasi-optimality results are given. In extensive numerical experiments, the ME-rule and its modification MEe gave good results, whereby in the case D=Lt with t<s these two rules allowed also moderate underestimation of the noise level (other rules fail in case of underestimated noise level).

Acknowledgments

This work was started while the first author held an appointment as Visiting Professor at the University of Applied Sciences Zittau/Görlitz from June until August 2010 and was supported by DFG (Deutsche Forschungsgemeinschaft) under a cooperative bilateral research project grant. In addition, the work was supported by the Estonian Science Foundation, Grant 9120.

References

  • Engl HW, Hanke M, Neubauer A. Regularization of inverse problems. Vol. 375, Mathematics and its applications Dordrecht: Kluwer; 1996.
  • Vainikko GM, Veretennikov AY. Iteration procedures in ill-posed problems. Moscow: Nauka; 1986(in Russian).
  • Tautenhahn U. On a new parameter choice rule for ill-posed inverse problems. In: Lipitakis EA, editor. HERCMA ’98: Proc. 4th Hellenic-European Conference on Computer Mathematics and its Applications. Athens: Lea Publishers; 1999. p. 118–125.
  • Hämarik U. Monotonicity of error and choice of the stopping index in iterative regularization methods. In: Pedas A, editor. Differential and integral equations: theory and numerical analysis. Tartu: Estonian Mathematical Society; 1999. p. 15–30.
  • Hämarik U, Raus T. On the a posteriori parameter choice in regularization methods. Proc. Estonian Acad. Sci. Phys. Math. 1999;48:133–145.
  • Tautenhahn U, Hämarik U. The use of monotonicity for choosing the regularization parameter in ill-posed problems. Inverse Probl. 1999;15:1487–1505.
  • Hämarik U, Tautenhahn U. On the monotone error rule for parameter choice in iterative and continuous regularization methods. BIT Numerical Math. 2001;41:1029–1038.
  • Hämarik U, Tautenhahn U. On the monotone error rule for choosing the regularization parameter in ill-posed problems. In: Lavrentiev MM, Kabanikhin SI, editors. Ill-posed and non-classical problems of mathematical physics and analysis. Utrecht: VSP; 2003 p. 27–55.
  • Alifanov OM, Rumyantsev SV. On the stability of iterative methods for the solution of linear ill-posed problems. Soviet Math. Dokl. 1979;20:1133–1136.
  • Alifanov OM, Artyukhin EA. Rumyantsev SV. Extreme methods for solving ill-posed problems with applications to inverse heat transfer problems. New York: Begell House; 1995.
  • Egger H. Y-scale regularization. SIAM J. Numer. Anal. 2008;46:419–436.
  • Egger H. Regularization of inverse problems with large noise. J. Phys.: Conf. Ser. 2008; 124: 9pp.
  • Morozov VA. Regularization under large noise. Comput. Math. Math. Phys. 1996;36:1175–1181.
  • Eggermont PPB, LaRiccia VN, Nashed MZ. On weakly bounded noise in ill-posed problems. Inverse Prob. 2009; 25: 14 p.
  • Mathé P, Tautenhahn U. Enhancing linear regularization to treat large noise. J. Inverse Ill-Posed Prob. 2011;19:859–879.
  • Mathé P, Tautenhahn U. Regularization under general noise assumptions. Inverse Prob. 2011; 27:035016 (15 p).
  • Vainikko GM. The discrepancy principle for a class of regularization methods. USSR Comp. Math. Math. Phys. 1982;22:1–19.
  • Hämarik U, Palm R, Raus T. Use of extrapolation in regularization methods. J. Inverse Ill-Posed Prob. 2007;15:277–294.
  • Hämarik U, Palm R, Raus T. Extrapolation of Tikhonov regularization method. Math. Model. Anal. 2010;15:55–68.
  • Tautenhahn U. Error estimates for regularization methods in Hilbert scales. SIAM J. Numer. Anal. 1996;33:2120–2130.
  • Morozov VA. On the solution of functional equations by the method of regularization. Soviet Math. Dokl. 1966;7:414–417.
  • Raus T. On the discrepancy principle for solution of ill-posed problems with non-selfadjoint operators. Acta et comment. Univ. Tartuensis. 1985;715: 12–20(in Russian).
  • Gfrerer H. An a posteriori parameter choice for ordinary and iterated Tikhonov regularization of ill-posed problems leading to optimal convergence rates. Math. Comp. 1987;49:507–522.
  • Tautenhahn U. On order optimal regularization under general source conditions. Proc. Est. Acad. Sci. Phys. Math 2004;53: 116–123.
  • Hämarik U, Palm R, Raus T. A family of rules for parameter choice in Tikhonov regularization of ill-posed problems with inexact noise level. J. Comput. Appl. Math. 2012;236:2146–2157.
  • Hämarik U, Palm R, Raus T. Comparison of parameter choices in regularization algorithms in case of different information about noise level. Calcolo. 2011;48:47–59.
  • Hämarik U, Kangro U, Palm R, Raus T. On parameter choice in the regularization of ill-posed problems with rough estimate of the noise level of the data. In: Numerical analysis and applied mathematics ICNAAM 2012. AIP Conference Proceedings, 1479. New York (NY): American Institute of Physics; 2012. p. 2332–2335.
  • Hämarik U, Raus T. On the choice of the regularization parameter in ill-posed problems with approximately given noise level of data. J. Inverse Ill-Posed Prob. 2006;14:251–266.
  • Hämarik U, Palm R, Raus T. On minimization strategies for choice of the regularization parameter in ill-posed problems. Numer. Funct. Anal. Opt. 2009;30:924–950.
  • Hämarik U, Raus T, Palm R. On the analog of the monotone error rule for parameter choice in the (iterated) Lavrentiev regularization. Comput. Meth. Appl. Math. 2008;8:237–252.
  • Palm R. Numerical comparison of regularization algorithms for solving ill-posed problems. University of Tartu; 2010. Available from: http://hdl.handle.net/10062/14623.
  • Hämarik U, Palm R, Raus T. A family of rules for the choice of the regularization parameter in the Lavrentiev method in the case of rough estimate of the noise level of the data. J. Inverse Ill-Posed Prob. 2012;20:831–854.
  • Gilyazov SF, Goldman NL. Regularization of ill-posed problems by iteration methods. Vol. 499, Mathematics and its applications. Dordrecht: Kluwer; 2000.
  • Hämarik U, Raus T. On the choice of the stopping index in iteration methods for solving problems with noisy data. In: Lipitakis EA, editor. HERCMA 2001: Proceedings of the 5th Hellenic-European conference on computer mathematics and its applications, Athens, Greece, September 20–22, 2001. Athens: Lea Publishers; 2002. p. 524–529.
  • Mathé P, Pereverzev SV. Geometry of linear ill-posed problems in variable Hilbert scales. Inverse Prob. 2003;19:789–803.
  • Hanke M. Conjugate gradient type methods for ill-posed problems. Harlow: Longman Scientific & Technical 1995.
  • Pereverzev SV, Schock E. On the adaptive selection of the parameter in the regularization of ill-posed problems. SIAM J. Numer. Anal. 2005;43:2060–2076.
  • Hämarik U, Palm R. On rules for stopping the conjugate gradient type methods in ill-posed problems. Math. Model. Anal. 2007;12:61–70.
  • Hämarik U, Raus T. About the balancing principle for choice of the regularization parameter. Numer. Funct. Anal. Opt. 2009;30:951–970.
  • Hanke M. Accelerated Landweber iterations for the solution of ill-posed equations. Numer. Math. 1991;60:341–373.
  • Raus T, Hämarik U. On the quasi-optimal rules for the choice of the regularization parameter in case of a noisy operator. Adv. Comput. Math. 2012;36:221–233.
  • Hämarik U, Palm R, Raus T. On the quasioptimal regularization parameter choices for solving ill-posed problems. J. Inverse Ill-Posed Prob. 2007;15:419–439.
  • Hansen PC. Regularization tools: a Matlab package for analysis and solution of discrete ill-posed problems. Numerical Algorithms. 1994;6:1–35.
  • Eldén L. The numerical solution of a non-characteristic Cauchy problem for a parabolic equation. In: Numerical Treatment of Inverse Problems in Differential and Integral Equations, Proceedings of an International Workshop, Heidelberg,1982 Boston: Birkhäuser; 1983. p. 246–268.
  • Hämarik U, Avi E, Ganina A. On the solution of ill-posed problems by projection methods with a posteriori choice of the discretization level. Math. Model. Anal. 2002;7:241–252.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.