Search in:

Inverse Problems in Science and Engineering Volume 22, 2014 - Issue 1: Proceedings of the 6th International Conference "Inverse Problems: Modeling and Simulation", 21-26 May 2012, Antalya, Turkey

Submit an article Journal homepage

Free access

296

Views

CrossRef citations to date

Altmetric

Listen

Articles

Monotonicity of error of regularized solution and its use for parameter choice

Uno HämarikFaculty of Mathematics and Computer Science, University of Tartu, Liivi 2, 50409 Tartu, Estonia.Correspondence[email protected]

Urve KangroFaculty of Mathematics and Computer Science, University of Tartu, Liivi 2, 50409 Tartu, Estonia.

Reimo PalmFaculty of Mathematics and Computer Science, University of Tartu, Liivi 2, 50409 Tartu, Estonia.

Toomas RausFaculty of Mathematics and Computer Science, University of Tartu, Liivi 2, 50409 Tartu, Estonia.

Ulrich TautenhahnDepartment of Mathematic, University of Applied Sciences Zittau/Görlitz, P. O. Box 1454, 02754Zittau, Germany.

Pages 10-30 | Received 10 Jul 2013, Accepted 13 Jul 2013, Published online: 20 Aug 2013

Cite this article
https://doi.org/10.1080/17415977.2013.827185
CrossMark

In this article

Abstract
Introduction
Well-known regularization methods in Y-scale
Well-known rules for choice of the regularization parameter
ME-rule for the continuous regularization methods
ME-rule for iterative regularization methods
Convergence and quasi-optimality for the ME-rule
Numerical examples
Conclusion
Acknowledgements
References

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions
View PDF PDF

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

We consider an ill-posed equation in a Hilbert space with a noisy operator and a noisy right-hand side. The noise level information is given in a general form, as a norm of a certain operator applied to the noise. We derive the monotone error rule (ME-rule) for the choice of the regularization parameter in many methods, giving parameter such that the error is monotonically increasing for larger parameters in the Tikhonov method and for smaller stopping indices in iteration methods. Regularization methods considered include $Y$ -scale regularization in (iterated) Tikhonov method and in iteration methods (Landweber method, CG type methods, semi-iterative methods). We also consider modifications of the ME-rule and show in numerical experiments (test problems from Hansen’s Regularization Toolbox, including the sideways heat equation) their advantages over the discrepancy principle.

Keywords:

ill-posed problems
regularization methods
iteration methods
parameter choice
monotone error

AMS Subject Classifications:

47A52
65J20
65J10

Introduction

In this paper, we consider linear ill-posed problems(1) $\begin{matrix} A_{0} x = y_{0}, y_{0} \in R (A), \end{matrix}$ (1) where $A_{0}$ is a bounded linear operator with non-closed range $R (A)$ and $X$ , $Y$ are infinite-dimensional real Hilbert spaces with inner products $(\cdot, \cdot)$ and norms $∥ \cdot ∥$ . We are interested in the minimum-norm solution $x^{†}$ of problem (Equation1(1) $\begin{matrix} A_{0} x = y_{0}, y_{0} \in R (A), \end{matrix}$ (1) ) and assume that instead of exact data $y_{0}$ and $A_{0}$ , there are given noisy data $y \in Y$ and $A \in L (X, Y)$ with(2) $\begin{matrix} ∥ y_{0} - y ∥ \leq δ, ∥ A_{0} - A ∥ \leq h \end{matrix}$ (2) and known noise levels $δ$ , $h$ . Later, we also consider the case of noise level information in more general form (Equation4(4) $\begin{matrix} ∥ D (y_{0} - y) ∥ \leq δ, ∥ D (A_{0} - A) ∥ \leq h . \end{matrix}$ (4) ).

Ill-posed problems (Equation1(1) $\begin{matrix} A_{0} x = y_{0}, y_{0} \in R (A), \end{matrix}$ (1) ) arise in a wide variety of problems in applied sciences. For their stable numerical solution, regularization methods are necessary, see [Citation1, Citation2]. Regularization methods include Tikhonov regularization $x_{r} = {(A^{*} A + r^{- 1} I)}^{- 1} A^{*} y$ with regularization parameter $r \in R_{+}$ and iterative and projection methods, where the stopping index $r = n \in N$ is the regularization parameter. Here, $A^{*} \in L (Y, X)$ is the adjoint operator to $A \in L (X, Y)$ . Traditional regularization methods possess the property that in the case of exact data, the error $∥ x_{r}^{0} - x^{†} ∥$ of regularized solution $x_{r}^{0}$ , as a function of $r$ , is monotonically decreasing for $r \to \infty$ . This property is no longer true for the error $∥ x_{r} - x^{†} ∥$ in the case of noisy data. The monotone decrease of the error $∥ x_{r} - x^{†} ∥$ for growing $r$ -values can only be guaranteed for small $r$ . Typically, $∥ x_{r} - x^{†} ∥$ diverges for $r \to \infty$ . Therefore, a rule for the proper choice of the regularization parameter $r$ is necessary.

In the monotone error rule (ME-rule) for choosing a proper regularization parameter, the idea consists in searching for the largest computable regularization parameter $r = r_{ME}$ for which we can guarantee the ME-property: the error $∥ x_{r} - x^{†} ∥$ is monotonically decreasing for $r \in (0, r_{ME}]$ . For the continuous regularization methods, this means that $\begin{matrix} \frac{d}{d r} ∥ x_{r} - x^{†} ∥^{2} \leq 0 \forall r \in (0, r_{ME}], \end{matrix}$ for the iteration methods and other methods with $r = n \in N$ this means that(3) $\begin{matrix} ∥ x_{n} - x^{†} ∥ \leq ∥ x_{n - 1} - x^{†} ∥ \forall n = 1, 2, \dots, n_{ME} . \end{matrix}$ (3) From derivation of the ME-rule for concrete regularization methods, one can see for which perturbations of the operator and the right-hand side, the ME-rule gives the optimal regularization parameter.

Consider now the case of the exact operator. In theoretical works on ill-posed problems, often the worst-case error $\sup {∥ x_{r} - x^{†} ∥ : y \in Y, {Ax}^{†} = y_{0}, ∥ y - y_{0} ∥ \leq δ}$ is considered. Here, the exact data $y_{0}$ are fixed, supremum is over noisy data $y \in Y$ . In applications, the exact data $y_{0}$ are unknown and noisy data $y$ are known. Then, we are interested in finding the parameter $r$ which minimizes the analogue of the worst-case error $\sup {∥ x_{r} - x ∥ : \tilde{y} \in Y, Ax = \tilde{y}, ∥ \tilde{y} - y ∥ \leq δ}$ , where given data $y$ are fixed, supremum is over “candidates of exact data” $\tilde{y} \in Y$ . This analogue of the worst-case error is minimized by the parameter $r_{ME}$ . This can be seen from derivation of the ME-rule for concrete regularization methods: for $r \in (0, r_{ME}]$ , the error $∥ x_{r} - x ∥$ is decreasing for all “candidates of exact data” $\tilde{y} \in Y$ with $Ax = \tilde{y}$ , but for $r > r_{ME}$ there exists $\tilde{y} \in Y$ with $Ax = \tilde{y}$ where $∥ x_{r} - x ∥ > ∥ x_{r_{ME}} - x ∥$ .

All regularization methods considered in this paper have the property $x_{r} \in R (A^{*})$ because we need this property for the formulation of the ME rule.

The ME-rule was proposed and studied for continuous methods in [Citation3–Citation8], for iterative methods in [Citation4, Citation7, Citation8], analogous rules were proposed in [Citation9, Citation10]. In this paper, we extend these results in the following directions: we consider the case of noisy operator ( $h > 0$ ), the noise level information may be given in general form (4), we derive the ME-rule for a more general class of regularization methods (including the regularization in $Y$ -scale,[Citation11] also for iterative and semi-iterative methods). We also give convergence results and error estimates for the ME-rule.

In practical applications, often different regularization parameters are selected and the corresponding regularized solutions are studied online. With our ME-rule, we provide some help for the selection of suitable values of $r$ , since we can guarantee that for $r < r_{ME}$ always $∥ x_{r_{ME}} - x^{†} ∥ < ∥ x_{r} - x^{†} ∥$ holds. This information may also be used for improving other a posteriori rules R for the choice of the regularization parameter, since the error for the parameter $\max (r_{R}, r_{ME})$ is less than or equal to the error in rule R. Unfortunately, this observation does not help to improve the discrepancy principle, which gives the parameter $r_{D} \geq r_{ME}$ , and for a smooth solution $r_{D}$ is often too large in regularization methods with the finite qualification. In (iterated) Tikhonov method, the ME-rule is in contrast to the discrepancy principle a quasi-optimal parameter choice.

Note that typically the error of the regularized solution decreases monotonically also, somewhat further up to some $r_{opt} \geq r_{ME}$ . Our numerical experiments suggest to use $r_{MEe} = c r_{ME}$ (or its integer part) with a certain $c > 1$ .

The plan of this paper is as follows. In Sections 2 and 3, well-known regularization methods and parameter choice rules are introduced, using noise-level information (Equation4(4) $\begin{matrix} ∥ D (y_{0} - y) ∥ \leq δ, ∥ D (A_{0} - A) ∥ \leq h . \end{matrix}$ (4) ). In Sections 4 and 5, the ME-rule is derived for the continouous regularization methods and for the iterative methods, respectively. In Section 6 convergence conditions and quasioptimality results of the ME-rule are given. The paper is finished by extensive numerical experiments.

Well-known regularization methods in $Y$ -scale

We assume that the noise level information is given in the form(4) $\begin{matrix} ∥ D (y_{0} - y) ∥ \leq δ, ∥ D (A_{0} - A) ∥ \leq h . \end{matrix}$ (4) where $D$ is a linear injective, possibly unbounded operator in $Y$ with domain $D (D)$ . We assume that $y_{0}, y \in D (D)$ . The standard case (Equation2(2) $\begin{matrix} ∥ y_{0} - y ∥ \leq δ, ∥ A_{0} - A ∥ \leq h \end{matrix}$ (2) ) is special case with $D = I$ , where $I$ is the identity operator. In [Citation12], the operator $D = L^{- t}$ with $t \in R$ was used. Here, the operator $L$ is a densely defined, unbounded, self-adjoint and strictly positive operator in the space $Y$ , inducing Hilbert scale $Y_{s}$ with norm $∥ y ∥_{s} : = ∥ L^{s} y ∥$ . In the standard case (Equation2(2) $\begin{matrix} ∥ y_{0} - y ∥ \leq δ, ∥ A_{0} - A ∥ \leq h \end{matrix}$ (2) ), $t = 0$ . The case $t < 0$ corresponds to large noise, the case $t > 0$ to small noise. The noise-level information (Equation4(4) $\begin{matrix} ∥ D (y_{0} - y) ∥ \leq δ, ∥ D (A_{0} - A) ∥ \leq h . \end{matrix}$ (4) ) with different operators $D$ (only the case $h = 0$ was considered) was used in works.[Citation13–Citation16]

In many papers, $X_{s}$ -scale is considered with the generating operator $L$ in the space $X$ , the equation $Ax = y$ is transformed to the equation $L^{- 2 s} A^{*} Ax = L^{- 2 s} A^{*} y$ and the regularized solutions are constructed in the form $x_{r} = g_{r} (L^{- 2 s} A^{*} A) L^{- 2 s} A^{*} y$ . Here, $r$ is the regularization parameter and the function $g_{r} (λ)$ satisfies the conditions(5) $\begin{matrix} \sup_{0 \leq λ \leq Λ} | \sqrt{λ} g_{r} (λ) | \leq γ_{*} \sqrt{r}, r \geq 0, \\ \sup_{0 \leq λ \leq Λ} λ^{p} | 1 - λ g_{r} (λ) | \leq γ_{p} r^{- p}, r \geq 0, 0 \leq p \leq p_{0} . \end{matrix}$ (5) Here, $p_{0}$ , $γ_{*}$ and $γ_{p}$ are positive constants, $Λ$ is at least the norm of the operator in the argument of the function $g_{r}$ , $γ_{0} = 1$ and the greatest value of $p_{0}$ , for which the inequality (Equation6(6) $\begin{matrix} x_{r} = g_{r} (B^{*} B) B^{*} \bar{y} = B^{*} g_{r} ({BB}^{*}) \bar{y} = A^{*} w_{r}, w_{r} = L^{- s} g_{r} ({BB}^{*}) \bar{y} . \end{matrix}$ (6) ) holds is called the qualification of the method. We can formulate the ME rule for choice of parameter $r$ only for regularized solutions $x_{r} \in R (A^{*})$ . Therefore, we prefer to use the $Y$ -scale instead of $X$ scale. The $Y$ -scale regularization was proposed by Egger [Citation11, Citation12]. Here, the equation $Ax = y$ is transformed into the equation $\begin{matrix} Bx = \bar{y}, B = L^{- s} A, \bar{y} = L^{- s} y, \end{matrix}$ assuming $B : X \to Y$ is bounded and $\bar{y} \in Y$ . If $s \geq 0$ this is always satisfied; if $s < 0$ , this means that $R (A) \in Y_{- s}$ and $y \in Y_{- s}$ . The regularized solutions are constructed in the form(6) $\begin{matrix} x_{r} = g_{r} (B^{*} B) B^{*} \bar{y} = B^{*} g_{r} ({BB}^{*}) \bar{y} = A^{*} w_{r}, w_{r} = L^{- s} g_{r} ({BB}^{*}) \bar{y} . \end{matrix}$ (6) Special cases of regularization methods of this form are the following well-known methods (cf. [Citation1, Citation2, Citation17]).

The Tikhonov method $x_{α} = {(α I + B^{*} B)}^{- 1} B^{*} \bar{y}$ . Here, $r = α^{- 1}$ , $g_{r} (λ) = {(λ + r^{- 1})}^{- 1}$ , $p_{0} = 1$ , $γ_{*} = 1 / 2$ , $γ_{p} = p^{p} {(1 - p)}^{1 - p}$ .
Iterative variants of the Tikhonov method. Let $m \in N$ , $m \geq 1$ , $x_{0, α} \in X$ – initial approximation and $\begin{matrix} x_{n, α} = {(α I + B^{*} B)}^{- 1} (α x_{n - 1, α} + B^{*} \bar{y}) (n = 1, 2, \dots, m) . \end{matrix}$ Here $r = α^{- 1}$ , $g_{r} (λ) = \frac{1}{λ} (1 - {(\frac{1}{1 + r λ})}^{m})$ , $p_{0} = m$ , $γ_{*} = \sqrt{m}$ , $γ_{p} = {(\frac{p}{m})}^{p} {(1 - \frac{p}{m})}^{m - p}$ .
Explicit iteration scheme (Landweber’s method). Let $x_{0} = 0$ , $\begin{matrix} x_{n} = x_{n - 1} - μ (B^{*} B x_{n - 1} - \bar{y}), μ \in (0, 1 / ∥ B^{*} B ∥), n = 1, 2, \dots \end{matrix}$ Here, $r = n$ , $g_{r} (λ) = \frac{1}{λ} (1 - {(1 - μ λ)}^{r})$ , $p_{0} = \infty$ , $γ_{*} = \sqrt{μ}$ , $γ_{p} = {(\frac{p}{μ e})}^{p}$ .
Implicit iteration scheme. Let $α > 0$ be a constant, $x_{0} = 0$ and $\begin{matrix} x_{n} = {(α I + B^{*} B)}^{- 1} (α x_{n - 1} + B^{*} \bar{y}), n = 1, 2, \dots \end{matrix}$ Here, $r = n$ , $g_{r} (λ) = \frac{1}{λ} (1 - {(\frac{α}{α + λ})}^{r})$ , $p_{0} = \infty$ , $γ_{*} = \frac{b_{0}}{\sqrt{α}}$ , where $b_{0} = \sup_{0 < λ < \infty} λ^{- 1 / 2} (1 - e^{- λ}) \approx 0.6382$ and $γ_{p} = {(α p)}^{p}$ .
The method of asymptotical regularization: approximation $x_{r}$ solves the Cauchy problem $\begin{matrix} x^{'} (r) + B^{*} B x (r) = B^{*} \bar{y}, x (0) = 0 . \end{matrix}$ Here, $g_{r} (λ) = \frac{1}{λ} (1 - e^{- r λ})$ , $p_{0} = \infty$ , $γ_{*} = b_{0}$ , $γ_{p} = {(p / e)}^{p}$ .

In the iterated Tikhonov method, we may use different parameter

α_{n}

at every iteration step n; then, we get the same approximation as in the extrapolated Tikhonov method [Citation18, Citation19]

(7)

\begin{matrix} x_{α_{1}, \dots, α_{m}} = \sum_{i = 1}^{m} c_{i} x_{α_{i}}, c_{i} = \prod_{\frac{j = 1}{j \neq i}}^{m} \frac{α_{j}}{α_{j} - α_{i}}, \end{matrix}

(7) where

x_{α_{i}}

i = 1

, ...,

m

are the approximations of Tikhonov method with the parameters

α_{i}

Note that regularized solutions in $X_{s}$ -scales and $Y_{s}$ -scales variants of Tikhonov method have the form(8) $\begin{matrix} x_{α} = {(α I + L^{- 2 s} A^{*} A)}^{- 1} L^{- 2 s} A^{*} y, x_{α} = {(α I + A^{*} L^{- 2 s} A)}^{- 1} A^{*} L^{- 2 s} y, \end{matrix}$ (8) and they minimize the corresponding functionals $\begin{matrix} ∥ Ax - y ∥^{2} + α ∥ L^{s} x ∥^{2}, ∥ L^{- s} (Ax - y) ∥^{2} + α ∥ x ∥^{2} . \end{matrix}$ If the operator L acts in both spaces X and Y and the operators $A^{*}$ and L commute, then these regularized approximations coincide. Much more widespread of these two approximations is the first one (cf.[Citation20]). For the operators L and $L^{- 1}$ , one may use, for example, the differentiation and integration operators. The fractional powers $L^{s}$ and $L^{- s}$ can be implemented efficiently, e.g. via FFT or multi-level techniques.

Positive values of $s$ are good for delaying saturation in Tikhonov and iterated Tikhonov methods,[Citation12] negative values of $s$ are good for preconditioning in the iteration methods that drastically decreases the number of iteration steps.[Citation11]

Well-known rules for choice of the regularization parameter

For the choice of the regularization parameter in any regularization method under noise information (Equation4(4) $\begin{matrix} ∥ D (y_{0} - y) ∥ \leq δ, ∥ D (A_{0} - A) ∥ \leq h . \end{matrix}$ (4) ), some estimate or approximation of value $∥ D (y - {Ax}^{†}) ∥$ is needed. This value may be estimated as follows:(9) $\begin{matrix} ∥ D (y - {Ax}^{†}) ∥ = ∥ D (y - y_{0} + (A_{0} - A) x^{†}) ∥ \leq δ + h ∥ x^{†} ∥ = : Δ_{1} . \end{matrix}$ (9) Typically, the exact value of $∥ x^{†} ∥$ is unknown. Substitution of $∥ x^{†} ∥$ by some upper bound $M \geq ∥ x^{†} ∥$ gives a rough estimate $∥ D (y - {Ax}^{†}) ∥ \leq δ + hM = : Δ_{2} .$ If an upper bound $M$ is not available, approximation of $∥ D (y - {Ax}^{†}) ∥$ by $Δ_{3} : = δ + h ∥ x_{α} ∥$ (in iteration methods $Δ_{3} : = δ + h ∥ x_{n} ∥$ ) may be used.

In the following, we formulate some a posteriori rules for choosing the regularization parameter which use $Δ \in {Δ_{1}, Δ_{2}, Δ_{3}}$ and are well known in the case $D = I$ , $s = 0$ . We assume $R ({AA}^{*}) \subset D (D)$ .

Discrepancy principle.[Citation1, Citation2, Citation17, Citation21] In the continuous regularization methods, this principle (D principle) chooses the parameter $α = α_{D}$ as the solution of the equation(10) $\begin{matrix} d_{D} (α) : = ∥ D (y - {Ax}_{α}) ∥ = C Δ with C \geq 1 . \end{matrix}$ (10) In iteration methods, the discrepancy principle finds $n_{D}$ as the first index for which $∥ D (y - {Ax}_{n}) ∥ \leq C Δ$ .
Modified discrepancy principle.[Citation22] In this rule for the methods M1, M2, the parameter $α = α_{MD}$ is chosen as the solution of the equation(11) $\begin{matrix} d_{MD} (α) : = ∥ {DL}^{s} {(I - g_{α} ({BB}^{*}) {BB}^{*})}^{1 / (2 p_{0})} L^{- s} (y - {Ax}_{α}) ∥ = C Δ \end{matrix}$ (11) with $C \geq 1$ , assuming $Y_{- s} \subset D (D)$ (if $D = L^{- t}$ then a sufficient condition is $s \geq t$ ). Here, $p_{0}$ is the qualification of the regularization method, see (Equation6(6) $\begin{matrix} x_{r} = g_{r} (B^{*} B) B^{*} \bar{y} = B^{*} g_{r} ({BB}^{*}) \bar{y} = A^{*} w_{r}, w_{r} = L^{- s} g_{r} ({BB}^{*}) \bar{y} . \end{matrix}$ (6) ). One may use also the alternative MD’ rule(12) $\begin{matrix} d_{{MD}^{'}} (α) = (D (y - {Ax}_{α}), D L^{s} {(I - g_{α} ({BB}^{*}) {BB}^{*})}^{1 / p_{0}} L^{- s} (y - {Ax}_{α} {))}^{1 / 2} = C Δ . \end{matrix}$ (12)

In case

D = L^{- s}

, the functions

d_{MD} (α)

and

d_{{MD}^{'}} (α)

coincide. For regularization methods with

p_{0} = \infty

, both these forms of the modified discrepancy principle coincide with the discrepancy principle.

ME-rule for the continuous regularization methods

Derivation of the ME-rule

Let us consider the ME-rule for the continuous regularization methods. The prominent example of these methods is Tikhonov method where the regularization parameter is traditionally denoted by $α$ . Therefore, in this section we use notation $α = 1 / r$ instead of r. We assume that the corresponding function $g_{α} (λ)$ is differentiable with respect to $α$ .

Let us consider the regularized approximation(13) $\begin{matrix} x_{α} = A^{*} w_{α}, z_{α} : = \frac{d}{d α} w_{α} \in Y, \end{matrix}$ (13) assuming that $z_{α} \in D ({(D^{- 1})}^{*})$ . An example of a regularized approximation of this form is the approximation $x_{α} = A^{*} w_{α}$ with $w_{α} = L^{- s} g_{r} ({BB}^{*}) \bar{y}$ from (Equation7(7) $\begin{matrix} x_{α_{1}, \dots, α_{m}} = \sum_{i = 1}^{m} c_{i} x_{α_{i}}, c_{i} = \prod_{\frac{j = 1}{j \neq i}}^{m} \frac{α_{j}}{α_{j} - α_{i}}, \end{matrix}$ (7) ). The reformulation of the general idea of ME-rule in terms of $α = 1 / r$ means: search in the approximation $x_{α}$ for the smallest computable regularization parameter $α = α_{ME}$ for which we can guarantee that(14) $\begin{matrix} \frac{d}{d α} ∥ x_{α} - x^{†} ∥^{2} \geq 0 for all α \in [α_{ME}, \infty) . \end{matrix}$ (14) In order to guarantee this property, we estimate the derivative of the squared error $∥ x_{α} - x^{†} ∥^{2}$ with respect to $α$ under condition (Equation10(10) $\begin{matrix} d_{D} (α) : = ∥ D (y - {Ax}_{α}) ∥ = C Δ with C \geq 1 . \end{matrix}$ (10) ) as follows:(15) $\begin{matrix} \frac{1}{2} \frac{d}{d α} ∥ x_{α} - x^{†} ∥^{2} & = (x_{α} - x^{†}, A^{*} z_{α}) = (D ({Ax}_{α} - y + y - {Ax}^{†}), {(D^{- 1})}^{*} z_{α}) \\ \geq & ∥ {(D^{- 1})}^{*} z_{α} ∥ (\frac{({Ax}_{α} - y, z_{α})}{∥ {(D^{- 1})}^{*} z_{α} ∥} - (δ + h ∥ x^{†} ∥)) . \end{matrix}$ (15) This estimate leads us to following ME-rule for the continuous regularization method (Equation14(14) $\begin{matrix} \frac{d}{d α} ∥ x_{α} - x^{†} ∥^{2} \geq 0 for all α \in [α_{ME}, \infty) . \end{matrix}$ (14) ).

ME rule. Choose $α = α_{ME}$ as the largest solution of the equation(16) $\begin{matrix} d_{ME} (α) : = \frac{({Ax}_{α} - y, z_{α})}{∥ {(D^{- 1})}^{*} z_{α} ∥} = Δ \end{matrix}$ (16) with $Δ \in {Δ_{1}, Δ_{2}}$ . If $d_{ME} (α) > Δ$ for all $α > 0$ take $α_{ME} = 0$ . If $d_{ME} (α) < Δ$ for all $α > 0$ take $x_{α} = 0$ (it corresponds to $α = α_{ME} = \infty$ ). For regularized approximation (Equation7(7) $\begin{matrix} x_{α_{1}, \dots, α_{m}} = \sum_{i = 1}^{m} c_{i} x_{α_{i}}, c_{i} = \prod_{\frac{j = 1}{j \neq i}}^{m} \frac{α_{j}}{α_{j} - α_{i}}, \end{matrix}$ (7) ), this equation has the form(17) $\begin{matrix} d_{ME} (α) : = \frac{({Ax}_{α} - y, L^{- s} \frac{d}{d α} g_{α} ({BB}^{*}) L^{- s} y)}{∥ {(D^{- 1})}^{*} L^{- s} \frac{d}{d α} g_{α} ({BB}^{*}) L^{- s} y ∥} = Δ . \end{matrix}$ (17) Here, the assumption $z_{α} \in D ({(D^{- 1})}^{*})$ is satisfied if $Y_{s} \subset D ({(D^{- 1})}^{*})$ .

Note that for $Δ = Δ_{3}$ , the ME-property is not guaranteed but quasi-optimality retains (see Theorem 6.3). However, $Δ = Δ_{3}$ does not need the norm of the exact solution or a bound of it and in practice may give better results than $Δ \in {Δ_{1}, Δ_{2}}$ .

From definition of $d_{ME} (α)$ follows the inequality $d_{ME} (α) \leq ∥ D (y - {Ax}_{α}) ∥$ , therefore $α_{D} \leq α_{ME}$ if $C = 1$ in the discrepancy principle. The derivation of the ME-rule in (Equation16(16) $\begin{matrix} d_{ME} (α) : = \frac{({Ax}_{α} - y, z_{α})}{∥ {(D^{- 1})}^{*} z_{α} ∥} = Δ \end{matrix}$ (16) ) uses only one inequality, which turns to the equality in the case $\begin{matrix} y - A x^{†} = - \frac{δ + h ∥ x^{†} ∥}{∥ {(D^{- 1})}^{*} z_{α} ∥} {(D^{*} D)}^{- 1} z_{α} . \end{matrix}$ Therefore, in this case the ME-rule gives the optimal parameter, if the equation (Equation18(18) $\begin{matrix} d_{MD} (α) = {(D ρ_{m, α}, D ρ_{m + 1, α})}^{1 / 2} and d_{ME} (α) = \frac{(ρ_{m, α}, L^{- 2 s} ρ_{m + 1, α})}{∥ {(D^{- 1})}^{*} L^{- 2 s} ρ_{m + 1, α} ∥} . \end{matrix}$ (18) ) has a unique solution.

ME-rule and modifications for the (iterated) Tikhonov regularization

Let $x_{0} = 0$ . Let $ρ_{m, α}$ denote the discrepancy of the (iterated) Tikhonov approximation $x_{α} = x_{m, α}$ , i.e. $ρ_{m, α} : = y - A x_{m, α}$ .

Using the identities $\begin{matrix} 1 - λ g_{α} (λ) = {(\frac{α}{λ + α})}^{m} and \frac{d}{d α} g_{α} (λ) = - \frac{m}{α^{2}} {(\frac{α}{λ + α})}^{m + 1} \end{matrix}$ we obtain $\begin{matrix} L^{- s} ρ_{m, α} = \bar{y} - {Bx}_{m, α} = [I - {BB}^{*} g_{α} ({BB}^{*})] \bar{y} = {[α {({BB}^{*} + α I)}^{- 1}]}^{m} \bar{y} \end{matrix}$ and $\begin{matrix} \frac{d}{d α} g_{α} ({BB}^{*}) \bar{y} = - \frac{m}{α^{2}} L^{- s} ρ_{m + 1, α} . \end{matrix}$ From these representations and from (Equation18(18) $\begin{matrix} d_{MD} (α) = {(D ρ_{m, α}, D ρ_{m + 1, α})}^{1 / 2} and d_{ME} (α) = \frac{(ρ_{m, α}, L^{- 2 s} ρ_{m + 1, α})}{∥ {(D^{- 1})}^{*} L^{- 2 s} ρ_{m + 1, α} ∥} . \end{matrix}$ (18) ), we conclude that the functions $d_{MD} (α)$ in (Equation13(13) $\begin{matrix} x_{α} = A^{*} w_{α}, z_{α} : = \frac{d}{d α} w_{α} \in Y, \end{matrix}$ (13) ) and $d_{ME} (α)$ for the MD’- and ME-rules have the form(18) $\begin{matrix} d_{MD} (α) = {(D ρ_{m, α}, D ρ_{m + 1, α})}^{1 / 2} and d_{ME} (α) = \frac{(ρ_{m, α}, L^{- 2 s} ρ_{m + 1, α})}{∥ {(D^{- 1})}^{*} L^{- 2 s} ρ_{m + 1, α} ∥} . \end{matrix}$ (18) In the case $D = I$ , $h = s = 0$ , convergence and order optimal error estimates for (iterated) Tikhonov method are proved in the case of MD’-rule in [Citation5, Citation22, Citation23] and in the case of ME-rule in [Citation3, Citation5, Citation6, Citation8, Citation24].

One may also use an analog of the ME-rule with the function(19) $\begin{matrix} d_{MEa} (α) = \frac{(D ρ_{m, α}, D ρ_{m + 1, α})}{∥ D ρ_{m + 1, α} ∥}, \end{matrix}$ (19) which coincides with ME-rule in the case $D = L^{- s}$ .

Consider now some modifications of the ME-rule. We know from (Equation15(15) $\begin{matrix} \frac{1}{2} \frac{d}{d α} ∥ x_{α} - x^{†} ∥^{2} & = (x_{α} - x^{†}, A^{*} z_{α}) = (D ({Ax}_{α} - y + y - {Ax}^{†}), {(D^{- 1})}^{*} z_{α}) \\ \geq & ∥ {(D^{- 1})}^{*} z_{α} ∥ (\frac{({Ax}_{α} - y, z_{α})}{∥ {(D^{- 1})}^{*} z_{α} ∥} - (δ + h ∥ x^{†} ∥)) . \end{matrix}$ (15) ) that $α_{ME} \geq α_{opt} : = argmin {∥ x_{m, α} - x^{†} ∥, α \geq 0}$ . It means that typically somewhat a smaller parameter $α_{ME} / c$ with proper $c > 1$ is better parameter than $α_{ME}$ . For case $D = I$ , $h = s = 0$ we optimized the value of $c$ (and other constants below) in extensive numerical experiments and recommend the estimated parameter $α_{MEe} = α_{ME} / 2.3$ . Note that in the case of rough estimate of the noise level, better rules than the ME-rule and the discrepancy principle are rules from the recently derived family of rules,[Citation25] see also [Citation26–Citation29].

In the extrapolated Tikhonov method (Equation8(8) $\begin{matrix} x_{α} = {(α I + L^{- 2 s} A^{*} A)}^{- 1} L^{- 2 s} A^{*} y, x_{α} = {(α I + A^{*} L^{- 2 s} A)}^{- 1} A^{*} L^{- 2 s} y, \end{matrix}$ (8) ) $x_{α} : = x_{α_{1}, \dots, α_{m}}$ , one may use the logarithmically uniform mesh of parameters $α_{n} = α q^{n - (m + 1) / 2}$ , $n = 1, \dots, m$ ; $q < 1$ and to choose $α$ by the same rules as in the iterated Tikhonov method.

Note that in [Citation30], an analog of ME-rule in case $D = I$ , $h = s = 0$ was proposed for the ( $m \geq 1$ times iterated) Lavrentiev method, finding the parameter $α = α_{MEa}$ as solution of the equation $∥ ρ_{m + 1, α} ∥^{2} / ∥ ρ_{m + 2, α} ∥ \leq C_{m} δ$ , where $C_{1} = 1.55$ , $C_{2} = 1.6$ . Numerical experiments in [Citation30, Citation31] showed that with this parameter choice, the error of regularized solution was in average only 5% larger (in modifications of this rule 3% larger) than with optimal parameter. Note also that several other analogs of ME-rule for the ( $m \geq 1$ times iterated) Lavrentiev method were proposed and analysed in [Citation32].

ME-rule for asymptotical regularization

In this regularization method, the regularized solution is given by $x_{α}$ as the solution of the initial value problem $\begin{matrix} \frac{d}{d t} x (t) + B^{*} B x (t) = B^{*} \bar{y} for 0 < t \leq r, x (0) = 0 \end{matrix}$ with $r = 1 / α$ . In this method, we have $g_{α} (λ) = (1 - e^{- λ / α}) / λ$ . Using the identities $\begin{matrix} \frac{d}{d α} g_{α} (λ) = - \frac{1}{α^{2}} e^{- λ / α} = - \frac{1}{α^{2}} (1 - λ g_{α} (λ)) \end{matrix}$ we obtain for the method of asymptotical regularization $\begin{matrix} d_{ME} (α) = \frac{({Ax}_{α} - y, L^{- 2 s} ({Ax}_{α} - y))}{∥ {(D^{- 1})}^{*} L^{- 2 s} ({Ax}_{α} - y) ∥} . \end{matrix}$ In the case $s = 0$ and $D = I$ , the ME-rule coincides with the discrepancy principle for which convergence and order optimal error estimates are well known (cf. [Citation1, Citation2, Citation8, Citation17]).

ME-rule for iterative regularization methods

Derivation of the ME-rule

For solving ill-posed problems (Equation1(1) $\begin{matrix} A_{0} x = y_{0}, y_{0} \in R (A), \end{matrix}$ (1) ), we now consider iteration methods of the general form(20) $\begin{matrix} x_{n} = x_{n - 1} + A^{*} z_{n - 1}, n = 1, 2, \dots \end{matrix}$ (20) with $z_{n} \in Y$ . The elements $z_{n}$ characterize the special iteration method. We assume that $z_{n} \in D ({(D^{- 1})}^{*})$ . Simple iteration methods are explicit iteration scheme (Landweber method) M3 with $z_{n} = μ L^{- 2 s} ρ_{n}$ , $μ \in (0, 1 / ∥ B^{*} B ∥)$ and implicit iteration scheme M4 with $z_{n} = L^{- s} {(α I + {BB}^{*})}^{- 1} L^{- s} ρ_{n} = α^{- 1} L^{- 2 s} ρ_{n + 1}$ , $α > 0$ , where $ρ_{n} = y - {Ax}_{n}$ .

In the monotone error rule (ME-rule), we search for a largest computable iteration number $n_{ME}$ for which the monotonicity property (Equation3(3) $\begin{matrix} ∥ x_{n} - x^{†} ∥ \leq ∥ x_{n - 1} - x^{†} ∥ \forall n = 1, 2, \dots, n_{ME} . \end{matrix}$ (3) )(21) $\begin{matrix} ∥ x_{n} - x^{†} ∥ \leq ∥ x_{n - 1} - x^{†} ∥ for all n = 1, 2, \dots, n_{ME} \end{matrix}$ (21) can be guaranteed. Exploiting (Equation21(21) $\begin{matrix} ∥ x_{n} - x^{†} ∥ \leq ∥ x_{n - 1} - x^{†} ∥ for all n = 1, 2, \dots, n_{ME} \end{matrix}$ (21) ) and (Equation10(10) $\begin{matrix} d_{D} (α) : = ∥ D (y - {Ax}_{α}) ∥ = C Δ with C \geq 1 . \end{matrix}$ (10) ), we obtain(22) $\begin{matrix} ∥ x_{n + 1} - x^{†} ∥^{2} - ∥ x_{n} - x^{†} ∥^{2} & = ∥ x_{n} + A^{*} z_{n} - x^{†} ∥^{2} - ∥ x_{n} - x^{†} ∥^{2} \\ = 2 (x_{n} - x^{†}, A^{*} z_{n}) + ∥ A^{*} z_{n} ∥^{2} \\ = 2 (D (A x_{n} - y_{0}), {(D^{- 1})}^{*} z_{n}) + ∥ A^{*} z_{n} ∥^{2} \\ = 2 (D (y - {Ax}^{†} - ρ_{n}), {(D^{- 1})}^{*} z_{n}) + ∥ A^{*} z_{n} ∥^{2} \\ \leq 2 ∥ {(D^{- 1})}^{*} z_{n} ∥ (δ + h ∥ x^{†} ∥) - 2 (ρ_{n}, z_{n}) + ∥ A^{*} z_{n} ∥^{2} \\ = : Φ (z_{n}) . \end{matrix}$ (22) This estimate leads us to the following ME-rule for iteration methods (Equation21(21) $\begin{matrix} ∥ x_{n} - x^{†} ∥ \leq ∥ x_{n - 1} - x^{†} ∥ for all n = 1, 2, \dots, n_{ME} \end{matrix}$ (21) ).

ME rule. Choose $n_{ME}$ as the first index $n$ satisfying(23) $\begin{matrix} d_{ME} (n) : = \frac{2 (ρ_{n}, z_{n}) - ∥ A^{*} z_{n} ∥^{2}}{2 ∥ {(D^{- 1})}^{*} z_{n} ∥} \leq Δ \end{matrix}$ (23) with $Δ \in {Δ_{1}, Δ_{2}}$ .

The derivation of the ME-rule in (Equation22(22) $\begin{matrix} ∥ x_{n + 1} - x^{†} ∥^{2} - ∥ x_{n} - x^{†} ∥^{2} & = ∥ x_{n} + A^{*} z_{n} - x^{†} ∥^{2} - ∥ x_{n} - x^{†} ∥^{2} \\ = 2 (x_{n} - x^{†}, A^{*} z_{n}) + ∥ A^{*} z_{n} ∥^{2} \\ = 2 (D (A x_{n} - y_{0}), {(D^{- 1})}^{*} z_{n}) + ∥ A^{*} z_{n} ∥^{2} \\ = 2 (D (y - {Ax}^{†} - ρ_{n}), {(D^{- 1})}^{*} z_{n}) + ∥ A^{*} z_{n} ∥^{2} \\ \leq 2 ∥ {(D^{- 1})}^{*} z_{n} ∥ (δ + h ∥ x^{†} ∥) - 2 (ρ_{n}, z_{n}) + ∥ A^{*} z_{n} ∥^{2} \\ = : Φ (z_{n}) . \end{matrix}$ (22) ) uses only one inequality, which turns to the equality in the case $\begin{matrix} y - A x^{†} = \frac{δ + h ∥ x^{†} ∥}{∥ {(D^{- 1})}^{*} z_{n} ∥} {(D^{*} D)}^{- 1} z_{n} . \end{matrix}$ Therefore, in this case the ME-rule gives the optimal parameter.

Note that for $Δ = Δ_{3}$ , the ME-property is not guaranteed but for certain methods it is known that quasi-optimality still holds (see Theorem 6.3). However, typically the monotonicity of error also holds up to somewhat larger iteration numbers, hence $Δ = Δ_{3}$ may also give better results than $Δ \in {Δ_{1}, Δ_{2}}$ .

From definition of $d_{ME} (n)$ follows the inequality $d_{ME} (n) \leq ∥ D ρ_{n} ∥ = d_{D} (n)$ .

Due to equality $ρ_{n + 1} = ρ_{n} - {AA}^{*} z_{n}$ , the function $d_{ME} (n)$ may be presented also in forms(24) $\begin{matrix} d_{ME} (n) : = \frac{(ρ_{n} + ρ_{n + 1}, z_{n})}{2 ∥ {(D^{- 1})}^{*} z_{n} ∥} = \frac{2 (ρ_{n + 1}, z_{n}) + ∥ A^{*} z_{n} ∥^{2}}{2 ∥ {(D^{- 1})}^{*} z_{n} ∥} . \end{matrix}$ (24) Inequalities (Equation23(23) $\begin{matrix} d_{ME} (n) : = \frac{2 (ρ_{n}, z_{n}) - ∥ A^{*} z_{n} ∥^{2}}{2 ∥ {(D^{- 1})}^{*} z_{n} ∥} \leq Δ \end{matrix}$ (23) ), (Equation24(24) $\begin{matrix} d_{ME} (n) : = \frac{(ρ_{n} + ρ_{n + 1}, z_{n})}{2 ∥ {(D^{- 1})}^{*} z_{n} ∥} = \frac{2 (ρ_{n + 1}, z_{n}) + ∥ A^{*} z_{n} ∥^{2}}{2 ∥ {(D^{- 1})}^{*} z_{n} ∥} . \end{matrix}$ (24) ) may be used:

for stopping iterations (Equation21(21) $\begin{matrix} ∥ x_{n} - x^{†} ∥ \leq ∥ x_{n - 1} - x^{†} ∥ for all n = 1, 2, \dots, n_{ME} \end{matrix}$ (21) ), if for the first time $d_{ME} (n) \leq Δ$ ;
for choosing $z_{n}$ , minimizing the function $Φ (z_{n})$ in error estimate (Equation23(23) $\begin{matrix} d_{ME} (n) : = \frac{2 (ρ_{n}, z_{n}) - ∥ A^{*} z_{n} ∥^{2}}{2 ∥ {(D^{- 1})}^{*} z_{n} ∥} \leq Δ \end{matrix}$ (23) ) (for example choosing steplength $μ$ in (Equation26(26) $\begin{matrix} ∥ x_{n + 1} - x^{†} ∥^{2} - ∥ x_{n} - x^{†} ∥^{2} \leq 2 μ_{n} (∥ {(D^{*})}^{- 1} L^{- 2 s} ρ_{n} ∥ Δ - ∥ L^{- s} ρ_{n} ∥^{2}) + μ_{n}^{2} ∥ A^{*} L^{- 2 s} ρ_{n} ∥^{2} . \end{matrix}$ (26) ));
for using several iteration methods switching from some faster method (for example CGLS) to a slower method (for example Landweber method): if inequality (Equation23(23) $\begin{matrix} d_{ME} (n) : = \frac{2 (ρ_{n}, z_{n}) - ∥ A^{*} z_{n} ∥^{2}}{2 ∥ {(D^{- 1})}^{*} z_{n} ∥} \leq Δ \end{matrix}$ (23) ) does not guarantee decrease of error in the fast method, the accuracy of this approximation may be further increased by a slower method.

ME-rule in gradient-type methods

Let us consider gradient-type methods of the form (Equation21(21) $\begin{matrix} ∥ x_{n} - x^{†} ∥ \leq ∥ x_{n - 1} - x^{†} ∥ for all n = 1, 2, \dots, n_{ME} \end{matrix}$ (21) ) with $z_{n} = μ_{n} L^{- 2 s} ρ_{n}$ , where $ρ_{n} = y - {Ax}_{n}$ is the discrepancy and $μ_{n} > 0$ is some properly chosen stepsize:(25) $\begin{matrix} x_{n} = x_{n - 1} + μ_{n - 1} A^{*} L^{- 2 s} (y - A x_{n - 1}), n = 1, 2, \dots \end{matrix}$ (25) Special cases of method (Equation26(26) $\begin{matrix} ∥ x_{n + 1} - x^{†} ∥^{2} - ∥ x_{n} - x^{†} ∥^{2} \leq 2 μ_{n} (∥ {(D^{*})}^{- 1} L^{- 2 s} ρ_{n} ∥ Δ - ∥ L^{- s} ρ_{n} ∥^{2}) + μ_{n}^{2} ∥ A^{*} L^{- 2 s} ρ_{n} ∥^{2} . \end{matrix}$ (26) ) are the Landweber method with $μ_{n} = μ \in (0, 1 / ∥ B^{*} B ∥)$ , the minimal error method with $μ_{n} = \frac{∥ L^{- s} ρ_{n} ∥^{2}}{∥ A^{*} L^{- 2 s} ρ_{n} ∥^{2}}$ and the steepest descent method with $μ_{n} = \frac{∥ A^{*} L^{- 2 s} ρ_{n} ∥^{2}}{∥ L^{- s} {AA}^{*} L^{- 2 s} ρ_{n} ∥^{2}}$ .

Here, the ME-function (Equation25(25) $\begin{matrix} x_{n} = x_{n - 1} + μ_{n - 1} A^{*} L^{- 2 s} (y - A x_{n - 1}), n = 1, 2, \dots \end{matrix}$ (25) ) has the form $\begin{matrix} d_{ME} (n) = \frac{(ρ_{n} + ρ_{n + 1}, L^{- 2 s} ρ_{n})}{2 ∥ {(D^{- 1})}^{*} L^{- 2 s} ρ_{n} ∥} . \end{matrix}$ The estimate (Equation23(23) $\begin{matrix} d_{ME} (n) : = \frac{2 (ρ_{n}, z_{n}) - ∥ A^{*} z_{n} ∥^{2}}{2 ∥ {(D^{- 1})}^{*} z_{n} ∥} \leq Δ \end{matrix}$ (23) ) which led us to the ME-rule can also be exploited for finding stepsizes $μ_{n} > 0$ in iteration methods (Equation26(26) $\begin{matrix} ∥ x_{n + 1} - x^{†} ∥^{2} - ∥ x_{n} - x^{†} ∥^{2} \leq 2 μ_{n} (∥ {(D^{*})}^{- 1} L^{- 2 s} ρ_{n} ∥ Δ - ∥ L^{- s} ρ_{n} ∥^{2}) + μ_{n}^{2} ∥ A^{*} L^{- 2 s} ρ_{n} ∥^{2} . \end{matrix}$ (26) ) which guarantee that $x_{n + 1}$ is a better approximation for $x^{†}$ than $x_{n}$ . Here, the stepsize $μ_{n}$ may not only depend on y but also on the noise level $Δ$ . Exploiting $z_{n} = μ_{n} L^{- 2 s} ρ_{n}$ , the estimate (Equation23(23) $\begin{matrix} d_{ME} (n) : = \frac{2 (ρ_{n}, z_{n}) - ∥ A^{*} z_{n} ∥^{2}}{2 ∥ {(D^{- 1})}^{*} z_{n} ∥} \leq Δ \end{matrix}$ (23) ) obtains form(26) $\begin{matrix} ∥ x_{n + 1} - x^{†} ∥^{2} - ∥ x_{n} - x^{†} ∥^{2} \leq 2 μ_{n} (∥ {(D^{*})}^{- 1} L^{- 2 s} ρ_{n} ∥ Δ - ∥ L^{- s} ρ_{n} ∥^{2}) + μ_{n}^{2} ∥ A^{*} L^{- 2 s} ρ_{n} ∥^{2} . \end{matrix}$ (26) Therefore, $x_{n + 1}$ is a better approximation for $x^{†}$ than $x_{n}$ , if $\begin{matrix} Δ < \frac{∥ L^{- s} ρ_{n} ∥^{2}}{∥ {(D^{*})}^{- 1} L^{- 2 s} ρ_{n} ∥}, 0 < μ_{n} < \frac{2 (∥ L^{- s} ρ_{n} ∥^{2} - ∥ {(D^{*})}^{- 1} L^{- 2 s} ρ_{n} ∥ Δ)}{∥ A^{*} L^{- 2 s} ρ_{n} ∥^{2}} . \end{matrix}$ Minimizing the right-hand side of (Equation27(27) $\begin{matrix} x_{n} = A^{*} w_{n}, n = 1, 2, \dots, w_{n} \in Y \end{matrix}$ (27) ) with respect to $μ_{n}$ yields $\begin{matrix} μ_{n} = \frac{∥ L^{- s} ρ_{n} ∥^{2} - ∥ {(D^{*})}^{- 1} L^{- 2 s} ρ_{n} ∥ Δ}{∥ A^{*} L^{- 2 s} ρ_{n} ∥^{2}} . \end{matrix}$ Substituting into (Equation27(27) $\begin{matrix} x_{n} = A^{*} w_{n}, n = 1, 2, \dots, w_{n} \in Y \end{matrix}$ (27) ) shows that for this stepsize, the improvement of the squared error can be estimated by $\begin{matrix} ∥ x_{n + 1} - x^{†} ∥^{2} - ∥ x_{n} - x^{†} ∥^{2} \leq - \frac{{(∥ L^{- s} ρ_{n} ∥^{2} - ∥ (D^{*})^{- 1} L^{- 2 s} ρ_{n} ∥ Δ)}^{2}}{∥ A^{*} L^{- 2 s} ρ_{n} ∥^{2}} . \end{matrix}$ In case $D = I$ , $h = s = 0$ last two relations were given in [Citation8, Citation10], but convergence results and order optimal error bounds for stopping many gradient methods [Citation2, Citation9, Citation10, Citation17, Citation33] with the ME-rule were stated in [Citation4, Citation7–Citation10, Citation34].

ME-rule in conjugate gradient methods

Some gradient-type methods that do not fit into the class of methods (Equation26(26) $\begin{matrix} ∥ x_{n + 1} - x^{†} ∥^{2} - ∥ x_{n} - x^{†} ∥^{2} \leq 2 μ_{n} (∥ {(D^{*})}^{- 1} L^{- 2 s} ρ_{n} ∥ Δ - ∥ L^{- s} ρ_{n} ∥^{2}) + μ_{n}^{2} ∥ A^{*} L^{- 2 s} ρ_{n} ∥^{2} . \end{matrix}$ (26) ) are conjugate gradient methods (see [Citation9, Citation10, Citation33, Citation36]).

Methods CGLS and CGME are the conjugate gradient method applied to $B^{*} Bx = B^{*} \bar{y}$ and ${BB}^{*} v = \bar{y}$ with $x = B^{*} v$ , respectively. In both methods, we fix the initial approximation $x_{0}$ , the initial value $u_{0} = 0$ and $β_{0} = 0$ , and compute $τ_{0} = \bar{y} - {Bx}_{0} .$ Further, we compute in CGLS $p_{0} = B^{*} τ_{0}$ , and for $n = 1, 2, \dots$ iteratively $\begin{matrix} u_{n} & = β_{n - 1} u_{n - 1} - τ_{n - 1}, γ_{n} = \frac{∥ p_{n - 1} ∥^{2}}{∥ {BB}^{*} u_{n} ∥^{2}}, x_{n} = x_{n - 1} - γ_{n} B^{*} u_{n}, \\ τ_{n} & = τ_{n - 1} + γ_{n} {BB}^{*} u_{n}, p_{n} = B^{*} τ_{n}, β_{n} = \frac{∥ p_{n} ∥^{2}}{∥ p_{n - 1} ∥^{2}} . \end{matrix}$ In the method CGME, one computes for $n = 1, 2, \dots$ iteratively $\begin{matrix} u_{n} & = β_{n - 1} u_{n - 1} - τ_{n - 1}, γ_{n} = \frac{∥ τ_{n - 1} ∥^{2}}{∥ B^{*} u_{n} ∥^{2}}, \\ x_{n} & = x_{n - 1} - γ_{n} B^{*} u_{n}, τ_{n} = τ_{n - 1} + γ_{n} {BB}^{*} u_{n}, β_{n} = \frac{∥ τ_{n} ∥^{2}}{∥ τ_{n - 1} ∥^{2}} \end{matrix}$ Both CGLS and CGME method have the form (Equation21(21) $\begin{matrix} ∥ x_{n} - x^{†} ∥ \leq ∥ x_{n - 1} - x^{†} ∥ for all n = 1, 2, \dots, n_{ME} \end{matrix}$ (21) ) with $z_{n - 1} = - γ_{n} L^{- s} u_{n}$ , hence ME-rule (Equation24(24) $\begin{matrix} d_{ME} (n) : = \frac{(ρ_{n} + ρ_{n + 1}, z_{n})}{2 ∥ {(D^{- 1})}^{*} z_{n} ∥} = \frac{2 (ρ_{n + 1}, z_{n}) + ∥ A^{*} z_{n} ∥^{2}}{2 ∥ {(D^{- 1})}^{*} z_{n} ∥} . \end{matrix}$ (24) ) can be used. In case $D = I$ , $h = s = 0$ , the ME-rule for CGLS and CGME methods was proposed and numerically tested in [Citation38].

ME-rule for sequence of approximations

In contrast to iterative methods where approximation on step $n$ is computed using approximation on previous step $n - 1$ , one may consider an arbitrary sequence of approximations(27) $\begin{matrix} x_{n} = A^{*} w_{n}, n = 1, 2, \dots, w_{n} \in Y \end{matrix}$ (27) and ask which from these approximations to take as approximate solution.

Theorem 5.1

If in sequence (Equation28(28) $\begin{matrix} d_{ME} (n) = \frac{(ρ_{n} + ρ_{n + 1}, w_{n + 1} - w_{n})}{2 ∥ ({(D^{- 1})}^{*} (w_{n + 1} - w_{n}) ∥} \leq Δ \end{matrix}$ (28) ) the index $n = n_{ME}$ is chosen as the first index $n$ satisfying(28) $\begin{matrix} d_{ME} (n) = \frac{(ρ_{n} + ρ_{n + 1}, w_{n + 1} - w_{n})}{2 ∥ ({(D^{- 1})}^{*} (w_{n + 1} - w_{n}) ∥} \leq Δ \end{matrix}$ (28) with $Δ \in {Δ_{1}, Δ_{2}}$ , then ME-property (Equation3(3) $\begin{matrix} ∥ x_{n} - x^{†} ∥ \leq ∥ x_{n - 1} - x^{†} ∥ \forall n = 1, 2, \dots, n_{ME} . \end{matrix}$ (3) ) holds.

Proof

This follows from (Equation25(25) $\begin{matrix} x_{n} = x_{n - 1} + μ_{n - 1} A^{*} L^{- 2 s} (y - A x_{n - 1}), n = 1, 2, \dots \end{matrix}$ (25) ) with $z_{n} = w_{n + 1} - w_{n}$ . $□$

To use the functional $d_{ME} (n)$ , elements $w_{n}$ are needed. They can be found by first computing $w_{n}$ and later $x_{n} = A^{*} w_{n}$ .

Note that the last theorem also gives an alternative way to derive the ME-rule for continuous regularization methods, using the sequence $α_{n} = q^{n}$ , $q < 1$ and taking the limit $q \to 1$ .

The last theorem may be applied as an alternative or a complement to any other parameter choice rule for finding a regularized approximation from the sequence in the form $x_{n} = A^{*} w_{n}$ , $n = 1, 2, \dots$ (for example in (iterated) Tikhonov method). Note that computation of the sequence of all approximate solutions for $n = 1, 2, \dots, N$ with some $N$ is required, for example, in the balancing principle (see [Citation11, Citation35, Citation37, Citation39]. This theorem guarantees that the stopping index $n_{ME, B} = \max (n_{ME}, n_{B})$ is at least as good as the index $n_{B}$ from arbitrary balancing principle: $∥ x_{n_{ME, B}} - x^{†} ∥ \leq ∥ x_{n_{B}} - x^{†} ∥$ .

ME-rule for semi-iterative regularization methods

Let $p_{n}$ be residual polynoms orthogonal in interval $[0, ∥ B^{*} B ∥]$ with respect to some measure (see [Citation40]). Then $p_{0} = 1$ , $\begin{matrix} p_{n} (λ) = p_{n - 1} (λ) + (1 - μ_{n}) (p_{n - 2} (λ) - p_{n - 1} (λ)) - ω_{n} λ p_{n - 1} (λ), n \geq 2 \end{matrix}$ with some constants $μ_{n}$ and $ω_{n}$ , where $μ_{1} = 1$ . Then, semi-iterative approximations are found as $\begin{matrix} x_{1} = x_{0} + ω_{1} A^{*} L^{- 2 s} (y - {Ax}_{0}), \\ x_{n} = x_{n - 1} + (1 - μ_{n}) (x_{n - 2} - x_{n - 1}) + ω_{n} A^{*} L^{- 2 s} (y - {Ax}_{n - 1}), n \geq 2 . \end{matrix}$ In case $x_{0} = A^{*} w_{0}$ , we have $x_{n} = A^{*} w_{n}$ with $w_{1} = w_{0} + ω_{1} L^{- 2 s} (y - {AA}^{*} w_{0})$ ,(29) $\begin{matrix} w_{n} = w_{n - 1} + (1 - μ_{n}) (w_{n - 2} - w_{n - 1}) + ω_{n} L^{- 2 s} (y - {AA}^{*} w_{n - 1}), n \geq 2 . \end{matrix}$ (29) Thus, the iterations of the semi-iterative methods may be stopped by the ME-rule (Equation29(29) $\begin{matrix} w_{n} = w_{n - 1} + (1 - μ_{n}) (w_{n - 2} - w_{n - 1}) + ω_{n} L^{- 2 s} (y - {AA}^{*} w_{n - 1}), n \geq 2 . \end{matrix}$ (29) ). We give formula for $w_{n}$ for the following examples of semi-iterative methods (see [Citation40]).

The Chebyshev method of Stiefel $\begin{matrix} w_{n} = \frac{2 n}{n + 1} w_{n - 1} - \frac{n - 1}{n + 1} w_{n - 2} + \frac{4 n}{n + 1} L^{- 2 s} (y - {AA}^{*} w_{n - 1}), n = 1, 2, \dots \end{matrix}$
The Chebyshev method of Nemirovskii and Polyak $\begin{matrix} w_{1} = \frac{4}{3} L^{- 2 s} (y - {AA}^{*} w_{0}), \end{matrix}$ $\begin{matrix} w_{n} & = 2 \frac{2 n - 1}{2 n + 1} w_{n - 1} - \frac{2 n - 3}{2 n + 1} w_{n - 2} + 4 \frac{2 n - 1}{2 n + 1} L^{- 2 s} (y - {AA}^{*} w_{n - 1}), \\ n = 2, 3, \dots \end{matrix}$
The $ν$ -method of Brakhage is method where $w_{n}$ in (Equation30(30) $\begin{matrix} x_{n + 1} & = x_{n} + A^{*} L^{- s} {({BB}^{*} + α_{n} I)}^{- 1} L^{- s} (y - {Ax}_{n}) \\ = {(B^{*} B + α_{n} I)}^{- 1} (α_{n - 1} x_{n} + B^{*} L^{- s} y), n = 0, 1, \dots \end{matrix}$ (30) ) has for fixed $ν > 0$ the coefficients $\begin{matrix} μ_{1} & = 1, μ_{n} = 1 + \frac{(n - 1) (2 n - 3) (2 n + 2 ν - 1)}{(n + 2 ν - 1) (2 n + 4 ν - 1) (2 n + 2 ν - 3)}, n = 2, 3, \dots \\ ω_{1} & = \frac{4 ν + 2}{4 ν + 1}, ω_{n} = \frac{4 (2 n + 2 ν - 1) (n + ν - 1)}{(n + 2 ν - 1) (2 n + 4 ν - 1)}, n = 2, 3, \dots \end{matrix}$

ME-rule in implicit iteration methods

Let us consider implicit iteration methods of the form(30) $\begin{matrix} x_{n + 1} & = x_{n} + A^{*} L^{- s} {({BB}^{*} + α_{n} I)}^{- 1} L^{- s} (y - {Ax}_{n}) \\ = {(B^{*} B + α_{n} I)}^{- 1} (α_{n - 1} x_{n} + B^{*} L^{- s} y), n = 0, 1, \dots \end{matrix}$ (30) This method has form (Equation21(21) $\begin{matrix} ∥ x_{n} - x^{†} ∥ \leq ∥ x_{n - 1} - x^{†} ∥ for all n = 1, 2, \dots, n_{ME} \end{matrix}$ (21) ) with(31) $\begin{matrix} z_{n} = L^{- s} {(α_{n} I + {BB}^{*})}^{- 1} L^{- s} ρ_{n} = α_{n}^{- 1} L^{- 2 s} ρ_{n + 1} . \end{matrix}$ (31) Therefore, the function $d_{ME} (n)$ in (Equation25(25) $\begin{matrix} x_{n} = x_{n - 1} + μ_{n - 1} A^{*} L^{- 2 s} (y - A x_{n - 1}), n = 1, 2, \dots \end{matrix}$ (25) ) attains the form $\begin{matrix} d_{ME} (n) = \frac{(ρ_{n} + ρ_{n + 1}, L^{- 2 s} ρ_{n + 1})}{∥ {(D^{- 1})}^{*} L^{- 2 s} ρ_{n + 1} ∥} . \end{matrix}$ In case $D = I$ , $h = s = 0$ , ME-rule for method (Equation31(31) $\begin{matrix} z_{n} = L^{- s} {(α_{n} I + {BB}^{*})}^{- 1} L^{- s} ρ_{n} = α_{n}^{- 1} L^{- 2 s} ρ_{n + 1} . \end{matrix}$ (31) ) was studied in [Citation4, Citation7, Citation8, Citation34].

Let us now consider the question of choosing the parameter $α_{n}$ such that the guaranteed improvement of accuracy of approximation $x_{n + 1}$ over $x_{n}$ would be maximal.

Proposition 5.2

In regularization method (Equation31(31) $\begin{matrix} z_{n} = L^{- s} {(α_{n} I + {BB}^{*})}^{- 1} L^{- s} ρ_{n} = α_{n}^{- 1} L^{- 2 s} ρ_{n + 1} . \end{matrix}$ (31) ), the guaranteed improvement $- Φ (z_{n})$ of accuracy in estimate (Equation23(23) $\begin{matrix} d_{ME} (n) : = \frac{2 (ρ_{n}, z_{n}) - ∥ A^{*} z_{n} ∥^{2}}{2 ∥ {(D^{- 1})}^{*} z_{n} ∥} \leq Δ \end{matrix}$ (23) ) is maximized for $z_{n}$ in (Equation32(32) $\begin{matrix} ∥ x_{r} - x^{†} ∥ \leq ∥ x_{r} - x_{r}^{0} ∥ + ∥ x_{r}^{0} - x^{†} ∥ . \end{matrix}$ (32) ) with $α_{n}$ as solution $α$ of the equation $α ∥ {(D^{- 1})}^{*} L^{- s} {(α I + {BB}^{*})}^{- 1} L^{- s} ρ_{n} ∥ ∥ {(α I + {BB}^{*})}^{- 3 / 2} L^{- s} ρ_{n} ∥^{2} = ({(D^{- 1})}^{*} L^{- s} {(α I + {BB}^{*})}^{- 1} L^{- s} ρ_{n}, {(D^{- 1})}^{*} L^{- s} {(α I + {BB}^{*})}^{- 2} L^{- s} ρ_{n}) Δ_{1} .$

Proof

Substitution of $z_{n}$ in (Equation32(32) $\begin{matrix} ∥ x_{r} - x^{†} ∥ \leq ∥ x_{r} - x_{r}^{0} ∥ + ∥ x_{r}^{0} - x^{†} ∥ . \end{matrix}$ (32) ) to function $Φ (z_{n})$ in (Equation23(23) $\begin{matrix} d_{ME} (n) : = \frac{2 (ρ_{n}, z_{n}) - ∥ A^{*} z_{n} ∥^{2}}{2 ∥ {(D^{- 1})}^{*} z_{n} ∥} \leq Δ \end{matrix}$ (23) ) shows that minimization of function $Φ (z_{n})$ is equivalent to minimization of the function $\begin{matrix} ϕ (α) & : = 2 ∥ {(D^{- 1})}^{*} L^{- s} {(α I + {BB}^{*})}^{- 1} τ_{n} ∥ Δ_{1} - 2 (τ_{n}, {(α I + {BB}^{*})}^{- 1} τ_{n}) \\ + ∥ B^{*} {(α I + {BB}^{*})}^{- 1} τ_{n} ∥^{2}, \end{matrix}$ where $τ_{n} : = L^{- s} ρ_{n}$ . Differentiation with use of formula $\frac{1}{2} \frac{d}{d α} ∥ \cdot ∥ = (\frac{1}{2} \frac{d}{d α} ∥ \cdot ∥^{2}) / (2 ∥ \cdot ∥)$ gives $\frac{1}{2} ϕ^{'} (α) = - \frac{({(D^{- 1})}^{*} L^{- s} {(α I + {BB}^{*})}^{- 1} τ_{n}, {(D^{- 1})}^{*} L^{- s} {(α I + {BB}^{*})}^{- 2} τ_{n})}{∥ {(D^{- 1})}^{*} L^{- s} {(α I + {BB}^{*})}^{- 1} τ_{n} ∥} Δ_{1} + ∥ {(α I + {BB}^{*})}^{- 1} τ_{n} ∥^{2} - ∥ B^{*} {(α I + {BB}^{*})}^{- 3 / 2} τ_{n} ∥^{2} .$ The equality $\begin{matrix} ∥ {(α I + {BB}^{*})}^{- 1} τ_{n} ∥^{2} & = ((α I + {BB}^{*}) τ_{n}, {(α I + {BB}^{*})}^{- 3} τ_{n}) \\ = α ∥ {(α I + {BB}^{*})}^{- 3 / 2} τ_{n} ∥^{2} + ∥ B^{*} {(α I + {BB}^{*})}^{- 3 / 2} τ_{n} ∥^{2} \end{matrix}$ shows that $ϕ^{'} (α) = 0$ iff $α$ solves the equation (Equation33(33) $\begin{matrix} x_{r} = B^{*} g_{r} ({BB}^{*}) \bar{y} = g_{r} (B^{*} B) B^{*} \bar{y} \end{matrix}$ (33) ). $□$

Theorem 5.3

In the case $D = L^{- s}$ in regularization method (Equation31(31) $\begin{matrix} z_{n} = L^{- s} {(α_{n} I + {BB}^{*})}^{- 1} L^{- s} ρ_{n} = α_{n}^{- 1} L^{- 2 s} ρ_{n + 1} . \end{matrix}$ (31) ), the guaranteed improvement $- Φ (z_{n})$ of accuracy in estimate (Equation23(23) $\begin{matrix} d_{ME} (n) : = \frac{2 (ρ_{n}, z_{n}) - ∥ A^{*} z_{n} ∥^{2}}{2 ∥ {(D^{- 1})}^{*} z_{n} ∥} \leq Δ \end{matrix}$ (23) ) is maximized, if $α_{n}$ is chosen by the analogue of the discrepancy principle $∥ L^{- s} ρ_{n + 1} ∥ = Δ_{1} .$

Proof

Indeed, in the case $D = L^{- s}$ , the equation (Equation33(33) $\begin{matrix} x_{r} = B^{*} g_{r} ({BB}^{*}) \bar{y} = g_{r} (B^{*} B) B^{*} \bar{y} \end{matrix}$ (33) ) for finding $α = r_{n}^{- 1}$ has the form $α_{n} ∥ {(α_{n} I + {BB}^{*})}^{- 1} L^{- s} ρ_{n} ∥ = Δ_{1}$ , which is equivalent to $∥ L^{- s} ρ_{n + 1} ∥ = Δ_{1}$ due to the equality $α_{n} {(α_{n} I + {BB}^{*})}^{- 1} L^{- s} ρ_{n} = L^{- s} ρ_{n + 1}$ . $□$

Convergence and quasi-optimality for the ME-rule

In the ME-rule, considered in previous sections, the regularization parameter $r$ in continuous regularization methods is found as the solution (the largest solution in case of many solutions) of a certain equation $d (r, y, A) = Δ$ with $Δ \in {Δ_{1}, Δ_{2}}$ . In regularization methods, where the regularization parameter is a natural number as in iteration methods, we choose $r_{ME}$ as the first $r = 1$ , $2$ , ... for which $d (r, y, A) \leq Δ$ . Let us denote by $x_{r}^{0}$ the regularized solutions for exact data $y_{0}$ , $A_{0}$ . We give for the ME-rule, the following convergence result.

Theorem 6.1

Let $Δ \in {Δ_{1}, Δ_{2}}$ . Let $x_{r}^{0}$ , $x_{r}$ satisfy the following properties:

$∥ x_{r}^{0} - x^{†} ∥ \to 0$ as $r \to \infty$ ;
$∥ x_{r_{1}}^{0} - x_{r_{1}} ∥ \leq ∥ x_{r_{2}}^{0} - x_{r_{2}} ∥$ (for $r_{1} \leq r_{2}$ );
if $r (Δ) \leq const < \infty$ as $Δ \to 0$ , then $∥ x_{r (Δ)} - x_{r (Δ)}^{0} ∥ \to 0$ as $Δ \to 0$ ;
if for some sequence $r_{k} \leq const < \infty$ ( $k \to \infty$ ) it holds true $d (r_{k}, y_{0}, A_{0}) \to 0$ ( $k \to \infty$ ), then $∥ x_{r_{k}}^{0} - x^{†} ∥ \to 0$ ( $k \to \infty$ ).

Then,

∥ x_{r_{ME}} - x^{†} ∥ \to 0

Δ \to 0

Proof

For every $r$ it holds(32) $\begin{matrix} ∥ x_{r} - x^{†} ∥ \leq ∥ x_{r} - x_{r}^{0} ∥ + ∥ x_{r}^{0} - x^{†} ∥ . \end{matrix}$ (32) First we consider the main case where $r_{ME} (Δ) \to \infty$ as $Δ \to 0$ . Then, $∥ x_{r_{ME}}^{0} - x^{†} ∥ \to 0$ due to the assumption (1). It remains to show that $∥ x_{r_{ME}}^{0} - x_{r_{ME}} ∥ \to 0$ as $Δ \to 0$ . Let $r_{0} (Δ)$ be a monotonically increasing function giving some a priori regularization parameter guaranteeing convergence $∥ x_{r_{0}} - x^{†} ∥ \to 0$ as $Δ \to 0$ . If $r_{ME} \leq r_{0}$ , then due to assumption (2) $\begin{matrix} ∥ x_{r_{ME}}^{0} - x_{r_{ME}} ∥ \leq ∥ x_{r_{0}}^{0} - x_{r_{0}} ∥ . \end{matrix}$ If $r_{ME} \geq r_{0}$ , then ME-property says that $\begin{matrix} ∥ x_{r_{ME}} - x^{†} ∥ \leq ∥ x_{r_{0}} - x^{†} ∥ . \end{matrix}$ Both estimates converge to 0 as $Δ \to 0$ .

Consider now the case if $r_{ME} (Δ) \leq const$ ( $Δ \to 0$ ). Then, for $r = r_{ME}$ both terms in the right-hand side of (Equation34(34) $\begin{matrix} ∥ x_{r (R)} - x^{†} ∥ \leq C \inf_{r \geq 0} ψ (r) + O (δ + h ∥ x^{†} ∥), \end{matrix}$ (34) ) converge as $Δ \to 0$ due to assumptions (3) and (4). $□$

Assumptions (1)–(3) of Theorem 6.1 are general concerning regularization method. Only the last assumption (4) uses concrete form of the function $d (r, y, A)$ , for regularization method (Equation7(7) $\begin{matrix} x_{α_{1}, \dots, α_{m}} = \sum_{i = 1}^{m} c_{i} x_{α_{i}}, c_{i} = \prod_{\frac{j = 1}{j \neq i}}^{m} \frac{α_{j}}{α_{j} - α_{i}}, \end{matrix}$ (7) ) it can be proved similarly to Lemma 3.2 in [Citation2, p. 66].

Consider now the case $D = L^{- s}$ . The problem $Ax = y$ may be rewritten in form $Bx = \bar{y}$ with operator $B \in L (X, Y)$ and for $x_{r}$ in (Equation7(7) $\begin{matrix} x_{α_{1}, \dots, α_{m}} = \sum_{i = 1}^{m} c_{i} x_{α_{i}}, c_{i} = \prod_{\frac{j = 1}{j \neq i}}^{m} \frac{α_{j}}{α_{j} - α_{i}}, \end{matrix}$ (7) )(33) $\begin{matrix} x_{r} = B^{*} g_{r} ({BB}^{*}) \bar{y} = g_{r} (B^{*} B) B^{*} \bar{y} \end{matrix}$ (33) the standard regularization theory applies. We get the following results.

Theorem 6.2

Let (Equation4(4) $\begin{matrix} ∥ D (y_{0} - y) ∥ \leq δ, ∥ D (A_{0} - A) ∥ \leq h . \end{matrix}$ (4) ) hold with $D = L^{- s}$ . Let $x_{r}$ be defined by one of methods M1–M5. Then, choice of $r$ by one of rules D, MD, ME or MEe guarantees convergence $∥ x_{r} - x^{†} ∥ \to 0$ as $Δ \to 0$ and under additional assumption $x^{†} \in R ({(B^{*} B)}^{p / 2})$ the order optimal error estimate $∥ x_{r} - x^{†} ∥ \leq O (Δ^{\frac{p}{p + 1}})$ holds with $p \leq 2 p_{0} - 1$ for rule D, with $p \leq 2 p_{0}$ for rules MD, ME and MEe.

proof

Let $D = L^{- s}$ . Then, results of this theorem for discrepancy principle follow from the results in [Citation1, Citation2, Citation17], for MD rule in [Citation22, Citation41]. Let $C = 1$ in rules D, MD. In methods M1, M2 it holds $d_{MD} (α) \leq d_{ME} (α) \leq d_{D} (α)$ , therefore $α_{D} \leq α_{ME} \leq α_{MD}$ ([Citation5–Citation8]), in methods M3, M4 $n_{D} - 1 \leq n_{ME} \leq n_{D}$ ([Citation4, Citation7, Citation8]), in method M5 $d_{ME} (α) = d_{D} (α)$ . Therefore, results for ME rule follow from results for D rule and MD rule. The same results hold for parameter $r_{MEe} = const r_{ME}$ . $□$

Let us consider the quasi-optimality results for the regularization methods. To characterize the quality of the rule of an a posteriori regularization parameter choice, we use in the following the quasi-optimality property (see [Citation41, Citation42]). We say that a rule R for a posteriori choice of the regularization parameter $r = r (R)$ is quasi-optimal for a given regularization method $x_{r} = A^{*} g_{r} ({AA}^{*}) y$ , if there exists a constant $C$ (not depending on $A_{0}$ , $A$ , $x^{†}$ , $y$ ) such that for $∥ y - y_{0} ∥ \leq δ$ , $∥ A - A_{0} ∥ \leq h$ the error estimate(34) $\begin{matrix} ∥ x_{r (R)} - x^{†} ∥ \leq C \inf_{r \geq 0} ψ (r) + O (δ + h ∥ x^{†} ∥), \end{matrix}$ (34) holds, where the function(35) $\begin{matrix} ψ (r) = ∥ (I - A^{*} A g_{r} (A^{*} A)) x^{†}) ∥ + γ_{*} \sqrt{r} (δ + h ∥ x^{†} ∥) \end{matrix}$ (35) is an upper bound of the error $∥ x_{r} - x^{†} ∥$ . The following result holds.

Theorem 6.3

Let $D = L^{- s}$ . The modified discrepancy principle with $Δ \in {Δ_{1}, Δ_{2}, Δ_{3}}$ and $C > 1$ is quasi-optimal rule for solving the problem $Bx = \bar{y}$ by methods M1–M5: the inequality (Equation36(35) $\begin{matrix} ψ (r) = ∥ (I - A^{*} A g_{r} (A^{*} A)) x^{†}) ∥ + γ_{*} \sqrt{r} (δ + h ∥ x^{†} ∥) \end{matrix}$ (35) ) holds where the operators $A$ , $A^{*}$ are replaced by operators $B$ , $B^{*}$ in (37). This quasi-optimality is also guaranteed for ME-rule with $Δ \in {Δ_{1}, Δ_{2}}$ , but also with $Δ = Δ_{3}$ if in (Equation17(17) $\begin{matrix} d_{ME} (α) : = \frac{({Ax}_{α} - y, L^{- s} \frac{d}{d α} g_{α} ({BB}^{*}) L^{- s} y)}{∥ {(D^{- 1})}^{*} L^{- s} \frac{d}{d α} g_{α} ({BB}^{*}) L^{- s} y ∥} = Δ . \end{matrix}$ (17) ) and (Equation24(24) $\begin{matrix} d_{ME} (n) : = \frac{(ρ_{n} + ρ_{n + 1}, z_{n})}{2 ∥ {(D^{- 1})}^{*} z_{n} ∥} = \frac{2 (ρ_{n + 1}, z_{n}) + ∥ A^{*} z_{n} ∥^{2}}{2 ∥ {(D^{- 1})}^{*} z_{n} ∥} . \end{matrix}$ (24) ) $Δ$ is replaced by $C Δ$ with $C > 1$ .

Proof

Applying results of [Citation41] (Theorem 3) (see also [Citation34](Theorem 2, Remark 2)) to the approximation (Equation35(35) $\begin{matrix} ψ (r) = ∥ (I - A^{*} A g_{r} (A^{*} A)) x^{†}) ∥ + γ_{*} \sqrt{r} (δ + h ∥ x^{†} ∥) \end{matrix}$ (35) ) instead of $x_{r} = A^{*} g_{r} ({AA}^{*}) y$ , we get the assertions of theorem about MD rule, but also for ME rule with $Δ \in {Δ_{1}, Δ_{2}, Δ_{3}}$ if in (Equation17(17) $\begin{matrix} d_{ME} (α) : = \frac{({Ax}_{α} - y, L^{- s} \frac{d}{d α} g_{α} ({BB}^{*}) L^{- s} y)}{∥ {(D^{- 1})}^{*} L^{- s} \frac{d}{d α} g_{α} ({BB}^{*}) L^{- s} y ∥} = Δ . \end{matrix}$ (17) ) and (Equation24(24) $\begin{matrix} d_{ME} (n) : = \frac{(ρ_{n} + ρ_{n + 1}, z_{n})}{2 ∥ {(D^{- 1})}^{*} z_{n} ∥} = \frac{2 (ρ_{n + 1}, z_{n}) + ∥ A^{*} z_{n} ∥^{2}}{2 ∥ {(D^{- 1})}^{*} z_{n} ∥} . \end{matrix}$ (24) ) $C Δ$ with $C > 1$ is used. In case $Δ \in {Δ_{1}, Δ_{2}}$ , the increase of $C \geq 1$ in right-hand side $C Δ$ in (Equation17(17) $\begin{matrix} d_{ME} (α) : = \frac{({Ax}_{α} - y, L^{- s} \frac{d}{d α} g_{α} ({BB}^{*}) L^{- s} y)}{∥ {(D^{- 1})}^{*} L^{- s} \frac{d}{d α} g_{α} ({BB}^{*}) L^{- s} y ∥} = Δ . \end{matrix}$ (17) ) and (Equation24(24) $\begin{matrix} d_{ME} (n) : = \frac{(ρ_{n} + ρ_{n + 1}, z_{n})}{2 ∥ {(D^{- 1})}^{*} z_{n} ∥} = \frac{2 (ρ_{n + 1}, z_{n}) + ∥ A^{*} z_{n} ∥^{2}}{2 ∥ {(D^{- 1})}^{*} z_{n} ∥} . \end{matrix}$ (24) ) leads to decrease of parameter $r \leq r_{ME}$ , therefore, to increase of the error $∥ x_{r} - x^{†} ∥$ . $□$

Numerical examples

Our tests are performed on the well-known set of test problems by Hansen [Citation43]: baart, deriv2, foxgood, gravity, heat, ilaplace, phillips, shaw, spikes, wing. In all tests, we used the discretization parameter $n = 100$ .

For example, the test problem heat (see [Citation43, Citation44]) represents the heat conduction in one-dimensional semi-infinite rod $[0, \infty)$ with the thermal diffusivity 1. The initial temperature of the rod is 0 and the boundary condition at the point $ξ = 0$ is not known. Instead, one measures the temperature at $ξ = 1$ for the time $t \in [0, 1]$ and tries to determine the temperature at $ξ = 0$ from this measurement. (NB! In this paragraph and in the following paragraph, $t$ and $s$ have different meaning from the rest of the paper.)

This results in the Volterra integral equation of the first kind $\begin{matrix} \int_{0}^{t} \frac{{(t - s)}^{- 3 / 2}}{2 \sqrt{π}} \exp (- \frac{1}{4 (t - s)}) x (s) ds = y (t), t \in [0, 1], \end{matrix}$ where $y (t)$ is the measured temperature at $ξ = 1$ and $x (t)$ is the temperature at $ξ = 0$ .

The integral equation is discretized by means of a simple collocation method on the points $t_{i} = i / n$ and the midpoint rule with $n = 100$ points. An exact discrete solution $z$ with components $\begin{matrix} z_{i} = {\begin{matrix} 75 t_{i}^{2}, & if t_{i} \leq 0.1 \\ 0.75 + (20 t_{i} - 2) (3 - 20 t_{i}), & if 0.1 \leq t_{i} < 0.15 \\ 0.75 \exp (- 2 (20 t_{i} - 3)), & if 0.15 < t_{i} \leq 0.5 \\ 0, & otherwise \end{matrix} \end{matrix}$ is constructed, and then the right-hand side $b$ is computed as $b = Az$ .

Since the performance of methods and rules generally depends on the smoothness $p$ of the exact solution, we complemented the standard solutions $x^{†}$ of the (now discrete) test problems with smoothened solutions ${(A^{*} A)}^{p / 2} x^{†}$ ( $p = 2$ ) in all test problems (computing the right-hand side as $A ({(A^{*} A)}^{p / 2} x^{†})$ ). After discretization, all problems were scaled (normalized) in such a way that the Euclidian norms of the operator and the right-hand side were 1. We used operator $D = L^{- t}$ with $t \in {- 0.5, 0, 0.5}$ . We computed $L^{- t} y_{0}$ and added a normally distributed noise such that $∥ L^{- t} (y - y_{0}) ∥$ had the values $0.3$ , $10^{- 1}$ , $10^{- 2}$ , $10^{- 3}$ , $10^{- 4}$ , $10^{- 5}$ , $10^{- 6}$ . Here, $∥ \cdot ∥$ means the Euclidean norm.

We generated 10 noise vectors and used these vectors in all of the problems. The problems were regularized by the Tikhonov method in $Y_{s}$ -scales (the second formula in (Equation9(9) $\begin{matrix} ∥ D (y - {Ax}^{†}) ∥ = ∥ D (y - y_{0} + (A_{0} - A) x^{†}) ∥ \leq δ + h ∥ x^{†} ∥ = : Δ_{1} . \end{matrix}$ (9) )), where the regularization parameters were chosen according to the rules that we wanted to compare. We used the $Y_{s}$ -scale with the operator L acting as $Lu = u - u^{''}$ . In the discrete form, we used the approximating matrix $(l_{ij})$ with the elements $l_{ii} = 2 n^{2} + 1$ , $i = 1, \dots, n$ , $l_{i, i + 1} = l_{i + 1, i} = - n^{2}$ , $i = 1, \dots, n - 1$ , $l_{ij} = 0$ otherwise.

Since in these model equations the exact solution is known, it is possible to find the regularization parameter $α = α_{*}$ which gives the smallest error: $∥ x_{α_{*}} - x^{†} ∥ = \min_{α > 0} {∥ x_{α} - x^{†} ∥}$ . For every rule R, the error ratio $∥ x_{α_{R}} - x^{†} ∥ / ∥ x_{α_{*}} - x^{†} ∥$ describes the performance of the rule R on this particular problem. For better comparison of the cases with different s, in the denominator we always use $s = 0$ . To compare the rules or to present their properties, the following tables show averages of these error ratios over various parameters of the data set (problems 1–10, noise levels $δ$ , noise vectors).

In Tables and we compare the following rules. Earlier we have introduced the discrepancy principle D (see page 5, formula (Equation11(11) $\begin{matrix} d_{MD} (α) : = ∥ {DL}^{s} {(I - g_{α} ({BB}^{*}) {BB}^{*})}^{1 / (2 p_{0})} L^{- s} (y - {Ax}_{α}) ∥ = C Δ \end{matrix}$ (11) )), the rule MD’ (page 7, (Equation19(19) $\begin{matrix} d_{MEa} (α) = \frac{(D ρ_{m, α}, D ρ_{m + 1, α})}{∥ D ρ_{m + 1, α} ∥}, \end{matrix}$ (19) )), the monotone error rule ME (page 7, (Equation19(19) $\begin{matrix} d_{MEa} (α) = \frac{(D ρ_{m, α}, D ρ_{m + 1, α})}{∥ D ρ_{m + 1, α} ∥}, \end{matrix}$ (19) )) and the rule MEa as the analog of the ME-rule (page 7, (Equation20(20) $\begin{matrix} x_{n} = x_{n - 1} + A^{*} z_{n - 1}, n = 1, 2, \dots \end{matrix}$ (20) )). In rule MEe (recall motivation on page 7), we chose the regularization parameter as $α_{MEe} = α_{ME} / c$ with $c = 2.3 \cdot 10^{c_{1}}$ , $c_{1} = (s - t) (1.5 + 1.3 s - 0.5 t - 0.8 s^{2} + 2 st - t^{2})$ ; this form of the constant was obtained by extensive numerical experiments on different values of $s$ and $t$ . In rule MEae, we chose the parameter $α_{MEae} = α_{MEa} / c$ , where $c = 2.3$ for $t = 0$ and $c = 3$ for $t = - 0.5$ or $t = 0.5$ .

Table 1 Error ratios for different rules.

Display Table

Note that in the case $s > t$ , the ME rule (unlike rules D, MD’, MEa) allows mild underestimation of the noise level. In the columns ME08 and ME07, we present the error ratios obtained by using the noise levels $0.8 δ$ and $0.7 δ$ instead of $δ$ . Typically, ME08 and ME07 give better results than ME rule, but sometimes they can give large errors. If instead of the exact noise level approximate noise level is given, we recommend using the MEe-rule with $s > t$ . In the case $s = t$ , all considered rules failed if the noise level was underestimated.

In Table , we present the results separately for large $δ$ ’s (l means $δ \geq 10^{- 2}$ ) and small $δ$ ’s (s means $δ \leq 10^{- 3}$ ). We present results for $s = t$ and $s = t + 0.5$ . For a given $t$ in most cases, the best choice of $s$ is $s = t$ , in which case all of the considered rules gave similar results. But in the case $s = t = - 0.5$ , wecould not obtain good results for smooth solutions, especially in case of small noise levels. Then, the error function had a very sharp decrease at its minimum, which was hard to locate. Therefore, in the case $t = - 0.5$ we recommend taking $s = 0$ . Then, the ME and MEe rules gave good results. However, other rules failed in this case in the problems baart, foxgood, and wing.

In most problems, using other values of $s$ and $t$ than $s = t$ and $s = t + 0.5$ caused slightly worse results in the rules ME and MEe and essentially worse results in rules D and MD’. However, in the problems deriv2 and phillips in the case of a smooth solution ( $p \geq 2$ ), increasing $s$ increased the accuracy of rules D, MD’, MEe for all $t$ (in problem deriv2 the accuracy of rule D increased for $s$ up to 0.5).

Table shows that the discrepancy principle works well in the case $s = t = 0$ , $s = t = 0.5$ but is not as good for $p > 1$ . Typically, the MEe-rule gives the best results. We made computations besides MD’ rule (Equation13(13) $\begin{matrix} x_{α} = A^{*} w_{α}, z_{α} : = \frac{d}{d α} w_{α} \in Y, \end{matrix}$ (13) ) also with MD rule (Equation12(12) $\begin{matrix} d_{{MD}^{'}} (α) = (D (y - {Ax}_{α}), D L^{s} {(I - g_{α} ({BB}^{*}) {BB}^{*})}^{1 / p_{0}} L^{- s} (y - {Ax}_{α} {))}^{1 / 2} = C Δ . \end{matrix}$ (12) ), the results were almost the same.

Table presents the results for the problem heat. In column ‘min’, we present averages of minimal errors over 10 runs for concrete $s$ and $t$ . Unlike Table , the rules D, MD’, MEa, MEae gave almost as good results as the rule MEe. These rules were not as good as the rule MEe only in the case of smooth solution ( $p = 2$ ): rule D for $t = s = 0$ and rules MD’, MEa, MEae for $t = - 0.5$ , $s = 0$ .

**Table 2 Results for the problem heat.**

Display Table

Both in Tables and in case $t = 0.5$ , $s = 1$ , the best results were obtained by using the MEae rule.

Some remarks about numerical realization of algorithms. In our numerical computations, the discretization parameter $n$ was 100. This low dimension allowed us to use the advantages of singular value decomposition of the matrix for fast computations. Note also that instead of the solution of the Equations (Equation11(11) $\begin{matrix} d_{MD} (α) : = ∥ {DL}^{s} {(I - g_{α} ({BB}^{*}) {BB}^{*})}^{1 / (2 p_{0})} L^{- s} (y - {Ax}_{α}) ∥ = C Δ \end{matrix}$ (11) ), (Equation19(19) $\begin{matrix} d_{MEa} (α) = \frac{(D ρ_{m, α}, D ρ_{m + 1, α})}{∥ D ρ_{m + 1, α} ∥}, \end{matrix}$ (19) ) and (Equation20(20) $\begin{matrix} x_{n} = x_{n - 1} + A^{*} z_{n - 1}, n = 1, 2, \dots \end{matrix}$ (20) ) we made computations on the $α$ -sequence $α_{i} = 0 . 9^{- i}$ , $i = 0, 1, \dots$ and used for the corresponding regularization parameter the first $α_{i}$ for which the left hand side of the equation was smaller than the right-hand side.

The computations are made with the exact operator, since the inexact operator is typically accompanied by a significant overestimation of the noise level. In this case, it is preferable to use the rules from the article [Citation25] where the exact operator was considered. We plan to extend these results to the case of a noisy operator in a forthcoming paper.

It is worthwhile to think about the possibility to choose the operator L depending on the operator A.

Note that the ME-rule can also be used in projection methods. It was studied for the exact operator in [Citation45]. Generalizations to noisy operators are to be published in a further paper.

Conclusion

We have considered a linear ill-posed problem $A_{0} x = y_{0}$ with the operator $A_{0} \in L (X, Y)$ . A noisy right-hand side $y$ and an operator $A$ with noise levels $∥ D (y_{0} - y) ∥ \leq δ$ , $∥ D (A_{0} - A) ∥ \leq h$ , where $D$ is certain operator, are given. We derived the monotone error rule (ME rule) for parameter choice in many regularization methods in $Y_{- s}$ -scale, generated by powers $s \in R$ of certain operator $L$ . These regularization methods include the (iterated) Tikhonov method and many iterative methods (Landweber method, steepest descent method, conjugate gradient-type methods, semi-iterative methods, implicit iteration method). It was shown for which data the ME rule gives the optimal parameter. For a class of methods, the convergence and quasi-optimality results are given. In extensive numerical experiments, the ME-rule and its modification MEe gave good results, whereby in the case $D = L^{- t}$ with $t < s$ these two rules allowed also moderate underestimation of the noise level (other rules fail in case of underestimated noise level).

Acknowledgments

This work was started while the first author held an appointment as Visiting Professor at the University of Applied Sciences Zittau/Görlitz from June until August 2010 and was supported by DFG (Deutsche Forschungsgemeinschaft) under a cooperative bilateral research project grant. In addition, the work was supported by the Estonian Science Foundation, Grant 9120.

References

Engl HW, Hanke M, Neubauer A. Regularization of inverse problems. Vol. 375, Mathematics and its applications Dordrecht: Kluwer; 1996.
Google Scholar
Vainikko GM, Veretennikov AY. Iteration procedures in ill-posed problems. Moscow: Nauka; 1986(in Russian).
Google Scholar
Tautenhahn U. On a new parameter choice rule for ill-posed inverse problems. In: Lipitakis EA, editor. HERCMA ’98: Proc. 4th Hellenic-European Conference on Computer Mathematics and its Applications. Athens: Lea Publishers; 1999. p. 118–125.
Google Scholar
Hämarik U. Monotonicity of error and choice of the stopping index in iterative regularization methods. In: Pedas A, editor. Differential and integral equations: theory and numerical analysis. Tartu: Estonian Mathematical Society; 1999. p. 15–30.
Google Scholar
Hämarik U, Raus T. On the a posteriori parameter choice in regularization methods. Proc. Estonian Acad. Sci. Phys. Math. 1999;48:133–145.
Google Scholar
Tautenhahn U, Hämarik U. The use of monotonicity for choosing the regularization parameter in ill-posed problems. Inverse Probl. 1999;15:1487–1505.
Web of Science ®Google Scholar
Hämarik U, Tautenhahn U. On the monotone error rule for parameter choice in iterative and continuous regularization methods. BIT Numerical Math. 2001;41:1029–1038.
Web of Science ®Google Scholar
Hämarik U, Tautenhahn U. On the monotone error rule for choosing the regularization parameter in ill-posed problems. In: Lavrentiev MM, Kabanikhin SI, editors. Ill-posed and non-classical problems of mathematical physics and analysis. Utrecht: VSP; 2003 p. 27–55.
Google Scholar
Alifanov OM, Rumyantsev SV. On the stability of iterative methods for the solution of linear ill-posed problems. Soviet Math. Dokl. 1979;20:1133–1136.
Google Scholar
Alifanov OM, Artyukhin EA. Rumyantsev SV. Extreme methods for solving ill-posed problems with applications to inverse heat transfer problems. New York: Begell House; 1995.
Google Scholar
Egger H. Y-scale regularization. SIAM J. Numer. Anal. 2008;46:419–436.
Web of Science ®Google Scholar
Egger H. Regularization of inverse problems with large noise. J. Phys.: Conf. Ser. 2008; 124: 9pp.
Google Scholar
Morozov VA. Regularization under large noise. Comput. Math. Math. Phys. 1996;36:1175–1181.
Google Scholar
Eggermont PPB, LaRiccia VN, Nashed MZ. On weakly bounded noise in ill-posed problems. Inverse Prob. 2009; 25: 14 p.
Google Scholar
Mathé P, Tautenhahn U. Enhancing linear regularization to treat large noise. J. Inverse Ill-Posed Prob. 2011;19:859–879.
Web of Science ®Google Scholar
Mathé P, Tautenhahn U. Regularization under general noise assumptions. Inverse Prob. 2011; 27:035016 (15 p).
Google Scholar
Vainikko GM. The discrepancy principle for a class of regularization methods. USSR Comp. Math. Math. Phys. 1982;22:1–19.
Web of Science ®Google Scholar
Hämarik U, Palm R, Raus T. Use of extrapolation in regularization methods. J. Inverse Ill-Posed Prob. 2007;15:277–294.
Google Scholar
Hämarik U, Palm R, Raus T. Extrapolation of Tikhonov regularization method. Math. Model. Anal. 2010;15:55–68.
Web of Science ®Google Scholar
Tautenhahn U. Error estimates for regularization methods in Hilbert scales. SIAM J. Numer. Anal. 1996;33:2120–2130.
Web of Science ®Google Scholar
Morozov VA. On the solution of functional equations by the method of regularization. Soviet Math. Dokl. 1966;7:414–417.
Google Scholar
Raus T. On the discrepancy principle for solution of ill-posed problems with non-selfadjoint operators. Acta et comment. Univ. Tartuensis. 1985;715: 12–20(in Russian).
Google Scholar
Gfrerer H. An a posteriori parameter choice for ordinary and iterated Tikhonov regularization of ill-posed problems leading to optimal convergence rates. Math. Comp. 1987;49:507–522.
Web of Science ®Google Scholar
Tautenhahn U. On order optimal regularization under general source conditions. Proc. Est. Acad. Sci. Phys. Math 2004;53: 116–123.
Google Scholar
Hämarik U, Palm R, Raus T. A family of rules for parameter choice in Tikhonov regularization of ill-posed problems with inexact noise level. J. Comput. Appl. Math. 2012;236:2146–2157.
Web of Science ®Google Scholar
Hämarik U, Palm R, Raus T. Comparison of parameter choices in regularization algorithms in case of different information about noise level. Calcolo. 2011;48:47–59.
Web of Science ®Google Scholar
Hämarik U, Kangro U, Palm R, Raus T. On parameter choice in the regularization of ill-posed problems with rough estimate of the noise level of the data. In: Numerical analysis and applied mathematics ICNAAM 2012. AIP Conference Proceedings, 1479. New York (NY): American Institute of Physics; 2012. p. 2332–2335.
Google Scholar
Hämarik U, Raus T. On the choice of the regularization parameter in ill-posed problems with approximately given noise level of data. J. Inverse Ill-Posed Prob. 2006;14:251–266.
Google Scholar
Hämarik U, Palm R, Raus T. On minimization strategies for choice of the regularization parameter in ill-posed problems. Numer. Funct. Anal. Opt. 2009;30:924–950.
Web of Science ®Google Scholar
Hämarik U, Raus T, Palm R. On the analog of the monotone error rule for parameter choice in the (iterated) Lavrentiev regularization. Comput. Meth. Appl. Math. 2008;8:237–252.
Google Scholar
Palm R. Numerical comparison of regularization algorithms for solving ill-posed problems. University of Tartu; 2010. Available from: http://hdl.handle.net/10062/14623.
Google Scholar
Hämarik U, Palm R, Raus T. A family of rules for the choice of the regularization parameter in the Lavrentiev method in the case of rough estimate of the noise level of the data. J. Inverse Ill-Posed Prob. 2012;20:831–854.
Web of Science ®Google Scholar
Gilyazov SF, Goldman NL. Regularization of ill-posed problems by iteration methods. Vol. 499, Mathematics and its applications. Dordrecht: Kluwer; 2000.
Google Scholar
Hämarik U, Raus T. On the choice of the stopping index in iteration methods for solving problems with noisy data. In: Lipitakis EA, editor. HERCMA 2001: Proceedings of the 5th Hellenic-European conference on computer mathematics and its applications, Athens, Greece, September 20–22, 2001. Athens: Lea Publishers; 2002. p. 524–529.
Google Scholar
Mathé P, Pereverzev SV. Geometry of linear ill-posed problems in variable Hilbert scales. Inverse Prob. 2003;19:789–803.
Web of Science ®Google Scholar
Hanke M. Conjugate gradient type methods for ill-posed problems. Harlow: Longman Scientific & Technical 1995.
Google Scholar
Pereverzev SV, Schock E. On the adaptive selection of the parameter in the regularization of ill-posed problems. SIAM J. Numer. Anal. 2005;43:2060–2076.
Web of Science ®Google Scholar
Hämarik U, Palm R. On rules for stopping the conjugate gradient type methods in ill-posed problems. Math. Model. Anal. 2007;12:61–70.
Web of Science ®Google Scholar
Hämarik U, Raus T. About the balancing principle for choice of the regularization parameter. Numer. Funct. Anal. Opt. 2009;30:951–970.
Web of Science ®Google Scholar
Hanke M. Accelerated Landweber iterations for the solution of ill-posed equations. Numer. Math. 1991;60:341–373.
Web of Science ®Google Scholar
Raus T, Hämarik U. On the quasi-optimal rules for the choice of the regularization parameter in case of a noisy operator. Adv. Comput. Math. 2012;36:221–233.
Web of Science ®Google Scholar
Hämarik U, Palm R, Raus T. On the quasioptimal regularization parameter choices for solving ill-posed problems. J. Inverse Ill-Posed Prob. 2007;15:419–439.
Google Scholar
Hansen PC. Regularization tools: a Matlab package for analysis and solution of discrete ill-posed problems. Numerical Algorithms. 1994;6:1–35.
Google Scholar
Eldén L. The numerical solution of a non-characteristic Cauchy problem for a parabolic equation. In: Numerical Treatment of Inverse Problems in Differential and Integral Equations, Proceedings of an International Workshop, Heidelberg,1982 Boston: Birkhäuser; 1983. p. 246–268.
Google Scholar
Hämarik U, Avi E, Ganina A. On the solution of ill-posed problems by projection methods with a posteriori choice of the discretization level. Math. Model. Anal. 2002;7:241–252.
Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Download PDF

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Your download is now in progress and you may close this window

Did you know that with a free Taylor & Francis Online account you can gain access to the following benefits?

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Have an account?
Login now Don't have an account?
Register for free

Login or register to access this feature

Have an account?
Login now Don't have an account?
Register for free

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Monotonicity of error of regularized solution and its use for parameter choice

Abstract

Introduction

Well-known regularization methods in $Y$ -scale

Well-known rules for choice of the regularization parameter

ME-rule for the continuous regularization methods

Derivation of the ME-rule

ME-rule and modifications for the (iterated) Tikhonov regularization

ME-rule for asymptotical regularization

ME-rule for iterative regularization methods

Derivation of the ME-rule

ME-rule in gradient-type methods

ME-rule in conjugate gradient methods

ME-rule for sequence of approximations

Theorem 5.1

Proof

ME-rule for semi-iterative regularization methods

ME-rule in implicit iteration methods

Proposition 5.2

Proof

Theorem 5.3

Proof

Convergence and quasi-optimality for the ME-rule

Theorem 6.1

Proof

Theorem 6.2

proof

Theorem 6.3

Proof

Numerical examples

Table 1 Error ratios for different rules.

**Table 2 Results for the problem heat.**

Conclusion

Acknowledgments

References

Information for

Open access

Opportunities

Help and information

Monotonicity of error of regularized solution and its use for parameter choice

Abstract

Introduction

Well-known regularization methods in Y-scale

Well-known rules for choice of the regularization parameter

ME-rule for the continuous regularization methods

Derivation of the ME-rule

ME-rule and modifications for the (iterated) Tikhonov regularization

ME-rule for asymptotical regularization

ME-rule for iterative regularization methods

Derivation of the ME-rule

ME-rule in gradient-type methods

ME-rule in conjugate gradient methods

ME-rule for sequence of approximations

Theorem 5.1

Proof

ME-rule for semi-iterative regularization methods

ME-rule in implicit iteration methods

Proposition 5.2

Proof

Theorem 5.3

Proof

Convergence and quasi-optimality for the ME-rule

Theorem 6.1

Proof

Theorem 6.2

proof

Theorem 6.3

Proof

Numerical examples

Table 1 Error ratios for different rules.

Table 2 Results for the problem heat.

Conclusion

Acknowledgments

References

Reprints and Corporate Permissions

Academic Permissions

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date

Well-known regularization methods in $Y$ -scale

**Table 2 Results for the problem heat.**