Search in:

Inverse Problems in Science and Engineering Volume 23, 2015 - Issue 1

Submit an article Journal homepage

Free access

277

Views

CrossRef citations to date

Altmetric

Listen

Articles

A double optimal descent algorithm for iteratively solving ill-posed linear inverse problems

Chein-Shan LiuDepartment of Civil Engineering, National Taiwan University, Taipei, Taiwan.Correspondence[email protected]

Pages 38-66 | Received 04 Sep 2013, Accepted 05 Jan 2014, Published online: 17 Feb 2014

Cite this article
https://doi.org/10.1080/17415977.2014.880905
CrossMark

In this article

1 Introduction
2 Invariant manifold
3 The Krylov subspace method
4 Doubly optimized descent direction of u
5 A double optimal descent algorithm
6 Numerical examples
7 Conclusions and discussion
Acknowledgements
References

Full Article
Figures & data
References
Citations
Metrics
Reprints & Permissions
View PDF PDF

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

In the iterative solution of an ill-posed linear system $B x = b$ , how to select a fast and easily established descent direction $u$ to reduce the residual $r : = B x - b$ is an important issue. A mathematical procedure to find a double optimal descent direction $u$ in $B u = r$ , without inverting $B$ , is developed in an $m$ -dimensional Krylov subspace. The novelty is that we expand $u$ in an affine Krylov subspace with undetermined coefficients, and then two optimization techniques are used to determine these coefficients in closed form, which can greatly accelerate the convergence speed in solving the ill-posed linear problems. The double optimal descent algorithm is proven to be absolutely convergent very fast, accurate and robust against noise, which is confirmed by numerical tests of several linear inverse problems, including the heat source identification problem, the backward heat conduction problem, the inverse Cauchy problem and the external force identification problem.

Keywords:

linear inverse problems
invariant-manifold
best descent direction
optimal iterative algorithm
double optimal descent algorithm
optimization in Krylov subspace

1 Introduction

In this paper, we develop a double optimal descent algorithm (DODA) in an affine Krylov subspace to iteratively solve the following linear equations system:1 $\begin{matrix} B x = b, \end{matrix}$ 1 where $x \in R^{n}$ is an unknown vector, to be determined from a given non-singular coefficient matrix $B \in R^{n \times n}$ , and the input $b \in R^{n}$ . For the existence of the solution $x$ , we suppose that $det (B) \neq 0$ , and hence $B$ has a full rank with $rank (B) = n$ .

The linear inverse problems are sometimes obtained via an $n$ -dimensional discretization of a bounded linear operator equation under a noisy input. When $B$ is severely ill conditioned and the data are corrupted by noise, we may encounter the problem that the numerical solution of Equation (Equation11 $\begin{matrix} B x = b, \end{matrix}$ 1 ) might deviate from the exact one to a great extent, and in general a generalized solution $x = B^{†} b$ is found, where $B^{†}$ is a pseudo-inverse of $B$ in the Penrose sense.

If we only know perturbed input data $b^{δ} \in R^{n}$ with $‖ b - b^{δ} ‖ \leq δ$ , and if the problem is ill posed, i.e. the $range (B)$ is not closed or equivalently $B^{†}$ is unbounded, we have to solve Equation (Equation11 $\begin{matrix} B x = b, \end{matrix}$ 1 ) by a regularization method.

A measure of the ill-posedness of Equation (Equation11 $\begin{matrix} B x = b, \end{matrix}$ 1 ) can be performed by using the condition number of $B$ [Citation1]:2 $\begin{matrix} cond (B) = {‖ B ‖}_{F} {‖ B^{- 1} ‖}_{F}, \end{matrix}$ 2 where ${‖ B ‖}_{F}$ denotes the Frobenius norm of $B$ .

For every matrix norm $‖ ∙ ‖$ , we have $ρ (B) \leq ‖ B ‖$ , where $ρ (B)$ is a radius of the spectrum of $B$ . The Householder theorem states that for every $ϵ > 0$ and every matrix $B$ , there exists a matrix norm $‖ B ‖$ depending on $B$ and $ϵ$ such that $‖ B ‖ \leq ρ (B) + ϵ$ . Anyway, the spectral condition number $ρ (B) ρ (B^{- 1})$ can be used as an estimation of the condition number of $B$ by3 $\begin{matrix} cond (B) = \frac{{max}_{σ (B)} | λ |}{{min}_{σ (B)} | λ |}, \end{matrix}$ 3 where $σ (B)$ is the collection of all the eigenvalues of $B$ . Turning back to the Frobenius norm, we have4 $\begin{matrix} {‖ B ‖}_{F} \leq \sqrt{n} max_{σ (B)} | λ | . \end{matrix}$ 4 In particular, when $B$ is symmetric, $ρ (B) ρ (B^{- 1}) = {‖ B ‖}_{2} {‖ B^{- 1} ‖}_{2}$ . Roughly speaking, the numerical solution of Equation (Equation11 $\begin{matrix} B x = b, \end{matrix}$ 1 ) may lose the accuracy of $k$ decimal points when $cond (B) = 10^{k}$ .

Due to the ill-posedness of the problem, small errors in the data may lead to computed solution far off the correct one, rendering straightforward approaches to solve Equation (Equation11 $\begin{matrix} B x = b, \end{matrix}$ 1 ) useless. To overcome the sensitivity to noise, it is often using a regularization method to solve the ill-posed problem,[Citation2–Citation5] where a suitable regularization parameter is used to suppress the approximation error and propagated data error, ensuring stability of the solution $x$ with respect to the perturbed data $b^{δ}$ .

The iterative algorithm for solving linear equations system can be derived from the discretization of a certain ordinary differential equations system.[Citation6] Particularly, some descent methods can be interpreted as the discretizations of gradient flows.[Citation7] For a large-scale linear equations system, the major choice is using an iterative algorithm, where an early stopping criterion is used to prevent the reconstruction of noisy component in the approximate solution, and which has the regularization effect. Recently, some useful and simple methods have been developed by the author for solving Equation (Equation11 $\begin{matrix} B x = b, \end{matrix}$ 1 ), like the optimally generalized regularization method,[Citation8] optimally scaled vector regularization method,[Citation9] adaptive Tikhonov regularization method,[Citation10] optimal preconditioner method,[Citation11] the globally optimal iterative method,[Citation12] as well as the optimal and globally optimal tri-vector iterative algorithms (OTVIA).[Citation13, Citation14]

There are a lot of numerical methods that converge significantly faster than the steepest descent method (SDM), and unlike the conjugate gradient method (CGM), they insist their search directions to be the gradient vector at each iteration.[Citation15] The SDM performs poorly, yielding the iteration counts that grow linearly with $cond (B)$ ,[Citation16] whose unwelcome slowness has to do with the choice of the gradient descent direction $R$ and its original steplength ${‖ R ‖}^{2} / {‖ B R ‖}^{2}$ detected by the SDM. Liu [Citation13] has explored a variant of the SDM by feeding the concept of an optimal descent tri-vector to solve ill-posed linear equations system, which is an optimal combination of the steepest descent vector $R : = B^{T} r$ , the residual vector $r$ and a supplemental vector $B R$ , where not only the direction $R$ but also the steplength are modified from a theoretical foundation of optimization being realized on an invariant manifold. This novel method outperformed better than the generalized minimal residual method (GMRES),[Citation17] the CGM, and other gradient descent variant methods. The concept of optimal vector driving algorithm was explored by Liu and Atluri [Citation18] for solving linear equations. The idea of using two optimizations in the solution of Equation (Equation11 $\begin{matrix} B x = b, \end{matrix}$ 1 ) in a Krylov subspace was proposed by Liu [Citation19]. Then, Liu [Citation20] used the scaling invariant property of the merit function for a maximal projection to derive a maximal projection solution of linear problem in an affine Krylov subspace, which is proven better than the least-square solution. As a continuation, we further explore the concept of double optimal iterative algorithm with an $m$ -vector in a Krylov subspace as a descent direction to solve Equation (Equation11 $\begin{matrix} B x = b, \end{matrix}$ 1 ).

The remaining parts of this paper are arranged as follows. In Section 2, we review the invariant-manifold method. The Krylov subspace method used to derive an optimal descent direction is demonstrated in Section 3. In Section 4, we solve the optimal parameters appeared in the optimal descent direction by two optimization techniques, of which the doubly optimized solutions of these parameters are given in a closed-form type. Section 5 outlines the numerical procedures of the $m$ -vector DODA in terms of $m + 1$ weighting parameters which are optimized explicitly from two different merit functions, where the convergence of the DODA is proven. The numerical examples of linear inverse problems solved by the iterative algorithm of DODA are given in Section 6, where we compare the numerical performance of DODA with other several algorithms. Finally, the conclusions and discussions are drawn in Section 7.

2 Invariant manifold

For the linear equations system (Equation11 $\begin{matrix} B x = b, \end{matrix}$ 1 ), which is expressed to be $r = 0$ in terms of the residual vector:5 $\begin{matrix} r = B x - b, \end{matrix}$ 5 we can introduce a scalar function:6 $\begin{matrix} h (x, t) = \frac{Q (t)}{2} {‖ r ‖}^{2} - \frac{1}{2} {‖ r_{0} ‖}^{2} = 0, \end{matrix}$ 6 where $Q (t) > 0$ is a monotonically increasing function of $t$ , $x_{0}$ is an initial value of $x$ and $r_{0} = B x_{0} - b$ .

In order to keep the iterative orbit $x_{k}$ evolving on the manifold, Liu and Atluri [Citation18] have derived an iterative algorithm:7 $\begin{matrix} x_{k + 1} = x_{k} - η \frac{‖ r_{k} ‖^{2}}{r_{k}^{T} v^{k}} u^{k}, \end{matrix}$ 7 where8 $\begin{matrix} v : = B u, \end{matrix}$ 8 9 $\begin{matrix} η = \frac{1 - γ}{a_{0}}, \end{matrix}$ 9 10 $\begin{matrix} a_{0} : = \frac{{‖ r ‖}^{2} {‖ v ‖}^{2}}{{(r^{T} v)}^{2}} \geq 1 . \end{matrix}$ 10 In the above,11 $\begin{matrix} 0 \leq γ < 1 \end{matrix}$ 11 is a relaxation parameter.

Because the convergence speed is closely correlated to $a_{0}$ by12 $\begin{matrix} Convergence Rate : = \frac{‖ r (t) ‖}{‖ r (t + Δ t) ‖} = \frac{1}{\sqrt{s}} > 1, \end{matrix}$ 12 where13 $\begin{matrix} s = 1 - \frac{1 - γ^{2}}{a_{0}}, \end{matrix}$ 13 the author has previously developed several optimal iterative algorithms basing on the minimization of $a_{0}$ . On the other hand, the term $\frac{r \cdot (B u)}{{‖ B u ‖}^{2}}$ in Equation (Equation77 $\begin{matrix} x_{k + 1} = x_{k} - η \frac{‖ r_{k} ‖^{2}}{r_{k}^{T} v^{k}} u^{k}, \end{matrix}$ 7 ) is a steplength. Liu [Citation21] has proposed a new algorithm to solve Equation (Equation11 $\begin{matrix} B x = b, \end{matrix}$ 1 ) by maximizing the steplength in a Krylov subspace.

3 The Krylov subspace method

There are several numerical solution methods of Equation (Equation11 $\begin{matrix} B x = b, \end{matrix}$ 1 ), which are originated from the idea of minimization. For the positive definite linear system, solving Equation (Equation11 $\begin{matrix} B x = b, \end{matrix}$ 1 ) by the SDM is equivalent to solve the following minimization problem [Citation22, Citation23]:14 $\begin{matrix} min_{x \in R^{n}} φ (x) = min_{x \in R^{n}} [\frac{1}{2} x^{T} B x - b^{T} x], \end{matrix}$ 14 where $B$ is a positive definite matrix.

Liu [Citation12–Citation14] has proposed new methods by minimizing the following merit function:15 $\begin{matrix} min \{a_{0} = \frac{{‖ r ‖}^{2} {‖ B u ‖}^{2}}{{[r \cdot (B u)]}^{2}}\}, \end{matrix}$ 15 to obtain a fast descent direction $u$ , in terms of two or three vectors, for the iterative solution of Equation (Equation11 $\begin{matrix} B x = b, \end{matrix}$ 1 ). Because the above minimization is quite difficult when $u$ is expressed as a linear combination of multi vectors, Liu [Citation24] has solved, instead of Equation (Equation1515 $\begin{matrix} min \{a_{0} = \frac{{‖ r ‖}^{2} {‖ B u ‖}^{2}}{{[r \cdot (B u)]}^{2}}\}, \end{matrix}$ 15 ), the following minimization problem in a Krylov subspace:16 $\begin{matrix} min \{{‖ r - B u ‖}^{2} = {‖ b - B x ‖}^{2}\} . \end{matrix}$ 16 In the solution of linear equations system, the Krylov subspace method is one of the most important classes of numerical methods, and the iterative algorithms that are applied to solve large-scale linear systems are mostly the preconditioned Krylov subspace methods.[Citation24–Citation29] In the past few decades, the Krylov subspace methods for solving Equation (Equation11 $\begin{matrix} B x = b, \end{matrix}$ 1 ) have been studied in depth since the appearance of pioneering work,[Citation30, Citation31] such as the minimum residual algorithm,[Citation32] the GMRES,[Citation17, Citation27] the quasi-minimal residual method,[Citation28] the biconjugate gradient method,[Citation33] the conjugate gradient squared method [Citation34] and the biconjugate gradient stabilized method.[Citation35] There are more discussions on the Krylov subspace methods in the review papers [Citation26, Citation36] and text books.[Citation37, Citation38] The most are paying more attention to the residual vector and its ‘optimality’ of some residual errors in the Krylov subspace to derive the ‘best approximation’.

It is known that for the iterative algorithm to solve linear equations system (Equation11 $\begin{matrix} B x = b, \end{matrix}$ 1 ), the best descent direction $u$ is given by $B^{- 1} r$ .[Citation39] Then in order to find the best descent direction $u$ , we require to solve the following linear problem:17 $\begin{matrix} B u = r . \end{matrix}$ 17 Suppose that we have an $m$ -dimensional Krylov subspace generated by the coefficient matrix $B$ from the right-hand side vector $r$ in Equation (Equation1717 $\begin{matrix} B u = r . \end{matrix}$ 17 ):18 $\begin{matrix} K_{m} : = span {r, B r, \dots, B^{m - 1} r} . \end{matrix}$ 18 Let $L_{m} = B K_{m}$ . The idea of GMRES is using the Galerkin method to search the solution $u \in K_{m}$ , such that the residual $r - B u = b - B x$ is perpendicular to $L_{m}$ .[Citation17] It can be shown that the solution $u \in K_{m}$ minimizes the residual [Citation37] in Equation (Equation1616 $\begin{matrix} min \{{‖ r - B u ‖}^{2} = {‖ b - B x ‖}^{2}\} . \end{matrix}$ 16 ). It is known that the GMRES does not always perform well for ill-posed linear systems.[Citation40–Citation44] In order to solve ill-posed linear problems based on the GMRES, Calvetti et al. [Citation40] have proposed the range restriction GMRES (RRGMRES), which expands the solution in the following Krylov subspace:19 $\begin{matrix} L_{m} : = span {B r, B^{2} r, \dots, B^{m} r} . \end{matrix}$ 19 This method restricts the Krylov subspace to generate an approximate solution in the range of the coefficient matrix $B$ . Calvetti et al. [Citation40] confirmed that the RRGMRES performs well for ill-posed linear problems.

Let $p (λ)$ be the characteristic equation of the coefficient matrix $B$ , which can be written as:20 $\begin{matrix} p (λ) = λ^{n} + c_{n - 1} λ^{n - 1} + \dots + c_{2} λ^{2} + c_{1} λ - c_{0} = 0, \end{matrix}$ 20 where $c_{0} = - det (B) \neq 0$ because we suppose that $B$ is non-singular. The Cayley-Hamilton theorem [Citation45] asserts that21 $\begin{matrix} p (B) = B^{n} + c_{n - 1} B^{n - 1} + \dots + c_{2} B^{2} + c_{1} B - c_{0} I_{n} = 0 . \end{matrix}$ 21 From the above equation, we can expand $B^{- 1}$ by22 $\begin{matrix} B^{- 1} = \frac{c_{1}}{c_{0}} I_{n} + \frac{c_{2}}{c_{0}} B + \frac{c_{3}}{c_{0}} B^{2} + \dots + \frac{c_{n - 1}}{c_{0}} B^{n - 2} + \frac{1}{c_{0}} B^{n - 1}, \end{matrix}$ 22 and hence, the solution of Equation (Equation1717 $\begin{matrix} B u = r . \end{matrix}$ 17 ) is given by23 $\begin{matrix} u = B^{- 1} r = [\frac{c_{1}}{c_{0}} I_{n} + \frac{c_{2}}{c_{0}} B + \dots + \frac{c_{n - 1}}{c_{0}} B^{n - 2} + \frac{1}{c_{0}} B^{n - 1}] r . \end{matrix}$ 23 The above process to find the optimal descent direction of $u$ is quite difficult to be realized in practice, since the coefficients $c_{j}, j = 0, 1, \dots, n - 1$ are hard to find when $n$ is a quite large positive integer. Moreover, the computation of the higher order powers of $B$ is very expensive. In order to have an effective iterative algorithm, the above series should be truncated properly. Hence, motivated by Equation (Equation2323 $\begin{matrix} u = B^{- 1} r = [\frac{c_{1}}{c_{0}} I_{n} + \frac{c_{2}}{c_{0}} B + \dots + \frac{c_{n - 1}}{c_{0}} B^{n - 2} + \frac{1}{c_{0}} B^{n - 1}] r . \end{matrix}$ 23 ), we can suppose that $u$ is expressed by24 $\begin{matrix} u = β r + \sum_{k = 1}^{m} α_{k} u_{k}, \end{matrix}$ 24 which is to be determined as an optimal combination of $r$ and the $m$ -vector $u_{k}, k = 1, \dots, m$ , when the coefficients $α_{k}$ and $β$ are optimized in Sections 4.2 and 4.3, respectively.

Now we describe how to set up the $m$ -vector $u_{k}, k = 1, \dots, m$ by the Krylov subspace method. Suppose that we have an $m$ -dimensional Krylov subspace generated by the coefficient matrix $B$ from the right-hand side vector $r$ in Equation (Equation1717 $\begin{matrix} B u = r . \end{matrix}$ 17 ):25 $\begin{matrix} L_{m} : = span {B r, \dots, B^{m} r} . \end{matrix}$ 25 Then, the Arnoldi process is used to normalize and orthogonalize the Krylov vectors $B^{j} r, j = 1, \dots, m$ , such that the resultant vectors $u_{i}, i = 1, \dots, m$ satisfy $u_{i} \cdot u_{j} = δ_{i j}, i, j = 1, \dots, m$ , where $δ_{i j}$ is the Kronecker delta symbol.

Let26 $\begin{matrix} U : = [u_{1}, \dots, u_{m}] \end{matrix}$ 26 be an $n \times m$ Krylov matrix with its $j$ th column being the vector $u_{j}$ . Because $u_{1}, \dots, u_{m}$ are linearly independent vectors and $m < n$ , the rank of $U$ is $rank (U) = m$ . Hence, Equation (Equation2424 $\begin{matrix} u = β r + \sum_{k = 1}^{m} α_{k} u_{k}, \end{matrix}$ 24 ) can be written as:27 $\begin{matrix} u = u_{0} + U α, \end{matrix}$ 27 where28 $\begin{matrix} u_{0} = β r, \end{matrix}$ 28 and $α : = {(α_{1}, \dots, α_{m})}^{T}$ . The superscript $^{T}$ denotes the transpose.

Remark 1

The Krylov subspace method is an iterative method, of which the $m$ th step approximation to the solution of Equation (Equation11 $\begin{matrix} B x = b, \end{matrix}$ 1 ) is found in $x_{0} + K_{m}$ . The approximation is of the form $x_{m} = x_{0} + p_{m - 1} (B) r_{0}$ , where $p_{m - 1}$ is a polynomial at most with $m - 1$ degree. Usually, the number $m$ is smaller than the degree of the minimal polynomial of $B$ , and indeed in the numerical algorithm we can choose $m ≪ n$ .

4 Doubly optimized descent direction of $u$

The most of the existent numerical methods to solve linear system (Equation11 $\begin{matrix} B x = b, \end{matrix}$ 1 ) in the Krylov subspace are paying attention to the minimizations of the residual norm or to the fulfilment of some Galerkin conditions. To the best knowledge of the author, there exists no numerical method to find the numerical solution of Equation (Equation11 $\begin{matrix} B x = b, \end{matrix}$ 1 ), simultaneously based on the two minimizations in Equations (Equation1515 $\begin{matrix} min \{a_{0} = \frac{{‖ r ‖}^{2} {‖ B u ‖}^{2}}{{[r \cdot (B u)]}^{2}}\}, \end{matrix}$ 15 ) and (Equation1616 $\begin{matrix} min \{{‖ r - B u ‖}^{2} = {‖ b - B x ‖}^{2}\} . \end{matrix}$ 16 ).

4.1 The first optimization

Let29 $\begin{matrix} v : = B u, \end{matrix}$ 29 and we attempt to establish a merit function, such that its minimization leads to the best fit of $v$ to $r$ , because $B u = r$ is just the equation we use to find the best descent direction $u$ .

We consider finding the best approximation of $v$ to $r$ . The orthogonal projection of $r$ to $v$ is regarded as the approximation of $r$ by $v$ , whose error vector is written as:30 $\begin{matrix} e : = r - (r, \frac{v}{‖ v ‖}) \frac{v}{‖ v ‖}, \end{matrix}$ 30 where the brace denotes the inner product. The best approximation can be found with $v$ minimizing the square norm of error vector:31 $\begin{matrix} {‖ e ‖}^{2} = {‖ r ‖}^{2} - \frac{{(r \cdot v)}^{2}}{{‖ v ‖}^{2}} \geq 0, \end{matrix}$ 31 or maximizing the square norm of the orthogonal projection of $r$ to $v$ , i.e.32 $\begin{matrix} max_{v} \{\frac{{(r \cdot v)}^{2}}{{‖ v ‖}^{2}}\} . \end{matrix}$ 32 Let us define the following merit function:33 $\begin{matrix} f : = \frac{{‖ v ‖}^{2}}{{(r \cdot v)}^{2}} \geq \frac{1}{{‖ r ‖}^{2}}, \end{matrix}$ 33 of which the inequality follows from the Cauchy-Schwarz inequality: $r \cdot v \leq ‖ r ‖ ‖ v ‖ .$ Let $J$ be an $n \times m$ matrix:34 $\begin{matrix} J : = [B u_{1}, \dots, B u_{m}] = B U, \end{matrix}$ 34 where $U$ is defined by Equation (Equation2626 $\begin{matrix} U : = [u_{1}, \dots, u_{m}] \end{matrix}$ 26 ). Then, with the aid of Equations (Equation2727 $\begin{matrix} u = u_{0} + U α, \end{matrix}$ 27 ) and (Equation2929 $\begin{matrix} v : = B u, \end{matrix}$ 29 ) v can be written as:35 $\begin{matrix} v = v_{0} + J α, \end{matrix}$ 35 where36 $\begin{matrix} v_{0} : = B u_{0} = β B r . \end{matrix}$ 36 Inserting Equation (Equation3535 $\begin{matrix} v = v_{0} + J α, \end{matrix}$ 35 ) for $v$ into Equation (Equation3333 $\begin{matrix} f : = \frac{{‖ v ‖}^{2}}{{(r \cdot v)}^{2}} \geq \frac{1}{{‖ r ‖}^{2}}, \end{matrix}$ 33 ), we encounter the following minimization problem:37 $\begin{matrix} min_{α_{1}, \dots, α_{m}} \{f = \frac{{‖ v ‖}^{2}}{{(r \cdot v)}^{2}}\}, \end{matrix}$ 37 where38 $\begin{matrix} r \cdot v = r \cdot v_{0} + r^{T} J α, \end{matrix}$ 38 39 $\begin{matrix} v_{1} : = \nabla_{α} (r \cdot v) = J^{T} r, \end{matrix}$ 39 40 $\begin{matrix} {‖ v ‖}^{2} = {‖ v_{0} ‖}^{2} + 2 v_{0}^{T} J α + α^{T} J^{T} J α, \end{matrix}$ 40 41 $\begin{matrix} v_{2} : = \nabla_{α} {‖ v ‖}^{2} = 2 J^{T} v_{0} + 2 J^{T} J α, \end{matrix}$ 41 in which $\nabla_{α}$ denotes the gradient with respect to $α$ . Equation (Equation4141 $\begin{matrix} v_{2} : = \nabla_{α} {‖ v ‖}^{2} = 2 J^{T} v_{0} + 2 J^{T} J α, \end{matrix}$ 41 ) is obtained from Equation (Equation4040 $\begin{matrix} {‖ v ‖}^{2} = {‖ v_{0} ‖}^{2} + 2 v_{0}^{T} J α + α^{T} J^{T} J α, \end{matrix}$ 40 ) by taking the derivative of ${‖ v ‖}^{2}$ with respect to $α$ , which is given below by using the componential form:42 $\begin{matrix} \frac{{\partial ‖ v ‖}^{2}}{\partial α_{k}} = \frac{\partial}{\partial α_{k}} [2 v_{0}^{i} J_{i j} α_{j} + α_{i} C_{i j} α_{j}] = 2 v_{0}^{i} J_{i j} δ_{j k} + δ_{i k} C_{i j} α_{j} + α_{i} C_{i j} δ_{j k} \\ = 2 v_{0}^{i} J_{i k} + C_{k j} α_{j} + α_{i} C_{i k}, \end{matrix}$ 42 where $C_{i j}$ is the $i j$ th component of43 $\begin{matrix} C : = J^{T} J, \end{matrix}$ 43 which is an $m \times m$ positive definite matrix. Because of $C^{T} = C$ , from the last equation in the above we can deduce Equation (Equation4141 $\begin{matrix} v_{2} : = \nabla_{α} {‖ v ‖}^{2} = 2 J^{T} v_{0} + 2 J^{T} J α, \end{matrix}$ 41 ). Here, we have to emphasize that Equation (Equation4242 $\begin{matrix} \frac{{\partial ‖ v ‖}^{2}}{\partial α_{k}} = \frac{\partial}{\partial α_{k}} [2 v_{0}^{i} J_{i j} α_{j} + α_{i} C_{i j} α_{j}] = 2 v_{0}^{i} J_{i j} δ_{j k} + δ_{i k} C_{i j} α_{j} + α_{i} C_{i j} δ_{j k} \\ = 2 v_{0}^{i} J_{i k} + C_{k j} α_{j} + α_{i} C_{i k}, \end{matrix}$ 42 ) minimization of $f$ is fully equivalent to the minimization of $a_{0}$ defined by Equation (Equation1010 $\begin{matrix} a_{0} : = \frac{{‖ r ‖}^{2} {‖ v ‖}^{2}}{{(r^{T} v)}^{2}} \geq 1 . \end{matrix}$ 10 ), because $r$ is a known vector.

By using44 $\begin{matrix} \nabla_{α} \frac{{‖ v ‖}^{2}}{{(r \cdot v)}^{2}} = 0 \Rightarrow {(r \cdot v)}^{2} \nabla_{α} {‖ v ‖}^{2} - 2 r \cdot v {‖ v ‖}^{2} \nabla_{α} (r \cdot v) = 0, \end{matrix}$ 44 we can derive the following equation to solve $α$ :45 $\begin{matrix} r \cdot v v_{2} - 2 {‖ v ‖}^{2} v_{1} = 0 . \end{matrix}$ 45

4.2 A closed-form solution of $α$

In view of Equation (Equation4343 $\begin{matrix} C : = J^{T} J, \end{matrix}$ 43 ), Equations (Equation4040 $\begin{matrix} {‖ v ‖}^{2} = {‖ v_{0} ‖}^{2} + 2 v_{0}^{T} J α + α^{T} J^{T} J α, \end{matrix}$ 40 ) and (Equation4141 $\begin{matrix} v_{2} : = \nabla_{α} {‖ v ‖}^{2} = 2 J^{T} v_{0} + 2 J^{T} J α, \end{matrix}$ 41 ) can be written as:46 $\begin{matrix} {‖ v ‖}^{2} = {‖ v_{0} ‖}^{2} + 2 v_{0}^{T} J α + α^{T} C α, \end{matrix}$ 46 47 $\begin{matrix} v_{2} = 2 J^{T} v_{0} + 2 C α . \end{matrix}$ 47 Because $J$ has a full column $rank (J) = m$ , the positivity of $C$ is guaranteed. From Equation (Equation4545 $\begin{matrix} r \cdot v v_{2} - 2 {‖ v ‖}^{2} v_{1} = 0 . \end{matrix}$ 45 ), we can observe that $v_{2}$ is proportional to $v_{1}$ , which is supposed to be48 $\begin{matrix} v_{2} = \frac{{2 ‖ v ‖}^{2}}{r \cdot v} v_{1} = 2 λ v_{1}, \end{matrix}$ 48 where $2 λ$ is a multiplier to be determined.

Then, by Equations (Equation3939 $\begin{matrix} v_{1} : = \nabla_{α} (r \cdot v) = J^{T} r, \end{matrix}$ 39 ), (Equation4747 $\begin{matrix} v_{2} = 2 J^{T} v_{0} + 2 C α . \end{matrix}$ 47 ) and (Equation4848 $\begin{matrix} v_{2} = \frac{{2 ‖ v ‖}^{2}}{r \cdot v} v_{1} = 2 λ v_{1}, \end{matrix}$ 48 ), we have49 $\begin{matrix} α = λ D J^{T} r - D J^{T} v_{0}, \end{matrix}$ 49 where50 $\begin{matrix} D : = C^{- 1} = {(J^{T} J)}^{- 1} \end{matrix}$ 50 is an $m \times m$ positive definite matrix. Inserting Equation (Equation4949 $\begin{matrix} α = λ D J^{T} r - D J^{T} v_{0}, \end{matrix}$ 49 ) into Equations (Equation3838 $\begin{matrix} r \cdot v = r \cdot v_{0} + r^{T} J α, \end{matrix}$ 38 ) and (Equation4646 $\begin{matrix} {‖ v ‖}^{2} = {‖ v_{0} ‖}^{2} + 2 v_{0}^{T} J α + α^{T} C α, \end{matrix}$ 46 ) we have51 $\begin{matrix} r \cdot v = r \cdot v_{0} + λ r^{T} E r - r^{T} E v_{0}, \end{matrix}$ 51 52 $\begin{matrix} {‖ v ‖}^{2} = λ^{2} r^{T} E r + {‖ v_{0} ‖}^{2} - v_{0}^{T} E v_{0}, \end{matrix}$ 52 where53 $\begin{matrix} E : = J D J^{T} \end{matrix}$ 53 is an $n \times n$ positive semi-definite matrix.

Now, from Equations (Equation4848 $\begin{matrix} v_{2} = \frac{{2 ‖ v ‖}^{2}}{r \cdot v} v_{1} = 2 λ v_{1}, \end{matrix}$ 48 ), (Equation5151 $\begin{matrix} r \cdot v = r \cdot v_{0} + λ r^{T} E r - r^{T} E v_{0}, \end{matrix}$ 51 ) and (Equation5252 $\begin{matrix} {‖ v ‖}^{2} = λ^{2} r^{T} E r + {‖ v_{0} ‖}^{2} - v_{0}^{T} E v_{0}, \end{matrix}$ 52 ), we can derive the following equation:54 $\begin{matrix} λ^{2} r^{T} E r + {‖ v_{0} ‖}^{2} - v_{0}^{T} E v_{0} = λ [r \cdot v_{0} + λ r^{T} E r - r^{T} E v_{0}] . \end{matrix}$ 54 By cancelling the quadratic term $λ^{2} r^{T} E r$ on both sides, we can obtain a linear equation, which renders a closed-form solution of $λ$ :55 $\begin{matrix} λ = \frac{‖ v_{0} ‖^{2} - v_{0}^{T} E v_{0}}{r \cdot v_{0} - r^{T} E v_{0}}, \end{matrix}$ 55 and from Equation (Equation4949 $\begin{matrix} α = λ D J^{T} r - D J^{T} v_{0}, \end{matrix}$ 49 ), we can obtain the closed-form solution of $α$ :56 $\begin{matrix} α = \frac{‖ v_{0} ‖^{2} - v_{0}^{T} E v_{0}}{r \cdot v_{0} - r^{T} E v_{0}} D J^{T} r - D J^{T} v_{0} . \end{matrix}$ 56 Inserting the above $α$ into Equation (Equation2727 $\begin{matrix} u = u_{0} + U α, \end{matrix}$ 27 ), we can obtain57 $\begin{matrix} u = u_{0} + λ Q r - Q v_{0} = β [λ_{0} Q r + r - Q B r], \end{matrix}$ 57 where58 $\begin{matrix} Q : = U D J^{T}, \end{matrix}$ 58 59 $\begin{matrix} λ_{0} : = \frac{r^{T} B^{T} B r - r^{T} B^{T} E B r}{r^{T} B r - r^{T} E B r}, \end{matrix}$ 59 are, respectively, an $n \times n$ constant matrix and a constant scalar, both being fully determined by the coefficient matrix $B$ and the right-hand side vector $r$ in Equation (Equation1717 $\begin{matrix} B u = r . \end{matrix}$ 17 ).

4.3 The second optimization to find $β$

Upon letting60 $\begin{matrix} v : = λ_{0} Q r + r - Q B r, \end{matrix}$ 60 $u$ in Equation (Equation5757 $\begin{matrix} u = u_{0} + λ Q r - Q v_{0} = β [λ_{0} Q r + r - Q B r], \end{matrix}$ 57 ) can be expressed as61 $\begin{matrix} u = β v . \end{matrix}$ 61 We can derive the closed-form solution of $β$ by the second optimization of the second merit function:62 $\begin{matrix} {‖ B u - r ‖}^{2} = {‖ β B v - r ‖}^{2} = β^{2} {‖ w ‖}^{2} - 2 β w \cdot r + {‖ r ‖}^{2}, \end{matrix}$ 62 where63 $\begin{matrix} w : = B v = B r + λ_{0} B Q r - B Q B r . \end{matrix}$ 63 Taking the derivative of Equation (Equation6262 $\begin{matrix} {‖ B u - r ‖}^{2} = {‖ β B v - r ‖}^{2} = β^{2} {‖ w ‖}^{2} - 2 β w \cdot r + {‖ r ‖}^{2}, \end{matrix}$ 62 ) with respect to $β$ and equating it to zero, we can obtain64 $\begin{matrix} β = \frac{w \cdot r}{{‖ w ‖}^{2}} . \end{matrix}$ 64 Inserting it into Equation (Equation6161 $\begin{matrix} u = β v . \end{matrix}$ 61 ) we arrive at65 $\begin{matrix} u = \frac{w \cdot r}{{‖ w ‖}^{2}} v, \end{matrix}$ 65 where $v$ is defined by Equation (Equation6060 $\begin{matrix} v : = λ_{0} Q r + r - Q B r, \end{matrix}$ 60 ) and $w$ is defined by Equation (Equation6363 $\begin{matrix} w : = B v = B r + λ_{0} B Q r - B Q B r . \end{matrix}$ 63 ).

4.4 The proof of $β λ_{0} = 1$

Inserting Equation (Equation3434 $\begin{matrix} J : = [B u_{1}, \dots, B u_{m}] = B U, \end{matrix}$ 34 ) for $J$ into the first $J$ in Equation (Equation5353 $\begin{matrix} E : = J D J^{T} \end{matrix}$ 53 ) and comparing the resultant with Equation (Equation5858 $\begin{matrix} Q : = U D J^{T}, \end{matrix}$ 58 ), it immediately follows that66 $\begin{matrix} E = B Q . \end{matrix}$ 66 Then, by using Equations (Equation5353 $\begin{matrix} E : = J D J^{T} \end{matrix}$ 53 ) and (Equation5050 $\begin{matrix} D : = C^{- 1} = {(J^{T} J)}^{- 1} \end{matrix}$ 50 ), we have67 $\begin{matrix} E^{2} = E, \end{matrix}$ 67 such that $E$ is a projection operator. Indeed, $E^{2} = E$ is a well-known result that $E = J {(J^{T} J)}^{- 1} J^{T}$ is an orthogonal projection operator.

In terms of $E$ , $w$ defined in Equation (Equation6363 $\begin{matrix} w : = B v = B r + λ_{0} B Q r - B Q B r . \end{matrix}$ 63 ) can be written as:68 $\begin{matrix} w = λ_{0} E r + B r - E B r . \end{matrix}$ 68 It follows that69 $\begin{matrix} {‖ w ‖}^{2} = λ_{0}^{2} r^{T} E r + r^{T} B^{T} B r - r^{T} B^{T} E B r, \end{matrix}$ 69 70 $\begin{matrix} r \cdot w = λ_{0} r^{T} E r + r^{T} B r - r^{T} E B r, \end{matrix}$ 70 where Equation (Equation6767 $\begin{matrix} E^{2} = E, \end{matrix}$ 67 ) was used in the first equation. With the aid of Equation (Equation5959 $\begin{matrix} λ_{0} : = \frac{r^{T} B^{T} B r - r^{T} B^{T} E B r}{r^{T} B r - r^{T} E B r}, \end{matrix}$ 59 ), Equation (Equation7070 $\begin{matrix} r \cdot w = λ_{0} r^{T} E r + r^{T} B r - r^{T} E B r, \end{matrix}$ 70 ) is further reduced to71 $\begin{matrix} r \cdot w = λ_{0} r^{T} E r + \frac{1}{λ_{0}} [r^{T} B^{T} B r - r^{T} B^{T} E B r] . \end{matrix}$ 71 Now, after inserting Equation (Equation7171 $\begin{matrix} r \cdot w = λ_{0} r^{T} E r + \frac{1}{λ_{0}} [r^{T} B^{T} B r - r^{T} B^{T} E B r] . \end{matrix}$ 71 ) into Equation (Equation6464 $\begin{matrix} β = \frac{w \cdot r}{{‖ w ‖}^{2}} . \end{matrix}$ 64 ), we can obtain72 $\begin{matrix} β λ_{0} {‖ w ‖}^{2} = λ_{0}^{2} r^{T} E r + r^{T} B^{T} B r - r^{T} B^{T} E B r; \end{matrix}$ 72 however, in view of Equation (Equation6969 $\begin{matrix} {‖ w ‖}^{2} = λ_{0}^{2} r^{T} E r + r^{T} B^{T} B r - r^{T} B^{T} E B r, \end{matrix}$ 69 ), the right-hand side is just equal to ${‖ w ‖}^{2}$ . Hence, we have73 $\begin{matrix} β λ_{0} = 1 . \end{matrix}$ 73 As a consequence, we have a neater form of the doubly optimized descent direction of $u$ , which is given by74 $\begin{matrix} u = β [r - Q B r] + Q r, \end{matrix}$ 74 where75 $\begin{matrix} β = \frac{r^{T} B r - r^{T} E B r}{r^{T} B^{T} B r - r^{T} B^{T} E B r} . \end{matrix}$ 75 Equation (Equation7474 $\begin{matrix} u = β [r - Q B r] + Q r, \end{matrix}$ 74 ) is better than Equation (Equation6565 $\begin{matrix} u = \frac{w \cdot r}{{‖ w ‖}^{2}} v, \end{matrix}$ 65 ), for saving computations.

Sometimes it is better to use the following normal equation:76 $\begin{matrix} B^{T} B u = R : = B^{T} r \end{matrix}$ 76 to find the best descent direction $u$ , because the coefficient matrix $B^{T} B$ is now positive definite. The process to find the best direction $u$ is the same, where we only need to replace $B$ by $B^{T} B$ , and $r$ by $R$ .

5 A double optimal descent algorithm

5.1 The numerical algorithm

The numerical procedure of the DODA is described in this section. Before that we need to compute the inverse matrix $D$ in Equation (Equation5050 $\begin{matrix} D : = C^{- 1} = {(J^{T} J)}^{- 1} \end{matrix}$ 50 ). The accuracy of this matrix is crucial for the DODA, and we can use the following matrix conjugate gradient method (MCGM) developed by Liu et al. [Citation46] to find the the inverse matrix $D$ of $C$ :

(i) Assume an initial $D_{0}$ .

(ii) Calculate $R_{0} = I_{m} - C D_{0}$ and $P_{1} = R_{0}$ .

(iii) For $k = 1, 2, \dots$ , we repeat the following iterations:77 $\begin{matrix} α_{k} = \frac{‖ R_{k - 1} ‖^{2}}{P_{k} : (C P_{k})}, \\ D_{k} = D_{k - 1} + α_{k} P_{k}, \\ R_{k} = I_{m} - C D_{k}, \\ η_{k} = \frac{‖ R_{k} ‖^{2}}{‖ R_{k - 1} ‖^{2}}, \\ P_{k + 1} = R_{k} + η_{k} P_{k} . \end{matrix}$ 77 If $D_{k}$ converges according to a given stopping criterion, such that,78 $\begin{matrix} ‖ R_{k} ‖ < ε_{1}, \end{matrix}$ 78 then stop; otherwise, go to step (iii). In the above, the boldfaced capital letters denote $m \times m$ matrices, the norm $‖ R_{k} ‖$ is the Frobenius norm and the inner product symbol $:$ is used for matrices. Because for the inverse problems to be computed we will use rather small $m$ , the inverse matrix $D$ is easily computed by the above MCGM algorithm, which is convergent very fast.

Thus, we arrive at the following DODA:

(i)	Select $m$ and $0 \leq γ < 1$ , and give an initial value of $x_{0}$ .
(ii)	For $k = 0, 1, \dots$ , we repeat the following computations:79 $\begin{matrix} r_{k} = B x_{k} - b, \\ Compute Equation (74) to obtain u^{k}, \\ v^{k} = B u^{k}, \\ x_{k + 1} = x_{k} - (1 - γ) \frac{r_{k} \cdot v^{k}}{‖ v^{k} ‖^{2}} u^{k} . \end{matrix}$ 79 If $x_{k + 1}$ converges according to a given stopping criterion $‖ r_{k + 1} ‖ < ε$ , then stop; otherwise, go to step (ii).

Remark 2

Note that $u_{j}, j = 1, \dots, m$ are constructed from the Krylov subspace method and following by the Arnoldi process to orthonormalize the Krylov vectors, the resultant optimal algorithm is the DODA with the Krylov subspace method. In the Krylov subspace method, the expansion matrix $J$ is not fixed, and it can adjust its configuration at each iterative step to accelerate the convergence speed, which, however, leads to a consumption of computational time in the calculation of the inverse of $C = J^{T} J$ to obtain $D$ , $E$ and $Q$ at every iterative step.

5.2 The proof of the convergence of DODA

In this section, we prove that the algorithm DODA is convergent. From the second equality in Equation (Equation4848 $\begin{matrix} v_{2} = \frac{{2 ‖ v ‖}^{2}}{r \cdot v} v_{1} = 2 λ v_{1}, \end{matrix}$ 48 ) by cancelling the common term $v_{1}$ on both sides, we have80 $\begin{matrix} {‖ v ‖}^{2} = λ r \cdot v, \end{matrix}$ 80 where $λ$ , by using Equations (Equation5555 $\begin{matrix} λ = \frac{‖ v_{0} ‖^{2} - v_{0}^{T} E v_{0}}{r \cdot v_{0} - r^{T} E v_{0}}, \end{matrix}$ 55 ) and (Equation3636 $\begin{matrix} v_{0} : = B u_{0} = β B r . \end{matrix}$ 36 ), can be written as81 $\begin{matrix} λ = β \frac{r^{T} B^{T} B r - r^{T} B^{T} E B r}{r^{T} B r - r^{T} E B r} . \end{matrix}$ 81 With the help of Equations (Equation5959 $\begin{matrix} λ_{0} : = \frac{r^{T} B^{T} B r - r^{T} B^{T} E B r}{r^{T} B r - r^{T} E B r}, \end{matrix}$ 59 ) and (Equation7373 $\begin{matrix} β λ_{0} = 1 . \end{matrix}$ 73 ), we have82 $\begin{matrix} λ = β λ_{0} = 1; \end{matrix}$ 82 hence, Equation (Equation8080 $\begin{matrix} {‖ v ‖}^{2} = λ r \cdot v, \end{matrix}$ 80 ) becomes83 $\begin{matrix} {‖ v ‖}^{2} = r \cdot v . \end{matrix}$ 83 In view of Equation (Equation8383 $\begin{matrix} {‖ v ‖}^{2} = r \cdot v . \end{matrix}$ 83 ), the iterative algorithm (Equation7979 $\begin{matrix} r_{k} = B x_{k} - b, \\ Compute Equation (74) to obtain u^{k}, \\ v^{k} = B u^{k}, \\ x_{k + 1} = x_{k} - (1 - γ) \frac{r_{k} \cdot v^{k}}{‖ v^{k} ‖^{2}} u^{k} . \end{matrix}$ 79 ) reduces to84 $\begin{matrix} x_{k + 1} = x_{k} - (1 - γ) u^{k} . \end{matrix}$ 84 By using Equations (Equation88 $\begin{matrix} v : = B u, \end{matrix}$ 8 ) and (Equation55 $\begin{matrix} r = B x - b, \end{matrix}$ 5 ), it follows that85 $\begin{matrix} B x_{k + 1} = B x_{k} - (1 - γ) B u^{k}, \\ r_{k + 1} = r_{k} - (1 - γ) v^{k} . \end{matrix}$ 85 Taking the square norms of both sides and using Equation (Equation8383 $\begin{matrix} {‖ v ‖}^{2} = r \cdot v . \end{matrix}$ 83 ), we can derive86 $\begin{matrix} ‖ r_{k + 1} ‖^{2} = ‖ r_{k} ‖^{2} - (1 - γ^{2}) {‖ v^{k} ‖}^{2} . \end{matrix}$ 86 Because of $1 - γ^{2} > 0$ and $‖ v^{k} ‖^{2} > 0$ , it immediately leads to87 $\begin{matrix} ‖ r_{k + 1} ‖^{2} < {‖ r_{k} ‖}^{2}, \end{matrix}$ 87 which indicates that the algorithm DODA is absolutely convergent.

Remark 3

The present algorithm DODA can provide a fast reduction of the residual. Indeed for the DODA, from Equations (Equation2929 $\begin{matrix} v : = B u, \end{matrix}$ 29 ), (Equation7474 $\begin{matrix} u = β [r - Q B r] + Q r, \end{matrix}$ 74 ), (Equation7575 $\begin{matrix} β = \frac{r^{T} B r - r^{T} E B r}{r^{T} B^{T} B r - r^{T} B^{T} E B r} . \end{matrix}$ 75 ) and (Equation6666 $\begin{matrix} E = B Q . \end{matrix}$ 66 ), we have88 $\begin{matrix} v^{k} = β (I_{n} - E) B r_{k} + E r_{k}, \\ ‖ v^{k} ‖^{2} = r_{k}^{T} E r_{k} + \frac{{[r_{k}^{T} (I_{n} - E) B r_{k}]}^{2}}{r_{k}^{T} B^{T} (I_{n} - E) B r_{k}}, \end{matrix}$ 88 where $(I_{n} - E) E = 0$ and Equation (Equation6767 $\begin{matrix} E^{2} = E, \end{matrix}$ 67 ) were used in the derivation of the second equation. In terms of the intersection angle $θ$ between $(I_{n} - E) r_{k}$ and $(I_{n} - E) B r_{k}$ , we have89 $\begin{matrix} ‖ v^{k} ‖^{2} = r_{k}^{T} E r_{k} + {‖ (I_{n} - E) r_{k} ‖}^{2} {cos}^{2} θ . \end{matrix}$ 89 If $θ = 0$ , $‖ v^{k} ‖^{2} = {‖ r_{k} ‖}^{2}$ and the DODA with $γ = 0$ converges with one step, due to $‖ r_{k + 1} ‖^{2} = 0$ by Equation (Equation8686 $\begin{matrix} ‖ r_{k + 1} ‖^{2} = ‖ r_{k} ‖^{2} - (1 - γ^{2}) {‖ v^{k} ‖}^{2} . \end{matrix}$ 86 ). On the other hand, if we take $m = n$ , then the DODA also converges with one step. We can see that if a suitable value of $m$ is taken then the DODA can converge within $n$ steps. Therefore, we have the following convergence criterion of the DODA. If90 $\begin{matrix} \sum_{j = 0}^{N} (1 - γ^{2}) ‖ v_{j} ‖^{2} \geq {‖ r_{0} ‖}^{2} - ε^{2}, \end{matrix}$ 90 then the iterations in Equation (Equation7979 $\begin{matrix} r_{k} = B x_{k} - b, \\ Compute Equation (74) to obtain u^{k}, \\ v^{k} = B u^{k}, \\ x_{k + 1} = x_{k} - (1 - γ) \frac{r_{k} \cdot v^{k}}{‖ v^{k} ‖^{2}} u^{k} . \end{matrix}$ 79 ) terminate, where $N \leq n$ .

6 Numerical examples

In order to evaluate the performance of the newly developed method of DODA, we test some linear inverse problems. Some numerical results are compared with that computed by the GMRES,[Citation17] RRGMRES,[Citation40] the OMVIA developed by Liu [Citation24], the OTVIA developed by Liu [Citation13] and a recent non-iterative algorithm based on the double optimal solution (DOS).[Citation19] The algorithm in [Citation19] is applied to solve linear problem by selecting a suitable value of $m$ for the dimension of the Krylov subspace.

In the comparison of numerical solution with exact solution, we use two criteria to measure the accuracy of numerical solution. First, the numerical error is defined to be the absolute value of the difference between numerical solution and exact solution. In addition to that, we also compare the root-mean-square error (RMSE), which is defined by91 $\begin{matrix} RMSE : = \sqrt{\frac{1}{n} \sum_{k = 1}^{n} {(x_{k}^{N} - x_{k}^{E})}^{2}}, \end{matrix}$ 91 where $x_{k}^{N}$ and $x_{k}^{E}$ are, respectively, the numerical solution and exact solution, and $n$ is the total number of data points to be compared.

6.1 Example 1

Finding an $n$ -order polynomial function $p (x) = c_{0} + c_{1} x + \dots + c_{n} x^{n}$ to best match a continuous function $f (x)$ in the interval of $x \in [0, 1]$ :92 $\begin{matrix} min_{deg (p) \leq n} \int_{0}^{1} {[f (x) - p (x)]}^{2} d x, \end{matrix}$ 92 leads to a problem governed by Equation (Equation11 $\begin{matrix} B x = b, \end{matrix}$ 1 ). The coefficient matrix $B$ is the $(n + 1) \times (n + 1)$ Hilbert matrix defined by93 $\begin{matrix} B_{i j} = \frac{1}{i + j - 1}, \end{matrix}$ 93 $x$ is composed of the $n + 1$ coefficients $c_{0}, c_{1}, \dots, c_{n}$ appeared in $p (x)$ , and94 $\begin{matrix} b = [\begin{matrix} \int_{0}^{1} f (x) d x \\ [2 p x] \int_{0}^{1} x f (x) d x \\ ⋮ \\ \int_{0}^{1} x^{n} f (x) d x \end{matrix}] \end{matrix}$ 94 is uniquely determined by the function $f (x)$ .

The Hilbert matrix is a notorious example of highly ill-conditioned matrices. Equation (Equation11 $\begin{matrix} B x = b, \end{matrix}$ 1 ) with the matrix $B$ having a large condition number usually displays that an arbitrarily small perturbation of data on the right-hand side may lead to an arbitrarily large perturbation to the solution on the left-hand side.

In this example, we consider a highly ill-conditioned linear system (Equation11 $\begin{matrix} B x = b, \end{matrix}$ 1 ) with $B$ given by Equation (Equation9393 $\begin{matrix} B_{i j} = \frac{1}{i + j - 1}, \end{matrix}$ 93 ). The ill-posedness of Equation (Equation11 $\begin{matrix} B x = b, \end{matrix}$ 1 ) increases fast with $n$ . We consider an exact solution with $x_{j} = 1, j = 1, \dots, n$ and $b_{i}$ is given by95 $\begin{matrix} b_{i} = \sum_{j = 1}^{n} \frac{1}{i + j - 1} + σ R (i), \end{matrix}$ 95 where $R (i)$ are random numbers between $[- 1, 1]$ .

It is known that the condition number of Hilbert matrix grows as $e^{3.5 n}$ when $n$ is very large. For the case with $n = 300$ , the condition number is extremely huge up to the order of $10^{522}$ . We solve this problem by using the DODA with the Krylov subspace method. Under a noise $σ = 10^{- 6}$ , and for $n = 300$ and $m = 5$ , the DODA is convergent very fast only through three steps with the convergence criterion $ε = 10^{- 3}$ , where the maximum error is $0.0158$ . It is very time saving because we only need to calculate the $5 \times 5$ matrix $C^{- 1}$ three times, which is using the MCGM developed by Liu et al. [Citation46]. Under a convergence criterion $ε_{1} = 10^{- 5}$ in Equation (Equation7878 $\begin{matrix} ‖ R_{k} ‖ < ε_{1}, \end{matrix}$ 78 ), the process to find $D$ is convergent very fast with five or six iterations, although we do not show them at here. The numerical results are shown in Figure . It is interesting that even for a large value of $n = 300$ , we do not need a large value of $m$ , of which merely $m = 5$ was used. Also we can observe that the accuracy does not lose although the ill-posedness of the linear Hilbert problem is highly increased for $n = 300$ .

Figure 1. For example 1 solved by the DODA with Krylov subspace, showing (a) residual, (b) $a_{0}$ and $β$ and (c) numerical error.

As mentioned, the algorithm in [Citation19] is applied to solve linear problem by selecting a suitable value of $m$ for the dimension of the Krylov subspace. This algorithm has been named the DOS. Under a large noise with $σ = 0.05$ and for $n = 300$ , we take $m = 2$ and apply the GMRES, DODA and DOS to solve this problem, whose numerical errors are compared in Figure . The maximum error of DODA is $0.367$ , while that for the GMRES is 0.579 and DOS is 0.566. The RMSE obtained by the DODA is 0.154, while that obtained by the GMRES is $0.16$ , and that obtained by the DOS is 0.214. It can be seen that the DODA is slightly accurate than the GMRES and the DOS.

Figure 2. For example 1 under a large noise with $σ = 0.05$ , comparing numerical errors obtained by the DODA, DOS and GMRES.

6.2 Example 2

In this section, we apply the DODA to identify an unknown space-dependent heat source function $H (x)$ for a one-dimensional heat conduction equation:96 $\begin{matrix} u_{t} (x, t) = u_{x x} (x, t) + H (x), 0 < x < ℓ, 0 < t < t_{f}, \end{matrix}$ 96 97 $\begin{matrix} u (0, t) = u_{0} (t), u (ℓ, t) = u_{ℓ} (t), \end{matrix}$ 97 98 $\begin{matrix} u (x, 0) = f (x) . \end{matrix}$ 98 In order to identify $H (x)$ , we can impose an extra condition:99 $\begin{matrix} u_{x} (0, t) = q (t) . \end{matrix}$ 99 We propose a numerical differential method by letting $v = u_{t}$ . Taking the differentials of Equations (Equation9696 $\begin{matrix} u_{t} (x, t) = u_{x x} (x, t) + H (x), 0 < x < ℓ, 0 < t < t_{f}, \end{matrix}$ 96 ), (Equation9797 $\begin{matrix} u (0, t) = u_{0} (t), u (ℓ, t) = u_{ℓ} (t), \end{matrix}$ 97 ) and (Equation9999 $\begin{matrix} u_{x} (0, t) = q (t) . \end{matrix}$ 99 ) with respect to $t$ , and letting $v = u_{t}$ , we can derive100 $\begin{matrix} v_{t} (x, t) = v_{x x} (x, t), 0 < x < ℓ, 0 < t < t_{f}, \end{matrix}$ 100 101 $\begin{matrix} v (0, t) = {\dot{u}}_{0} (t), \end{matrix}$ 101 102 $\begin{matrix} v (ℓ, t) = {\dot{u}}_{ℓ} (t), \end{matrix}$ 102 103 $\begin{matrix} v_{x} (0, t) = \dot{q} (t) . \end{matrix}$ 103 This is an inverse heat conduction problem (IHCP) for $v (x, t)$ without using the initial condition.

Therefore, we can first solve the above IHCP for $v (x, t)$ by using the method of fundamental solutions (MFS) to obtain a linear equations system, and then the method introduced in Section 5 is used to solve the resultant linear equations system; hence, we can construct $u (x, t)$ by104 $\begin{matrix} u (x, t) = \int_{0}^{t} v (x, ξ) d ξ + f (x), \end{matrix}$ 104 which automatically satisfies the initial condition in Equation (Equation9898 $\begin{matrix} u (x, 0) = f (x) . \end{matrix}$ 98 ).

From Equation (Equation104104 $\begin{matrix} u (x, t) = \int_{0}^{t} v (x, ξ) d ξ + f (x), \end{matrix}$ 104 ), it follows that105 $\begin{matrix} u_{x x} (x, t) = \int_{0}^{t} v_{x x} (x, ξ) d ξ + f^{″} (x), \end{matrix}$ 105 which together with $u_{t} = v$ being inserted into Equation (Equation9696 $\begin{matrix} u_{t} (x, t) = u_{x x} (x, t) + H (x), 0 < x < ℓ, 0 < t < t_{f}, \end{matrix}$ 96 ) leads to106 $\begin{matrix} v (x, t) = \int_{0}^{t} v_{x x} (x, ξ) d ξ + f^{″} (x) + H (x) . \end{matrix}$ 106 Inserting Equation (Equation100100 $\begin{matrix} v_{t} (x, t) = v_{x x} (x, t), 0 < x < ℓ, 0 < t < t_{f}, \end{matrix}$ 100 ) for $v_{x x} = v_{t}$ into the above equation and integrating it, we can derive the following equation to recover $H (x)$ :107 $\begin{matrix} H (x) = v (x, 0) - f^{″} (x) . \end{matrix}$ 107 For the purpose of comparison, we consider the following exact solutions:108 $\begin{matrix} u (x, t) = x^{2} + 2 x t + sin (2 π x), \\ H (x) = 2 x - 2 + 4 π^{2} sin (2 π x) . \end{matrix}$ 108 In Equation (Equation107107 $\begin{matrix} H (x) = v (x, 0) - f^{″} (x) . \end{matrix}$ 107 ), we disregard the ill-posedness of $f^{″} (x)$ , and suppose that the data $f^{″} (x)$ are given exactly. A random noise with an intensity $σ = 10 %$ is added on the data $\dot{q} (t)$ . Under the following parameters $m = 10$ and $γ = 0.25$ , we solve this problem by the DODA with 200 steps. In Figure (a) and (b), we plot the residual, $a_{0}$ and $β$ , and the numerical solution is compared with the exact solution in Figure (c). The numerical error is shown in Figure (b), whose maximum error is 0.00421. It can be seen that the present DODA can provide very accurate numerical result. This result is better than that calculated by Liu [Citation47] using the vector regularization iterative method, whose maximum error is 0.0086. Then under the same parameters as that used in the DODA, we apply the GMRES,[Citation17] the RRGMRES [Citation40] and the OMVIA [Citation24] to solve this problem. The residuals and numerical errors are compared in Figure . When the maximum error of GMRES is 0.053, the maximum error of RRGMRES is 0.02, and the maximum error of OMVIA is 0.0265. The maximum error of DODA as shown in the above is 0.00421, which is the best one among the four numerical methods. The RMSE of GMRES is $0.032$ , the RMSE of RRGMRES is 0.014, the RMSE of OMVIA is 0.023 and the RMSE of DODA is 0.0024. It can be seen that the DODA is much accurate than other three numerical algorithms.

Figure 3. For example 2 solved by the DODA with Krylov subspace, showing (a) residual, (b) $a_{0}$ and $β$ and (c) numerical error.

Figure 4. For example 2 solved by the DODA, GMRES, RRGMRES and OMVIA, comparing (a) residuals and (b) numerical errors.

6.3 Example 3

When the backward heat conduction problem (BHCP) is considered in a spatial interval of $0 < x < ℓ$ by subjecting to the boundary conditions at two ends of a slab:109 $\begin{matrix} u_{t} (x, t) = κ u_{x x} (x, t), 0 < t < T, 0 < x < ℓ, \end{matrix}$ 109 110 $\begin{matrix} u (0, t) = u_{0} (t), u (ℓ, t) = u_{ℓ} (t), \end{matrix}$ 110 we solve $u$ under a final time condition:111 $\begin{matrix} u (x, T) = u^{T} (x) . \end{matrix}$ 111 The fundamental solution of Equation (Equation109109 $\begin{matrix} u_{t} (x, t) = κ u_{x x} (x, t), 0 < t < T, 0 < x < ℓ, \end{matrix}$ 109 ) is given as follows:112 $\begin{matrix} K (x, t) = \frac{H (t)}{2 \sqrt{κ π t}} exp (\frac{- x^{2}}{4 κ t}), \end{matrix}$ 112 where $H (t)$ is the Heaviside function.

The MFS has a serious drawback that the resulting linear equations system is always highly ill conditioned, when the number of source points is increased, or when the distances of source points are increased.

In the MFS, the solution of $u$ at the field point $z = (x, t)$ can be expressed as a linear combination of the fundamental solutions $U (z, s_{j})$ :113 $\begin{matrix} u (z) = \sum_{j = 1}^{n} c_{j} U (z, s_{j}), s_{j} = (η_{j}, τ_{j}) \in Ω^{c}, \end{matrix}$ 113 where $n$ is the number of source points, $c_{j}$ are unknown coefficients and $s_{j}$ are source points being located in the complement $Ω^{c}$ of $Ω = [0, ℓ] \times [0, T]$ . For the heat conduction equation, we have the basis functions114 $\begin{matrix} U (z, s_{j}) = K (x - η_{j}, t - τ_{j}) . \end{matrix}$ 114 It is known that the location of source points in the MFS has a great influence on the accuracy and stability. In a practical application of MFS to solve the BHCP, the source points are uniformly located on two vertical straight lines parallel to the $t$ -axis, not over the final time, which was adopted by Hon and Li [Citation48] and Liu [Citation49], showing a large improvement than the line location of source points below the initial time. After imposing the boundary conditions and the final time condition to Equation (Equation113113 $\begin{matrix} u (z) = \sum_{j = 1}^{n} c_{j} U (z, s_{j}), s_{j} = (η_{j}, τ_{j}) \in Ω^{c}, \end{matrix}$ 113 ), we can obtain a linear equations system:115 $\begin{matrix} B x = b, \end{matrix}$ 115 where116 $\begin{matrix} B_{i j} = U (z_{i}, s_{j}), x = {(c_{1}, \dots, c_{n})}^{T}, \\ b = {(u_{ℓ} (t_{i}), i = 1, \dots, m_{1}; u^{T} (x_{j}), j = 1, \dots, m_{2}; u_{0} (t_{k}), k = m_{1}, \dots, 1)}^{T}, \end{matrix}$ 116 and $n = 2 m_{1} + m_{2}$ .

Since the BHCP is highly ill posed, the ill condition of the coefficient matrix $B$ in Equation (Equation115115 $\begin{matrix} B x = b, \end{matrix}$ 115 ) is serious. To overcome the ill-posedness of Equation (Equation115115 $\begin{matrix} B x = b, \end{matrix}$ 115 ), we can use the DODA to solve this problem. Here, we compare the numerical solution with an exact solution: $u (x, t) = cos (π x) exp (- π^{2} t) .$ For the case with $T = 1$ , the value of final time data is in the order of $10^{- 4}$ , which is small by comparing with the value of the initial temperature $f (x) = u_{0} (x) = cos (π x)$ to be retrieved, which is $O (1)$ . First we impose a relative random noise with an intensity $σ = 10 %$ being imposed on the final time data. Under the following parameters $m_{1} = 15$ , $m_{2} = 8$ , $m = 16$ , $γ = 0.005$ and $ε = 10^{- 2}$ and $ε_{1} = 10^{- 8}$ , we solve this problem by the DODA. With five steps the DODA is convergent. In Figure (a) and (b), we plot the residual, $a_{0}$ and $β$ , and the numerical solution is compared with the exact solution in Figure (c), whose maximum error is $9.25 \times 10^{- 3}$ . It can be seen that the present DODA converges very fast and is very robust against noise, and we can provide a very accurate numerical result by using the DODA.

Figure 5. For example 3 solved by the DODA with Krylov subspace, showing (a) residual, (b) $a_{0}$ and $β$ and (c) comparing numerical and exact solutions.

Then under the same parameters as that used in the DODA, we also apply the GMRES,[Citation17] the RRGMRES,[Citation40] the OMVIA [Citation24] and the OTVIA [Citation13] to solve this problem. The residuals and numerical errors are compared in Figure . The OMVIA is convergent with 142 steps and the maximum error is 0.041. The GMRES is convergent with 16 steps and the maximum error is 0.148. The RRGMRES is convergent with four steps and the maximum error is 0.0124. The OTVIA does not converge within 1000 steps and the maximum error is 0.0994. Then we apply the DOS with $m = 16$ to solve this problem, whose numerical error as shown in Figure (b) with the maximum error being $0.0194$ is better than that of GMRES, OMVIA and OTVIA, but is worse than that of DODA. It can be seen that the DODA is the best one among the six numerical methods, which is convergent faster and accurate than other five algorithms DOS, GMRES, RRGMRES, OMVIA and OTVIA. The RMSE of GMRES is $0.104$ , the RMSE of DOS is $0.0136$ , the RMSE of OMVIA is $0.0282$ and the RMSE of OTVIA is $0.0622$ . The RMSE of DODA is $6.1 \times 10^{- 3}$ , while that for the RRGMRES is $7.9 \times 10^{- 3}$ . For this ill-posed problem, the DODA is better than the RRGMRES.

Figure 6. For example 3 solved by the DODA, DOS, GMRES, RRGMRES, OMVIA and OTVIA, comparing (a) residuals and (b) numerical errors.

Figure 7. For example 4 solved by the DODA with Krylov subspace, showing (a) residual, (b) $a_{0}$ and $β$ and (c) comparing numerical and exact solutions.

Figure 8. For example 4 solved by the DODA, DOS, GMRES, RRGMRES, OMVIA and OTVIA, comparing (a) residuals and (b) numerical errors.

Figure 9. For example 5 solved by the DODA and GMRES, showing (a) residuals, (b) $a_{0}$ and $β$ , (c) comparing numerical and exact solutions and (d) numerical errors.

6.4 Example 4

Let us consider the inverse Cauchy problem for the Laplace equation:117 $\begin{matrix} Δ u = u_{r r} + \frac{1}{r} u_{r} + \frac{1}{r^{2}} u_{θ θ} = 0, \end{matrix}$ 117 118 $\begin{matrix} u (ρ, θ) = h (θ), 0 \leq θ \leq π, \end{matrix}$ 118 119 $\begin{matrix} u_{n} (ρ, θ) = g (θ), 0 \leq θ \leq π, \end{matrix}$ 119 where $h (θ)$ and $g (θ)$ are given function. The inverse Cauchy problem is specified as follows:

To seek an unknown boundary function $f (θ)$ on the part $Γ_{2} : = {(r, θ) | r = ρ (θ), π < θ < 2 π}$ of the boundary under Equations (Equation117117 $\begin{matrix} Δ u = u_{r r} + \frac{1}{r} u_{r} + \frac{1}{r^{2}} u_{θ θ} = 0, \end{matrix}$ 117 )–(Equation119119 $\begin{matrix} u_{n} (ρ, θ) = g (θ), 0 \leq θ \leq π, \end{matrix}$ 119 ) with the overspecified data being given on $Γ_{1} : = {(r, θ) | r = ρ (θ), 0 \leq θ \leq π}$ .

It is well known that the MFS can be used to solve the Laplace equation when a fundamental solution is known. In the MFS, the solution of $u$ at the field point $x = (r cos θ, r sin θ)$ can be expressed as a linear combination of fundamental solutions $U (x, s_{j})$ :120 $\begin{matrix} u (x) = \sum_{j = 1}^{n} c_{j} U (x, s_{j}), s_{j} \in Ω^{c} . \end{matrix}$ 120 For the Laplace equation (Equation117117 $\begin{matrix} Δ u = u_{r r} + \frac{1}{r} u_{r} + \frac{1}{r^{2}} u_{θ θ} = 0, \end{matrix}$ 117 ), we have the fundamental solutions:121 $\begin{matrix} U (x, s_{j}) = ln r_{j}, r_{j} = ‖ x - s_{j} ‖ . \end{matrix}$ 121 In the practical application of MFS, by imposing the boundary conditions (Equation118118 $\begin{matrix} u (ρ, θ) = h (θ), 0 \leq θ \leq π, \end{matrix}$ 118 ) and (Equation119119 $\begin{matrix} u_{n} (ρ, θ) = g (θ), 0 \leq θ \leq π, \end{matrix}$ 119 ) at $N$ points on Equation (Equation120120 $\begin{matrix} u (x) = \sum_{j = 1}^{n} c_{j} U (x, s_{j}), s_{j} \in Ω^{c} . \end{matrix}$ 120 ), we can obtain a linear equations system:122 $\begin{matrix} B c = b, \end{matrix}$ 122 where123 $\begin{matrix} x_{i} = (x_{i}^{1}, x_{i}^{2}) = (ρ (θ_{i}) cos θ_{i}, ρ (θ_{i}) sin θ_{i}), \\ s_{j} = (s_{j}^{1}, s_{j}^{2}) = (R (θ_{j}) cos θ_{j}, R (θ_{j}) sin θ_{j}), \\ B_{i j} = ln ‖ x_{i} - s_{j} ‖, if i is odd, \\ B_{i j} = \frac{η (θ_{i})}{‖ x_{i} - s_{j} ‖^{2}} \\ \times (ρ (θ_{i}) - s_{j}^{1} cos θ_{i} - s_{j}^{2} sin θ_{i} - \frac{ρ^{'} (θ_{i})}{ρ (θ_{i})} [s_{j}^{1} sin θ_{i} - s_{j}^{2} cos θ_{i}]), if i is even, \\ c = {(c_{1}, \dots, c_{n})}^{T}, b = {(h (θ_{1}), g (θ_{1}), \dots, h (θ_{N}), g (θ_{N}))}^{T}, \end{matrix}$ 123 in which $n = 2 N$ , and124 $\begin{matrix} η (θ) = \frac{ρ (θ)}{\sqrt{ρ^{2} (θ) + {[ρ^{'} (θ)]}^{2}}} . \end{matrix}$ 124 The above $R (θ) = ρ (θ) + D$ with an offset $D$ can be used to locate the source points along a contour with a radius $R (θ)$ .

For the purpose of comparison, we consider the following exact solution:125 $\begin{matrix} u (x, y) = cos x cosh y + sin x sinh y, \end{matrix}$ 125 defined in a domain with a complex amoeba-like irregular shape as a boundary:126 $\begin{matrix} ρ (θ) = exp (sin θ) {sin}^{2} (2 θ) + exp (cos θ) {cos}^{2} (2 θ) . \end{matrix}$ 126 After imposing the boundary conditions (Equation118118 $\begin{matrix} u (ρ, θ) = h (θ), 0 \leq θ \leq π, \end{matrix}$ 118 ) and (Equation119119 $\begin{matrix} u_{n} (ρ, θ) = g (θ), 0 \leq θ \leq π, \end{matrix}$ 119 ) at $N$ points on Equation (Equation120120 $\begin{matrix} u (x) = \sum_{j = 1}^{n} c_{j} U (x, s_{j}), s_{j} \in Ω^{c} . \end{matrix}$ 120 ), we can obtain a linear equations system. The noise being imposed on the measured data $h$ and $g$ is $σ = 0.01$ .

We solve this problem by the DODA with $m = 5$ and $γ = 0.2$ . The descent direction $u$ is solved from $B^{T} B u = B^{T} r$ . Through 700 steps the residual and the values of $a_{0}$ and $β$ are shown in Figure (a) and (b). The numerical solution and exact solution are compared in Figure (c), whose maximum error is smaller than 0.0884. It can be seen that the DODA can accurately recover the unknown boundary condition.

Then under the same parameters as that used in the DODA, we apply the GMRES,[Citation17] the RRGMRES,[Citation40] and the OMVIA [Citation24] to solve this problem. The residuals and numerical errors are compared in Figure . When the maximum error of GMRES is 5.02 (failure) and the maximum error of RRGMRES is 1.16, the maximum error of OMVIA is 0.197. The maximum error of DODA as shown in the above is much better than that calculated by Liu [Citation13] by using the OTVIA, of which the maximum error is 0.34. Then we apply the DOS with $m = 10$ to solve this problem, whose numerical error as shown in Figure (b) with the maximum error being $0.955$ is better than that of GMRES, but is worse than that of RRGMRES, OMVIA, OTVIA and DODA. The RMSE of GMRES is $3.03$ , the RMSE of RRGMRES is $0.553$ , the RMSE of DOS is $0.602$ , the RMSE of OMVIA is $0.127$ and the RMSE of OTVIA is $0.177$ . The RMSE of DODA is $0.045$ , which is better than other five algorithms. Again, it can be seen that the DODA is the best one among the six numerical methods, which is convergent faster and much more accurate than other five algorithms.

6.5 Example 5

Let us consider the following inverse problem to recover the external force $F (t)$ for127 $\begin{matrix} \ddot{y} (t) + \dot{y} (t) + y (t) = F (t) . \end{matrix}$ 127 In a time interval of $t \in [0, t_{f}]$ , the discretized data $y_{i} = y (t_{i})$ are supposed to be measurable, which are subjected to the random noise with an intensity $σ = 0.01$ . Usually, it is very difficult to recover the external force $F (t_{i})$ from Equation (Equation127127 $\begin{matrix} \ddot{y} (t) + \dot{y} (t) + y (t) = F (t) . \end{matrix}$ 127 ) by the direct differentials of the noisy data of displacements, because the differential is an ill-posed linear operator.

To approach this inverse problem by the polynomial interpolation, we begin with128 $\begin{matrix} p_{m} (x) = c_{0} + \sum_{k = 1}^{m} c_{k} x^{k} . \end{matrix}$ 128 Now, the coefficient $c_{k}$ is split into two coefficients $a_{k}$ and $b_{k}$ to absorb more interpolation points; in the meanwhile, $cos (k θ_{k})$ and $sin (k θ_{k})$ are introduced to reduce the condition number of the coefficient matrix. We suppose that129 $\begin{matrix} c_{k} = \frac{a_{k} cos (k θ_{k})}{R_{2 k}^{k}} + \frac{b_{k} sin (k θ_{k})}{R_{2 k + 1}^{k}}, \end{matrix}$ 129 and130 $\begin{matrix} θ_{k} = \frac{2 k π}{m}, k = 1, \dots, m . \end{matrix}$ 130 The problem domain is $[a, b]$ , and the interpolating points are:131 $\begin{matrix} a = x_{0} < x_{1} < x_{2} < \dots < x_{2 m - 1} < x_{2 m} = b . \end{matrix}$ 131 Substituting Equation (Equation129129 $\begin{matrix} c_{k} = \frac{a_{k} cos (k θ_{k})}{R_{2 k}^{k}} + \frac{b_{k} sin (k θ_{k})}{R_{2 k + 1}^{k}}, \end{matrix}$ 129 ) into Equation (Equation128128 $\begin{matrix} p_{m} (x) = c_{0} + \sum_{k = 1}^{m} c_{k} x^{k} . \end{matrix}$ 128 ), we can obtain132 $\begin{matrix} p (x) = a_{0} + \sum_{k = 1}^{m} [a_{k} {(\frac{x}{R_{2 k}})}^{k} cos (k θ_{k}) + b_{k} {(\frac{x}{R_{2 k + 1}})}^{k} sin (k θ_{k})], \end{matrix}$ 132 where we let $c_{0} = a_{0}$ . Here, $a_{k}$ and $b_{k}$ are unknown coefficients. In order to obtain them, we impose the following $n$ interpolated conditions:133 $\begin{matrix} p (x_{i}) = y_{i}, i = 0, \dots, n - 1 . \end{matrix}$ 133 Thus, we obtain a linear equations system to determine $a_{k}$ and $b_{k}$ :134 $\begin{matrix} [\begin{matrix} 1 & \frac{x_{0} cos θ_{1}}{R_{2}} & \frac{x_{0} sin θ_{1}}{R_{3}} & \dots & {(\frac{x_{0}}{R_{2 m}})}^{m} cos m θ_{m} & {(\frac{x_{0}}{R_{2 m + 1}})}^{m} sin m θ_{m} \\ [1 e x] 1 & \frac{x_{1} cos θ_{1}}{R_{2}} & \frac{x_{1} sin θ_{1}}{R_{3}} & \dots & {(\frac{x_{1}}{R_{2 m}})}^{m} cos m θ_{m} & {(\frac{x_{1}}{R_{2 m + 1}})}^{m} sin m θ_{m} \\ [1 e x] ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ 1 & \frac{x_{2 m - 1} cos θ_{1}}{R_{2}} & \frac{x_{2 m - 1} sin θ_{1}}{R_{3}} & \dots & {(\frac{x_{2 m - 1}}{R_{2 m}})}^{m} cos m θ_{m} & {(\frac{x_{2 m - 1}}{R_{2 m + 1}})}^{m} sin m θ_{m} \\ [1 e x] 1 & \frac{x_{2 m} cos θ_{1}}{R_{2}} & \frac{x_{2 m} sin θ_{1}}{R_{3}} & \dots & {(\frac{x_{2 m}}{R_{2 m}})}^{m} cos m θ_{m} & {(\frac{x_{2 m}}{R_{2 m + 1}})}^{m} sin m θ_{m} \end{matrix}] [\begin{matrix} a_{0} \\ a_{1} \\ b_{1} \\ ⋮ \\ a_{m} \\ b_{m} \end{matrix}] = [\begin{matrix} y_{0} \\ y_{1} \\ y_{2} \\ ⋮ \\ y_{2 m - 1} \\ y_{2 m} \end{matrix}] . \end{matrix}$ 134 We note that the norm of the first column of the above coefficient matrix is $\sqrt{2 m + 1}$ . According to the concept of equilibrated matrix,[Citation50] we can derive the optimal scales for the current interpolation with a half-order technique as135 $\begin{matrix} R_{2 k} = β_{0} {(\frac{1}{2 m + 1}, \sum_{j = 0}^{2 m}, x_{j}^{2 k}, {(cos k θ_{k})}^{2})}^{1 / (2 k)}, k = 1, 2, \dots, m, \end{matrix}$ 135 136 $\begin{matrix} R_{2 k + 1} = β_{0} {(\frac{1}{2 m + 1}, \sum_{j = 0}^{2 m}, x_{j}^{2 k}, {(sin k θ_{k})}^{2})}^{1 / (2 k)}, k = 1, 2, \dots, m, \end{matrix}$ 136 where $β_{0}$ is a scaling factor.[Citation50] The improved method uses $m$ -order polynomial to interpolate $n = 2 m + 1$ data nodes, while regular method with a full-order can only interpolate $m + 1$ data points.

Now we fix $m_{0} = 25$ (of the above $m$ ) and $t_{f} = 5$ and consider the exact solution to be $F (t) = ω cos (ω t) + (1 - ω^{2}) sin (ω t)$ , which is obtained by inserting the exact value of $y (t) = sin (ω t)$ into Equation (Equation127127 $\begin{matrix} \ddot{y} (t) + \dot{y} (t) + y (t) = F (t) . \end{matrix}$ 127 ). The parameters used are $ω = 0.5$ , and $β_{0} = 2$ . When we use the DODA with $m = 4$ , we let it run 300 steps. The descent direction $u$ is solved from $B^{T} B u = B^{T} r$ . The residual and the values of $a_{0}$ and $β$ are shown in Figure (a) and (b). The numerical solution and exact solution are compared in Figure (c), whose maximum error is smaller than 0.027. It can be seen that the DODA can accurately recover the unknown external force.

When we apply the GMRES to solve this problem, the parameter $β_{0}$ is changed to $β_{0} = 6$ , and the other parameters are unchanged. In Figure , we plot the results obtained by the GMRES with dashed-dotted lines, of which as shown in Figure (c) and (d) we can see that the GMRES is less accurate with the maximum error being 0.35. The RMSE obtained by the DODA is $1.14 \times 10^{- 2}$ , while that obtained by the GMRES is 0.117. It can be seen that the DODA is much accurate than the GMRES.

7 Conclusions and discussion

In the present paper, we have derived a double optimal algorithm, including an $m$ -vector optimal search direction in an $m$ -dimensional affine Krylov subspace to solve a highly ill-posed linear system. This algorithm is a multi-vector DODA, of which the expansion coefficients in the descent direction are solved in closed-form through double optimizations of two basic merit functions to measure the distance between $v = B u$ and $r$ . The DODA has a good computational efficiency and accuracy in solving the ill-posed linear equations system. Numerical tests on the linear inverse problems have confirmed the robustness of the DODA against noisy disturbance even with an intensity being large up to $10 %$ . For the test examples, we found that the DODA can converge fast and stable with the iterations account smaller than the dimension of the considered problem. As the usual Krylov subspace method, the value of $m$ cannot be too large; otherwise, many computational time is required to construct the Krylov matrix by the Arnoldi process at each iteration step. When $m$ is large, the Krylov matrix will be highly ill conditioned for the ill-posed problem. The DODA is convergent faster than other numerical methods investigated in this paper, and indeed it can achieve smaller residual errors than the GMRES, the RRGMRES and the OMVIA under the same value of $m$ . In the proof of the convergence of the DODA, we have derived an exact equation to estimate the difference of two consecutive square residual norms: $‖ r_{k + 1} ‖^{2} = {‖ r_{k} ‖}^{2} - [r_{k}^{T} E r_{k} + \frac{{[r_{k}^{T} (I_{n} - E) B r_{k}]}^{2}}{r_{k}^{T} B^{T} (I_{n} - E) B r_{k}}] .$ How to maximize the quantity in the square brackets, i.e. $max [r_{k}^{T} E r_{k} + \frac{{[r_{k}^{T} (I_{n} - E) B r_{k}]}^{2}}{r_{k}^{T} B^{T} (I_{n} - E) B r_{k}}]$ might be an important issue, which will lead to a further study of the structure of the Krylov subspace resulting to the best configuration of the projection operator $E$ . On the other hand, based on Theorem 5 in [Citation20], the square residual obtained by the DODA is smaller than that obtained by the algorithms based on the least square of residual with the following relation: $‖ r_{k} ‖_{DODA}^{2} = {‖ r_{k} ‖}_{LS}^{2} - \frac{{[r_{k}^{T} (I_{n} - E) B r_{k}]}^{2}}{r_{k}^{T} B^{T} (I_{n} - E) B r_{k}} .$ This fact revealed that the DODA is more efficient than other algorithms, and all numerical examples investigated in this paper by assessing the convergence speed, the maximum error, the absolute error and the RMSE confirmed the superiority of DODA.

Acknowledgments

The author highly appreciates the constructive comments from anonymous referees, which improve the quality of this paper. Highly appreciated are the project NSC-102-2221-E-002-125-MY3 and the 2011 Outstanding Research Award from the National Science Council of Taiwan, and the 2011 Taiwan Research Front Award from Thomson Reuters. It is also acknowledged that the author has been promoted as being a Lifetime Distinguished Professor of National Taiwan University since 2013.

References

Stewart G. Introduction to matrix computations. New York (NY): Academic Press; 1973.
Google Scholar
Kunisch K, Zou J. Iterative choices of regularization parameters in linear inverse problems. Inverse Probl. 1998;14:1247–1264.
Web of Science ®Google Scholar
Wang Y, Xiao T. Fast realization algorithms for determining regularization parameters in linear inverse problems. Inverse Probl. 2001;17:281–291.
Web of Science ®Google Scholar
Xie J, Zou J. An improved model function method for choosing regularization parameters in linear inverse problems. Inverse Probl. 2002;18:631–643.
Web of Science ®Google Scholar
Resmerita E. Regularization of ill-posed problems in Banach spaces: convergence rates. Inverse Probl. 2005;21:1303–1314.
Web of Science ®Google Scholar
Chehab JP, Laminie J. Differential equations and solution of linear systems. Numer. Algorithms. 2005;40:103–124.
Web of Science ®Google Scholar
Helmke U, Moore JB. Optimization and dynamical systems. Berlin: Springer; 1994.
Google Scholar
Liu CS. Optimally generalized regularization methods for solving linear inverse problems. CMC: Comput. Mater. Con. 2012;29:103–127.
Web of Science ®Google Scholar
Liu CS. Scaled vector regularization method to solve ill-posed linear problems. Appl. Math. Comput. 2012;218:10602–10616.
Web of Science ®Google Scholar
Liu CS. A dynamical Tikhonov regularization for solving ill-posed linear algebraic systems. Acta Appl. Math. 2012;123:285–307.
Web of Science ®Google Scholar
Liu CS. An optimal preconditioner with an alternate relaxation parameter used to solve ill-posed linear problems. CMES: Comput. Model. Eng. Sci. 2013;92:241–269.
Web of Science ®Google Scholar
Liu CS. A globally optimal iterative algorithm to solve an ill-posed linear system. CMES: Comput. Model. Eng. Sci. 2012;84:383–403.
Web of Science ®Google Scholar
Liu CS. An optimal tri-vector iterative algorithm for solving ill-posed linear inverse problems. Inverse Probl. Sci. Eng. 2013;21:650–681.
Web of Science ®Google Scholar
Liu CS. A globally optimal tri-vector method to solve an ill-posed linear system. J. Comput. Appl. Math. 2014;260:18–35.
Web of Science ®Google Scholar
Barzilai J, Borwein JM. Two point step size gradient methods. IMA J. Numer. Anal. 1988;8:141–148.
Web of Science ®Google Scholar
Akaike H. On a successive transformation of probability distribution and its application to the analysis of the optimum gradient method. Ann. Inst. Stat. Math. Tokyo. 1959;11:1–16.
Web of Science ®Google Scholar
Saad Y, Schultz MH. GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Stat. Comput. 1986;7:856–869.
Web of Science ®Google Scholar
Liu CS, Atluri SN. An iterative method using an optimal descent vector, for solving an ill-conditioned system Bx = b, better and faster than the conjugate gradient method. CMES: Comput. Model. Eng. Sci. 2011;80:275–298.
Web of Science ®Google Scholar
Liu CS. A doubly optimized solution of linear equations system expressed in an affine Krylov subspace. J. Comput. Appl. Math. 2014;260:375–394.
Web of Science ®Google Scholar
Liu CS. Discussing a more fundamental concept than the minimal residual method for solving linear system in a Krylov subspace. J. Math. Res. 2013;5:58–70.
Google Scholar
Liu CS. An optimally generalized steepest-descent algorithm for solving ill-posed linear systems. J. Appl. Math. 2013; ID 154358, 15 p.
PubMed Web of Science ®Google Scholar
Jacoby SLS, Kowalik JS, Pizzo JT. Iterative methods for nonlinear optimization problems. New Jersey (NJ): Prentice-Hall; 1972.
Google Scholar
Ostrowski AM. Solution of equations in Euclidean and Banach spaces. 3rd ed. New York (NY): Academic Press; 1973.
Google Scholar
Liu CS. An optimal multi-vector iterative algorithm in a Krylov subspace for solving the ill-posed linear inverse problems. CMC: Comput. Mater. Con. 2013;33:175–198.
Web of Science ®Google Scholar
Dongarra J, Sullivan F. Guest editors’ introduction to the top 10 algorithms. Comput. Sci. Eng. 2000;2:22–23.
Web of Science ®Google Scholar
Simoncini V, Szyld DB. Recent computational developments in Krylov subspace methods for linear systems. Numer. Linear Algebra Appl. 2007;14:1–59.
Web of Science ®Google Scholar
Saad Y. Krylov subspace methods for solving large unsymmetric linear systems. Math. Comput. 1981;37:105–126.
Web of Science ®Google Scholar
Freund RW, Nachtigal NM. QMR: a quasi-minimal residual method for non-Hermitian linear systems. Numer. Math. 1991;60:315–339.
Web of Science ®Google Scholar
van Den Eshof J, Sleijpen GLG. Inexact Krylov subspace methods for linear systems. SIAM J. Matrix Anal. Appl. 2004;26:125–153.
Web of Science ®Google Scholar
Hestenes MR, Stiefel EL. Methods of conjugate gradients for solving linear systems. J. Res. Nat. Bur. Stand. 1952;49:409–436.
Google Scholar
Lanczos C. Solution of systems of linear equations by minimized iterations. J. Res. Nat. Bur. Stand. 1952;49:33–53.
Google Scholar
Paige CC, Saunders MA. Solution of sparse indefinite systems of linear equations. SIAM J. Numer. Anal. 1975;12:617–629.
Web of Science ®Google Scholar
Fletcher R. Conjugate gradient methods for indefinite systems. Lecture notes in Math. Vol. 506. Berlin: Springer-Verlag; 1976. p. 73–89.
Google Scholar
Sonneveld P. CGS: a fast Lanczos-type solver for nonsymmetric linear systems. SIAM J. Sci. Stat. Comput. 1989;10:36–52.
Web of Science ®Google Scholar
van der Vorst HA. Bi-CGSTAB: a fast and smoothly converging variant of Bi-CG for the solution of nonsymmetric linear systems. SIAM J. Sci. Stat. Comput. 1992;13:631–644.
Web of Science ®Google Scholar
Saad Y, van der Vorst HA. Iterative solution of linear systems in the 20th century. J. Comput. Appl. Math. 2000;123:1–33.
Web of Science ®Google Scholar
Saad Y. Iterative methods for sparse linear systems. 2nd ed. Pennsylvania (PA): SIAM; 2003.
Google Scholar
van der Vorst HA. Iterative Krylov methods for large linear systems. New York (NY): Cambridge University Press; 2003.
Google Scholar
Liu CS. The concept of best vector used to solve ill-posed linear inverse problems. CMES: Comput. Model. Eng. Sci. 2012;83:499–525.
Web of Science ®Google Scholar
Calvetti C, Lewis B, Reichel L. GMRES-type methods for inconsistent systems. Linear Algebra Appl. 2000;316:157–169.
Web of Science ®Google Scholar
Kuroiwa N, Nodera T. A note on the GMRES method for linear discrete ill-posed problems. Adv. Appl. Math. Mech. 2009;1:816–829.
Web of Science ®Google Scholar
Matinfar M, Zareamoghaddam H, Eslami M, Saeidy M. GMRES implementations and residual smoothing techniques for solving ill-posed linear systems. Comput. Math. Appl. 2012;63:1–13.
Web of Science ®Google Scholar
Morikuni K, Reichel L, Hayami K. FGMRES for linear discrete ill-posed problems. Appl. Numer. Math. 2014;75:175–187.
Web of Science ®Google Scholar
Yin JF, Hayami K. Preconditioned GMRES methods with incomplete Givens orthogonalization method for large sparse least-squares problems. J. Comput. Appl. Math. 2009;226:177–186.
Web of Science ®Google Scholar
Horn RA, Johnson CR. Matrix analysis. New York (NY): Cambridge University Press; 1985.
Google Scholar
Liu CS, Hong HK, Atluri SN. Novel algorithms based on the conjugate gradient method for inverting ill-conditioned matrices, and a new regularization method to solve ill-posed linear systems. CMES: Comput. Model. Eng. Sci. 2010;60:279–308.
Web of Science ®Google Scholar
Liu CS. A vector regularization method to solve linear inverse problems. Inverse Probl. Sci. Eng. 2013. dx.doi.org/10.1080/17415977.2013.823415.
PubMedGoogle Scholar
Hon YC, Li M. A discrepancy principle for the source points location in using the MFS for solving the BHCP. Int. J. Comput. Methods. 2009;6:181–197.
Web of Science ®Google Scholar
Liu CS. The method of fundamental solutions for solving the backward heat conduction problem with conditioning by a new post-conditioner. Numer. Heat Transfer B: Fundam. 2011;60:57–72.
Web of Science ®Google Scholar
Liu CS. A two-side equilibration method to reduce the condition number of an ill-posed linear system. CMES: Comput. Model. Eng. Sci. 2013;91:17–42.
Web of Science ®Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Download PDF

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Your download is now in progress and you may close this window

Did you know that with a free Taylor & Francis Online account you can gain access to the following benefits?

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Have an account?
Login now Don't have an account?
Register for free

Login or register to access this feature