Search in:

Computer Assisted Surgery Volume 24, 2019 - Issue sup1: Advances in Minimally Invasive Surgery and Clinical Measurement. Guest Editors: Chengyu Liu & Lung-kwang Pan

Submit an article Journal homepage

Open access

1,535

Views

CrossRef citations to date

Altmetric

Listen

Research Article

Super resolution reconstruction for medical image based on adaptive multi-dictionary learning and structural self-similarity

Fang ZhangTianjin Key Laboratory of Optoelectronic Detection Technology and Systems, Tianjin, China; ;School of Electronics and Information Engineering, Tianjin Polytechnic University, Tianjin, ChinaView further author information

Yue WuSchool of Electronics and Information Engineering, Tianjin Polytechnic University, Tianjin, ChinaView further author information

Zhitao XiaoTianjin Key Laboratory of Optoelectronic Detection Technology and Systems, Tianjin, China; ;School of Electronics and Information Engineering, Tianjin Polytechnic University, Tianjin, ChinaCorrespondence[email protected]
View further author information

Lei GengTianjin Key Laboratory of Optoelectronic Detection Technology and Systems, Tianjin, China; ;School of Electronics and Information Engineering, Tianjin Polytechnic University, Tianjin, ChinaView further author information

Jun WuTianjin Key Laboratory of Optoelectronic Detection Technology and Systems, Tianjin, China; ;School of Electronics and Information Engineering, Tianjin Polytechnic University, Tianjin, ChinaView further author information

Jia WenTianjin Key Laboratory of Optoelectronic Detection Technology and Systems, Tianjin, China; ;School of Electronics and Information Engineering, Tianjin Polytechnic University, Tianjin, ChinaView further author information

Wen WangTianjin Key Laboratory of Optoelectronic Detection Technology and Systems, Tianjin, China; ;School of Electronics and Information Engineering, Tianjin Polytechnic University, Tianjin, ChinaView further author information

Ping LiuSchool of Electronics and Information Engineering, Tianjin Polytechnic University, Tianjin, ChinaView further author information

show all

Pages 81-88 | Published online: 28 Jan 2019

Cite this article
https://doi.org/10.1080/24699322.2018.1560092
CrossMark

In this article

Abstract
1. Introduction
2. SR model
3. Acquisition and training of adaptive multi-dictionary samples
4. Experimental results and analysis
5. Conclusions
Disclosure statement
Additional information
References

Full Article
Figures & data
References
Citations
Metrics
Licensing
Reprints & Permissions
View PDF PDF View EPUB EPUB

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

To improve the quality of the super-resolution (SR) reconstructed medical images, an improved adaptive multi-dictionary learning method is proposed, which uses the combined information of medical image itself and the natural images database. In training dictionary section, it uses the upper layer images of pyramid which are generated by the self-similarity of low resolution images. In reconstruction section, the top layer image of pyramid is taken as the initial reconstruction image, and medical image’s SR reconstruction is achieved by regularization term which is the non-local structure self-similarity of the image. This method can make full use of the same scale and different scale similar information of medical images. Simulation experiments are carried out on natural images and medical images, and the experimental results show the proposed method is effective for improving the effect of medical image SR reconstruction.

Keywords:

Super-resolution reconstruction
medical image
improved adaptive multi-dictionary learning
non-local structural similarity

1. Introduction

With the continuous innovation of medical imaging technology and computer technology, medical image processing is becoming important in clinical diagnosis. It can help doctors fully understand the structure of the internal pathology of patients, and then an accurate treatment plan is developed. High-quality medical images can not only improve the level of clinical diagnosis, but also provide necessary support for medical research, teaching, surgery, etc. Commonly used methods of super-resolution (SR) reconstruction are interpolation methods, such as B-spline and bicubic interpolation etc. Through SR reconstruction, the clarity of medical images and the correctness of disease diagnosis, can be improved. Thereby it can help doctors to perform precise treatment on patients.

A single HR image is generated by one or more low-resolution images in the same scene based on SR method [Citation1]. And for SR method, it is the biggest difficulty to recover high frequency information lost during process of getting LR inputs [Citation2,Citation3].

Yang et al. [Citation4] propose a SR algorithm based on sparse representation. In this algorithm, FSS (Feature-Sign Search) method plans to generate low-resolution (LR) and high-resolution (HR) dictionaries by using the patch pairs of HR and LR image. LR dictionary produces the spare coefficients of LR image patches, and it uses the coefficients to produce the HR image patches based on the HR dictionary.

For each local patch, pre-learned dictionaries from a dataset of high quality example patches are selected based on Dong [Citation5]. In the process of reconstruction, non-local structure self-similarity (NLSS) and autoregressive model are considered as the regularization term. Dong’s method uses the self-similarity of the image that increases the accuracy and reliability accessed to the information. Therefore, it still uses the image database to train the global dictionary, and it cannot be sparse representation for all image patches. In order to further improve the reconstruction effect, A SR method based on adaptive multi-dictionary learning (MDL) was proposed by Pan [Citation6]. Firstly, patches of LR image pyramid which are packed into some categories are classified into one of these groups, if they meet special conditions. And then, corresponding dictionaries are generated for each group based on MDL.

This paper proposes an improved method based on adaptive MDL and structural self-similarity. In this method, we use the pyramid generated by structural self-similarity upper layer images as samples instead of Pan’s method. It uses the pyramid decomposition to obtain samples when training a dictionary, which will achieve more self-similar information of the different scales. In addition, the paper uses the top layer image of pyramid as the initial reconstruction image. SR image is reconstructed by regularization term using the non-local structure self-similarity of image. It also employs self-similar information of the same scale to reconstruct image. Experimental results demonstrate that the HR medical image reconstructed by the proposed method has good effect for medical images.

2. SR model

2.1. SR model based on sparse representation

Supposing $Φ \in R^{N \times M}$ $(N < M)$ is an over-complete dictionary of M prototype signal-atoms, a sparse linear combination of these atoms can represent a signal. The signal vector can be expressed as: (1) $x \approx Φ α$ (1) $α \in R^{M}$ is the sparse coefficient with very few $(≪ M)$ nonzero term. Solving an $l_{0}$ -minimization problem can calculate the sparse decomposition coefficient, and it can be formulated as: (2) $α = \underset{α}{\arg \min} {∥ α ∥}_{0}, s . t . {∥ x - Φ α ∥}_{2} \leq ε$ (2) where $| | • | |_{0}$ is a pseudo norm that counts the number of non-zero entries in $α$ , and approximation error is influenced by a small constant $ε$ . Minimization $l_{0}$ is converted to the convex $l_{1}$ -minimization. It is an issue about NP-hard combinatorial optimization. According to sparse coding problem, $l_{1}$ -norm is summarized as follows: (3) $α = \underset{α}{\arg \min} {∥ α ∥}_{1}, s . t . {∥ x - Φ α ∥}_{2} \leq ε$ (3)

EquationEq. (3)(3) $α = \underset{α}{\arg \min} {∥ α ∥}_{1}, s . t . {∥ x - Φ α ∥}_{2} \leq ε$ (3) also can be written as the form of Lagrange polynomials: (4) $α = \underset{α}{\arg \min} {{∥ x - Φ α ∥}_{2}^{2} + λ {∥ α ∥}_{1}}$ (4) where $λ$ represents the regularization parameter, it is a constant.

The single image SR problem can be presented as: a HR image $X$ of the same scene can be recovered by giving a LR image $Y$ . Assuming the LR image is the version of the solution $X$ . And the version is blurred and down-sampled: (5) $Y = D B X$ (5) where $B$ is a fuzzy filter, and $D$ is a lower sampling operator. The process of solving $X$ by $Y$ is the process of image SR reconstruction, which is to solve the following least squares problem: (6) $X = \underset{X}{\arg \min} {∥ Y - D B X ∥}_{2}^{2}$ (6)

The sparse representation prior on small patches $x$ of $X$ is used to make HR images $X$ satisfy the reconstruction constraint for a given LR input $Y$ .

$x$ is an image column vector reshaped from image $X$ . $R_{i}$ is a extracting-matrix, whose function is extracting patch $x_{i}$ from $X$ . $x_{i} = R_{i} X$ , $i = 1, 2, \dots, N$ is the $i^{t h}$ patch (size: $\sqrt{n} \times \sqrt{n}$ ) vector of $x$ . ${Φ_{k}}$ , $k = 1, 2, \dots, M$ is defined to be a set of orthonormal sub-dictionaries. ${\hat{x}}_{i} = Φ_{k_{i}} α_{i}$ can approximately express the sparse coding $x_{i}$ , $| | α_{i} | |_{1} \leq T$ and the reconstructed patches ${\hat{x}}_{i}$ are averaged to reconstruct the whole image $x$ . And it can be described as: (7) $\hat{X} = Φ \circ α = {(\sum_{i = 1}^{N} R_{i}^{T} R_{i})}^{- 1} \sum_{i = 1}^{N} (R_{i}^{T} Φ_{k_{i}} α_{i})$ (7) where $α$ represents a collection of all $α_{i}$ . And $Φ$ represents a collection of all sub-dictionaries ${Φ_{k}}$ .

For $Y = D B X$ , recovering the original image $X$ from $Y$ is our goal. A super-resolution reconstruction model based on sparse representation through EquationEqs. (6)(6) $X = \underset{X}{\arg \min} {∥ Y - D B X ∥}_{2}^{2}$ (6) and Equation(7)(7) $\hat{X} = Φ \circ α = {(\sum_{i = 1}^{N} R_{i}^{T} R_{i})}^{- 1} \sum_{i = 1}^{N} (R_{i}^{T} Φ_{k_{i}} α_{i})$ (7) can be summarized as follows: (8) $\hat{α} = \underset{α}{\arg \min} {{∥ Y - D B Φ \circ α ∥}_{2}^{2} + λ {∥ α ∥}_{1}}$ (8) $l_{1}$ -norm sparsity regularization term $| | α | |_{1}$ is weighted by the constant $λ$ . And the $l_{0}$ -norm sparsity is closer to the reweighted $l_{1}$ -norm sparsity than using a constant weight according to Ref [13]. Therefore, EquationEq. (8)(8) $\hat{α} = \underset{α}{\arg \min} {{∥ Y - D B Φ \circ α ∥}_{2}^{2} + λ {∥ α ∥}_{1}}$ (8) can be summed up as: (9) $\hat{α} = \underset{α}{\arg \min} {{∥ Y - D B Φ \circ α ∥}_{2}^{2} + \sum_{i = 1}^{N} \sum_{j = 1}^{n} λ_{i, j} ∥ α_{i, j} ∥}$ (9) where $λ_{i, j}$ represents the weight assigned to $α_{i, j}$ , $α_{i, j}$ represents the coefficient associated with the $j^{t h}$ atom of $Φ_{k_{i}}$ . $λ_{i, j}$ that can be obtained by the following formula: (10) $λ_{i, j} = \frac{c}{| α_{i, j}^{(l)} | + ε}$ (10) where $c$ is a constant, $α_{i, j}^{(l)}$ is an estimated value of the first $l$ iteration of $α_{i, j}$ , and $ε$ is a small constant.

2.2. SR model combined with structure self-similarity

In general, many duplicate images contain a lot of redundant information. And the quality of reconstructed images is significantly improved. Therefore, a non-local self-similar regularization term based on sparsity of the image SR framework is proposed. In the whole image $\hat{X}$ , similar patches $x_{i}^{l} (l = 1, 2, \dots, L)$ are found for each local patch $x_{i}$ . If $e_{i}^{l} = {∥ {\hat{x}}_{i} - {\hat{x}}_{i}^{l} ∥}_{2}^{2} \leq t$ , similar patch will choose a patch $x_{i}^{l}$ . ${\hat{x}}_{i}$ , ${\hat{x}}_{i}^{l}$ are the current estimated value based on the preset threshold $t$ . And the patch ${\hat{x}}_{i}^{l}$ within the first L closest patches is also selected to ${\hat{x}}_{i}$ , where L is also a preset threshold. The center pixel of patch $x_{i}$ and $x_{i}^{l}$ are $χ_{i}$ and $χ_{i}^{l}$ , respectively. The weighted average of $χ_{i}^{l}$ , i.e., $\sum_{l = 1}^{L} b_{i}^{l} χ_{i}^{l}$ is predicted based on $χ_{i}$ . The weight $b_{i}^{l}$ , which is allocated to $χ_{i}^{l}$ is assumed as $b_{i}^{l} = exp (- e_{i}^{l} / h) / \sum_{l = 1}^{L} exp (- e_{i}^{l} / h)$ , where $h$ is a weight control factor. In this paper, the difference between $χ_{i}$ and ${\hat{χ}}_{i}$ is used as a regularization term, i.e. ${∥ χ_{i} - \sum_{l = 1}^{L} b_{i}^{l} χ_{i}^{l} ∥}_{2}^{2}$ . So the sparse representation and structure self-similarity are combined to obtain the image super resolution reconstruction model: (11) $\begin{matrix} \hat{α} = \underset{α}{\arg \min} {{∥ Y - D B Φ \circ α ∥}_{2}^{2} \\ + \sum_{i = 1}^{N} \sum_{j = 1}^{n} λ_{i, j} | α_{i, j} | + η \cdot \sum_{i = 1}^{N} ∥ χ_{i} - \sum_{l = 1}^{L} b_{i}^{l} χ_{i}^{l} ∥} \end{matrix}$ (11) where $η$ is to balance the contribution of non-local regularization, it is a constant.

Then by merging the first term and third term, the simplified form of EquationEq. (11)(11) $\begin{matrix} \hat{α} = \underset{α}{\arg \min} {{∥ Y - D B Φ \circ α ∥}_{2}^{2} \\ + \sum_{i = 1}^{N} \sum_{j = 1}^{n} λ_{i, j} | α_{i, j} | + η \cdot \sum_{i = 1}^{N} ∥ χ_{i} - \sum_{l = 1}^{L} b_{i}^{l} χ_{i}^{l} ∥} \end{matrix}$ (11) can be expressed as (12) $\hat{α} = \underset{α}{\arg} \min {{∥ \tilde{Y} - F Φ \circ α ∥}_{2}^{2} + \sum_{i = 1}^{N} \sum_{j = 1}^{n} λ_{i, j} ∥ α_{i, j} ∥}$ (12) where (13) $\tilde{Y} = [\begin{matrix} Y \\ 0 \end{matrix}], F = [\begin{matrix} D B \\ η \cdot (I - A) \end{matrix}],$ (13) A expresses the weight matrix, and $I$ expresses the identity matrix.

3. Acquisition and training of adaptive multi-dictionary samples

3.1. Selection of samples

In order to get the missing high-frequency information, it is necessary to obtain the dictionary that contains the high-frequency information. In this paper, the dictionary training samples is consisted of image itself to be processed and the natural image database. The natural image database is mainly to select high resolution images with rich texture details. For the image itself, we generate the image pyramid by self-similarity to produce the training samples. It contains the image's own information, and the upper layer images of pyramid are selected as the samples of a dictionary. The principle of generating the image pyramid by self-similarity is as follows.

shows the input image $I_{i n}$ , and we generate LR images $I_{i n - k}$ ( $k = 1, 2, \dots, m$ ) as the lower layer images of the pyramid firstly. In the lower-resolution images ( $I_{i n - 1}$ or $I_{i n - 2}$ ), some similar patches ( $P_{1}$ and $P_{2}$ ) are searched where $P_{s}$ is a source patch in $I_{i n}$ . A corresponding region ( $Q_{1}$ or $Q_{2}$ ) in $I_{i n}$ are determined for each patch ( $P_{1}$ or $P_{2}$ ). The corresponding region ( $D_{1}$ or $D_{2}$ ) depends on two factors: (1) the area of source patch $P_{s}$ , and (2) the layer index of the found patch (−1 of $I_{i n - 1}$ or −2 of $I_{i n - 2}$ ). In the end, $Q_{1}$ is moved to the corresponding position of the enlarged area $D_{1}$ , and $Q_{2}$ is also the case. $P_{s}$ is not exactly like $P_{k}$ , and $Q_{k}$ is not exactly like $D_{k}$ . So $D_{k}$ can be obtained by $Q_{k}$ weighted, and the formula is as follows: (14) $D_{k} = {\sum ω}_{k} Q_{k}$ (14) where their weights are (15) $ω_{k} = exp (- {∥ P_{s} - P_{k} ∥}^{2} / σ^{2})$ (15)

Figure 1. The image pyramid generated by self-similarity.

And $σ$ controls the degree of similarity. $I_{i n + 1}$ and $I_{i n + 2}$ are denoted in the HR images in . $I_{i n + 1}$ and $I_{i n + 2}$ , which have some copied patches, have many undetected areas (i.e., in the image pyramid, some similar patches are not found by source patches in $I_{i n}$ ). Uncovered area will be filled with the back projection to improve image resolution based on method [Citation6].

3.2. Training of adaptive multi-dictionary

As mentioned above, the dictionary has an important impact on the reconstruction results. In order to enrich the high frequency information contained in the dictionary, this paper adopts the method of combining the image itself and the natural images database. Firstly, we take the upper layer images of pyramid as the training samples $S$ . The semantic information is conveyed by the edge of image which is sensitive to human visual system. The high-pass filtering is used to obtain the feature of each patch with size $\sqrt{n} \times \sqrt{n}$ for the training samples $S$ of clustering. Blocks only with a certain edge structure are used in dictionary learning to exclude the smooth patches from training. We select the intensity variance of image patch which is better than a threshold $Δ$ . The selected image patch, denoted by $s_{i}, i = 1, 2, \dots, m$ , and $g_{i}, i = 1, 2, \dots, m$ is the high frequency component of ${\tilde{s}}_{i}$ . The high frequency component patches $g_{i}$ are clustered into K clusters by K-means algorithm, which is denoted by $C_{k} = {g_{1}^{k}, g_{2}^{k}, \dots, g_{m_{k}}^{k}}$ , $k = 1, 2, \dots, K$ , where $m_{k}$ is the number of samples for each class that guarantee $\sum_{k = 1}^{K} m_{k} = m$ . Then we compute class centers $μ_{k} = \frac{1}{m_{k}} \sum_{i = 1}^{m_{k}} g_{i}^{k}$ and radius $r_{k} = \max {∥ g_{i}^{k} - μ_{k} ∥}_{2}, i = 1, \dots, m_{k}$ for each class.

Similar with the upper layer images of pyramid, the high-pass filtering is used for the images of the natural image database. Abundant image patches are cropped from the natural images. The size of the image patch is $\sqrt{n} \times \sqrt{n}$ , and we selected the patches whose intensity variances are better than a threshold $Δ$ . The selected image patch is denoted by ${\tilde{s}}_{i}, i = 1, 2, \dots, n$ , and ${\tilde{g}}_{i}, i = 1, 2, \dots, n$ is the high frequency component of ${\tilde{s}}_{i}$ . Then we compute the distance between ${\tilde{g}}_{i}$ and the class centers $μ_{k}$ , which is denoted by $d_{i}^{k}$ . If $\min (d_{i}^{k}) < δ r_{k}$ , ${\tilde{g}}_{i}$ is added to $C_{k}$ , otherwise the image block is abandoned, where $δ$ is a parameter that is used to control the similarity degree between the image blocks and the class centers. After the above operation, the samples are expanded, and each type of expanded sample is recorded as $C_{k} = {g_{1}^{k}, g_{2}^{k}, \dots, g_{q_{k}}^{k}}$ , where $q_{k}$ is the number of samples after the expansion.

Then, according to the high frequency component samples $C_{k}$ , we can find the corresponding image block sample matrix $S_{k} = [s_{1}^{k}, s_{2}^{k}, \dots, s_{q_{k}}^{k}]$ $k = 1, 2, \dots, K$ . The following objective function can formulate the design of sub-dictionary $Φ_{k}$ . (16) $({\hat{Φ}}_{k}, {\hat{Λ}}_{k}) = \underset{Φ_{k}, Λ_{k}}{\arg \min} {{∥ S_{k} - Φ_{k} Λ_{k} ∥}_{F}^{2} + λ {∥ Λ_{k} ∥}_{1}}$ (16) where $Λ_{k}$ is the representation coefficient matrix, and $λ$ is the parameter controlling the degree of sparsity. In order to reduce the amount of computation, the principal component analysis (PCA) is used to calculate the main parts for each sub-dataset $S_{k}$ . An orthogonal transformation matrix $P_{k}$ , which is generated by applying PCA, is set the dictionary, and it meets $Z_{k} = P_{k}^{T} S_{k}$ . Then we will have ${∥ S_{k} - P_{k} Z_{k} ∥}_{F}^{2}$ = ${∥ S_{k} - P_{k} P_{k}^{T} S_{k} ∥}_{F}^{2}$ = 0. In order to obtain a dictionary $Φ_{r} = [p_{1}, p_{2}, \dots, p_{r}]$ , the first $r$ which is the most important eigenvectors in $P_{k}$ is separated. In this way, the relationship of $l_{1}$ -norm regularization term and $l_{2}$ -norm approximation term in EquationEq. (16)(16) $({\hat{Φ}}_{k}, {\hat{Λ}}_{k}) = \underset{Φ_{k}, Λ_{k}}{\arg \min} {{∥ S_{k} - Φ_{k} Λ_{k} ∥}_{F}^{2} + λ {∥ Λ_{k} ∥}_{1}}$ (16) is more balanced. The reconstruction error ${∥ S_{k} - Φ_{r} Λ_{r} ∥}_{F}^{2}$ will increase the decrease of $r$ according to EquationEq. (16)(16) $({\hat{Φ}}_{k}, {\hat{Λ}}_{k}) = \underset{Φ_{k}, Λ_{k}}{\arg \min} {{∥ S_{k} - Φ_{k} Λ_{k} ∥}_{F}^{2} + λ {∥ Λ_{k} ∥}_{1}}$ (16) . Considering the term ${∥ Λ_{r} ∥}_{1}$ will reduce, the best value $r_{0}$ of $r$ can be prescribed for: (17) $r_{0} = \underset{r}{\arg \min} {{∥ S_{k} - Φ_{r} Λ_{r} ∥}_{F}^{2} + λ {∥ Λ_{r} ∥}_{1}}$ (17) where $λ$ is the weight of the $l_{1}$ norm sparse regularization term ${∥ α ∥}_{1}$ . In the end, the sub-dictionary which is learned from sub-dataset $S_{k}$ is $Φ_{k} = [p_{1}, p_{2}, \dots, p_{r_{0}}]$ , $k = 1, 2, \dots, K$ .

Finally, we seek an adaptive dictionary for the image patches to be reconstructed. Considering spanning the adaptive sparse domain, a sub-dictionary is assigned adaptively to each partial patch of $x$ . $X$ needs to be estimated in advance, because it is not known at the beginning. We can take the top layer image of pyramid as the initial estimation of $X$ . $\hat{X}$ represents the estimate of $X$ , and ${\hat{x}}_{i}$ represents the local patch of $\hat{X}$ . For the centroid $μ_{k}$ , considering the high-pass filtered patch of ${\hat{g}}_{i}$ , the best fitted sub-dictionary to ${\hat{x}}_{i}$ is generated, which is of each cluster available. For ${\hat{x}}_{i}$ , the dictionary is influenced by the minimum distance between ${\hat{g}}_{i}$ and $μ_{k}$ , i.e. (18) $k_{i} = \underset{k}{\arg \min} {∥ {\hat{g}}_{i} - μ_{k} ∥}_{2}$ (18)

3.3. Algorithm Flow

In the paper, in order to solve EquationEq. (12)(12) $\hat{α} = \underset{α}{\arg} \min {{∥ \tilde{Y} - F Φ \circ α ∥}_{2}^{2} + \sum_{i = 1}^{N} \sum_{j = 1}^{n} λ_{i, j} ∥ α_{i, j} ∥}$ (12) , we use the iterative shrinkage algorithm [Citation7]. The algorithm flow chart is shown in .

Figure 2. The algorithm flow chart of this paper.

The overall algorithm process is as follows:

Initialization:
1. By the descending process of the image, we determine the down-sampling matrix $D$ and the blurring matrix $B$ ;
2. The sub-dictionary $Φ_{k_{i}}$ is established by PCA, in which samples from the natural images and the upper layer images of the pyramid by structural self-similarity property are established;
3. The weight matrix $A$ of the non-local structure self-similarity is obtained by searching the similar image patches of the same scale in the image;
4. The termination error $e$ , the maximal iteration number $M a x_Iter$ , the constant of controlling non-local regularization term $η$ , and the condition of updating parameters $P$ are presented;
It is iterated on $k$ until $k \geq M a x_Iter$ or ${∥ {\hat{X}}^{(k)} - {\hat{X}}^{(k - 1)} ∥}_{2}^{2} / N \leq e$ is satisfied.
1. The current estimation is updated using EquationEq. (19)(19) ${\hat{X}}^{(k + 1 / 2)} = {\hat{X}}^{(k)} + F^{T} (\tilde{Y} - F {\hat{X}}^{(k)})$ (19) : (19) ${\hat{X}}^{(k + 1 / 2)} = {\hat{X}}^{(k)} + F^{T} (\tilde{Y} - F {\hat{X}}^{(k)})$ (19)
2. The sparse representation coefficients are updated using EquationEq. (20)(20) $α_{i, j}^{(k + 1 / 2)} = Φ_{k_{i}} R_{i} {\hat{X}}^{(k + 1 / 2)}$ (20) and EquationEq. (21)(21) $α_{i, j}^{(k + 1)} = soft (α_{i, j}^{(k + 1 / 2)}, τ_{i, j})$ (21) : (20) $α_{i, j}^{(k + 1 / 2)} = Φ_{k_{i}} R_{i} {\hat{X}}^{(k + 1 / 2)}$ (20) (21) $α_{i, j}^{(k + 1)} = soft (α_{i, j}^{(k + 1 / 2)}, τ_{i, j})$ (21) where $soft (\cdot, τ_{i, j})$ is a soft function of thresholding, and its threshold is $τ_{i, j}$ ; (22) $soft (x, τ) = sgn (x) \max (| x | - τ, 0)$ (22) where $sgn (x)$ is a sign function;
3. Re-update the current estimation ${\hat{X}}^{(k + 1)} = Φ \circ α^{(k + 1)}$ ;
4. If $\mod (k, P) = 0$ , the improved estimate ${\hat{X}}^{(k + 1)}$ is used to update the matrix $A$ and the adaptive sparse domain of $X$ .

4. Experimental results and analysis

This section mainly presents the results of the SR reconstruction of medical images by this method. It also performs three sets of experiments in both noiseless and noisy cases. The three sets of experimental images (Head1, Head2, and Chest) are actually taken by the hospital, including MRI images and CT images. In the experiments, the degraded LR image is generated by a 7 × 7 gaussian kernel and an original image with a standard deviation of 1.6 make the degraded LR image. Then the image is down sampled by a factor of 3. Firstly, the low resolution color images are converted to ycbcr format. Finally, the bicubic interpolator is applied to the color components. We compare our method with two most advanced methods: the bicubic interpolation method and the ASDS method [Citation8]. The statistical experimental results are analyzed and discussed to verify the advantages of this method for medical image’s SR reconstruction. The results with noiseless are shown in and . The results with noise are shown in and .

Figure 3. Comparison of the reconstructed images of various methods for noiseless medical image (Equation1(1) $x \approx Φ α$ (1) ) (a) Original image (b) LR image (c) Bicubic (d) ASDS (e) Our.

Figure 3. Comparison of the reconstructed images of various methods for noiseless medical image (Equation1(1) x≈Φα(1) ) (a) Original image (b) LR image (c) Bicubic (d) ASDS (e) Our.

Figure 4. Comparison of the reconstructed images of various methods for noiseless medical image (Equation2(2) $α = \underset{α}{\arg \min} {∥ α ∥}_{0}, s . t . {∥ x - Φ α ∥}_{2} \leq ε$ (2) ) (a) Original image (b) LR image (c) Bicubic (d) ASDS (e) Our.

Figure 4. Comparison of the reconstructed images of various methods for noiseless medical image (Equation2(2) α=argminα∥α∥0,s.t.∥x−Φα∥2≤ε(2) ) (a) Original image (b) LR image (c) Bicubic (d) ASDS (e) Our.

Figure 5. Comparison of the reconstructed images of various methods for noiseless medical image (Equation3(3) $α = \underset{α}{\arg \min} {∥ α ∥}_{1}, s . t . {∥ x - Φ α ∥}_{2} \leq ε$ (3) ) (a) Original image (b) LR image (c) Bicubic (d) ASDS (e) Our.

Figure 5. Comparison of the reconstructed images of various methods for noiseless medical image (Equation3(3) α=argminα∥α∥1,s.t.∥x−Φα∥2≤ε(3) ) (a) Original image (b) LR image (c) Bicubic (d) ASDS (e) Our.

Figure 6. Comparison of the reconstructed images of various methods for noisy medical image (Equation1(1) $x \approx Φ α$ (1) ) (a) Original image (b) LR image (c) Bicubic (d) ASDS (e) Our.

Figure 6. Comparison of the reconstructed images of various methods for noisy medical image (Equation1(1) x≈Φα(1) ) (a) Original image (b) LR image (c) Bicubic (d) ASDS (e) Our.

Figure 7. Comparison of the reconstructed images of various methods for noisy medical image (Equation2(2) $α = \underset{α}{\arg \min} {∥ α ∥}_{0}, s . t . {∥ x - Φ α ∥}_{2} \leq ε$ (2) ) (a) Original image (b) LR image (c) Bicubic (d) ASDS (e) Our.

Figure 7. Comparison of the reconstructed images of various methods for noisy medical image (Equation2(2) α=argminα∥α∥0,s.t.∥x−Φα∥2≤ε(2) ) (a) Original image (b) LR image (c) Bicubic (d) ASDS (e) Our.

Figure 8. Comparison of the reconstructed images of various methods for noisy medical image (Equation3(3) $α = \underset{α}{\arg \min} {∥ α ∥}_{1}, s . t . {∥ x - Φ α ∥}_{2} \leq ε$ (3) ).

Figure 8. Comparison of the reconstructed images of various methods for noisy medical image (Equation3(3) α=argminα∥α∥1,s.t.∥x−Φα∥2≤ε(3) ).

Table 1. Comparison of PSNR (dB)/SSIM of the medical images ( $σ = 0$ ) based on various methods.

Display Table

Table 2. Comparison of PSNR (dB)/SSIM of the medical images ( $σ = 5$ ) based on various methods.

Display Table

Firstly, show the reconstructed results for the three images of Head1, Head2, and Chest. The texture of the head and the texture of lungs in the chest can be observed clearly. The blur and noise conditions are greatly improved. Compared with the bicubic algorithm and the ASDS method, the medical images reconstructed by our method are of better quality, and more detailed.

Secondly, we compare our method with the two method for medical images respectively for quantitatively prove the advantages of our method. Evaluating the experimental results based on various methods use the SSIM and PSNR [Citation9]. The greater value of the PSNR indicates the quality of the reconstructed image and the reconstructed results are better. SSIM evaluates the structure differences between the reconstructed image and the high resolution image by SR. The greater the value of SSIM is, the better the reconstructed result will be. and list the SSIM and PSNR results, and it can be seen from the results in the table that the method presented in this paper has good results.

In addition, we compare the complexity and time consumption of the bicubic interpolation method, the ASDS method and our method. The calculation formula of time complexity is $T (n) = O (f (n))$ , where $f (n)$ is the frequency of the sentence with the largest frequency in the algorithm. The program of time consuming test method is run on Matlab R2014b, and the average time consumption of the 512 × 512 pixel test chart is calculated. The test results are shown in .

Table 3. Comparison of complexity.

Display Table

Furthermore, we give the experimental results on the public image library for fully prove the advantages of our method. We randomly selected 20 high quality natural images from the sample database. shows the results of comparing our method with two most advanced methods: the bicubic interpolation method and the Yang method. From the result we can see the proposed method results achieve the most visual effect, which is more clearly compared with the other algorithms because more details are restored. Secondly, for quantitatively demonstrate the advantages of our method, we use PSNR and SSIM to evaluate the experimental results based on various methods. list the SSIM and PSNR consequence of these algorithms, where it can be seen that our method is more effective than others. The average PSNR gains over other methods are up to 3.4 dB and 2.57 dB. Compared with other methods, the average value of the SSIM gains up to 0.0706 and 0.0455, respectively.

Figure 9. Comparison of the reconstructed images of various methods for Butterfly image (a) Original image (b) Bicubic (c) Yang (d) Our.

Table 4. Comparison of PSNR (dB)/SSIM of the reconstructed images based on various method.

Download CSV Display Table

5. Conclusions

In this paper, we introduce a single image super resolution reconstruction method based on adaptive MDL and structural self-similarity. The proposed method is the improvement for the existing adaptive MDL method. In this method, the image pyramid is generated by structural self-similarity, and the pyramid upper layer images are as samples, which will contain more self-similar information of the different scales than directly conducting the pyramid decomposition for the image itself. In addition, the paper uses the top layer image of pyramid as the initial reconstruction image and the image SR reconstruction uses the non-local structure self-similarity of the image as the regularization term. It also added the self-similar information of the same scale to the reconstruction image. Experimental results demonstrate that our method achieves better results in terms of all the visual perception and the parameters of PSNR and SSIM compared with multiple super resolution reconstruction methods.

Disclosure statement

No potential conflict of interest was reported by the authors.

Additional information

Funding

This study was sponsored by the National Natural Science Foundation of China (NSFC) (grant no. 61601325), the Tianjin Natural Science Foundation (grant no. 17JCQNJC01400), and Supported by the Program for Innovative Research Team in University of Tianjin (grant no. TD13-5034).

References

Xie B, Duan ZM, Ma PG, et al. SR reconstruction algorithm of infrared image based on dynamic pyramid model[J]. Infrared and Laser Eng. 2018;001(1):277–282.
Google Scholar
Pan ZX, Yu J, Hu SX, et al. Super-resolution based on compressive sensing and structural self-similarity for remote sensing images. IEEE Transac Geosci Remote Sens. 2013;51(9):4864–4876.
Web of Science ®Google Scholar
Zhao H, Wei JX, Pang ZH, et al. Wave-front coded super-resolution imaging technique. Infrared and Laser Eng. 2016;45(4):227–236.
Google Scholar
Yang J, Wright J, Huang TS, et al. Image super-resolution via sparse representation. IEEE Transac Image Proc. 2010;19(11):2861.
PubMed Web of Science ®Google Scholar
Dong WS, Zhang L, Shi GM, et al. Image deblurring and super-resolution by adaptive sparse domain selection and adaptive regularization. IEEE Transac Image Proc. 2011;20(7):1838–1857.
PubMed Web of Science ®Google Scholar
Pan ZX, Yu J, Xiao CB, et al. Single image super resolution based on adaptive multi-dictionary learning. Acta Electronica Sinica. 2015;43(2):209–216.
Google Scholar
Dong WS, Zhang L, Lukac R, et al. Sparse representation based image interpolation with nonlocal autoregressive modeling. IEEE Transac Image Proc. 2013;22(4):1382–1394.
PubMed Web of Science ®Google Scholar
Zhang A, Guan C, Jiang H, et al. An image super-resolution scheme based on compressive sensing with PCA sparse representation. In: Shi YQ, Kim HJ, editors. The international workshop on digital forensics and watermarking 2012. Berlin Heidelberg: Springer; 2013.
Google Scholar
Zhang X, Zhou W, Duan Z. Image super-resolution reconstruction based on fusion of K-SVD and semi-coupled dictionary learning. IEEE Signal and Information Processing Association Summit and Conference; 2017 December 13–16, Jeju, South Korea.
Google Scholar

Download PDF

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Your download is now in progress and you may close this window

Did you know that with a free Taylor & Francis Online account you can gain access to the following benefits?

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Have an account?
Login now Don't have an account?
Register for free

Login or register to access this feature

Have an account?
Login now Don't have an account?
Register for free

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Super resolution reconstruction for medical image based on adaptive multi-dictionary learning and structural self-similarity

Abstract

1. Introduction

2. SR model

2.1. SR model based on sparse representation

2.2. SR model combined with structure self-similarity