Full article: Duality Mapping for Schatten Matrix Norms

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

In this paper, we fully characterize the duality mapping over the space of matrices that are equipped with Schatten norms. Our approach is based on the analysis of the saturation of the Hölder inequality for Schatten norms. We prove in our main result that, for $p \in (1, \infty),$ the duality mapping over the space of real-valued matrices with Schatten-p norm is a continuous and single-valued function and provide an explicit form for its computation. For the special case p = 1, the mapping is set-valued; by adding a rank constraint, we show that it can be reduced to a Borel-measurable single-valued function for which we also provide a closed-form expression.

Keywords:

1. Introduction

In linear algebra and matrix analysis, Schatten norms are a family of spectral matrix norms that are defined via the singular-value decomposition [Citation1]. They have appeared in many applications such as image reconstruction [Citation2, Citation3], image denoising [Citation4], and tensor decomposition [Citation5], to name a few.

Generally, the Schatten-p norm of a matrix is the $ℓ_{p}$ norm of its singular values [Citation6]. The family contains some well-known matrix norms: The Frobenius and the spectral (operator) norms are special cases in the family, with p = 2 and $p = \infty,$ respectively. The case p = 1 (trace or nuclear norm) is of particular interest for applications as it can be used to recover low-rank matrices [Citation7]. This is the current paradigm in matrix completion, where the goal is to recover an unknown matrix given some of its entries [Citation8]. Prominent examples of applications that can be reduced to low-rank matrix-recovery problems are phase retrieval [Citation9], sensor-array processing [Citation10], system identification [Citation11], and index coding [Citation12, Citation13].

In addition to their many applications in data science, Schatten norms have been extensively studied from a theoretical point of view. Various inequalities concerning Schatten norms have been proven [Citation14–22]; sharp bounds for commutators in Schatten spaces have been given [Citation23, Citation24]; moreover, facial structure [Citation25], Fréchet differentiablity [Citation26], and various other aspects [Citation27, Citation28] have been studied already.

Our objective in this paper is to investigate the duality mapping in spaces of matrices that are equipped with Schatten norms. The duality mapping is a powerful tool to understand the topological structure of Banach spaces [Citation29, Citation30]. It has been used to derive powerful characterizations of the solution of variational problems in function spaces [Citation31, Citation32] and also to determine generalized linear inverse operators [Citation33]. Here, we prove that the duality mapping over Schatten-p spaces with $p \in (1, + \infty)$ is a single-valued and continuous function which, in fact, highlights the strict convexity of these spaces. Although the provided characterization is intuitive, we could not find it in the literature and this is, to the best of our knowledge, the first work which provides a direct way of computing this mapping in this case. For the special case p = 1, the mapping is set-valued. However, we prove that, by adding a rank constraint, it reduces to a single-valued Borel-measurable function. In both cases, we also derive closed-form expressions that allow one to compute them explicitly.

The paper is organized as follows: In Section 2, we present relevant mathematical tools and concepts that are used in this paper. We study the duality mapping of Schatten spaces and propose our main result in Section 3. We provide further discussions regarding the introduced mappings in Section 4.

2. Preliminaries

2.1. Dual norms, Hölder inequality, and duality mapping

Let V be a finite-dimensional vector space that is equipped with an inner-product $〈 \cdot, \cdot 〉 : V \times V \to R$ and let $| | \cdot | |_{X} : V \to R_{\geq 0}$ be an arbitrary norm on V. We then denote by X the space V equipped with $| | \cdot | |_{X} .$ Clearly, X is a Banach space, because all finite-dimensional normed spaces are complete. The dual norm of X, denoted by $| | \cdot | |_{X'} : V \to R_{\geq 0},$ is defined as (1) $| | v | |_{X'} = \sup_{u \in V \ {0}} \frac{〈 v, u 〉}{| | u | |_{X}},$ (1) for any $v \in V .$ Following this definition, one would directly obtain the generic duality bound (2) $〈 v, u 〉 \leq | | v | |_{X'} | | u | |_{X},$ (2) for any $v, u \in V .$ Saturation of Inequality (Equation2(2) $〈 v, u 〉 \leq | | v | |_{X'} | | u | |_{X},$ (2) ) is the key concept of dual conjugates that is formulated in the following definition.

Definition 1.

Let V be a finite-dimensional vector space and let $(| | \cdot | |_{X}, | | \cdot | |_{X'})$ be a pair of dual norms that are defined over V. The pair $(u, v) \in V \times V$ is said to be a $(X, X')$ -conjugate, if

$〈 v, u 〉 = | | v | |_{X'} | | u | |_{X},$
$| | v | |_{X'} = | | u | |_{X} .$

For any $u \in V,$ the set of all elements $v \in V$ such that $(u, v)$ forms an $(X, X')$ -conjugate is denoted by $J_{X} (u) \subseteq V .$ We refer to the set-valued mapping $J_{X} : V \to 2^{V}$ as the duality mapping. If, for all $u \in V,$ the set $J_{X} (u)$ is a singleton, then we indicate the duality mapping for the X-norm via the single-valued function $J_{X} : V \to V$ with $J_{X} (u) = {J_{X} (u)} .$

It is worth mentioning that, for any $u \in V,$ the set $J_{X} (u)$ is nonempty. In fact, the closed ball $B = {v \in V : | | v | |_{X'} = | | u | |_{X}}$ is compact and, hence, the function $v \mapsto 〈 v, u 〉$ attains its maximum value at some $v^{*} \in B .$ Now, following Definition 1, one readily verifies that $(u, v^{*})$ is an $(X, X')$ -conjugate.

We conclude this part by providing a classical and illustrative example. Let $V = R^{n}$ for some $n \in N .$ For any $p \in [1, + \infty],$ the $ℓ_{p}$ -norm of a vector $u = (u_{i}) \in R^{n}$ is defined as (3) $| | u | |_{p} = {\begin{matrix} {(\sum_{i = 1}^{n} | u_{i} |^{p})}^{\frac{1}{p}}, & p < + \infty \\ \max_{i} | u_{i} |, & p = + \infty . \end{matrix}$ (3)

It is widely known that the dual norm of $ℓ_{p}$ is the $ℓ_{q}$ -norm, where (p, q) are Hölder conjugates (i.e., $1 / p + 1 / q = 1$ ) [Citation34]. This stems from the Hölder inequality which states that (4) $〈 v, u 〉 \leq | | u | |_{p} | | v | |_{q},$ (4) for all $u = (u_{i}), v = (v_{i}) \in R^{n} .$ In the sequel, we exclude the trivial cases $u = 0$ and $v = 0$ to avoid unnecessary complexities in our statements.

When $1 < p < + \infty,$ Inequality (Equation4(4) $〈 v, u 〉 \leq | | u | |_{p} | | v | |_{q},$ (4) ) is saturated if and only if $u_{i} v_{i} \geq 0$ for $i = 1, \dots, n$ and there exists a constant c > 0 such that $| u |^{p} = c | v |^{q},$ where $| u |^{p} = (| u_{i} |^{p}) .$ This ensures that the duality mapping is single-valued and also yields the map (5) $J_{p} (u) = sign (u) \frac{| u |^{p - 1}}{| | u | |_{p}^{p - 2}} .$ (5)

For p = 1, one can verify that the equality happens if and only if, for any index $i = 1, \dots, n$ with $u_{i} \neq 0,$ one has that (6) $v_{i} = sign (u_{i}) | | v | |_{\infty} .$ (6)

In other words, the vector $v$ should attain its extreme values at places where $u$ has nonzero values, with the sign being determined by the corresponding element in $u .$

Due to (Equation6(6) $v_{i} = sign (u_{i}) | | v | |_{\infty} .$ (6) ), the set $J_{1} (u)$ is not necessarily a singleton. However, if we add an additional sparsity constraint, then the mapping becomes single-valued. This leads us to introduce the new notion of sparse duality mapping in Definition 2.

Definition 2.

Let V be a finite-dimensional vector space and let $s_{0} : V \to N$ be an integer-valued function that acts as a sparsity measure. Assuming a pair $(| | \cdot | |_{X}, | | \cdot | |_{X'})$ of dual norms over V, we call the pair $(u, v) \in V \times V$ a sparse $(X, X')$ -conjugate if

$(u, v)$ forms an $(X, X')$ -conjugate pair. In other words, $v \in J (u) .$
The quantity $s_{0} (v)$ attains its minimal value over the set $J (u) .$

We denote the set of sparse conjugates of $u$ by $J_{X, s_{0}} (u) .$ Whenever $J_{X, s_{0}} (u)$ is a singleton for any $u \in V,$ we refer to the single-valued function $J_{X, s_{0}} : V \to V$ with $J_{X, s_{0}} (u) = {J_{X, s_{0}} (u)}$ as the sparse duality mapping.

Following Definition 2, if we use the $ℓ_{0}$ -norm as the sparsity measure, that is $s_{0} (u) = | | u | |_{0} = Card ({i : u_{i} \neq 0})$ Footnote¹, then we have the sparse duality mapping (7) $\begin{matrix} J_{1, 0} : R^{n} \to R^{n} : u = (u_{i}) \mapsto v = (v_{i}) = J_{1, 0} (u), \\ v_{i} = {\begin{matrix} sign (u_{i}) | | u | |_{1}, & u_{i} \neq 0 \\ 0, & u_{i} = 0. \end{matrix} \end{matrix}$ (7)

Finally, we mention that, for p = + ∞, the reduced set $J_{\infty, 0}$ is not single-valued. Indeed, let us define $I_{\max} (u) = {i : | u_{i} | = | | u | |_{\infty}} \subseteq {1, \dots, n} .$ We readily deduce from (6) that $v = (v_{1}, \dots, v_{n}) \in J_{\infty} (u)$ if and only if v_i = 0 whenever $i \notin I_{\max} (u)$ and $sign (v_{i}) = sign (u_{i})$ for $i \in I_{\max} (u)$ with $\sum_{i \in I_{\max} (u)} | v_{i} | = | | u | |_{\infty} .$ This shows that $J_{\infty} (u)$ is a convex set with $J_{\infty, 0} (u)$ being its extreme points, where $J_{\infty, 0} (u) = {u_{i} e_{i} : i \in I_{\max} (u)} .$

2.2. Schatten p-norm

It is widely known that any matrix $A \in R^{m \times n}$ can be decomposed as (8) $A = US V^{T},$ (8) where $U \in R^{m \times m}$ and $V \in R^{n \times n}$ are orthogonal matrices and $S$ is an m by n rectangular diagonal matrix with nonnegative real entries $σ_{1} \geq σ_{2} \geq \dots \geq σ_{\min (m, n)} \geq 0$ sorted in descending order [Citation35]. In the literature, (8) is known as the singular-value decomposition (SVD) and the entries σ_i are the singular values of $A .$ In general, the SVD of a matrix $A$ is not unique. However, the diagonal matrix $S$ and, consequently, its entries, are fully determined from $A .$ In other words, the values of σ_i are invariant to a specific choice of decomposition. This is why one can refer to the diagonal entries of $S$ as the “singular values” of $A .$

When $A$ is not full rank, one can obtain a reduced version of (8). Indeed, if we denote the rank of $A$ by r, then we have that (9) $A = U_{r} S_{r} V_{r}^{T},$ (9) where $U_{r} \in R^{m \times r}$ and $V_{r} \in R^{n \times r}$ are (sub)-orthogonal matrices such that $U_{r}^{T} U_{r} = V_{r}^{T} V_{r} = I_{r}$ and $S_{r} = diag (σ)$ is a diagonal matrix that contains positive singular values $σ = (σ_{1}, \dots, σ_{r}) \in R^{r}$ of $A .$

For any $p \in [1, + \infty],$ the Schatten-p norm of $A$ is defined as (10) $| | A | |_{S_{p}} = {\begin{matrix} {(\sum_{i = 1}^{r} σ_{i}^{p})}^{\frac{1}{p}}, & p < + \infty \\ σ_{1}, & p = + \infty . \end{matrix}$ (10)

We remark that (Equation10(10) $| | A | |_{S_{p}} = {\begin{matrix} {(\sum_{i = 1}^{r} σ_{i}^{p})}^{\frac{1}{p}}, & p < + \infty \\ σ_{1}, & p = + \infty . \end{matrix}$ (10) ) defines a family of quasi norms for $p \in (0, 1) .$ In the extreme case p = 0, the Schatten-0 norm actually coincides with the rank of the matrix, i.e. $| | A | |_{S_{0}} = rank (A) .$ The Schatten quasi norms have also been studied in the literature from both theoretical and practical point of views (see, [Citation36–39], and references therein).

3. Duality mapping in Schatten spaces

For any $p \in [1, \infty],$ the dual of the Schatten-p norm is the Schatten-q norm, where $q \in [1, \infty]$ is such that $\frac{1}{p} + \frac{1}{q} = 1$ [Citation1]. This is due to the generalized version of Hölder’s inequality for Schatten norms, as stated in Proposition 1. While this is a known result (see, for example, [Citation40]), it is also the basis for the present work, which is the reason why we provide a proof in A.

Proposition 1.

For any pair $(p, q) \in {[1, + \infty]}^{2}$ of Hölder conjugates with $\frac{1}{p} + \frac{1}{q} = 1$ and any pair of matrices $A, B \in R^{m \times n}$ , we have that (11) $〈 A, B 〉 = Tr (A^{T} B) \leq | | A | |_{S_{p}} | | B | |_{S_{q}} .$ (11)

In Proposition 2, we investigate the case where the Hölder inequality is saturated, in the sense that (12) $Tr (A^{T} B) = | | A | |_{S_{p}} | | B | |_{S_{q}} .$ (12)

This saturation is central to our work, as it is tightly linked to the notion of duality mapping.

Proposition 2.

Let (p, q) be a pair of Hölder conjugates and let $A, B \in R^{m \times n}$ be a pair of nonzero matrices with reduced SVDs of the form (13) $A = U_{r} diag (σ) V_{r}^{T}, B = {\tilde{U}}_{\tilde{r}} diag (\tilde{σ}) {\tilde{V}}_{\tilde{r}}^{T} .$ (13)

If $p \in (1, \infty)$ , then the Hölder inequality is saturated if and only if we have that (14) $B = c U_{r} diag (J_{p} (σ)) V_{r}^{T}$ (14) or, equivalently, (15) $A = c^{- 1} {\tilde{U}}_{\tilde{r}} diag (J_{q} (\tilde{σ})) {\tilde{V}}_{\tilde{r}}^{T},$ (15)
where $c = \frac{| | B | |_{S_{q}}}{| | A | |_{S_{p}}}$ and $J_{p} (\cdot)$ and $J_{q} (\cdot)$ are the duality mappings for the $ℓ_{p}$ and $ℓ_{q}$ norms, respectively (see (Equation5(5) $J_{p} (u) = sign (u) \frac{| u |^{p - 1}}{| | u | |_{p}^{p - 2}} .$ (5) )).

If p = 1, then a necessary condition for the saturation of the Hölder inequality is that (16) $rank (A) \leq r_{1} \leq rank (B),$ (16)

where $r_{1} = Card ({i : {\tilde{σ}}_{i} = {\tilde{σ}}_{1}})$ is the multiplicity of the first singular value of $B$ . Moreover, if we denote the first r₁ singular vectors of $B$ in (13) by ${\tilde{U}}_{1} \in R^{m \times r_{1}}$ and ${\tilde{V}}_{1} \in R^{n \times r_{1}}$ , then the Hölder inequality is saturated if and only if there exists a symmetric matrix $X \in R^{r_{1} \times r_{1}}$ such that (17) $A = {\tilde{U}}_{1} X {\tilde{V}}_{1}^{T} .$ (17)

Finally in the rank-equality case $rank (A) = rank (B)$ , we have saturation if and only if (18) $B = c U_{r} V_{r}^{T},$ (18)

where $c = | | B | |_{S_{\infty}}$ and the matrices U_r and V_r are defined in (13).

Remark 1.

Note that even though the reduced SVD is not unique (i.e. there are multiple choices for the sub-orthogonal matrices in (Equation13(13) $A = U_{r} diag (σ) V_{r}^{T}, B = {\tilde{U}}_{\tilde{r}} diag (\tilde{σ}) {\tilde{V}}_{\tilde{r}}^{T} .$ (13) )), the parametric forms given in Proposition 2 do not depend on a specific decomposition and the results are invariant to any arbitrary choice of these reduced SVDs, primarily due to the “only if” parts of the statements.

The proof of Proposition 2 can be found in B. We observe that, in the case $p \in (1, \infty),$ the saturation of Hölder inequality provides a very tight link between the two matrices: If we know one of them, then the other lies in a one-dimensional ray that is parameterized by the constant c > 0. However, in the special case p = 1, the identification is not as simple. There again, for a given matrix $B,$ one can fully characterize the set of admissible matrices $A .$ However, for the reverse direction, an additional rank-equality constraint is essential to reduce the set of admissible matrices $B$ to just one ray.

Inspired from Proposition 2, we now propose our main result in Theorem 1, where we explicitly characterize the duality mapping for the Schatten p-norms. The proof of Theorem 1 can be found in C.

Theorem 1.

Let $p, q \in [1, + \infty]$ be a pair of Hölder conjugates with $\frac{1}{p} + \frac{1}{q} = 1$ and $A \in R^{m \times n}$ a matrix whose reduced SVD is specified in (Equation9(9) $A = U_{r} S_{r} V_{r}^{T},$ (9) ).

If $1 < p < + \infty$ , then the single-valued duality mapping $J_{S_{p}} : R^{m \times n} \to R^{m \times n}$ is well-defined and can be expressed as (19) $J_{S_{p}} : A = U_{r} diag (σ) V_{r}^{T} \mapsto A^{*} = U_{r} diag (J_{p} (σ)) V_{r}^{T} .$ (19)
If p = 1 and if we consider the rank function as the sparsity measure in Definition 2, then the sparse duality mapping $J_{S_{1}, rank} : R^{m \times n} \to R^{m \times n}$ is well-defined (singleton) and is given as (20) $J_{S_{1}, rank} : A = U_{r} diag (σ) V_{r}^{T} \mapsto A^{*} = | | σ | |_{1} U_{r} V_{r}^{T} .$ (20)
If p = + ∞, then the set-valued mapping $J_{S_{\infty}} (\cdot)$ can be described as (21) $J_{S_{\infty}} (A) = {σ_{1} U_{1} X V_{1}^{T} : X \in R^{r_{1} \times r_{1}} is symmetric and | | X | |_{S_{1}} = 1},$ (21)

where r₁ denotes the multiplicity of the first singular value σ₁ of $A$ and $U_{1}, V_{1}$ are singular vectors that correspond to σ₁ in (Equation9(9) $A = U_{r} S_{r} V_{r}^{T},$ (9) ). Finally, the set of sparse dual conjugates is the collection of rank-1 elements of $J_{S_{\infty}} (A)$ which can be characterized as (22) $J_{S_{\infty}, rank} (A) = {σ_{1} U_{1} p p^{T} V_{1}^{T} : p \in R^{r_{1}}, | | p | |_{2} = 1} .$ (22)

4. Discussion

Theorem 1 provides an interesting characterization of the duality mapping in three scenarios: The first case is $1 < p < + \infty$ which is the most straightforward one. Theorem 1 tells us that the mapping is single-valued and also gives a formula to compute the dual conjugate $A^{*}$ of any matrix $A \in R^{m \times n} .$ We use this result to deduce the continuity of the duality mapping as well as the strict convexity of the Schatten space in this case (see Corollary 1). In the second case, with p = 1, the mapping is not single-valued. However, there is a unique element in the set of dual conjugates with the minimal rank (that is equal to the rank of $A$ ) and, hence, we can construct a single-valued sparse duality mapping. Finally, we showed in the third case, characterized by p = + ∞, that neither the set of dual conjugates nor the ones with the minimal rank are unique.

In Corollary 1, we highlight some consequences of Theorem 1 concerning the strict convexity of Schatten spaces and the continuity of the duality mapping.

Corollary 1.

The Banach space of m by n matrices equipped with the Schatten-p norm is strictly convex, if and only if $p \in (1, + \infty)$ . In this case, the function $J_{S_{p}} : R^{m \times n} \to R^{m \times n}$ is continuous.

Proof.

For $p \in (1, + \infty),$ we know from Theorem 1 that the duality mapping $J_{S_{p}}$ is bijective. Moreover, it is known that all finite-dimensional Banach spaces are reflexive. Now, following [Citation41], we deduce the strict convexity of the space of m by n matrices with Schatten-p norm.

For p = 1 and p = + ∞, we can readily verify that $\begin{matrix} {| | α (\begin{matrix} 1 & \dots & 0 \\ ⋮ & ⋱ & ⋮ \\ 0 & \dots & 0 \end{matrix}) + (1 - α) (\begin{matrix} 0 & \dots & 0 \\ ⋮ & ⋱ & ⋮ \\ 0 & \dots & 1 \end{matrix}) | |}_{S_{1}} = {| | (\begin{matrix} α & \dots & 0 \\ ⋮ & ⋱ & ⋮ \\ 0 & \dots & (1 - α) \end{matrix}) | |}_{S_{1}} = 1, \\ {| | α (\begin{matrix} 1 & \dots & 0 \\ ⋮ & ⋱ & ⋮ \\ 0 & \dots & 0 \end{matrix}) + (1 - α) (\begin{matrix} 1 & \dots & 0 \\ ⋮ & ⋱ & ⋮ \\ 0 & \dots & 1 \end{matrix}) | |}_{S_{\infty}} = {| | (\begin{matrix} 1 & \dots & 0 \\ ⋮ & ⋱ & ⋮ \\ 0 & \dots & (1 - α) \end{matrix}) | |}_{S_{\infty}} = 1, \end{matrix}$ for all $α \in (0, 1),$ which shows that the Schatten space is not strictly convex for $p = 1, + \infty .$

Finally, the Schatten-p norm is known to be Fréchet differentiable for $p \in (1, + \infty)$ [Citation26]. Moreover, the duality mapping of any Banach space with Fréchet-differentiable norms is guaranteed to be continuous [Citation42, Citation43]. Combining the two statements, we deduce the continuity of the duality mapping in this case. □

By contrast, the sparse duality mapping $J_{S_{1}, rank} (\cdot)$ is not continuous. This is best explained by providing a counterexample. Specifically, let us consider the sequence of 2 by 2 matrices $S_{k} = (\begin{matrix} 1 & 0 \\ 0 & \frac{1}{k} \end{matrix}), k \in N .$

It is clear that $S_{k} \to S_{\infty} = (\begin{matrix} 1 & 0 \\ 0 & 0 \end{matrix}) .$ However, we have that $\forall k \in N : J_{S_{1}, rank} (S_{k}) = (\begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix}), while J_{S_{1}, rank} (S_{\infty}) = (\begin{matrix} 1 & 0 \\ 0 & 0 \end{matrix}),$ which shows the discontinuity of $J_{S_{1}, rank}$ in the space of 2 by 2 matrices. This can be generalized to space of matrices with arbitrary dimensions $m, n \in N .$

Although $J_{S_{1}, rank}$ is not continuous, we now show that it is Borel-measurable and, hence, that it can be approximated with arbitrary precision by a continuous mapping due to Lusin’s theorem [Citation34].

Proposition 3.

For any $m, n \in N$ , the sparse duality mapping $J_{S_{1}, rank}$ is a Borel-measurable matrix-valued function over the space of m by n matrices.

Before going into the proof of Proposition 3, we present a preliminary result.

Lemma 1.

The set $R_{r} \subseteq R^{m \times n}$ of m by n matrices of rank r is Borel-measurable.

Proof.

First note that $R_{1} = {u v^{T} : u \in R^{m}, v \in R^{n}} .$

The set $R_{1}$ is the image of the continuous mapping $R^{m} \times R^{n} \to R^{m \times n} : (u, v) \mapsto u v^{T}$ and, hence, is Borel-measurable.

Now, denote by $R_{\leq r} \subseteq R^{m \times n},$ the set of matrices with rank no more than r. Using the identity $R_{\leq r} = R_{1} + \dots + R_{1}, (r times),$ we deduce that $R_{\leq r}$ and, consequently, $R_{r} = R_{\leq r} \ R_{\leq (r - 1)}$ are also Borel-measurable sets. □

Proof of Proposition 3.

Consider a Borel-measurable set $B \subseteq R^{m \times n} .$ We show that $B_{inv} = J_{S_{1}, rank}^{- 1} (B)$ is also Borel-measurable. By defining $B_{inv, r} = B_{inv} \cap R_{r},$ we can partition $B_{inv}$ as $B_{inv} = \cup_{r = 1}^{\min (m, n)} B_{inv, r} .$

Hence, it is sufficient to show that each partition $B_{inv, r}$ is Borel-measurable.

Define the set $P_{r} \subseteq R_{r}^{2}$ as $P_{r} = {(A, B) \in R_{r} \times B : Tr (A^{T} B) = | | A | |_{S_{1}} | | B | |_{S_{\infty}}, | | A | |_{S_{1}} = | | B | |_{S_{\infty}}} .$

The set $P_{r}$ introduces a relation over $R_{r}$ whose domain is $B_{inv, r} .$ In other words, we have that $B_{inv, r} = {A \in R_{r} : \exists B \in B, (A, B) \in P_{r}} .$

Since the trace and norm are continuous (and, consequently, Borel-measurable) functions and $R_{r} \times B$ is a Borel-measurable set (using Lemma 1), we deduce that the relation induced from $P_{r}$ is Borel-measurable as well. Finally, we use [Citation44, Proposition 2.1] to show that its domain is Borel-measurable. □

5. Conclusion

In this paper, we studied the duality mapping in finite-dimensional Schatten spaces. Based on a careful investigation of the cases where the Hölder inequality saturates, we provided an explicit form for this mapping when $p \in (1, + \infty) .$ Furthermore, by adding a rank constraint, we proved that the mapping becomes single-valued for the special case p = 1. As for p = + ∞, we showed that the mapping yields a convex set whose elements are explicitly characterized. Finally, we discussed our theorem and studied the continuity of the introduced mappings as well as the strict convexity of the Schatten spaces. A possible future direction of research is to extend the results of this paper to infinite-dimensional Schatten spaces and even, in full generality, to linear operators over Hilbert spaces.

A. Proof of Proposition 1

Proof.

Let us recall the reduced SVD of the matrix $A$ as (23) $A = U_{r} S_{r} V_{r}^{T},$ (23) where $r = rank (A), U_{r} = [u_{1} \dots u_{r}] \in R^{m \times r}, V_{r} = [v_{1} \dots v_{r}] \in R^{n \times r},$ and $S = diag (σ_{1}, \dots, σ_{r}) .$ Similarly, for the matrix $B,$ we have that (24) $B = {\tilde{U}}_{\tilde{r}} {\tilde{S}}_{\tilde{r}} {\tilde{V}}_{\tilde{r}}^{T},$ (24) where $\tilde{r} = rank (A), {\tilde{U}}_{\tilde{r}} = [{\tilde{u}}_{1} \dots {\tilde{u}}_{\tilde{r}}] \in R^{m \times \tilde{r}}, {\tilde{V}}_{\tilde{r}} = [{\tilde{v}}_{1} \dots {\tilde{v}}_{r}] \in R^{n \times \tilde{r}},$ and $\tilde{S} = diag ({\tilde{σ}}_{1}, \dots, {\tilde{σ}}_{\tilde{r}}) .$ A direct computation then reveals that (25) $Tr (A^{T} B) = \sum_{i = 1}^{r} \sum_{j = 1}^{\tilde{r}} σ_{i} {\tilde{σ}}_{j} u_{i}^{T} {\tilde{u}}_{j} v_{i}^{T} {\tilde{v}}_{j} .$ (25)

By using the weighted Hölder inequality for vectors [Citation45], we obtain for $p \neq 1$ that (26) $\sum_{i = 1}^{r} \sum_{j = 1}^{\tilde{r}} σ_{i} {\tilde{σ}}_{j} u_{i}^{T} {\tilde{u}}_{j} v_{i}^{T} {\tilde{v}}_{j} \leq {(\sum_{i = 1}^{r} σ_{i}^{p} \sum_{j = 1}^{\tilde{r}} | u_{i}^{T} {\tilde{u}}_{j} v_{i}^{T} {\tilde{v}}_{j} |)}^{\frac{1}{p}} {(\sum_{j = 1}^{\tilde{r}} σ_{j}^{p} \sum_{i = 1}^{r} | u_{i}^{T} {\tilde{u}}_{j} v_{i}^{T} {\tilde{v}}_{j} |)}^{\frac{1}{q}}$ (26) and for p = 1 that (27) $\sum_{i = 1}^{r} \sum_{j = 1}^{\tilde{r}} σ_{i} {\tilde{σ}}_{j} u_{i}^{T} {\tilde{u}}_{j} v_{i}^{T} {\tilde{v}}_{j} \leq (\sum_{i = 1}^{r} σ_{i} \sum_{j = 1}^{\tilde{r}} | u_{i}^{T} {\tilde{u}}_{j} v_{i}^{T} {\tilde{v}}_{j} |) | | \tilde{σ} | |_{\infty} .$ (27)

Finally, by invoking Cauchy-Schwartz and the orthonormality of the matrices $U_{r}, V_{r}, {\tilde{U}}_{\tilde{r}}, {\tilde{V}}_{\tilde{r}},$ we deduce for $i = 1, \dots, r$ that (28) $\sum_{j = 1}^{\tilde{r}} | u_{i}^{T} {\tilde{u}}_{j} v_{i}^{T} {\tilde{v}}_{j} | \leq {(\sum_{j = 1}^{\tilde{r}} {(u_{i}^{T} {\tilde{u}}_{j})}^{2})}^{\frac{1}{2}} {(\sum_{j = 1}^{\tilde{r}} {(v_{i}^{T} {\tilde{v}}_{j})}^{2})}^{\frac{1}{2}} \leq | | u_{i} | |_{2} | | v_{i} | |_{2} = 1,$ (28)

For $j = 1, \dots, \tilde{r},$ we deduce that (29) $\sum_{i = 1}^{r} | u_{i}^{T} {\tilde{u}}_{j} v_{i}^{T} {\tilde{v}}_{j} | \leq {(\sum_{i = 1}^{r} {(u_{i}^{T} {\tilde{u}}_{j})}^{2})}^{\frac{1}{2}} {(\sum_{i = 1}^{r} {(v_{i}^{T} {\tilde{v}}_{j})}^{2})}^{\frac{1}{2}} \leq | | {\tilde{u}}_{i} | |_{2} | | {\tilde{v}}_{i} | |_{2} = 1.$ (29)

The combination of these inequalities completes the proof. □

B. Proof of Proposition 2

Proof.

We separate the two cases and analyze each one independently.

Case 1: $1 < p < + \infty .$ We prove (Equation14(14) $B = c U_{r} diag (J_{p} (σ)) V_{r}^{T}$ (14) ) and deduce (Equation15(15) $A = c^{- 1} {\tilde{U}}_{\tilde{r}} diag (J_{q} (\tilde{σ})) {\tilde{V}}_{\tilde{r}}^{T},$ (15) ) by symmetry. Following the proof of Proposition 2 and considering the reduced SVD of the matrices $A$ and $B$ given in (Equation23(23) $A = U_{r} S_{r} V_{r}^{T},$ (23) ) and (Equation24(24) $B = {\tilde{U}}_{\tilde{r}} {\tilde{S}}_{\tilde{r}} {\tilde{V}}_{\tilde{r}}^{T},$ (24) ), we immediately see that the inequalities (Equation26(26) $\sum_{i = 1}^{r} \sum_{j = 1}^{\tilde{r}} σ_{i} {\tilde{σ}}_{j} u_{i}^{T} {\tilde{u}}_{j} v_{i}^{T} {\tilde{v}}_{j} \leq {(\sum_{i = 1}^{r} σ_{i}^{p} \sum_{j = 1}^{\tilde{r}} | u_{i}^{T} {\tilde{u}}_{j} v_{i}^{T} {\tilde{v}}_{j} |)}^{\frac{1}{p}} {(\sum_{j = 1}^{\tilde{r}} σ_{j}^{p} \sum_{i = 1}^{r} | u_{i}^{T} {\tilde{u}}_{j} v_{i}^{T} {\tilde{v}}_{j} |)}^{\frac{1}{q}}$ (26) ), (Equation28(28) $\sum_{j = 1}^{\tilde{r}} | u_{i}^{T} {\tilde{u}}_{j} v_{i}^{T} {\tilde{v}}_{j} | \leq {(\sum_{j = 1}^{\tilde{r}} {(u_{i}^{T} {\tilde{u}}_{j})}^{2})}^{\frac{1}{2}} {(\sum_{j = 1}^{\tilde{r}} {(v_{i}^{T} {\tilde{v}}_{j})}^{2})}^{\frac{1}{2}} \leq | | u_{i} | |_{2} | | v_{i} | |_{2} = 1,$ (28) ), and (Equation29(29) $\sum_{i = 1}^{r} | u_{i}^{T} {\tilde{u}}_{j} v_{i}^{T} {\tilde{v}}_{j} | \leq {(\sum_{i = 1}^{r} {(u_{i}^{T} {\tilde{u}}_{j})}^{2})}^{\frac{1}{2}} {(\sum_{i = 1}^{r} {(v_{i}^{T} {\tilde{v}}_{j})}^{2})}^{\frac{1}{2}} \leq | | {\tilde{u}}_{i} | |_{2} | | {\tilde{v}}_{i} | |_{2} = 1.$ (29) ) should all be saturated. The equality condition of the weighted Hölder implies the existence of a positive constant $α > 0$ such that, for all $(i, j) \in {1, \dots, r} \times {1, \dots, \tilde{r}},$ we have one of the following conditions: (30) $u_{i}^{T} {\tilde{u}}_{j} v_{i}^{T} {\tilde{v}}_{j} = 0, or$ (30) (31) $u_{i}^{T} {\tilde{u}}_{j} v_{i}^{T} {\tilde{v}}_{j} > 0 and {\tilde{σ}}_{j}^{q} = α σ_{i}^{p} .$ (31)

Moreover, the saturation of (Equation28(28) $\sum_{j = 1}^{\tilde{r}} | u_{i}^{T} {\tilde{u}}_{j} v_{i}^{T} {\tilde{v}}_{j} | \leq {(\sum_{j = 1}^{\tilde{r}} {(u_{i}^{T} {\tilde{u}}_{j})}^{2})}^{\frac{1}{2}} {(\sum_{j = 1}^{\tilde{r}} {(v_{i}^{T} {\tilde{v}}_{j})}^{2})}^{\frac{1}{2}} \leq | | u_{i} | |_{2} | | v_{i} | |_{2} = 1,$ (28) ) implies that (32) $u_{i} \in Range ({\tilde{U}}_{\tilde{r}}), v_{i} \in Range ({\tilde{V}}_{\tilde{r}}) \forall i = 1, \dots, r$ (32) and also that there exists a positive constant $β_{i} > 0$ (positivity follows from (Equation31(31) $u_{i}^{T} {\tilde{u}}_{j} v_{i}^{T} {\tilde{v}}_{j} > 0 and {\tilde{σ}}_{j}^{q} = α σ_{i}^{p} .$ (31) ) and (Equation30(30) $u_{i}^{T} {\tilde{u}}_{j} v_{i}^{T} {\tilde{v}}_{j} = 0, or$ (30) )) such that (33) $u_{i}^{T} {\tilde{u}}_{j} = β_{i} v_{i}^{T} {\tilde{v}}_{j}, \forall j = 1, \dots, \tilde{r} .$ (33)

However, from the normality of u_i and (Equation32(32) $u_{i} \in Range ({\tilde{U}}_{\tilde{r}}), v_{i} \in Range ({\tilde{V}}_{\tilde{r}}) \forall i = 1, \dots, r$ (32) ), we have that (34) $1 = | | u_{i} | |_{2}^{2} = \sum_{j = 1}^{\tilde{r}} | u_{i}^{T} {\tilde{u}}_{j} |^{2} = β_{i}^{2} \sum_{j = 1}^{\tilde{r}} | v_{i}^{T} {\tilde{v}}_{j} |^{2} = β_{i}^{2} | | v | |_{2}^{2} = β_{i}^{2}$ (34) which, together with the positivity of β_i, leads to the conclusion that $β_{i} = 1$ for $i = 1, \dots, r .$ Using this, we rewrite (Equation33(33) $u_{i}^{T} {\tilde{u}}_{j} = β_{i} v_{i}^{T} {\tilde{v}}_{j}, \forall j = 1, \dots, \tilde{r} .$ (33) ) in matrix form as (35) $U_{r}^{T} {\tilde{U}}_{\tilde{r}} = V_{r}^{T} {\tilde{V}}_{\tilde{r}} .$ (35)

Similarly, the saturation of (29) implies that (36) ${\tilde{u}}_{j} \in Range (U_{r}), {\tilde{v}}_{i} \in Range (V_{r}),$ (36) for all $j = 1, \dots, \tilde{r} .$ Putting together (Equation32(32) $u_{i} \in Range ({\tilde{U}}_{\tilde{r}}), v_{i} \in Range ({\tilde{V}}_{\tilde{r}}) \forall i = 1, \dots, r$ (32) ) and (Equation36(36) ${\tilde{u}}_{j} \in Range (U_{r}), {\tilde{v}}_{i} \in Range (V_{r}),$ (36) ), we deduce that $r = \tilde{r}$ and (37) $Range (U_{r}) = Range ({\tilde{U}}_{\tilde{r}}), Range (V_{r}) = Range ({\tilde{V}}_{\tilde{r}}) .$ (37)

This implies the existence of two orthogonal matrices $P, Q \in R^{r \times r}$ such that (38) ${\tilde{U}}_{\tilde{r}} = U_{r} P, {\tilde{V}}_{\tilde{r}} = V_{r} Q .$ (38)

However, replacing (Equation38(38) ${\tilde{U}}_{\tilde{r}} = U_{r} P, {\tilde{V}}_{\tilde{r}} = V_{r} Q .$ (38) ) in (Equation35(35) $U_{r}^{T} {\tilde{U}}_{\tilde{r}} = V_{r}^{T} {\tilde{V}}_{\tilde{r}} .$ (35) ), we conclude that (39) $P = U_{r}^{T} U_{r} P = U_{r}^{T} {\tilde{U}}_{\tilde{r}} = V_{r}^{T} {\tilde{V}}_{\tilde{r}} = V_{r}^{T} V_{r} Q = Q .$ (39)

This implies that the matrix $B$ can be represented as (40) $B = U_{r} P {\tilde{S}}_{\tilde{r}} P^{T} V_{r}^{T} = U_{r} S_{0} V_{r}^{T},$ (40) where $S_{0} = P {\tilde{S}}_{\tilde{r}} P^{T} .$ We now show that $S_{0}$ is a diagonal matrix. Indeed, by denoting the (i, j)-th entry of $P$ as $p_{i, j}$ such that $P = [p_{1} \dots p_{r}] = [p_{i, j}],$ we rewrite (Equation30(30) $u_{i}^{T} {\tilde{u}}_{j} v_{i}^{T} {\tilde{v}}_{j} = 0, or$ (30) ) and (Equation31(31) $u_{i}^{T} {\tilde{u}}_{j} v_{i}^{T} {\tilde{v}}_{j} > 0 and {\tilde{σ}}_{j}^{q} = α σ_{i}^{p} .$ (31) ) as (41) $p_{i, j} = 0, or$ (41) (42) $p_{i, j} > 0 and {\tilde{σ}}_{j}^{q} = α σ_{i}^{p},$ (42) for all $(i, j) \in {1, \dots, r}^{2} .$ Moreover, by expanding the (i, j)-th entry of the matrix $S_{0},$ we have that $\begin{matrix} {[S_{0}]}_{i, j} = {[P {\tilde{S}}_{\tilde{r}} P^{T}]}_{i, j} = \sum_{k = 1}^{r} p_{i, k} {\tilde{σ}}_{k} p_{j, k} = \sum_{k = 1}^{r} p_{i, k} σ_{i}^{\frac{p}{q}} α^{\frac{1}{q}} p_{j, k} \\ = σ_{i}^{\frac{p}{q}} α^{\frac{1}{q}} p_{i}^{T} p_{j} = {[J_{p} (σ)]}_{i} c_{B} δ [i - j], \end{matrix}$ where $δ [\cdot]$ denotes the Kronecker delta and $c_{B} = α^{\frac{1}{q}} > 0$ is a positive constant. Finally, we obtain the announced expression in (Equation14(14) $B = c U_{r} diag (J_{p} (σ)) V_{r}^{T}$ (14) ) by replacing the above characterization of $S_{0}$ in (Equation40(40) $B = U_{r} P {\tilde{S}}_{\tilde{r}} P^{T} V_{r}^{T} = U_{r} S_{0} V_{r}^{T},$ (40) ).

For the converse, we note that, if the matrix $B$ is in the form of (Equation14(14) $B = c U_{r} diag (J_{p} (σ)) V_{r}^{T}$ (14) ), then we have that $\begin{array}{l} Tr (A^{T} B) = Tr (U_{r} diag (σ) V_{r}^{T} {(U_{r} diag (J_{p} (σ)) V_{r}^{T})}^{T}) \\ = c_{B} Tr (diag (σ) V_{r}^{T} V_{r} diag (J_{p} (σ)) U_{r}^{T} U_{r}) \\ = c_{B} σ^{T} J_{p} (σ) \\ = c_{B} | | σ | |_{p} | | J_{p} (σ) | |_{q} = | | A | |_{S_{p}} | | B | |_{S_{q}}, \end{array}$ which shows that the equality is indeed saturated in this case.

Case 2: p = 1. In this case, the saturation of the weighted Hölder inequality implies that, for all $(i, j) \in {1, \dots, r} \times {1, \dots, \tilde{r}},$ we have that (43) $u_{i}^{T} {\tilde{u}}_{j} v_{i}^{T} {\tilde{v}}_{j} = 0, or$ (43) (44) $u_{i}^{T} {\tilde{u}}_{j} v_{i}^{T} {\tilde{v}}_{j} > 0 and {\tilde{σ}}_{j} = {\tilde{σ}}_{1} .$ (44)

For equality, we also need to have the saturation of (Equation28(28) $\sum_{j = 1}^{\tilde{r}} | u_{i}^{T} {\tilde{u}}_{j} v_{i}^{T} {\tilde{v}}_{j} | \leq {(\sum_{j = 1}^{\tilde{r}} {(u_{i}^{T} {\tilde{u}}_{j})}^{2})}^{\frac{1}{2}} {(\sum_{j = 1}^{\tilde{r}} {(v_{i}^{T} {\tilde{v}}_{j})}^{2})}^{\frac{1}{2}} \leq | | u_{i} | |_{2} | | v_{i} | |_{2} = 1,$ (28) ), which we showed to be equivalent to (Equation32(32) $u_{i} \in Range ({\tilde{U}}_{\tilde{r}}), v_{i} \in Range ({\tilde{V}}_{\tilde{r}}) \forall i = 1, \dots, r$ (32) ) and (Equation35(35) $U_{r}^{T} {\tilde{U}}_{\tilde{r}} = V_{r}^{T} {\tilde{V}}_{\tilde{r}} .$ (35) ). From (Equation32(32) $u_{i} \in Range ({\tilde{U}}_{\tilde{r}}), v_{i} \in Range ({\tilde{V}}_{\tilde{r}}) \forall i = 1, \dots, r$ (32) ), we deduce the existence of matrices $P_{1}, P_{2} \in R^{\tilde{r} \times r}$ such that (45) $U_{r} = {\tilde{U}}_{\tilde{r}} P_{1}, V_{r} = {\tilde{V}}_{\tilde{r}} P_{2} .$ (45)

The replacement of these in (Equation35(35) $U_{r}^{T} {\tilde{U}}_{\tilde{r}} = V_{r}^{T} {\tilde{V}}_{\tilde{r}} .$ (35) ) implies that (46) $P_{1}^{T} = P_{1}^{T} {\tilde{U}}_{\tilde{r}}^{T} {\tilde{U}}_{\tilde{r}} = U_{r}^{T} {\tilde{U}}_{\tilde{r}} = V_{r}^{T} {\tilde{V}}_{\tilde{r}} = P_{2}^{T} {\tilde{V}}_{\tilde{r}}^{T} {\tilde{V}}_{\tilde{r}} = P_{2}^{T},$ (46) and, hence, that $P_{1} = P_{2} = [p_{i, j}] \in R^{\tilde{r} \times r} .$ Now, one can rewrite the conditions (Equation43(43) $u_{i}^{T} {\tilde{u}}_{j} v_{i}^{T} {\tilde{v}}_{j} = 0, or$ (43) ) and (Equation44(44) $u_{i}^{T} {\tilde{u}}_{j} v_{i}^{T} {\tilde{v}}_{j} > 0 and {\tilde{σ}}_{j} = {\tilde{σ}}_{1} .$ (44) ) and deduce that, for any $j = 1, \dots, \tilde{r},$ we have that (47) $p_{j, i} = 0, \forall i = 1, \dots, r or {\tilde{σ}}_{j} = {\tilde{σ}}_{1} .$ (47)

From Conditions (Equation47(47) $p_{j, i} = 0, \forall i = 1, \dots, r or {\tilde{σ}}_{j} = {\tilde{σ}}_{1} .$ (47) ) and following the definition of r₁ (the multiplicity of the largest singular value), we deduce that (48) $P_{1} = [\begin{matrix} P \\ 0_{r_{res} \times r} \end{matrix}],$ (48) where $P \in R^{r_{1} \times r}$ and $r_{res} = (\tilde{r} - r_{1}) .$ Using this form and the definition of ${\tilde{U}}_{1}$ and ${\tilde{V}}_{1}$ (given in the statement of the proposition), we rewrite (Equation45(45) $U_{r} = {\tilde{U}}_{\tilde{r}} P_{1}, V_{r} = {\tilde{V}}_{\tilde{r}} P_{2} .$ (45) ) as (49) $U_{r} = {\tilde{U}}_{1} P, V_{r} = {\tilde{V}}_{1} P .$ (49)

Therefore, (50) $I_{r} = U_{r}^{T} U_{r} = P^{T} {\tilde{U}}_{1}^{T} {\tilde{U}}_{1} P = P^{T} P .$ (50)

Hence, $P$ is a sub-orthogonal matrix and $rank (B) = \tilde{r} \geq r_{1} \geq rank (P) \geq r = rank (A) .$

The replacement of (Equation49(49) $U_{r} = {\tilde{U}}_{1} P, V_{r} = {\tilde{V}}_{1} P .$ (49) ) in the reduced SVD of $A$ yields the announced expression with $X = PS P^{T} .$

Based on the definitions of r₁, ${\tilde{U}}_{1},$ and ${\tilde{V}}_{1},$ we note that one can rewrite the reduced SVD of $B$ as (51) $B = {\tilde{σ}}_{1} {\tilde{U}}_{1} {\tilde{V}}_{1}^{T} + {\tilde{U}}_{res} {\tilde{S}}_{res} {\tilde{V}}_{res}^{T},$ (51)

where ${\tilde{U}}_{res} \in R^{m \times r_{res}}, {\tilde{S}}_{res} \in R^{r_{res} \times r_{res}},$ and ${\tilde{V}}_{res} \in R^{n \times r_{res}}$ are the remaining singular values and vectors such that $\tilde{U} = [{\tilde{U}}_{1} {\tilde{U}}_{res}], \tilde{V} = [{\tilde{V}}_{1} {\tilde{V}}_{res}], \tilde{S} = [\begin{matrix} {\tilde{σ}}_{1} I_{r_{1}} & 0 \\ 0 & {\tilde{S}}_{res} \end{matrix}] .$

Now, if $A$ admits the form (Equation17(17) $A = {\tilde{U}}_{1} X {\tilde{V}}_{1}^{T} .$ (17) ) and if we consider the SVD of $X = PS P^{T}$ (the assumption that $X$ is symmetric ensures that is has an orthogonal eigen-decomposition), then $\begin{array}{l} Tr (A^{T} B) = Tr ({\tilde{V}}_{1} P S P^{T} {\tilde{U}}_{1}^{T} ({\tilde{σ}}_{1} {\tilde{U}}_{1} {\tilde{V}}_{1}^{T} + {\tilde{U}}_{res} {\tilde{S}}_{res} {\tilde{V}}_{res}^{T})) \\ = {\tilde{σ}}_{1} Tr ({\tilde{V}}_{1} P S P^{T} {\tilde{U}}_{1}^{T} {\tilde{U}}_{1} {\tilde{V}}_{1}^{T}) + Tr ({\tilde{V}}_{1} P S P^{T} {\tilde{U}}_{1}^{T} {\tilde{U}}_{res} {\tilde{S}}_{res} {\tilde{V}}_{res}^{T}) \\ = {\tilde{σ}}_{1} Tr ({\tilde{V}}_{1} P S P^{T} I_{r_{1}} {\tilde{V}}_{1}^{T}) + Tr ({\tilde{V}}_{1} P S P^{T} 0_{r_{1} \times r_{res}} {\tilde{S}}_{res} {\tilde{V}}_{res}^{T}) \\ = {\tilde{σ}}_{1} Tr (S P^{T} {\tilde{V}}_{1}^{T} {\tilde{V}}_{1} P) + 0 \\ = {\tilde{σ}}_{1} Tr (S P^{T} P) \\ = {\tilde{σ}}_{1} Tr (S) = | | B | |_{S_{\infty}} | | A | |_{S_{1}}, \end{array}$ which establishes the sufficiency in this case.

Finally, assuming that $r = r_{1} = \tilde{r},$ we deduce that $P \in R^{r \times r}$ is an orthogonal matrix and, hence, that $P^{- 1} = P^{T} .$ Now, using (Equation49(49) $U_{r} = {\tilde{U}}_{1} P, V_{r} = {\tilde{V}}_{1} P .$ (49) ) and the rank assumption, we can simplify the expansion (Equation51(51) $B = {\tilde{σ}}_{1} {\tilde{U}}_{1} {\tilde{V}}_{1}^{T} + {\tilde{U}}_{res} {\tilde{S}}_{res} {\tilde{V}}_{res}^{T},$ (51) ) as (52) $B = {\tilde{σ}}_{1} {\tilde{U}}_{1} {\tilde{V}}_{1}^{T} = {\tilde{σ}}_{1} U_{r} P^{T} {(V_{r} P^{T})}^{T} = {\tilde{σ}}_{1} U_{r} P^{T} P V_{r}^{T} = {\tilde{σ}}_{1} U_{r} V_{r}^{T} .$ (52)

C. Proof of Theorem 1

Proof.

Case I: $1 < p < + \infty .$ Assume that $(A, B)$ forms an (S_p, S_q)-conjugate pair. Hence, we have that $〈 A, B 〉 = | | A | |_{S_{p}} | | B | |_{S_{q}}$ which, together with Proposition 2, implies that $B$ admits the form $B = \frac{| | B | |_{S_{q}}}{| | A | |_{S_{p}}} U_{r} diag (J_{p} (σ)) V_{r}^{T} = U_{r} diag (J_{p} (σ)) V_{r}^{T} .$

Case II: p = 1. Similarly to the previous case, consider $A \in R^{m \times n}$ and $B \in J_{S_{1}, rank} (A) .$ We have that (53) $Tr (A^{T} B) = | | A | |_{S_{1}} | | B | |_{S_{\infty}}$ (53) (54) $| | A | |_{S_{1}} = | | B | |_{S_{\infty}}$ (54) (55) $rank (B) \leq rank (C), \forall C \in J_{S_{1}} (A) .$ (55)

From (Equation53(53) $Tr (A^{T} B) = | | A | |_{S_{1}} | | B | |_{S_{\infty}}$ (53) ) and using Proposition 2, we deduce that $rank (B) \geq rank (A)$ which, together with (Equation55(55) $rank (B) \leq rank (C), \forall C \in J_{S_{1}} (A) .$ (55) ), implies that $B$ should be equal to $B = | | B | |_{S_{\infty}} U_{r} V_{r}^{T} = | | σ | |_{1} U_{r} V_{r}^{T},$ where the last equality is obtained using (Equation54(54) $| | A | |_{S_{1}} = | | B | |_{S_{\infty}}$ (54) ).

Case III: p = +∞. Following Proposition 2, any matrix $B \in J_{S_{\infty}} (A)$ can be expressed as $B = U_{1} \tilde{X} V_{1}^{T},$ where $\tilde{X} \in R^{r_{1} \times r_{1}}$ is a symmetric matrix. By defining $X = σ_{1}^{- 1} \tilde{X},$ one readily verifies that $B = σ_{1} U_{1} X V_{1}^{T} .$ By recalling the normalization constraint $| | A | |_{S_{\infty}} = | | B | |_{S_{1}},$ we therefore obtain that $σ_{1} = | | A | |_{S_{\infty}} = | | B | |_{S_{1}} = σ_{1} | | X | |_{S_{1}},$ which implies that $| | X | |_{S_{1}} = 1 .$ To show that $J_{S_{\infty}} (A)$ is convex, consider two symmetric matrices $X_{0}$ and $X_{1}$ in the unit ball of Schatten-1 norm and define $B_{α} = σ_{1} U_{1} X_{α} V_{1}^{T}, X_{α} = α X_{0} + (1 - α) X_{1}$ for $α \in [0, 1] .$ On one hand, from the linearity of traces, we have that $Tr (A^{T} B_{α}) = Tr (A^{T} (α B_{1} + (1 - α) B_{0})) = α Tr (A^{T} B_{1}) + (1 - α) Tr (A^{T} B_{0}) .$

On the other hand, from the definition of $X_{0}$ and $X_{1},$ we deduce that $B_{0}, B_{1} \in J_{S_{\infty}} (A) .$ Hence, $Tr (A^{T} B_{α}) = α | | A | |_{S_{\infty}}^{2} + (1 - α) | | A | |_{S_{\infty}}^{2} = | | A | |_{S_{\infty}}^{2} .$

However, from the Hölder inequality and the convexity of norms, we have that $Tr (A^{T} B_{α}) \leq | | A | |_{S_{\infty}} | | B_{α} | |_{S_{1}} \leq | | A | |_{S_{\infty}} (α | | B_{1} | |_{S_{1}} + (1 - α) | | B_{0} | |_{S_{1}}) = | | A | |_{S_{\infty}} .$

This implies that the Hölder inequality is saturated and also that $| | B_{α} | |_{S_{1}} = | | A | |_{S_{\infty}}$ which, altogether, implies that $B_{α} \in J_{S_{\infty}} (A)$ for all $α \in [0, 1] .$

Finally, we observe that the set $J_{S_{\infty}} (A)$ contains all matrices of the form $B = U_{1} p p^{T} V_{1}^{T}$ for any vector $p \in R^{r_{1}}$ with $| | p | |_{2} = 1 .$ These are indeed all the rank-1 elements of $J_{S_{\infty}} (A)$ which, due to the Definition 2, forms the set of sparse dual conjugates. □

Additional information

Funding

This work was supported in part by the European Research Council (H2020-ERC Project GlobalBioIm) under Grant 692726 and in part by the Swiss National Science Foundation, Grant 200020_184646/1.

Notes

1 Although this functional does not satisfy the homogeneity property of a norm, it has been widely referred to as the $ℓ_{0}$ -norm.

References

Bhatia, R. (1997). Matrix Analysis, Vol. 169. New York: Springer-Verlag.
Google Scholar
Lefkimmiatis, S., Unser, M. (2013). Poisson image reconstruction with Hessian Schatten-norm regularization. IEEE Trans. Image Process. 22(11):4314–4327. DOI: 10.1109/TIP.2013.2271852.
PubMed Web of Science ®Google Scholar
Lefkimmiatis, S., Ward, J., Unser, M. (2013). Hessian Schatten-norm regularization for linear inverse problems. IEEE Trans. Image Process. 22(5):1873–1888. DOI: 10.1109/TIP.2013.2237919.
PubMed Web of Science ®Google Scholar
Xie, Y., Gu, S., Liu, Y., Zuo, W., Zhang, W., Zhang, L. (2016). Weighted Schatten p-norm minimization for image denoising and background subtraction. IEEE Trans. Image Process. 25(10):4842–4857. DOI: 10.1109/TIP.2016.2599290.
Web of Science ®Google Scholar
Gao, S., Fan, Q. (2020). Robust Schatten-p norm based approach for tensor completion. J. Sci. Comput. 82(1):1–23.
Web of Science ®Google Scholar
Horn, R. A., Johnson, C. R. (2012). Matrix Analysis. Cambridge University Press.
Google Scholar
Davenport, M. A., Romberg, J. (2016). An overview of low-rank matrix recovery from incomplete observations. IEEE J. Sel. Top. Signal Process. 10(4):608–622. DOI: 10.1109/JSTSP.2016.2539100.
Web of Science ®Google Scholar
Candès, E. J., Recht, B. (2009). Exact matrix completion via convex optimization. Found. Comput. Math. 9(6):717–772. DOI: 10.1007/s10208-009-9045-5.
Web of Science ®Google Scholar
Candès, E. J., Eldar, Y. C., Strohmer, T., Voroninski, V. (2015). Phase retrieval via matrix completion. SIAM Rev. 57(2):225–251. DOI: 10.1137/151005099.
Web of Science ®Google Scholar
Davies, M. E., Eldar, Y. C. (2012). Rank awareness in joint sparse recovery. IEEE Trans. Inform. Theory 58(2):1135–1146. DOI: 10.1109/TIT.2011.2173722.
Web of Science ®Google Scholar
Fazel, M., Pong, T. K., Sun, D., Tseng, P. (2013). Hankel matrix rank minimization with applications to system identification and realization. SIAM J. Matrix Anal. Appl. 34(3):946–977. DOI: 10.1137/110853996.
Web of Science ®Google Scholar
Asadi, E., Aziznejad, S., Amerimehr, M. H., Amini, A. (2017). A fast matrix completion method for index coding. In: Proceedings of the Twenty-Fifth European Signal Processing Conference (EUSIPCO’17). Kos Island, Greece: IEEE, pp. 2606–2610.
Google Scholar
Esfahanizadeh, H., Lahouti, F., Hassibi, B. (2014). A matrix completion approach to linear index coding problem. In: Proceedings of the Information Theory Workshop (ITW 2014). Hobart, Australia: IEEE, pp. 531–535.
Google Scholar
Kittaneh, F. (1985). Inequalities for the Schatten p-norm. Glasgow Math. J. 26(2):141–143. DOI: 10.1017/S0017089500005905.
Google Scholar
Kittaneh, F. (1987). Inequalities for the Schatten p-norm II. Glasgow Math. J. 29(1):99–104. DOI: 10.1017/S0017089500006716.
Web of Science ®Google Scholar
Kittaneh, F. (1986). Inequalities for the Schatten p-norm III. Communmath. Phys. 104(2):307–310. DOI: 10.1007/BF01211597.
Web of Science ®Google Scholar
Kittaneh, F. (1986). Inequalities for the Schatten p-norm IV. Communmath. Phys. 106(4):581–585. DOI: 10.1007/BF01463397.
Web of Science ®Google Scholar
Kittaneh, F., Kosaki, H. (1987). Inequalities for the Schatten p-norm V. Publ. Res. Inst. Math. Sci. 23(2):433–443. DOI: 10.2977/prims/1195176547.
Web of Science ®Google Scholar
Bourin, J.-C. (2006). Matrix versions of some classical inequalities. Linear Algebra Appl. 416(2–3):890–907. DOI: 10.1016/j.laa.2006.01.002.
Web of Science ®Google Scholar
Hirzallah, O., Kittaneh, F., Moslehian, M. (2010). Schatten p-norm inequalities related to a characterization of inner product spaces. Math. Inequal. Appl. 13(2):235–241. DOI: 10.7153/mia-13-19.
Web of Science ®Google Scholar
Moslehian, M. S., Tominaga, M., Saito, K.-S. (2011). Schatten p-norm inequalities related to an extended operator parallelogram law. Linear Algebra Appl. 435(4):823–829. DOI: 10.1016/j.laa.2011.01.046.
Web of Science ®Google Scholar
Conde, C., Moslehian, M. S. (2016). Norm inequalities related to p-Schatten class. Linear Algebra Appl. 498:441–449. DOI: 10.1016/j.laa.2015.11.031.
Web of Science ®Google Scholar
Wenzel, D., Audenaert, K. M. (2010). Impressions of convexity: An illustration for commutator bounds. Linear Algebra Appl. 433(11–12):1726–1759. DOI: 10.1016/j.laa.2010.06.039.
Web of Science ®Google Scholar
Cheng, C.-M., Lei, C. (2015). On Schatten p-norms of commutators. Linear Algebra Appl. 484:409–434. DOI: 10.1016/j.laa.2015.07.009.
Web of Science ®Google Scholar
So, W. (1990). Facial structures of Schatten p-norms. Linear Multilinear Algebra. 27(3):207–212. DOI: 10.1080/03081089008818012.
Google Scholar
Potapov, D., Sukochev, F. (2014). Fréchet differentiability of Sp norms. Adv. Math. 262:436–475.
Web of Science ®Google Scholar
Kittaneh, F. (1989). On the continuity of the absolute value map in the Schatten classes. Linear Algebra Appl. 118:61–68. DOI: 10.1016/0024-3795(89)90571-5.
Web of Science ®Google Scholar
Bhatia, R., Kittaneh, F. (2000). Cartesian decompositions and Schatten norms. Linear Algebra Appl. 318(1-3):109–116. DOI: 10.1016/S0024-3795(00)00206-8.
Web of Science ®Google Scholar
Beurling, A., Livingston, A. (1962). A theorem on duality mappings in Banach spaces. Ark. Mat. 4(5):405–411. DOI: 10.1007/BF02591622.
Google Scholar
Cioranescu, I. (1990). Geometry of Banach Spaces. Duality Mappings and Nonlinear Problems, Vol. 62, Netherlands: Springer.
Google Scholar
de Boor, C. (1976). On “best” interpolation. J. Approximation Theory. 16(1):28–42. DOI: 10.1016/0021-9045(76)90093-9.
Web of Science ®Google Scholar
Unser, M. (2020). A Unifying Representer Theorem for Inverse Problems and Machine Learning. Foundations of Computational Mathematics.
Google Scholar
Liu, P., Wang, Y.-W. (2007). The best generalized inverse of the linear operator in normed linear space. Linear Algebra Appl. 420(1):9–19. DOI: 10.1016/j.laa.2006.04.024.
Web of Science ®Google Scholar
Rudin, W. (1991). Functional Analysis. International Series in Pure and Applied Mathematics. New York: McGraw-Hill, Inc.
Google Scholar
Johnson, C. R., Horn, R. A. (1985). Matrix Analysis. Cambridge University Press,
Google Scholar
Nie, F., Huang, H., Ding, C. (2012). Low-rank matrix recovery via efficient Schatten p-norm minimization. In: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 26.
Google Scholar
Shang, F., Liu, Y., Cheng, J. (2016). Scalable algorithms for tractable Schatten quasi-norm minimization. In: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 30.
Google Scholar
Shang, F., Liu, Y., Shang, F., Liu, H., Kong, L., Jiao, L. (2020). A unified scalable equivalent formulation for schatten quasi-norms. Mathematics. 8(8):1325. DOI: 10.3390/math8081325.
Web of Science ®Google Scholar
Giampouras, P., Vidal, R., Rontogiannis, A., Haeffele, B. (2020). A novel variational form of the Schatten-p quasi-norm. arXiv preprint arXiv:2010.13927.
Google Scholar
Lefkimmiatis, S., Roussos, A., Maragos, P., Unser, M. (2015). Structure tensor total variation. SIAM J. Imaging Sci. 8(2):1090–1122. DOI: 10.1137/14098154X.
Web of Science ®Google Scholar
Petryshyn, W. (1970). A characterization of strict convexity of Banach spaces and other uses of duality mappings. J. Funct. Anal. 6(2):282–291. DOI: 10.1016/0022-1236(70)90061-3.
Google Scholar
Giles, J., Gregory, D., Sims, B. (1978). Geometrical implications of upper semi-continuity of the duality mapping on a Banach space. Pacific J. Math. 79(1):99–109. DOI: 10.2140/pjm.1978.79.99.
Web of Science ®Google Scholar
Contreras, M. D., Payá, R. (1994). On upper semicontinuity of duality mappings. Proc. Am. Math. Soc. 121(2):451–459. DOI: 10.1090/S0002-9939-1994-1215199-4.
Web of Science ®Google Scholar
Himmelberg, C. J., Parthasarathy, T. (1975). Measurable relations. Fund. Math. 87(1):53–72. DOI: 10.4064/fm-87-1-53-72.
Google Scholar
Cvetkovski, Z. (2012). Inequalities: Theorems, Techniques and Selected Problems. Berlin Heidelberg: Springer-Verlag.
Google Scholar

Duality Mapping for Schatten Matrix Norms

Abstract

1. Introduction

2. Preliminaries

2.1. Dual norms, Hölder inequality, and duality mapping

2.2. Schatten p-norm

3. Duality mapping in Schatten spaces

4. Discussion

5. Conclusion

A. Proof of Proposition 1

B. Proof of Proposition 2

C. Proof of Theorem 1

References

Information for

Open access

Opportunities

Help and information

Duality Mapping for Schatten Matrix Norms

Abstract

1. Introduction

2. Preliminaries

2.1. Dual norms, Hölder inequality, and duality mapping

2.2. Schatten p-norm

3. Duality mapping in Schatten spaces

4. Discussion

5. Conclusion

A. Proof of Proposition 1

B. Proof of Proposition 2

C. Proof of Theorem 1

Additional information

Funding

Notes

References

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date