1,569
Views
2
CrossRef citations to date
0
Altmetric
Research Article

Duality Mapping for Schatten Matrix Norms

ORCID Icon &
Pages 679-695 | Received 18 Jan 2021, Accepted 22 Apr 2021, Published online: 10 May 2021

Abstract

In this paper, we fully characterize the duality mapping over the space of matrices that are equipped with Schatten norms. Our approach is based on the analysis of the saturation of the Hölder inequality for Schatten norms. We prove in our main result that, for p(1,), the duality mapping over the space of real-valued matrices with Schatten-p norm is a continuous and single-valued function and provide an explicit form for its computation. For the special case p = 1, the mapping is set-valued; by adding a rank constraint, we show that it can be reduced to a Borel-measurable single-valued function for which we also provide a closed-form expression.

1. Introduction

In linear algebra and matrix analysis, Schatten norms are a family of spectral matrix norms that are defined via the singular-value decomposition [Citation1]. They have appeared in many applications such as image reconstruction [Citation2, Citation3], image denoising [Citation4], and tensor decomposition [Citation5], to name a few.

Generally, the Schatten-p norm of a matrix is the p norm of its singular values [Citation6]. The family contains some well-known matrix norms: The Frobenius and the spectral (operator) norms are special cases in the family, with p = 2 and p=, respectively. The case p = 1 (trace or nuclear norm) is of particular interest for applications as it can be used to recover low-rank matrices [Citation7]. This is the current paradigm in matrix completion, where the goal is to recover an unknown matrix given some of its entries [Citation8]. Prominent examples of applications that can be reduced to low-rank matrix-recovery problems are phase retrieval [Citation9], sensor-array processing [Citation10], system identification [Citation11], and index coding [Citation12, Citation13].

In addition to their many applications in data science, Schatten norms have been extensively studied from a theoretical point of view. Various inequalities concerning Schatten norms have been proven [Citation14–22]; sharp bounds for commutators in Schatten spaces have been given [Citation23, Citation24]; moreover, facial structure [Citation25], Fréchet differentiablity [Citation26], and various other aspects [Citation27, Citation28] have been studied already.

Our objective in this paper is to investigate the duality mapping in spaces of matrices that are equipped with Schatten norms. The duality mapping is a powerful tool to understand the topological structure of Banach spaces [Citation29, Citation30]. It has been used to derive powerful characterizations of the solution of variational problems in function spaces [Citation31, Citation32] and also to determine generalized linear inverse operators [Citation33]. Here, we prove that the duality mapping over Schatten-p spaces with p(1,+) is a single-valued and continuous function which, in fact, highlights the strict convexity of these spaces. Although the provided characterization is intuitive, we could not find it in the literature and this is, to the best of our knowledge, the first work which provides a direct way of computing this mapping in this case. For the special case p = 1, the mapping is set-valued. However, we prove that, by adding a rank constraint, it reduces to a single-valued Borel-measurable function. In both cases, we also derive closed-form expressions that allow one to compute them explicitly.

The paper is organized as follows: In Section 2, we present relevant mathematical tools and concepts that are used in this paper. We study the duality mapping of Schatten spaces and propose our main result in Section 3. We provide further discussions regarding the introduced mappings in Section 4.

2. Preliminaries

2.1. Dual norms, Hölder inequality, and duality mapping

Let V be a finite-dimensional vector space that is equipped with an inner-product ·,·:V×VR and let ||·||X:VR0 be an arbitrary norm on V. We then denote by X the space V equipped with ||·||X. Clearly, X is a Banach space, because all finite-dimensional normed spaces are complete. The dual norm of X, denoted by ||·||X:VR0, is defined as (1) ||v||X=supuV\{0}v,u||u||X,(1) for any vV. Following this definition, one would directly obtain the generic duality bound (2) v,u||v||X||u||X,(2) for any v,uV. Saturation of Inequality (Equation2) is the key concept of dual conjugates that is formulated in the following definition.

Definition 1.

Let V be a finite-dimensional vector space and let (||·||X,||·||X) be a pair of dual norms that are defined over V. The pair (u,v)V×V is said to be a (X,X)-conjugate, if

  • v,u=||v||X||u||X,

  • ||v||X=||u||X.

For any uV, the set of all elements vV such that (u,v) forms an (X,X)-conjugate is denoted by JX(u)V. We refer to the set-valued mapping JX:V2V as the duality mapping. If, for all uV, the set JX(u) is a singleton, then we indicate the duality mapping for the X-norm via the single-valued function JX:VV with JX(u)={JX(u)}.

It is worth mentioning that, for any uV, the set JX(u) is nonempty. In fact, the closed ball B={vV:||v||X=||u||X} is compact and, hence, the function vv,u attains its maximum value at some v*B. Now, following Definition 1, one readily verifies that (u,v*) is an (X,X)-conjugate.

We conclude this part by providing a classical and illustrative example. Let V=Rn for some nN. For any p[1,+], the p-norm of a vector u=(ui)Rn is defined as (3) ||u||p={(i=1n|ui|p)1p,p<+maxi|ui|,p=+.(3)

It is widely known that the dual norm of p is the q-norm, where (p, q) are Hölder conjugates (i.e., 1/p+1/q=1) [Citation34]. This stems from the Hölder inequality which states that (4) v,u||u||p||v||q,(4) for all u=(ui),v=(vi)Rn. In the sequel, we exclude the trivial cases u=0 and v=0 to avoid unnecessary complexities in our statements.

When 1<p<+, Inequality (Equation4) is saturated if and only if uivi0 for i=1,,n and there exists a constant c > 0 such that |u|p=c|v|q, where |u|p=(|ui|p). This ensures that the duality mapping is single-valued and also yields the map (5) Jp(u)=sign(u)|u|p1||u||pp2.(5)

For p = 1, one can verify that the equality happens if and only if, for any index i=1,,n with ui0, one has that (6) vi=sign(ui)||v||.(6)

In other words, the vector v should attain its extreme values at places where u has nonzero values, with the sign being determined by the corresponding element in u.

Due to (Equation6), the set J1(u) is not necessarily a singleton. However, if we add an additional sparsity constraint, then the mapping becomes single-valued. This leads us to introduce the new notion of sparse duality mapping in Definition 2.

Definition 2.

Let V be a finite-dimensional vector space and let s0:VN be an integer-valued function that acts as a sparsity measure. Assuming a pair (||·||X,||·||X) of dual norms over V, we call the pair (u,v)V×V a sparse (X,X)-conjugate if

  • (u,v) forms an (X,X)-conjugate pair. In other words, vJ(u).

  • The quantity s0(v) attains its minimal value over the set J(u).

We denote the set of sparse conjugates of u by JX,s0(u). Whenever JX,s0(u) is a singleton for any uV, we refer to the single-valued function JX,s0:VV with JX,s0(u)={JX,s0(u)} as the sparse duality mapping.

Following Definition 2, if we use the 0-norm as the sparsity measure, that is s0(u)=||u||0=Card({i:ui0})Footnote1, then we have the sparse duality mapping (7) J1,0:RnRn:u=(ui)v=(vi)=J1,0(u),vi={sign(ui)||u||1,ui00,ui=0.(7)

Finally, we mention that, for p = + , the reduced set J,0 is not single-valued. Indeed, let us define Imax(u)={i:|ui|=||u||}{1,,n}. We readily deduce from (6) that v=(v1,,vn)J(u) if and only if vi = 0 whenever iImax(u) and sign(vi)=sign(ui) for iImax(u) with iImax(u)|vi|=||u||. This shows that J(u) is a convex set with J,0(u) being its extreme points, where J,0(u)={uiei:iImax(u)}.

2.2. Schatten p-norm

It is widely known that any matrix ARm×n can be decomposed as (8) A=USVT,(8) where URm×m and VRn×n are orthogonal matrices and S is an m by n rectangular diagonal matrix with nonnegative real entries σ1σ2σmin(m,n)0 sorted in descending order [Citation35]. In the literature, (8) is known as the singular-value decomposition (SVD) and the entries σi are the singular values of A. In general, the SVD of a matrix A is not unique. However, the diagonal matrix S and, consequently, its entries, are fully determined from A. In other words, the values of σi are invariant to a specific choice of decomposition. This is why one can refer to the diagonal entries of S as the “singular values” of A.

When A is not full rank, one can obtain a reduced version of (8). Indeed, if we denote the rank of A by r, then we have that (9) A=UrSrVrT,(9) where UrRm×r and VrRn×r are (sub)-orthogonal matrices such that UrTUr=VrTVr=Ir and Sr=diag(σ) is a diagonal matrix that contains positive singular values σ=(σ1,,σr)Rr of A.

For any p[1,+], the Schatten-p norm of A is defined as (10) ||A||Sp={(i=1rσip)1p,p<+σ1,p=+.(10)

We remark that (Equation10) defines a family of quasi norms for p(0,1). In the extreme case p = 0, the Schatten-0 norm actually coincides with the rank of the matrix, i.e. ||A||S0=rank(A). The Schatten quasi norms have also been studied in the literature from both theoretical and practical point of views (see, [Citation36–39], and references therein).

3. Duality mapping in Schatten spaces

For any p[1,], the dual of the Schatten-p norm is the Schatten-q norm, where q[1,] is such that 1p+1q=1 [Citation1]. This is due to the generalized version of Hölder’s inequality for Schatten norms, as stated in Proposition 1. While this is a known result (see, for example, [Citation40]), it is also the basis for the present work, which is the reason why we provide a proof in A.

Proposition 1.

For any pair (p,q)[1,+]2 of Hölder conjugates with 1p+1q=1 and any pair of matrices A,BRm×n, we have that (11) A,B=Tr(ATB)||A||Sp||B||Sq.(11)

In Proposition 2, we investigate the case where the Hölder inequality is saturated, in the sense that (12) Tr(ATB)=||A||Sp||B||Sq.(12)

This saturation is central to our work, as it is tightly linked to the notion of duality mapping.

Proposition 2.

Let (p, q) be a pair of Hölder conjugates and let A,BRm×n be a pair of nonzero matrices with reduced SVDs of the form (13) A=Urdiag(σ)VrT,B=U˜r˜diag(σ˜)V˜r˜T.(13)

  • If p(1,), then the Hölder inequality is saturated if and only if we have that (14) B=cUrdiag(Jp(σ))VrT(14) or, equivalently, (15) A=c1U˜r˜diag(Jq(σ˜))V˜r˜T,(15)

    where c=||B||Sq||A||Sp and Jp(·) and Jq(·) are the duality mappings for the p and q norms, respectively (see (Equation5)).

  • If p = 1, then a necessary condition for the saturation of the Hölder inequality is that (16) rank(A)r1rank(B),(16)

where r1=Card({i:σ˜i=σ˜1}) is the multiplicity of the first singular value of B. Moreover, if we denote the first r1 singular vectors of B in (13) by U˜1Rm×r1 and V˜1Rn×r1, then the Hölder inequality is saturated if and only if there exists a symmetric matrix XRr1×r1 such that (17) A=U˜1XV˜1T.(17)

Finally in the rank-equality case rank(A)=rank(B), we have saturation if and only if (18) B=cUrVrT,(18)

where c=||B||S and the matrices Ur and Vr are defined in (13).

Remark 1.

Note that even though the reduced SVD is not unique (i.e. there are multiple choices for the sub-orthogonal matrices in (Equation13)), the parametric forms given in Proposition 2 do not depend on a specific decomposition and the results are invariant to any arbitrary choice of these reduced SVDs, primarily due to the “only if” parts of the statements.

The proof of Proposition 2 can be found in B. We observe that, in the case p(1,), the saturation of Hölder inequality provides a very tight link between the two matrices: If we know one of them, then the other lies in a one-dimensional ray that is parameterized by the constant c > 0. However, in the special case p = 1, the identification is not as simple. There again, for a given matrix B, one can fully characterize the set of admissible matrices A. However, for the reverse direction, an additional rank-equality constraint is essential to reduce the set of admissible matrices B to just one ray.

Inspired from Proposition 2, we now propose our main result in Theorem 1, where we explicitly characterize the duality mapping for the Schatten p-norms. The proof of Theorem 1 can be found in C.

Theorem 1.

Let p,q[1,+] be a pair of Hölder conjugates with 1p+1q=1 and ARm×n a matrix whose reduced SVD is specified in (Equation9).

  • If 1<p<+, then the single-valued duality mapping JSp:Rm×nRm×n is well-defined and can be expressed as (19) JSp:A=Urdiag(σ)VrTA*=Urdiag(Jp(σ))VrT.(19)

  • If p = 1 and if we consider the rank function as the sparsity measure in Definition 2, then the sparse duality mapping JS1,rank:Rm×nRm×n is well-defined (singleton) and is given as (20) JS1,rank:A=Urdiag(σ)VrTA*=||σ||1UrVrT.(20)

  • If p = + ∞, then the set-valued mapping JS(·) can be described as (21) JS(A)={σ1U1XV1T:XRr1×r1 is symmetric and ||X||S1=1},(21)

where r1 denotes the multiplicity of the first singular value σ1 of A and U1,V1 are singular vectors that correspond to σ1 in (Equation9). Finally, the set of sparse dual conjugates is the collection of rank-1 elements of JS(A) which can be characterized as (22) JS,rank(A)={σ1U1ppTV1T:pRr1,||p||2=1}.(22)

4. Discussion

Theorem 1 provides an interesting characterization of the duality mapping in three scenarios: The first case is 1<p<+ which is the most straightforward one. Theorem 1 tells us that the mapping is single-valued and also gives a formula to compute the dual conjugate A* of any matrix ARm×n. We use this result to deduce the continuity of the duality mapping as well as the strict convexity of the Schatten space in this case (see Corollary 1). In the second case, with p = 1, the mapping is not single-valued. However, there is a unique element in the set of dual conjugates with the minimal rank (that is equal to the rank of A) and, hence, we can construct a single-valued sparse duality mapping. Finally, we showed in the third case, characterized by p = + , that neither the set of dual conjugates nor the ones with the minimal rank are unique.

In Corollary 1, we highlight some consequences of Theorem 1 concerning the strict convexity of Schatten spaces and the continuity of the duality mapping.

Corollary 1.

The Banach space of m by n matrices equipped with the Schatten-p norm is strictly convex, if and only if p(1,+). In this case, the function JSp:Rm×nRm×n is continuous.

Proof.

For p(1,+), we know from Theorem 1 that the duality mapping JSp is bijective. Moreover, it is known that all finite-dimensional Banach spaces are reflexive. Now, following [Citation41], we deduce the strict convexity of the space of m by n matrices with Schatten-p norm.

For p = 1 and p = + , we can readily verify that ||α(1000)+(1α)(0001)||S1=||(α00(1α))||S1=1,||α(1000)+(1α)(1001)||S=||(100(1α))||S=1, for all α(0,1), which shows that the Schatten space is not strictly convex for p=1,+.

Finally, the Schatten-p norm is known to be Fréchet differentiable for p(1,+) [Citation26]. Moreover, the duality mapping of any Banach space with Fréchet-differentiable norms is guaranteed to be continuous [Citation42, Citation43]. Combining the two statements, we deduce the continuity of the duality mapping in this case. □

By contrast, the sparse duality mapping JS1,rank(·) is not continuous. This is best explained by providing a counterexample. Specifically, let us consider the sequence of 2 by 2 matrices Sk=(1001k),kN.

It is clear that SkS=(1000). However, we have that kN:JS1,rank(Sk)=(1001),whileJS1,rank(S)=(1000), which shows the discontinuity of JS1,rank in the space of 2 by 2 matrices. This can be generalized to space of matrices with arbitrary dimensions m,nN.

Although JS1,rank is not continuous, we now show that it is Borel-measurable and, hence, that it can be approximated with arbitrary precision by a continuous mapping due to Lusin’s theorem [Citation34].

Proposition 3.

For any m,nN, the sparse duality mapping JS1,rank is a Borel-measurable matrix-valued function over the space of m by n matrices.

Before going into the proof of Proposition 3, we present a preliminary result.

Lemma 1.

The set RrRm×n of m by n matrices of rank r is Borel-measurable.

Proof.

First note that R1={uvT:uRm,vRn}.

The set R1 is the image of the continuous mapping Rm×RnRm×n:(u,v)uvT and, hence, is Borel-measurable.

Now, denote by RrRm×n, the set of matrices with rank no more than r. Using the identity Rr=R1++R1,(r times), we deduce that Rr and, consequently, Rr=Rr\R(r1) are also Borel-measurable sets. □

Proof of Proposition 3.

Consider a Borel-measurable set BRm×n. We show that Binv=JS1,rank1(B) is also Borel-measurable. By defining Binv,r=BinvRr, we can partition Binv as Binv=r=1min(m,n)Binv,r.

Hence, it is sufficient to show that each partition Binv,r is Borel-measurable.

Define the set PrRr2 as Pr={(A,B)Rr×B:Tr(ATB)=||A||S1||B||S,||A||S1=||B||S}.

The set Pr introduces a relation over Rr whose domain is Binv,r. In other words, we have that Binv,r={ARr:BB,(A,B)Pr}.

Since the trace and norm are continuous (and, consequently, Borel-measurable) functions and Rr×B is a Borel-measurable set (using Lemma 1), we deduce that the relation induced from Pr is Borel-measurable as well. Finally, we use [Citation44, Proposition 2.1] to show that its domain is Borel-measurable. □

5. Conclusion

In this paper, we studied the duality mapping in finite-dimensional Schatten spaces. Based on a careful investigation of the cases where the Hölder inequality saturates, we provided an explicit form for this mapping when p(1,+). Furthermore, by adding a rank constraint, we proved that the mapping becomes single-valued for the special case p = 1. As for p = + , we showed that the mapping yields a convex set whose elements are explicitly characterized. Finally, we discussed our theorem and studied the continuity of the introduced mappings as well as the strict convexity of the Schatten spaces. A possible future direction of research is to extend the results of this paper to infinite-dimensional Schatten spaces and even, in full generality, to linear operators over Hilbert spaces.

A. Proof of Proposition 1

Proof.

Let us recall the reduced SVD of the matrix A as (23) A=UrSrVrT,(23) where r=rank(A),Ur=[u1ur]Rm×r,Vr=[v1vr]Rn×r, and S=diag(σ1,,σr). Similarly, for the matrix B, we have that (24) B=U˜r˜S˜r˜V˜r˜T,(24) where r˜=rank(A),U˜r˜=[u˜1u˜r˜]Rm×r˜,V˜r˜=[v˜1v˜r]Rn×r˜, and S˜=diag(σ˜1,,σ˜r˜). A direct computation then reveals that (25) Tr(ATB)=i=1rj=1r˜σiσ˜juiTu˜jviTv˜j.(25)

By using the weighted Hölder inequality for vectors [Citation45], we obtain for p1 that (26) i=1rj=1r˜σiσ˜juiTu˜jviTv˜j(i=1rσipj=1r˜|uiTu˜jviTv˜j|)1p(j=1r˜σjpi=1r|uiTu˜jviTv˜j|)1q(26) and for p = 1 that (27) i=1rj=1r˜σiσ˜juiTu˜jviTv˜j(i=1rσij=1r˜|uiTu˜jviTv˜j|)||σ˜||.(27)

Finally, by invoking Cauchy-Schwartz and the orthonormality of the matrices Ur,Vr,U˜r˜,V˜r˜, we deduce for i=1,,r that (28) j=1r˜|uiTu˜jviTv˜j|(j=1r˜(uiTu˜j)2)12(j=1r˜(viTv˜j)2)12||ui||2||vi||2=1,(28)

For j=1,,r˜, we deduce that (29) i=1r|uiTu˜jviTv˜j|(i=1r(uiTu˜j)2)12(i=1r(viTv˜j)2)12||u˜i||2||v˜i||2=1.(29)

The combination of these inequalities completes the proof. □

B. Proof of Proposition 2

Proof.

We separate the two cases and analyze each one independently.

Case 1: 1<p<+. We prove (Equation14) and deduce (Equation15) by symmetry. Following the proof of Proposition 2 and considering the reduced SVD of the matrices A and B given in (Equation23) and (Equation24), we immediately see that the inequalities (Equation26), (Equation28), and (Equation29) should all be saturated. The equality condition of the weighted Hölder implies the existence of a positive constant α>0 such that, for all (i,j){1,,r}×{1,,r˜}, we have one of the following conditions: (30) uiTu˜jviTv˜j=0,or(30) (31) uiTu˜jviTv˜j>0andσ˜jq=ασip.(31)

Moreover, the saturation of (Equation28) implies that (32) uiRange(U˜r˜),viRange(V˜r˜)i=1,,r(32) and also that there exists a positive constant βi>0 (positivity follows from (Equation31) and (Equation30)) such that (33) uiTu˜j=βiviTv˜j,j=1,,r˜.(33)

However, from the normality of ui and (Equation32), we have that (34) 1=||ui||22=j=1r˜|uiTu˜j|2=βi2j=1r˜|viTv˜j|2=βi2||v||22=βi2(34) which, together with the positivity of βi, leads to the conclusion that βi=1 for i=1,,r. Using this, we rewrite (Equation33) in matrix form as (35) UrTU˜r˜=VrTV˜r˜.(35)

Similarly, the saturation of (29) implies that (36) u˜jRange(Ur),v˜iRange(Vr),(36) for all j=1,,r˜. Putting together (Equation32) and (Equation36), we deduce that r=r˜ and (37) Range(Ur)=Range(U˜r˜),Range(Vr)=Range(V˜r˜).(37)

This implies the existence of two orthogonal matrices P,QRr×r such that (38) U˜r˜=UrP,V˜r˜=VrQ.(38)

However, replacing (Equation38) in (Equation35), we conclude that (39) P=UrTUrP=UrTU˜r˜=VrTV˜r˜=VrTVrQ=Q.(39)

This implies that the matrix B can be represented as (40) B=UrPS˜r˜PTVrT=UrS0VrT,(40) where S0=PS˜r˜PT. We now show that S0 is a diagonal matrix. Indeed, by denoting the (i, j)-th entry of P as pi,j such that P=[p1pr]=[pi,j], we rewrite (Equation30) and (Equation31) as (41) pi,j=0,or(41) (42) pi,j>0andσ˜jq=ασip,(42) for all (i,j){1,,r}2. Moreover, by expanding the (i, j)-th entry of the matrix S0, we have that [S0]i,j=[PS˜r˜PT]i,j=k=1rpi,kσ˜kpj,k=k=1rpi,kσipqα1qpj,k=σipqα1qpiTpj=[Jp(σ)]icBδ[ij], where δ[·] denotes the Kronecker delta and cB=α1q>0 is a positive constant. Finally, we obtain the announced expression in (Equation14) by replacing the above characterization of S0 in (Equation40).

For the converse, we note that, if the matrix B is in the form of (Equation14), then we have that Tr(ATB)=Tr(Urdiag(σ)VrT(Urdiag(Jp(σ))VrT)T)=cBTr(diag(σ)VrTVrdiag(Jp(σ))UrTUr)=cBσTJp(σ)=cB||σ||p||Jp(σ)||q=||A||Sp||B||Sq, which shows that the equality is indeed saturated in this case.

Case 2: p = 1. In this case, the saturation of the weighted Hölder inequality implies that, for all (i,j){1,,r}×{1,,r˜}, we have that (43) uiTu˜jviTv˜j=0,or(43) (44) uiTu˜jviTv˜j>0andσ˜j=σ˜1.(44)

For equality, we also need to have the saturation of (Equation28), which we showed to be equivalent to (Equation32) and (Equation35). From (Equation32), we deduce the existence of matrices P1,P2Rr˜×r such that (45) Ur=U˜r˜P1,Vr=V˜r˜P2.(45)

The replacement of these in (Equation35) implies that (46) P1T=P1TU˜r˜TU˜r˜=UrTU˜r˜=VrTV˜r˜=P2TV˜r˜TV˜r˜=P2T,(46) and, hence, that P1=P2=[pi,j]Rr˜×r. Now, one can rewrite the conditions (Equation43) and (Equation44) and deduce that, for any j=1,,r˜, we have that (47) pj,i=0,i=1,,rorσ˜j=σ˜1.(47)

From Conditions (Equation47) and following the definition of r1 (the multiplicity of the largest singular value), we deduce that (48) P1=[P0rres×r],(48) where PRr1×r and rres=(r˜r1). Using this form and the definition of U˜1 and V˜1 (given in the statement of the proposition), we rewrite (Equation45) as (49) Ur=U˜1P,Vr=V˜1P.(49)

Therefore, (50) Ir=UrTUr=PTU˜1TU˜1P=PTP.(50)

Hence, P is a sub-orthogonal matrix and rank(B)=r˜r1rank(P)r=rank(A).

The replacement of (Equation49) in the reduced SVD of A yields the announced expression with X=PSPT.

Based on the definitions of r1, U˜1, and V˜1, we note that one can rewrite the reduced SVD of B as (51) B=σ˜1U˜1V˜1T+U˜resS˜resV˜resT,(51)

where U˜resRm×rres,S˜resRrres×rres, and V˜resRn×rres are the remaining singular values and vectors such that U˜=[U˜1U˜res],V˜=[V˜1V˜res],S˜=[σ˜1Ir100S˜res].

Now, if A admits the form (Equation17) and if we consider the SVD of X=PSPT (the assumption that X is symmetric ensures that is has an orthogonal eigen-decomposition), then Tr(ATB)=Tr(V˜1PSPTU˜1T(σ˜1U˜1V˜1T+U˜resS˜resV˜resT))=σ˜1Tr(V˜1PSPTU˜1TU˜1V˜1T)+Tr(V˜1PSPTU˜1TU˜resS˜resV˜resT)=σ˜1Tr(V˜1PSPTIr1V˜1T)+Tr(V˜1PSPT0r1×rresS˜resV˜resT)=σ˜1Tr(SPTV˜1TV˜1P)+0=σ˜1Tr(SPTP)=σ˜1Tr(S)=||B||S||A||S1, which establishes the sufficiency in this case.

Finally, assuming that r=r1=r˜, we deduce that PRr×r is an orthogonal matrix and, hence, that P1=PT. Now, using (Equation49) and the rank assumption, we can simplify the expansion (Equation51) as (52) B=σ˜1U˜1V˜1T=σ˜1UrPT(VrPT)T=σ˜1UrPTPVrT=σ˜1UrVrT.(52)

C. Proof of Theorem 1

Proof.

Case I: 1<p<+. Assume that (A,B) forms an (Sp, Sq)-conjugate pair. Hence, we have that A,B=||A||Sp||B||Sq which, together with Proposition 2, implies that B admits the form B=||B||Sq||A||SpUrdiag(Jp(σ))VrT=Urdiag(Jp(σ))VrT.

Case II: p = 1. Similarly to the previous case, consider ARm×n and BJS1,rank(A). We have that (53) Tr(ATB)=||A||S1||B||S(53) (54) ||A||S1=||B||S(54) (55) rank(B)rank(C),CJS1(A).(55)

From (Equation53) and using Proposition 2, we deduce that rank(B)rank(A) which, together with (Equation55), implies that B should be equal to B=||B||SUrVrT=||σ||1UrVrT, where the last equality is obtained using (Equation54).

Case III: p = +. Following Proposition 2, any matrix BJS(A) can be expressed as B=U1X˜V1T, where X˜Rr1×r1 is a symmetric matrix. By defining X=σ11X˜, one readily verifies that B=σ1U1XV1T. By recalling the normalization constraint ||A||S=||B||S1, we therefore obtain that σ1=||A||S=||B||S1=σ1||X||S1, which implies that ||X||S1=1. To show that JS(A) is convex, consider two symmetric matrices X0 and X1 in the unit ball of Schatten-1 norm and define Bα=σ1U1XαV1T,Xα=αX0+(1α)X1 for α[0,1]. On one hand, from the linearity of traces, we have that Tr(ATBα)=Tr(AT(αB1+(1α)B0))=αTr(ATB1)+(1α)Tr(ATB0).

On the other hand, from the definition of X0 and X1, we deduce that B0,B1JS(A). Hence, Tr(ATBα)=α||A||S2+(1α)||A||S2=||A||S2.

However, from the Hölder inequality and the convexity of norms, we have that Tr(ATBα)||A||S||Bα||S1||A||S(α||B1||S1+(1α)||B0||S1)=||A||S.

This implies that the Hölder inequality is saturated and also that ||Bα||S1=||A||S which, altogether, implies that BαJS(A) for all α[0,1].

Finally, we observe that the set JS(A) contains all matrices of the form B=U1ppTV1T for any vector pRr1 with ||p||2=1. These are indeed all the rank-1 elements of JS(A) which, due to the Definition 2, forms the set of sparse dual conjugates. □

Additional information

Funding

This work was supported in part by the European Research Council (H2020-ERC Project GlobalBioIm) under Grant 692726 and in part by the Swiss National Science Foundation, Grant 200020_184646/1.

Notes

1 Although this functional does not satisfy the homogeneity property of a norm, it has been widely referred to as the 0-norm.

References

  • Bhatia, R. (1997). Matrix Analysis, Vol. 169. New York: Springer-Verlag.
  • Lefkimmiatis, S., Unser, M. (2013). Poisson image reconstruction with Hessian Schatten-norm regularization. IEEE Trans. Image Process. 22(11):4314–4327. DOI: 10.1109/TIP.2013.2271852.
  • Lefkimmiatis, S., Ward, J., Unser, M. (2013). Hessian Schatten-norm regularization for linear inverse problems. IEEE Trans. Image Process. 22(5):1873–1888. DOI: 10.1109/TIP.2013.2237919.
  • Xie, Y., Gu, S., Liu, Y., Zuo, W., Zhang, W., Zhang, L. (2016). Weighted Schatten p-norm minimization for image denoising and background subtraction. IEEE Trans. Image Process. 25(10):4842–4857. DOI: 10.1109/TIP.2016.2599290.
  • Gao, S., Fan, Q. (2020). Robust Schatten-p norm based approach for tensor completion. J. Sci. Comput. 82(1):1–23.
  • Horn, R. A., Johnson, C. R. (2012). Matrix Analysis. Cambridge University Press.
  • Davenport, M. A., Romberg, J. (2016). An overview of low-rank matrix recovery from incomplete observations. IEEE J. Sel. Top. Signal Process. 10(4):608–622. DOI: 10.1109/JSTSP.2016.2539100.
  • Candès, E. J., Recht, B. (2009). Exact matrix completion via convex optimization. Found. Comput. Math. 9(6):717–772. DOI: 10.1007/s10208-009-9045-5.
  • Candès, E. J., Eldar, Y. C., Strohmer, T., Voroninski, V. (2015). Phase retrieval via matrix completion. SIAM Rev. 57(2):225–251. DOI: 10.1137/151005099.
  • Davies, M. E., Eldar, Y. C. (2012). Rank awareness in joint sparse recovery. IEEE Trans. Inform. Theory 58(2):1135–1146. DOI: 10.1109/TIT.2011.2173722.
  • Fazel, M., Pong, T. K., Sun, D., Tseng, P. (2013). Hankel matrix rank minimization with applications to system identification and realization. SIAM J. Matrix Anal. Appl. 34(3):946–977. DOI: 10.1137/110853996.
  • Asadi, E., Aziznejad, S., Amerimehr, M. H., Amini, A. (2017). A fast matrix completion method for index coding. In: Proceedings of the Twenty-Fifth European Signal Processing Conference (EUSIPCO’17). Kos Island, Greece: IEEE, pp. 2606–2610.
  • Esfahanizadeh, H., Lahouti, F., Hassibi, B. (2014). A matrix completion approach to linear index coding problem. In: Proceedings of the Information Theory Workshop (ITW 2014). Hobart, Australia: IEEE, pp. 531–535.
  • Kittaneh, F. (1985). Inequalities for the Schatten p-norm. Glasgow Math. J. 26(2):141–143. DOI: 10.1017/S0017089500005905.
  • Kittaneh, F. (1987). Inequalities for the Schatten p-norm II. Glasgow Math. J. 29(1):99–104. DOI: 10.1017/S0017089500006716.
  • Kittaneh, F. (1986). Inequalities for the Schatten p-norm III. Communmath. Phys. 104(2):307–310. DOI: 10.1007/BF01211597.
  • Kittaneh, F. (1986). Inequalities for the Schatten p-norm IV. Communmath. Phys. 106(4):581–585. DOI: 10.1007/BF01463397.
  • Kittaneh, F., Kosaki, H. (1987). Inequalities for the Schatten p-norm V. Publ. Res. Inst. Math. Sci. 23(2):433–443. DOI: 10.2977/prims/1195176547.
  • Bourin, J.-C. (2006). Matrix versions of some classical inequalities. Linear Algebra Appl. 416(2–3):890–907. DOI: 10.1016/j.laa.2006.01.002.
  • Hirzallah, O., Kittaneh, F., Moslehian, M. (2010). Schatten p-norm inequalities related to a characterization of inner product spaces. Math. Inequal. Appl. 13(2):235–241. DOI: 10.7153/mia-13-19.
  • Moslehian, M. S., Tominaga, M., Saito, K.-S. (2011). Schatten p-norm inequalities related to an extended operator parallelogram law. Linear Algebra Appl. 435(4):823–829. DOI: 10.1016/j.laa.2011.01.046.
  • Conde, C., Moslehian, M. S. (2016). Norm inequalities related to p-Schatten class. Linear Algebra Appl. 498:441–449. DOI: 10.1016/j.laa.2015.11.031.
  • Wenzel, D., Audenaert, K. M. (2010). Impressions of convexity: An illustration for commutator bounds. Linear Algebra Appl. 433(11–12):1726–1759. DOI: 10.1016/j.laa.2010.06.039.
  • Cheng, C.-M., Lei, C. (2015). On Schatten p-norms of commutators. Linear Algebra Appl. 484:409–434. DOI: 10.1016/j.laa.2015.07.009.
  • So, W. (1990). Facial structures of Schatten p-norms. Linear Multilinear Algebra. 27(3):207–212. DOI: 10.1080/03081089008818012.
  • Potapov, D., Sukochev, F. (2014). Fréchet differentiability of Sp norms. Adv. Math. 262:436–475.
  • Kittaneh, F. (1989). On the continuity of the absolute value map in the Schatten classes. Linear Algebra Appl. 118:61–68. DOI: 10.1016/0024-3795(89)90571-5.
  • Bhatia, R., Kittaneh, F. (2000). Cartesian decompositions and Schatten norms. Linear Algebra Appl. 318(1-3):109–116. DOI: 10.1016/S0024-3795(00)00206-8.
  • Beurling, A., Livingston, A. (1962). A theorem on duality mappings in Banach spaces. Ark. Mat. 4(5):405–411. DOI: 10.1007/BF02591622.
  • Cioranescu, I. (1990). Geometry of Banach Spaces. Duality Mappings and Nonlinear Problems, Vol. 62, Netherlands: Springer.
  • de Boor, C. (1976). On “best” interpolation. J. Approximation Theory. 16(1):28–42. DOI: 10.1016/0021-9045(76)90093-9.
  • Unser, M. (2020). A Unifying Representer Theorem for Inverse Problems and Machine Learning. Foundations of Computational Mathematics.
  • Liu, P., Wang, Y.-W. (2007). The best generalized inverse of the linear operator in normed linear space. Linear Algebra Appl. 420(1):9–19. DOI: 10.1016/j.laa.2006.04.024.
  • Rudin, W. (1991). Functional Analysis. International Series in Pure and Applied Mathematics. New York: McGraw-Hill, Inc.
  • Johnson, C. R., Horn, R. A. (1985). Matrix Analysis. Cambridge University Press,
  • Nie, F., Huang, H., Ding, C. (2012). Low-rank matrix recovery via efficient Schatten p-norm minimization. In: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 26.
  • Shang, F., Liu, Y., Cheng, J. (2016). Scalable algorithms for tractable Schatten quasi-norm minimization. In: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 30.
  • Shang, F., Liu, Y., Shang, F., Liu, H., Kong, L., Jiao, L. (2020). A unified scalable equivalent formulation for schatten quasi-norms. Mathematics. 8(8):1325. DOI: 10.3390/math8081325.
  • Giampouras, P., Vidal, R., Rontogiannis, A., Haeffele, B. (2020). A novel variational form of the Schatten-p quasi-norm. arXiv preprint arXiv:2010.13927.
  • Lefkimmiatis, S., Roussos, A., Maragos, P., Unser, M. (2015). Structure tensor total variation. SIAM J. Imaging Sci. 8(2):1090–1122. DOI: 10.1137/14098154X.
  • Petryshyn, W. (1970). A characterization of strict convexity of Banach spaces and other uses of duality mappings. J. Funct. Anal. 6(2):282–291. DOI: 10.1016/0022-1236(70)90061-3.
  • Giles, J., Gregory, D., Sims, B. (1978). Geometrical implications of upper semi-continuity of the duality mapping on a Banach space. Pacific J. Math. 79(1):99–109. DOI: 10.2140/pjm.1978.79.99.
  • Contreras, M. D., Payá, R. (1994). On upper semicontinuity of duality mappings. Proc. Am. Math. Soc. 121(2):451–459. DOI: 10.1090/S0002-9939-1994-1215199-4.
  • Himmelberg, C. J., Parthasarathy, T. (1975). Measurable relations. Fund. Math. 87(1):53–72. DOI: 10.4064/fm-87-1-53-72.
  • Cvetkovski, Z. (2012). Inequalities: Theorems, Techniques and Selected Problems. Berlin Heidelberg: Springer-Verlag.