859
Views
0
CrossRef citations to date
0
Altmetric
APPLIED & INTERDISCIPLINARY MATHEMATICS

Solving equations with real Jordan canonical forms

&
Article: 2068740 | Received 27 Aug 2021, Accepted 15 Apr 2022, Published online: 03 May 2022

ABSTRACT

A new equivalent version of Gordan’s theorem of the alternative is presented based on the system of linear inequalities. Such a version has a more intuitive geometric interpretation than Gordan’s theorem. The real Jordan canonical form theorem is utilized in proving the theorem.

PUBLIC INTEREST STATEMENT

This note examines whether a system of linear inequalities is solvable or a system of linear equations has a positive solution. In this note, we also investigated the geometrical properties of real linear operators by the use of its Jordan canonical form.

1. Introduction

Let Rn denote the set of vectors with size n, and Rm×n denote the collection of m-by-n real matrices. The following theorem examines circumstances under which a system of linear equations has a positive solution.

Theorem 1 (Gordan’s theorem). Let ARm×n be arbitrary. Then, either the system

(1.1) Ax=0andx0(1.1)

has a nonzero solution xRn, or the system

(1.2) ATy>0(1.2)

has a solution yRm, but never both.

Theorem 1, due to Gordan (Gordan, Citation1873), has a long history and has been reproved numerous times. The survey by Kjeldsen (Kjeldsen, Citation2002) attributed the historical development of the theory of the systems of linear inequalities (1.1) and (1.2). It was rediscovered by Stiemke (Stiemke, Citation1915), representing a large class of theorems of the alternative that play an important role in linear and nonlinear programming. Such theorems are crucial in deriving optimality conditions for wide classes of extremal problems. For more background information and applications on the theorems of the alternative and other relevant matters, we refer to the textbooks (Ciarlet, Citation1989; Gill et al., Citation1991; Mccormick, Citation1983; Osborne, Citation1985), the surveys (Dax, Citation1993; Saunders & Schneider, Citation1979), and research papers (Dax & Sreedharan, Citation1997; Galán, Citation2017; Giannessi, Citation1984; Mangasarian, Citation1981).

Gordan (Gordan, Citation1873) and Stiemke (Stiemke, Citation1915) proved Theorem 1 by induction independently. However, they did not give any motivation for this investigation. Such an investigation do not make clear why the theorem works; see e.g. (Gale, Citation1960). Recent proofs are usually based on separation theorems of the convex sets with a simple geometrical interpretation. In this note, we present a simple self-contained proof which is based on elementary arguments in linear algebra.

The paper is organized as follows. In the next section, we will present a new equivalent form of Theorem 1. And in Section 3, we will give a simple, elementary proof of the equivalent form by the use of the real Jordan canonical form. The last section contains conclusions and some remarks.

2. An equivalent form

Let r denote the rank of the matrix A. Clearly, one has

rmin{m,n}.

For the sake of avoiding trivial cases, we always assume that r is greater than zero. By the theory of the linear algebra (Wilkinson & Reinsch, Citation1971), there is an invertible matrix MRm×m and a permutation matrix PRn×n such that

(2.1) MA=IB00P,(2.1)

where B is a r-by-(nr) real matrix, and I denotes a unit matrix with a suitable size. The left product is equivalent to performing a series of elementary row operations on the matrix A. Then the system of equations in (1.1) can be rewritten as follows:

(2.2) IB00Px=0.(2.2)

If uv=Px, where uRr and vRnr, then x0 means that both u0 and v0 hold. This results in our main theorem, an equivalent algebraic version of Gordan’s theorem (Theorem 1).

Theorem 2. (The Main Theorem). Let BRr×n1 be arbitrary. Then, either the system

(2.3) Bv0andv0(2.3)

has a nonzero solution vRn1, or the system

(2.4) BTy>0andy>0(2.4)

has a solution yRr, but never both.

From this it is simple to prove Gordan’s theorem (Theorem 1). Our main theorem has a more intuitive geometric interpretation than Gordan’s theorem in the sense discussed below but despite this is perhaps simpler to prove.

In the following statement, we assume without loss of generality that r=n1, i.e. the matrix B is square (if not, make it a square matrix by adding zero rows or columns).

Let R++r and R+r denote the positive and the nonnegative quadrant cones, respectively, that is,

R++r={vRr|v>0}andR+r={vRr|v0}.

If B is a linear operator represented by the matrix B, then the solvability of the system (2.3) is equivalent to that

(2.5) B(R+r)(R+r),(2.5)

while the solvability of the system (2.4) is equivalent to that

(2.6) B(R++r)R++r,(2.6)

where B is the adjoint operator of B, and it can be represented by the matrix BT when B is a real matrix.

3. Proofs

The Jordan canonical form is described with reference to matrices known as elementary Jordan blocks. An elementary Jordan block of size l×l associated with an eigenvalue λ will be denoted by Jl(λ), and its general form is adequately illustrated by the definition

(3.1) Jl(λ)=λ1λ1λ1λ.(3.1)

The basic theorem is that given any r×r matrix with complex entries, there exists a nonsingular matrix S such that

(3.2) SBS1=J,SB=JS,(3.2)

where J, the Jordan canonical form of B, is block diagonal, each diagonal matrix being an elementary Jordan block with λ=λi. Apart from the ordering of the blocks along the diagonal of J (which can be arbitrary), the Jordan canonical form is unique, although S is far from unique.

Following we mainly consider the Jordan canonical form when the matrix B has only real entries. In this case, all the nonreal eigenvalues must occur in conjugate pairs. Moreover, if ξ+iη is an eigenvector of the real matrix B associated with the complex eigenvalue λ=a+bi, then ξiη must be an eigenvector of B associated with the conjugate eigenvalue λˉ=abi, where a,bR, ξ,ηRr and i is the imaginary unit, i.e. i2=1. That is

B(ξ+iη)=(a+bi)(ξ+iη)andB(ξiη)=(abi)(ξiη),

or equivalently,

B(ξ,η)=(ξ,η)abba,

in which the right 2×2 real matrix, denoted by C(a,b), has eigenvalues a±bi. If θ=arcsinba2+b2, then

C(a,b)=(a2+b2)cosθsinθsinθcosθ.

So the linear operator represented by the matrix C(a,b) on the real space span{ξ,η} is a superposition of rotation and stretching. Of course, the rotation angle of the corresponding adjoint operator, represented by the matrix CT(a,b), is the opposite of the rotational angle of C(a,b). When θ0 or π, such two linear operators have no eigenvector on the 2-dimensional real space span{ξ,η}.

Note that the structure of the Jordan blocks corresponding to the conjugate eigenvalue λˉ must be the same as the structure of the Jordan blocks corresponding to the eigenvalue λ, e.g. see, (Horn & Johnson, Citation1990). For example, if λ is a complex eigenvalue of the real matrix B, and if J2(λ) appears in the Jordan canonical form of B with a certain multiplicity, J2(λˉ) must also appear with the same multiplicity. Then, the block matrix

J2(λ)00J2(λˉ)=λ1000λ0000λˉ1000λˉ

is permutation similar to the block matrix

λ0100λˉ0100λ0000λˉ.

If λ=a+bi and λˉ=abi, then it is also similar to the matrix

C(a,b)I0C(a,b).

In general, each block pair of conjugate l-by-l Jordan block

Jl(λ)00Jl(λˉ)

with nonreal λ=a+bi is similar to a real 2l-by-2l block of the form

(3.3) Cl(a,b)=C(a,b)IC(a,b)IC(a,b).(3.3)

These observations lead us to the real Jordan canonical form, see, (Horn & Johnson, Citation1990).

Theorem 3. Each real matrix BRr×r is similar to a block diagonal real matrix of the form

(3.4) JR=diag(Cr1(a1,b1),n,Crp(arp,brp),Jrp+1(λp+1),,Jrk(λk)),(3.4)

where λj=aj+bji is a nonreal eigenvalue of B for j=1,2,,p, aj,bj are real, λp+1,,λk are real eigenvalues of B. The real Jordan block Crj(aj,bj) is of the form (3.3), and the Jordan block Jrj(λj) are exactly the Jordan blocks in (3.2) with λ=λj.

The motivation to prove our main theorem drives us to investigate the connection between the real Jordan canonical forms of the real matrix B and its transpose. A simple observation is that if the real matrix B has a real Jordan canonical form JR, then its transpose has a real Jordan canonical form JRT. This yields the following result.

Corollary 4. Let the real Jordan canonical form JR of the real matrix B be defined by (3.4).

(1) If ζj is an eigenvector of BT associated with the Jordan block JrjT(λj), then

Wj(B)=span{ζj,Bζj,B2ζj,,Brj1ζj}

is an invariant subspace of B, in which B has one linearly independent real eigenvector on Wj(B);

(2) If ξj±iηj are eigenvectors of BT on the complex space Cr associated with the real Jordan block CrjT(aj,bj), where ξj,ηjRr, then

Wj(B)=span{ξj,ηj,Bξj,Bηj,,Brj1ξj,Brj1ηj}

is an invariant subspace of B on Rr;

(3) Rr is a direct sum of the subspaces Wj(B),j=1,2,,k, i.e.

(3.5) Rr=W1(B)W2(B)Wk(B).(3.5)

Proof. If rj=1, then the result is trivial. Otherwise, each Jordan block Jrj(λj) and its transpose JrjT(λˉj) have only one linearly independent real eigenvector erj and e1, respectively, where erj and e1 denote the last column and the first column of the unit matrix with order rj. Then, it is easy to verify that the above results for the real Jordan canonical form JR hold. Then, the desired results hold since the similarity invariance of the subspaces Wj(J).□

If the real matrix B in Corollary 4 is replaced by its transpose, then the corresponding result also is true. We may obtain another decomposition of Rr in the form of the matrix BT

(3.6) Rr=W1(BT)W2(BT)Wk(BT).(3.6)

Of course, we have Wj(BT)=Wj(B) for all j=1,2,,k.

It follows from Corollary 4 that if gj and Vλj(BT) denote the geometric multiplicity and the eigensubspace of λj associated with the matrix BT, then the null space of (BλI)gj is equal to

span{VλjT,BVλj(BT),,Bgj1Vλj(BT)}.

In addition, the first and last terms of Corollary 4 is still true when B is a complex matrix, only if the transpose matrix BT is replaced by the conjugate transpose matrix B, and the corresponding Jordan block JrjT(λj) is replaced by JrjT(λˉj).

Let

Vj+(B)=span{ξj+iηj,B(ξj+iηj),,Brj(ξj+iηj)}

and

Vj(B)=span{ξjiηj,B(ξjiηj),,Brj(ξjiηj)}

be invariant subspaces of B, where ξj±iηj are eigenvalues of BT associated with the eigenvalues aj±bji. Then ξj±iηjVj+(B)Vj(B) such that B on one 2-dimensional real subspace V=span{ξ,η}Wj(B) rotates the angles θj=arcsinbjaj2+bj2, while the adjoint operator B on the subspace VjWjBT rotates the angles θj.

Now we ready to Theorem 2.

Proof of the main theorem. It is easy to show that both two inequalities (2.5) and (2.6) cannot be true together. From Corollary 4, we only need prove that at least one of them holds for each Wj=Wj(B)=Wj(BT), j=1,2,,k. The proof requires consideration of two cases.

When the eigenvalue λj is a real number, the inequality (2.6) holds on Wj if and only if λj>0, while the inequality (2.5) holds on Wj if and only if λj0.

When the eigenvalue λj=aj+bji is a complex number, the inequality (2.6) holds on Wj if and only if 00<arcsinbjaj2+bj2<900, while the inequality (2.5) holds on Wj if and only if 900arcsinbjaj2+bj21800. The proof is finished.

Finally, Gordan’s theorem follows from our main theorem trivially.

4. Concluding remarks

In this note, we have given an alternative proof of Gordan’s theorem, a proof that is based on the real Jordan canonical form. Another purely algebraic ways are to apply the property of the orthogonal or antisymmetric matrix. See, for example, (Broyden, Citation1998) and (Tucker, Citation1956; Vajda, Citation1961). This approach still don’t make clear why the theorem works. In our proof, a new geometrical interpretation for a pair of real linear adjoint operators is presented.

Other theorems of the alternative seem to be also related to the above geometrical interpretation. This will be our further work.

Acknowledgements

This study was funded by the National Natural Science Foundation of China (11871118). The author has received research grants from Yangtze University. The author declares that he has no conflict of interest. The author is indebted to the anonymous referees, whose detailed remarks helped in improving the manuscript.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

Supported by the National Natural Science Foundation of China (11871118).

Notes on contributors

Zi-Zong Yan

Zi-Zong Yan is currently a professor in the school of information and mathematics at Yangtze university. He received his Ph.D. degree in the school of mathematics and statistics from Wuhan university in 2005. His current research interests are in the area of optimization theory and matrix analysis.

Zhi-Jia Shu

Zhi-Jia Shu was born in Hubei, China, in 1995. He received his Bachelor’s degree in the school of information and mathematics at Yangtze university, in 2019. Currently, he is a master in reading at Yangtze university. His research interest is optimization theory.

References

  • Broyden, C. G. (1998). A simple algebraic proof of Farkas’s lemma and related theorems. Optimization Methods and Software, 8(3–4), 185–5. https://doi.org/10.1080/10556789808805676
  • Ciarlet, P. G. (1989). Introduction to numerical linear algebra and optimization. Cambridge University Press.
  • Dax, A. (1993). The relationship between theorems of the alternative, least norm problems, steepest descent directions, and degeneracy: A review. Annals of Operations Research, 46(1), 11–60 https://doi.org/10.1007/BF02096256
  • Dax, A., & Sreedharan, P. (1997). Theorems of the alternative and duality. Journal of Optimization Theory and Applications, 94(3), 561–590. https://doi.org/10.1023/A:1022644832111
  • Galán, M. R. (2017). A theorem of the alternative with an arbitrary number of inequalities and quadraticprogramming. Journal of Global Optimization, 69(2), 427–442. https://doi.org/10.1007/s10898-017-0525-x
  • Gale, D. (1960). The theory of linear economic models. McGraw-Hill.
  • Giannessi, F. (1984). Theorems of the alternative and optimality conditions. Journal of Optimization Theory and Applications, 42(3), 331–365. https://doi.org/10.1007/BF00935321
  • Gill, P. E., Murry, W., & Wright, M. H. (1991). Numerical linear algebra and optimization. Vol. 1 Addison-Wesley.
  • Gordan, P. (1873). Über die auflösung linearer gleichungen mit reelen coefficienten. Mathematische Annalen, 6(1), 23–28. https://doi.org/10.1007/BF01442864
  • Horn, R. A., & Johnson, C. R. (1990). Matrix analysis. Cambridge University Press.
  • Kjeldsen, T. H. (2002). Different motivations and goals in the historical development of the theory of systems of linear inequalities. Archive for History of Exact Sciences, 56(6), 469–538. https://doi.org/10.1007/s004070200057
  • Mangasarian, O. L. (1981). A stable theorem of the alternative: An extension of the Gordan theorem. Linear Algebra and Its Applications, 41, 209–223. https://doi.org/10.1016/0024-3795(81)90100-2
  • Mccormick, G. P. (1983). Nonlinear programming. John Wiley.
  • Osborne, M. R. (1985). Finite algorithms in optimization and data analysis. John Wiley & Sons.
  • Saunders, B. D., & Schneider, H. (1979). Applications of the Gordan-Stiemke theorem in combinatorial matrix theory. SIAM Review, 21(4), 528–541. https://doi.org/10.1137/1021094
  • Stiemke, E. (1915). Über positive Lösungen homogener linearer Gleichungen. Mathematische Annalen, 76(2–3), 340–342. https://doi.org/10.1007/BF01458147
  • Tucker, A. W. (1956). Dual systems of homogeneous linear relations. In H. W. Kuhn & A. W. Tucker (Eds.), Linear Inequalities and Related Systems (pp. 3–18). Ann Math Stud. Princeton Univ. Press.
  • Vajda, S. Mathematical programming, Addison Wesley: 1961.
  • Wilkinson, J. H., & Reinsch, C. (1971). Linear algebra, handbook for automatic computation (Vol. 2). Springer-Verlag.