Full article: Solving equations with real Jordan canonical forms

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

ABSTRACT

A new equivalent version of Gordan’s theorem of the alternative is presented based on the system of linear inequalities. Such a version has a more intuitive geometric interpretation than Gordan’s theorem. The real Jordan canonical form theorem is utilized in proving the theorem.

KEYWORDS:

PUBLIC INTEREST STATEMENT

This note examines whether a system of linear inequalities is solvable or a system of linear equations has a positive solution. In this note, we also investigated the geometrical properties of real linear operators by the use of its Jordan canonical form.

1. Introduction

Let $R^{n}$ denote the set of vectors with size $n$ , and $R^{m \times n}$ denote the collection of $m$ -by- $n$ real matrices. The following theorem examines circumstances under which a system of linear equations has a positive solution.

Theorem 1 (Gordan’s theorem). Let $A \in R^{m \times n}$ be arbitrary. Then, either the system

(1.1)

A x = 0 a n d x \geq 0

(1.1)

has a nonzero solution $x \in R^{n}$ , or the system

(1.2)

A^{T} y > 0

(1.2)

has a solution $y \in R^{m}$ , but never both.

Theorem 1, due to Gordan (Gordan, Citation1873), has a long history and has been reproved numerous times. The survey by Kjeldsen (Kjeldsen, Citation2002) attributed the historical development of the theory of the systems of linear inequalities (1.1) and (1.2). It was rediscovered by Stiemke (Stiemke, Citation1915), representing a large class of theorems of the alternative that play an important role in linear and nonlinear programming. Such theorems are crucial in deriving optimality conditions for wide classes of extremal problems. For more background information and applications on the theorems of the alternative and other relevant matters, we refer to the textbooks (Ciarlet, Citation1989; Gill et al., Citation1991; Mccormick, Citation1983; Osborne, Citation1985), the surveys (Dax, Citation1993; Saunders & Schneider, Citation1979), and research papers (Dax & Sreedharan, Citation1997; Galán, Citation2017; Giannessi, Citation1984; Mangasarian, Citation1981).

Gordan (Gordan, Citation1873) and Stiemke (Stiemke, Citation1915) proved Theorem 1 by induction independently. However, they did not give any motivation for this investigation. Such an investigation do not make clear why the theorem works; see e.g. (Gale, Citation1960). Recent proofs are usually based on separation theorems of the convex sets with a simple geometrical interpretation. In this note, we present a simple self-contained proof which is based on elementary arguments in linear algebra.

The paper is organized as follows. In the next section, we will present a new equivalent form of Theorem 1. And in Section 3, we will give a simple, elementary proof of the equivalent form by the use of the real Jordan canonical form. The last section contains conclusions and some remarks.

2. An equivalent form

Let $r$ denote the rank of the matrix $A$ . Clearly, one has

r \leq min {m, n} .

For the sake of avoiding trivial cases, we always assume that $r$ is greater than zero. By the theory of the linear algebra (Wilkinson & Reinsch, Citation1971), there is an invertible matrix $M \in R^{m \times m}$ and a permutation matrix $P \in R^{n \times n}$ such that

(2.1)

M A = (\begin{matrix} I & B \\ 0 & 0 \end{matrix}) P,

(2.1)

where $B$ is a $r$ -by- $(n - r)$ real matrix, and $I$ denotes a unit matrix with a suitable size. The left product is equivalent to performing a series of elementary row operations on the matrix $A$ . Then the system of equations in (1.1) can be rewritten as follows:

(2.2)

(\begin{matrix} I & B \\ 0 & 0 \end{matrix}) P x = 0.

(2.2)

If $(\begin{matrix} u \\ v \end{matrix}) = P x$ , where $u \in R^{r}$ and $v \in R^{n - r}$ , then $x \geq 0$ means that both $u \geq 0$ and $v \geq 0$ hold. This results in our main theorem, an equivalent algebraic version of Gordan’s theorem (Theorem 1).

Theorem 2. (The Main Theorem). Let $B \in R^{r \times n_{1}}$ be arbitrary. Then, either the system

(2.3)

B v \leq 0 a n d v \geq 0

(2.3)

has a nonzero solution $v \in R^{n_{1}}$ , or the system

(2.4)

B^{T} y > 0 a n d y > 0

(2.4)

has a solution $y \in R^{r}$ , but never both.

From this it is simple to prove Gordan’s theorem (Theorem 1). Our main theorem has a more intuitive geometric interpretation than Gordan’s theorem in the sense discussed below but despite this is perhaps simpler to prove.

In the following statement, we assume without loss of generality that $r = n_{1}$ , i.e. the matrix $B$ is square (if not, make it a square matrix by adding zero rows or columns).

Let $R_{+ +}^{r}$ and $R_{+}^{r}$ denote the positive and the nonnegative quadrant cones, respectively, that is,

R_{+ +}^{r} = {v \in R^{r} | v > 0} a n d R_{+}^{r} = {v \in R^{r} | v \geq 0} .

If $B$ is a linear operator represented by the matrix $B$ , then the solvability of the system (2.3) is equivalent to that

(2.5)

B (R_{+}^{r}) \cap (- R_{+}^{r}) \neq \emptyset,

(2.5)

while the solvability of the system (2.4) is equivalent to that

(2.6)

B^{*} (R_{+ +}^{r}) \cap R_{+ +}^{r} \neq \emptyset,

(2.6)

where $B^{*}$ is the adjoint operator of $B$ , and it can be represented by the matrix $B^{T}$ when $B$ is a real matrix.

3. Proofs

The Jordan canonical form is described with reference to matrices known as elementary Jordan blocks. An elementary Jordan block of size $l \times l$ associated with an eigenvalue $λ$ will be denoted by $J_{l} (λ)$ , and its general form is adequately illustrated by the definition

(3.1)

J_{l} (λ) = (\begin{matrix} λ & 1 \\ λ & 1 \\ ⋱ & ⋱ \\ λ & 1 \\ λ \end{matrix}) .

(3.1)

The basic theorem is that given any $r \times r$ matrix with complex entries, there exists a nonsingular matrix $S$ such that

(3.2)

S B S^{- 1} = J, S B = J S,

(3.2)

where $J$ , the Jordan canonical form of $B$ , is block diagonal, each diagonal matrix being an elementary Jordan block with $λ = λ_{i}$ . Apart from the ordering of the blocks along the diagonal of $J$ (which can be arbitrary), the Jordan canonical form is unique, although $S$ is far from unique.

Following we mainly consider the Jordan canonical form when the matrix $B$ has only real entries. In this case, all the nonreal eigenvalues must occur in conjugate pairs. Moreover, if $ξ + i η$ is an eigenvector of the real matrix $B$ associated with the complex eigenvalue $λ = a + b i$ , then $ξ - i η$ must be an eigenvector of $B$ associated with the conjugate eigenvalue $\overset{ˉ}{λ} = a - b i$ , where $a, b \in R$ , $ξ, η \in R^{r}$ and $i$ is the imaginary unit, i.e. $i^{2} = - 1$ . That is

B (ξ + i η) = (a + b i) (ξ + i η) a n d B (ξ - i η) = (a - b i) (ξ - i η),

or equivalently,

B (ξ, η) = (ξ, η) (\begin{matrix} a & b \\ - b & a \end{matrix}),

in which the right $2 \times 2$ real matrix, denoted by $C (a, b)$ , has eigenvalues $a \pm b i$ . If $θ = arcsin \frac{b}{a^{2} + b^{2}}$ , then

C (a, b) = (a^{2} + b^{2}) (\begin{matrix} cos θ & sin θ \\ - sin θ & cos θ \end{matrix}) .

So the linear operator represented by the matrix $C (a, b)$ on the real space $s p a n {ξ, η}$ is a superposition of rotation and stretching. Of course, the rotation angle of the corresponding adjoint operator, represented by the matrix $C^{T} (a, b)$ , is the opposite of the rotational angle of $C (a, b)$ . When $θ \neq 0$ or $π$ , such two linear operators have no eigenvector on the $2$ -dimensional real space $s p a n {ξ, η}$ .

Note that the structure of the Jordan blocks corresponding to the conjugate eigenvalue $\overset{ˉ}{λ}$ must be the same as the structure of the Jordan blocks corresponding to the eigenvalue $λ$ , e.g. see, (Horn & Johnson, Citation1990). For example, if $λ$ is a complex eigenvalue of the real matrix $B$ , and if $J_{2} (λ)$ appears in the Jordan canonical form of $B$ with a certain multiplicity, $J_{2} (\overset{ˉ}{λ})$ must also appear with the same multiplicity. Then, the block matrix

(\begin{matrix} J_{2} (λ) & 0 \\ 0 & J_{2} (\overset{ˉ}{λ}) \end{matrix}) = (\begin{matrix} λ & 1 & 0 & 0 \\ 0 & λ & 0 & 0 \\ 0 & 0 & \overset{ˉ}{λ} & 1 \\ 0 & 0 & 0 & \overset{ˉ}{λ} \end{matrix})

is permutation similar to the block matrix

(\begin{matrix} λ & 0 & 1 & 0 \\ 0 & \overset{ˉ}{λ} & 0 & 1 \\ 0 & 0 & λ & 0 \\ 0 & 0 & 0 & \overset{ˉ}{λ} \end{matrix}) .

If $λ = a + b i$ and $\overset{ˉ}{λ} = a - b i$ , then it is also similar to the matrix

(\begin{matrix} C (a, b) & I \\ 0 & C (a, b) \end{matrix}) .

In general, each block pair of conjugate $l$ -by- $l$ Jordan block

(\begin{matrix} J_{l} (λ) & 0 \\ 0 & J_{l} (\overset{ˉ}{λ}) \end{matrix})

with nonreal $λ = a + b i$ is similar to a real $2 l$ -by- $2 l$ block of the form

(3.3)

C_{l} (a, b) = (\begin{matrix} C (a, b) & I \\ C (a, b) & ⋱ \\ ⋱ & I \\ C (a, b) \end{matrix}) .

(3.3)

These observations lead us to the real Jordan canonical form, see, (Horn & Johnson, Citation1990).

Theorem 3. Each real matrix $B \in R^{r \times r}$ is similar to a block diagonal real matrix of the form

(3.4)

\begin{aligned} J_{R} = d i a g (C_{r_{1}} (a_{1}, b_{1}), n, C_{r_{p}} (a_{r_{p}}, b_{r_{p}}), J_{r_{p + 1}} (λ_{p + 1}), \dots, J_{r_{k}} \\ (λ_{k})), \end{aligned}

(3.4)

where $λ_{j} = a_{j} + b_{j} i$ is a nonreal eigenvalue of $B$ for $j = 1, 2, \dots, p$ , $a_{j}, b_{j}$ are real, $λ_{p + 1}, \dots, λ_{k}$ are real eigenvalues of $B$ . The real Jordan block $C_{r_{j}} (a_{j}, b_{j})$ is of the form (3.3), and the Jordan block $J_{r_{j}} (λ_{j})$ are exactly the Jordan blocks in (3.2) with $λ = λ_{j}$ .

The motivation to prove our main theorem drives us to investigate the connection between the real Jordan canonical forms of the real matrix $B$ and its transpose. A simple observation is that if the real matrix $B$ has a real Jordan canonical form $J_{R}$ , then its transpose has a real Jordan canonical form $J_{R}^{T}$ . This yields the following result.

Corollary 4. Let the real Jordan canonical form $J_{R}$ of the real matrix $B$ be defined by (3.4).

(1) If $ζ_{j}$ is an eigenvector of $B^{T}$ associated with the Jordan block $J_{r_{j}}^{T} (λ_{j})$ , then

W_{j} (B) = s p a n {ζ_{j}, B ζ_{j}, B^{2} ζ_{j}, \dots, B^{r_{j} - 1} ζ_{j}}

is an invariant subspace of $B$ , in which $B$ has one linearly independent real eigenvector on $W_{j} (B)$ ;

(2) If $ξ_{j} \pm i η_{j}$ are eigenvectors of $B^{T}$ on the complex space $C^{r}$ associated with the real Jordan block $C_{r_{j}}^{T} (a_{j}, b_{j})$ , where $ξ_{j}, η_{j} \in R^{r}$ , then

W_{j} (B) = s p a n {ξ_{j}, η_{j}, B ξ_{j}, B η_{j}, \dots, B^{r_{j} - 1} ξ_{j}, B^{r_{j} - 1} η_{j}}

is an invariant subspace of $B$ on $R^{r}$ ;

(3) $R^{r}$ is a direct sum of the subspaces $W_{j} (B), j = 1, 2, \dots, k$ , i.e.

(3.5)

R^{r} = W_{1} (B) \oplus W_{2} (B) \oplus \dots \oplus W_{k} (B) .

(3.5)

Proof. If $r_{j} = 1$ , then the result is trivial. Otherwise, each Jordan block $J_{r_{j}} (λ_{j})$ and its transpose $J_{r_{j}}^{T} ({\overset{ˉ}{λ}}_{j})$ have only one linearly independent real eigenvector $e_{r_{j}}$ and $e_{1}$ , respectively, where $e_{r_{j}}$ and $e_{1}$ denote the last column and the first column of the unit matrix with order $r_{j}$ . Then, it is easy to verify that the above results for the real Jordan canonical form $J_{R}$ hold. Then, the desired results hold since the similarity invariance of the subspaces $W_{j} (J)$ .□

If the real matrix $B$ in Corollary 4 is replaced by its transpose, then the corresponding result also is true. We may obtain another decomposition of $R^{r}$ in the form of the matrix $B^{T}$

(3.6)

R^{r} = W_{1} (B^{T}) \oplus W_{2} (B^{T}) \oplus \dots \oplus W_{k} (B^{T}) .

(3.6)

Of course, we have $W_{j} (B^{T}) = W_{j} (B)$ for all $j = 1, 2, \dots, k$ .

It follows from Corollary 4 that if $g_{j}$ and $V_{λ_{j}} (B^{T})$ denote the geometric multiplicity and the eigensubspace of $λ_{j}$ associated with the matrix $B^{T}$ , then the null space of $(B - λ I)^{g_{j}}$ is equal to

s p a n {V_{λ_{j}}^{T}, B V_{λ_{j}} (B^{T}), \dots, B^{g_{j} - 1} V_{λ_{j}} (B^{T})} .

In addition, the first and last terms of Corollary 4 is still true when $B$ is a complex matrix, only if the transpose matrix $B^{T}$ is replaced by the conjugate transpose matrix $B^{*}$ , and the corresponding Jordan block $J_{r_{j}}^{T} (λ_{j})$ is replaced by $J_{r_{j}}^{T} ({\overset{ˉ}{λ}}_{j})$ .

Let

V_{j}^{+} (B) = s p a n {ξ_{j} + i η_{j}, B (ξ_{j} + i η_{j}), \dots, B^{r_{j}} (ξ_{j} + i η_{j})}

and

V_{j}^{-} (B) = s p a n {ξ_{j} - i η_{j}, B (ξ_{j} - i η_{j}), \dots, B^{r_{j}} (ξ_{j} - i η_{j})}

be invariant subspaces of $B$ , where $ξ_{j} \pm i η_{j}$ are eigenvalues of $B^{T}$ associated with the eigenvalues $a_{j} \pm b_{j} i$ . Then $ξ_{j} \pm i η_{j} \in V_{j}^{+} (B) \oplus V_{j}^{-} (B)$ such that $B$ on one 2-dimensional real subspace $V = s p a n {ξ, η} \subset W_{j} (B)$ rotates the angles $θ_{j} = arcsin \frac{b_{j}}{a_{j}^{2} + b_{j}^{2}}$ , while the adjoint operator $B^{*}$ on the subspace $V_{j} \subset W_{j} (B^{T})$ rotates the angles $- θ_{j}$ .

Now we ready to Theorem 2.

Proof of the main theorem. It is easy to show that both two inequalities (2.5) and (2.6) cannot be true together. From Corollary 4, we only need prove that at least one of them holds for each $W_{j} = W_{j} (B) = W_{j} (B^{T})$ , $j = 1, 2, \dots, k$ . The proof requires consideration of two cases.

When the eigenvalue $λ_{j}$ is a real number, the inequality (2.6) holds on $W_{j}$ if and only if $λ_{j} > 0$ , while the inequality (2.5) holds on $W_{j}$ if and only if $λ_{j} \leq 0$ .

When the eigenvalue $λ_{j} = a_{j} + b_{j} i$ is a complex number, the inequality (2.6) holds on $W_{j}$ if and only if $0^{0} < arcsin \frac{b_{j}}{a_{j}^{2} + b_{j}^{2}} < 90^{0}$ , while the inequality (2.5) holds on $W_{j}$ if and only if $90^{0} \leq arcsin \frac{b_{j}}{a_{j}^{2} + b_{j}^{2}} \leq 180^{0}$ . The proof is finished.

Finally, Gordan’s theorem follows from our main theorem trivially.

4. Concluding remarks

In this note, we have given an alternative proof of Gordan’s theorem, a proof that is based on the real Jordan canonical form. Another purely algebraic ways are to apply the property of the orthogonal or antisymmetric matrix. See, for example, (Broyden, Citation1998) and (Tucker, Citation1956; Vajda, Citation1961). This approach still don’t make clear why the theorem works. In our proof, a new geometrical interpretation for a pair of real linear adjoint operators is presented.

Other theorems of the alternative seem to be also related to the above geometrical interpretation. This will be our further work.

Acknowledgements

This study was funded by the National Natural Science Foundation of China (11871118). The author has received research grants from Yangtze University. The author declares that he has no conflict of interest. The author is indebted to the anonymous referees, whose detailed remarks helped in improving the manuscript.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

Supported by the National Natural Science Foundation of China (11871118).

Notes on contributors

Zi-Zong Yan

Zi-Zong Yan is currently a professor in the school of information and mathematics at Yangtze university. He received his Ph.D. degree in the school of mathematics and statistics from Wuhan university in 2005. His current research interests are in the area of optimization theory and matrix analysis.

Zhi-Jia Shu

Zhi-Jia Shu was born in Hubei, China, in 1995. He received his Bachelor’s degree in the school of information and mathematics at Yangtze university, in 2019. Currently, he is a master in reading at Yangtze university. His research interest is optimization theory.

References

Broyden, C. G. (1998). A simple algebraic proof of Farkas’s lemma and related theorems. Optimization Methods and Software, 8(3–4), 185–5. https://doi.org/10.1080/10556789808805676
Web of Science ®Google Scholar
Ciarlet, P. G. (1989). Introduction to numerical linear algebra and optimization. Cambridge University Press.
Google Scholar
Dax, A. (1993). The relationship between theorems of the alternative, least norm problems, steepest descent directions, and degeneracy: A review. Annals of Operations Research, 46(1), 11–60 https://doi.org/10.1007/BF02096256
Google Scholar
Dax, A., & Sreedharan, P. (1997). Theorems of the alternative and duality. Journal of Optimization Theory and Applications, 94(3), 561–590. https://doi.org/10.1023/A:1022644832111
Web of Science ®Google Scholar
Galán, M. R. (2017). A theorem of the alternative with an arbitrary number of inequalities and quadraticprogramming. Journal of Global Optimization, 69(2), 427–442. https://doi.org/10.1007/s10898-017-0525-x
Web of Science ®Google Scholar
Gale, D. (1960). The theory of linear economic models. McGraw-Hill.
Google Scholar
Giannessi, F. (1984). Theorems of the alternative and optimality conditions. Journal of Optimization Theory and Applications, 42(3), 331–365. https://doi.org/10.1007/BF00935321
Web of Science ®Google Scholar
Gill, P. E., Murry, W., & Wright, M. H. (1991). Numerical linear algebra and optimization. Vol. 1 Addison-Wesley.
Google Scholar
Gordan, P. (1873). Über die auflösung linearer gleichungen mit reelen coefficienten. Mathematische Annalen, 6(1), 23–28. https://doi.org/10.1007/BF01442864
Google Scholar
Horn, R. A., & Johnson, C. R. (1990). Matrix analysis. Cambridge University Press.
Google Scholar
Kjeldsen, T. H. (2002). Different motivations and goals in the historical development of the theory of systems of linear inequalities. Archive for History of Exact Sciences, 56(6), 469–538. https://doi.org/10.1007/s004070200057
Web of Science ®Google Scholar
Mangasarian, O. L. (1981). A stable theorem of the alternative: An extension of the Gordan theorem. Linear Algebra and Its Applications, 41, 209–223. https://doi.org/10.1016/0024-3795(81)90100-2
Web of Science ®Google Scholar
Mccormick, G. P. (1983). Nonlinear programming. John Wiley.
Google Scholar
Osborne, M. R. (1985). Finite algorithms in optimization and data analysis. John Wiley & Sons.
Google Scholar
Saunders, B. D., & Schneider, H. (1979). Applications of the Gordan-Stiemke theorem in combinatorial matrix theory. SIAM Review, 21(4), 528–541. https://doi.org/10.1137/1021094
Web of Science ®Google Scholar
Stiemke, E. (1915). Über positive Lösungen homogener linearer Gleichungen. Mathematische Annalen, 76(2–3), 340–342. https://doi.org/10.1007/BF01458147
Google Scholar
Tucker, A. W. (1956). Dual systems of homogeneous linear relations. In H. W. Kuhn & A. W. Tucker (Eds.), Linear Inequalities and Related Systems (pp. 3–18). Ann Math Stud. Princeton Univ. Press.
Google Scholar
Vajda, S. Mathematical programming, Addison Wesley: 1961.
Google Scholar
Wilkinson, J. H., & Reinsch, C. (1971). Linear algebra, handbook for automatic computation (Vol. 2). Springer-Verlag.
Google Scholar

Solving equations with real Jordan canonical forms

ABSTRACT

PUBLIC INTEREST STATEMENT

1. Introduction

2. An equivalent form

3. Proofs

4. Concluding remarks

Acknowledgements

Disclosure statement

Notes on contributors

Zi-Zong Yan

Zhi-Jia Shu

References

Information for

Open access

Opportunities

Help and information

Solving equations with real Jordan canonical forms

ABSTRACT

PUBLIC INTEREST STATEMENT

1. Introduction

2. An equivalent form

3. Proofs

4. Concluding remarks

Acknowledgements

Disclosure statement

Additional information

Funding

Notes on contributors

Zi-Zong Yan

Zhi-Jia Shu

References

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date