Full article: Norm statement considered harmful: comment on ‘evolution of unconditional dispersal in periodic environments’

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

ABSTRACT

The mathematical symbol for the norm, which is heavily overloaded with multiple definitions that have both universal and specific properties, lends itself to confusion. This is manifest in the proof of an important theorem for population dynamics by Schreiber and Li on how dispersal increases population growth in a periodic environment. Here the theorem is placed in context, the proof is clarified, and the confusing but inconsequential errors corrected.

KEYWORDS:

AMS 2010 SUBJECT CLASSIFICATION:

In a classic paper in computer science, ‘Go To Statement Considered Harmful’, Dijkstra [Citation7] points out that while there is nothing incorrect in using goto in computer programs, ‘The go to statement as it stands is just too primitive; it is too much an invitation to make a mess of one's program.’ Ponder and Bush's homage to Dijkstra, ‘Polymorphism considered harmful’ [Citation24] considers the problem of operator overloading – using a single symbol to represent multiple different operations – which creates a situation of polymorphism: ‘Polymorphism is a powerful and flexible programming mechanism, but like many others it presents opportunities for abuses. These abuses can ruin program understandability.’

Understandability in mathematics is similarly at risk from operator overloading. The norm symbol, $∥ \cdot ∥$ , is heavily overloaded due to the large family of functions used as norms, and has the added complication that some mathematical results hold for all norms while others hold only for specific norms. The hazards of norm notation prompted Mathias to write, in The Handbook of Linear Algebra [Citation22, p. 24.6]:

Warning: There is potential for considerable confusion. For example, ${∥ A ∥}_{2} = {∥ A ∥}_{K, 1} = {∥ A ∥}_{S, \infty}$ , while ${∥ \cdot ∥}_{\infty} \neq {∥ \cdot ∥}_{S, \infty}$ (unless m=1), and generally ${∥ A ∥}_{2}, {∥ A ∥}_{S, 2}$ and ${∥ A ∥}_{K, 2}$ are all different, as are ${∥ A ∥}_{1}, {∥ A ∥}_{S, 1}$ and ${∥ A ∥}_{K, 1}$ .

This brings us to the paper by Schreiber and Li, ‘Evolution of unconditional dispersal in periodic environments’ [Citation26]. Schreiber and Li prove a theorem of broad interest to population biology, and their method of proof thus merits study. Norm notation in their proof, however, causes some confusion.

First, let us have some context for their theorem. Karlin [Citation17] proved a deep theorem on the asymptotic growth rate of populations that combine (1) a Markov chain with (2) heterogeneous growth rates:

Theorem 1.1

(Theorem 5.2 in [Citation17])

Let $P$ be an irreducible stochastic matrix and $D$ be a non-negative non-scalar matrix. Let $M (α) := (1 - α) I + α P$ represent a family of stochastic matrices parameterized by a mixing value $α \in [0, 1]$ . Then the spectral radius of $M (α) D$ strictly decreases in α.

Karlin's motivation for the theorem was to analyse the effect of dispersal in structured populations upon the protection of genetic diversity. In the theorem, α is the dispersal rate, and $P$ is the matrix of probabilities $P_{i j}$ of moving from deme j to deme i given that an organism disperses. $D$ is the diagonal matrix of growth rates of the rare allele in each deme. The spectral radius $r (M (α) D$ represents the asymptotic growth rate of a rare allele in the metapopulation, and the increase in $r (M (α) D$ as α decreases means that reducing dispersal gives rare alleles greater protection against extinction.

The generality of his theorem allows it to be employed towards a seemingly unrelated problem – the evolution of information transmission in organisms – to prove that evolution would favour reductions in migration rates, mutation rates, recombination rates, and even rates of cultural change [Citation1, Citation4], producing a unification of the ‘Reduction Principle’ found for specific models by Feldman and coworkers (cf. [Citation10, Citation11]). Karlin's theorem and its application to the evolution of dispersal rates were independently rediscovered by Kirkland, Li and Schreiber [Citation18]. The theorem's extension to linear operators on Banach spaces is provided in [Citation3], which unifies the result that ‘the slower diffuser wins’ found in a number of reaction diffusion models for the evolution of dispersal [Citation5, Citation8, Lemma 2.1, Citation12, Citation14, Lemma 5.2, Citation15].

Empirically, it is clear that dispersal, mutation, and recombination rates have not evolved to zero in organisms, so one is directed to look for mathematical sources of departure from the reduction principle. The characterization of these conditions remains largely an open question. Just as the reduction principle appears in many context, we are seeing new situations in which departures from reduction hold, for example, for linear stochastic differential equations [Citation9] (S. Schreiber, personal communication).

For linear systems, departure from reduction means that the spectral radius increases with the mixing rate α. One such departure occurs when the variation in the stochastic matrix has the general form $M (α) = B [(1 - α) I + α P]$ , where $B$ and $P$ are specially related stochastic matrices representing multiple transformation processes [Citation2].

Schreiber and Li [Citation26] prove a general result for another source of departure: temporally changing environments, where the growth rate matrix $D$ alternates every time step with its inverse $D^{- 1}$ . They generalize, to arbitrary $n \times n$ transition matrices of reversible Markov chains, the behaviour found for $2 \times 2$ matrices in [Citation6, Citation16, Citation19, Citation20, Result 2], and 4 × 4 in [Citation25].

Theorem 1.2

([Citation26, Appendix 1])

Let $D = diag [d_{1}, \dots, d_{n}]$ with $d_{1}, \dots, d_{n} \in (0, \infty),$ and let $S$ be an $n \times n$ column stochastic matrix such that $R S R^{- 1}$ is symmetric for some diagonal matrix $R$ . For $t \in [0, 1],$ denote by $r (F (t))$ the Perron $($ largest $)$ eigenvalue of $F (t) = D [(1 - t) I + t S] D^{- 1} [(1 - t) I + t S],$ where $I$ is the order-n identity matrix. Then $r (F (t))$ is either an increasing function on $[0, 1]$ or a constant function on $[0, 1]$ .

To prove this theorem, they provide another theorem in which norm notation enters:

Theorem 1.3

([Citation26, Appendix 2])

Denote by $∥ A ∥$ the operator norm of the matrix $A$ . Suppose $A \in M_{n}$ is non-zero and satisfies $∥ I + A ∥ \geq 1$ . Then $∥ I + t A ∥ \geq ∥ I + A ∥$ for all $t \geq 1$ .

The term ‘operator norm’ is polymorphic in the literature. In some uses (e.g. [Citation23]) ‘operator norm’ is synonymous with the spectral norm, ${∥ A ∥}_{2} := r (A^{*} A)^{1 / 2}$ , where $A^{*}$ is the conjugate transpose of $A$ , and $r (\cdot)$ is the spectral radius.

Others use ‘operator norm’ more generally as the norm of a matrix induced by a chosen vector norm $∥ \cdot ∥$ on $C^{n}$ (e.g. [Citation13, Citation21]): (1) $∥ A ∥ := max_{x \neq 0} \frac{∥ A x ∥}{∥ x ∥} .$ (1) Throughout the main text of Schreiber and Li [Citation26] and in Theorem 1.2, the vector norm used is $∥ x ∥ = \sum_{i = 1}^{m} x_{i}$ for $x_{i} \geq 0$ . However, in Theorem 1.3, used to prove Theorem 1.2, we find this inequality: (2) $∥ (I + A) u ∥ = ∥ (1 + α) u + β v ∥ = | 1 + α |^{2} + | β |^{2} \geq 1,$ (2) where $u$ and $v$ are vectors such that $∥ u ∥ = ∥ v ∥ = 1$ , $∥ I + A ∥ = ∥ (I + A) u ∥$ , $A u = α u + β v$ , and ${u, v}$ is an orthonormal family.

Two things are clear from Equation (Equation2(2) $∥ (I + A) u ∥ = ∥ (1 + α) u + β v ∥ = | 1 + α |^{2} + | β |^{2} \geq 1,$ (2) ):

The vector norm is no longer $∥ x ∥ = \sum_{i = 1}^{m} x_{i}$ but is now $∥ x ∥ := {∥ x ∥}_{2} = \sqrt{\sum_{i = 1}^{n} x_{i} {x_{i}}^{*}}$ , so that $∥ A ∥ = max_{x \neq 0} {∥ A x ∥}_{2} / {∥ x ∥}_{2}$ .
Squares are missing, and Equation (Equation2(2) $∥ (I + A) u ∥ = ∥ (1 + α) u + β v ∥ = | 1 + α |^{2} + | β |^{2} \geq 1,$ (2) ) should read: (3) ${∥ (I + A) u ∥}^{2} = {∥ (1 + α) u + β v ∥}^{2} = | 1 + α |^{2} + | β |^{2} \geq 1 .$ (3)

The squares were also dropped from the next sequence of calculations, which should read ${∥ I + t A ∥}^{2} \geq {∥ (I + t A) u ∥}^{2} = | 1 + t α |^{2} + | t β |^{2} = \dots \geq {∥ I + A ∥}^{2} .$ Fortunately, the final inequality ${∥ I + t A ∥}^{2} \geq {∥ I + A ∥}^{2}$ remains true when the squares are dropped, so the errors are inconsequential to the proof.

Norm notation allows Theorem 1.3 to be stated quite simply, but non-transparently, and the polymorphism in the literature for the usage of ‘operator norm’ makes us unsure of exactly which vector norm is being used (It is, however, hard to imagine a vector norm that would provide a counterexample to Theorem 1.3). The theorem could have been expressed by unpacking the spectral norm explicitly as $r ((I + t A) (I + t A)^{*}) \geq r ((I + A) (I + A)^{*})$ for all $t \geq 1$ , which leaves no room for ambiguity or error, and segues directly into Schreiber and Li's sagacious decomposition (p. 134), $r (F (t)) = r (D^{- 1 / 2} F (t) D^{1 / 2}) = r (B (t) B (t)^{⊤})$ .

Norms are of course fundamental to mathematical analysis. The ‘information hiding’ achieved by $∥ \cdot ∥$ makes norm notation concise, but both the reader – and the writer – may benefit from always explicitly stating its content.

ORCID details

Lee Altenberg http://orcid.org/0000-0001-9704-6811

Disclosure statement

No potential conflict of interest was reported by the author.

Additional information

Funding

This work was supported by the Konrad Lorenz Institute for Evolution and Cognition Research, Klosterneuburg, Austria, and the Mathematical Biosciences Institute at The Ohio State University, USA, through National Science Foundation Award #DMS 0931642.

References

L. Altenberg, A Generalization of Theory on the Evolution of Modifier Genes, ProQuest, 1984 October. Available at http://search.proquest.com/docview/303425436/abstractProQuest, document ID: 303425436.
Google Scholar
L. Altenberg, The evolution of dispersal in random environments and the principle of partial control, Ecol. Monogr. 82 (2012), pp. 297–333. doi: 10.1890/11-1136.1
Google Scholar
L. Altenberg, Resolvent positive linear operators exhibit the reduction phenomenon, Proc. Natl. Acad. Sci. USA 109 (2012), pp. 3705–3710.
Google Scholar
L. Altenberg and M.W. Feldman, Selection, generalized transmission, and the evolution of modifier genes. I. The reduction principle, Genetics 117 (1987), pp. 559–572.
Google Scholar
R. Cantrell, C. Cosner, and Y. Lou, Evolution of dispersal in heterogeneous landscapes, in Spatial Ecology, R. Cantrell, C. Cosner, and S. Ruan, eds., Mathematical and Computational Biology, Chapman & Hall/CRC Press, London, 2010, pp. 213–229.
Google Scholar
O. Carja, U. Liberman, and M.W. Feldman, Evolution in changing environments: Modifiers of mutation, recombination, and migration, Proc. Natl. Acad. Sci. 111 (2014), pp. 17935–17940.
Google Scholar
E.W. Dijkstra, Go to statement considered harmful, Commun. ACM 11 (1968), pp. 147–148. doi: 10.1145/362929.362947
Google Scholar
J. Dockery, V. Hutson, K. Mischaikow, and M. Pernarowski, The evolution of slow dispersal rates: A reaction diffusion model, J. Math. Biol. 37 (1998), pp. 61–83. doi: 10.1007/s002850050120
Google Scholar
S.N. Evans, P.L. Ralph, S.J. Schreiber, and A. Sen, Stochastic population growth in spatially heterogeneous environments, J. Math. Biol. 66 (2013), pp. 423–476. doi: 10.1007/s00285-012-0514-0
Google Scholar
M.W. Feldman, Selection for linkage modification: I. Random mating populations, Theor. Popul. Biol. 3 (1972), pp. 324–346. doi: 10.1016/0040-5809(72)90007-X
Google Scholar
M.W. Feldman, F.B. Christiansen, and L.D. Brooks, Evolution of recombination in a constant environment, Proc. Natl. Acad. Sci. USA 77 (1980), pp. 4838–4841.
Google Scholar
A. Hastings, Can spatial variation alone lead to selection for dispersal? Theor. Popul. Biol. 24 (1983), pp. 244–251.
Google Scholar
R.A. Horn and C.R. Johnson, Matrix Analysis, 2nd ed., Cambridge University Press, Cambridge, 2013.
Google Scholar
V. Hutson, J. López-Gómez, K. Mischaikow, and G. Vickers, Limit behavior for a competing species problem with diffusion, in Dynamical Systems and Applications, R.P. Agarwal, ed., World Scientific, Singapore, 1995, pp. 343–358.
Google Scholar
V. Hutson, S. Martinez, K. Mischaikow, and G.T. Vickers, The evolution of dispersal, J. Math. Biol. 47 (2003), pp. 483–517. doi: 10.1007/s00285-003-0210-1
Google Scholar
K. Ishii, H. Matsuda, Y. Iwasa, and A. Sasaki, Evolutionarily stable mutation rate in a periodically changing environment, Genetics 121 (1989), pp. 163–174.
Google Scholar
S. Karlin, Classifications of selection–migration structures and conditions for a protected polymorphism, in Evolutionary Biology, M.K. Hecht, B. Wallace, and G.T. Prance, eds., Vol. 14, Plenum Publishing Corporation, New York, 1982, pp. 61–204.
Google Scholar
S. Kirkland, C.-K. Li, and S.J. Schreiber, On the evolution of dispersal in patchy landscapes, SIAM J. Appl. Math. 66 (2006), pp. 1366–1382. doi: 10.1137/050628933
Google Scholar
M. Lachmann and E. Jablonka, The inheritance of phenotypes: An adaptation to fluctuating environments, J. Theor. Biol. 181 (1996), pp. 1–9. doi: 10.1006/jtbi.1996.0109
Google Scholar
U. Liberman, J. Van Cleve, and M.W. Feldman, On the evolution of mutation in changing environments: Recombination and phenotypic switching, Genetics 187 (2011), pp. 837–851. doi: 10.1534/genetics.110.123620
Google Scholar
P.-L . Loh, M.J. Wainwright et al., High-dimensional regression with noisy and missing data: Provable guarantees with nonconvexity, Ann. Statist. 40 (2012), pp. 1637–1664. doi: 10.1214/12-AOS1018
Google Scholar
R. Mathias, Singular values and singular value inequalities, in Handbook of Linear Algebra, L. Hogben, ed., 2nd ed., Chap. 24, Chapman and Hall, Boca Raton, FL, 2014, pp. 24-1–24-17.
Google Scholar
L. Molnár and P. Szokol, Transformations on positive definite matrices preserving generalized distance measures, Linear Algebra Appl. 466 (2015), pp. 141–159. doi: 10.1016/j.laa.2014.09.045
Google Scholar
C. Ponder and B. Bush, Polymorphism considered harmful, ACM SIGSOFT Softw. Eng. Notes 19 (1994), pp. 35–37. doi: 10.1145/181628.181635
Google Scholar
M. Salathé, J. Van Cleve, and M.W. Feldman, Evolution of stochastic switching rates in asymmetric fitness landscapes, Genetics 182 (2009), pp. 1159–1164. doi: 10.1534/genetics.109.103333
Google Scholar
S.J. Schreiber and C.-K . Li, Evolution of unconditional dispersal in periodic environments, J. Biol. Dyn. 5 (2011), pp. 120–134. doi: 10.1080/17513758.2010.525667
Google Scholar

Norm statement considered harmful: comment on ‘evolution of unconditional dispersal in periodic environments’

ABSTRACT

(Theorem 5.2 in [Citation17])

([Citation26, Appendix 1])

([Citation26, Appendix 2])

ORCID details

Disclosure statement

References

Information for

Open access

Opportunities

Help and information

Norm statement considered harmful: comment on ‘evolution of unconditional dispersal in periodic environments’

ABSTRACT

(Theorem 5.2 in [Citation17])

([Citation26, Appendix 1])

([Citation26, Appendix 2])

ORCID details

Disclosure statement

Additional information

Funding

References

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date