982
Views
0
CrossRef citations to date
0
Altmetric
Interactive Theorem Provers

Irrationality and Transcendence Criteria for Infinite Series in Isabelle/HOL

ORCID Icon, ORCID Icon & ORCID Icon

ABSTRACT

We give an overview of our formalizations in the proof assistant Isabelle/HOL of certain irrationality and transcendence criteria for infinite series from three different research papers: by Erdős and Straus, Hančl, and Hančl and Rucki. Our formalizations in Isabelle/HOL can be found on the Archive of Formal Proofs. Here we describe selected aspects of the formalization and discuss what this reveals about the use and potential of Isabelle/HOL in formalizing modern mathematical research, particularly in these parts of number theory and analysis.

1 Introduction

The libraries of formalized mathematics in various proof assistants (or interactive theorem provers) have been growing for years. However, the field is still far from the goal propounded in 1994 as the QED Manifesto [Citation5] “to build a computer system that effectively represents all important mathematical knowledge and techniques.” We are light years from the vision of having an interactive tool that could “converse” with human mathematicians to aid in the discovery of new results, as colorfully described by Timothy Gowers [Citation17]. Our goal within the ALEXANDRIA Project at Cambridge [Citation37] is to contribute to making proof assistants (and more specifically Isabelle/HOL) more useful to research mathematicians. To gather information about what specifically needs to be done, we have been formalizing material to contribute to the Isabelle/HOL LibrariesFootnote1 and the Archive of Formal Proofs (AFP).Footnote2 As research mathematics is our main target, and given that a considerable amount of basic mathematics has already been formalized in Isabelle/HOL over the years, here we wish to explore the following question:

How receptive to formalization (particularly in Isabelle/HOL) is mainstream published mathematical research?

In this paper we are not working towards the ambitious goal of formalizing the proof of a landmark result. Nor are we concerned with standard material from undergraduate or graduate textbooks. Those proofs have been thoroughly studied and checked, and repeatedly reworked and polished by many people over the years. Instead, we are formalizing mainstream mathematical research from journal papers in their raw form, not reworked or clarified as textbook proofs normally are. To explore the question above (within a specific case study), we ask questions such as these:

  • To what extent does the current Isabelle/HOL library cover the necessary preliminaries?

  • How helpful is the automatic proof tools?

  • How much longer than the original proofs are the formalized proofs (i.e. the de Bruijn factor [Citation40])?

  • Are we going to discover any issues or gaps in the proofs?

Clearly, the answers to these questions depend on the chosen proofs. Here, we attempt some case studies in number theory, on transcendence and irrationality criteria for infinite series. In particular, we choose several results from three research papers by Erdős and Straus (1974) [Citation13], Hančl (2002) [Citation23], and Hančl and Rucki (2005) [Citation24]. This line of research makes heavy use of standard background material from analysis, e.g., convergence tests for series and (for Erdős and Straus [Citation13]) certain results about prime numbers. The material from Hančl and Rucki [Citation24] applies Roth’s theorem on rational approximations to algebraic numbers.Footnote3 Inevitably, we had to pick topics in which we a priori knew that at least some of the preliminaries were already available in Isabelle/HOL. The material we chose is mathematically interesting in its own right, and moreover—unlike most results formalized nowadays—it comes from fairly recent research papers instead of textbooks.

Paul Erdős was keenly interested in establishing criteria for the irrationality and transcendence of infinite series [Citation11, Citation12, Citation14]. Several techniques for showing the transcendence of series are given by Nishioka [Citation33], who presents, among other, methods originated by Mahler [Citation31]. Later, Sándor [Citation39], Nyblom [Citation34, Citation35], and Hančl [Citation20–22] obtained many more results in this area.

In this article, we merely present in standard mathematical language the statements (without the informal proofs) of the theorems that we formalized and afterwords we discuss our formalization in Isabelle/HOL focussing on its most interesting aspects. For the informal proofs, the reader may refer to the respective papers [Citation13, Citation23, Citation24]. Our full Isabelle/HOL formalizations can be found online, on the AFP [Citation28–30]. As we had to fill in many intermediate arguments that were implicit in the original proofs, providing more details and clarifications, our formalizations may serve not only as case studies to explore the question of how receptive to formalization in Isabelle/HOL mathematical research is and as proof verification, but also as helpful supplementary material for a detailed study of the original proofs.

The plan of this article is as follows: in the next section we give a brief introduction to Isabelle/HOL. In Sections 3–5, we present the material formalized and discuss the main challenges and lessons learned through our formalizations for each article, respectively. In Section 6, we briefly discuss the de Bruijn factors of the formalizations and finally, in Section 7 we present our conclusion.

2 Isabelle/HOL

IsabelleFootnote4 is an interactive theorem prover first developed in the 1980s by Lawrence Paulson and Tobias Nipkow [Citation32, Citation36]. Today, Isabelle provides the Isar language for writing proofs. Isar proofs are hierarchically structured, and they include enough redundancy—consisting of explicitly stated assumptions and goals—to be understandable to humans as well as machines. Isabelle supports multiple different logical formalisms, with Isabelle/HOL specifically based on higher order logic. Unlike proof assistants based on constructive type theories with dependent types, it implements simple type theory. Isabelle/HOL admits classical (non-constructive) proofs, i.e. it accepts the law of the excluded middle, P¬P, as well as the axiom of choice.

Simple type theory and classical logic allow for powerful automation, which is a significant advantage. Isabelle/HOL’s main automation tool is Sledgehammer [Citation2], which searches for proofs by calling external automated theorem provers, and then converts the proofs discovered to Isar. Two counterexample finding tools, nitpick and Quickcheck, are also provided. Moreover, the command try0 calls a number of built-in proof methods (such as simplification) in search of a proof. The user interface is Isabelle/jEdit, which provides real-time proof checking, as well as rich semantic information for the formal text and direct links to the user manuals and tutorials.Footnote5 We have written an introduction to Isabelle aimed at mathematicians [Citation27].

The Isabelle/HOL libraries and the AFP contain a vast amount of formalized material, covering many areas of pure mathematics as well as in computer science, logic and even philosophy.

3 On the irrationality of certain series

3.1 Results formalized

The following results are by Erdős and Straus [Citation13].

Theorem 3.1

(Erdős and Straus [13, Theorem 2.1]). Let {bn}n=1 be a sequence of integers and {αn}n=1 a sequence of positive integers with αn>1 for all large n and limn=1,n|bn|αn1αn=0.

Then the sum n=1bni=1nαiis rational if and only if there exists a positive integer B and a sequence of integers {cn}n=1 so that for all large n, Bbn=cnαncn+1, |cn+1|<αn/2.

Corollary 3.1 (Erdős and Straus [13, Corollary 2.10]). Let {αn}n=1 and {bn}n=1 satisfy the hypotheses of the theorem above and in addition that for all large n we have bn>0,αn+1αn,limn(bn+1bn)/αn0 and liminfnαn/bn=0. Then the sum n=1bni=1nαiis irrational.

Theorem 3.2

(Erdős and Straus [13, Theorem 3.1]).

Let pn be the nth prime number and let {αn}n=1 be a monotonic sequence of positive integers satisfying limnpn/αn2=0 and liminfnαn/pn=0. Then the sum n=1pni=1nαiis irrational.

3.2 On the formalization

Our full formalization is online [Citation30] and here we discuss certain key points. The formalization process turned out to be time-consuming: we had to fill in a number of intermediate reasoning steps that had to be made explicit in the formal proofs. For example, in the proof of Theorem 3.2 ([13, Theorem 3.1]), for the sequence of integers {cn}n=1 (for which cn>0 for large n), Erdős and Straus claim that since it is unbounded, for large n there must exist an index mn so that cmcn<cm+1. In order to prove this claim, we had first to show that for large n the sequence {cn}n=1 is neither monotone increasing nor monotone decreasing, which required some work.

We moreover had to do a certain amount of restructuring for the formalization of the proof of Theorem 3.1 ([13, Theorem 2.1]), as the original proof is based on a sketch of a construction for obtaining the terms of the desired sequence {cn}n=1, starting with the construction of the first two terms for large enough n (“Proceeding in this manner we get the desired sequence”). In particular, in the sufficient direction, we need to obtain a positive integer B and a sequence of integers {cn}n=1 that satisfy Bbn=cnαncn+1 and |cn+1|<αn/2 for all large enough n, where {bn}n=1 and {αn}n=1 are sequences of integers as in the assumptions of the theorem. In the middle of the proof, we will have for a large enough N (3.1) BbNαNRNis an integer,(3.1) where {Rn}n=1 is a sequence such that for all nN,|Rn|<1/4 and(3.2) Rn+1=αnRnBbn+1αn+1.(3.2)

To construct a suitable {cn}n=1, we let cN be the integer nearest to (BbN)/αN and inductively construct cn+1=cnαnBbn (for all nN). This was achieved by defining a recursive function in Isabelle/HOL:

  • fun get_c::"(nat int) (nat int)

  • int nat (nat int)" where

  • "get_c a’b’ B N 0 = round (B * b’ N/a’ N)" |

  • "get_c a’ b’ B N (Suc n) = get_c a’ b’ B N n * a’ (n + N)

  • - B * b’ (n + N)"

where round returns the nearest integer of an input number. By construction, we already have Bbn=cnαncn+1 so the goal is to show |cn+1|<αn/2 (for all nN). To achieve this, we deploy proof by induction to derivecn(Bbn)/αn=Rn, where (3.1) has been used for the base case and (3.2) for the inductive one. We can then close the proof by having|cn+1|αn=|Rn|<12.

The original proof repeatedly invokes variants of (3.1) and tries to prove cn is the closest integer to (Bbn)/αn for all nN (e.g., (2.7)-(2.9) in the original proof). We believe our altered proof might be more straightforward.

We also noted that in the original proofs of Corollary 3.1 ([13, Corollary 2.10]) and Theorem 3.2 ([13, Theorem 3.1]) certain inequalities required some minor corrections (for details we refer to our formalization [Citation30]), which fortunately did not affect the correctness of the statements nor the original proof structures.

Our formalized versions of the statements of Theorem 3.1 ([13, Theorem 2.1]), Corollary 3.1 ([13, Corollary 2.10]), and Theorem 3.2 ([13, Theorem 3.1]) read as follows:

  • theorem theorem_2_1_Erdos_Straus:

  • fixes a b:: "nat⇒int"

  • assumes "∀ n. a n >0" and "∀F n in sequentially. a n > 1"

  • and "(λn. |b n|/(a (n-1)*a n))→ 0"

  • shows "(Σn. (b n/(π i ≤ n. a i))) ∈ Q $

  • (∃ (B::int)>0. ∃ c::nat⇒ int.

  • (∀F n in sequentially. B*b n = c n * a n - c(n + 1)

  • ∧|c(n + 1)|<a n/2)) "

  • corollary corollary_2_10_Erdos_Straus:

  • fixes a b:: "nat⇒int"

  • assumes "∀ n. a n >0" and "∀F n in sequentially. a n > 1"

  • and "(λn. |b n|/(a (n-1)*a n)) → 0"

  • and "∀F n in sequentially. b n > 0 ∧ a (n + 1) ≥ a n"

  • and "lim (λn. (b(n + 1) - b n)/a n) ≤ 0"

  • and "convergent (λn. (b(n + 1) - b n)/a n)"

  • and "liminf (λn. a n/b n) = 0 "

  • shows "(Σn. (b n/(π i ≤ n. a i)))/ ∉ Q"

  • theorem theorem_3_10_Erdos_Straus:

  • fixes a::"nat ⇒ int"

  • assumes "∀ n. a n >0" and "mono a"

  • and "(λn. nth_prime n/(a n)ˆ2) → 0"

  • and "liminf (λn. a n/nth_prime n) = 0"

  • shows "(Σn. (nth _prime n/(π i ≤ n. a i)))/ ∉ Q "

Above, sequentially is a filter [Citation25] for expressing limits indexed by positive integers. So F n in sequentially. P n expresses that P n holds for all sufficiently large n.

3.2.1 Calculations with real asymptotics

Asymptotic arguments (e.g., limits and statements of the form “for all sufficiently large n”) present real difficulties for mechanized proofs. Justifications omitted as obvious in paper proofs need to be written out. For example, in the proof of Theorem 3.2 ([13, Theorem 3.1]) we need to formally prove(3.3) 8Blnn+1<n4for all largen,(3.3) where B is a positive constant. Since the proof of (3.3) involves routine calculations of limits, this step is completely omitted in the paper proof. In Isabelle/HOL, (3.3) is encoded as

"∀F n in sequentially. 8*B*ln n + 1 < sqrt n/4"

Our first attempt resulted in the following proof:

  • have "∀F n in sequentially. 8*B*ln n + 1 < sqrt n/4"

  • proof -

  • have "(λn. ln n/sqrt n) → 0"

  • using lim_ln_over_power[of "1/2"] powr_half_sqrt by simp

  • from tendsto_mult_right_zero[OF this]

  • have "(λn. 8 * B * ln n/sqrt n) → 0" by auto

  • moreover have "(λn. 1/sqrt n) → 0"

  • apply (rule real_tendsto_divide_at_top)

  • by (auto simp add: filterlim_of_nat_at_top sqrt_at_top)

  • ultimately have "(λn. 8*B*ln n/sqrt n + 1/sqrt n) → 0"

  • by (auto intro:tendsto_add_zero)

  • from tendstoD[OF this, of "1/4"]

  • have "∀F n in sequentially. 8*B*ln n/sqrt n + 1/sqrt n < 1/4"

  • unfolding abs_less_iff dist_real_def

  • by (auto elim!:eventually_mono)

  • then show? thesis

  • using eventually_gt_at_top[of 0]

  • by (eventually_elim, auto simp:field_simps)

  • qed

The proof is tedious. Let us work through the text above. We start with primitive properties of the ln and square root functions,limnlnnn=0andlimn1n=0,written” (λ n. ln n/sqrt n) → 0” and” (λ n. 1/sqrt n) → 0” in the script. Then we need to manually transform a property of limits,limn8Blnnn+1n=0 to the proposition that8Blnnn+1n<14for all large enoughn, by applying the lemma tendstoD:if limxFf(x)=l and 0<e then eventually6|f(x)l|<e.

Fortunately, Manuel Eberl [Citation8] had just implemented the tactic real_asymp that solves (3.3) automatically, so we can reduce our initial proof to just one line:

  • have "∀F n in sequentially. 8*B*ln n + 1 < sqrt n/4"

  • by real_asymp

The basic idea behind real_asymp is to approximate a continuous function g:RR when its argument approaches a limit L using Poincaré expansion (or asymptotic series):n=1Nanφn(x),where anR and {φn(x)} is a sequence of continuous functions such that φn+1(x)o(φn(x)) for all n as xL.Footnote6 Through arithmetic on the formal series, the asymptotic behavior of a function at L can then be decided. In general, real_asymp handles functions built from basic arithmetic and elementary functions including exp, ln, and sin.

In situations involving limits, due to the presence of bound variables, Sledgehammer’s automation helps little, so real_asymp makes a real contribution. Such domain-specific automation inspired from computer algebra systems will surely prove to be of great use when mechanising mathematical proofs.

But in general we have found Sledgehammer to be of tremendous value. A telltale sign that it has been used is any occurrence of the proof methods metis or smt, which are littered throughout our formal proofs [28–30]. These methods are difficult to use manually, and any instances of them are almost certain to have been generated by Sledgehammer.

3.2.2 Reasoning with prime numbers

Elementary properties concerning prime numbers are often assumed in informal proofs, with supporting calculations largely omitted. Thanks to the mechanised proof of the prime number theorem (PNT) [Citation10] as well as another development on primes [Citation9] from the AFP, basic reasoning with prime numbers is feasible. However, there are still some tedious, repetitious calculations, which could be optimised in the future. For example, the first step in proving Theorem 3.2 ([13, Theorem 3.1]) is to demonstrate that the hypotheses of Theorem 3.1 ([13, Theorem 2.1]) are satisfied by the former’s assumptions. Although this step has been completely omitted in the informal proof, we need to explicitly derive limnpn/(αn1αn) from limnpn/αn2, where pn is the nth prime number and {αn}n=1 is a monotonic sequence of positive integers. The derivation works as follows: limnpnαn1anlimnpnαn12limn2pn1αn12=2limnpnαn2=0.

The second inequality requires a basic property of the prime numbers:(3.4) limnpn+1pn=1.(3.4)

Starting with a form of the PNT, namely pnnln(n), this property alone takes 28 lines of tedious limit calculations similar to those in our first attempt to prove (3.3). Another example from the proof of Theorem 3.2 ([13, Theorem 3.1]) is the following: considering the interval [N,2N) for large enough N and given that cn+1>cn implies pn+1>pn+pn, we need to deduce that cn+1>cn happens for fewer than half of the integers within [N,2N). The derivation works because(3.5) p2NpNpN<N12+ϵforallϵ>0and all large enough N,(3.5) and N12+ϵ<N/2 when ϵ<1/2 and N is large enough. Property (3.5) is formulated in Isabelle/HOL as

  • lemma nth_prime_double_sqrt_less:

  • assumes "ε > 0"

  • shows " F n in sequentially. (nth_prime (2*n) - nth_prime n)

  • /sqrt (nth_prime n) < n powr (1/2 + ε)"

where we remind that F n in sequentially stands for the “large enough” quantifier. Despite (3.5) being merely claimed without proof in the paper, it took us 56 lines to prove it in Isabelle/HOL starting from the PNT. In short, we are happy that properties related to the prime numbers are now viable in Isabelle/HOL, but we wish we could derive their asymptotic properties like (3.4) and (3.5) with fewer tears. Automating this process through the tactic real_asymp is desirable but the tactic does not yet support the function nth_prime: in contrast to elementary functions like exp, nth_prime cannot be asymptotically expanded to an arbitrary order. Incorporating nth_prime into real_asymp may require special treatment.

4 Irrational rapidly convergent series

4.1 Results formalized

The following results are by Hančl [Citation23].

Theorem 4.1

(Hančl [23, Theorem 3]). Let A > 1 be a real number. Let {dn}n=1 be a sequence of real numbers greater than one. Let {αn}n=1 and {bn}n=1 be sequences of positive integers such that limnαn12n=A and for all sufficiently large n Aαn12n>j=ndjandlimndn2nbn=.

Then the sum n=1bnαnis irrational.

Corollary 4.1 (Hančl [23, Corollary 2]). Let A > 1 and let {αn}n=1 and {bn}n=1 be sequences of positive integers such that limnαn12n=A. Then, assuming that for every sufficiently large positive integer n, αn12n(1+4(2/3)n)A and bn2(4/3)n1, the sum n=1bn/αn is irrational.

This corollary follows directly from the theorem by an appropriate choice of the sequence {dn}n=1, in particular by choosing dn=1+(2/3)n so that the theorem can be directly applied.

4.2 On the formalization

Our full formalization is online on the AFP [Citation28]. This was the first work in this area that we formalized, starting in early 2018. We chose this research paper as the proofs are interesting but not too complicated, and even though this is a topic in number theory, they are based on fundamental material from analysis. Of course, we also had to show quite a few auxiliary facts, including lemmas on summability criteria for series. For instance, we modified the classic ratio test for series (which was already available in the HOL Library):

  • lemma summable_ratio_test:

  • fixes c:: real and N:: nat

  • and f:: "nat ⇒ ’a:: real_normed_vector"

  • assumes "c < 1" and "∀n ≥ N. norm (f (Suc n)) ≤ c * norm (f n)"

  • shows "summable f"

  • The version modified in terms of limits is given below:

  • lemma summable_ratio_test_tendsto:

  • fixes c:: real and f:: "nat ⇒ ’a:: real_normed_vector"

  • assumes "c < 1" and "∀n. f n _= 0"

  • and tendsto_c: "(λn. norm (f (Suc n))/norm (f n)) → c"

  • shows "summable f"

  • proof -

  • from hc < 1i tendsto_c

  • obtain N where N_dist:

  • "∀n ≥ N. dist (norm (f (Suc n))/norm (f n)) c < (1-c)/2"

  • unfolding tendsto_iff eventually_sequentially

  • by (meson diff_gt_0_iff_gt zero_less_divide_iff zero_less_numeral)

  • have "norm (f (Suc n))/norm (f n) ≤ (1 + c)/2" when "n ≥ N" for n

  • using N_dist[rule_format,OF hn ≥ Ni] hc < 1i

  • by (auto simp add:field_simps dist_norm,argo)

  • then have "norm (f (Suc n)) ≤ (1 + c)/2 * norm (f n)"

  • when "n ≥ N" for n

  • using hn ≥ Ni h∀n. f n ≠ 0i by (auto simp add:divide_simps)

  • moreover have "(1 + c)/2 < 1" using hc < 1i by auto

  • ultimately show "summable f"

  • using summable_ratio_test by blast

  • qed

Our formalized version of Theorem 4.1 ([23, Theorem 3]) is presented below. We include its formalized proof in the Appendix. The reader will notice that the formal proof is only around 50 lines of Isar code; however, it comes after about 800 lines of auxiliary lemmas that are required for this specific proof. As in the original—and after having established the summability of the series in question—the proof is by contradiction. Assuming that the sum is rational, for a quantity ALPHA(n) we show that ALPHA(n)1 for all nN. After that, we show the existence of an nN for which ALPHA(n)<1, a contradiction: hence, the sum of the series is irrational. As already mentioned, Isar admits structured proofs, which are more readable than the usual sequence of proof commands. In our proof, the structure of a proof by contradiction is visible:

  • proof (rule ccontr) […] show False […] qed.

  • theorem Hancl3:

  • fixes d::"nat⇒real" and a b:: "nat⇒int"

  • assumes "A > 1" and d: "∀n. d n > 1"

  • and a: "∀n. a n > 0" and b: "∀n. b n > 0" and "s > 0"

  • and "(λn. (a n) powr(1/of_int(2ˆn))) → A"

  • and "∀n ≥ s. A/(a n) powr (1/of_int(2ˆn)) > (πj. d (n + j))"

  • and "LIM n sequentially. d n ˆ 2 ˆ n/b n:> at_top"

  • and "convergent_prod d"

  • shows "(Σn. b n/a n)/∉ Q

  • Our formalized version of Corollary 4.1 ([23, Corollary 2]) is as follows:

  • corollary Hancl3corollary:

  • fixes A::real and a b:: "nat⇒int"

  • assumes "A > 1" and a: "∀n. a n > 0" and b: "∀n. b n > 0"

  • and "(λn. (a n) powr(1/of_int(2ˆn))) → A"

  • and "∀n ≥ 6. a n powr(1/of_int (2ˆn)) * (1 + 4*(2/3)ˆn) ≤ A

  • ∧ b n ≤ 2 powr (4/3)ˆ(n-1)"

  • shows "(Σn. b n/a n) ∉ Q "

It is interesting to note that while in Corollary 4.1 ([23, Corollary 2]) we have an assumption that holds “for all sufficiently large positive integer n”, in our formalized version above this assumption is actually specified as “for all n6”. In our formalized version of Theorem 4.1 ([23, Theorem 3]) above, “for all sufficiently large n” had been written simply as “ns” where s > 0 is some fixed, unknown number. Statements of the form “for all sufficiently large” can also be encoded in a more abstract fashion using the keyword eventually, which does not require an additional variable.

4.2.1 Infinite products

This material requires the notion of an infinite product, which was then (early 2018) not available in Isabelle except in a small development by Manuel Eberl. Infinite sums were available, and because we were only dealing with positive reals, the infinite products could be reduced to infinite sums via logarithms. However, since one objective of this work was to identify and fill gaps in the libraries, we decided to formalize infinite products directly. We extended Eberl’s development to a comprehensive Infinite_Products library and included it in the next Isabelle release (Isabelle 2018).

Consider the following technical lemma, where we make use of :

  • lemma show8:

  • fixes d::"nat⇒real " and a::"nat⇒int" and s k::nat

  • assumes "A > 1" and d: "∀n. d n> 1" and "∀n. a n > 0" and "s > 0"

  • and "convergent_prod d"

  • and "∀n ≥ s. A/(a n) powr(1/(2::int)ˆn) > (πj. d(n + j))"

  • shows "n ≥ s. (πj. d (j + n)) < A

  • /(MAX j∈{s.n}. a j powr (1/(2::int) ˆ j))"

This had been originally formulated using ∑ via the natural logarithm in the first version, so the conclusion instead read

  • shows "∀n ≥ s. exp (Σj. ln(d (j + n))) < A

  • /(MAX j∈{s.n}. a j powr (1/(2::int) ˆ j))"

By convention, infinite products are defined in two stages [1, p. 241].

  1. For nonzero complex numbers, the infinite product k=0uk is defined to converge if the finite partial products converge to a nonzero limit. If that limit is zero, then the infinite product diverges to zero.

  2. If finitely many of the uk equal zero and the infinite product of the nonzero terms converges in the sense of (1), then k=0uk converges to zero.

One advantage of this approach is that k=0uk=u0k=1uk and similar identities hold.

5 The transcendence of certain infinite series

5.1 Results formalized

The following results are by Hančl and Rucki in [Citation24].

Theorem 5.1

(Hančl and Rucki [24, Theorem 2.1]). Let δ be a positive real number. Let {αn}n=1 and {bn}n=1 be sequences of positive integers such thatlimsupnαn+1(i=1nαi)2+δ·1bn+1=andliminfnαn+1αn·bnbn+1>1.

Then the sumn=1bnαnis transcendental.

Theorem 5.2

(Hančl and Rucki [24, Theorem 2.2]). Let δ and ϵ be positive real numbers. Let {αn}n=1 and {bn}n=1 be sequences of positive integers such that limsupnαn+1(i=1nαi)2+2/ϵ+δ·1bn+1=and for every sufficiently large n αn+1bn+11+ϵαnbn1+ϵ+1.

Then the sum n=1bnαnis transcendental.

The two theorems above are in fact corollaries of Roth’s celebrated result on rational approximations to algebraic numbers from 1955:

Theorem 5.3

(Roth [Citation38]). Let a be any algebraic number, not rational. If the inequality |apq|<1qκhas infinitely many solutions in coprime integers p and q where q > 0, then κ2.

In particular, the two theorems above by Hančl and Rucki [Citation24] asserting the transcendence of the sum of certain series are shown by finding infinitely many such integer solutions of the inequality with a the sum of the series in question and for some κ>2.

5.2 On the formalization

Our full formalization is online [Citation29].Footnote7 In this work, too, we had to fill in certain arguments that had been omitted in the original paper, for example to first show that the series in question were summable.

5.2.1 Implementing Roth’s Theorem

As already mentioned, for the proofs of Theorems 5.1 ([24, Theorem 2.1]) and 5.2 ([24, Theorem 2.2]), we applied Theorem 5.3: Roth’s theorem on rational approximations to algebraic numbers [Citation38]. The proof of this celebrated theorem is long and elaborate; it has not been formalized in any proof assistant, to our knowledge. In our formalization, we have thus used the statement of Roth’s theorem merely as an assumption, instead of formalizing Roth’s proof itself beforehand. Roth’s theorem was implemented within a locale as follows:

  • locale RothsTheorem =

  • assumes RothsTheorem:

  • "ξ κ. algebraic ξξ/∉ Q ∧ infinite {(p ,q ). (q::int)>0

  • coprime p q ∧ |ξ - p/q | < 1/q powr κ} → κ ≤ 2"

A locale collects parameters and assumptions, which it packages as a context in which to work. This locale simply packages the assumption that Roth’s theorem is true. It is a safer option than assuming the theorem as an axiom. Theorems 5.1 ([24, Theorem 2.1]) and 5.2 ([24, Theorem 2.2]) were then formalized within this locale:

  • theorem (in RothsTheorem) HanclRucki1:

  • fixes a b:: "nat ⇒ int" and δ:: real

  • assumes "∀k. a k > 0" and "∀k. b k > 0" and "δ > 0"

  • and "limsup (λk. a(k + 1)/

  • (πi = 0.k. a i) powr (2 + δ) * (1/b(k + 1))) = ∞"

  • and "liminf (λk. a (k + 1)/a k * b k/b (k + 1)) > 1"

  • shows "¬ algebraic (Σk. b k/a k)"

  • theorem (in RothsTheorem) HanclRucki2:

  • fixes a b:: "nat⇒int" and δ ε:: real and t:: nat

  • assumes "∀k. a k > 0" and "∀k. b k > 0" and "δ > 0" "ε > 0"

  • and "limsup (λk. a(k + 1)/(πi = 0.k. a i) powr (2 + 2/ε + δ)

  • * (1/b(k + 1))) = ∞"

  • and "∀k ≥ t. (a (k + 1)/b (k + 1)) powr (1/(1 + ε))

  • ≥ (a k/b k) powr (1/(1 + ε)) + 1"

  • shows "¬ algebraic (Σ k. b k/a k)"

Assuming a key theorem whose proof has not been formalized could be seen as compromising the vision of absolute correctness, which demands a bottom-up approach: to formalize the proofs of all prerequisites. However, ours is a more realistic approach that reflects actual mathematical practice. It still guarantees the correctness of the arguments we formalized, which assume Roth’s theorem. Normally, whenever mathematicians make use of a theorem in their work, they do not prove it themselves all over again but they trust the proof in the literature. But when writing a proof with a proof assistant, one would ideally expect that every result used would also be formally verified as well. Here, however, Roth’s theorem is used even though its proof hasn’t been formalized; we just trust its informal proof [38], just like a traditional mathematician would do.

Our approach has the significant advantage of reaching the formalization of more advanced mathematics faster. Nevertheless, care must be taken to avoid propagating errors in the literature. We recommend that a formally unverified theorem should be taken as an assumption within a formalization only if it is a fundamental result that has been checked by many mathematicians in the past. Roth’s famous theorem can be considered safe in this sense. And all such assumptions must be declared openly [27, Section 5].

5.2.2 A small issue in the original proof

The process of the formalization helped to reveal a slight mistake in one of the original proofs. It puzzled us for some time, although the eventual fix was straightforward. In the proof of Theorem 5.1 [24, page 534] it is claimed that from the assumptions it follows that for each real number A > 1, there exists a positive integer k0 such that for all k>k0,1A·bkαk>bk+1αk+1.

During the formalization process, we noticed that there is a problem with this claim. Consider the counterexample where we set αk+1=k(i=1kαi)2+δ if k is odd, and αk+1=2αk otherwise, with bk = 1 for all k. To resolve this, we had suggested a slightly modified version of the original statement and a different proof. However, this turned out to be unnecessary; the authors suggested to us via email that the problem could be resolved by replacing “for each real number A > 1” with “there exists a real number A > 1” above. This suggestion repaired the proof, and the rest of our formalization [Citation29] followed the original proof [24] without any further problems.

6 On the de Bruijn factors

The de Bruijn factor, introduced in 1977 by van Benthem Jutting [Citation26], is the ratio of the size of a formalization (in symbols) to the size of the original mathematical text (in words). Freek Wiedijk [Citation40] recommends making this more precise by comparing the number of bytes in the computer encodings (using the LATEX source of the mathematics), even compressing both files to ensure independence from such factors as the lengths of identifiers. Such precision is of questionable value, given the enormous variations in the density of mathematical texts.

Here, we only give approximate estimates of the de Bruijn factors for our formalizations, counting the number of lines in our formalizations and in the respective published informal proofs. We consider the entire amount of the material (i.e. statements together with their proofs as well as corollaries together with their proofs) in each formalization work. Given our formalization [Citation30] which spans around 1960 lines, we estimate the de Bruijn factor for Theorem 3.1 ([13, Theorem 2.1]), Corollary 3.1 ([13, Corollary 2.10]), and Theorem 3.2 ([13, Theorem 3.1]) and their respective proofs altogether to be around 25. Given our formalization [Citation28] which spans around 1054 lines, we estimate the de Bruijn factor for Theorem 4.1 ([23, Theorem 3]), Corollary 4.1 ([23, Corollary 2]) and their respective proofs altogether to be around 21. Finally, given our formalization [Citation29] which spans around 990 lines, we estimate the de Bruijn factor for Theorems 5.1 ([24, Theorem 2.1]) and 5.2 ([24, Theorem 2.2]) and their respective proofs altogether to be around 13. These de Bruijn factors can be considered high—Wiedijk reports lower factors for some formalizations in HOL [Citation41]—because we had to fill in a considerable amount of intermediate arguments which had not been made explicit in the original proofs, especially in the case of Erdős–Straus [Citation13].

7 Conclusion

We have formalized, in Isabelle/HOL [Citation28–30], several results on irrationality and transcendence criteria for infinite series from three research papers: by Erdős–Straus (1974) [Citation13], Hančl (2002) [Citation23] and Hančl–Rucki (2005) [Citation24].

We formalized results from mainstream journal papers in their original, unpolished form. Landmarks in the world of formalized mathematics include proving the four-colour theorem in Coq by Georges Gonthier [Citation15], the Kepler conjecture [Citation18] in HOL Light and Isabelle by Thomas C. Hales et al. [Citation19] and, more recently, the solution to the Cap Set Problem in Lean by Dahmen, Hölzl and Lewis [Citation7], the formalization of perfectoid spaces in Lean by Buzzard, Commelin and Massot [Citation6] and the formalization of Grothendieck’s Schemes in Isabelle/HOL by Bordg, Paulson and Li [Citation3, Citation4]. More in the spirit of our work is the formalization in Isabelle/HOL by Gouëzel of a recent paper by Shchur in the theory of Gromov-hyperbolic spaces, where a mistake in the original paper was moreover discovered through the formalization process [Citation16].

Summarizing the lessons that we learned through our formalization, we can now revisit the questions posed in the Introduction:

  • The amount of preliminary content in the current Isabelle/HOL libraries was indeed satisfactory. We had a priori chosen material that we expected would be within reach, and indeed, we didn’t run into significant deficiencies in the library. Some prerequisites that had to be added were a detailed formalization of infinite products, a modification of the ratio test for series and some facts about prime numbers. All these additions were doable within a reasonable amount of time and we can expect that they will be used in many other contexts. Our formalizations were thus an incentive to enrich the library.

  • In general we have found Sledgehammer’s automation to be of immense help, as hinted by the many occurrences throughout our formal proofs of proof methods such as metis and smt, which are generated by Sledgehammer.

  • As discussed in Section 6, the formal proofs were significantly longer than the original, which is reflected by the fairly high de Bruijn factors (approximately 25, 21, and 13 for the material [Citation13, Citation23, Citation24] respectively). This was certainly not unexpected.

  • We did not discover any major issues in the original proofs. We did however need to fill in some missing steps or calculations, some of which were proofs of summability of infinite series. These omissions contributed to the high de Bruijn factors. Regarding Erdős–Straus [Citation13], one of the proofs was slightly reformulated, while in other proofs a few inequalities required some minor corrections that did not, however, affect the correctness of the statements or the general proof structure. In Hančl–Rucki [Citation24], a slight mistake was easily fixed thanks to a hint from the authors. Our detailed, complete formal proofs may even be useful to the readers who wish to study the original proofs.

Overall, we found the formalization process to be interesting and educational. We hope that our feedback may be useful both to experienced proof assistant users and to mathematicians interested in attempting to formalize some research work for the first time.

Acknowledgements

The authors thank the ERC for their support through the ERC Advanced Grant ALEXANDRIA (Project GA 742178). We thank the anonymous reviewers for their useful suggestions that improved our exposition. We also thank Jaroslav Hančl and Pavel Rucki for a helpful email suggesting a fix for the slight mistake in their proof [24, Theorem 2.1], Iosif Pinelis for his useful explanation on MathOverflow regarding an argument in the proof of Theorem 2.2 in the same articleFootnote8 and Anthony Bordg for a discussion on the necessity of formalizing advanced mathematics even if the proofs of the prerequisites are not formalized.

7 Declaration of Interest

No potential conflict of interest was reported by the author(s).

Correction Statement

This article has been republished with minor changes. These changes do not impact the academic content of the article.

Notes

1 isabelle.in.tum.de/dist/library/HOL

3 The proof of Roth’s theorem itself has not been formalized; see Section 5.

5 These can also be found under “Documentation” on the Isabelle webpage.

6 as x approaches the filter F

7 Here, o(·) is the ‘little o’ notation, and φn+1(x)o(φn(x)) as xL is equivalent to limxLφn+1(x)/φn(x)=0.

8 The reader may notice that this development too seems to depend on the AFP entry for the Prime Number Theorem [10]. However we only use some technical lemmas from that development. Primes play no role here.

References

Appendix

Our formalized version of Theorem 4.1 ([23, Theorem 3]) along with its formalized proof as in [Citation28] is presented below:

  • theorem Hancl3:

  • fixes d:: "nat⇒real " and a b::"nat⇒int" and A::real and s::nat

  • assumes "A > 1" and d:"∀n. d n> 1" and a:"∀n. a n > 0"

  • and b:"∀n. b n > 0" and "s > 0"

  • and assu1: "( (λ n. a n powr (1/(2::int)ˆn))→ A) sequentially"

  • and assu2: "∀n ≥ s. A/a n powr (1/(2::int)ˆn)> (πj. d (n + j))"

  • and assu3: "LIM n sequentially. d n ˆ 2 ˆ n/b n:> at_top"

  • and "convergent_prod d"

  • shows "(Σ n. b n/a n)/∉ Q"

  • proof(rule ccontr)

  • assume asm:"((Σn. b n/a n) /∉ Q)"

  • have ab_sum:"summable (λj. b j/a j)"

  • using issummable[OF hA>1i d a b assu1 assu3 hconvergent_prod di].

  • obtain p q::int where "q > 0"

  • and pq_def:"(λ n. b (n + 1)/a (n + 1)) sums (p/q)"

  • proof -

  • from asm have "(Σn. b n/a n) ∈ Q" by auto

  • then have "(Σn. b (n + 1)/a (n + 1)) ∈ Q "

  • by (subst suminf_minus_initial_segment[OF ab_sum,of 1]) auto

  • then obtain p’ q’::int where "q’_=0"

  • and pq_def:"(λ n. b (n + 1)/a (n + 1)) sums (p’/q’)"

  • unfolding Rats_eq_int_div_int

  • using summable_ignore_initial_segment[OF ab_sum,of 1,THEN summable_sums]

  • by force

  • define p q where "p=(if q’<0 then - p’ else p’)"

  • and "q=(if q’<0 then - q’ else q’)"

  • have "p’/q’=p/q" and "q > 0"

  • using hq’≠0i unfolding p_def q_def by auto

  • then show? thesis using that[of q p] pq_def by auto

  • qed

  • define ALPHA where

  • "ALPHA = (λn. q * (πj = 1.n. a j) * (Σj. b (j + n + 1)/a (j + n + 1)))"

  • have "ALPHA n ≥ 1" for n

  • proof -

  • have "(Σn. b (n + 1)/a (n + 1)) > 0"

  • apply (rule suminf_pos)

  • using summable_ignore_initial_segment[OF ab_sum,of 1] a b by auto

  • then have "p/q > 0" using sums_unique[OF pq_def,symmetric] by auto

  • then have "q ≥ 1" and "p ≥ 1" using hq > 0i by (auto simp add:divide_simps)

  • moreover have "∀n. 1 ≤ a n" and "∀n. 1 ≤ b n"

  • using a b by (auto simp add: int_one_le_iff_zero_less)

  • ultimately show? thesis unfolding ALPHA_def

  • using show7[OF _ _ _ _ pq_def] by auto

  • qed

  • moreover have "∃n. ALPHA n < 1" unfolding ALPHA_def

  • proof (rule lasttoshow[OF d a hs > 0i hq > 0i hA > 1i b _ assu3])

  • show " F n in sequentially. (Σj. b (n + j)/a(n + j))

  • ≤ 2 * b n/a n"

  • using show5[OF hA > 1 i d a b assu1 assu3 hconvergent_prod di] by simp

  • show "∀m ≥ s. ∃n ≥ m. d (n + 1)

  • * (MAX j∈{s.n}. a j powr (1/(2::int) ˆ j))

  • ≤ a (n + 1) powr (1/(2::int) ˆ (n + 1))"

  • apply (rule showFootnote9[OF hA > 1i d a hs > 0i assu1 hconvergent_prod di])

  • using show8[OF hA > 1i d a hs > 0i hconvergent_prod di assu2]

  • by (simp add:algebra_simps)

  • qed

  • ultimately show False using not_le by blast

  • qed