301
Views
1
CrossRef citations to date
0
Altmetric
Articles

Why Is Proof the Only Way to Acquire Mathematical Knowledge?

ORCID Icon
Pages 333-353 | Received 12 Mar 2022, Accepted 26 Sep 2022, Published online: 15 Oct 2023
 

ABSTRACT

This paper proposes an account of why proof is the only way to acquire knowledge of some mathematical proposition’s truth. Admittedly, non-deductive arguments for mathematical propositions can be strong and play important roles in mathematics. But this paper proposes a necessary condition for knowledge that can be satisfied by putative proofs (and proof sketches), as well as by non-deductive arguments in science, but not by non-deductive arguments from mathematical evidence. The necessary condition concerns whether we can justly expect that if the mathematical proposition is actually false, despite the evidence for its truth, then this fact has an explanation.

Acknowledgements

I am very grateful for the Journal’s referees for their kind suggestions.

Disclosure Statement

No potential conflict of interest was reported by the author(s).

Notes

1 This view is defended in, for example, Zeilberger Citation1993; Davis Citation1995: 212; Fallis Citation1996, Citation1997, Citation2002; Womach and Farach Citation2003; Paseau Citation2015; and Brown Citation2017, Citation2020. Paseau (Citation2015: 775, 786), for instance, maintains that ‘non-deductive evidence can yield knowledge of a mathematical proposition … inductive knowledge can exist in mathematics just as it can elsewhere’. Womach and Farach endorse the same view: Besides proof, they say,

we now have at our disposal other methods which do not meet standards of absolute rigor, but do provide results to mathematical problems. The methods used are probabilistic … [W]e argue that epistemological worries about confidence in the correctness of a procedure do not preclude use of [these methods] as an acceptable method for acquiring mathematical knowledge. (Womach and Farach Citation2003: 72–73)

Womach and Farach (Citation2003: 78) distinguish their view from the following alternative, which they reject: that probabilistic methods can ‘give us strong reason to believe mathematical claims and can be useful heuristically (in directions for research and offering reason to look for proofs), but they do not convey actual mathematical knowledge’ (their emphasis). Davis characterizes the standard view that proofs alone provide mathematical knowledge as manifesting an unreasonable preoccupation with proof:

Obsessed over the millennia by the vision that mathematics can provide absolute, rock bottom ‘certainty’, the mathematical establishment has often expressed its displeasure with certain types of ‘proof’: visual, mechanical, experimental, probabilistic. … Computer proof, theorem discovery, and mathematical experimentation are now openly acknowledged as legitimate methodologies and roads to mathematical knowledge. Thus, absolutely rigorous mathematical proof, as an ideal, is giving way and is now seen as a part of a wider, more generous and more flexible notion that I like to call ‘mathematical evidence’. … Classical proof must move over and share the educational stage and time with other means of arriving at mathematical evidence and knowledge. (Davis Citation1995: 212)

(Perhaps Pólya (Citation1954: viii) should be added on the strength of remarks like this: ‘the role of inductive evidence in mathematical investigation is similar to its role in physical research’.)

2 Of course, I recognize testimonial knowledge as a way of acquiring knowledge in mathematics without the knower’s having a proof of the proposition known. But I will be interested in alleged non-testimonial sources of mathematical knowledge. (It is common for philosophical discussions of the topic I am addressing to set aside testimonial knowledge in mathematics; see, e.g., Peressini Citation2003: 229 and De Toffoli Citation2021: 837.)

I will be concerned only with the means by which mathematicians acquire knowledge of theorems given knowledge of the axioms, so I will set aside the question of how mathematicians acquire knowledge of the axioms. (In so doing, I follow others who have discussed the topic I am addressing—e.g., Fallis (Citation2002: 374) and De Toffoli (Citation2021: 823, 828).) If you wish, you can avoid the question of how mathematicians acquire knowledge of the axioms by adopting an if-thenist view of what mathematicians come to know from a proof of a given mathematical proposition (namely, they come to know that those axioms entail that proposition). My topic concerns workaday mathematics (where theorems are confirmed and proved and the relevant axioms are presumed), not foundational inquiries (where questions about the proper axioms are investigated).

There are some controversial cases concerning whether certain kinds of proof (such as computer-assisted proofs or otherwise unsurveyable proofs) can supply mathematical knowledge. I will set these interesting issues aside, along with questions about how rigorous or explicit a proof must be for it to supply mathematical knowledge. Typical ‘proofs’ in mathematics journals are perhaps better characterized as informal proofs, proof sketches, or recipes for providing proofs. I will refer to all of these items simply as ‘proofs’ since they all contrast with the sorts of non-deductive arguments for mathematical propositions with which I will be concerned. (However, I will briefly return to this topic in note 17.)

3 Rabin (Citation1976: 36–37) reports having used his test (with 100 randomly-selected numbers) to ‘identify’ 2400–593 as prime; this 121-digit number was later proved to be prime (Williams and Holte Citation1978: 915).

4 To make this point seems to be one reason that De Toffoli (Citation2021) introduces the notion of a ‘simil-proof’: an argument that looks like a proof to the relevant experts. Not every simil-proof turns out to be a proof.

5 For criticism of Easwaran Citation2009, see Jackson Citation2009 and Fallis Citation2011.

6 Nelkin (Citation2000), Smith (Citation2010), Buchak (Citation2014), and Staffel (Citation2016), for example, draw this lesson. The ‘paradox’ originated with Kyburg (Citation1961: 197). I will not review the vast literature on it.

7 Both Smith (Citation2016: 48–49) and Hamami (Citation2022) have argued that the reason why we cannot know in advance that a given ticket will lose (whatever that reason is) is the same as the reason why ‘probabilistic proofs’ cannot supply knowledge. (They do not discuss the sorts of non-deductive arguments used, for example, to support Goldbach’s conjecture.)

8 An agent in the lottery case would have had the same evidence (such as knowledge that the lottery is fair) even if her ticket had turned out to win. The agent’s belief that her ticket will lose therefore does not ‘track the truth’; it is ‘insensitive’ to whether the ticket wins. This failure to truth-track has been proposed as helping to explain why the agent does not know that her ticket will lose—and has also been cited as differentiating the lottery case from some non-deductive reasoning in mathematics. Consider an agent who checks many numbers less than n, finds none that witnesses n’s compositionality, and accordingly believes that n is prime—and suppose that n is prime. Hamami (Citation2022) argues that the agent’s belief ‘tracks the truth’ in that had nbeen composite, then probably her evidence would have been different in that she would have encountered a witness. Easwaran (Citation2009: 348) agrees; so does Paseau (Citation2015: 788). Paseau concludes that the lottery case provides no reason to think that the agent using the Rabin test does not know that n is prime, whereas Hamami believes that the ‘tracking requirement’ is mistaken in not precluding the agent from knowing that n is prime. I will not address this disagreement because I will not invoke the ‘tracking requirement’ to explain why neither the lottery-ticket holder nor the Rabin-test user knows. I am hesitant to appeal to counterfactuals with mathematically impossible antecedents such as ‘Had n been composite’. (Compare Roland and Cogburn Citation2011: 551.)

But (setting that hesitation aside) suppose that we are prepared to accept that had n been composite, then probably the agent would have encountered a witness (considering that at least 3/4 of the numbers less than n are witnesses). It is not evident that we should then likewise accept (if Goldbach’s conjecture is true) that had Goldbach’s conjecture been false, then probably there would have been a counterexample below 4×1018. If Goldbach’s conjecture is true, then had Goldbach’s conjecture been false, why should its ‘closest calls’ (one of which turned out to be fatal) still likely have been smaller numbers (that is, numbers we have checked)? Compare the countermathematical ‘Had Goldbach’s conjecture been false … ’ to the counterfactual ‘Had the match been struck … ’. Regarding an unstruck match, we can ask which features of its environment would still have obtained, had it been struck. Ordinarily, it is straightforward to say confidently that the match would still have been dry and oxygenated since the causes of its being dry and oxygenated have nothing to do with the causes of its remaining unstruck. But it is not obvious which features of mathematics would still have held, had a given mathematical fact not obtained, since various mathematical facts are highly interconnected; a counterfactual alteration to one mathematical fact may well ramify very widely. (Had 23 not been prime, then what other mathematical facts would—or might—have been different? What would 23’s factors have been?) In particular, it is not obvious that (if Goldbach’s conjecture is true) had Goldbach’s conjecture been false, smaller numbers would still have stood a greater chance than larger numbers of violating it. Had it been false, its smallest exception might have been a large number—and (mathematical facts having been very different) larger numbers might have been more likely to violate it. Likewise, if some very large n had been the smallest number violating Goldbach’s conjecture, then perhaps another way that mathematics would have been different is that smaller numbers would not have stood a greater chance of violating it. See also note 16. (I return to ‘sensitivity’ at the end of section 5.)

9 Roy Sorensen called this example to my attention in a lovely unpublished paper. I have discussed it further in Lange Citation2017: 276–77, 280–81, 289, 353, 355-56, 394–95.

10 The article appears (unsigned, as a ‘gleaning’) on p. 283 of volume 70, number 454 (the December 1986 issue).

11 I give a much fuller treatment of explanation in mathematics in Lange Citation2017: 231–346.

12 For fuller accounts of contrastive causal explanations, see Lewis Citation1986: 229–31; Garfinkel Citation1990; and Carroll Citation1999. My argument will not rely on the details of any particular account of contrastive explanation. (Lewis gives an example about explaining why he lectured at one place rather than another that inspired my example below.)

13 For example, Bressoud (Citation1994: 69) gives the above argument and says that we can thereby ‘explain why it is that sometimes you can differentiate an infinite series by differentiating each term, and sometimes you cannot’. Bressoud (Citation1994: 65) gives this argument as accounting for the contrast with the finite case—that is, as answering the question ‘What goes wrong with the infinite series?’ (where ‘going wrong’ is differing from finite series in this respect). This argument (as I argued in Lange Citation2018) is regarded in mathematical practice as explaining why infinite sums sometimes differ from finite sums, despite this argument’s not being a proof that there exist infinite sums where (f1+f2+f3+)(a)f1(a)+f2(a)+f3(a).

It might be useful to see an example where the derivative at x=a of a sum of infinitely many terms, each differentiable at x=a, does not equal the sum of the terms’ derivatives at x=a. Consider (following Bressoud (Citation1994: 73–74)) the Fourier series: F(x)=4/π[cos(πx/2)1/3cos(3πx/2)+1/5cos(5πx/2)1/7cos(7πx/2)+].For any 1<x<1,F(x)=1, and so its derivative F(x)=0. However, differentiating F(x) term by term, we obtain G(x)=2[sin(πx/2)sin(3πx/2)+sin(5πx/2)sin(7πx/2)+]and so, for instance, G(.5)=2[111+1+111+1+],which fails to converge. Therefore, G(.5)F(.5).

14 In Leavitt Citation2007: 182, for instance. Sorensen (see note 9) also gives this example. For more discussion of what mathematical coincidences are, see Lange Citation2017: 276–311.

15 Spivak goes on to point out that ‘In this case there does happen to be an explanation … ’— namely, involving the impure proof appealing to the radius of convergence theorem applied to the complex plane. I have previously discussed this example in Lange Citation2017: 255, 290–92, 331, 344–45, 451, 454. There I also cite prior philosophical discussions of it.

16 This conditional (‘If Goldbach’s conjecture breaks down … ’) and all similar conditionals below should be read as indicative conditionals, not as subjunctive (including counterfactual) conditionals. (Nor are they material—i.e., truth-functional—conditionals.) The contrast between counterfactual and indicative conditionals is often drawn with examples such as Adams’s (Citation1970: 90) famous pair ‘Had Oswald not shot Kennedy, someone else would have’ (false, presuming Oswald to have been a lone gunman rather than part of a conspiracy) and ‘If Oswald did not shoot Kennedy, someone else did’ (true, considering Kennedy’s grievous injuries). The same sort of contrast can be drawn among conditionals in math. Suppose that we have been reviewing the evidence for Goldbach’s conjecture (including that it has been verified for all numbers up to 4×1018) and we regard this evidence as rendering Goldbach’s conjecture highly plausible. Suppose that we then entertain the indicative antecedent ‘If Goldbach’s conjecture has an exception for some number much larger than 4×1018’. This antecedent has us hold fixed that all numbers up to 4×1018 accord with the conjecture (just as we hold fixed Kennedy’s injuries in contemplating ‘If Oswald did not shoot Kennedy’). By contrast, suppose that we contemplate the (presumed-to-be) counterfactual antecedent ‘Had Goldbach’s conjecture had an exception for some number much larger than 4×1018’. Here we may well not hold fixed that all numbers up to 4×1018 accord with the conjecture; had Goldbach’s conjecture had an exception for such a large number, then it might have had a much smaller exception—perhaps one that would already have been discovered (just as when we consider ‘Had Oswald not shot Kennedy’, we do not hold fixed our evidence that Kennedy was shot).

17 For a brief autobiographical account of what it is like for a mathematician to discover that a purported proof (or, more accurately, a pair of purported proofs) was in fact incorrect and later to discover why it was incorrect, see Jones Citation1998: 208–9.

Easwaran (Citation2015: 151) proposes that the proof sketches (see note 2) publishable in mathematics journals must ‘contain enough detail that … if (hypothetically) someone were to find a counterexample, one would be able to trace the counterexample through the proof and find out which particular step failed’. This is similar to my view that there will be an explanation, if a purported proof turns out to fail (though Easwaran does not talk about explanation). Easwaran (Citation2015: 156) sees this requirement as ‘rul[ing] out a reliance on statistical arguments’ in mathematics. He remarks briefly that this requirement may ‘rul[e] out certain inductive or abductive arguments as well’ in mathematics. (It is unclear to me whether he has in mind ruling them out in mathematics from being publishable, from being strong, or from providing knowledge. Only the last is my concern.)

18 This is not a sufficient condition for knowledge since a non-deductive argument would also have to be strong enough.

My proposed necessary condition requires that the agent ‘justly expect’ something. By this, I mean (roughly) that it is reasonable (i.e., epistemically appropriate) for her to believe it, i.e., that it is epistemically permissible for her to do so, which is entailed by the agent’s having adequate epistemic reason (evidence, argument, etc.) for so believing. Shortly I will say a bit more about what sort of knowledge would provide some reason to believe that if P is false despite the mathematician’s evidence for it, then there is an explanation of why it is false despite that evidence. (Scientists routinely draw on their background knowledge of the explanations of other facts in judging how confident to be that a given, as yet unexplained fact has a given sort of explanation. See Lange Citation2022.)

In requiring that the agent justly expect that if P is false (despite the agent’s evidence for P’s truth) then there is an explanation of why P is false (despite that evidence), the necessary condition does not require that the agent justly be certain of this. The agent’s evidence may not be so strong as to make it impossible for P to be false (despite the evidence) just as a matter of bad luck.

19 As another example of the difference between non-contrastive and contrastive explanations here, consider the gas example. Suppose that the nth sample is CO2 whereas all of the previous samples were other gases. The pressure-volume-temperature behaviour of the nth sample can presumably be explained by CO2’s equation of state for some range of conditions that includes the conditions experienced by the sample. But to explain why this sample departed from the pattern exhibited by the previous (n1) gas samples, we presumably need a theory that covers not only CO2, but also the gases in the previous samples, since the explanation must reveal some aspect of the nth sample in which it differs from the previous samples and where this difference is responsible for their behaving differently.

20 Smith (Citation2010, Citation2016, Citation2018) has proposed a similar necessary condition for knowledge as resolving the ‘lottery paradox’ and addressing legal examples like the Blue Bus case. Smith (Citation2016: 49) has briefly suggested that this condition could offer a rationale for rejecting ‘probabilistic proofs’ as providing knowledge in mathematics, but he has not pursued this suggestion. Hamami (Citation2022: 87) has also briefly raised these possibilities (citing Smith’s proposal) without pursuing them. Neither Smith nor Hamami has combined this proposal with ideas about explanations in mathematics, and neither has raised extending this rationale to other sorts of non-deductive arguments in mathematics besides ‘probabilistic proofs’.

There are some relatively slight differences between Smith’s proposal and mine. Here is a representative statement of Smith’s proposal: ‘What I suggest is a standard of proof that is met only if a proposition is normically supported by the evidence—only if the evidence makes the falsity of that proposition less normal, in the sense of calling for more explanation, than its truth’ (Smith Citation2018: 1209). My proposal does not characterize one hypothetical circumstance as ‘calling for more explanation’ (or ‘requiring more explanation’ (Smith Citation2016: 39–40)) than another. I am unsure what it means for some fact to have ‘more’ or ‘less’ explanation (rather than to have or lack an explanation, or for us to be more or less confident that it has an explanation). When we say that a hypothetical circumstance ‘calls for explanation’ of some kind, we usually mean that we justly expect that if it came to pass, it would have such an explanation. Unlike Smith’s proposal, mine focuses on contrastive explanation.

Furthermore, Smith (Citation2016: 98) uses possible worlds to elaborate which hypothetical circumstances call for more explanation. (Littlejohn and Dutant Citation2020 do too, though without discussing mathematical knowledge.) But any false mathematical proposition holds in no possible world. I am unsure how this affects the applicability of Smith’s formulation to mathematical knowledge.

21 Smith (see previous note) might not agree with this motivation or with the rest of the paragraph in the main text (or with the rest of this note).

The proposed necessary condition is not that the agent not justly expect that there is no explanation. (Sorry for that unavoidable triple negative!) Rather, the necessary condition is steeper: that the agent justly expect that there is an explanation. (The weaker condition is satisfied by an agent who had no expectation one way or the other.)

In the lottery case, the agent knows that the lottery is fair and so that if a given ticket wins, despite the odds against it, then there is no explanation. But concerning Goldbach’s conjecture, the agent may not have sufficient evidence to justify believing that if Goldbach’s conjecture is violated by some large number, despite holding of all numbers up to 4×1018, then there is no explanation of this departure from the previous pattern. But she also has insufficient evidence to justly conclude that there is then an explanation. Therefore, she fails to satisfy the necessary condition for knowledge.

It is crucial that the necessary condition for knowledge concern what the agent justly expects explanation-wise if the confirmed proposition turns out to be false despite the evidence for it. Even if the agent justly expects there to be (if Goldbach’s conjecture is false) no explanation of why it is false despite the evidence of its truth, nevertheless the agent may justly expect the truth of Goldbach’s conjecture to have an explanation if it is true.

Perhaps the proposed necessary condition could be given this broader motivation: My being rational requires my being rationally committed to the existence of the appropriate sort of connection between my evidence and the facts. In particular, my being rational in believing P requires my being rationally committed to the existence of an explanation (if P is true) of why the evidence by which I am being guided in believing P leads to the truth—and to the existence of an explanation (if P is false) of why P turns out to be false despite my evidence for P’s truth. But in this paper, I will resist appealing to any such broad picture of knowledge.

22 My proposed necessary condition for the agent to know that P is that the agent justly expect that if P is false, despite the evidence marshalled by the agent’s argument, then there is a contrastive explanation of why P is false despite the agent’s evidence that P is true. Here are some of the reasons why I have included the italicized reference to the agent’s justified expectations – that is, why I have not let the necessary condition be the externalist ‘simple conditional’ that if P is false, despite the agent’s evidence, then there is a contrastive explanation of this.

Suppose that we are evaluating whether a given agent who believes mathematical proposition Pknows that P. Suppose that we know that P is true because we have proved P, but the agent has no proof, only some non-deductive argument for P. What should we say about whether the ‘simple conditional’ is true? It is unclear to me what we take such an indicative conditional (see note 16) to mean when we believe for certain that its antecedent (‘If P is false … ’) is false. Obviously, a counterfactual conditional has a clear role to play when its antecedent is believed for certain to be false, but an indicative conditional generally does not. We have no practice for assessing an indicative conditional under those circumstances—unless we are understood to be imaginatively entering into the epistemic situation of another (perhaps hypothetical) agent who does not believe its antecedent for sure to be false; we are then understood to be assessing what they ought to make of the conditional. My proposal avoids employing the ‘simple conditional’, instead specifying that the conditional is to be assessed from the epistemic situation of the agent who is being evaluated for whether she knows P. (She believes P.)

Notice also that the motivation for my proposal that I gave in the final paragraph of footnote 21 does not motivate the ‘simple conditional’ version of that proposal.

23 Another difference between my proposal and safety is that my proposal is internalist (since it concerns what the agent justly expects) whereas safety is externalist. I have discussed this aspect of my proposal in notes 18 and 22. Another difference is that safety is expressed in terms of a subjunctive (or counterfactual) conditional (e.g., that the agent would believe P only if P were true), whereas my proposal is expressed in terms of an indicative conditional. See note 16.

24 Before being generalized, sensitivity required that a counterfactual conditional (‘Had P been false … ’) obtain. This counterfactual is a countermathematical if P is a mathematical fact. In note 8, I discussed some concerns regarding a sensitivity requirement involving countermathematicals. But I have set aside these concerns here by generalizing sensitivity so that the counterfactual has been replaced by a fact about explanation.

Sensitivity (in either its original or generalized form) is an externalist requirement, whereas my proposal is internalist (as I discussed in note 22). This difference is reflected in the contrast that I am about to draw between them.

25 In this paper, I have been concerned with purely mathematical evidence for a mathematical proposition. But there could be scientific evidence as well. I have not argued that that evidence cannot supply mathematical knowledge.

Furthermore, some deductive arguments for mathematical claims are frequently mislabeled ‘non-deductive’ and so might seem to be precluded by my account from supplying mathematical knowledge when, in fact, they are not so precluded. For instance, suppose that some observations coupled with a well-established scientific theory suffice to entail some (as yet unproved) mathematical proposition. This argument from our knowledge to the mathematical proposition is deductive. My account does not preclude this argument from supplying us with knowledge that the mathematical proposition is true. Nevertheless, since our knowledge of the scientific theory was obviously arrived at non-deductively, this argument for the mathematical proposition might seem non-deductive.

For instance, Plateau famously took a closed curve of wire, dipped it into soap, withdrew it slowly, and observed the soap film bounded by the wire. A well-established scientific theory (concerning the soap film’s potential energy) together with auxiliary hypotheses concerning the experimental conditions entailed that the resulting film had the minimum area of any surface bound by the wire’s shape. In this way, Plateau deduced a mathematical proposition (concerning the surface of minimum area bound by a given closed curve) from an observation, scientific theory, and auxiliary hypotheses. My account permits this experiment to yield mathematical knowledge. (The proposed necessary condition for knowledge does not preclude this argument from yielding knowledge. Knowledge of the scientific theory entitles us to believe that if the mathematical proposition arrived at by Plateau’s experiment turned out to be false, despite the evidence, then there would be a contrastive explanation of this fact, such as that the external forces on the film were non-negligible or that the film when it was observed had not yet settled down to its equilibrium shape—that is, some auxiliary hypothesis was false.) Although this experiment obviously provides no mathematical proof, the route from prior knowledge and experimental observation to the mathematical proposition is deductive. (Peressini (Citation2003: 221), by contrast, terms this reasoning ‘non-deductive’.)

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 94.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.