Summary
Not many mathematical problems can be successfully introduced around a dinner table or in other social settings, but a quick internet search tells us that broken stick problems are among the classics that still engage. The fact that solutions may require basic knowledge of calculus, probability, and combinatorics helps demonstrate the power and necessity of mathematical formalism. In this article, the general intention has been to show how multiple integrals calculate probabilities in relation to polygons made by pieces from a randomly broken stick of unit length. Most of this article is devoted to proving a probability formula for the case where side lengths have predetermined lower bounds. This formula generalizes a well known corresponding formula without these requirements.
We will consider the probability of making an -edged polygon (), originating from pieces of a randomly broken stick of unit length, meaning that the n breaking points are independent realizations of a uniform distribution on . For a polygon to be accepted, we also require that piece number i has fixed minimum length (). To identify piece number i, we transfer the stick to the real line, where the n breaking points are indexed according to their position, .
The piece lengths are also called spacings, and spacing number i is denoted by . If and , we have (). If one of the spacings happens to be at least , the sum of the remaining spacings cannot exceed , and the result would be that no polygon could be made. Hence we conclude that the requirement (1) (1) must be fulfilled.
Definition
A single piece between and is called legal if , and a constellation , where , is called legal if (1) is satisfied.
A triangle probability formula
Situations with broken sticks are well suited for simulation, and 100,000 simulations of a broken stick into three pieces resulted in an estimated probability of forming a triangle when . We note that the sum of the lower bounds is in this case, fulfilling the general requirement
The good news is that double integrals will take us to the exact answer . To reach that far, we first consider geometrically all possible constellations , where . tells us what the set U satisfying these inequalities looks like.
It’s no surprise that since the double integral represents the area of U. The fact that the breaking points are independent realizations of a uniform distribution on implies that every constellation has the same probability of occurring. If we let L denote the set of all legal constellations, we thus find that the triangle probability must be
So next we must describe L and find its area. We will do that by writing , where A contains all legal and illegal constellations inside U satisfying , , and , while B contains all illegal constellations in A. We claim that A and B may be described as follows:
Regarding A, we know that , since the first piece is at least of length . To ensure sufficient space for the two pieces to the right, we require . Wherever is situated, we must be sure that . Thus, we must have , and in addition , since we must clear enough space for the last piece.
The illegal constellations of B are characterized by the fact that exactly one of the spacings exceeds . When , we can have , since in this case. The fact that and makes it impossible to achieve illegal constellations of the kinds where or . Altogether, this justifies the claimed description of B. In we see the legal region L together with the illegal region B ().
Now we are ready to compute the probability . We could of course have found the area of the trapezoid by the standard formula, but let us instead do it by double integration, since we want to generalize the process later. If we omit details, we get and
Hence we obtain which is in good compliance with the simulated result (it deviates less than one standard deviation).
We have so far intentionally chosen to keep the notation , , and instead of inserting specific numbers immediately. For , we found (2) (2)
In mathematics there is always a desire to generalize, and this is also the case here. Under the right circumstances, we would have subtracted three terms in (2) instead of one. If in general fulfills the requirements , , and , we might see illegal constellations of all three kinds, referring to the possible outcome that all three spacings might exceed 1/2 (one at a time). To make a unified formula, taking this into consideration, we need a notation that cancels a numerical expression a when and keeps a when . We define by
So, if , where (), a generalization of (2) gives us (3) (3)
In the next section we will see that a further generalization of (3) to pieces will look very similar.
Example 1.
If in (3), we get to be the probability of making a triangle of the pieces when there are no minimum length restrictions. This will be a well known result for those who have made a Google search with keywords “broken stick.”
Example 2.
If , , and , the use of (3) will give
A polygon probability formula
The triangle probability was calculated as a fraction between a legal area L and a possible area U. When , legal and possible regions will be subsets of , and volumes will measure the space these constellations occupy. When the number of breaking points increases to , we must consider “volumes” of n-dimensional regions, also called hypervolumes, to be able to calculate the relevant probabilities. If denotes the set of all legal constellations for a given sequence of lower bounds , the probability of being able to make a polygon from the pieces may be expressed by the proportion between favorable and possible outcomes, expressed by hypervolumes. The idea now is to follow the same reasoning that was used for the triangle probability. The main challenge will be to compute the corresponding n-tuple integrals over the regions U, A, and B, and thereby find . The region A consists of all constellations in U satisfying (). Since there will be both legal and illegal constellations in A, we must remove the region B of illegal ones to obtain L. So, in line with the triangular case, we have , telling us that
Integrating over the region U
Computation of may be performed as an iterated integral with limits induced by . If U’s hypervolume is denoted by , we have
The innermost integral gives the result . The next iteration gives
Computing yet another one, we obtain
A natural guess is that k iterations will give us the result
Since we get
we can conclude, by induction, that
Integrating over the region A
Computation of is performed in much the same way as , but the integration limits are a bit more complicated. We remember that A consists of constellations in U such that for all i. As in an ordinary multivariable calculus course, describing the integration range is an important part of the job. The basic idea, establishing the limits, will be to make room for the pieces, with minimum total length , to the right of the breaking point . The lower integration limit, integrating with respect to , is obtained by adding the corresponding minimum tolerance to the neighboring variable , in line with the requirement . We therefore get the following description of A: (4) (4)
The resulting multiple integral becomes (5) (5) where . To evaluate (5), our strategy will again be to observe the pattern after a few iterations. The innermost integration gives since .
Next, integrating with respect to , we obtain
Our guess should be that k iterations will result in (6) (6)
so we will have to prove that
We omit the details here and conclude that, by induction, it is proved that the result after n iterations becomes (7) (7)
Formula (7) is not a new one. In [Citation1] an alternative derivation of the corresponding probability formula is suggested.
Integrating over the region B
Computation of will follow the same path as for U and A. We repeat that B consists of the illegal constellations in A, a constellation being illegal if exactly one of the spacings, say , is too big, that is, . For simplicity, let us first assume . Then the first n spacings cannot exceed 1/2 in total length. The condition is therefore necessary if one is to avoid zero probability, telling us that the situation is analogous to the one handled for the region A. In analogy with (4), we get the following integration ranges for the n breaking points: (8) (8) where the upper limit of (8) is when . The hypervolume of observing an illegal constellation, with piece number illegally big, may therefore be inherited from (7), and we get
The question now is whether one can expect something else to happen if one of the other pieces is illegally big. The answer is no because the spacings () are identically distributed. To see that, we follow the reasoning in [Citation1], and consider a circle with circumference equal to 1. Then we imagine a sample , positioned randomly and independently on the circle. By symmetry, the spacings must be identically distributed. Next we cut the circle at and straighten it out to become a stick of length 1. We have then obtained that the s () constitute an independent and random sample on , with identically distributed spacings. This is why the hypervolume of constellations with an illegally big piece number i is given by (9) (9)
Alternatively, assuming , we must have , which enables us to derive the nonzero variant of (9), integrating over the region characterized by
The spacing is usually located somewhere in the middle, and we must therefore do the integration with respect to separately. We can reuse the expression in (6) to obtain the result
after the first iterative integrations. Then, integrating with respect to , we get
We do the last integrations following the same recipe, but will not be included in the sums. The final result is
Since every single piece has the opportunity to become illegally big (one at a time), we have the disjoint union , and therefore . From , we obtain
We summarize what we have found in the following theorem.
Theorem 1.
Suppose a stick of unit length is broken into pieces. The breaking points are assumed to be independent realizations from a uniform distribution on , and organized in rising order. If ( ) and , the probability that the pieces are legal, and thus may form a polygon, is given by (10) (10)
Example 3.
Let and . In this case the polygon probability in (10) becomes
100,000 simulations gave an estimated probability , which agrees well with the true value.
In the following examples we will assume . Therefore we simplify notation and write in these cases.
Example 4.
We will find a piecewise expression of . We get for every i. This means that if , and if . Hence, we get (11) (11)
Example 5.
If in (11), we find (12) (12)
There are several derivations of (12) in the literature, one of them in [Citation2] and another one in [Citation3]. The last reference is based on multiple integration. demonstrates how the probabilities increase as the number of pieces increases.
Example 6.
In we have evaluated according to (11). Compared to the case of , we see how the polygon probabilities decrease when the number of pieces gets sufficiently big.
Example 7.
We take a closer look at (11) when and get
Since as , we obtain the asymptotic relation
Example 8.
We notice that the piecewise expression of in (11) is times differentiable at . indicates this smoothness at , when .
Further reading
The original problem of forming a triangle from pieces of a randomly broken stick has been expanded and generalized in several directions. Reference [Citation4] provides some insight into the historical development, as well as a few areas of application. The professional content of this article can be perceived as quite advanced, but some of the references should nevertheless be worth a closer look.
Acknowledgments
I’m very thankful for the guidance and inputs from the referee and the editor. This was crucial to give the article a necessary increase in readability and appeal. I would also like to thank my former colleague, Ivar Henning Skau, who inspired me to study and write about “broken stick” problems.
Disclosure Statement
No potential conflict of interest was reported by the author(s).
Additional information
Notes on contributors
Kai Forsberg Kristensen
Kai Forsberg Kristensen ([email protected]) is a docent of mathematics at the University of South-Eastern Norway. He received his cand. scient. degree from the University of Oslo in 1985 and has a long career in teaching mathematics and statistics to engineering students. He takes special interest in problems that are intuitively easy to grasp but might require a certain degree of mathematical insight to solve. If one nonprofessional interest was to be mentioned, it would have to be fishing—an activity he would like to spend more time on.
References
- Feller W. An introduction to probability theory and its applications. Vol. 2, 2nd ed. New York: Wiley; 1970.
- D’Andrea C, Gómez E. The broken spaghetti noodle. Amer Math Monthly. 2006;113(6):555–557. DOI: 10.1080/00029890.2006.11920336.
- Kaushik V. A simple multiple integral solution to the broken stick problem; 2021. Available from: https://arxiv.org/pdf/2001.03644.pdf
- Verreault W. On the probability of forming polygons from a broken stick. Stat Probab Lett. 2022;180:109237. https://www.sciencedirect.com/science/article/pii/S0167715221001991?via%3Dihub DOI: 10.1016/j.spl.2021.109237.