The dice roll with a given sum problem


you could also be interested in:
Toric sections

Overview


In some exercise taken from combinatorics textbooks you might find a question like: “ if you roll 2 dice what is the probability to get 9 for the sum of the 2 dice faces?”
The answer is not much difficult and can be intuitively given by enumerating the number of cases that satisfy the given event and dividing that number with the number of all possible outcomes.
But what if you want to find the probability of, say, a sum of 31 when rolling 10 dice, where simple enumeration would require a very long time? Or to find a more general formula for the probability of getting p points as the sum of rolling n dice? Is there such a formula, computable without having to enumerate all the cases?

To try to give an answer to above general questions I’ve made some research and this article reports the results I’ve found.
But before getting to the core of the problem it’s better to clarify some of the notation and terminology used in the rest of the page.


PART 1:

Combinatorics: formulas, definitions and notation


In combinatorics there are many different naming conventions and symbols, also coming from different cultural traditions.
Here is the meaning of the main notation and terminology used in this article.
I’ve also tried to mention some alternative terminology to help with the translations from different conventions.

GENERAL TERMS

A set is a (unordered) collection of distinct objects. The cardinality |S| of a set S is the number of members of S . A subset is a portion of a set. So B is a subset of A if B is “contained” inside A , that is, if all elements of B are also elements of A .
A multiset is a (unordered) collection of objects in which members are allowed to appear more than once. The number of times an element is repeated in the multiset is the multiplicity of that member. The total number of elements in a multiset, including repeated memberships, is the cardinality of the multiset.
For example, in the multiset M={a, a, b, b, b, c, d, d, d, d} the multiplicities of the members a, b, c and d are respectively 2, 3, 1 and 4, and the cardinality |M| of the multiset is 10.
In sets and multisets the order of the objects doesn’t count .
On the contrary a list (or tuple) is an ordered sequence of objects.

1) PERMUTATIONS

A permutation is an ordered sequence of elements taken from a collection.

1.A) The permutations of the set S are the different possible ordered sequences of all the elements of S.
If the set S has n elements, then the number of all possible different permutations of this set is:
P(n,n){{=}_{n}}{{P}_{n}}={{P}_{n}}=n(n-1)(n-2)...\text{3}\cdot \text{2}\cdot \text{1}=n!.
In fact there are n possible choices for the element in the first position, for each of these there are n1 choices for the element in the second position (since one of the n elements has already been used) and so on till filling all n positions.
For example there are 6 permutations of the elements {a, b, c}: abc, acb, bac, bca, cab, cba.

1.B) Another different case of permutations (this time with repetition) occur when, instead of choosing the elements from a set S (that, as such, has unique and distinguishable elements) we choose them from a multiset M (in which members are allowed to appear more than once).
In this case a multiset permutation is a sequence of all the elements of M in which each element appears exactly as often as is its multiplicity in M.
If the multiplicities of the elements of M (taken in some order) are {{k}_{1}},{{k}_{2}},...{{k}_{m}} and their sum (the cardinality of M ) is n ({{k}_{1}},{{k}_{2}},...{{k}_{m}}=n), then the number of multiset permutations of M (multinomial coefficient) is given by:
\displaystyle \text{M}{{\text{P}}_{n}}^{{{k}_{1}},{{k}_{2}},{{k}_{3}}...}=\frac{n!}{{{k}_{1}}!{{k}_{2}}!{{k}_{3}}!...}
A typical example is that of the number of different anagrams of a word with some duplicate letter. For example the number of possible anagrams of the 6-letters word “ tattoo ” (that has 3 “ t ”, 2 “ o ” and 1 “ a ”) is \displaystyle \frac{6!}{3!\cdot 2!\cdot 1!}=60.

In case not all of the n elements of the collection are included in the permutations (and we take only k elements out of the n available) we have the k-permutations (repetitions not allowed) and the permutation with repetitions (repetitions allowed) detailed herebelow.

1.C) k-permutations (or variations) are the ordered sequences of k elements taken from a set S of n elements. In other words k-permutations are those ordered arrangements in which no element occurs more than once, but without the requirement of using all the elements from a given set. The concept of k-permutations is then a generalization of that of permutations.
The number of the “kpermutations of k elements taken from a set of n elements” (or of the “lists of length k with n possible elements”) is:
\displaystyle P(n,k)=n(n-1)(n-2)...(n-k+1)=\frac{n!}{(n-k)!}.
For example there are P\left( {8,3} \right) = 8 \cdot 7 \cdot 6 = 336 possible arriving orders of the first 3 horses in a race with 8 runners.
As a special case, If k equals n, we get back to the notion of permutations.

1.D) If repetitions are allowed then the number of the “permutations with repetition” of k elements taken from a set of n elements” (or of lists of length k with n possible elements in each position and repetitions are allowed) is:
PR\left( n,k \right)={{n}^{k}}
For example there are PR\left( 3,2 \right)={{3}^{2}}=9 permutations with repetition of two letters taken from the set {a, b, c}: aa, ab, ac, ba, bb, bc, ca, cb, cc.

2) COMBINATIONS

A combination is a selection of elements from a collection, where (unlike permutations) order does not matter.

2.A) COMBINATIONS WITHOUT REPETITIONS

If repetitions are not allowed then a “combination of k elements taken from a set S of n elements” (or k-combinations) is just a subset of S with k elements. Since the order of the elements is not taken into account, two lists with the same elements but with different orderings are considered to be the same combination. The number of possible combinations without repetition (binomial coefficient) is:
\displaystyle C(n,k)=\binom{n}{k}= \frac{{n!}}{{k! \cdot \left( {n - k} \right)!}}
The notation \displaystyle \binom{n}{k} can be pronounced “n choose k“.
For example the number of different basketball teams of 5 players that can be formed from a group of 9 players is \displaystyle \left( \begin{array}{c}9\\5\end{array} \right) = \frac{{9!}}{{5! \cdot 4!}} = 126.

2.B) COMBINATIONS WITH REPETITIONS

If repetitions are allowed then a “combination with repetition of k elements taken from a set S of n elements” (or k-multiset or k-combination with repetition or k-multicombination) is a multiset with cardinality k whose elements belong to S .
The number of possible combinations with repetition (multiset coefficient) is:
\text{MC}(n,k)={{C}^{*}}_{n,k}=\left( \left( \begin{matrix} n \\ k \\ \end{matrix} \right) \right)=\left(\begin{matrix}n+k-1 \\k \\\end{matrix} \right).
The notation \displaystyle \left( \left( \begin{matrix} n \\ k \\ \end{matrix} \right) \right) can be pronounced “n multichoose k“.
Example 1: let’s say we can choose between five flavors of ice cream: banana, chocolate, lemon, strawberry and vanilla and we can have three scoops. How many different ice creams will there be? Let’s use letters for the flavors: {b, c, l, s, v}. Example selections include:
{c, c, c}: 3 scoops of chocolate
{b, l, v}: one each of banana, lemon and vanilla
{b, v, v}: one of banana, two of vanilla.
In this example there are n=5 things to choose from, and we choose k=3 of them. Order doesn’t matter for the different ice creams and it’s possible to choose more scoops with the same flavor (that is, repetitions are allowed).
The number of possible ice creams we could have is then:
\displaystyle \left( {\left( {\begin{array}{*{20}{c}} 5\\3\end{array}} \right)} \right) = \left( {\begin{array}{*{20}{c}}7\\3\end{array}} \right) = \frac{{7!}}{{3! \cdot 4!}} = 35.

Example 2: Another possible and relevant example for the use of the combinations with repetitions is in problems such as “find the number of possible ways to distribute 3 balls in 5 boxes”, allowing some box to be empty and some to contain one or more balls.
This example is part of a more general kind of problems about finding the number of possible ways to distribute k indistinguishable objects (balls, pencils, white t-shirts…), in n containers, implicitly assuming that, after the distribution, some of the containers might remain empty or contain more objects (even all). Furthermore the number k of objects to distribute in the containers may be greater, equal or less than the number of containers n.
The formula answering this question is:
\displaystyle \left( \left( \begin{matrix}5 \\3 \\\end{matrix} \right) \right)=\left( \begin{matrix}7 \\3 \\\end{matrix} \right)=\frac{7!}{3!\cdot 4!}=35.
Actually that’s the same result we got for the problem of the ice cream variations. We can see that both problems have the same structure if we create the analogy “boxes \leftrightarrow flavors” and “balls \leftrightarrow scoops“.
That is, “a ball in a box” means that the associated flavor is chosen and a box may contain from 0 to 3 balls (flavor not chosen or chosen for 1,2,3 scoops).
So the analogy is built upon the fact that a ball in a box can be interpreted as a pointer to mean that that particular box has been chosen (one or more times).
In applying this formula it’s important to make clear what’s n and what’s k:
n is the number of things to choose from (flavors or boxes) whilst k is the number of choices we have (scoops or balls).
Since here, unlike in the combinations without repetitions, it’s possible that k>n, possible variations to the previous examples could be those to count the number of different ice creams we could have when choosing from 3 flavors and having 5 scoops or the problem of distributing 5 balls in 3 boxes. In these latter cases the formula would be still valid but we should set n=3 e k=5 and we’d get a different result:
\displaystyle \left( {\left( {\begin{array}{*{20}{c}}3\\5\end{array}} \right)} \right) = \left( {\begin{array}{*{20}{c}}7\\5\end{array}} \right) = \frac{{7!}}{{5! \cdot 2!}} = 21.

Note about combinations with repetition
Since combinations with repetition will have an important role later in this article, it’s worth to go deeper into it.
The general formula for the combinations with repetition
\left( \left( \begin{matrix}n \\k \\\end{matrix} \right) \right)=\left( \begin{matrix}n+k-1 \\k \\\end{matrix} \right)
can be explained, making reference to the “3 balls in 5 boxes” problem, in the following way.
Let’s assign a symbol to the 3 balls (say “⊕”) and another symbol to the boxes’ separators (say “|” ). Note that to create 5 boxes we just need 4 boxes’ separators (and for n boxes, n-1 separators are enough).
With these assignments we can interpret the sequence ||||⊕⊕ as representing the distribution “0 ball in the first box; 1 ball in the second box; 0 balls in the third box; 1 ball in the fourth box; 2 balls in the fifth box;” whilst the sequence ⊕⊕ ||||⊕ would represent the distribution “ 2 balls in the first box; 1 balls in the fifth box ”.
Every other possible distribution could be translated in this way by a proper sequence of the two symbols and every sequence of the two symbols represents a different balls distribution.
So the problem of the distribution of 3 balls in 5 boxes can be translated in the possible different multiset permutations of 2 symbols, forming a sequence of length 7.
In this example one symbol (⊕, balls) must be repeated 3 times and the other one (|, separators) must be repeated 4 times.
So the multiset has cardinality 7 and is formed by two elements with multiplicity 3 and 4.
These multiset permutations can also be considered as the number of possible different anagrams of a 7-letters word, in which only two different characters (⊕ and |) are used.
So we can say that \displaystyle \left( \left( \begin{matrix}5 \\3 \\\end{matrix} \right) \right)=\left( \begin{matrix}7 \\3 \\\end{matrix} \right)= \frac{7!}{3!\cdot 4!} and that, in the general case, \displaystyle \left( \left( \begin{matrix}n \\k \\\end{matrix} \right) \right)=\left( \begin{matrix}n+k-1 \\k \\\end{matrix} \right)= \frac{\left( n+k-1 \right)!}{\left( n-1 \right)!k!}.
Later we’ll use this concept to find the number of ways in which (say) 10 dice can yield a sum of (say) 31. In that case we’ll have to distribute 31 points (balls) between 10 dice (boxes).


PART 2:

The dice roll with a given sum problem

Origin of the problem: a simple case


To answer a question like: “If you roll 2 dice what is the probability to get 9 for the sum of the 2 dice faces?” we could define the sample space U (set of all the possible outcomes):
\begin{matrix} \{\{1,1\}, & \{1,2\}, & \{1,3\}, & \{1,4\}, & \{1,5\}, & \{1,6\}, \\ \{2,1\}, & \{2,2\}, & \{2,3\}, & \{2,4\}, & \{2,5\}, & \{2,6\}, \\ \{3,1\}, & \{3,2\}, & \{3,3\}, & \{3,4\}, & \{3,5\}, & \{3,6\}, \\ \{4,1\}, & \{4,2\}, & \{4,3\}, & \{4,4\}, & \{4,5\}, & \{4,6\}, \\ \{5,1\}, & \{5,2\}, & \{5,3\}, & \{5,4\}, & \{5,5\}, & \{5,6\}, \\ \{6,1\}, & \{6,1\}, & \{6,3\}, & \{6,4\}, & \{6,5\}, & \{6,6\}\} \\ \end{matrix}
and notice that there are 36 elements in it (36 possible outcomes).
Then we should define the event space (set of all the subsets of U).
The particular event “the sum of the two dice is 9” is the following element of the event space:
E=\left\{ \{3,6\},\{4,5\},\{5,4\}\{6,3\} \right\} and there are 4 elements in it. If the two dice are fair then each possibility of U is equally likely (with probability 1/36) and the probability of the event E is then \displaystyle P\left(E \right)=\frac{4}{36}=\frac{1}{9}

Note : One possible mistake could be to consider the event E as made up by just the unordered dice outcomes:  E'=\left\{ \left\{ 3,6 \right\},\left\{ 4,5 \right\} \right\}. In this case we should also consider the sample space as made up of unordered couples of dice numbers:
 \begin{matrix} \{\{1,1\}, & \{1,2\}, & \{1,3\}, & \{1,4\}, & \{1,5\}, & \{1,6\}, \\ {} & \{2,2\}, & \{2,3\}, & \{2,4\}, & \{2,5\}, & \{2,6\}, \\ {} & {} & \{3,3\}, & \{3,4\}, & \{3,5\}, & \{3,6\}, \\ {} & {} & {} & \{4,4\}, & \{4,5\}, & \{4,6\}, \\ {} & {} & {} & {} & \{5,5\}, & \{5,6\}, \\ {} & {} & {} & {} & {} & \{6,6\}\} \\\end{matrix}
There are now 21 elements in U. But in this case, saying that \displaystyle  P\left( E' \right)=\frac{2}{21} would be a wrong answer. That’s because the elements in the sample space are no more equally likely. Outcomes with different numbers have in fact a double probability with respect to outcomes with a double number.
We should adjust things by giving a weight of 2 to the outcomes with different numbers and a weight of 1 to the ones with the same number. With this adjustment our probability would become \displaystyle  P\left( E' \right)=\frac{2+2}{6+(2\cdot 15)}=\frac{1}{9} , as before.

 

Generalization


In the previous example we found the number of elements of the event E (roll sum equal 9) by enumeration .
In that example it didn’t take much time to find that this number was 4. The denominator was 36 ({{6}^{2}}), that is the number of permutations with repetition of 2 elements taken from a set of 6 elements.

But what if someone wants to find the probability of, say, a sum of 31 when rolling 10 dice? In this case the denominator (number of elements in the sample space) would be {{6}^{10}}=60466176, but what about the numerator? How long would it take to enumerate and count all the possible different outcomes that results in a sum of 31 with 10 dice (one of these would be: {6,3,4,3,5,2,1,1,2,4} )?

Isn’t there a general formula to find the number of the possible outcomes that satisfy the event statement?
To give the problem a wider generality one would wish to find a general formula to compute the probability of getting p points as the sum of rolling n dice, each with s sides.

The answer is: yes, there is such a formula:
\displaystyle P\left( {p,n,s} \right) = \frac{1}{{{s^n}}} \cdot \sum\limits_{k=0}^{{{k}_{\max }}}{{{(-1)}^{k}}}\left( \begin{matrix} n \\ k \\\end{matrix} \right)\left( \begin{matrix} p-s\cdot k-1 \\ p-s\cdot k-n \\\end{matrix} \right)(1)
where \displaystyle {{k}_{\max }}=\left[ \frac{p-n}{s} \right] and \left[ x \right] is the floor function, that gives the largest integer less than or equal to x (ie [6.83]=6).

I found the formula in http://mathworld.wolfram.com/Dice.html but couldn’t find some intuitive justification of it (even if I bet there is something somewhere in the net).
Actually the proof presented in the mathworld page is mostly based on algebraic methods rather than on a pure combinatorics reasoning, and, as such, didn’t satisfy much my need for a deeper understanding of its structure.

I was a little puzzled by that formula, with its use of an alternating sign term in the summation and with the use of the floor function. Where do these elements come from?

So I tried to understand it better and the following part of this article describes the path I followed.

 

Understanding the formula


First let’s apply the formula to the following concrete example case: “ find the probability to get a sum of 31 when rolling 10 dice ”.
In this case the simple enumeration of the favorable cases is not possible (or would require a very long time) so some general reasoning is necessary.
In this exploratory case we consider the common cubic dice with 6 sides (s = 6). Furthermore p = 31, n = 10 and \displaystyle {{k}_{\max }}=\left[ \frac{31-10}{6} \right]=[3.5]=3. So the formula (1) becomes:
\displaystyle P\left( 31,10,6 \right)=\frac{1}{{{6}^{10}}} \cdot \sum\limits_{k=0}^{3}{{{(-1)}^{k}}}\left( \begin{matrix} 10 \\ k \\\end{matrix} \right)\left( \begin{matrix} 31-6k-1 \\ 31-6k-10 \\\end{matrix} \right)
We can ignore the first factor \frac{1}{{{6}^{10}}} (that’s just the number of possible outcomes needed to calculate the probability) and concentrate on the second factor (the sum) that represents the number of favorable cases, that is the number of the (ordered) outcomes that produce a sum of 31. Let’s call it {{N}^{10}}_{31}.
By expanding the summation we get:
{{N}^{10}}_{31}=\left( \begin{matrix} 30 \\ 21 \\ \end{matrix} \right)-\left( \begin{matrix} 10 \\ 1 \\ \end{matrix} \right)\left( \begin{matrix} 24 \\ 15 \\ \end{matrix} \right)+\left( \begin{matrix} 10 \\ 2 \\ \end{matrix} \right)\left( \begin{matrix} 18 \\ 9 \\ \end{matrix} \right)-\left( \begin{matrix} 10 \\ 3 \\ \end{matrix} \right)\left( \begin{matrix} 12 \\ 3 \\ \end{matrix} \right)=3393610

To face the dice roll problem I have chosen to use the model of the combinations with repetition .
Its formula addresses problems such as “ find the number of possible ways to distribute 3 balls in 5 boxes ” (allowing some box to be empty) and the formula answering this question would be \displaystyle \left( \left( \begin{matrix}5 \\3 \\\end{matrix} \right) \right)=\left( \begin{matrix}5+3-1 \\3 \\\end{matrix} \right)=\left( \begin{matrix}7 \\3 \\\end{matrix} \right)=\frac{7!}{3!\cdot 4!}=35   (see the “Combinations with repetitions” section above).

But how is the combination with repetitions formula related with a roll of 10 dice with the given sum of 31?
Well, in this case we don’t have balls and boxes, but points and dice. So, using the correspondence balls in the boxes  \to  points in the dice our problem will be that of counting all the possible ways to distribute 31 points (balls) in 10 dice (boxes).
Here the number of elements to choose from (n) is 10 and the number of choices (k) is 31:
\displaystyle \left( \left( \begin{matrix} 10\\ 31 \\ \end{matrix} \right) \right)=\left(\begin{matrix}10+31-1 \\31 \\\end{matrix} \right)=\binom{40}{31}= \frac{{40!}}{{31! \cdot 9!}}.
The fact that here  k>n seems a little strange initially, since it can’t happen in the more common case of combinations without repetitions. But, since here repetitions are allowed, there’s actually nothing wrong with it. For example, If the 3rd die gives the number “4”, than it’s like it had been chosen “4” times, and since we must total a number of 31 points (choices) with 10 dice it’s inevitable that there must be repetitions.

Following the same reasoning made above in “Note about combinations with repetition” an alternative explanation of the formula of the multiset coefficient  \displaystyle \left( \left( \begin{matrix} n\\ k \\ \end{matrix} \right) \right) is to interpret it in the form permutations with repetition, using the symbol “|” for the box (dice) separator and the symbol “p” for a point assigned to a dice.
For example, the sequence
ppp|p|ppppp|pp|p|pppp|ppp|pppp|pppppp|pp
would be the dice outcome {3,1,5,2,1,4,3,4,6,2}, whose sum is 31.
The number of ways we can arrange the 31 “p” symbols and the 9 “|” symbols (separators) in a string of length 40 is just the number of different anagrams of a 40 length word that can be made with two letters, that is  \displaystyle \frac{{40!}}{{31! \cdot 9!}}.

But here there are two problems: one is rather simple to face, the second is more complex.
Both problems arise from the fact that, differently form the balls and boxes examples (in which one box could be empty, contain some balls and also contain all the balls, leaving the other boxes empty), here a die (box) must be assigned a minimum of 1 point (first problem) and a maximum of 6 points (second problem).
To deal with the fact that each die have at least 1 point is quite simple. We just force the placement of 1 point in each die (box), for a total of 10 points.
After that we’ll have to place the remaining 21 points in the 10 dice.

If there wasn’t an upper limit for the points to assign to the dice (second problem) we’d be done.
We could just apply the combination with repetition formula to find all the possible distributions:
\displaystyle \left( \left( \begin{matrix}10 \\21 \\\end{matrix} \right) \right)=\left( \begin{matrix}10+21-1 \\21 \\\end{matrix} \right)=\left( \begin{matrix}30 \\21 \\\end{matrix} \right)=\frac{30!}{21!9!}=14307150.
This formula gives the number of all the possible distributions of 21 points to 10 dice and each die could then have a value ranging from 1 to 22 (considering the pre-assigned point).


To check this result let’s apply the general formula to dice with an unrestrained number of faces. Actually it’s enough to consider dice with 22 faces (s = 22).
With this choice for s each die could then produce from 1 to 22 points and this guarantees that a single die could provide all the points needed to reach the sum of 31 points when the other 9 dice are producing a minimum sum of 9 (with their minimum value of 1). So, in this case,  \displaystyle {{k}_{\max }}=\left[ \frac{31-10}{22} \right]=\left[ 0.95 \right]=0 and the number of possible distributions given by the formula (1) would be:
 \displaystyle \sum\limits_{k=0}^{0}{{{(-1)}^{k}}}\left( \begin{matrix}n \\k \\\end{matrix} \right)\left( \begin{matrix}p-s \cdot k-1 \\p-s \cdot k-n \\\end{matrix} \right)=\left( \begin{matrix}10 \\0 \\\end{matrix} \right)\left( \begin{matrix}30 \\21 \\\end{matrix} \right)=\left( \begin{matrix}30 \\21 \\\end{matrix} \right)=\frac{30!}{21!9!}=14307150
That’s the same result we got by using the combinations with repetition  \displaystyle \left( \left( \begin{matrix}10 \\21 \\\end{matrix} \right) \right).
That means that there’s a promising matching between our reasoning (based on the combination with repetition) and the formula (1), at least if we ignore the second problem about the maximum value that a regular 6 sided die can have.

That’s good! :). Let’s go on!


The next step is to adjust this temporary result by taking into account the fact that there’s actually an upper limit for the points assigned to each die.
To find the correct number of outcomes with the common 6 sided dice we must identify and remove from above number of outcomes (with unrestrained dice) the number of prohibited outcomes coming from the outlaw dice.
Considering that, after the adjustment to meet the minimum value of 1 each die has already received 1 point, the number of new fresh points to assign to each legal die must be between 0 and 5. An outlaw die will then be one that receives a number of points \geq 6.

 

Prohibited cases removal


Here comes the toughest part.
We must find a way to enumerate and count all the prohibited distributions, that is, count the distributions in which one or more dice produce a number of points ≥ 6. If we manage to do that then we’ll be able to say that:
number of allowed distributions = number of unrestrained distributions number of prohibited distributions .
We already know that the number of unrestrained distribution is \displaystyle \left( \begin{matrix}30 \\21 \\\end{matrix} \right). So all we have to do is to find the number of prohibited distributions.
A prohibited distribution occurs when 1 or more dice presents a face with value ≥ 6.
Let’s first consider the cases when just one die is forced to be an outlaw die (with points ≥6).
If we pre-assign to a die (say die 1) 6 points, then there will be 15 points left to distribute again to all the 10 dice (after assigning the 6 points to die 1 we can assign other points to the same die and let its total grow above 6). Using our formula of the combination with repetitions we have
\left( \left( \begin{matrix}10 \\15 \\\end{matrix} \right) \right)=\left( \begin{matrix}15+10-1 \\15 \\\end{matrix} \right)=\left( \begin{matrix}24 \\15 \\\end{matrix} \right)=1307504
but, considering that we can make this choice (of making a die an outlaw die ) in 10 different ways (1 for each die), the total number of prohibited distributions generated will be:
10\left( \left( \begin{matrix}10 \\15 \\\end{matrix} \right) \right)=10\left( \begin{matrix}15+10-1 \\15 \\\end{matrix} \right)=10\left( \begin{matrix}24 \\15 \\\end{matrix} \right)=13075040
We’ll call this group of prohibited distributions of the points to the 10 dice as {{A}_{1}} (or “≥1six” group).
Above procedure could be summarized in the following 4 steps:
(1) choose the first die
(2) assign 6 points to it
(3) distribute the remaining 15 points to the ten dice (including the one with the fixed assignment)
(4) move to the next die and go to step (2)
The cycle will stop after processing the 10th die.

It’s important to notice that the procedure of assigning 6 points to a single die (die k ) in the  step (2) and then the remaining points to all the 10 dice in the step (3) doesn’t prevent that:
a) after the step (3) the die k could have a total of more than 6 points (it could also even receive all the other 15 points at stake)
b) some other die could receive, in step (3), 6 or more points and become so another outlaw die even if it were not forced to be such in the step (2).

At this point we have built a group of distributions ({{A}_{1}}) that includes all the prohibited distributions with at least 1 outlaw die. The number of distributions in this group is:
\displaystyle \left| {{A_1}} \right| = 10\left( \begin{matrix}24 \\15 \\\end{matrix} \right)=13075040.

 

Removal of the duplicated cases within the prohibited cases


Unfortunately there’s a serious problem in above procedure: it doesn’t guarantee that all the distributions generated in the {{A}_{1}} group are unique. That also means that {{A}_{1}} is not a set but rather a multiset. Let’s see this with an example:
Let’s suppose that the current die forced as outlaw is die 2 so we assign 6 points to it. Then we assign the other 15 points to the dice and get the following distribution (one of the many possible): {1,6,0,4,0,0,3,1,6,0}.
The total of points is 21 and the actual corresponding dice roll (we have to sum 1 pre-assigned point to each die) would be {2,7,1,5,1,1,4,2,7,1}, with sum 31 but with two outlaw dice.
Then we arrive at dice 9, assign 6 points to it and assign the remaining 15 points to the dice. We could have again (amongst the other) the same distribution {1,6,0,4,0,0,3,1,6,0} . So this same distribution will be produced (and counted) two times: one with die 2 forced to be an outlaw die and 1 with die 9 forced to be outlaw.
The same thing happens whenever there are 2 outlaw dice in the resulting distribution.
So, to find the duplicates in the {{A}_{1}} group, we must consider the distributions with at least two outlaw dice.
We’ll call this group {{A}_{2}} (or “≥2six” group).

To build the group{{A}_{2}} we start by pre-assigning to 2 dice 6 points. There will then be 9 residual points to distribute again to all the 10 dice, and that can be made in
\left( \left( \begin{matrix}10 \\9 \\\end{matrix} \right) \right)=\left( \begin{matrix}10+9-1 \\9 \\\end{matrix} \right)=\left( \begin{matrix}18 \\9 \\\end{matrix} \right)=48620 ways.
But we can make this choice (of making two dice outlaw dice ) in \left( \begin{matrix}10 \\2 \\\end{matrix} \right)=45 different ways; so the total number of prohibited distributions generated will be:
\displaystyle \left| {{A_2}} \right| = \left( \begin{matrix}10 \\2 \\\end{matrix} \right)\left( \left( \begin{matrix}10 \\9 \\\end{matrix} \right) \right)=\left( \begin{matrix}10 \\2 \\\end{matrix} \right)\left( \begin{matrix}10+9-1 \\9 \\\end{matrix} \right)=\left( \begin{matrix}10 \\2 \\\end{matrix} \right)\left( \begin{matrix}18 \\9 \\\end{matrix} \right)=2187900
But also in this group there are duplicates and also this group is a multiset and not a set.
In fact if the pre-assigned dice were the die 3 and the die 7 we could have another die with a six, say in die 9, and a possible example could be the sequence {1,0,6,0,0,0,6,2,6,0}.
But the same sequence could occur also when die 3 and die 9 are pre-assigned and when die 7 and die 9 are pre-assigned. That means there are \left( \begin{matrix}3 \\2 \\\end{matrix} \right)=3 duplicates for each double pre-assignment.
This duplication occurs whenever there are 3 outlaw dice in the distribution.
So, to find the duplicates in the {{A}_{2}} group, we must consider the distributions with at least three outlaw dice.
We’ll call this group {{A}_{3}}.

It seems there’s no end to this constant need to find the number of distributions with more and more outlaw dice, but luckily now we see that things change.
In fact we can recognize (with some relief) that actually there cannot be more than 3 outlaw dice (with value \geq 6) in the possible distributions if the sum is to be 21.
In fact 4 six would produce a sum of 24 that is greater than the total sum we want to get, breaking the condition on the sum of the dice roll.
So the {{A}_{3}} group doesn’t include duplicated elements (it’s actually a set and not a multiset) and can be denoted as “=3six” group.
To build the group {{A}_{3}} we start by pre-assigning to 3 dice 6 points. There will then be only 3 residual points to distribute again to all the 10 dice (and with so few points no other dice can become another outlaw die), and the number of possible distributions is:
\left( \left( \begin{matrix}10 \\3 \\\end{matrix} \right) \right)=\left( \begin{matrix}10+3-1 \\3 \\\end{matrix} \right)=\left( \begin{matrix}12 \\3 \\\end{matrix} \right)=220
but we can make this choice (of making three dice outlaw dice) in \left( \begin{matrix}10 \\3 \\\end{matrix} \right)=120 different ways; so the total number of prohibited distributions generated in the “=3six” group will be:
\displaystyle \left| {{A_3}} \right|= \left( \begin{matrix}10 \\3 \\\end{matrix} \right)\left( \left( \begin{matrix}10 \\3 \\\end{matrix} \right) \right)=\left( \begin{matrix}10 \\3 \\\end{matrix} \right)\left( \begin{matrix}10+3-1 \\3 \\\end{matrix} \right)=\left( \begin{matrix}10 \\3 \\\end{matrix} \right)\left( \begin{matrix}12 \\3 \\\end{matrix} \right)=26400.
Since there cannot be more than 3 six in the distribution of the 21 points to the dice the group {{A}_{3}} will have no duplicates and we have finally reached the end of the chain.

It’s important to notice that there are duplicated sequences of the  {{A}_{3}} group not only in the  {{A}_{2}} but also in the  {{A}_{1}} group. In fact if the pre-assigned dice was the die 6 we could also have other 2 dice with a six, say die 3 and 8.
An example could be the sequence {1,1,6,0,0,6,0,6,0,1}. The same sequence could occur also when die 3 is pre-assigned and when and die 8 is pre-assigned. That is \left( \begin{matrix}3 \\1 \\\end{matrix} \right)=3 duplicates for each pre-assignment of 1 six to a die. These sequences belong to both the group  {{A}_{1}}, the group  {{A}_{2}} and the set  {{A}_{3}}.

We can now go backward to re-build the unique distributions, cleaned up from duplicated distributions.
To do so it may be useful to define the following numbers:
{{x}_{3}}: number of unique distributions with exactly 3 six;
{{x}_{2}}: number of unique distributions with exactly 2 six;
{{x}_{1}}: number of unique distributions with exactly 1 six.
Adopting above definitions we can say that:
{{A}_{3}} is made of unique distributions, so {{x}_{3}} is the cardinality of {{A}_{3}}:
{{x}_{3}}=\left|{{A}_{3}} \right|.
{{A}_{2}} includes 3 times the distributions with exactly 3 six. So to get he number of unique distributions with exactly 2 six we must subtract 3{{x}_{3}} to the cardinality of {{A}_{2}}:
{{x}_{2}} =\left| {{A}_{2}} \right|-3{{x}_{3}}=\left| {{A}_{2}} \right|-3|{{A}_{3}}|
{{A}_{1}} includes 2 times the distributions with exactly 2 six and 3 times the distributions with exactly 3 six. So the number of distributions with exactly 1 six is:
{{x}_{1}}=\left|{{A}_{1}} \right| -2{{x}_{2}} -3 {{x}_{3}} =\left| {{A}_{1}} \right|-2\left( \left| {{A}_{2}}\left| -3 \right|{{A}_{3}} \right| \right)-3\left| {{A}_{3}} \right|=\left| {{A}_{1}}\left| -2 \right|{{A}_{2}}\left| +3 \right|{{A}_{3}} \right|

Finally we can get the actual number of all the possible prohibited distributions, cleaned up from duplicates:
{x_1} + {x_2} + {x_3} = \left( {\left| {A_1} \right| - 2\left| {A_2} \right| + 3\left| {A_3} \right|} \right) + \left( {\left| {A_2} \right| - 3\left| {A_3} \right|} \right) + \left| {A_3} \right| = \left| {A_1} \right| - \left| {A_2} \right| + \left| {A_3} \right|
This is the number of unique distributions that include sequences with 1 or 2 or 3 six. And this is also the set of prohibited distributions.

The following figure visually shows the duplicated elements in the {{A}_{1}}, {{A}_{2}} and {{A}_{3}} groups and the result {{x}_{1}} +{{x}_{2}}+ {{x}_{3}}=\left| {{A}_{1}} \right|-\left| {{A}_{2}} \right|+\left| {{A}_{3}} \right|.


DiceSumProblem_Figure1PSFigure 1: The group {{A}_{1}} includes all combinations with just 1 six (counted 1 time) but also the combinations with 2 six, that are counted 2 times and the ones with 3 six, that are counted 3 times.
Similarly the group
{{A}_{2}} includes all combinations with just 2 six (counted 1 time) but also the combinations with 3 six, that are counted 3 times.
In general, in the multiset {{A}_{k}} the combinations with j six would be counted \displaystyle  \left( \begin{matrix}j \\k \\\end{matrix} \right) times.


The same result could be found algebraically by solving the following system in the unknowns {x_1}{x_2} and {x_3}:
\left\{ {\begin{array}{*{20}{l}} {\left| {{A_1}} \right| = {x_1} + 2{x_2} + 3{x_3}}\\ {\left| {{A_2}} \right| = {x_2} + 3{x_3}}\\ {\left| {{A_3}} \right| = {x_3}} \end{array}} \right.
which gives:
{x_1} = \left| {{A_1}} \right| - 2\left| {{A_2}} \right| + 3\left| {{A_3}} \right|,{\rm{ }}{x_2} = \left| {{A_2}} \right| - 3\left| {{A_3}} \right|,{\rm{ }}{x_3} = \left| {{A_3}} \right|
and form which we get:
{x_1} + {x_2} + {x_3} = \left| {{A_1}} \right| - \left| {{A_2}} \right| + \left| {{A_3}} \right|

The following figure shows the nesting structure and the duplicated distributions of the {{A}_{1}}, {{A}_{2}}, {{A}_{3}}  groups.


Figure 2: Another representation of the multisets structure of {{A}_{1}}, {{A}_{2}} and {{A}_{3}} and of their the duplicated elements, visually showing that the unique prohibited cases {{x}_{1}}, {{x}_{2}}, {{x}_{3}} can be obtained with the formulas {x_1} = \left| {{A_1}} \right| - 2\left| {{A_2}} \right| + 3\left| {{A_3}} \right|,{\rm{ }}{x_2} = \left| {{A_2}} \right| - 3\left| {{A_3}} \right|,{\rm{ }}{x_3} = \left| {{A_3}} \right|.


 

Conclusions for the 10 dice roll with sum 31 case


After having finally found the number of prohibited cases and how to remove the number of duplicated prohibited cases we can finally apply the equation
number of allowed distributions = number of unrestrained distributions number of prohibited distributions
where the number of unrestrained distributions is:
\left( \left( \begin{matrix}10 \\21 \\\end{matrix} \right) \right)=\left( \begin{matrix}30 \\21 \\\end{matrix} \right)=14307150
the number of unique prohibited distributions is:
\left( \begin{matrix} 10 \\ 1 \\ \end{matrix} \right)\left( \begin{matrix} 24 \\ 15 \\ \end{matrix} \right)-\left( \begin{matrix} 10 \\ 2 \\ \end{matrix} \right)\left( \begin{matrix} 18 \\ 9 \\ \end{matrix} \right)+\left( \begin{matrix} 10 \\ 3 \\ \end{matrix} \right)\left( \begin{matrix} 12 \\ 3 \\ \end{matrix} \right) =13075040 - 2187900 + 26400=10913540
and the number of allowed distributions is:
{{N}^{10}}_{31}=\left( \begin{matrix} 30 \\ 21 \\ \end{matrix} \right)-\left( \begin{matrix} 10 \\ 1 \\ \end{matrix} \right)\left( \begin{matrix} 24 \\ 15 \\ \end{matrix} \right)+\left( \begin{matrix} 10 \\ 2 \\ \end{matrix} \right)\left( \begin{matrix} 18 \\ 9 \\ \end{matrix} \right)-\left( \begin{matrix} 10 \\ 3 \\ \end{matrix} \right)\left( \begin{matrix} 12 \\ 3 \\ \end{matrix} \right) =14307150-10913540=3393610
in accordance with the general formula (1).
For the records, considering the number of different possible outcomes ({6^{10}} = 60466176) the probability of having a sum of 31 in a 10 dice roll is
\displaystyle P\left( {31,10,6} \right) = \frac{{3393610}}{{60466176}} = 0.05612 \to 5.612\%
Furthermore, by applying the formula in a spreadsheet or a mathematical software program, we can say that the maximum probability for a given sum would occur for a roll of 35 (7.269%), the probability to have a sum between 30 and 40 (extremes included) is 68.699% and the probability to have a sum between 25 and 45 (extremes included) is 94.950%.

 

Conclusions for the general formula


With the long process of building the expression for the case of a 10 dice roll with sum 31, expression that, as was verified, matches the general formula (1), we have reached a better understanding about how the formula works.
For example the floor function comes out because in the prohibited cases groups there cannot be more than 3 six in order to have a sum of 21 and the alternating signs come out from the cleaning up of the duplicated elements of the multisets {{A}_{1}}, {{A}_{2}}, (whilst the last group, the set {{A}_{3}}, has no duplicates) to get the actual set of the unique prohibited cases, whose cardinality is {{x}_{1}}+{{x}_{2}}+{{x}_{3}}.
The technique used in deriving the formula for removing the duplicated cases is based on the inclusion-exclusion principle even if we had to face the further complication for having to deal with multisets and not with simple sets.

 

Side note: a dual view


We have used the combinations with repetitions to get the number of unrestrained distributions: \left( \left( \begin{matrix}10 \\21 \\\end{matrix} \right) \right)=\left( \begin{matrix}30 \\21 \\\end{matrix} \right). This formula has the meaning of counting the possible ways to distribute 21 points between 10 dice.
But since \displaystyle \left( \begin{matrix}30 \\21 \\\end{matrix} \right)=\left( \begin{matrix}30 \\9 \\\end{matrix} \right)=\frac{30!}{21!9!} we could also have used the multiset coefficient \left( \left( \begin{matrix}22 \\9 \\\end{matrix} \right) \right)=\left( \begin{matrix}30 \\9 \\\end{matrix} \right).
(In general  \left( \left( \begin{matrix}n \\k \\\end{matrix} \right) \right)=\left( \begin{matrix}n+k-1 \\k \\\end{matrix} \right)=\left( \begin{matrix}n+k-1 \\n-1 \\\end{matrix} \right)=\left( \left( \begin{matrix}k+1 \\n-1 \\\end{matrix} \right) \right) ).

The multiset coefficient  \left( \left( \begin{matrix}22 \\9 \\\end{matrix} \right) \right) can be interpreted as the number of possible ways to distribute 9 dice between the 22 steps of a staircase, numbered from 0 to 21.
If the dice are ordered by their height along the staircase (from the lowest to the highest positions), the integer number associated to the step i occupied by the k-th die can represent the cumulated sum of the dice from 1 to k.
For example if the step with value 13 is occupied by the 3rd die (the 1st and 2nd dice will be in equal or lower steps), this would mean that the sum of the dice 1, 2, and 3 is 13.
Furthermore the value of the 3rd die will be equal to the difference between the number of the step in which that die is placed and the number of the step in which is placed the 2nd die.
The 10th die is not considered in the multiset coefficient formula. In fact we can think of it as automatically assigned to the 22nd step (whose number is 21) to guarantee a total of 21 for the sum of the 10 dice, and, being in a fixed position, it doesn’t contribute to the number of the possible distributions of the dice.
With above interpretation we can recognize that there is a one-to-one correspondence between the distributions counted by the multiset coefficient \left( \left( \begin{matrix}22 \\9 \\\end{matrix} \right) \right) (9 dice placed in the 22 steps of a staircase, case a) and those counted by the multiset coefficient \left( \left( \begin{matrix}10 \\21 \\\end{matrix} \right) \right) (21 points placed in 10 boxes/dice, case b).
In both cases a die may have a value ranging from 0 (i.e. two dice sharing the same step (a) or an empty box (b)) to 21 (i.e. 9 dice placed in the first step having number 0 with the 10th virtual dice being by default located in the last step with number 21 (a) or all 21 points placed in the 10th box (b)).

The corresponding binomial coefficient \left( \begin{matrix}30 \\9 \\\end{matrix} \right) associated to the multiset coefficient \left( \left( \begin{matrix}22 \\9 \\\end{matrix} \right) \right) also suggests to interpret this number as the count of the combinations of 9 different symbols taken from a set of 30.
In our case the 30 symbols may simply be the natural numbers from 1 to 30 and \left( \begin{matrix}30 \\9 \\\end{matrix} \right) is the number of the possible drawings of 9 numbers from an urn containing all the 30 numbers.
Since in combinations the order doesn’t count we can consider each single combination as a sequence of 9 of the 30 elements (natural numbers)  listed in the ascending lexicographical order.
For instance, one possible combination could be: {5,11,17,22,23,24,26,28,30}.
This sequence could be interpreted as the cumulative sum of the dice in a 9 dice roll. So above list could be associated with this roll of 9 dice: {5,6,6,5,1,1,2,2,2}.
Since we have actually 10 dice (and not 9) and since the sum of the 10 dice must be 31 (and not 30) we can just append the number 31 at the end of the sequence.
In this way each dice roll with sum 31 can be represented by the ordered sequence of the 30 natural numbers (with the number 31 appended at the end) and each ordered sequence of the 30 natural numbers (with 31 appended at the end) can be translated in a 10 dice roll. So there’s a one-to-one correspondence between dice rolls and the ordered sequences of the 9 numbers extracted.
It’s important to notice that dice rolls with the same points for the 10 dice but with different distributions would be represented by different combinations of the symbols.
For instance the 10 dice roll {3,5,2,2,5,4,1,1,2,6} is associated to the symbol sequence {3,8,10,12,17,21,22,23,25,31}.
If we invert die 1 and die 2 we have the (different) 10 dice roll {5,3,2,2,5,4,1,1,2,6} that is associated to a different symbol sequence {5,8,10,12,17,21,22,23,25,31} (that is a different combination of the symbols).
This must be so as, to calculate the probability of a given sum, we must consider the two dice rolls as different dice rolls (remember the case of two dice with sum 9 discussed in the previous section “Origin of the problem: a simple case” ) even if the numbers produced by the ten dice are the same.

This dual view seems simpler than the one presented in the main section of this article: here the fact that each die contribute to the sum with at least 1 point needs no artificial fix and is naturally addressed by the fact that two consecutive natural numbers in the symbol sequence always differ by at least 1 unit.
Anyway this dual view leaves open the problem of the outlaws dice (the ones with more than 6 points) in the dice rolls.
For example the symbol sequence {1,3,12,15,16,18,20,21,29,31} would translate in a prohibited dice roll {1,2,9,3,1,2,2,1,8,2} with 2 outlaw dice.
A prohibited dice rolls occur whenever in the sequence of the 9 natural numbers (with the number 31 appended at the end) the first number or the difference between 2 consecutive numbers is greater than 6.

But to find the right way to remove the prohibited cases in this new scenario is another story…

 

References


Mathworld – The dice roll problem and the general formula: http://mathworld.wolfram.com/Dice.html
Counting and Probability (Johan Claeys): http://home.scarlet.be/math/tel.htm
Notes on Probability and Statistics (Andrew Forrester, pdf): http://aforrester.bol.ucla.edu/educate/Math/Notes_Probability.pdf
Mathisfun – Combinations and Permutations: http://www.mathsisfun.com/combinatorics/combinations-permutations.html
Wikipedia – Permutation: http://en.wikipedia.org/wiki/Permutation
Wikipedia – Combination: http://en.wikipedia.org/wiki/Combination
Wikipedia – Multiset: http://en.wikipedia.org/wiki/Multiset
Wikipedia – Floor and ceiling functions: http://en.wikipedia.org/wiki/Floor_and_ceiling _functions
Wikipedia – Inclusion-exclusion principle: http://en.wikipedia.org/wiki/Inclusion–exclusion_principle
Wikipedia – Partition (number theory): http://en.wikipedia.org/wiki/Partition_(number_theory)

Last edited 08 mar 2015