2 Coding Game-A
In the preceding chapter we described Game-A, and we were able to derive the probability of winning this game:
\[\begin{align*} Prob(\text{at least one six in 4 rolls}) &= 1 - Prob(\text{no six in 4 rolls}) \\ &= 1 - \left( \frac{5}{6} \right)^4 \\ &= 0.517747 \end{align*}\]
In this example, we are able to find the exact probability associated to the event of “winning Game-A”. However, not all probability problems have an analytical solution. When this is the case, we often rely on computers to implement simulations that allow us to find an approximate solution, and have a better understanding of the systematic long-term patterns that arise in random processes.
To go ahead with our exploration of simulations, let us pretend that the probability of winning Game-A had no analytical solution. Instead, let’s see how to use R for running various simulations to get an idea of the probabilities involved behind Game-A.
2.1 Rolling one die
Because Game-A involves rolling a die four times, the first step is to create—in a computational sense—an object die, and learn how to roll it.
Perhaps the most straightforward way to create a die
object is with a numeric
vector. One option for this is with a numeric sequence as follows:
= 1:6
die
die
## [1] 1 2 3 4 5 6
Once we have an object die
, how do we roll it? Rolling a die, computationally
speaking, implies getting a sample of size 1 from the elements in vector die
.
To do this in R, we can use the function sample()
. This function takes an
input vector and draws a random sample—of a given size—from the vector’s
elements.
# random seed (for reproducibility purposes)
set.seed(5)
sample(die, size = 1)
## [1] 2
Because sample()
is a function that generates random numbers, every time
you invoke it you will get a different output.
# another "roll"
sample(die, size = 1)
## [1] 3
# one more "roll"
sample(die, size = 1)
## [1] 1
To be able to reproduce results with any random generator function, you can use
set.seed()
to set the so-called random seed that the algorithms behind
sample()
and friends employ to generate those values. By setting this seed,
you can obtain the same reproducible output:
# random seed (for reproducibility purposes)
set.seed(5)
sample(die, size = 1)
## [1] 2
2.2 Rolling dice with sample()
What about rolling a pair of dice? We could repeat the sample()
command
twice:
set.seed(5)
= sample(die, size = 1)
roll1 = sample(die, size = 1) roll2
To avoid repeating the sample()
command multiple times we can also change the
value of the size
argument in sample()
:
set.seed(5)
sample(die, size = 2)
## [1] 2 3
The issue, not evident here, is that sample()
–by default—draws samples
without replacement. This means that if we want to get a sample of size
six, we would get a simple rearrangement of the elements in die
, which may
not be what we really want:
set.seed(5)
sample(die, size = 6)
## [1] 2 3 1 5 4 6
Even worse, for bigger samples of size 7 or greater, we would run into some limitations:
set.seed(5)
sample(die, size = 7)
## Error in sample.int(length(x), size, replace, prob): cannot take a sample larger than the population when 'replace = FALSE'
So, how do we draw samples with replacement? To draw samples with
replacement we need to use the replace
argument as follows:
set.seed(5)
sample(die, size = 2, replace = TRUE)
## [1] 2 3
2.3 Rolling four dice
So far, so good. We have a mechanism to simulate rolling a die four times:
set.seed(20)
= sample(die, size = 4, replace = TRUE)
four_rolls four_rolls
## [1] 6 3 2 1
With an output vector of four rolls, the next step involves determining if
any of the numbers is a six. To do this, we use a comparison to see which
elements in four_rolls
are equal to 6:
== 6 four_rolls
## [1] TRUE FALSE FALSE FALSE
This comparison returns a logical vector, which we can then pass to sum()
in order to count the number of TRUE
values:
= sum(four_rolls == 6)
count_sixes count_sixes
## [1] 1
Because all we care about is knowing if there’s at least one six, we can also
use the any()
function; here are a couple of examples of what this function
does when testing equality to 6:
any(c(6, 3, 2, 1) == 6)
## [1] TRUE
any(c(5, 3, 2, 1) == 6)
## [1] FALSE
any(c(4, 6, 2, 6) == 6)
## [1] TRUE
If any()
returns TRUE
, we win that game. If it returns FALSE
, we lose.