6 Running Various Simulations
A single series of 100 games may not be enough to fully understand the probability behavior of Game-A. So let’s see how to run various simulations, all of them involving playing 100 games.
Let’s start small. Meaning, let’s begin with a handful of games, and a few simulations. This will allow us to get a better idea of the objects and commands we need to use in order to later generalize things with more games and simulations.
Our starting point will involve 3 simulations, each one consisting of 5 games as illustrated in the picture below.
Here’s some naive code to run these simulations. We are momentarily considering this piece of code to get a high-level intuition of the things we might need later to write more efficient commands.
# -------------------------------------
# three simulations, each of 5 games
# -------------------------------------
set.seed(753)
# main inputs
= 1:6
die = 5
number_games
# simulation 1
= matrix(0, nrow = number_games, ncol = 4)
games_sim1 for (game in 1:number_games) {
= sample(die, size = 4, replace = TRUE)
games_sim1[game, ]
}= apply(games_sim1, 1, function(x) any(x == 6))
wins_sim1 = sum(wins_sim1) / number_games
prop_wins_sim1
# simulation 2
= matrix(0, nrow = number_games, ncol = 4)
games_sim2 for (game in 1:number_games) {
= sample(die, size = 4, replace = TRUE)
games_sim2[game, ]
}= apply(games_sim2, 1, function(x) any(x == 6))
wins_sim2 = sum(wins_sim2) / number_games
prop_wins_sim2
# simulation 3
= matrix(0, nrow = number_games, ncol = 4)
games_sim3 for (game in 1:number_games) {
= sample(die, size = 4, replace = TRUE)
games_sim3[game, ]
}= apply(games_sim3, 1, function(x) any(x == 6))
wins_sim3 = sum(wins_sim3) / number_games
prop_wins_sim3
# summary: proportion of wins
c('sim1' = prop_wins_sim1,
'sim2' = prop_wins_sim2,
'sim3' = prop_wins_sim3)
## sim1 sim2 sim3
## 0.4 0.2 0.8
This piece of code has a lot of unnecessary redundancy. But let’s ignore this
fact for a second. In each simulation, we create a matrix games_sim
to store
the rolls of each game. Then we use apply()
to determine which games are wins
(wins_sim
), and finally we calculate the proportion of wins (prop_wins_sim
).
If you inspect the vectors wins_sim1
, wins_sim2
, and wins_sim3
, you’ll
see that these are of "logical"
data-type. TRUE
means that a given game
is a win, whereas FALSE
means lose.
Alternatively, instead of dealing with the logical vectors wins_sim
, we
can convert their values into numbers: 1
instead of TRUE
, and -1
instead
of FALSE
. Why do we need this change from logical to numbers? Quick
answer, you don’t really need it. But as we’ll see, working with 1
and -1
is more convenient later down the road when we calculate cumulative gains.
The diagram below depicts this idea of storing the end result of each game
into 1
(win) or -1
(lose), for every simulation:
6.1 Embedded for()
loops
Let’s take the preceding naive code and try to make it more compact. We are
going to use embedded for()
loops: one of them to take care of the simulations,
and the other one to take care of the games.
The idea is to iterate through each simulation: 1, 2, and 3. And then, in a given simulation, we iterate through each game: 1, 2, 3, 4, and 5.
set.seed(753)
# main inputs
= 1:6
die = 5
number_games = 3
number_sims
# matrix to store simulation outputs
= matrix(0, nrow = number_games, ncol = number_sims)
simulations
# 1st loop) iterate through each simulation
for (sim in 1:number_sims) {
= matrix(0, nrow = number_games, ncol = 4)
games
# 2nd loop) iterate through each game
for (game in 1:number_games) {
= sample(die, size = 4, replace = TRUE)
games[game, ]
}= apply(games, 1, function(x) any(x == 6))
any_sixes = ifelse(any_sixes, 1, -1)
simulations[ ,sim]
}
rownames(simulations) = paste0("game", 1:number_games)
colnames(simulations) = paste0("sim", 1:number_sims)
simulations
## sim1 sim2 sim3
## game1 1 -1 -1
## game2 1 1 1
## game3 -1 -1 1
## game4 -1 -1 1
## game5 -1 -1 1
What’s going on with this code? Before starting the iterations, we create a
matrix simulations
which will store the numeric values of all games, and
all simulations. The rows of this matrix are associated to the games, and
the columns are associated to the simulations. At the end of the iterative
process, the cells of this matrix will be populated with either 1
(win game)
and -1
(lose game).
The first for()
loop—the outer loop—iterates through the simulations.
At the beginning of each simulation, an auxiliary games
matrix is initialized,
which then gets populated in the embedded for()
loop. This second for()
loop—the inner loop—iterates through the games, obtaining the rolls of
the five games. Right after the games are completed in the inner loop, we use
apply()
to determine which games are wins (i.e. have at least one 6), and
then we convert the TRUE
’s and FALSE
’s into 1
and -1
, respectively, by
using ifelse()
. This numeric vector is stored in the corresponding column of
the simulations
matrix.
Having obtained the matrix simulations
, we then proceed to count the number
of wins total_wins
(see code below). Notice the use of apply()
to count the
number of wins for each column of simulations
. The last step involves
computing the proportion of wins prop_wins
:
= apply(simulations, 2, function(x) sum(x==1))
total_wins = total_wins / number_games
prop_wins prop_wins
## sim1 sim2 sim3
## 0.4 0.2 0.8