4 Vectorized loops with apply()
We finished the last chapter writing code that simulates playing Game-A 10 times. For recap purposes, the implemented code is displayed below:
set.seed(133)
= 1:6
die = 10
number_games
# initialize output matrix (to be populated in for-loop)
= matrix(0, nrow = number_games, ncol = 4)
games
for (game in 1:number_games) {
= sample(die, size = 4, replace = TRUE)
games[game, ]
}
rownames(games) = paste0("game", 1:number_games)
colnames(games) = paste0("roll", 1:4)
games
The punchline of this piece of code has to do with the for()
loop, storing
the outputs of each game in the corresponding row of the games
matrix.
Additionally, we also wrote a second for()
loop to determine whether each
game—each row in games
—had at least one six; this was done with the
any()
function, and it’s depicted in the following diagram:

Figure 4.1: Diagram depicting the application of any() to all the rows of matrix games.
4.1 Function apply()
Instead of writing a loop to see which games are wins, and which games are
losses, we can take advantage of a very interesting function called apply()
,
which R users refer to as a vectorized loop function.
As the name indicates, apply()
lets you apply a function to the elements
of a matrix. The elements of a matrix can be:
its rows:
MARGIN = 1
its columns:
MARGIN = 2
both (rows & cols):
MARGIN = c(1, 2)
For example, say you want to get the sum()
of all the elements in each
row of games
. Here’s how to do that with apply()
:
# row sum
apply(X = games, MARGIN = 1, FUN = sum)
## game1 game2 game3 game4 game5 game6 game7 game8 game9 game10
## 12 13 16 15 7 13 13 8 10 17
We pass three inputs to apply()
. The first ingredient is the input matrix,
the second ingredient specifies the MARGIN
value, and the third ingredient
FUN
is the function to be applied. MARGIN = 1
means that the function FUN
is applied row-by-row.
Here’s another example. Say you want to obtain the product of all the elements
in each column of games
. This requires specifying MARGIN = 2
and FUN = prod
:
# column product
apply(X = games, MARGIN = 2, FUN = prod)
## roll1 roll2 roll3 roll4
## 38400 19440 5760 720
Or what if you want to get the minimum in each row of games
? All you have
to do is apply()
the min
function:
# row minimum
apply(X = games, MARGIN = 1, FUN = min)
## game1 game2 game3 game4 game5 game6 game7 game8 game9 game10
## 1 1 2 1 1 2 1 1 1 1
4.2 Anonymous functions and apply()
Sometimes, there is no built-in function to be used for the argument FUN
.
For instance, say you want to obtain the range of each row, that is, the
maximum minus the minimum. R has a range()
function but it does not return a
single value, it just gives you the min()
and the max()
of an input vector:
= c(1, 6, 4, 1)
game_1 range(game_1)
## [1] 1 6
If you want the range, you need to compute the max()
minus the min()
= c(1, 6, 4, 1)
game_1 = max(game_1) - min(game_1)
range_1 range_1
## [1] 5
Because R does not have a built-in function that returns the range, we need
to provide this function to the FUN
argument of apply()
. When the function
to be provided is fairly simple, we can create an anonymous function
inside apply()
, here’s how we do it:
# row ranges (with anonymous function)
apply(
X = games,
MARGIN = 1,
FUN = function(x) max(x) - min(x))
## game1 game2 game3 game4 game5 game6 game7 game8 game9 game10
## 5 5 3 5 3 2 5 3 5 5
The reason why the provided function to the argument FUN
is called an anonymous
function is because the created function has no name.
An alternative option is to first create a function outside apply()
, and
then pass this function like any other function. This alternative is often
preferred when the body of the function to be passed to apply()
involves
several lines of code.
For example, in the following code chunk we create a function vector_range()
—that
computes the statistical range—and then we pass this function to apply()
in
order to get the range in each row of the matrix games
:
# auxiliary function to compute range
= function(x) {
vector_range max(x) - min(x)
}
# row ranges
apply(
X = games,
MARGIN = 1,
FUN = vector_range)
## game1 game2 game3 game4 game5 game6 game7 game8 game9 game10
## 5 5 3 5 3 2 5 3 5 5
4.2.1 Number of wins with apply()
Let’s go back to the task of finding which games are wins.
Because there’s no default function that computes if any()
element of a
vector is equal to six, we need to create an anonymous function for the
FUN
argument:
= apply(
wins X = games,
MARGIN = 1,
FUN = function(x) any(x == 6))
wins
## game1 game2 game3 game4 game5 game6 game7 game8 game9 game10
## TRUE TRUE FALSE TRUE FALSE FALSE TRUE FALSE TRUE TRUE
We can now compute the proportion of wins:
= sum(wins) / number_games
prop_wins prop_wins
## [1] 0.6