26 Loops
The majority of functions that work with vectors are vectorized. Remember that vectorized operations are calculations that are applied to all the elements in a vector (element-wise operations).
In order to learn about loops and iterations, it’s good to forget about
vectorized operations in R. This means that we will be writing code,
using some sort of loop structure, to perform tasks for which there is
already a vectorized implementation. For example, in this chapter you will have
to write code with various types of loops to calculate the mean of a numeric
vector. This can easily be done using the function mean()
. But we don’t
want you to use mean()
. We want you to think about control-flow structures,
which are essential in any programming activity.
- Many times we need to perform a procedure several times
- We perform the same operation several times as long as some condition is fulfilled
- For this purpose we use loops
- The main idea is that of iteration
- R provides three basic paradigms:
for
,repeat
,while
26.1 Motivation Example
Consider a numeric vector with prices of five items:
26.1.1 Printing prices “manually”
Say you are interested in printing each price individually. You can manually display them one by one, by typing the same command several times:
cat("Price 1 is", prices[1])
cat("Price 2 is", prices[2])
cat("Price 3 is", prices[3])
cat("Price 4 is", prices[4])
#> Price 1 is 2.5
#> Price 2 is 2.95
#> Price 3 is 3.45
#> Price 4 is 3.25
26.1.2 Printing prices with a for
loop
Or you can use a loop structure in which you tell the computer to display the prices a given number of times, but using one command instead of typing it various times:
for (i in 1:4) {
cat("Price", i, "is", prices[i], "\n")
}
#> Price 1 is 2.5
#> Price 2 is 2.95
#> Price 3 is 3.45
#> Price 4 is 3.25
Let’s make it less simple by creating a vector of prices with the names of the associated coffees:
coffee_prices <- c(
expresso = 2.50,
latte = 2.95,
mocha = 3.45,
cappuccino = 3.25)
coffee_prices
#> expresso latte mocha cappuccino
#> 2.50 2.95 3.45 3.25
Without using a loop, you can display, via cat()
, the prices one-by-one;
(this, of course, involves a lot of repetition)
cat("Expresso has a price of", coffee_prices[1])
cat("Latte has a price of", coffee_prices[2])
cat("Mocha has a price of", coffee_prices[3])
cat("Capuccino has a price of", coffee_prices[4])
#> Expresso has a price of 2.5
#> Latte has a price of 2.95
#> Mocha has a price of 3.45
#> Capuccino has a price of 3.25
26.2 For loops
Let’s start with a super simple example. Consider a vector vec <- c(3, 1, 4)
.
And suppose you want to add 1 to every element of vec
. You know that this
can easily be achieved using vectorized code:
In order to learn about loops, I’m going to ask you to forget about the notion of vectorized code in R. That is, pretend that R does not have vectorized functions.
Think about what you would manually need to do in order to add 1 to the elements
in vec
. This addition would involve taking the first element in vec
and
add 1, then taking the second element in vec
and add 1, and finally the third
element in vec
and add 1, something like this:
The code above does the job. From a purely arithmetic standpoint, the three
lines of code reflect the operation that you would need to carry out to add
1 to all the elements in vec
.
From a programming point of view, you are performing the same type of operation
three times: selecting an element in vec
and adding 1 to it. But there’s
a lot of (unnecessary) repetition.
This is where loops come very handy. Here’s how to use a for ()
loop
to add 1 to each element in vec
:
In the code above we are taking each vec
element vec[j]
, adding 1 to it,
and printing the outcome with print()
so you can visualize the additions
at each iteration of the loop.
Your turn: rewrite the for
loop in order to triple every element in vec
,
and printing the output at each step of the loop:
vec <- c(3, 1, 4) # Change this value!
for (j in c()) { # Replace c() with an appropriate sequence.
# Fill in.
}
What if you want to create a vector vec2
, in which you store the values
produced at each iteration of the loop? Here’s one possibility:
26.3 For Loops
- Often we want to repeatedly carry out some computation a fixed number of times.
- For instance, repeat an operation for each element of a vector.
- In R this can be done with a
for
loop. for
loops are used when we know exactly how many times we want the code to repeat
The anatomy of a for
loop is as follows:
for()
takes an iterator variable and a vector of times to iterate
through.
value <- 2
for (i in 1:5) {
value <- value * 2
print(value)
}
#> [1] 4
#> [1] 8
#> [1] 16
#> [1] 32
#> [1] 64
The vector of times does NOT have to be a numeric vector; it can be any vector
value <- 2
times <- c('one', 'two', 'three', 'four')
for (i in times) {
value <- value * 2
print(value)
}
#> [1] 4
#> [1] 8
#> [1] 16
#> [1] 32
However, if the iterator is used inside the loop in a numerical computation, then the vector of times will almost always be a numeric vector:
set.seed(4321)
numbers <- rnorm(5)
for (h in 1:length(numbers)) {
if (numbers[h] < 0) {
value <- sqrt(-numbers[h])
} else {
value <- sqrt(numbers[h])
}
print(value)
}
#> [1] 0.653
#> [1] 0.473
#> [1] 0.847
#> [1] 0.917
#> [1] 0.358
26.3.1 For Loops and Next statement
Sometimes we need to skip a loop iteration if a given condition is met, this can be done with a next statement
Example:
26.3.2 Nested Loops
It is common to have nested loops
Example: Nested loops
# some matrix
A <- matrix(1:12, nrow = 3, ncol = 4)
A
#> [,1] [,2] [,3] [,4]
#> [1,] 1 4 7 10
#> [2,] 2 5 8 11
#> [3,] 3 6 9 12
Example: Nested Loops
26.3.3 About for
Loops and Vectorized Computations
R loops have a bad reputation for being slow.
Experienced users will tell you: “tend to avoid
for
loops in R” (me included).It is not really that the loops are slow; the slowness has more to do with the way R handles the boxing and unboxing of data objects, which may be a bit inefficient.
R provides a family of functions that are usually more efficient than loops (i.e.
apply()
functions).For this course, especially if you have NO programming experience, you should ignore any advice about avoiding loops in R.
You should learn how to write loops, and understand how they work; every programming language provides some type of loop structure.
In practice, many (programming) problems can be tackled using some loop structure.
When using R, you may need to start solving a problem using a loop. Once you solved it, try to see if you can find a vectorized alternative.
It takes practice and experience to find alternative solutions to
for
loops.There are cases when using
for
loops is not that bad.
26.4 Practice Examples
Below are a bunch of practice examples.
Your Turn: Summation Series
Write a for loop to compute the following two series. Your loop should start at step \(k=0\) and stop at step \(n\). Test your code with different values for \(n\). And store each k-th term at each iteration. Does the series converge as \(n\) increase?
\[ \sum_{k=0}^{n} \frac{1}{2^k} = 1 + \frac{1}{2} + \frac{1}{4} + \frac{1}{8} + \dots + \frac{1}{2^n} \]
\[ \sum_{k=0}^{n} \frac{1}{9^k} =1 + \frac{1}{9} + \frac{1}{81} + \dots + \frac{1}{9^n} \]
Your Turn: Arithmetic Series
Write a for loop to compute the following arithmetic series \(a_n = a_1 + (n-1)d\) when \(a_1 = 3\), and \(d = 3\). For instance: \(3 + 6 + 9 + 12 + 15 + \dots\).
\[ a_n = a_1 + (n-1)d \]
Test your code with different values for \(n\). And store each n-th term at each iteration. Does the series converge as \(n\) increase?
Your Turn: Geometric Sequence
A sequence such as \(3, 6, 12, 24, 48\) is an example of a geometric sequence. In this type of sequence, the \(n\)-th term is obtained as:
\[ a_n = a_1 \times r^{n-1} \]
where: \(a_1\) is the first term, \(r\) is the common ratio, and \(n\) is the number of terms.
Write a for loop to compute the sum of the first \(n\) terms of: \(3 + 6 + 12 + 24 + \dots\). Test your code with different values for \(n\). Does the series converge as \(n\) increase?
Your Turn: Sine Approximation
Consider the following series that is used to approximate the function \(sin(x)\):
\[ sin(x) \approx x - \frac{x^3}{3!} + \frac{x^5}{5!} - \frac{x^7}{7!} + \dots \]
Write a for
loop to approximate \(sin(x)\). Try different number of terms,
\(n = 5, 10, 50, 100\). Compare your loop with the sin()
function.
26.5 For loop with a matrix
Consider the following matrix A
:
A <- matrix(1:20, nrow = 5, ncol = 4)
A
#> [,1] [,2] [,3] [,4]
#> [1,] 1 6 11 16
#> [2,] 2 7 12 17
#> [3,] 3 8 13 18
#> [4,] 4 9 14 19
#> [5,] 5 10 15 20
Say we want to add 1 to all elements in row 1, add 2 to all elements in
row 2, add 3 to all elements in row 3, and so on. To do this without using
vectorized code, you need to work with two nested for()
loops. One loop will
control how you traverse the matrix by rows, the other loop will control how
you traverse the matrix by columns. Here’s how:
# empty matrix B
B <- matrix(NA, nrow = 5, ncol = 4)
# for loop to get matrix B
for (i in 1:nrow(A)) {
for (j in 1:ncol(A)) {
B[i,j] <- A[i,j] + i
}
}
B
#> [,1] [,2] [,3] [,4]
#> [1,] 2 7 12 17
#> [2,] 4 9 14 19
#> [3,] 6 11 16 21
#> [4,] 8 13 18 23
#> [5,] 10 15 20 25
Your Turn
Consider the following matrix X
:
set.seed(123)
X <- matrix(rnorm(12), nrow = 4, ncol = 3)
X
#> [,1] [,2] [,3]
#> [1,] -0.5605 0.129 -0.687
#> [2,] -0.2302 1.715 -0.446
#> [3,] 1.5587 0.461 1.224
#> [4,] 0.0705 -1.265 0.360
Write code in R, using loops, to get a matrix Y
such that the negative
numbers in X
are transformed into squared values, while the positive
numbers in X
are transformed into square root values
26.6 Dividing a number by 2 multiple times
The following examples involve dividing a number by 2 until it becomes odd.
Using a repeat
loop
# Divide a number by 2 until it becomes odd.
val_rep <- 898128000 # Change this value!
repeat {
print(val_rep)
if (val_rep %% 2 == 1) { # If val_rep is odd,
break # end the loop.
}
val_rep <- val_rep / 2 # Divide val_rep by 2 since val_rep was even.
# When the end of the loop is reached, return to the beginning of the loop.
}
#> [1] 8.98e+08
#> [1] 4.49e+08
#> [1] 2.25e+08
#> [1] 1.12e+08
#> [1] 56133000
#> [1] 28066500
#> [1] 1.4e+07
#> [1] 7016625
Using a while
Loop
# Divide a number by 2 until it becomes odd.
val_while <- 898128000 # Change this value!
while (val_while %% 2 == 0) { # Continue the loop as long as val_while is even.
print(val_while)
val_while <- val_while / 2
}
#> [1] 8.98e+08
#> [1] 4.49e+08
#> [1] 2.25e+08
#> [1] 1.12e+08
#> [1] 56133000
#> [1] 28066500
#> [1] 1.4e+07
print(val_while)
#> [1] 7016625
Make a reduce()
function
Now generalize the above code to create a function reduce()
which performs
the same operation. (You should change very little.)
Your Turn: Average
The average of \(n\) numbers \(x_1, x_2, \dots, x_n\) is given by the following formula:
\[ \bar{x} = \frac{1}{n} \sum_{i=1}^{n} x_i = \frac{x_1 + x_2 + \dots + x_n}{n} \]
Write R code, using each type of loop (e.g. for
, while
, repeat
) to
implement the arithmetic mean of the vector x = 1:100
Your Turn: Standard Deviation
The sample standard deviation of a list of \(n\) numbers \(x_1, x_2, \dots, x_n\) is given by the following formula:
\[ SD = \sqrt{ \frac{1}{n-1} \sum_{i=1}^{n} (x_i - \bar{x})^2 } \]
Write R code, using each type of loop (e.g. for
, while
, repeat
) to
implement the sample standard deviation of the vector x = 1:100
Your Turn: Geometric Mean
The geometric mean of \(n\) numbers \(x_1, x_2, \dots, x_n\) is given by the following formula:
\[ \bar{x} = \left ( \prod_{i=1}^{n} x_i \right )^{1/n} \]
Write R code, using each type of loop (e.g. for
, while
, repeat
) to
implement the geometric mean of the vector x = 1:50
Your Turn: Distance Matrix of Letters
The following code generates a random matrix distances
with arbitrary
distance values among letters in English:
# random distance matrix
num_letters <- length(LETTERS)
set.seed(123)
values <- sample.int(num_letters)
distances <- values %*% t(values)
diag(distances) <- 0
dimnames(distances) <- list(LETTERS, LETTERS)
The first 5 rows and columns of distances
are:
distances[1:5, 1:5]
#> A B C D E
#> A 0 285 210 45 150
#> B 285 0 266 57 190
#> C 210 266 0 42 140
#> D 45 57 42 0 30
#> E 150 190 140 30 0
Consider the following character vector vec <- c('E', 'D', 'A')
. The idea is
to use the values in matrix distances
to compute the total distance between
the letters: that is from E
to D
, and then from D
to A
:
Hence, you can say that the letters in the word 'E' 'D' 'A'
have a total distance value of 651.
Your Turn
Write a function get_dist()
that takes two inputs:
distances
= the matrix of distance among letters.ltrs
= a character vector of upper case letters.
The function must return a numeric value with the total distance. Also, include
a stopping condition—via stop()
—for when a value in ltrs
does not match
any capital letter. The error message should be "Unrecognized character"
Here’s an example of how you should be able to invoke get_dist()
:
And here’s an example that should raise an error:
Test your function with the following character vectors:
cal <- c('C', 'A', 'L')
stats <- c('S', 'T', 'A', 'T', 'S')
oski <- c('O', 'S', 'K', 'I')
zzz <- rep('Z', 3)
lets <- LETTERS
- a vector
first
with letters for your first name, e.g.c('G', 'A', 'S', 'T', 'O', 'N')
- a vector
last
for your last name, e.g.c('S', 'A', 'N', 'C', 'H', 'E', 'Z')
Your turn: Assuming that you already created the objects listed above,
now create an R list strings
like this:
# use your own 'first' and 'last' objects
strings <- list(
cal = cal,
stats = stats,
oski = oski,
zzz = zzz,
lets = lets,
first = first,
last = last
)
Write a for()
loop to iterate over the elements in strings
, and compute
their distances. At each iteration, store the calculated distances in a list
called strings_dists
; this list should have the same names as strings
.
How does your list strings_dists
look like?