26 Loops

The majority of functions that work with vectors are vectorized. Remember that vectorized operations are calculations that are applied to all the elements in a vector (element-wise operations).

In order to learn about loops and iterations, it’s good to forget about vectorized operations in R. This means that we will be writing code, using some sort of loop structure, to perform tasks for which there is already a vectorized implementation. For example, in this chapter you will have to write code with various types of loops to calculate the mean of a numeric vector. This can easily be done using the function mean(). But we don’t want you to use mean(). We want you to think about control-flow structures, which are essential in any programming activity.

  • Many times we need to perform a procedure several times
  • We perform the same operation several times as long as some condition is fulfilled
  • For this purpose we use loops
  • The main idea is that of iteration
  • R provides three basic paradigms: for, repeat, while

26.1 Motivation Example

Consider a numeric vector with prices of five items:

prices <- c(2.50, 2.95, 3.45, 3.25)
prices
#> [1] 2.50 2.95 3.45 3.25

26.1.1 Printing prices “manually”

Say you are interested in printing each price individually. You can manually display them one by one, by typing the same command several times:

cat("Price 1 is", prices[1])
cat("Price 2 is", prices[2])
cat("Price 3 is", prices[3])
cat("Price 4 is", prices[4])
#> Price 1 is 2.5
#> Price 2 is 2.95
#> Price 3 is 3.45
#> Price 4 is 3.25

26.1.2 Printing prices with a for loop

Or you can use a loop structure in which you tell the computer to display the prices a given number of times, but using one command instead of typing it various times:

for (i in 1:4) {
  cat("Price", i, "is", prices[i], "\n")
}
#> Price 1 is 2.5 
#> Price 2 is 2.95 
#> Price 3 is 3.45 
#> Price 4 is 3.25

Let’s make it less simple by creating a vector of prices with the names of the associated coffees:

coffee_prices <- c(
  expresso = 2.50,
  latte = 2.95,
  mocha = 3.45, 
  cappuccino = 3.25)
coffee_prices
#>   expresso      latte      mocha cappuccino 
#>       2.50       2.95       3.45       3.25

Without using a loop, you can display, via cat(), the prices one-by-one; (this, of course, involves a lot of repetition)

cat("Expresso has a price of", coffee_prices[1])
cat("Latte has a price of", coffee_prices[2])
cat("Mocha has a price of", coffee_prices[3])
cat("Capuccino has a price of", coffee_prices[4])
#> Expresso has a price of 2.5
#> Latte has a price of 2.95
#> Mocha has a price of 3.45
#> Capuccino has a price of 3.25

26.1.3 Printing coffee prices with a for loop

for (i in 1:4) {
  cat(names(coffee_prices)[i], "has a price of", 
      prices[i], "\n")
}
#> expresso has a price of 2.5 
#> latte has a price of 2.95 
#> mocha has a price of 3.45 
#> cappuccino has a price of 3.25

26.2 For loops

Let’s start with a super simple example. Consider a vector vec <- c(3, 1, 4). And suppose you want to add 1 to every element of vec. You know that this can easily be achieved using vectorized code:

vec <- c(3, 1, 4) 

vec + 1
#> [1] 4 2 5

In order to learn about loops, I’m going to ask you to forget about the notion of vectorized code in R. That is, pretend that R does not have vectorized functions.

Think about what you would manually need to do in order to add 1 to the elements in vec. This addition would involve taking the first element in vec and add 1, then taking the second element in vec and add 1, and finally the third element in vec and add 1, something like this:

vec[1] + 1
vec[2] + 1
vec[3] + 1

The code above does the job. From a purely arithmetic standpoint, the three lines of code reflect the operation that you would need to carry out to add 1 to all the elements in vec.

From a programming point of view, you are performing the same type of operation three times: selecting an element in vec and adding 1 to it. But there’s a lot of (unnecessary) repetition.

This is where loops come very handy. Here’s how to use a for () loop to add 1 to each element in vec:

vec <- c(3, 1, 4)

for (j in 1:3) {
  print(vec[j] + 1)
}
#> [1] 4
#> [1] 2
#> [1] 5

In the code above we are taking each vec element vec[j], adding 1 to it, and printing the outcome with print() so you can visualize the additions at each iteration of the loop.

Your turn: rewrite the for loop in order to triple every element in vec, and printing the output at each step of the loop:

vec <- c(3, 1, 4) # Change this value!

for (j in c()) { # Replace c() with an appropriate sequence.
  # Fill in.
  
}

What if you want to create a vector vec2, in which you store the values produced at each iteration of the loop? Here’s one possibility:

vec <- c(3, 1, 4)  # Change this value!
vec2 <- rep(0, length(vec))  # "empty" of zeros vector to be filled in the loop

for (i in c()) {# Replace c() with an appropriate sequence.
  # Fill in.
}

26.3 For Loops

  • Often we want to repeatedly carry out some computation a fixed number of times.
  • For instance, repeat an operation for each element of a vector.
  • In R this can be done with a for loop.
  • for loops are used when we know exactly how many times we want the code to repeat

The anatomy of a for loop is as follows:

for (iterator in times) { 
  do_something
}

for() takes an iterator variable and a vector of times to iterate through.

value <- 2
for (i in 1:5) { 
  value <- value * 2 
  print(value)
}
#> [1] 4
#> [1] 8
#> [1] 16
#> [1] 32
#> [1] 64

The vector of times does NOT have to be a numeric vector; it can be any vector

value <- 2
times <- c('one', 'two', 'three', 'four')
for (i in times) { 
  value <- value * 2 
  print(value)
}
#> [1] 4
#> [1] 8
#> [1] 16
#> [1] 32

However, if the iterator is used inside the loop in a numerical computation, then the vector of times will almost always be a numeric vector:

set.seed(4321)
numbers <- rnorm(5)
for (h in 1:length(numbers)) {
  if (numbers[h] < 0) {
    value <- sqrt(-numbers[h])
  } else {
    value <- sqrt(numbers[h])
  }
  print(value)
}
#> [1] 0.653
#> [1] 0.473
#> [1] 0.847
#> [1] 0.917
#> [1] 0.358

26.3.1 For Loops and Next statement

Sometimes we need to skip a loop iteration if a given condition is met, this can be done with a next statement

for (iterator in times) { 
  expr1
  expr2
  if (condition) {
    next
  }
  expr3
  expr4
}

Example:

x <- 2
for (i in 1:5) {
  y <- x * i
  if (y == 8) {
    next
  }
  print(y)
}
#> [1] 2
#> [1] 4
#> [1] 6
#> [1] 10

26.3.2 Nested Loops

It is common to have nested loops

for (iterator1 in times1) { 
  for (iterator2 in times2) {
    expr1
    expr2
    ...
  }
}

Example: Nested loops

# some matrix
A <- matrix(1:12, nrow = 3, ncol = 4)
A
#>      [,1] [,2] [,3] [,4]
#> [1,]    1    4    7   10
#> [2,]    2    5    8   11
#> [3,]    3    6    9   12

Example: Nested Loops

# reciprocal of values less than 6
for (i in 1:nrow(A)) { 
  for (j in 1:ncol(A)) {
    if (A[i,j] < 6) A[i,j] <- 1 / A[i,j] 
  }
}
A
#>       [,1] [,2] [,3] [,4]
#> [1,] 1.000 0.25    7   10
#> [2,] 0.500 0.20    8   11
#> [3,] 0.333 6.00    9   12

26.3.3 About for Loops and Vectorized Computations

  • R loops have a bad reputation for being slow.

  • Experienced users will tell you: “tend to avoid for loops in R” (me included).

  • It is not really that the loops are slow; the slowness has more to do with the way R handles the boxing and unboxing of data objects, which may be a bit inefficient.

  • R provides a family of functions that are usually more efficient than loops (i.e. apply() functions).

  • For this course, especially if you have NO programming experience, you should ignore any advice about avoiding loops in R.

  • You should learn how to write loops, and understand how they work; every programming language provides some type of loop structure.

  • In practice, many (programming) problems can be tackled using some loop structure.

  • When using R, you may need to start solving a problem using a loop. Once you solved it, try to see if you can find a vectorized alternative.

  • It takes practice and experience to find alternative solutions to for loops.

  • There are cases when using for loops is not that bad.

26.4 Practice Examples

Below are a bunch of practice examples.

Your Turn: Summation Series

Write a for loop to compute the following two series. Your loop should start at step \(k=0\) and stop at step \(n\). Test your code with different values for \(n\). And store each k-th term at each iteration. Does the series converge as \(n\) increase?

\[ \sum_{k=0}^{n} \frac{1}{2^k} = 1 + \frac{1}{2} + \frac{1}{4} + \frac{1}{8} + \dots + \frac{1}{2^n} \]

\[ \sum_{k=0}^{n} \frac{1}{9^k} =1 + \frac{1}{9} + \frac{1}{81} + \dots + \frac{1}{9^n} \]

Your Turn: Arithmetic Series

Write a for loop to compute the following arithmetic series \(a_n = a_1 + (n-1)d\) when \(a_1 = 3\), and \(d = 3\). For instance: \(3 + 6 + 9 + 12 + 15 + \dots\).

\[ a_n = a_1 + (n-1)d \]

Test your code with different values for \(n\). And store each n-th term at each iteration. Does the series converge as \(n\) increase?

Your Turn: Geometric Sequence

A sequence such as \(3, 6, 12, 24, 48\) is an example of a geometric sequence. In this type of sequence, the \(n\)-th term is obtained as:

\[ a_n = a_1 \times r^{n-1} \]

where: \(a_1\) is the first term, \(r\) is the common ratio, and \(n\) is the number of terms.

Write a for loop to compute the sum of the first \(n\) terms of: \(3 + 6 + 12 + 24 + \dots\). Test your code with different values for \(n\). Does the series converge as \(n\) increase?

Your Turn: Sine Approximation

Consider the following series that is used to approximate the function \(sin(x)\):

\[ sin(x) \approx x - \frac{x^3}{3!} + \frac{x^5}{5!} - \frac{x^7}{7!} + \dots \]

Write a for loop to approximate \(sin(x)\). Try different number of terms, \(n = 5, 10, 50, 100\). Compare your loop with the sin() function.

26.5 For loop with a matrix

Consider the following matrix A:

A <- matrix(1:20, nrow = 5, ncol = 4)
A
#>      [,1] [,2] [,3] [,4]
#> [1,]    1    6   11   16
#> [2,]    2    7   12   17
#> [3,]    3    8   13   18
#> [4,]    4    9   14   19
#> [5,]    5   10   15   20

Say we want to add 1 to all elements in row 1, add 2 to all elements in row 2, add 3 to all elements in row 3, and so on. To do this without using vectorized code, you need to work with two nested for() loops. One loop will control how you traverse the matrix by rows, the other loop will control how you traverse the matrix by columns. Here’s how:

# empty matrix B
B <- matrix(NA, nrow = 5, ncol = 4)

# for loop to get matrix B
for (i in 1:nrow(A)) {
  for (j in 1:ncol(A)) {
    B[i,j] <- A[i,j] + i
  }
}

B
#>      [,1] [,2] [,3] [,4]
#> [1,]    2    7   12   17
#> [2,]    4    9   14   19
#> [3,]    6   11   16   21
#> [4,]    8   13   18   23
#> [5,]   10   15   20   25

Your Turn

Consider the following matrix X:

set.seed(123)
X <- matrix(rnorm(12), nrow = 4, ncol = 3)
X
#>         [,1]   [,2]   [,3]
#> [1,] -0.5605  0.129 -0.687
#> [2,] -0.2302  1.715 -0.446
#> [3,]  1.5587  0.461  1.224
#> [4,]  0.0705 -1.265  0.360

Write code in R, using loops, to get a matrix Y such that the negative numbers in X are transformed into squared values, while the positive numbers in X are transformed into square root values

26.6 Dividing a number by 2 multiple times

The following examples involve dividing a number by 2 until it becomes odd.

Using a repeat loop

# Divide a number by 2 until it becomes odd.
val_rep <- 898128000 # Change this value!

repeat {
  print(val_rep)
  if (val_rep %% 2 == 1) { # If val_rep is odd,
    break                  # end the loop.
  }
  val_rep <- val_rep / 2 # Divide val_rep by 2 since val_rep was even.
  # When the end of the loop is reached, return to the beginning of the loop.
}
#> [1] 8.98e+08
#> [1] 4.49e+08
#> [1] 2.25e+08
#> [1] 1.12e+08
#> [1] 56133000
#> [1] 28066500
#> [1] 1.4e+07
#> [1] 7016625

Using a while Loop

# Divide a number by 2 until it becomes odd.
val_while <- 898128000 # Change this value!

while (val_while %% 2 == 0) { # Continue the loop as long as val_while is even.
  print(val_while)
  val_while <- val_while / 2
}
#> [1] 8.98e+08
#> [1] 4.49e+08
#> [1] 2.25e+08
#> [1] 1.12e+08
#> [1] 56133000
#> [1] 28066500
#> [1] 1.4e+07
print(val_while)
#> [1] 7016625

Make a reduce() function

Now generalize the above code to create a function reduce() which performs the same operation. (You should change very little.)

# your reduce() function
reduce <- function(x) {
  # Fill in.
  
}

reduce(898128000)

Your Turn: Average

The average of \(n\) numbers \(x_1, x_2, \dots, x_n\) is given by the following formula:

\[ \bar{x} = \frac{1}{n} \sum_{i=1}^{n} x_i = \frac{x_1 + x_2 + \dots + x_n}{n} \]

Write R code, using each type of loop (e.g. for, while, repeat) to implement the arithmetic mean of the vector x = 1:100

Your Turn: Standard Deviation

The sample standard deviation of a list of \(n\) numbers \(x_1, x_2, \dots, x_n\) is given by the following formula:

\[ SD = \sqrt{ \frac{1}{n-1} \sum_{i=1}^{n} (x_i - \bar{x})^2 } \]

Write R code, using each type of loop (e.g. for, while, repeat) to implement the sample standard deviation of the vector x = 1:100

Your Turn: Geometric Mean

The geometric mean of \(n\) numbers \(x_1, x_2, \dots, x_n\) is given by the following formula:

\[ \bar{x} = \left ( \prod_{i=1}^{n} x_i \right )^{1/n} \]

Write R code, using each type of loop (e.g. for, while, repeat) to implement the geometric mean of the vector x = 1:50

Your Turn: Distance Matrix of Letters

The following code generates a random matrix distances with arbitrary distance values among letters in English:

# random distance matrix
num_letters <- length(LETTERS)
set.seed(123)
values <- sample.int(num_letters) 
distances <- values %*% t(values)
diag(distances) <- 0
dimnames(distances) <- list(LETTERS, LETTERS)

The first 5 rows and columns of distances are:

distances[1:5, 1:5]
#>     A   B   C  D   E
#> A   0 285 210 45 150
#> B 285   0 266 57 190
#> C 210 266   0 42 140
#> D  45  57  42  0  30
#> E 150 190 140 30   0

Consider the following character vector vec <- c('E', 'D', 'A'). The idea is to use the values in matrix distances to compute the total distance between the letters: that is from E to D, and then from D to A:

# (E to D) + (D to A)
483 + 168
#> [1] 651

Hence, you can say that the letters in the word 'E' 'D' 'A' have a total distance value of 651.

Your Turn

Write a function get_dist() that takes two inputs:

  • distances = the matrix of distance among letters.
  • ltrs = a character vector of upper case letters.

The function must return a numeric value with the total distance. Also, include a stopping condition—via stop()—for when a value in ltrs does not match any capital letter. The error message should be "Unrecognized character"

Here’s an example of how you should be able to invoke get_dist():

vec <- c('E', 'D', 'A')
get_dist(distances, vec)

And here’s an example that should raise an error:

err <- c('E', 'D', ')')
get_dist(distances, err)

Test your function with the following character vectors:

  • cal <- c('C', 'A', 'L')
  • stats <- c('S', 'T', 'A', 'T', 'S')
  • oski <- c('O', 'S', 'K', 'I')
  • zzz <- rep('Z', 3)
  • lets <- LETTERS
  • a vector first with letters for your first name, e.g. c('G', 'A', 'S', 'T', 'O', 'N')
  • a vector last for your last name, e.g. c('S', 'A', 'N', 'C', 'H', 'E', 'Z')

Your turn: Assuming that you already created the objects listed above, now create an R list strings like this:

# use your own 'first' and 'last' objects
strings <- list(
  cal = cal,
  stats = stats,
  oski = oski,
  zzz = zzz,
  lets = lets,
  first = first,
  last = last
)

Write a for() loop to iterate over the elements in strings, and compute their distances. At each iteration, store the calculated distances in a list called strings_dists; this list should have the same names as strings.

How does your list strings_dists look like?