14 Iterations: While Loop

In the previous chapter you got introduced to your first iterative construct: for loops. You use this type of loop when you know how many times a given computation needs to be repeated. But what about those situations in which you have to repeat a process without necessarily knowing how many times this repetition will take place? This is where we need a more general type of loop, namely, the while loop.

14.1 Motivation

Let’s begin with the same toy example discussed in the previous chapter. Say you have a vector vec <- c(3, 1, 4), and suppose you want to obtain a new vector vec2 that adds 1 to every element in vec. You know that this can easily be achieved using vectorized code:

vec <- c(3, 1, 4) 

vec2 <- vec + 1
vec2

[1] 4 2 5

Again, in order to explain the concept of a while loop, I am going to ask you to pretend that R does not have vectorized code.

What would you need to do in order to add 1 to the elements in vec? As we mentioned in the preceding chapter, you would need to do something like this:

# new vector to be updated
vec2 <- rep(0, 3)

# repetitive steps
vec2[1] <- vec[1] + 1
vec2[2] <- vec[2] + 1
vec2[3] <- vec[3] + 1

That is, take the first element in vec and add 1, then take the second element in vec and add 1, and finally the third element in vec and add 1. Basically, you are performing the same type of operation several times: selecting an element in vec and adding 1 to it. But there’s a lot of (unnecessary) repetition.

We’ve seen how to write a for loop to take care of the addition computation. Alternatively, we can also approach this problem from a slightly different perspective by considering a stopping condition to decide when to terminate the repetitive process of adding 1 to the elements in vec.

What stopping condition can we use? Well, one example may involve: “let’s keep selecting a single element in vec and adding 1 to it, until we exhaust all elements in vec”. In other words, let’s keep iterating until we reach the last element in vec.

As usual, the first step involves identifying the common structure of the repetitive steps. We can make the repetitive code a bit more general by referring to each position as pos:

vec2[pos] <- vec[pos] + 1

Once we have the correct abstraction for the code that needs to be repetead, then we can encapsulate it with a while loop. Let me first show you an example and then we’ll examine it in detail:

# input vector
vec <- c(3, 1, 4)

# initialize output vector
vec2 <- rep(0, 3)

# declare auxiliary iterator
pos <- 1

# while loop
while (pos <= length(vec)) {
  vec2[pos] <- vec[pos] + 1
  pos <- pos + 1  # update iterator
}

The first thing that I should mention is that writing an R while loop is a bit more complex than writing a for loop. The complexity has to do with some of the things that R does not automatically take care of in a while loop.

One main difference between a for loop and a while loop is that in the latter we must explicit declare the auxiliary iterator and give it an initial value: pos <- 1.

Next we have the while statement. This statement is technically a function, but I prefer to think of it, and call it, a statement (like the if and the for statements). What you pass inside parenthesis of the while declaration is a condition. This is basically any piece of code that R will evaluate and coerce it into a logical condition that is TRUE or FALSE. The while loop iterates as long as the condition is TRUE. If the condition becomes FALSE then the loop is terminated.

The code of the repetitive steps consists of an R expression { ... }. This is where we indicate what to do at each step. Often, an important piece of code that we need to include here involves increasing the value of the auxiliary iterator: pos <- pos + 1. In this particular example, if we don’t increase the iterator pos, the loop would iterate forever.

Note that the condition is the stopping condition, which in turn depends on the auxiliary iterator: pos <= length(vec). You can think of this condition as: “let’s keep iterating until we reach the last element in vec”.

14.2 Anatomy of a While Loop

Now that you’ve seen a first example of a while loop, I can give you a generic template for this kind of iterative construct:

iterator <- initial

while (condition) {
  do_something
  iterator <- iterator + 1
}

What’s going on?

you need to declare the auxiliary iterator with some initial value
you declare the while statement by giving a condition inside parenthesis
the condition must be a piece of code that gets evaluated into a single logical value: TRUE or FALSE
the condition is used as the stopping condition: if the condition is TRUE the loop keeps iterating; when the condition becomes FALSE the loop is terminated
we use an R compound expression { ... } to embrace the code that will be repeated at each iteration
inside the loop, you typically need to increase the value of the iterator; even if the condition does not depend on the iterator, it’s a good idea to keep track of the number of iterations in the loop

14.3 Another Example

Let’s see a more interesting example.

Say we generate a vector with 10 different integer numbers between 1 and 100, arranged in increasing order. To make things more interesting, we are going to generate these numbers in a random way using the sample.int() function that allows us to get a random sample of size = 10 integers, sampling without replacement (replace = FALSE):

set.seed(234)  # for replication purposes

# vector of 10 random integers between 1 and 100
random_numbers = sample.int(n = 100, size = 10, replace = FALSE)
random_numbers = sort(random_numbers)
random_numbers

 [1]  1 18 31 34 46 56 68 92 97 98

What are we going to do with these random_numbers? We are going to compute a cumulative sum until its value becomes greater than 100. And we are going to consider these two questions:

What is the value of the cumulative sum?
How many numbers were added to reach the sum’s value?

My recommendation is to always start with baby steps. Simply put, start writing code for a couple of concrete steps so that you understand what kind of computations will be repeated, and what things they have in common:

# initialize output sum
total_sum = 0

# accumulate numbers
total_sum = total_sum + random_numbers[1]
total_sum = total_sum + random_numbers[2]
total_sum = total_sum + random_numbers[3]
# ... keep adding numbers as long as total_sum <= 100

There are three important aspects to keep in mind:

we need an object to store the cumulative sum: total_sum
we need an iterator to move through the elements of random_numebrs
and of course we need to determine a stopping-condition: total_sum <= 100

Here’s the code:

# initialize object of cumulative sum
total_sum = 0

# declare iterator
pos = 0

# repetitive steps
while (total_sum <= 100) {
  pos = pos + 1
  total_sum = total_sum + random_numbers[pos]
}

# what is the value of the cumulative sum?
total_sum

[1] 130

# how many iterations were necessary?
pos

[1] 5

Observe that in this example, we declared pos = 0. Then, at each iteration, we increase its value pos = pos + 1, and then we added random_numbers[pos] to the previous total_sum value, effectively updating the cumulative sum.

For comparison purposes, consider this other while loop. It looks extremely similar to the preceding loop but there is an important difference.

# initialize object of cumulative sum
total_sum = 0

# declare iterator
pos = 1

# repetitive steps
while (total_sum <= 100) {
  total_sum = total_sum + random_numbers[pos]
  pos = pos + 1
}

# what is the value of the cumulative sum?
total_sum

[1] 130

# how many iterations were necessary?
pos

[1] 6

Can you see the difference between these two while loops?

In this second loop, the iterator is declared as pos = 1, and its value is increased after updating the cumulative sum. While total_sum has the correct value, pos does not indicate anymore the right number of iterations.

I wanted to show you this second example to make a point: in a while loop you not only need to declare the iterator before entering the loop, but you also need to carefully think what initial value you’ll use, as well as when to increase its value inside the loop. Some times the very first thing to do in each iteration is to increase the value of the iterator; some times that’s the last thing to do. It all depends on the specific way you are approaching a given iterative task.

14.3.1 While Loops and Next statement

Sometimes we need to skip a certain iteration if a given condition is met, this can be done with the next statement. The following code chunk contains an abstract template that uses next:

iterator <- initial

while (condition) {
  do_something
  if (skip_condition) {
    next
  }
  iterator <- iterator + 1
}

As a less abstract example, let’s bring back the while loop of the cumulative sum of random numbers, but this time say we want to skip any numbers between 30 and 39. This means we need an if-else statement to check whether a given element of random_numbers is between 30 and 39. If yes, we should skip that element and go to the next iteration. Here is how to do it:

total_sum = 0
pos = 0

while (total_sum <= 100) {
  pos = pos + 1
  if (random_numbers[pos] %in% 30:39) {
    next
  }
  total_sum = total_sum + random_numbers[pos]
}

total_sum

[1] 121

pos

[1] 6

14.3.2 While Loops and Break statement

In addition to skipping certain iterations, sometimes we need to stop a loop from iterating if a given condition is met. This can be done with the break statement, which is shown below in an abstract code template:

while (condition) { 
  expr1
  expr2
  if (stop_condition) {
    break
  }
  expr3
  expr4
}

Let’s go back to the cumulative sum example. Say we want to stop iterating if numbers are greater than or equal to 40. Like we did previously, we need again an if-else statement to check whether a given element of random_numbers is greater than or equal to 40. If yes, we stop the loop from iterating by using the break statement as follows:

total_sum = 0
pos = 0

while (total_sum <= 100) {
  pos = pos + 1
  if (random_numbers[pos] >= 40) {
    break
  }
  total_sum = total_sum + random_numbers[pos]
}

total_sum

[1] 84

pos

[1] 5