# 13 Iterations: For Loop

In the previous chapter you got introduced to conditional statements, better known as if-else statements. In this chapter we introduce another common programming structure known as **iterations**. Simply put *iterative procedures* commonly referred to as **loops**, allow you to repeat a series of common steps.

## 13.1 Motivation

Say you decide to invest $1000 in an investment fund that tracks the performance of the total US stock market. This type of financial asset, commonly referred to as a *total stock fund* is a mutual fund or an exchange traded fund (ETF) that holds every stock in a selected market. The purpose of a total stock fund is to replicate the broad market by holding the stock of every security that trades on a certain exchange. From the investing point of view, these funds are ideal for investors who want exposure to the overall equity market at a very low cost.

Examples of such funds are:

VTSAX: Vanguard Total Stock Market Index Fund

FSKAX: Fidelity Total Stock Market Index Fund

SWTSX: Schwab Total Stock Market Index Fund

As we were saying, you decide to invest $1000 in a total stock market index fund (today).

How much money would you

expectto get in 10 years?

If we knew the annual rate of return, we could use the future value formula (of compound interest)

\[ \text{FV} = \$1000 \times (1 + r)^{10} \]

The problem, or the “interesting part”—depending on how you want to see it—is that *total stock funds* don’t have a constant annual rate of return. Why not? Well, because the stock market is **volatile**, permanently moving up and down every day, with prices of stocks fluctuating every minute.

Despite the variability in prices of stocks, it turns out that in most (calendar) years, the annual return is positive. But not always. There are some years in which the annual return can be negative.

### 13.1.1 US Stock Market Historical Annual Returns

We can look at historical data to get an idea of the average annual return for investing in the total US stock market. One interesting resource that gives a nice perspective of the historical distribution of US stock market returns comes from amateur investor Joachim Klement

https://klementoninvesting.substack.com/p/the-distribution-of-stock-market

As you can tell, on an annual scale, market returns are basically random and follow the normal distribution fairly well.

So let us assume that annual rates of return for the total US Stock Market have a **Normal distribution** with mean \(\mu = 10\%\) and standard deviation \(\sigma = 18\%\).

Mathematically, we can write something like this:

\[ r_t \sim \mathcal{N}(\mu = 0.10, \ \sigma = 0.18) \]

where \(r_t\) represents the annual rate of return in any given year \(t\).

This means that, on average, we expect 10% return every year. But a return between 28% = 10% + 18%, and -2% = 10% - 18%, it’s also a perfectly reasonable return in every year.

In other words, there is nothing surprising if you see years where the return is any value between -2% and 28%.

Admittedly, we don’t know what’s going to happen with the US Stock Market in the next 10 years. But we can use our knowledge of the long-term regular behavior of the market to guesstimate an expected return in 10 years. If we assume that the (average) annual return for investing in a total market index fund is 10%, then we could estimate the expected future value as:

\[ \textsf{expected } \text{FV} = \$ 1000 \times (1 + 0.10)^{10} = \$ 2593.742 \]

Keep in mind that one of the limitations behind this assumption is that we are not taking into account the volatility of the stock market. This is illustrated in the following figure that compares a theoretical scenario with constant 10% annual returns, versus a plausible scenario with variable returns in each year.

## 13.2 Simulating Normal Random Numbers

Because the stock market is volatile, we need a way to simulate random rates of return. The good news is that we can use `rnorm()`

to simulate generating numbers from a Normal distribution. By default, `rnorm()`

generates a random number from a standard normal distribution (mean = 0, standard deviation = 1)

```
set.seed(345) # for replication purposes
rnorm(n = 1, mean = 0, sd = 1)
```

`[1] -0.7849082`

Here’s how to simulate three rates from a Normal distribution with mean \(\mu = 0.10\) and \(\sigma = 0.18\)

```
= rnorm(n = 3, mean = 0.10, sd = 0.18)
rates rates
```

`[1] 0.04968742 0.07093758 0.04769262`

### 13.2.1 Investing in the US stock market during three years

To understand what could happen with your investment, let’s focus on a three year horizon. For each yar, we need a random rate of return \(r_t\) (\(t\) = 1, 2, 3)

As we mentioned, we can use `rnorm()`

to generate three random rates of return from a normal distribution with \(\mu = 0.10\) and \(\sigma = 0.18\)

```
# inputs, consider investing during three years
set.seed(345) # for replication purposes
= 1000
amount0 = rnorm(n = 3, mean = 0.10, sd = 0.18)
rates rates
```

`[1] -0.04128347 0.04968742 0.07093758`

With these rates, we can then calculate the amounts in the investment fund that we could expect to have at the end of each year:

```
# output: balance amount at the end of each year
= amount0 * (1 + rates[1])
amount1 = amount1 * (1 + rates[2])
amount2 = amount2 * (1 + rates[3])
amount3
c(amount1, amount2, amount3)
```

`[1] 958.7165 1006.3527 1077.7409`

Notice the common structure in the commands used to obtain an `amount`

value. Moreover, notice that we are **repeating** the same operation three times. At this point you may ask yourself whether we could vectorize this code. Let’s see if we can:

```
# vectorized attempt 1
= amount0 * (1 + rates)
amounts amounts
```

`[1] 958.7165 1049.6874 1070.9376`

If we use the vector `rates`

, we obtain some `amounts`

but these are not the values that we are looking for.

What if we vectorize both `rates`

and years `1:3`

?

```
# vectorized attempt 2
= amount0 * (1 + rates)^(1:3)
amounts amounts
```

`[1] 958.7165 1101.8437 1228.2661`

Once again, we obtain some `amounts`

but these are not the values that we are looking for.

Can you see why the above vectorized code options fail to capture the correct compounding return?

### 13.2.2 Investing during ten years

Let’s expand our time horizon from three to ten years, generating random rates of return for each year, and calculating a simulated amount year by year:

```
set.seed(345) # for replication purposes
= 1000
amount0 = rnorm(n = 10, mean = 0.10, sd = 0.18)
rates
= amount0 * (1 + rates[1])
amount1 = amount1 * (1 + rates[2])
amount2 = amount2 * (1 + rates[3])
amount3 = amount3 * (1 + rates[4])
amount4 = amount4 * (1 + rates[5])
amount5 = amount5 * (1 + rates[6])
amount6 = amount6 * (1 + rates[7])
amount7 = amount7 * (1 + rates[8])
amount8 = amount8 * (1 + rates[8])
amount9 = amount9 * (1 + rates[10]) amount10
```

Note: we know that this is too repetitive, time consuming, boring and error prone. Can you spot the error?

## 13.3 Iterations to the Rescue

Let’s first write some “inefficient” code in order to understand what is going on at each step.

```
= amount0 * (1 + rates[1])
amount1 = amount1 * (1 + rates[2])
amount2 = amount2 * (1 + rates[3])
amount3 = amount3 * (1 + rates[4])
amount4 # etc
```

What do these commands have in common?

They all take an input amount, that gets compounded for one year at a certain rate. The output amount at each step is used as the input for the next amount.

Instead of calculating a single `amount`

at each step, we can start with an almost “empty” vector `amounts()`

. This vector will contain the initial amount, as well as the amounts at the end of every year.

```
set.seed(345) # for replication purposes
= 1000
amount0 = rnorm(n = 10, mean = 0.10, sd = 0.18)
rates
# output vector (to be populated)
= c(amount0, double(length = 10))
amounts
# repetitive commands
2] = amounts[1] * (1 + rates[1])
amounts[3] = amounts[2] * (1 + rates[2])
amounts[# etc ...
10] = amounts[9] * (1 + rates[9])
amounts[11] = amounts[10] * (1 + rates[10]) amounts[
```

From the commands in the previous code chunk, notice that at step `s`

, to compute the next value `amounts[s+1]`

, we do this:

`+1] = amounts[s] * (1 + rates[s]) amounts[s`

This operation is repeated 10 times (one for each year). At the end of the computations, the vector `amounts`

contains the initial investment, and the 10 values resulting from the compound return year-by-year.

## 13.4 For Loop Example

One programming structure that allows us to write code for carrying out these repetitive steps is a **for** loop, which is one of the iterative *control flow* structures in every programming language.

In R, a `for()`

loop has the following syntax

```
for (s in 1:10) {
+1] = amounts[s] * (1 + rates[s])
amounts[s }
```

You use the

`for`

statementInside parenthesis, you specify three ingredients separated by blank spaces:

an auxiliary iterator, e.g.

`s`

the keyword

`in`

a vector to iterate through, e.g.

`1:10`

The code for the repetitive steps gets wrapped inside braces

Let’s take a look at the entire piece of code:

```
set.seed(345) # for replication purposes
= 1000
amount0 = rnorm(n = 10, mean = 0.10, sd = 0.18)
rates
# output vector (to be populated)
= c(amount0, double(length = 10))
amounts
# for loop
for (s in 1:10) {
+1] = amounts[s] * (1 + rates[s])
amounts[s
}
amounts
```

```
[1] 1000.0000 958.7165 1006.3527 1077.7409 1129.1412
[6] 1228.3298 1211.0918 1129.9604 1590.9151 2223.8740
[11] 3170.9926
```

There are a couple of important things worth noticing:

You don’t need to declare or create the auxiliary iterator outside the loop

R will automatically handle the auxiliary iterator (no need to explicitly increase its value)

The length of the “iterations vector” determines the number of times the code inside the loop has to be repeated

You use

**for**loops when you know how many times a series of calculations need to be repeated.

Having obtained the vector of `amounts`

, we can then plot a timeline to visualize the behavior of the simulated returns against the hypothetical investment with a constant rate of return:

```
# random annual rates of return (blue)
plot(0:10, amounts, type = "l", lwd = 2, col = "blue",
xlab = "years", ylab = "amount (dollars)", las = 1)
points(0:10, amounts, col = "blue", pch = 20)
# assuming constant 10% annual return (orange)
lines(0:10, amount0 * (1+0.10)^(0:10), col = "orange", lwd = 2)
points(0:10, amount0 * (1+0.10)^(0:10), col = "orange", pch = 20)
```

## 13.5 About For Loops

To describe more details about `for`

loops in R, let’s consider a super simple example. Say you have a vector `vec <- c(3, 1, 4)`

, and suppose you want to add 1 to every element of `vec`

. You know that this can easily be achieved using vectorized code:

```
<- c(3, 1, 4)
vec
+ 1 vec
```

`[1] 4 2 5`

In order to learn about loops, I’m going to ask you to forget about the notion of vectorized code in R. That is, pretend that R does not have vectorized functions.

Think about what you would need to do in order to add 1 to the elements in `vec`

. This addition would involve taking the first element in `vec`

and add 1, then taking the second element in `vec`

and add 1, and finally the third element in `vec`

and add 1, something like this:

```
1] + 1
vec[2] + 1
vec[3] + 1 vec[
```

The code above does the job. From a purely arithmetic standpoint, the three lines of code reflect the operation that you would need to carry out to add 1 to all the elements in `vec`

.

From a programming point of view, you are performing the same type of operation three times: selecting an element in `vec`

and adding 1 to it. But there’s a lot of (unnecessary) repetition.

This is where loops come very handy. Here’s how to use a `for ()`

loop to add 1 to each element in `vec`

:

```
<- c(3, 1, 4)
vec
for (j in 1:3) {
print(vec[j] + 1)
}
```

```
[1] 4
[1] 2
[1] 5
```

In the code above we are taking each element `vec[j]`

, adding 1 to it, and printing the outcome with `print()`

so that we can visualize the additions at each iteration of the loop.

What if you want to create a vector `vec2`

, in which you store the values produced at each iteration of the loop? Here’s one possibility:

```
<- c(3, 1, 4) # you can change these values
vec <- rep(0, length(vec)) # vector of zeros to be filled in the loop
vec2
for (j in 1:3) {
= vec[j] + 1
vec2[j] }
```

### 13.5.1 Anatomy of a For Loop

The anatomy of a `for`

loop is as follows:

```
for (iterator in times) {
do_something }
```

`for()`

takes an **iterator** variable and a vector of **times** to iterate through. For example, in the following code the auxiliary iterator is `i`

and the vector of times is the numeric sequence `1:5`

.

```
<- 2
value
for (i in 1:5) {
<- value * 2
value print(value)
}
```

```
[1] 4
[1] 8
[1] 16
[1] 32
[1] 64
```

As you can tell, `print()`

is used to display the updated `value`

at each iteration. This print statement is used here for illustration purposes; in practice you rarely need to print any output in a loop.

The vector of *times* does NOT have to be a numeric vector; it can be **any** vector. The important thing about this vector is its length, which R uses to determine the number of iterations in the `for`

loop.

```
<- 2
value <- c('one', 'two', 'three', 'four')
times
for (i in times) {
<- value * 2
value print(value)
}
```

```
[1] 4
[1] 8
[1] 16
[1] 32
```

However, if the *iterator* is used inside the loop in a numerical computation, then the vector of *times* will almost always be a numeric vector:

```
set.seed(4321)
<- rnorm(5)
numbers
for (h in 1:length(numbers)) {
if (numbers[h] < 0) {
<- sqrt(-numbers[h])
value else {
} <- sqrt(numbers[h])
value
}print(value)
}
```

```
[1] 0.6532667
[1] 0.4728761
[1] 0.8471168
[1] 0.9173035
[1] 0.3582698
```

### 13.5.2 For Loops and Next statement

Sometimes we need to skip a certain iteration if a given condition is met, this can be done with the `next`

statement

```
for (iterator in times) {
expr1
expr2if (condition) {
next
}
expr3
expr4 }
```

Example:

```
<- 2
x for (i in 1:5) {
<- x * i
y if (y == 8) {
next
}print(y)
}
```

```
[1] 2
[1] 4
[1] 6
[1] 10
```

### 13.5.3 For Loops and Break statement

In addition to skipping certain iterations, sometimes we need to stop a loop from iterating if a given condition is met, this can be done with the `break`

statement:

```
for (iterator in times) {
expr1
expr2if (stop_condition) {
break
}
expr3
expr4 }
```

Example:

```
<- 2
x for (i in 1:5) {
<- x * i
y if (y == 8) {
break
}print(y)
}
```

```
[1] 2
[1] 4
[1] 6
```

### 13.5.4 Nested Loops

It is common to have nested loops

```
for (iterator1 in times1) {
for (iterator2 in times2) {
expr1
expr2
...
} }
```

#### Example: Nested loops

Consider a matrix with 3 rows and 4 columns

```
# some matrix
<- matrix(1:12, nrow = 3, ncol = 4)
A A
```

```
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 2 5 8 11
[3,] 3 6 9 12
```

Suppose you want to transform those values less than 6 into their reciprocals (that is, dividing them by 1). You can use a pair of embedded loops: one to traverse the rows of the matrix, the other one to traverse the columns of the matrix:

```
# reciprocal of values less than 6
for (i in 1:nrow(A)) {
for (j in 1:ncol(A)) {
if (A[i,j] < 6) A[i,j] <- 1 / A[i,j]
}
} A
```

```
[,1] [,2] [,3] [,4]
[1,] 1.0000000 0.25 7 10
[2,] 0.5000000 0.20 8 11
[3,] 0.3333333 6.00 9 12
```

### 13.5.5 About `for`

Loops and Vectorized Computations

R loops have a bad reputation for being slow.

Experienced users will tell you: “tend to avoid

`for`

loops in R” (me included).It is not really that the loops are slow; the slowness has more to do with the way R handles the

*boxing and unboxing*of data objects, which may be a bit inefficient.R provides a family of functions that are usually more efficient than loops (i.e.

`apply()`

functions).If you have NO programming experience, you should ignore any advice about avoiding loops in R.

You should learn how to write loops, and understand how they work; every programming language provides some type of loop structure.

In practice, many (programming) problems can be tackled using some loop structure.

When using R, you may need to start solving a problem using a loop. Once you solved it, try to see if you can find a vectorized alternative.

It takes practice and experience to find alternative solutions to

`for`

loops.There are cases when using

`for`

loops is not that bad.