6 The Pipe Operators

This part introduces the pipe operators, which allow you write function calls in a more human-readable way. This kind of operators can be extremely useful within tidyverse operations that require many steps.

We should note that there are two operators: %>% and |>

  • %>% is the "magrittr" operator, and it is the oldest of the 2 available pipes in R.

  • |> is a more recent operator, and it is now part of "base" R.

6.1 Piping

The behavior of "dplyr" is functional in the sense that function calls don’t have side-effects. This implies that you must always save their results in order to keep them in an object (in memory).

The “ugly” side of this functional behavior is that it doesn’t lead to particularly elegant code, especially if you want to do many operations at once.

For example, say we are interested in the wind speed of hurricanes, and that in addition to having speeds measured in knots we also want them in mph and kph. And not only that, but we want to arrange the hurricanes by wind speed in increasing order.

One option is to do calculations step-by-step, storing the intermediate results in their own data objects.

# manipulation step-by-step
dat1 = filter(sep2010, category != "ts")
dat2 = select(dat1, name, wind)
dat3 = mutate(
  dat2,
  wind_mph = wind * 1.15078,
  wind_kph = wind * 1.852)
dat4 = arrange(dat3, wind)
dat4
## # A tibble: 4 × 4
##   name   wind wind_mph wind_kph
##   <chr> <dbl>    <dbl>    <dbl>
## 1 Lisa     75     86.3     139.
## 2 Karl    110    127.      204.
## 3 Julia   120    138.      222.
## 4 Igor    135    155.      250.

Another option, if you don’t want to name the intermediate results, requires wrapping the function calls inside each other:

# inside-out style (hard to read)
arrange(
  mutate(
    select(
      filter(sep2010, category != "ts"), 
      name, wind),
    wind_mph = wind * 1.15078,
    wind_kph = wind * 1.852),
  wind)
## # A tibble: 4 × 4
##   name   wind wind_mph wind_kph
##   <chr> <dbl>    <dbl>    <dbl>
## 1 Lisa     75     86.3     139.
## 2 Karl    110    127.      204.
## 3 Julia   120    138.      222.
## 4 Igor    135    155.      250.

This is difficult to read because the order of the operations is from inside to out. Thus, the arguments are a long way away from the function. To get around this problem, you can use a piper either %>% or |>.

6.1.1 The pipe operator

x |> f(y) turns into f(x, y) so you can use it to rewrite multiple operations that you can read left-to-right, top-to-bottom:

# manipulation step-by-step
sep2010 |> 
  filter(category != "ts") |>
  select(name, wind) |>
  mutate(
    wind_mph = wind * 1.15078,
    wind_kph = wind * 1.852) |>
  arrange(wind)
## # A tibble: 4 × 4
##   name   wind wind_mph wind_kph
##   <chr> <dbl>    <dbl>    <dbl>
## 1 Lisa     75     86.3     139.
## 2 Karl    110    127.      204.
## 3 Julia   120    138.      222.
## 4 Igor    135    155.      250.