6 The Pipe Operators
This part introduces the pipe operators, which allow you write function calls in a more human-readable way. This kind of operators can be extremely useful within tidyverse operations that require many steps.
We should note that there are two operators: %>%
and |>
%>%
is the"magrittr"
operator, and it is the oldest of the 2 available pipes in R.|>
is a more recent operator, and it is now part of"base"
R.
6.1 Piping
The behavior of "dplyr"
is functional in the sense that function calls don’t
have side-effects. This implies that you must always save their results in order
to keep them in an object (in memory).
The “ugly” side of this functional behavior is that it doesn’t lead to particularly elegant code, especially if you want to do many operations at once.
For example, say we are interested in the wind speed of hurricanes, and that in addition to having speeds measured in knots we also want them in mph and kph. And not only that, but we want to arrange the hurricanes by wind speed in increasing order.
One option is to do calculations step-by-step, storing the intermediate results in their own data objects.
# manipulation step-by-step
= filter(sep2010, category != "ts")
dat1 = select(dat1, name, wind)
dat2 = mutate(
dat3
dat2,wind_mph = wind * 1.15078,
wind_kph = wind * 1.852)
= arrange(dat3, wind)
dat4 dat4
## # A tibble: 4 × 4
## name wind wind_mph wind_kph
## <chr> <dbl> <dbl> <dbl>
## 1 Lisa 75 86.3 139.
## 2 Karl 110 127. 204.
## 3 Julia 120 138. 222.
## 4 Igor 135 155. 250.
Another option, if you don’t want to name the intermediate results, requires wrapping the function calls inside each other:
# inside-out style (hard to read)
arrange(
mutate(
select(
filter(sep2010, category != "ts"),
name, wind),wind_mph = wind * 1.15078,
wind_kph = wind * 1.852),
wind)
## # A tibble: 4 × 4
## name wind wind_mph wind_kph
## <chr> <dbl> <dbl> <dbl>
## 1 Lisa 75 86.3 139.
## 2 Karl 110 127. 204.
## 3 Julia 120 138. 222.
## 4 Igor 135 155. 250.
This is difficult to read because the order of the operations is from inside
to out. Thus, the arguments are a long way away from the function.
To get around this problem, you can use a piper either %>%
or |>
.
6.1.1 The pipe operator
x |> f(y)
turns into f(x, y)
so you can use it to rewrite multiple
operations that you can read left-to-right, top-to-bottom:
# manipulation step-by-step
|>
sep2010 filter(category != "ts") |>
select(name, wind) |>
mutate(
wind_mph = wind * 1.15078,
wind_kph = wind * 1.852) |>
arrange(wind)
## # A tibble: 4 × 4
## name wind wind_mph wind_kph
## <chr> <dbl> <dbl> <dbl>
## 1 Lisa 75 86.3 139.
## 2 Karl 110 127. 204.
## 3 Julia 120 138. 222.
## 4 Igor 135 155. 250.