Column-wise operations with colwise
Posted on February 19, 2013
In a previous post
I described different options in R to do some calculations using tapply()
,
ddply()
, and sqldf()
.
I used a simple example in which the goal was to apply a function by groups on some data. More specifically: how to calculate the average of a single variable taking into account a grouping variable (eg categorical variable).
This time I wanted to continue the discussion with another interesting task when operating on grouping variables. Say we have some categorical variable (like gender, geographic region, political affiliation) with other quantitative information. More often than not we want to calculate descriptive statistics taking into account the categorical variable. Maybe we want to calculate the average of all the quantitative variables by gender. How do you do that in R?
There are a number of different options to get the answer. One option is to use the
function colwise()
from the package "plyr"
(by Hadley Wickham). The idea of
colwise()
is to turn a function that operates on a vector into a function that
operates column-wise on a data.frame
. The trick is to use colwise()
inside
the function ddply()
. Here’s a simple example:
You should get something like this:
We have the average values of all the variables for each group. Now let’s visualize them to have a better idea of what’s going on: