6 Intro to R and RStudio

In this chapter, we provide a basic coverage of R and RStudio. We are assuming that you have installed R and RStudio in your computer. If that’s not the case, then follow the instructions to download and install them:

  1. Download and Install R:
  2. Download and Install RStudio (Desktop free version)

Both R and RStudio are free, and available for Mac (OS X), Windows, and Linux (e.g. Ubuntu, Fedora, Debian).

Keep in mind that R and RStudio are not the same thing. R is the software, the “engine” or computational core. RStudio is just a convenient layer that talks directly to R, and gives us a convenient working space to organize our files, to type in code, to run commands, visualize plots, interact with our filesystem, etc. The tools provided by RStudio are designed to make our life easier while working with R. However, everything that happens in RStudio, can be done in R alone, you may need to write more code and work in a more rudimentary way, but nothing should stop your work in R if one day RStudio disappears from the face of the earth.

Main computational tools: R and RStudio

Figure 6.1: Main computational tools: R and RStudio

6.1 R as a scientific calculator

Launch RStudio and notice the locations of the panes (or panels); the layout in computer may be different but should have four panes:

  • Source (top left in image below)
  • Console (top right in image below)
  • Environment/History (bottom left in image below)
  • Files/Plots/Packages/Help (bottom right in image below)
Screenshot of RStudio panes

Figure 6.2: Screenshot of RStudio panes

FYI: you can change the default location of the panes, among many other things: Customizing RStudio. If you have no experience working with R/RStudio, you don’t have to customize anything right now. It’s better if you wait some days until you get a better feeling of the working environment. You will probably be experimenting (trial and error) some time with the customizing options until you find what works for you.

6.1.1 First contact with the R console

If you have never used software in which you have to type commands and code, our best suggestion is that you begin typing basic things in the console, using R as a scientific calculator.

For instance, consider the monthly bills of an undergrad student:

  • cell phone $80
  • transportation $20
  • groceries $527
  • gym $10
  • rent $1500
  • other $83

You can use R to find the student’s total expenses by typing these commands in the console:

# total expenses
80 + 20 + 527 + 10 + 1500 + 83
#> [1] 2220

Often, it will be more convenient to create objects or variables that store one or more values. To do this, type the name of the variable, followed by the assignment or “arrow” operator <-, followed by the assigned value. For example, you can create an object phone for the cell phone bill, and then inspect the object by typing its name:

phone <- 80
phone
#> [1] 80

All R statements where you create objects are known as “assignments”, and they have this form:

object <- value

this means you assign a value to a given object; one easy way to read the previous assignment is “phone gets 80”.

RStudio has a keyboard shortcut for the arrow operator <-: Alt + - (the minus sign). In fact, there is a large set of keyboard shortcuts. In the menu bar, go to the Help tab, and then click on the option Keyboard Shorcuts Help to find information about all the available shortcuts.

You will be working with RStudio a lot, and you will have time to learn most of the bells and whistles RStudio provides. Think about RStudio as your “workbench” that gives you an environment that makes it easier to work with R, while taking care of many of the little tasks than can be a hassle.

6.1.2 Your Turn

  • Make more assignments to create variables transportation, groceries, gym, rent, and other with their corresponding amounts.

  • Now that you have all the variables, create a total object with the sum of the expenses.

  • Assuming that the student has the same expenses every month, how much would she spend during a school “semester”? (assume the semester involves five months).

  • Maintaining the same assumption about the monthly expenses, how much would she spend during a school “year”? (assume the academic year is 10 months).

6.1.3 Object Names

There are certain rules you have to follow when creating objects and variables. Object names cannot start with a digit and cannot contain certain other characters such as a comma or a space. People use different naming styles, and at some point you should also adopt a convention for naming things. Some of the common styles are:

i_use_snake_case

other.people.use.periods

evenOthersUseCamelCase

Pretty much all the objects and variables created in this book follow the “snake_case” style. It is certainly possible that you may endup working with a team has a styleguide with a specific naming convention. Feel free to try various style, and once you feel comfortable with one of them, then stick to it.

The following are invalid names (and invalid assignments)

# cannot start with a number
5variable <- 5

# cannot start with an underscore
_invalid <- 10

# cannot contain comma
my,variable <- 3

# cannot contain spaces
my variable <- 1

This is fine but a little bit too much:

this_is_a_really_long_name <- 3.5

6.1.4 Functions

R has many functions. To use a function type its name followed by parenthesis. Inside the parenthesis you pass an input. Most functions will produce some type of output:

# absolute value
abs(10)
abs(-4)

# square root
sqrt(9)

# natural logarithm
log(2)

6.1.5 Comments in R

All programming languages use a set of characters to indicate that a specifc part or lines of code are comments, that is, things that are not to be executed. R uses the hash or pound symbol # to specify comments. Any code to the right of # will not be executed by R.

# this is a comment
# this is another comment
2 * 9

4 + 5  # you can place comments like this

You will notice that we have included comments in almost all of the code snippets shown in the book. To be honest, some examples may have too many comments, but we’ve done that to be very explicit, and so that those of you who lack coding experience understand what’s going on. In real life, programmers use comments, but not so much as we do in the book. The main purpose of writing comments is to describe what is hapenning—conceptually—with certain lines of code.

6.1.6 Case Sensitive

R is case sensitive. This means that phone is not the same as Phone or PHONE

# case sensitive
phone <- 80
Phone <- -80
PHONE <- 8000

phone + Phone
#> [1] 0

PHONE - phone
#> [1] 7920

6.1.7 Your turn

Take your objects (i.e. variables) phone, transportation, groceries, gym, rent, and other and pass them inside the combine function c(), separating each variable with a comma, to create an object expenses.

Now, use the graphing function barplot() to produce a barchart of expenses:

barplot(expenses)

Find out how to use sort() to sort the elements in expenses, in order to produce a bar-chart with bars in decreasing order. Also, see if you can figure out how to display the names of the variables below each of the bars. Also optional, see if you can find out how to display the values of each variable at the top of each bar.

6.2 Getting Help

Because we work with functions all the time, it’s important to know certain details about how to use them, what input(s) is required, and what is the returned output.

There are several ways to get help.

If you know the name of a function you are interested in knowing more, you can use the function help() and pass it the name of the function you are looking for:

# documentation about the 'abs' function
help(abs)

# documentation about the 'mean' function
help(mean)

Alternatively, you can use a shortcut using the question mark ? followed by the name of the function:

# documentation about the 'abs' function
?abs

# documentation about the 'mean' function
?mean
  • How to read the manual documentation
    • Title
    • Description
    • Usage of function
    • Arguments
    • Details
    • See Also
    • Examples!!!

help() only works if you know the name of the function your are looking for. Sometimes, however, you don’t know the name but you may know some keywords. To look for related functions associated to a keyword, use double help.search() or simply ??

# search for 'absolute'
help.search("absolute")

# alternatively you can also search like this:
??absolute

Notice the use of quotes surrounding the input name inside help.search()

6.3 Installing Packages

R comes with a large set of functions and packages. A package is a collection of functions that have been designed for a specific purpose. One of the great advantages of R is that many analysts, scientists, programmers, and users can create their own pacakages and make them available for everybody to use them. R packages can be shared in different ways. The most common way to share a package is to submit it to what is known as CRAN, the Comprehensive R Archive Network.

You can install a package using the install.packages() function. To do this, we recommend that you run this command directly on the console. Do NOT include this command in a code chunk of an Rmd file: you will very likely get an error message when knitting the Rmd file.

To use install.packages() just give it the name of a package, surrounded by qoutes, and R will look for it in CRAN, and if it finds it, R will download it to your computer.

# installing (run this on the console!)
install.packages("knitr")

You can also install a bunch of packages at once:

# run this command on the console!
install.packages(c("readr", "ggplot2"))

Once you installed a package, you can start using its functions by loading the package with the function library(). By the way, when working on an Rmd file that uses functions from a given package, you MUST include a code chunk with the library() command.

# (this command can be included in an Rmd file)
library(knitr)

6.3.1 Slides

6.4 Exercises

Type commands directly on the console pane of RStudio to:

1) Install packages "stringr", "RColorBrewer", and “XML

2) Calculate: \(3x^2 + 4x + 8\) when \(x = 2\)

3) Calculate: \(3x^2 + 4x + 8\) but now with a numeric sequence for \(x\) using x <- -3:3

4) Find out how to look for information about math binary operators like + or ^ (without using ?Arithmetic).

5) In RStudio, one of the panes has tabs Files, Plots, Packages, Help, Viewer.

  1. Find what does the tab Files is good for.
  2. In the tab Files, what happens when you click the button with a House icon?
  3. Find what does the tab Help is good for.
  4. In the tab Help, what happens when you click the button with a House icon?

6) In RStudio, one of the panes has the tabs Environment, History, Connections.

  1. Find what does the tab History is for.
  2. Find what the buttons of the menu bar in tab History are for.
  3. Likewise, what can you say about the tab Environment?

7) When you start a new R session in Rstudio, a message with similar content to the text below appears on the console (the exact content will depend on your R version):

   R version 3.5.1 (2018-07-02) -- "Feather Spray"
   Copyright (C) 2018 The R Foundation for Statistical Computing
   Platform: x86_64-apple-darwin15.6.0 (64-bit)

   R is free software and comes with ABSOLUTELY NO WARRANTY.
   You are welcome to redistribute it under certain conditions.
   Type 'license()' or 'licence()' for distribution details.

     Natural language support but running in an English locale   

   R is a collaborative project with many contributors.
   Type 'contributors()' for more information and
   'citation()' on how to cite R or R packages in publications.

   Type 'demo()' for some demos, 'help()' for on-line help, or
   'help.start()' for an HTML browser interface to help.
   Type 'q()' to quit R.
  1. What happens when you type: license()?

  2. What happens when you type: contributors()?

  3. What happens when you type: citation()?

  4. What happens when you type: demo()?