6 Intro to R Markdown Files

Most of the times you won’t be working directly on the console. Instead, you will be typing your commands in some source file. The most basic type of source files are known as R script files. But there are more flavors of source files. A very convenient type of source file that allow you to mix R code with narrative is an R markdown file commonly referred to as Rmd file.

6.1 Get to know `Rmd` files

In the menu bar of RStudio, click on File, then New File, and choose R Markdown. Select the default option (Document), and click Ok.

Rmd files are a special type of file, referred to as a dynamic document, that allows to combine narrative (text) with R code. Because you will be turning in most homework assignments as Rmd files, it is important that you quickly become familiar with this resource.

Locate the button Knit HTML (the one with a knitting icon) and click on it so you can see how Rmd files are rendered and displayed as HTML documents.

6.1.1 What is an `Rmd` file?

Rmd files are a special type of file, referred to as a dynamic document. This is the fancy term we use to describe a document that allows us to combine narrative (text) with R code in one single file.

R markdown files use a special syntax called markdown. To be more precise, Rmd files let you type text using either: 1) R syntax for code that needs to be executed; 2) markdown syntax to write your narrative, and 3) latex syntax for math equations and symbols.

Rmd files are plain text files. This means that you can open an Rmd file with any text editor (not just RStudio) and being able to see and edit its contents.

The main idea behind dynamic documents is simple yet very powerful: instead of working with two separate files, one that contains the R code, and another one that contains the narrative, you use an .Rmd file to include both the commands and the narrative.

One of the main advantages of this paradigm, is that you avoid having to copy results from your computations and paste them into a report file. In fact, there are more complex ways to work with dynamic documents and source files. But the core idea is the same: combine narrative and code in a way that you let the computer do the manual, repetitive, and time consuming job.

Rmd files is just one type of dynamic document that you will find in RStudio. In fact, RStudio provides other file formats that can be used as dynamic documents: e.g. .Rnw, .Rpres, .Rhtml, etc.

6.1.2 Anatomy of an `Rmd` file

The structure of an .Rmd file can be divided in two parts: 1) a YAML header, and 2) the body of the document. In addition to this structure, you should know that .Rmd files use three types of syntaxes: YAML, Markdown, and R.

The YAML header consists of the first few lines at the top of the file. This header is established by a set of three dashes --- as delimiters (one starting set, and one ending set). This part of the file requires you to use YAML syntax (Yet Another Markup Language.) Within the delimiter sets of dashes, you specify settings (or metadata) that will apply to the entire document. Some of the common options are things like:

title
author
date
output

The body of the document is everything below the YAML header. It consists of a mix of narrative and R code. All the text that is narrative is written in a markup syntax called Markdown (although you can also use LaTeX math notation). In turn, all the text that is code is written in R syntax inside blocks of code.

There are two types of blocks of code: 1) code chunks, and 2) inline code. Code chunks are lines of text separated from any lines of narrative text. Inline code is code inserted within a line of narrative text .

6.1.3 How does an Rmd file work?

Rmd files are plain text files. All that matters is the syntax of its content. The content is basically divided in the header, and the body.

The header uses YAML syntax.
The narrative in the body uses Markdown syntax.
The code and commands use R syntax.

The process to generate a nice rendered document from an Rmd file is known as knitting. When you knit an Rmd file, various R packages and programs run behind the scenes. But the process can be broken down in three main phases: 1) Parsing, 2) Execution, and 3) Rendering.

Parsing: the content of the file is parsed (examined line by line) and each component is identified as yaml header, or as markdown text, or as R code.

Each component receives a special treatment and formatting.

The most interesting part is in the pieces of text that are R code. Those are separated and executed if necessary. The commands may be included in the final document. Also, the output may be included in the final document. Sometimes, nothing is executed nor included.

Depending on the specified output format (e.g. HTML, pdf, word), all the components are assembled, and one single document is generated.

6.1.4 Yet Another Syntax to Learn

R markdown (Rmd) files use markdown as the main syntax to write content. Markdown is a very lightweight type of markup language, and it is relatively easy to learn.

One of the most common sources of confusion when learning about R and Rmd files has to do with the hash symbol #. As you know, # is the character used by R to indicate comments. The issue is that the # character has a different meaning in markdown syntax. Hashes in markdown are used to define levels of headings.

In an Rmd file, a hash # that is inside a code chunk will be treated as an R comment. A hash outside a code chunk, will be treated as markdown syntax, making its associated text a given type of heading.

6.2 Code chunks

There are dozens of options available to control the executation of the code, the formatting and display of both the commands and the output, the display of images, graphs, and tables, and other fancy things. Here’s a list of the basic options you should become familiar with:

eval: whether the code should be evaluated
- TRUE
- FALSE
echo: whether the code should be displayed
- TRUE
- FALSE
- numbers indicating lines in a chunk
error: whether to stop execution if there is an error
- TRUE
- FALSE
results: how to display the output
- markup
- asis
- hold
- hide
comment: character used to indicate output lines
- the default is a double hash ##
- "" empty character (to have a cleaner display)

6.2.1 Resources for Markdown

In RStudio’s menu bar select the Help tab. Then click on the option Markdown Quick Reference.

Work through the markdown tutorial: www.markdown-tutorial.com

Your turn: After lab discussion, find some time to go through this additional markdown tutorial www.markdowntutorial.com

RStudio has a very comprehensive R Markdown tutorial: Rstudio markdown tutorial