7 Intro to R Markdown Files
Most of the times you won’t be working directly on the console.
Instead, you will be typing your commands in some source file.
The most basic type of source files are known as R script files.
But there are more flavors of source files. A very convenient type of source
file that allow you to mix R code with narrative is an R markdown file
commonly referred to as Rmd
file.
7.1 Get to know Rmd
files
In the menu bar of RStudio, click on File, then New File, and choose R Markdown. Select the default option (Document), and click Ok.
Rmd files are a special type of file, referred to as a dynamic document,
that allows to combine narrative (text) with R code. Because you will
be turning in most homework assignments as Rmd
files, it is important
that you quickly become familiar with this resource.
Locate the button Knit HTML (the one with a knitting icon) and click on it
so you can see how Rmd
files are renderer and displayed as HTML documents.
7.1.1 What is an Rmd
file?
Rmd files are a special type of file, referred to as a dynamic document. This is the fancy term we use to describe a document that allows us to combine narrative (text) with R code in one single file.
R markdown files use a special syntax called markdown. To be more precise, Rmd files let you type text using either: 1) R syntax for code that needs to be executed; 2) markdown syntax to write your narrative, and 3) latex syntax for math equations and symbols.
Rmd files are plain text files. This means that you can open an Rmd file with any text editor (not just RStudio) and being able to see and edit its contents.
The main idea behind dynamic documents is simple yet very powerful: instead of
working with two separate files, one that contains the R code, and
another one that contains the narrative, you use an .Rmd
file to include
both the commands and the narrative.
One of the main advantages of this paradigm, is that you avoid having to copy results from your computations and paste them into a report file. In fact, there are more complex ways to work with dynamic documents and source files. But the core idea is the same: combine narrative and code in a way that you let the computer do the manual, repetitive, and time consuming job.
Rmd files is just one type of dynamic document that you will find in RStudio.
In fact, RStudio provides other file formats that can be used
as dynamic documents: e.g. .Rnw
, .Rpres
, .Rhtml
, etc.
7.1.2 Anatomy of an Rmd
file
The structure of an .Rmd
file can be divided in two parts: 1) a YAML header,
and 2) the body of the document. In addition to this structure, you should
know that .Rmd
files use three types of syntaxes: YAML, Markdown, and R.
The YAML header consists of the first few lines at the top of the file.
This header is established by a set of three dashes ---
as delimiters
(one starting set, and one ending set). This part of the file requires you
to use YAML syntax (Yet Another Markup Language.)
Within the delimiter sets of dashes, you specify settings (or metadata) that
will apply to the entire document. Some of the common
options are things like:
title
author
date
output
The body of the document is everything below the YAML header. It consists of a mix of narrative and R code. All the text that is narrative is written in a markup syntax called Markdown (although you can also use LaTeX math notation). In turn, all the text that is code is written in R syntax inside blocks of code.
There are two types of blocks of code: 1) code chunks, and 2) inline code. Code chunks are lines of text separated from any lines of narrative text. Inline code is code inserted within a line of narrative text .
7.1.3 How does an Rmd file work?
Rmd files are plain text files. All that matters is the syntax of its content. The content is basically divided in the header, and the body.
- The header uses YAML syntax.
- The narrative in the body uses Markdown syntax.
- The code and commands use R syntax.
The process to generate a nice rendered document from an Rmd file is known as knitting. When you knit an Rmd file, various R packages and programs run behind the scenes. But the process can be broken down in three main phases: 1) Parsing, 2) Execution, and 3) Rendering.
- Parsing: the content of the file is parsed (examined line by line) and each component is identified as yaml header, or as markdown text, or as R code.
Each component receives a special treatment and formatting.
The most interesting part is in the pieces of text that are R code. Those are separated and executed if necessary. The commands may be included in the final document. Also, the output may be included in the final document. Sometimes, nothing is executed nor included.
Depending on the specified output format (e.g. HTML, pdf, word), all the components are assembled, and one single document is generated.
7.1.4 Yet Another Syntax to Learn
R markdown (Rmd
) files use markdown
as the main syntax to write content. Markdown is a very lightweight type of
markup language, and it is relatively easy to learn.
One of the most common sources of confusion when learning about R and Rmd
files has to do with the hash symbol #
. As you know, #
is the character
used by R to indicate comments. The issue is that the #
character has a
different meaning in markdown syntax. Hashes in markdown are used to define
levels of headings.
In an Rmd file, a hash #
that is inside a code chunk will be treated as
an R comment. A hash outside a code chunk, will be treated as markdown syntax,
making its associated text a given type of heading.
7.2 Code chunks
There are dozens of options available to control the executation of the code, the formatting and display of both the commands and the output, the display of images, graphs, and tables, and other fancy things. Here’s a list of the basic options you should become familiar with:
eval
: whether the code should be evaluatedTRUE
FALSE
echo
: whether the code should be displayedTRUE
FALSE
- numbers indicating lines in a chunk
error
: whether to stop execution if there is an errorTRUE
FALSE
results
: how to display the outputmarkup
asis
hold
hide
comment
: character used to indicate output lines- the default is a double hash
##
""
empty character (to have a cleaner display)
- the default is a double hash
7.2.1 Resources for Markdown
In RStudio’s menu bar select the Help
tab. Then click on the option
Markdown Quick Reference
.
Work through the markdown tutorial: www.markdown-tutorial.com
Your turn: After lab discussion, find some time to go through this additional markdown tutorial www.markdowntutorial.com
RStudio has a very comprehensive R Markdown tutorial: Rstudio markdown tutorial
7.3 Exercises
1) Open an Rmd file and write content in markdown syntax to replicate, as much as possible, the format of the following sample text.
2) The table below shows different examples of marked-up text.
Example | Example | ||
---|---|---|---|
A) | **Some text** |
K) | -Some text- |
B) | (Some text) |
L) | - Some text |
C) | {Some text} |
M) | > Some text |
D) | (Some text)[../folder/file] |
N) | [Some text](../folder/file) |
E) | # Some text |
O) | Some text! |
F) | `Some text` |
P) | | Some text | |
G) | "Some text" |
Q) | :Some text: |
H) | __*Some text*__ |
R) | _Some text_ |
I) | ~~Some text~~ |
S) | <Some text> |
J) | 1. Some text |
T) | *_Some text_* |
Indicate what letter corresponds to the following Markdown options:
____ Text in italics
____ Text in bold
____ Text in code format (i.e. monospace)
____ Heading text (title)
____ Strikethrough text
____ Unordered item (unordered bullet)
____ Link (hyperlink)
____ Blockquote
____ Ordered item (ordered bullet)
____ Italized text in bold