38 Introduction

In this part of the book, you will learn about a number of tools that give you more power, flexibility, and skills to handle data beyond what we’ve seen so far. For the lack of a better name, we’ll use the label Data Technologies to group things like:

  • Regular Expressions
  • XML and HTML
  • JSON data
  • The Web and basics of HTTP
  • Web Scraping

Below, we provide the names of R packages that you will need for this part of the book.

library(stringr)   # for strings and regular expressions
library(xml2)      # for parsing data in XML (e.g. HTML)
library(rvest)     # for scraping XML and HTML content
library(jsonlite)  # for handling data in JSON format
library(httr)      # for working with HTTP requests (e.g. with APIs)

By the way, at the beginning of each chapter we will also indicate which specific packages you need to load.

Keep in mind that this list of packages is not comprehensive. As a matter of fact, you may find alternative packages out there that can be good substitutes for the provided ones (e.g. "XML", "crul", "rjson").