16 Matrices and Arrays

In this chapter we introduce R arrays, which are multidimensional data objects including 2-dimensional arrays better known as matrices, and N-dimensional arrays (generic arrays).

16.1 From Vectors to Arrays

In thre previous three chapters, we discussed a number of ideas and concepts that basically have to do with vectors. As you know, a vector is a one-dimensional atomic object. While many data sets can be handled with a vector, there are occasions in which one dimension is not enough. The classic example for when one-dimensional objects are not enough involves working with data that fits better into a tabular structure. This type of structure requires two dimensions: one for rows, and another one for columns.

R provides a handful of rectangular or 2-dimensional objects. One of them is matrix which is a two-dimensional atomic object. In turns out that an R matrix is a special type of multi-dimensional atomic object called array.

Figure 16.1: Atomic data objects in R

So, how do you go from a vector into a generic array? Conceptually, we can transform a vector into an n-dimensional array by giving it a dimensions "dim" attribute with dim():

# simple vector
x <- 1:8
x
#> [1] 1 2 3 4 5 6 7 8

# adding 'dim' attribute: 2 rows, 4 columns
dim(x) <- c(2, 4)
x
#>      [,1] [,2] [,3] [,4]
#> [1,]    1    3    5    7
#> [2,]    2    4    6    8

A couple of things to notice:

a vector can be given a dim attribute
a dim attribute is a numeric vector of length n > 2
R will reorganize the elements of the vector into n dimensions
each dimension will have as many rows (or columns, etc.) as the n-th value of the dim vector

Here’s another example converting a vector into an array with three dimensions:

# simple vector
x <- 1:8

# dim attribute with 3 dimensions
dim(x) <- c(2, 2, 2)
x
#> , , 1
#> 
#>      [,1] [,2]
#> [1,]    1    3
#> [2,]    2    4
#> 
#> , , 2
#> 
#>      [,1] [,2]
#> [1,]    5    7
#> [2,]    6    8

The vector c(2, 2, 2) passed to dim() indicates that the produced array should have 2 slices (first index 2), each slice containing 2-rows (second index 2) and 2-columns (third index 2).

One important detail to keep in mind when converting a vector into either a matrix or a generic array, is to make sure that the length of the vector matches the product of the values given for each dimension. In the previous examples, we first converted vector x of length 8, into a matrix of 2 rows and 4 columns (\(2 \times 4 = 8\)); and then we converted x into an array of three dimensions each one having 2 elements (\(2 \times 2 \times 2 = 8\)).

We should say that these examples are purely conceptual, and the main purpose is to illustrate the notion of dimensions in atomic objects, and how R can internally generate a matrix or an array from an input vector by giving it a dimensions attribute via the function dim(). We should also say that using dim() like in the above examples, is not really how you should create matrices and arrays. You should instead use dedicated functions such as matrix() and array():

# create a matrix with matrix()
mat <- matrix(1:12, nrow = 3, ncol = 4)
mat
#>      [,1] [,2] [,3] [,4]
#> [1,]    1    4    7   10
#> [2,]    2    5    8   11
#> [3,]    3    6    9   12

# create an array with array()
arr <- array(1:12, dim = c(2, 2, 3))
arr
#> , , 1
#> 
#>      [,1] [,2]
#> [1,]    1    3
#> [2,]    2    4
#> 
#> , , 2
#> 
#>      [,1] [,2]
#> [1,]    5    7
#> [2,]    6    8
#> 
#> , , 3
#> 
#>      [,1] [,2]
#> [1,]    9   11
#> [2,]   10   12

16.2 Matrices

The most common type of array in R is the two-dimensional array also know as matrix. We saw how to take a vector and convert it into a matrix with dim(). In practice, however, the formal way to create matrices is with matrix(), which has the following usage syntax:

matrix(data = NA, nrow = 1, ncol = 1, byrow = FALSE, dimnames = NULL)

To make a 2-by-4 matrix from an input vector x <- 1:8, you use matrix() and specify the number of rows and columns with arguments nrow = 2 and ncol = 4 as follows:

# vector to matrix
x <- 1:8
x
#> [1] 1 2 3 4 5 6 7 8

X <- matrix(x, nrow = 2, ncol = 4)
X
#>      [,1] [,2] [,3] [,4]
#> [1,]    1    3    5    7
#> [2,]    2    4    6    8

What happens when a matrix is created with matrix():

R always fills up each matrix by columns.
You can also assign dim-names to an array.
If you don’t specify names for rows and columns, which is the default value of argument dimnames = NULL, rows and columns will be unnamed.
As mentioned before, the number of cells in a matrix—given by the product of number of rows and columns—has to match the length of the input vector, otherwise R will give you an error.

Internally, R always stores matrices by columns, and this is technically known as column-major. Many other languages store matrices or 2-dimensional arrays in this form, such as Fortran, Matlab, and Julia (but not like C or Python with "numpy").

Even though internally a matrix is always stored by columns, sometimes you may want to create a matrix by rows. This is possible thanks to the argument byrow.

# vector to matrix (default behavior)
matrix(1:8, nrow = 2, ncol = 4, byrow = FALSE)
#>      [,1] [,2] [,3] [,4]
#> [1,]    1    3    5    7
#> [2,]    2    4    6    8

# vector to matrix (by rows)
matrix(1:8, nrow = 2, ncol = 4, byrow = TRUE)
#>      [,1] [,2] [,3] [,4]
#> [1,]    1    2    3    4
#> [2,]    5    6    7    8

16.2.1 Bracket Notation

The matrix mat is a 2-dimensional object: the 1st dimension corresponds to the rows, while the 2nd dimension corresponds to the columns. Because mat has two dimensions, the bracket notation involves working with matrices in this form: mat[ , ].

Figure 16.2: Bracket notation in matrices

In other words, you have to specify values inside the brackets for the 1st index, and the 2nd index: mat[index1, index2].

Selecting cells

By specifying index vectors in the row and column margins, you can select sets of cells in a matrix. Typically, the input index vectors consist of positive numeric values indicating the position of the row(s) and column(s) to focus on. Likewise, it is also possible to exclude certain rows-and-columns by passing negative numeric indices.

Figure 16.3: Several ways to select cells

Selecting rows

With bracket notation, you can focus just on the rows of a matrix. This is done by specifying a vector for the row index, while leaving empty the position of the columns.

Figure 16.4: Several ways to select rows

Selecting columns

You can also focus just on the columns of a matrix by specifying a vector for the column index, while leaving empty the position of the rows.

Figure 16.5: Several ways to select columns

Row and Column Binding

To combine two or more matrices into a single one, you can use the binding functions rbind() and cbind(). Row binding with rbind() allows you to stack two or more matrices, as long as they have the same number of columns. In turn, column binding with cbind() allows you to join two or more matrices, as long as they have the same number of rows.

Figure 16.6: Combining matrices either by rows or by columns

R objects versus Algebra objects

It is important to distinguish vectors and matrices, especially in R:

In matrix algebra we use the convention that vectors are column vectors (i.e. they are \(n \times 1\) matrices).
In R, a vector with \(n\) elements is not the same as an \(n \times 1\) matrix.
Vectors in R behave more like “row vectors” (especially when displayed).
However, depending on the type of functions you apply to vectors, sometimes R will handle vectors like if they were column vectors.

R matrices versus R data frames

It’s also important to distinguish between a matrix and a data.frame. We haven’t talked that much about data frames (to be formally covered in chapter R Data Frames), but you’ve already worked with them in part III of the book. Both matrices and data frames are tabular structures provided by R, with some similarities and some important differences:

Both objects allow us to store data in a 2-dimensional object.
In many cases, both R matrices and data frames have similar behaviors.
This is mostly the case when they are displayed on the screen.
And sometimes it is hard to distinguish between a matrix and a data frame by just looking at the displayed content on the screen.

About Arrays

Arrays store values in an n-dimensional array
To create an array, give a vector to array() and specify number of dimensions

16.2.2 Basic Matrix Operations

We first quickly go through the basic matrix operations in R

transpose
addition
scalar multiplication
matrix-vector multiplication
matrix-matrix multiplication

16.2.3 Transpose

The transpose of a \(n \times p\) matrix \(\mathbf{X}\) is the \(p \times n\) matrix \(\mathbf{X}^{\mathsf{T}}\). In R the transpose is given by the function t()

# matrix X
X <- matrix(1:6, 2, 3)
X
#>      [,1] [,2] [,3]
#> [1,]    1    3    5
#> [2,]    2    4    6

# transpose of X
t(X)
#>      [,1] [,2]
#> [1,]    1    2
#> [2,]    3    4
#> [3,]    5    6

16.2.4 Matrix Addition

Matrix addition of two matrices \(\mathbf{A} + \mathbf{B}\) is defined when \(\mathbf{A}\) and \(\mathbf{B}\) have the same dimensions:

A <- matrix(1:6, 2, 3)
B <- matrix(7:9, 2, 3)

A + B
#>      [,1] [,2] [,3]
#> [1,]    8   12   13
#> [2,]   10   11   15

16.2.5 Scalar Multiplication

We can multiply a matrix by a scalar using the usual product operator *, moreover it doesn’t matter if we pre-multiply or post-multiply:

X <- matrix(1:3, 3, 4)

# (pre)multiply X by 0.5
(1/2) * X
#>      [,1] [,2] [,3] [,4]
#> [1,]  0.5  0.5  0.5  0.5
#> [2,]  1.0  1.0  1.0  1.0
#> [3,]  1.5  1.5  1.5  1.5

You can also postmultiply by a scalar (although this is not recommended because may confuse readers):

X <- matrix(1:3, 3, 4)

# (post)multiply X by 0.5
X * (1/2)
#>      [,1] [,2] [,3] [,4]
#> [1,]  0.5  0.5  0.5  0.5
#> [2,]  1.0  1.0  1.0  1.0
#> [3,]  1.5  1.5  1.5  1.5

16.2.6 Matrix-Matrix Multiplication

The matrix product operator in R is %*%. We can multiply matrices \(\mathbf{A}\) and \(\mathbf{B}\) if the number of columns of \(\mathbf{A}\) is equal to the number of rows of \(\mathbf{B}\)

A <- matrix(1:6, 2, 3)
B <- matrix(7:9, 3, 2)

A %*% B
#>      [,1] [,2]
#> [1,]   76   76
#> [2,]  100  100

We can multiply matrices \(\mathbf{A}\) and \(\mathbf{B}\) if the number of columns of \(\mathbf{A}\) is equal to the number of rows of \(\mathbf{B}\)

A <- matrix(1:6, 2, 3)
B <- matrix(7:9, 3, 2)

B %*% A
#>      [,1] [,2] [,3]
#> [1,]   21   49   77
#> [2,]   24   56   88
#> [3,]   27   63   99

16.2.7 Cross-Products

A very common type of products in multivariate data analysis are \(\mathbf{X^{\mathsf{T}} X}\) and \(\mathbf{X X^{\mathsf{T}}}\), sometimes known as cross-products:

# cross-product
t(A) %*% A
#>      [,1] [,2] [,3]
#> [1,]    5   11   17
#> [2,]   11   25   39
#> [3,]   17   39   61

# cross-product
A %*% t(A)
#>      [,1] [,2]
#> [1,]   35   44
#> [2,]   44   56

R provides functions crossprod() and tcrossprod() which are formally equivalent to:

\(\texttt{crossprod(X, X)} \equiv \texttt{t(X) \%*\% X}\)
\(\texttt{tcrossprod(X, X)} \equiv \texttt{X \%*\% t(X)}\)

However, crossprod() and tcrossprod() are usually faster than using t() and %*%

16.2.8 Matrix-Vector Multiplication

We can post-multiply an \(n \times p\) matrix \(\mathbf{X}\) with a vector \(\mathbf{b}\) with \(p\) elements. This means making linear combinations (weighted sums) of the columns of \(\mathbf{X}\):

X <- matrix(1:12, 3, 4)
b <- seq(0.25, 1, by = 0.25)

X %*% b
#>      [,1]
#> [1,] 17.5
#> [2,] 20.0
#> [3,] 22.5

In R we can pre-multiply a vector \(\mathbf{a}\) (with \(n\) elements) with an \(n \times p\) matrix \(\mathbf{X}\). This means making linear combinations (weighted sums) of the rows of \(\mathbf{X}\):

X <- matrix(1:12, 3, 4)
a <- 1:3

a %*% X
#>      [,1] [,2] [,3] [,4]
#> [1,]   14   32   50   68

Notice that when we use the product operator %*%, R is smart enough to use the convention that vectors are \(n \times 1\) matrices.

Notice also that if we ask for a vector-matrix multiplication, we can use both formulas:

a %*% X
t(a) %*% X

In case you are curious, here are some other interesting functions for matrices

det(): determinant
diag(): extract or replace the diagonal elements
solve(): solve system of equations
svd(): singular value decomposition
eigen(): eigen-decomposition
qr(): QR decomposition
chol(): Choleski decomposition

16.3 Exercises

1) Given the following vector hp, how would you create the matrix displayed below?

# vector of names
hp <- c("Harry", "Potter", "Ron", "Weasley", "Hermione", "Granger")

     [,1]     [,2]      
[1,] "Harry"  "Weasley" 
[2,] "Potter" "Hermione"
[3,] "Ron"    "Granger"

2) Given the following vector sw, how would you create the matrix displayed below?

# vector of names
sw <- c("Luke", "Skywalker", "Leia", "Organa", "Han", "Solo")

     [,1]   [,2]       
[1,] "Luke" "Skywalker"
[2,] "Leia" "Organa"   
[3,] "Han"  "Solo"

3) Consider the following vectors a1, a2, a3:

a1 <- c(2, 3, 6, 7, 10)
a2 <- c(1.88, 2.05, 1.70, 1.60, 1.78)
a3 <- c(80, 90, 70, 50, 75)

Column-bind the vectors a1, a2, a3 to form the following matrix A:

#>   a1   a2 a3
#> 1  2 1.88 80
#> 2  3 2.05 90
#> 3  6 1.70 70
#> 4  7 1.60 50
#> 5 10 1.78 75

4) Consider the following vectors b1, b2, b3:

b1 <- c(1, 4, 5, 8, 9)
b2 <- c(1.22, 1.05, 3.60, 0.40, 2.54)
b3 <- c(20, 40, 30, 80, 100)

Row-bind the vectors b1, b2, b3 to form the following matrix B:

#>        1     2    3    4      5
#> b1  1.00  4.00  5.0  8.0   9.00
#> b2  1.22  1.05  3.6  0.4   2.54
#> b3 20.00 40.00 30.0 80.0 100.00

5) With matrices A and B created above, use the matrix-multiplication operator %*% and the transpose function t() to compute the matrix products:

\(\mathbf{AB}\)
\(\mathbf{BA}\)
\(\mathbf{A^\mathsf{T} B^\mathsf{T}}\)
\(\mathbf{B^\mathsf{T} A^\mathsf{T}}\)