12.1 Eigenvectors and Eigenvalues

Let’s begin with the classic definition of eigenvectors and eigenvalues. Consider a square matrix \(\mathbf{A}\) of order \(p \times p\). Any vector \(\mathbf{v}\) such that:

\[ \mathbf{Av} = \lambda \mathbf{v} \]

with \(\lambda \neq 0\) some scalar, is called an eigenvector of \(\mathbf{A}\), and \(\lambda\) an eigenvalue.

For example, consider the following matrix \(\mathbf{A}\) and vector \(\mathbf{v_1}\):

\[ \mathbf{A} = \ \begin{bmatrix} 2 & 3 \\ 4 & 1 \\ \end{bmatrix}, \ \qquad \mathbf{v_1} = \ \begin{bmatrix} 1 \\ 1 \\ \end{bmatrix} \]

The vector resulting from multiplying \(\mathbf{Av_1}\) is:

\[ \mathbf{A v_1} = \ \begin{bmatrix} 2 & 3 \\ 4 & 1 \\ \end{bmatrix} \ \begin{bmatrix} 1 \\ 1 \\ \end{bmatrix} \ = \begin{bmatrix} 5 \\ 5 \\ \end{bmatrix} = 5 \begin{bmatrix} 1 \\ 1 \\ \end{bmatrix} \]

As you can tell,

\[ \mathbf{Av_1} = 5 \mathbf{v_1} \]

Therefore, \(\mathbf{v_1}\) and \(\lambda_1 = 5\) are an eigenvector and eigenvalue, respectively, of \(\mathbf{A}\).

More formally, if we think of a square matrix \(\mathbf{A}\) as a transformation, an eigenvector \(\mathbf{v}\) of \(\mathbf{A}\) is a special kind of vector: it is a vector which under the transformation \(\mathbf{A}\) maps into itself or a multiple of itself. Another way to put it is by saying that eigenvectors are invariant vectors under a given transformation.

12.1.1 Where do eigenvectors come from?

The definition of an eigenvector and an eigenvalue tells us what they are, but it does not tell us where they come from. To understand how eigen-elements come into existance, we need to take a closer look at the matrix equation:

\[ \mathbf{Av} = \lambda \mathbf{v} \]

We can rearrange the terms as follows:

\[ \mathbf{Av} - \lambda \mathbf{v} = \mathbf{0} \]

or equivalently:

\[ \mathbf{Av} - \lambda \mathbf{Iv} = \mathbf{0} \]

Notice that \(\mathbf{0}\) is not the scalar 0, but a \(p\)-element vector of zeros. Likewise, the matrix \(\mathbf{I}\) is the \(p \times p\) identity matrix.

Next, we can factor out \(\mathbf{v}\) to get:

\[ \left( \mathbf{A} - \lambda \mathbf{I} \right) \mathbf{v} = \mathbf{0} \]

This matrix equation involves a system of \(p\) homogeneous equations with the elements of \(\mathbf{v}\) as \(p\) unknowns. The equations can be solved only if the matrix of coefficients has rank smaller than \(p\); so the determinant of the \(p \times p\) matrix \(( \mathbf{A} - \lambda \mathbf{I} )\) must be equal to zero:

\[ | \mathbf{A} - \lambda \mathbf{I} | = 0 \]

Notice that this equation turns into an equation in \(\lambda\) only; so the eigenvalues can be found.

In the example above, we have:

\[ | \mathbf{A} - \lambda \mathbf{I} | = \left | \begin{bmatrix} 2 & 3 \\ 4 & 1 \\ \end{bmatrix} - \lambda \begin{bmatrix} 1 & 0 \\ 0 & 1 \\ \end{bmatrix} \right | = \left | \begin{matrix} 2 - \lambda & 3 \\ 4 & 1 - \lambda \\ \end{matrix} \right | = 0 \]

The determinant can be rewritten as:

\[ (2 - \lambda) (1 - \lambda) - 12 = 0 \]

\[ \lambda^2 - 3\lambda - 10 = 0 \]

with solutions \(\lambda_1 = 5\) and \(\lambda_2 = -2\).

Expanding the determinant \(| \mathbf{A} - \lambda \mathbf{I} |\) becomes an equation of a \(p\)-th degree polynomial in \(\lambda\) from which the values \(\lambda_1, \lambda_2, \dots, \lambda_p\) are the desired eigenvalues. This equation is the so-called characteristic equation, and it will have, in general, \(p\) different roots which are the eigenvalues. This means that finding eigenvalues is nothing else than finding the roots of a \(p\)-th degree polynomial. The problem is that, in general, the direct computation of the roots from the characteristic equation cannot be obtained in an analytical way.

So how do people find the roots of the polynomial associated to the characteristic equation? Well, this is a problem that has called the attention of many mathematicians for many centuries. The bad news is that there is no general formula for the solution of the roots of a polynomial of degree greater than 4. The good news is that we can use iterative procedures based on the idea of successive approximation or linearization. Roughly speaking, the idea of the successive approximation involves starting from one or more initial approximations to a given root, so we can produce a sequence \(l_0, l_1, l_2, \dots\) which presumably converges to the desired root.