7.4: Singular Value Decompositions

David Austin
Grand Valley State University

The Spectral Theorem has animated the past few sections. In particular, we applied the fact that symmetric matrices can be orthogonally diagonalized to simplify quadratic forms, which enabled us to use principal component analysis to reduce the dimension of a dataset. But what can we do with matrices that are not symmetric or even square? For instance, the following matrices are not diagonalizable, much less orthogonally so:

\begin \begin 2 & 1 \\ 0 & 2 \end, \hspace \begin 1 & 1 & 0 \\ -1 & 0 & 1 \end. \end

In this section, we will develop a description of matrices called the singular value decomposition that is, in many ways, analogous to an orthogonal diagonalization. For example, we have seen that any symmetric matrix can be written in the form \(QDQ^T\) where \(Q\) is an orthogonal matrix and \(D\) is diagonal. A singular value decomposition will have the form \(U\Sigma V^T\) where \(U\) and \(V\) are orthogonal and \(\Sigma\) is diagonal. Most notably, we will see that every matrix has a singular value decomposition whether it's symmetric or not.

Preview Activity 7.4.1.

Suppose that \(A\) is any matrix. Explain why the matrix \(G = A^TA\) is symmetric.
Suppose that \(A = \begin 1 & 2 \\ -2 & -1 \\ \end\text\) Find the matrix \(G=A^TA\) and write out the quadratic form \(q_G\left(\twovec\right)\) as a function of \(x_1\) and \(x_2\text\)
What is the maximum value of \(q_G(\mathbf x)\) and in which direction does it occur?

Finding singular value decompositions

We will begin by explaining what a singular value decomposition is and how we can find one for a given matrix \(A\text<.>\)

Recall how the orthogonal diagonalization of a symmetric matrix is formed: if \(A\) is symmetric, we write \(A = QDQ^T\) where the diagonal entries of \(D\) are the eigenvalues of \(A\) and the columns of \(Q\) are the associated eigenvectors. Moreover, the eigenvalues are related to the maximum and minimum values of the associated quadratic form \(q_A(\mathbf u)\) among all unit vectors.

A general matrix, particularly a matrix that is not square, may not have eigenvalues and eigenvectors, but we can discover analogous features, called singular values and singular vectors, by studying a function somewhat similar to a quadratic form. More specifically, any matrix \(A\) defines a function

\begin l_A(\mathbf x) = |A\mathbf x|, \end

which measures the length of \(A\mathbf x\text<.>\) For example, the diagonal matrix \(D=\begin 3 & 0 \\ 0 & -2 \\ \end\) gives the function \(l_D(\mathbf x) = \sqrt\text<.>\) The presence of the square root means that this function is not a quadratic form. We can, however, define the singular values and vectors by looking for the maximum and minimum of this function \(l_A(\mathbf u)\) among all unit vectors \(\mathbf u\text<.>\)

While \(l_A(\mathbf x)\) is not itself a quadratic form, it becomes one if we square it:

\begin \left(l_A(\mathbf x)\right)^2 = |A\mathbf x|^2 = (A\mathbf x)\cdot(A\mathbf x) = \mathbf x\cdot(A^TA\mathbf x)=q_(\mathbf x)\text <.>\end

We call \(G=A^TA\text\) the Gram matrix associated to \(A\) and note that

\begin l_A(\mathbf x) = \sqrt\text \end

This is important in the next activity, which introduces singular values and singular vectors.

Activity 7.4.2.

The following interactive figure will help us explore singular values and vectors geometrically before we begin a more algebraic approach. This figure is also available at gvsu.edu/s/0YE.

The four sliders at the top of this figure enable us to choose a \(2\times2\) matrix \(A\text<.>\) Below on the left, we see the unit circle and the red unit vector \(\mathbf x\text\) which may be varied by clicking in the head of the vector and dragging it to a new unit vector.

Figure 7.4.1. Singular values, right singular vectors and left singular vectors

Select the matrix \(A=\begin 1 & 2 \\ -2 & - 1 \\ \end\text<.>\) As we vary the vector \(\mathbf x\text\) we see the vector \(A\mathbf x\) in gray while the height of the blue bar to the right tells us \(l_A(\mathbf x) = |A\mathbf x|\text<.>\)

The first singular value \(\sigma_1\) is the maximum value of \(l_A(\mathbf x)\) and an associated right singular vector \(\mathbf v_1\) is a unit vector describing a direction in which this maximum occurs. Use the diagram to find the first singular value \(\sigma_1\) and an associated right singular vector \(\mathbf v_1\text<.>\)
The second singular value \(\sigma_2\) is the minimum value of \(l_A(\mathbf x)\) and an associated right singular vector \(\mathbf v_2\) is a unit vector describing a direction in which this minimum occurs. Use the diagram to find the second singular value \(\sigma_2\) and an associated right singular vector \(\mathbf v_2\text<.>\)
Here's how we can find the right singular values and vectors without using the diagram. Remember that \(l_A(\mathbf x) = \sqrt\) where \(G=A^TA\) is the Gram matrix associated to \(A\text<.>\) Since \(G\) is symmetric, it is orthogonally diagonalizable. Find \(G\) and an orthogonal diagonalization of it.

\begin U = \begin\mathbf u_1 & \mathbf u_2 \end, \hspace \Sigma = \begin \sigma_1 & 0 \\ 0 & \sigma_2 \\ \end, \hspace V = \begin\mathbf v_1 & \mathbf v_2 \end \end

As this activity shows, the singular values of \(A\) are the maximum and minimum values of \(l_A(\mathbf x)=|A\mathbf x|\) among all unit vectors and the right singular vectors \(\mathbf v_1\) and \(\mathbf v_2\) are the directions in which they occur. The key to finding the singular values and vectors is to utilize the Gram matrix \(G\) and its associated quadratic form \(q_G(\mathbf x)\text<.>\) We will illustrate with some more examples.

Example 7.4.2

We will find a singular value decomposition of the matrix \(A=\begin 1 & 2 \\ -1 & 2 \end \text<.>\) Notice that this matrix is not symmetric so it cannot be orthogonally diagonalized.

We begin by constructing the Gram matrix \(G = A^TA = \begin 2 & 0 \\ 0 & 8 \\ \end\text<.>\) Since \(G\) is symmetric, it can be orthogonally diagonalized with

\begin D = \begin 8 & 0 \\ 0 & 2 \\ \end,\hspace Q = \begin 0 & 1 \\ 1 & 0 \\ \end\text \end

We now know that the maximum value of the quadratic form \(q_G(\mathbf x)\) is 8, which occurs in the direction \(\twovec01\text<.>\) Since \(l_A(\mathbf x) = \sqrt\text\) this tells us that the maximum value of \(l_A(\mathbf x)\text\) the first singular value, is \(\sigma_1=\sqrt\) and that this occurs in the direction of the first right singular vector \(\mathbf v_1=\twovec01\text<.>\)

In the same way, we also know that the second singular value \(\sigma_2=\sqrt\) with associated right singular vector \(\mathbf v_2=\twovec10\text<.>\)

The first left singular vector \(\mathbf u_1\) is defined by \(A\mathbf v_1 = \twovec22 = \sigma_1\mathbf u_1\text<.>\) Because \(\sigma_1 = \sqrt\text\) we have \(\mathbf u_1 = \twovec>>\text<.>\) Notice that \(\mathbf u_1\) is a unit vector because \(\sigma_1 = |A\mathbf v_1|\text<.>\)

In the same way, the second left singular vector is defined by \(A\mathbf v_2 = \twovec1 = \sigma_2\mathbf u_2\text\) which gives us \(\mathbf u_2 = \twovec>>\text<.>\)

We then construct

\begin U & <>=<> \begin \mathbf u_1 & \mathbf u_2 \end = \begin 1/\sqrt & 1/\sqrt \\ 1/\sqrt & -1/\sqrt \\ \end\\ \Sigma & <>=<> \begin \sigma_1 & 0 \\ 0 & \sigma_2 \\ \end = \begin \sqrt & 0 \\ 0 & \sqrt \\ \end\\ V & <>=<> \begin \mathbf v_1 & \mathbf v_2 \end = \begin 0 & 1 \\ 1 & 0 \\ \end \end

We now have \(AV=U\Sigma\) because

\begin AV = \begin A\mathbf v_1 & A\mathbf v_2 \end = \begin \sigma_1\mathbf u_1 & \sigma_2\mathbf u_2 \end = \Sigma U\text <.>\end

Because the right singular vectors, the columns of \(V\text\) are eigenvectors of the symmetric matrix \(G\text\) they form an orthonormal basis, which means that \(V\) is orthogonal. Therefore, we have \((AV)V^T = A = U\Sigma V^T\text<.>\) This gives the singular value decomposition

\begin A = \begin 1 & 2 \\ -1 & 2 \\ \end = \begin 1/\sqrt & 1/\sqrt \\ 1/\sqrt & -1/\sqrt \\ \end \begin \sqrt & 0 \\ 0 & \sqrt \\ \end \begin 0 & 1 \\ 1 & 0 \\ \end^T = U\Sigma V^T\text <.>\end

To summarize, we find a singular value decomposition of a matrix \(A\) in the following way:

Construct the Gram matrix \(G=A^TA\) and find an orthogonal diagonalization to obtain eigenvalues \(\lambda_i\) and an orthonormal basis of eigenvectors.
The singular values of \(A\) are the squares roots of eigenvalues \(\lambda_i\) of \(G\text\) that is, \(\sigma_i = \sqrt<\lambda_i>\text<.>\) For reasons we'll see in the next section, the singular values are listed in decreasing order: \(\sigma_1 \geq \sigma_2 \geq \ldots \text<.>\) The right singular vectors \(\mathbf v_i\) are the associated eigenvectors of \(G\text<.>\)
The left singular vectors \(\mathbf u_i\) are found by \(A\mathbf v_i = \sigma_i\mathbf u_i\text<.>\) Because \(\sigma_i=|A\mathbf v_i|\text\) we know that \(\mathbf u_i\) will be a unit vector. In fact, the left singular vectors will also form an orthonormal basis. To see this, suppose that the associcated singular values are nonzero. We then have:

\begin \sigma_i\sigma_j(\mathbf u_i\cdot\mathbf u_j) & <>=<> (\sigma_i\mathbf u_i)\cdot(\sigma_j\mathbf u_j) = (A\mathbf v_i)\cdot(A\mathbf v_j)\\ & <>=<> \mathbf v_i\cdot(A^TA\mathbf v_j) \\ & <>=<> \mathbf v_i\cdot(G\mathbf v_j) = \lambda_j\mathbf v_i\cdot\mathbf v_j = 0 \end

Example 7.4.3

Let's find a singular value decomposition for the symmetric matrix \(A=\begin 1 & 2 \\ 2 & 1 \end\text<.>\) The associated Gram matrix is

\begin G = A^TA = \begin 5 & 4 \\ 4 & 5 \\ \end\text \end

which has an orthogonal diagonalization with

\begin D = \begin 9 & 0 \\ 0 & 1 \\ \end,\hspace Q = \begin 1/\sqrt & 1/\sqrt \\ 1/\sqrt & -1/\sqrt \\ \end\text <.>\end

This gives singular values and vectors

and the singular value decomposition \(A=U\Sigma V^T\) where

\begin U = \begin 1/\sqrt & -1/\sqrt \\ 1/\sqrt & 1/\sqrt \end,\hspace \Sigma = \begin 3 & 0 \\ 0 & 1 \end,\hspace V = \begin 1/\sqrt & 1/\sqrt \\ 1/\sqrt & -1/\sqrt \end. \end

This example is special because \(A\) is symmetric. With a little thought, it's possible to relate this singular value decomposition to an orthogonal diagonalization of \(A\) using the fact that \(G=A^TA = A^2\text<.>\)

Activity 7.4.3.

In this activity, we will construct the singular value decomposition of \(A=\begin 1 & 0 & -1 \\ 1 & 1 & 1 \end\text<.>\) Notice that this matrix is not square so there are no eigenvalues and eigenvectors associated to it.

\begin \Sigma = \begin \sigma_1 & 0 & 0 \\ 0 & \sigma_2 & 0 \end \end and verify that \(A = U\Sigma V^T\text<.>\)

Example 7.4.4

We will find a singular value decomposition of the matrix \(A=\begin 2 & -2 & 1 \\ -4 & -8 & -8 \\ \end\text<.>\)

Finding an orthogonal diagonalization of \(G=A^TA\) gives

\begin D=\begin 144 & 0 & 0 \\ 0 & 9 & 0 \\ 0 & 0 & 0 \\ \end,\hspace Q=\begin 1/3 & 2/3 & 2/3 \\ 2/3 & -2/3 & 1/3 \\ 2/3 & 1/3 & -2/3 \\ \end\text \end

which gives singular values \(\sigma_1=\sqrt=12\text\) \(\sigma_2 = \sqrt= 3\text\) and \(\sigma_3 = 0\text<.>\) The right singular vectors \(\mathbf v_i\) appear as the columns of \(Q\) so that \(V = Q\text<.>\)

\begin A\mathbf v_1 = \twovec = 12\mathbf u_1, \hspace & \mathbf u_1 = \twovec\\ A\mathbf v_2 = \twovec = 3\mathbf u_1, \hspace & \mathbf u_1 = \twovec10\\ A\mathbf v_3 = \twovec \end

Notice that it's not possible to find a third left singular vector since \(A\mathbf v_3=\zerovec\text<.>\) We thereform form the matrices

\begin U = \begin 0 & 1 \\ -1 & 0 \\ \end,\hspace \Sigma = \begin 12 & 0 & 0 \\ 0 & 3 & 0 \\ \end,\hspace V=\begin 1/3 & 2/3 & 2/3 \\ 2/3 & -2/3 & 1/3 \\ 2/3 & 1/3 & -2/3 \\ \end\text \end

which gives the singular value decomposition \(A=U\Sigma V^T\text<.>\)

Notice that \(U\) is a \(2\times2\) orthogonal matrix because \(A\) has two rows, and \(V\) is a \(3\times3\) orthogonal matrix because \(A\) has three columns.

As we'll see in the next section, some additional work may be needed to construct the left singular vectors \(\mathbf u_j\) if more of the singular values are zero, but we won't worry about that now. For the time being, let's record our work in the following theorem.

Theorem 7.4.5. The singular value decomposition.

An \(m\times n\) matrix \(A\) may be written as \(A=U\Sigma V^T\) where \(U\) is an orthogonal \(m\times m\) matrix, \(V\) is an orthogonal \(n\times n\) matrix, and \(\Sigma\) is an \(m\times n\) matrix whose entries are zero except for the singular values of \(A\) which appear in decreasing order on the diagonal.

Notice that a singular value decomposition of \(A\) gives us a singular value decomposition of \(A^T\text<.>\) More specifically, if \(A=U\Sigma V^T\text\) then

\begin A^T = (U\Sigma V^T)^T = V\Sigma^T U^T. \end

Proposition 7.4.6.

If \(A=U\Sigma V^T\text\) then \(A^T = V\Sigma^T U^T\text<.>\) In other words, \(A\) and \(A^T\) share the same singular values, and the left singular vectors of \(A\) are the right singular vectors of \(A^T\) and vice-versa.

As we said earlier, the singular value decomposition should be thought of a generalization of an orthogonal diagonalization. For instance, the Spectral Theorem tells us that a symmetric matrix can be written as \(QDQ^T\text<.>\) Many matrices, however, are not symmetric and so they are not orthogonally diagonalizable. However, every matrix has a singular value decomposition \(U\Sigma V^T\text<.>\) The price of this generalization is that we usually have two sets of singular vectors that form the orthogonal matrices \(U\) and \(V\) whereas a symmetric matrix has a single set of eignevectors that form the orthogonal matrix \(Q\text<.>\)

The structure of singular value decompositions

Now that we have an understanding of what a singular value decomposition is and how to construct it, let's explore the ways in which a singular value decomposition reveals the underlying structure of the matrix. As we'll see, the matrices \(U\) and \(V\) in a singular value decomposition provide convenient bases for some important subspaces, such as the column and null spaces of the matrix. This observation will provide the key to some of our uses of these decompositions in the next section.

Activity 7.4.4.

Let's suppose that a matrix \(A\) has a singular value decomposition \(A=U\Sigma V^T\) where

\begin U=\begin \mathbf u_1 & \mathbf u_2 & \mathbf u_3 & \mathbf u_4 \end,\hspace \Sigma = \begin 20 & 0 & 0 \\ 0 & 5 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end,\hspace V=\begin \mathbf v_1 & \mathbf v_2 & \mathbf v_3 \end. \end

What are the dimensions of \(A\text\) that is, how many rows and columns does \(A\) have?
Suppose we write a three-dimensional vector \(\mathbf x\) as a linear combination of right singular vectors:

\begin \mathbf x = c_1\mathbf v_1 + c_2\mathbf v_2 + c_3\mathbf v_3\text \end \begin A\mathbf x = 20c_1\mathbf u_1 + 5c_2\mathbf u_2 = \mathbf b \end

This activity shows how a singular value decomposition of a matrix encodes important information about its null and column spaces. This is, in fact, the key observation that makes singular value decompositions so useful: the left and right singular vectors provide orthonormal bases for \(Nul(A)\) and \(Col(A)\text<.>\)

Example 7.4.7

Suppose we have a singular value decomposition \(A=U\Sigma V^T\) where \(\Sigma = \begin \sigma_1 & 0 & 0 & 0 & 0 \\ 0 & \sigma_2 & 0 & 0 & 0 \\ 0 & 0 & \sigma_3 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ \end \text<.>\) This means that \(A\) has four rows and five columns just as \(\Sigma\) does.

As in the activity, if \(\mathbf x = c_1 \mathbf v_1 + c_2\mathbf v_2 + \ldots + c_5\mathbf v_5\text\) we have

\begin A\mathbf x = \sigma_1c_1\mathbf u_1 + \sigma_2c_2\mathbf u_2 + \sigma_3c_3\mathbf u_3\text <.>\end

If \(\mathbf b\) is in the \(Col(A)\text\) then \(\mathbf b\) must have the form

\begin \mathbf b = \sigma_1c_1\mathbf u_1 + \sigma_2c_2\mathbf u_2 + \sigma_3c_3\mathbf u_3\text \end

which says that \(\mathbf b\) is a linear combination of \(\mathbf u_1\text\) \(\mathbf u_2\text\) and \(\mathbf u_3\text<.>\) These three vectors therefore form a basis for \(Col(A)\text<.>\) In fact, since they are columns in the orthogonal matrix \(U\text\) they form an orthonormal basis for \(Col(A)\text<.>\)

Remembering that \(Rank(A)=dimCol(A)\text\) we see that \(Rank(A) = 3\text\) which results from the three nonzero singular values. In general, the rank \(r\) of a matrix \(A\) equals the number of nonzero singular values, and \(\mathbf u_1, \mathbf u_2, \ldots,\mathbf u_r\) form an orthonormal basis for \(Col(A)\text<.>\)

Moreover, if \(\mathbf x = c_1 \mathbf v_1 + c_2\mathbf v_2 + \ldots + c_5\mathbf v_5\) satisfies \(A\mathbf x = \zerovec\text\) then

\begin A\mathbf x = \sigma_1c_1\mathbf u_1 + \sigma_2c_2\mathbf u_2 + \sigma_3c_3\mathbf u_3=\zerovec\text \end

which implies that \(c_1=0\text\) \(c_2=0\text\) and \(c_3=0\text<.>\) Therefore, \(\mathbf x = c_4\mathbf v_4+c_5\mathbf v_5\) so \(\mathbf v_4\) and \(\mathbf v_5\) form an orthonormal basis for \(Nul(A)\text<.>\)

More generally, if \(A\) is an \(m\times n\) matrix and if \(Rank(A) = r\text\) the last \(n-r\) right singular vectors form an orthonormal basis for \(Nul(A)\text<.>\)

Generally speaking, if the rank of an \(m\times n\) matrix \(A\) is \(r\text\) then there are \(r\) nonzero singular values and \(\Sigma\) has the form

\begin \begin \sigma_1 & \ldots & 0 & \ldots & 0 \\ 0 & \ldots & 0 & \ldots & 0 \\ 0 & \ldots & \sigma_r & \ldots & 0 \\ 0 & \ldots & 0 & \ldots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & \ldots & 0 & \ldots & 0 \\ \end, \end

The first \(r\) columns of \(U\) form an orthonormal basis for \(Col(A)\text\)

\begin U = \left[ \underbrace<\mathbf u_1 ~ \ldots ~ \mathbf u_r>_ \hspace \mathbf u_ ~ \ldots ~ \mathbf u_m \right] \end

and the last \(n-r\) columns of \(V\) form an orthonormal basis for \(Nul(A)\text\)

\begin V = \left[ \mathbf v_1 ~ \ldots ~ \mathbf v_r\hspace \underbrace <\mathbf v_~ \ldots ~ \mathbf v_n>_ \right] \end

In fact, we can say more. Remember that Proposition 7.4.6 says that \(A\) and its transpose \(A^T\) share the same singular values. Since the rank of a matrix equals its number of nonzero singular values, this means that \(Rank(A)=Rank(A^T)\text\) a fact that we cited back in Section 6.2.

Proposition 7.4.8.

For any matrix \(A\text\)

\begin Rank(A) = Rank(A^T)\text \end

If we have a singular value decomposition of an \(m\times n\) matrix \(A=U\Sigma V^T\text\) Proposition 7.4.6 also tells us that the left singular vectors of \(A\) are the right singular vectors of \(A^T\text<.>\) Therefore, \(U\) is the \(m\times m\) matrix whose columns are the right singular vectors of \(A^T\text<.>\) This means that the last \(m-r\) vectors form an orthonormal basis for \(Nul(A^T)\text<.>\) Therefore, the columns of \(U\) provide orthonormal bases for \(Col(A)\) and \(Nul(A^T)\text\)

\begin U = \left[ \underbrace<\mathbf u_1 ~ \ldots ~ \mathbf u_r>_ \hspace \underbrace <\mathbf u_~ \ldots ~ \mathbf u_m>_ \right]\text \end

This reflects the familiar fact that \(Nul(A^T)\) is the orthogonal complement of \(Col(A)\text<.>\)

In the same way, \(V\) is the \(n\times n\) matrix whose columns are the left singular vectors of \(A^T\text\) which means that the first \(r\) vectors form an orthonormal basis for \(Col(A^T)\text<.>\) Because the columns of \(A^T\) are the rows of \(A\text\) this subspace is sometimes called the row space of \(A\) and denoted \(Row(A)\text<.>\) While we have yet to have an occasion to use \(Row(A)\text\) there are times when it is important to have an orthonormal basis for it. Fortunately, a singular value decomposition provides just that. To summarize, the columns of \(V\) provide orthonormal bases for \(Col(A^T)\) and \(Nul(A)\text\)

\begin V = \left[ \underbrace<\mathbf v_1 ~ \ldots ~ \mathbf v_r>_\hspace \underbrace <\mathbf v_~ \ldots ~ \mathbf v_m>_ \right] \end

Considered altogether, the subspaces \(Col(A)\text\) \(Nul(A)\text\) \(Col(A^T)\text\) and \(Nul(A^T)\) are called the four fundamental subspaces associated to \(A\text<.>\) In addition to telling us the rank of a matrix, a singular value decomposition gives us orthonormal bases for all four fundamental subspaces.

Theorem 7.4.9.

Suppose \(A\) is an \(m\times n\) matrix having a singular value decomposition \(A=U\Sigma V^T\text<.>\) Then

\(r=Rank(A)\) is the number of nonzero singular values.
The columns \(\mathbf u_1,\mathbf u_2,\ldots,\mathbf u_r\) form an orthonormal basis for \(Col(A)\text\)
The columns \(\mathbf u_,\ldots,\mathbf u_m\) form an orthonormal basis for \(Nul(A^T)\text\)
The columns \(\mathbf v_1,\mathbf v_2,\ldots,\mathbf v_r\) form an orthonormal basis for \(Col(A^T)\text\)
The columns \(\mathbf v_,\ldots,\mathbf v_n\) form an orthonormal basis for \(Nul(A)\text\)

When we previously outlined a procedure for finding a singular decomposition of an \(m\times n\) matrix \(A\text\) we found the left singular vectors \(\mathbf u_j\) using the expression \(A\mathbf v_j = \sigma_j\mathbf u_j\text<.>\) This produces left singular vectors \(\mathbf u_1, \mathbf u_2,\ldots,\mathbf u_r\text\) where \(r=Rank(A)\text<.>\) If \(r\lt m\text\) however, we still need to find the left singular vectors \(\mathbf u_,\ldots,\mathbf u_m\text<.>\) Theorem 7.4.9 tells us how to do that: because those vectors form an orthonormal basis for \(Nul(A^T)\text\) we can find them by solving \(A^T\mathbf x = \zerovec\) to find a basis for \(Nul(A^T)\) and applying the Gram-Schmidt algorithm.

We won't worry about this issue too much, however, as we will frequently use software to find singular value decompositions for us.

Reduced singular value decompositions

As we'll see in the next section, there are times when it is helpful to express a singular value decomposition in a slightly different form.

Activity 7.4.5.

Suppose we have a singular value decomposition \(A = U\Sigma V^T\) where

\begin U = \begin \mathbf u_1 & \mathbf u_2 & \mathbf u_3 & \mathbf u_4 \end,\hspace \Sigma = \begin 18 & 0 & 0 \\ 0 & 4 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \\ \end,\hspace V = \begin \mathbf v_1 & \mathbf v_2 & \mathbf v_3 \end\text <.>\end

What are the dimensions of \(A\text\) What is \(Rank(A)\text\)
Identify bases for \(Col(A)\) and \(Col(A^T)\text\)
Explain why

\begin U\Sigma = \begin \mathbf u_1 & \mathbf u_2 \end \begin 18 & 0 & 0 \\ 0 & 4 & 0 \\ \end\text <.>\end

\begin \begin 18 & 0 & 0 \\ 0 & 4 & 0 \\ \endV^T = \begin 18 & 0 \\ 0 & 4 \\ \end \begin \mathbf v_1 & \mathbf v_2 \end^T\text <.>\end

We call this a reduced singular value decomposition.

Proposition 7.4.10. Reduced singular value decomposition.

If \(A\) is an \(m\times n\) matrix having rank \(r\text\) then \(A=U_r \Sigma_r V_r^T\) where

\(U_r\) is an \(m\times r\) matrix whose columns form an orthonormal basis for \(Col(A)\text\)
\(\Sigma_r=\begin \sigma_1 & 0 & \ldots & 0 \\ 0 & \sigma_2 & \ldots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & \sigma_r \\ \end\) is an \(r\times r\) diagonal, invertible matrix, and
\(V_r\) is an \(n\times r\) matrix whose columns form an orthonormal basis for \(Col(A^T)\text\)

Example 7.4.11

In Example 7.4.4, we found the singular value decomposition

\begin A=\begin 2 & -2 & 1 \\ -4 & -8 & -8 \\ \end = \begin 0 & 1 \\ -1 & 0 \\ \end \begin 12 & 0 & 0 \\ 0 & 3 & 0 \\ \end \begin 1/3 & 2/3 & 2/3 \\ 2/3 & -2/3 & 1/3 \\ 2/3 & 1/3 & -2/3 \\ \end^T\text <.>\end

Since there are two nonzero singular values, \(Rank(A) =2\) so that the reduced singular value decomposition is

\begin A=\begin 2 & -2 & 1 \\ -4 & -8 & -8 \\ \end = \begin 0 & 1 \\ -1 & 0 \\ \end \begin 12 & 0 \\ 0 & 3 \\ \end \begin 1/3 & 2/3 \\ 2/3 & -2/3 \\ 2/3 & 1/3 \\ \end^T\text <.>\end

Summary

This section has explored singular value decompositions, how to find them, and how they organize important information about a matrix.

A singular value decomposition of a matrix \(A\) is a factorization where \(A=U\Sigma V^T\text\) The matrix \(\Sigma\) has the same shape as \(A\text\) and its only nonzero entries are the singular values of \(A\text\) which appear in decreasing order on the diagonal. The matrices \(U\) and \(V\) are orthogonal and contain the left and right singular vectors.
To find a singular value decomposition of a matrix, we construct the Gram matrix \(G=A^TA\text\) which is symmetric. The singular values of \(A\) are the square roots of the eigenvalues of \(G\text\) and the right singular vectors \(\mathbf v_j\) are the associated eigenvectors of \(G\text\) The left singular vectors \(\mathbf u_j\) are determined from the relationship \(A\mathbf v_j=\sigma_j\mathbf u_j\text\)
A singular value decomposition organizes fundamental information about a matrix. For instance, the number of nonzero singular values is the rank \(r\) of the matrix. The first \(r\) left singular vectors form an orthonormal basis for \(Col(A)\) with the remaining left singular vectors forming an orthonormal basis of \(Nul(A^T)\text\) The first \(r\) right singular vectors form an orthonormal basis for \(Col(A^T)\) while the remaining right singular vectors form an orthonormal basis of \(Nul(A)\text\)
If \(A\) is a rank \(r\) matrix, we can write a reduced singular value decomposition as \(A=U_r\Sigma_rV_r^T\) where the columns of \(U_r\) form an orthonormal basis for \(Col(A)\text\) the columns of \(V_r\) form an orthonormal basis for \(Col(A^T)\text\) and \(\Sigma_r\) is an \(r\times r\) diagonal, invertible matrix.

Exercises 7.4.5Exercises

1

Consider the matrix \(A = \begin 1 & 2 & 1 \\ 0 & -1 & 2 \\ \end \text<.>\)

Find the Gram matrix \(G=A^TA\) and use it to find the singular values and right singular vectors of \(A\text\)
Find the left singular vectors.
Form the matrices \(U\text\) \(\Sigma\text\) and \(V\) and verify that \(A=U\Sigma V^T\text\)
What is \(Rank(A)\) and what does this say about \(Col(A)\text\)
Determine an orthonormal basis for \(Nul(A)\text\)

2

Find singular value decompositions for the following matrices:

\(\begin 0 & 0 \\ 0 & -8 \end\text\)
\(\begin 2 & 3 \\ 0 & 2 \end\text\)
\(\displaystyle \begin 4 & 0 & 0 \\ 0 & 0 & 2 \end\)
\(\displaystyle \begin 4 & 0 \\ 0 & 0 \\ 0 & 2 \end\)

3

Consider the matrix \(A = \begin 2 & 1 \\ 1 & 2 \end \text<.>\)

Find a singular value decomposition of \(A\) and verify that it is also an orthogonal diagonalization of \(A\text\)
If \(A\) is a symmetric, positive semidefinite matrix, explain why a singular value decomposition of \(A\) is an orthogonal diagonalization of \(A\text\)

4

Suppose that the matrix \(A\) has the singular value decomposition

\begin \begin -0.46 & 0.52 & 0.46 & 0.55 \\ -0.82 & 0.00 & -0.14 & -0.55 \\ -0.04 & 0.44 & -0.85 & 0.28 \\ -0.34 & -0.73 & -0.18 & 0.55 \end \begin 6.2 & 0.0 & 0.0 \\ 0.0 & 4.1 & 0.0 \\ 0.0 & 0.0 & 0.0 \\ 0.0 & 0.0 & 0.0 \end \begin -0.74 & 0.62 & -0.24 \\ 0.28 & 0.62 & 0.73 \\ -0.61 & -0.48 & 0.64 \end. \end

What are the dimensions of \(A\text\)
What is \(Rank(A)\text\)
Find orthonormal bases for \(Col(A)\text\) \(Nul(A)\text\) \(Col(A^T)\text\) and \(Nul(A^T)\text\)
Find the orthogonal projection of \(\mathbf b=\fourvec102\) onto \(Col(A)\text\)

5

Consider the matrix \(A = \begin 1 & 0 & -1 \\ 2 & 2 & 0 \\ -1 & 1 & 2\\ \end \text<.>\)

Construct the Gram matrix \(G\) and use it to find the singular values and right singular vectors \(\mathbf v_1\text\) \(\mathbf v_2\text\) and \(\mathbf v_3\) of \(A\text\) What are the matrices \(\Sigma\) and \(V\) in a singular value decomposition?
What is \(Rank(A)\text\)
Find as many left singular \(\mathbf u_j\) as you can using the relationship \(A\mathbf v_j=\sigma_j\mathbf u_j\text\)
Find an orthonormal basis for \(Nul(A^T)\) and use it to construct the matrix \(U\) so that \(A=U\Sigma V^T\text\)
State an orthonormal basis for \(Nul(A)\) and an orthonormal basis for \(Col(A)\text\)

6

Consider the matrix \(B=\begin 1 & 0 \\ 2 & -1 \\ 1 & 2 \end\) and notice that \(B=A^T\) where \(A\) is the matrix in Exercise 7.4.5.1.

Use your result from Exercise 7.4.5.1 to find a singular value decomposition of \(B=U\Sigma V^T\text\)
What is \(Rank(B)\text\) Determine a basis for \(Col(B)\) and \(Col(B)^\perp\text\)
Suppose that \(\mathbf b=\threevec47\text\) Use the bases you found in the previous part of this exericse to write \(\mathbf b=\widehat<\mathbf>+\mathbf b^\perp\text\) where \(\widehat<\mathbf>\) is in \(Col(B)\) and \(\mathbf b^\perp\) is in \(Col(B)^\perp\text\)

Find the least squares approximate solution to the equation \(B\mathbf x=\mathbf b\text\)

7

Suppose that \(A\) is a square \(m\times m\) matrix with singular value decomposition \(A=U\Sigma V^T\text<.>\)

If \(A\) is invertible, find a singular value decomposition of \(A^\text\)

What condition on the singular values must hold for \(A\) to be invertible?

How are the singular values of \(A\) and the singular values of \(A^\) related to one another?

How are the right and left singular vectors of \(A\) related to the right and left singular vectors of \(A^\text\)

8

If \(Q\) is an orthogonal matrix, remember that \(Q^TQ=I\text\) Explain why \(\det Q = \pm 1\text\)

If \(A=U\Sigma V^T\) is a singular value decomposition of a square matrix \(A\text\) explain why \(|\det A|\) is the product of the singular values of \(A\text\)

What does this say about the singular values of \(A\) if \(A\) is invertible?

9

If \(A\) is a matrix and \(G=A^TA\) its Gram matrix, remember that

\begin \mathbf x\cdot(G\mathbf x) = \mathbf x\cdot(A^TA\mathbf x) = (A\mathbf x)\cdot(A\mathbf x) = \len^2. \end

For a general matrix \(A\text\) explain why the eigenvalues of \(G\) are nonnegative.

Given a symmetric matrix \(A\) having an eigenvalue \(\lambda\text\) explain why \(\lambda^2\) is an eigenvalue of \(G\text\)

If \(A\) is symmetric, explain why the singular values of \(A\) equal the absolute value of its eigenvalues: \(\sigma_j = |\lambda_j|\text\)

10

Determine whether the following statements are true or false and explain your reasoning.

If \(A=U\Sigma V^T\) is a singular value decomposition of \(A\text\) then \(G=V(\Sigma^T\Sigma)V^T\) is an orthogonal diagonalization of its Gram matrix.

If \(A=U\Sigma V^T\) is a singular value decomposition of a rank 2 matrix \(A\text\) then \(\mathbf v_1\) and \(\mathbf v_2\) form an orthonormal basis for the column space \(Col(A)\text\)

If \(A\) is a diagonalizable matrix, then its set of singular values is the same as its set of eigenvalues.

If \(A\) is a \(10\times7\) matrix and \(\sigma_7 = 4\text\) then the columns of \(A\) are linearly independent.

The Gram matrix is always orthogonally diagonalizable.

11

Suppose that \(A=U\Sigma V^T\) is a singular value decomposition of the \(m\times n\) matrix \(A\text<.>\) If \(\sigma_1,\ldots,\sigma_r\) are the nonzero singular values, the general form of the matrix \(\Sigma\) is

\begin \Sigma = \begin \sigma_1 & \ldots & 0 & \ldots & 0 \\ 0 & \ldots & 0 & \ldots & 0 \\ 0 & \ldots & \sigma_r & \ldots & 0 \\ 0 & \ldots & 0 & \ldots & 0 \\ 0 & \vdots & 0 & \vdots & 0 \\ 0 & \ldots & 0 & \ldots & 0 \\ \end. \end

If you know that the columns of \(A\) are linearly independent, what more can you say about the form of \(\Sigma\text\)

If you know that the columns of \(A\) span \(\mathbb R^m\text\) what more can you say about the form of \(\Sigma\text\)

If you know that the columns of \(A\) are linearly independent and span \(\mathbb R^m\text\) what more can you say about the form of \(\Sigma\text\)

This page titled 7.4: Singular Value Decompositions is shared under a CC BY 4.0 license and was authored, remixed, and/or curated by David Austin via source content that was edited to the style and standards of the LibreTexts platform.

Back to top

7.3: Principal Component Analysis

7.5: Using Singular Value Decompositions

Was this article helpful?

Yes

No