Читать книгу Modern Characterization of Electromagnetic Systems and its Associated Metrology - Magdalena Salazar-Palma - Страница 17
1.3 An Introduction to Singular Value Decomposition (SVD) and the Theory of Total Least Squares (TLS) 1.3.1 Singular Value Decomposition
ОглавлениеAs has been described in [https://davetang.org/file/Singular_Value_Decomposition_Tutorial.pdf] “Singular value decomposition (SVD) can be looked at from three mutually compatible points of view. On the one hand, we can see it as a method for transforming correlated variables into a set of uncorrelated ones that better expose the various relationships among the original data items. At the same time, SVD is a method for identifying and ordering the dimensions along which data points exhibit the most variation. This ties in to the third way of viewing SVD, which is that once we have identified where the most variation is, it's possible to find the best approximation of the original data points using fewer dimensions. Hence, SVD can be seen as a method for data reduction. We shall illustrate this last point with an example later on.
First, we will introduce a critical component in solving a Total Least Squares problem called the Singular Value Decomposition (SVD). The singular value decomposition is one of the most important concepts in linear algebra [2]. To start, we first need to understand what eigenvectors and eigenvalues are as related to a dynamic system. If we multiply a vector x by a matrix [A], we will get a new vector, Ax. The next equation shows the simple equation relating a matrix [A] and an eigenvector x to an eigenvalue λ (just a number) and the original x.
[A] is assumed to be a square matrix, x is the eigenvector, and λ is a value called the eigenvalue. Normally when any vector x is multiplied by any matrix [A] a new vector results with components pointing in different directions than the original x. However, eigenvectors are special vectors that come out in the same direction even after they are multiplied by the matrix [A]. From (1.16) we can see that when one multiplies an eigenvector by [A], the new vector Ax is just the eigenvalue λ times the original x. This eigenvalue determines whether the vector x is shrunk, stretched, reversed, or unchanged. Eigenvectors and eigenvalues play crucial roles in linear algebra ranging from simplifying matrix algebra such as taking the 500th power of [A] to solving differential equations. To take the 500th power of [A], one only needs to find the eigenvalues and eigenvectors of [A] and take the 500th power of the eigenvalues. The eigenvectors will not change direction and the multiplication of the 500th power of the eigenvalues and the eigenvectors will result in [A]500. As we will see in later sections, the eigenvalues can also provide important parameters of a system transfer function such as the poles.
One way to characterize and extract the eigenvalues of a matrix [A] is to diagonalize it. Diagonalizing a matrix not only provides a quick way to extract eigenvalues but important parameters such as the rank and dimension of a matrix can be found easily once a matrix is diagonalized. To diagonalize matrix [A], the eigenvalues of [A] must first be placed in a diagonal matrix, [Λ]. This is completed by forming an eigenvector matrix [S] with the eigenvectors of [A] put into the columns of [S] and multiplying as such
(1.17) can now be rearranged and [A] can also be written as
(1.18)
We start to encounter problems when matrices are not only square but also rectangular. Previously we assumed that [A] was an n by n square matrix. Now we will assume [A] is any m by n rectangular matrix. We would still like to simplify the matrix or “diagonalize” it but using [S]─1[A][S] is no longer ideal for a few reasons; the eigenvectors of [S] are not always orthogonal, there are sometimes not enough eigenvectors, and using A x = λ x requires [A] to be a square matrix. However, this problem can be solved with the singular value decomposition but of course at a cost. The SVD of [A] results in the following
where m is the number of rows of [A], n is the number of columns of [A], and r is the rank of [A]. The SVD of [A], which can now be rectangular or square, will have two sets of singular vectors, u’s and v’s. The u’s are the eigenvectors of [A][A]* and the v’s are the eigenvectors of [A]T[A]. [U] and [V] are also unitary matrices which means that [U]*[U] =[I] and [V]*[V] = [I]. In other words, they are orthogonal where * denoted complex conjugate transpose. The σ’s are the singular values which also so happen to be the square roots of the eigenvalues of both [A][A]* and [A]*[A]. We are not totally finished however because [U] and [V] are not square matrices. While (1.19) is the diagonalization of [A], the matrix equation is technically not valid since we cannot multiply these rectangular matrices of different sizes. To make them square we will need n – r more v’s and m – r more u’s. We can get these required u’s and v’s from the nullspace N(A) and the left nullspace N(A*). Once the new u’s and v’s are added, the matrices are square and [A] will still equal [U][∑][V]T. The true SVD of [A] will now be
The new singular value matrix [∑] is the same matrix as the old r × r matrix but with m – r new rows of zero and n – r columns of new zero added. The theory of total least squares (TLS) heavily utilizes the SVD as will be seen in the next section.
As an example consider the letter X. We now discretize the image on a 20×20 grid as seen in Figure 1.1. We place a 1 where there is no portion of the letter X and a zero where there is some portion of the letter. This results in the following matrix (1.21) representing the letter X digitally on a 20×20 grid of the matrix A. Now if we ask the question “What is the information content in the matrix X consisting of 1’s and 0’s? Clearly one would not require 20×20 = 400 pieces of information to represent X in (1.21) as there is a lot of redundancy in the system. This is where the SVD comes in and addresses this issue in quite a satisfactory way. If one performs a SVD of the matrix A given by (1.21), one will find that diagonal ∑ matrix in (1.20) has a lot of zero entries. In fact only seven of the singular values are not close to zero as presented in Table 1.1. This implies that the information content of the picture in Figure 1.1 is small. The SVD can also be used to reconstruct the original data matrix with reduced information without sacrificing any accuracy.
Figure 1.1 Discretization of the letter X on a 20×20 grid.
Table 1.1 List of all the singular values of the matrix A.
16.9798798959208 4.57981833410903 3.55675452579199 2.11591593148044 1.74248432449203 1.43326538670857 0.700598651862344 8.40316310453450×10–16 2.67120960711993×10–16 1.98439889201509×10–16 1.14953653060279×10–16 4.74795444425376×10–17 1.98894713470634×10–17 1.64625309682086×10–18 5.03269559457945×10–32 1.17624286081108×10–32 4.98861204114981e×10–33 4.16133005156854×10–49 1.35279056788548×10–80 1.24082758075064×10–112 |
(1.21) (1.21)
To illustrate the point, we now perform a rank‐1 approximation for the matrix and we will require
(1.22)
where U1 is a column matrix of size 20×1 and so is V1. The rank one approximation of A is seen in Figure 1.2.
The advantage of the SVD is that an error in the reconstruction of the image can be predicted without actually knowing the actual solution. This is accomplished by looking at the second largest singular value. The result is not good and we did not expect it to be. So now if we perform a Rank‐2 reconstruction for the image, it will be given by
(1.23)
Figure 1.2 Rank‐1 approximation of the image X.
where now two of the right and the left singular values are required. In this case the reank‐2 approximation will be given by Figure 1.3. In this case we have captured the largest two singular values as seen from Table 1.1.
Again as we study how the picture evolves as we take more and more on the singular values and the vectors the accuracy in the reconstruction increases. For example, Figure 1.4 provides the rank‐3 reconstruction, Figure 1.5 provides the rank‐4 reconstruction, Figure 1.6 provides the rank‐5 reconstruction, Figure 1.7 provides the rank‐6 reconstruction. It is seen with each higher rank approximation the reconstructed picture resembles the actual one. The error in the approximation decreases as the magnitudes of the neglected singular values become small. This is the greatest advantage of the Singular Value Decomposition over say for example, the Tikhonov regularization as the SVD provides an estimate about the error in the reconstruction even though the actual solution is unknown.
Finally, it is seen that there are seven large singular values and after that they become mostly zeros. Therefore, we should achieve a perfect reconstruction with Rank‐7 which is seen in Figure 1.8 and going to rank‐8 which is presented in Figure 1.9, will not make any difference in the quality of the image. This is now illustrated next through the mean squared error between the true solution and the approximate ones.
Figure 1.3 Rank‐2 approximation of the image X.
Figure 1.4 Rank‐3 approximation of the image X.
Figure 1.5 Rank‐4 approximation of the image X.
Figure 1.6 Rank‐5 approximation of the image X.
Figure 1.7 Rank‐6 approximation of the image X.
Figure 1.8 Rank‐7 approximation of the image X.
Figure 1.9 Rank‐8 approximation of the image X.
In Figure 1.10, the mean squared error between the actual picture and the approximate ones is presented. It is seen, as expected the error is given by this data
(1.24)
This is a very desirable property for the SVD. Next, the principles of total least squares is presented.