Читать книгу Modern Characterization of Electromagnetic Systems and its Associated Metrology - Magdalena Salazar-Palma - Страница 18

1.3.2 The Theory of Total Least Squares

Оглавление

The method of total least squares (TLS) is a linear parameter estimation technique and is used in wide variety of disciplines such as signal processing, general engineering, statistics, physics, and the like. We start out with a set of m measured data points {(x1,y1),…,(xm,ym)}, and a set of n linear coefficients (a1,…,an) that describe a model, (x;a) where m > n [3, 4]. The objective of total least squares is to find the linear coefficients that best approximate the model in the scenario that there is missing data or errors in the measurements. We can describe the approximation by a simple linear expression


Figure 1.10 Mean squared error of the approximation.

(1.25)

Since m > n, there are more equations than unknowns and therefore (1.25) has an overdetermined set of equations. Typically, an overdetermined system of equation is best solved by the ordinary least squares where the unknown is given by

(1.26)

where X represents the complex conjugate transpose of the matrix X. The least squares can take into account if there are some uncertainties like noise in y as it is a least squares fit to it. However, if there is uncertainty in the elements of the matrix X then the ordinary least squares cannot address it. This is where the total Least squares come in. In the total least squares the matrix equation (1.25) is cast into a different form where uncertainty in the elements of both the matrix X and y can be taken into account.

(1.27)

In this form one is solving for the solution to the composite matrix by searching for the eigenvector/singular vector corresponding to the zero eigen/singular value. If the matrix X is rectangular then the eigenvalue concept does not apply and one needs to deal with the singular vectors and the singular values.

The best approximation according to total least squares is that minimizes the norm of the difference between the approximated data and the model (x;a) as well as the independent variables X. Considering the errors of the measured data vector, y, and the independent variables, X, (1.25) can be re‐written as

(1.28)

where and are the errors in both the dependent variable measurements and independent variable measurements, respectively. We then want to approximate in a way that minimizes these errors in the dependent and independent variables. This can be expressed by,

(1.29)

where is an augmented matrix with the columns of error matrix concatenated with the error vector . The operator ‖•‖F represents the Frobenius norm of the augmented matrix. The Frobenius norm is defined as the square root of the sum of the absolute squares of all of the elements in a matrix. This can be expressed in equation form as the following, where A is any matrix,

(1.30)

and where σi is the i‐th singular value of matrix A.

We will now bring the right‐hand side of (1.28) over to the left side of the equation and equate it to zero as such

(1.31)

If the concatenated matrix [X y] has a rank of n + 1, the n + 1 columns of the matrix are linearly independent and the n + 1, m‐dimensional columns of the matrix span the same n‐dimensional space as X. In order to have a unique solution for the coefficients, a, the matrix [X + ; y + ] must have n linearly independent columns. However, this matrix has n + 1 columns in total and therefore is rank is deficient by 1. We then must find the smallest matrix [] that changes matrix [X y] with a rank of n + 1, to a matrix {[X y] + []} with a rank n. According to the Eckart‐Young‐Mirsky theorem we can achieve this by defining {[X y] + []} as the best rank‐n approximation to [X y] and by eliminating the last singular value of [X y] which contains the least amount of system information and provides a unique solution. The Eckart–Young–Mirsky theorem (https://en.wikipedia.org/wiki/Low‐rank_approximation) states a low‐rank approximation is a minimization problem, in which the cost function measures the fit between a given matrix (the data) and an approximating matrix (the optimization variable), subject to a constraint that the approximating matrix has reduced rank. To illustrate how this is accomplished, we take the SVD of [X y] as follows

(1.32)

where Ux has n columns, uy is a column vector, ∑x contains the n largest singular values diagonally, σy is the smallest singular value, Vxx is a n × n matrix, and vyy is scalar. Let us multiple both sides by matrix V.

(1.33)

Next, we will equate just the last columns of the matrix multiplication occurring in (1.32).

(1.34)

From the Eckart‐Young theorem, we know that {[X y] + []} is the closest rank‐n approximation to [X y]. Matrix {[X y] + []} has the same singular vectors contained in ∑x above with σy equal to zero. We can then write the SVD of {[X y] + []} as such

(1.35)

To obtain [] we must solve the following

(1.36)

(1.36) can be solved by first using (1.32) and (1.35) which results in

(1.37)

Then, from (1.34) we can rewrite (1.37) as

(1.38)

Finally, {[X y] + []} can be defined as

(1.39)

After multiplying each term in (1.39) by we get the following

(1.40)

The right‐hand side cancels and we are left with

(1.41)

From (1.31) and (1.41) we can solve for the model coefficient a as

(1.42)

The vector vxy is the first n elements of the n +1‐th columns of the right singular matrix V, of [X y] and vyy is the n + 1‐th element of the n + 1 columns of V. The best approximation of the model is then given by

(1.43)

This completes the total least squares solution.

Modern Characterization of Electromagnetic Systems and its Associated Metrology

Подняться наверх