Читать книгу Artificial Intelligence and Quantum Computing for Advanced Wireless Networks - Savo G. Glisic - Страница 35

Design Example 2.5

Assume that we are analyzing scientific articles related to a specific domain. Each article will be represented by a vector x of word frequencies; that is, we choose a set of M words representative of our scientific area, and we annotate how many times each word appears in each article. Each vector x is then orthogonally projected onto the new subspace defined by the vectors w_i. Each vector w_i has dimension M, and it can be understood as a “topic” (i.e. a topic is characterized by the relative frequencies of the M different words; two different topics will differ in the relative frequencies of the M words). The projection of x onto each w_i gives an idea of how important topic w_i is for representing the article. Important topics have large projection values and, therefore, large values in the corresponding component of χ.

It can be shown [43, 47], as already indicated in Section 2.1, that when the input vectors, x, are zero‐mean (if they are not, we can transform the input data simply by subtracting the sample average vector), then the solution of the minimization of J_PCA is given by the m eigenvectors associated to the largest m eigenvalues of the covariance matrix of x {, note that the covariance matrix of x is a M × M matrix with M eigenvalues). If the eigenvalue decomposition of the input covariance matrix is (since C_x is a real‐symmetric matrix), then the feature vectors are constructed as , where Λ_m is a diagonal matrix with the m largest eigenvalues of the matrix Λ_M and W_m are the corresponding m columns from the eigenvectors matrix W_M. We could have constructed all the feature vectors at the same time by projecting the whole matrix X, . Note that the i‐th feature is the projection of the input vector x onto the i‐th eigenvector, . The computed feature vectors have an identity covariance matrix, C_χ = I, meaning that the different features are decorrelated.

Univariate variance is a second‐order statistical measure of the departure of the input observations with respect to the sample mean. A generalization of the univariate variance to multivariate variables is the trace of the input covariance matrix. By choosing the m largest eigenvalues of the covariance matrix C_x, we guarantee that we are making a representation in the feature space explaining as much variance of the input space as possible with only m variables. As already indicated in Section 2.1, in fact, w₁ is the direction in which the data exhibit the largest variability, w₂ is the direction with largest variability once the variability along w₁ has been removed, w₃ is the direction with largest variability once the variability along w₁ and w₂ has been removed, and so on. Thanks to the orthogonality of the w_i vectors, and the subsequent decorrelation of the feature vectors, the total variance explained by PCA decomposition can be conveniently measured as the sum of the variances of each feature,

Artificial Intelligence and Quantum Computing for Advanced Wireless Networks

Подняться наверх