Читать книгу Matrix and Tensor Decompositions in Signal Processing - Gérard Favier - Страница 14
I.4. With what tensor decompositions?
ОглавлениеIt is important to note that, for an Nth-order tensor the number of elements is and, assuming In = I for n ∈ 〈N〉, this number becomes IN, which induces an exponential increase with the tensor order N. This is called the curse of dimensionality (Oseledets and Tyrtyshnikov 2009). For big data tensors, tensor decompositions play a fundamental role in alleviating this curse of dimensionality, due to the fact that the number of parameters that characterize the decompositions is generally much smaller than the number of elements in the original tensor.
We now introduce three basic decompositions: PARAFAC/CANDECOMP/CPD, TD and TT5. The first two are studied in depth in Chapter 5, whereas the third, briefly introduced in Chapter 3, will be considered in more detail in Volume 3.
Table I.3 gives the expression of the element of a tensor of order N and size I1 × · • · × IN, either real ( = ℝ) or complex ( = ℂ), for each of the three decompositions cited above. Their parametric complexity is compared in terms of the size of each matrix and tensor factor, assuming In = I and Rn = R for all n ∈ 〈N〉.
Figures I.1–I.3 show graphical representations of the PARAFAC model and the TD model for a third-order tensor and of the TT model [[A(1), (2), (3), A(4)]] for a fourth-order tensor χ ∈ I1×I2×I3×I4. In the case of the PARAFAC model, we define using its columns, for n ∈ {1, 2, 3}.
Figure I.1. Third-order PARAFAC model
We can make a few remarks about each of these decompositions:
– The PARAFAC decomposition (Harshman 1970), also known as CANDECOMP (Carroll and Chang 1970) or CPD (Hitchcock 1927), of a Nth-order tensor χ is a sum of R rank-one tensors, each defined as the outer product of one column from each of the N matrix factors A(n) ∈ In×R. When R is minimal, it is called the rank of the tensor. If the matrix factors satisfy certain conditions, this decomposition has the essential uniqueness property. See Figure I.1 for a third-order tensor (N = 3), and Chapter 5 for a detailed presentation.
Table I.3. Parametric complexity of the CPD, TD, and TT decompositions
Decompositions | Notation | Element xi1,··· ,iN |
PARAFAC / CPD | [[A(1), · · · , A(N); R ]] | |
TD | ||
TT | ||
Decompositions | Parameters | Complexity |
CPD | A(n) ∈ In×R, ∀n ∈ 〈N〉 | O(N I R) |
TD | ||
TT |
Figure I.2. Third-order Tucker model
Figure I.3. Fourth-order TT model
– The Tucker decomposition (Tucker 1966) can be viewed as a generalization of the PARAFAC decomposition that takes into account all the interactions between the columns of the matrix factors A(n) ∈ In×Rn via the introduction of a core tensor This decomposition is not unique in general. Note that, if Rn ≤ In for ∀n ∈ 〈N〉, then the core tensor provides a compressed form of χ. If Rn, for n ∈ 〈N〉, is chosen as the rank of the mode-n matrix unfolding6 of χ, then the N-tuple (R1, · · · , RN) is minimal, and it is called the multilinear rank of the tensor.
Such a Tucker decomposition can be obtained using the truncated high-order SVD (THOSVD), under the constraint of column-orthonormal matrices A(n) (de Lathauwer et al. 2000a). This algorithm is described in section 5.2.1.8.
See Figure I.2 for a third-order tensor, and Chapter 5 for a detailed presentation.
– The TT decomposition (Oseledets 2011) is composed of a train of third-order tensors the first and last carriages of the train being matrices which implies r0 = rN = 1, and therefore The dimensions Rn, for n ∈ 〈N − 1〉, called the TT ranks, are given by the ranks of some matrix unfoldings of the original tensor.
This decomposition has been used to solve the tensor completion problem (Grasedyck et al. 2015; Bengua et al. 2017), for facial recognition (Brandoni and Simoncini 2020) and for modeling MIMO communication channels (Zniyed et al. 2020), among many other applications. A brief description of the TT decomposition is given in section 3.13.4 using the mode-(p, n) product. Note that a specific SVD-based algorithm, called TT-SVD, was proposed by Oseledets (2011) for computing a TT decomposition.
This decomposition and the hierarchical Tucker (HT) one (Grasedyck and Hackbush 2011; Ballani et al. 2013) are special cases of tensor networks (TNs) (Cichocki 2014), as will be discussed in more detail in the next volume.
6 See definition [3.41], in Chapter 3, of the mode-n matrix unfolding Xn of a tensor χ, whose columns are the mode-n vectors obtained by fixing all but n indices.
From this brief description of the three tensor models, one can conclude that, unlike matrices, the notion of rank is not unique for tensors, since it depends on the decomposition used. Thus, as mentioned above, one defines the tensor rank (also called the canonical rank or Kruskal’s rank) associated with the PARAFAC decomposition, the multilinear rank that relies on the Tucker’s model, and the TT-ranks linked with the TT decomposition.
It is important to note that the number of characteristic parameters of the PARAFAC and TT decompositions is proportional to N, the order of the tensor, whereas the parametric complexity of the Tucker decomposition increases exponentially with N. This is why the first two decompositions are especially valuable for large-scale problems. Although the Tucker model is not unique in general, imposing an orthogonality constraint on the matrix factors yields the HOSVD decomposition, a truncated form of which gives an approximate solution to the best low multilinear rank approximation problem (de Lathauwer et al. 2000a). This solution, which is based on an a priori choice of the dimensions Rn of the core tensor, is to be compared with the truncated SVD in the matrix case, although it does not have the same optimality property. It is widely used to reduce the parametric complexity of data tensors.
From the above, it can be concluded that the TT model combines the advantages of the other two decompositions, in terms of parametric complexity (like PARAFAC) and numerical stability (like Tucker’s model), due to a parameter estimation algorithm based on a calculation of SVDs.
To illustrate the use of the PARAFAC decomposition, let us consider the case of multi-user mobile communications with a CDMA (code-division multiple access) encoding system. The multiple access technique allows multiple emitters to simultaneously transmit information over the same communication channel by assigning a code to each emitter. The information is transmitted as symbols sn,m, with n ∈ 〈N〉 and m ∈ 〈M〉, where N and M are the number of transmission time slots, i.e. the number of symbol periods, and the number of emitting antennas, respectively. The symbols belong to a finite alphabet that depends on the modulation being used. They are encoded with a space-time coding that introduces code diversity by repeating each symbol P times with a code cp,m assigned to the mth emitting antenna, p ∈ 〈P〉, where P denotes the length of the spreading code. The signal received by the kth receiving antenna, during the nth symbol period and the pth chip period, is a linear combination of the symbols encoded and transmitted by the M emitting antennas:
[I.1]
where hk,m is the fading coefficient of the communication channel between the receiving antenna k and the emitting antenna m.
The received signals, which are complex-valued, therefore form a third-order tensor whose modes are: space × time × code, associated with the indices (k, n, p). This signal tensor satisfies a PARAFAC decomposition [[H, S, C; M]] whose rank is equal to the number M of emitting antennas and whose matrix factors are the channel the matrix of transmitted symbols This example is a simplified form of the DS-CDMA (direct-sequence CDMA) system proposed by (Sidiropoulos et al. 2000b).