Читать книгу Multiblock Data Fusion in Statistics and Machine Learning - Tormod Næs - Страница 41

1.8 Common and Distinct Components

A crucial concept that plays a role in almost all methods in this book is the idea of common and distinct subspaces and components (Smilde et al., 2017). A schematic illustration of these concepts is shown in Figure 1.9 and a more detailed exposure of these concepts is given in Chapter 2.

Figure 1.9 The idea of common and distinct components. Legend: blue is common variation; dark yellow and dark red are distinct variation and shaded areas are noise (unsystematic variation).

Suppose there are two data blocks X1 and X2 sharing the same samples, i.e., different variables are measured on the same set of samples (see Chapter 3). Then these two blocks can have variation in common (the blue part). This common variation spans a subspace and the common components are then a basis for this subspace.

There is also a part in each block that contains still systematic variation (the dark yellow and dark red parts). These have nothing in common and are, therefore, called distinct parts. These also represent subspaces and the distinct components (two sets; one set for each block) are the bases for these subspaces. What is left in the matrices is unsystematic variation or noise (shaded parts).

The division of each data block in common, distinct, and unsystematic variation should not be read in terms of the individual variables being in common or being distinct but in terms of subspaces. Hence, a part of the variation of a variable in block 1 may be in common with variation of some variables in block 2 whereas the other part of that variable may be distinct, see Elaboration 1.8.

Multiblock Data Fusion in Statistics and Machine Learning

Подняться наверх