Читать книгу Multiblock Data Fusion in Statistics and Machine Learning - Tormod Næs - Страница 40
1.7 Fundamental Choices
ОглавлениеIn any sort of multiblock data analysis, choices have to be made such as which method to use and what kind of pre-processing to apply. Two fundamental questions which always should be considered (and dealt with) are highlighted below.
Variation explained:Do we only want to explain variation between blocks or also within blocks?Fairness:Should all blocks play a role in the final solution or can we allow some of the blocks to be dominant in this respect?
The first concept – variation explained – pertains to the choice that we can model the variation within the blocks and/or the variation between the blocks. Each multiblock data analysis method makes a different choice in this respect and thus it is up to the user to decide what aspect is the most important: between- or within-variation. A more detailed account is given in Section 2.4.
The concept of fairness relates to the notion that each block should participate in the final solution to a certain degree. Multiblock data analysis methods also differ in this respect: some methods are fair and some are ‘block selectors’ (Smilde et al., 2003; Tenenhaus et al., 2017). To some extent, fairness can be influenced by block-scaling (see Section 2.6), but some methods are invariant to this type of scaling. The method ROSA (see Section 7.5) builds on the principle of fairness: each block is allowed to enter the solution in competition with the other blocks: if it is important then it will be included. The fairness concept has some relation to the concept of invariance discussed in Chapter 7.