Читать книгу Multiblock Data Fusion in Statistics and Machine Learning - Tormod Næs - Страница 40

1.7 Fundamental Choices

Оглавление

In any sort of multiblock data analysis, choices have to be made such as which method to use and what kind of pre-processing to apply. Two fundamental questions which always should be considered (and dealt with) are highlighted below.

Variation explained:Do we only want to explain variation between blocks or also within blocks?Fairness:Should all blocks play a role in the final solution or can we allow some of the blocks to be dominant in this respect?

The first concept – variation explained – pertains to the choice that we can model the variation within the blocks and/or the variation between the blocks. Each multiblock data analysis method makes a different choice in this respect and thus it is up to the user to decide what aspect is the most important: between- or within-variation. A more detailed account is given in Section 2.4.

The concept of fairness relates to the notion that each block should participate in the final solution to a certain degree. Multiblock data analysis methods also differ in this respect: some methods are fair and some are ‘block selectors’ (Smilde et al., 2003; Tenenhaus et al., 2017). To some extent, fairness can be influenced by block-scaling (see Section 2.6), but some methods are invariant to this type of scaling. The method ROSA (see Section 7.5) builds on the principle of fairness: each block is allowed to enter the solution in competition with the other blocks: if it is important then it will be included. The fairness concept has some relation to the concept of invariance discussed in Chapter 7.

Multiblock Data Fusion in Statistics and Machine Learning

Подняться наверх