Читать книгу Multiblock Data Fusion in Statistics and Machine Learning - Tormod Næs - Страница 43
1.9 Overview and Links
ОглавлениеIn this book we will consider a multitude of methods. To streamline this a bit, we are going to give a summary, at the beginning of each chapter, of the methods and aspects which will be discussed. That will be done in the format of a table. We will specify the following aspects of the methods:
1 A method for unsupervised (U), supervised (S) or complex (C) data structures.
2 The method can deal with heterogeneous data (HET, i.e., different measurement scales) or can only deal with homogeneous data (HOM).
3 A method that uses a sequential (SEQ) or simultaneous (SIM) approach.
4 The method is defined in terms of a model (MOD) or in terms of an algorithm (ALG).
5 A method for finding common (C); common and distinct (CD); or finding common, local and distinct components (CLD).
6 Estimation of the model parameters is based on least squares (LS), maximum likelihood (ML), eigenvalue decompositions (ED) or maximising covariance or correlations (MC).
The first item (A) is used to organise the different chapters. Some methods can deal with data of different measurements scales (heterogeneous data) and some methods can only handle homogeneous data. The difference between the simultaneous and sequential method is explained in more detail in Chapter 2. Some methods are defined by a clear model and some methods are based on an algorithm. The already discussed topic of common and distinct variation is also a distinguishing and important feature of the methods and the sections in some of the chapters are organised according to this principle. Finally, there are different ways of estimating the parameters (weights, scores, loadings, etc.) of the multiblock models. This is also explained in more detail in Chapter 2.
Table 1.1 is an example of such a table for Chapter 6. This table presents a birds-eye view of the properties of the methods. Each chapter discussing methods will start with this table to set the scene. We will end most chapters with some recommendations for practitioners on what method to use in which situation.
Table 1.1 Overview of methods. Legend: U = unsupervised, S = supervised, C = complex, HOM = homogeneous data, HET = heterogeneous data, SEQ = sequential, SIM = simultaneous, MOD = model-based, ALG = algorithm-based, C = common, CD = common/distinct, CLD = common/local/distinct, LS = least squares, ML = maximum likelihood, ED =eigendecomposition, MC = maximising correlations/covariances. For abbreviations of the methods, see Section 1.11
A | B | C | D | E | F | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Section | U | S | C | HOM | HET | SEQ | SIM | MOD | ALG | C | CD | CLD | LS | ML | ED | MC | |
ASCA | 6.1 | ||||||||||||||||
ASCA+ | 6.1.3 | ||||||||||||||||
LiMM-PCA | 6.1.3 | ||||||||||||||||
MSCA | 6.2 | ||||||||||||||||
PE-ASCA | 6.3 |