Читать книгу Multiblock Data Fusion in Statistics and Machine Learning - Tormod Næs - Страница 19

ELABORATION 1.2 High Level Supervised Fusion

Оглавление

High-level supervised fusion focuses on combining classification or prediction results for improved precision. Instead of using a method which takes the different data sets into account in building a predictor or classifier, high-level fusion combines results from individual predictions and combines them in the best possible way. In other words, high-level fusion refers to combining results from already established prediction or classification methods.

A possible drawback with this strategy as compared to low-level and feature-level fusion is that it does not provide further insight into how the different measurements relate to each other and how they can be combined in a good way in the prediction of the outcome. On the other hand, high-level fusion of prediction results for new samples does not generally require the individual predictors to be developed from the same samples. In other words, when two (or more) predictors are to be combined for a new sample, they do not need to come from the same data source. It is possible to simply plug in the new data and obtain predictions that can be combined as described below. In this sense it is more flexible (Ballabio et al. (2019)) than low- and feature-level fusion. It has been shown in Doeswijk et al. (2011) that fusing classifiers most often gives similar or improved prediction results as compared to using only one of them. An overview of the use of high-level fusion (and other methods) can be found in Borràs et al. (2015).

A simple way of combining classifiers is to use voting based on counting the number of times the classifiers agree. There are different types of voting schemes that are proposed in the literature. One of them is simple democratic majority voting which means that the group/class that gets the highest number of votes is chosen. In the case of ties, the result is inconclusive. An alternative strategy is 75% voting which means that 75% of the votes should be for the same class before a decision can be made.

Fusing quantitative predictors is most easily done using averages or weighted averages with weights depending on the prediction error of the different predictions, as determined by, for instance, cross-validation. This strategy has similarities with so-called bagging (see, e.g., Freund (1995)). In machine learning, high-level supervised fusion is found in the sub-domain ‘ensemble learning’.

Multiblock Data Fusion in Statistics and Machine Learning

Подняться наверх