Читать книгу Multiblock Data Fusion in Statistics and Machine Learning - Tormod Næs - Страница 15

Glossary of terms

Оглавление

Data set:The total collection of all data that is under consideration for a particular problem.Data block:One block of data organised in a matrix (array) with rows and columns as a part of a data set.Multiblock data set:The organisation of the data set in blocks of data.Multiblock data analysis:The process of analysing the whole multiblock data set simultaneously using multiblock methods.Object, Subject, Sample:Entity for which measurements are obtained. They can be random drawings from a population and/or they can come from an experimental design. The general term is a sample but if these samples pertain to human beings they may be called subjects. They constitute the row entries of a matrix.Variable:A measured property of an entity collected in the columns of a matrix; this is called a feature in machine learning.Measurement scale:The scale on which a variable is measured (ratio, interval, ordinal, or nominal-scaled).Homogeneous versus heterogeneous data:If a data set contains blocks of data all measured on the same scale then this is called homogeneous data; if not, then the data are called heterogeneous. In most cases, homogeneous data will refer to blocks containing quantitative data (at least interval-scaled).

Elaboration 1.1 suggests a consistent vocabulary to be used in the book. However, the difference between variables and objects is not always that clear (for examples, see Chapter 8 on complex relations). We will try, however, to remain as consistent as possible and give extra explanations of terms at the appropriate places. In the rest of this chapter we will delineate our potential audience. We will give some examples of why multiblock methods are necessary and give an overview of the types of problems encountered. Moreover, we will give some history and discuss briefly some fundamental concepts which we need in the rest of the book. We end by giving the notation which we will use in this book and a list of abbreviations.

Multiblock Data Fusion in Statistics and Machine Learning

Подняться наверх