Читать книгу Reservoir Characterization - Группа авторов - Страница 46
3.3 Basic Anomaly Detection Classifiers
ОглавлениеThree basic classifiers are introduced, analyzed and tested in this paper:
1 1. Distance from the center of the training set:(3.4)where ym and ctr,m are coordinates of the tested record and of the center of the training set respectively. The center of the training set is defined as the mean over train set records. Coordinates of the training set center are of the form: where yk,m is the m-th coordinate of the k-th record in the training set, K is total number of records in the training set.
2 2. Nearest neighbors sparsity:(3.5)where dist(Y, neighborl) is the distance between tested record Y and its l-th nearest neighbor from the training set. The farther away in a parameter space tested records are from the records in the training set, the larger are both the sparsity and the distance from the center of the training set. These two classifiers are universal. Their performance is not affected by the properties of records in the training set.
3 3. Divergence is defined as follows:(3.6)
The divergence defined by the Eq. 3.6 is of the “Bregman divergence” type (Bregman [20]). It is similar to distance, but does not satisfy either the triangle inequality or the symmetry conditions. Applications of Bregman divergence to the solution of machine learning problems are presented, for example in the Banerjee et al., [21]. Bregman type divergence of Eq. 3.6 is a new highly specialized AD classifier with coefficients am dependent on the anomaly type. It needs prior information about the type of potential anomaly. This classifier may be efficient, for example, if all coordinates of the anomaly records tend to be smaller than the respective coordinates of the records in the training set. This is to be the case for such parameters as Vp/Vs and Poisson’s ratio, if the training set is a compilation of the records obtained in brine-filled sands or shales, and the anomaly of interest is gas-filled sands. In this case reasonable values for coefficients in Eq. 3.6 are am=1.
We also construct and test adaptive aggregated anomaly classifiers designed to identify anomalies with unknown properties. They are built as a linear combination of measured parameters:
Weights sm in Eq. 3.7 should be adjusted according the properties of a specific anomaly. In this paper, the writers showed a technique for the optimization of these coefficients for detection of an anomaly with unknown properties.