Читать книгу Machine Learning for Healthcare Applications - Группа авторов - Страница 73

3.2 Literature Survey

Оглавление

In this section, we shall briefly mention most of the studies and research done currently in the field of EEG and Neuromarketing relevant to our study.

Primarily we got a solid foundation [1] to work on from Dr. Partha Pratim Roy’s and his associates’ paper on Analysis of EEG signals and application of Neuromarketing, in his paper he has used the deep learning method of Hidden Markov Model and recorded the dataset using user-independent test approach. He also proposed a predictive modelling framework to acquire the consumer’s knowledge about what all he like/dislikes amongst the sample products using an Emotiv EPOC+ sensor. We have borrowed his dataset for initial study as an ice-breaker and it has helped us in leaps.

After reading his paper, we inherently searched for the spectrum of mind which consciously makes the decision of a person liking/disliking a product in a natural environment. We had encountered lot of reasons such as presentation, composition of materials, past experiences, cost and brand value which a person uses to determine its likeability. But perhaps this wasn’t enough. So, we decided to delve a little into emotion recognition for identifying which all areas in brain elicit an emotion. Following will be our concise notes on emotion recognition and after which we shall provide the methodological research of models.

This paper [2] is about automatic emotional classification by EEG data using DEAP dataset led by Samarth Tripathi and his associates, applying Convolutional and Deep Neural Networks on DEAP datasets. Earlier emotion recognition involved text, speech, facial, etc. as analyzing parameter s. An emotion is a psychophysiological operation started by a voluntary or involuntary reception of a situation.

In this paper, peripheral physiological signals of 32 subjects were recorded while they watched videos and were evaluated on levels of arousals & valence. They used a 32 EEG-channel 512 Hz Biosemi Active2 device that utilizes active AgCl electrodes to compile the data.

Neural networks implement functions based on large datasets of unknown inputs by training & statistic models. Here, 2 neural models are used 1. Deep Neural Network (DNN) and 2. Convolutional Neural Network (CNN). The dataset is of 8064 signal data from 40 channels for each subject. A total of 322,560 readings were recorded for the models to process. The first model, i.e., DNN used 4 neural levels whose output became input for the subsequent levels. As the dataset was limited, they implemented dropout technique with superior Epoch which could keep the count of all training vectors for updating weights. The datasets were divided into groups for easier use and they all go through learning algorithms before Epoch update occurs. The data was trained in 310 groups with Epoch of 250. For the second model, i.e., CNN, the DEAP data is converted to 2D images for the 101 readings each totaling to a size of 4,040 units. CNN’s first layer used ‘Tan Hyperbolic’ as activation function in valence classification model & ‘Relu’ as activation for arousal model. The subsequent levels used 100 filters and 3 ∗ 3 sized kernel with the very same ‘Tan Hyperbolic’ function as activation function for both classifier models. The last dense layer used ‘Softplus’ as its activation function using CCE as loss function and SGD as an optimizer.

The learning rates were found to be 0.00001 for valence, 0.001 for arousal & a gradient momentum of 0.9. These models resulted in 4.51 & 4.96% improvement in classifying valence and arousal respectively among 2 classes (High/Low) in valence & 3 classes (High/Normal/Low) in arousal. The learning rate is marginally more useful, but dropout probability secures the best classification across levels. They also noted that wrong choice of activation functions especially 1st CNN layer will cause severe defects to models. The models were highly accurate with respect to previous researchers and prove the fact that neural networks are the key for EEG classification of emotions in a step to unlocking the brain.

Hence Deep Neural Networks are used to analyze human emotions and classify them by PSD and frontal asymmetry features. Training model for emotional dataset are created to identify its instances. Emotions are of 2 types—Discrete, classified as a synchronized response in neural anatomy, physiology & morphological expressions and Dimensional, i.e., they can be represented by a collection of small number of underlying effective dimensions, in other words, vectors in a multidimensional space.

The aim of this paper is to identify excitement, meditation, boredom and frustration from the DEAP emotion dataset by a classification algorithm. The Python language is used including libraries like SciKit Learn Toolbox, SciPy and Keras Library. The DEAP dataset contains physiological readings of 32 participants recorded at a sampling rate of 512 Hz with a “bandpass frequency filter” with a range of 4.0 to 45.0 Hz and eliminated EOG artifacts. Power Spectral Density (PSD), based on Fast Fourier Transform, decomposes the data into 4 distinct frequency ranges, i.e., theta (4–8 Hz), alpha (8–13 Hz), beta (13–30 Hz) and gamma (30–40 Hz) using the avgpower function available in Python’s Signal Processing toolbox. The left hemisphere of brain has more frequently activation with positive valence and the right hemisphere has negative valence.


Emotion estimation on EEG frontal asymmetry:


“Ramirez et al. classified emotional states by computing levels of arousal as prefrontal cortex and valence levels as below”:


“Whenever the arousal was computed as beta to alpha activity ratio in frontal cortex, valence was computed as relative frontal alpha activity in right lobe compared to left lobe as below”:


“A time-frequency transform was used to extract spectral features alpha (8–11 Hz) and beta (12–29 Hz). Lastly Mean absolute error (MAE), Mean squared error (MSE) and Pearson Correlation (Corr) is used.”


By scaling (1–9) into valence & arousal (High & Low) we see that feeling of frustration and excitement triggers as high arousal in a low valence area and high valence area respectively whereas meditation and boredom triggers as low arousal in high valence area and in low valence area respectively.

The DNN classifier has 2,184 units with each hidden layer having 60% of its predecessor’s units. Training was done using roughly 10% of the dataset divided into a train set, validation set and a test set. After setting a dropout of 0.2 for input layer & 0.5 for hidden layers, the model recognized arousal & valence with rates of 73.06% (73.14%), 60.7% (62.33%), and 46.69% (45.32%) for 2, 3, and 5 classes, respectively. The kernel-based classifier was observed to have better accuracy compared to other methods like Naïve Bayes and SVM. The result was a set of 2,184 unique features describing EEG activity during each trial. These extracted features were used to train a DNN classifier & random forest classifier. This was exclusively successful for BCI where datasets are huge.

Emotion monitoring using LVQ and EEG is used to identify emotions for the purpose of medical therapy and rehabilitation purposes. [3] proposes a monitoring system for humane emotions in real time through wavelet and “Learning Vector Quantization”. Training data from 10 trials with 10 subjects, “3 classes and 16 segments (equal to 480 sets of data)” is processed within 10 seconds and classified into 4 frequency bands. These bands then become input for LVQ and sort into excited, relaxed or sad emotions. The alpha waves appear frequently when people are relaxed, beta wave occurs when people think, theta wave occurs when people are under stress, tired or sleepy and delta wave occurs when people are in deep sleep. EEG data is captured using an Emotive Insight wireless EEG on 10 participants. They used wireless EEG electrodes on “AF3”, “T7”, “T8” and “AF4” with a 128 Hz sampling frequency to record at morning, noon and night. 1,280 points are recorded in a set, which occurs every 3 min segmented every 10 s. Each participant is analyzed with excited, relaxed or sad states. Using the “LVQ wavelet transform”, EEG was extracted into the required frequencies. “Discrete wavelet transforms (DWT)” again X(n) signal is described as follows:


Known as wavelet base function. Approximation signal below is a resulted signal generated from convoluted processes of original signal mapping with high pass filter.


Where, x(n) = original signal

 g(n) = low pass filter coeff

 h(n) = high pass filter coeff

 K, n = index 1 = till length of signal

 Scale function coefficient (Low pass filter)g0 = 1 − 342, g1 = 3 − 342, g2 = 3 + 3, 342, g3 = 1+342

 Wavelet function coefficient (High pass filter)h0 = 1 − 342, h1 = − 3 − 342, h2 = 3 + 3, 342, h3 = − 1 + 342

When each input data with class label is known, a supervised version of vector quantization called “Learning Vector quantization” can be used to obtain the class that depends on the Euclidean distance between reference vectors and weights. Each training data’s class was compared based on:


Following is the series of input identification systems:


“As stated, The LVS algorithm attempted to correct winning weight Wi with minimum D by shifting the input by the following values:

1 If the input xi and wining wi have the same class label, then move them closer together by ΔWi(j) = B(j)(Xij − Wij).

2 If the input xi and wining wi have a different class label, then move them apart together by ΔWi(j) = −B(j)(Xij −Wij).

3 Voronoi vectors/weights wj corresponding to other input regions are left unchanged with Δwi (t) = 0.”

“The parameters used to train the LVQ model has a learning rate of 0.01 to 0.05 with 0.001 of learning rate reduction and the maximum epoch of 10,000.” The learning rate of 0.05 resulted in the highest accuracy. An accuracy of 72% was achieved without extraction, 87% with extraction using a subset of pair symmetric channel called asymmetric wave and 84% accuracy without asymmetric wave. Using LVQ resulted in computation time being under a minute without any loss of accuracy. Generalization of data in LVQ training was relatively faster and more stable than Multilayer Perceptron. 10 seconds of signal data was identified in 0.44 s in each test.

Here 6 different emotional states such as sorrow, fear, happiness, frustration, satisfaction and enjoyment can be classified using different methods by extracting features from EEG signals. Decent accuracy was achieved by extracting appropriate features for emotional states such as discrete wavelet transforms and ANN recognizer system. In Ref. [4] this model, valence emotions range from negative to positive whereas arousal emotions go from calm to excitement. Discrete Wavelength transforms were applied on brain signal to classify different feature sets. The models used here is the 2-dimensional Arousal-Valence model. We invoked stimulus in the participant’s neural signals using IAPS datasets. The IAPS Dataset contains 956 images with each of them projecting all emotional states. The participants of IAPS rated every picture for valence and arousal. 18 electrodes of a 21-electrode headset are used with 10–20 standard system and a sampling rate of 128 Hz. Since every subject’s emotion are different, a self-assessment manikin (SAM) using the 2-dimensional space (arousal/valence) model where each of them having 5 levels of intensity was taken by the subjects needed to rate his or her emotion. The test was attended by 5 participants between the ages of 25 and 32. Each participant was given a stimulus of 5 s since the duration of each emotion is about 0.5 to 4 s.

To do this the data is derived from 4 frequency bands—alpha, beta, theta, delta. ECG (heart) artefacts which are about 1.2 Hz, “EOG” artefacts (Blinking) is below 4 Hz and EMG (Muscle) artefacts about 30 Hz and Non-Physiological artefacts power lines which is above 50 Hz which removed in preprocessing. In DWT all frequency bands are used and for each trial, the feature vector is 18 ∗ 3 ∗ 9 ∗ 4 = 1,944 (18 electrodes, 3 statistical features, 9 temporal windows & 4 frequency bands). In our instance, an “artificial neural network” has been used as a form of classifier of “backpropagation” algorithm for learning models implemented on the network. The architecture consists of 6 outputs and 10 hidden layers for all the different states of emotion. The accuracies “10-fold cross-validation technique” was used to avoid overfitting while estimating accuracies for the classifiers. As user’s emotion can be affected by many factors such as their emotional state during the experiment, the best achieved accuracy for the network was 55.58%.

They applied Support Vector Machine to explore the bond between neural signals elated in prefrontal cortex based on taste in music. They [5] explored the effects of music on mental illnesses like dementia. It was observed that music enabled listeners to regulate negative behaviors and thoughts occurring in the mind. A BCI-based music system is able to analyze the real time activities of neurons and possible to provide physiological information to the therapist for understanding their emotions. The methods used to evaluate the data depended on the subjects.

The BCI music system consisted of EEG capturing system, a Bluetooth module for transmitting data from the EEG signals to analyze real time issue and also accordingly control the music. 3 major categories of music were considered to trigger positive emotions in the brain, namely the subject’s favorite songs, K448 and high focus audio. The high focus audio comprised of non-vocal instrumentals produced by “The Brain Sync Company”. It consisted of classic white noise elements such as those found in nature like falling rain or a cricket’s cry, etc. The company claimed that these audio clippings helps a person reach their peak performance brain state with relative ease and without getting distracted. 28 participants attended the experiment having a mean age of 21.75 years.

 ‘Fast fourier transform with 0.5 overlap to average power with a frequency band.’

 Each power was normalized by the valve of the baseline in same frequency band across the scalp. (N = 3)

 NEEGS, F = EEGS, F1/N * S = 1NBEEGS, F

 To investigate the asymmetric response of alpha power of PFC, a relation ratio (RR) = RP − LPRP + LP * 100, RP = alpha power from right hemisphere of PFC (FP2) & LP is from (FP1).

SVM was used utilizing a “non-linear kernel function” to recognize the responses of EEG signals. In one-sample test setting the median to 128 with a range between 0 and 255, it was seen that values went from highest to lowest in favorite songs, “K448” and High Focus, in that order. This proved that SVM recognized emotions with high accuracy. This approach did vary vastly from other approaches such as using musical properties such as tempo and melody as a metric to judge emotional response.

It is used to pretreat EEG signals for recognizing emotions. Emotions and their states are divided broadly as being either optimistic or pessimistic. This study [6] is able to scientifically explain emotion driven events such as rash driving and creativity. “DEAP” datasets were used to divide the EEG signals into sub-band analysis using “Fisher’s Linear Discriminant” and then used “Naive Bayes Classifier” to classify emotions as being optimistic or pessimistic. 40 different locations of the brain were tracked for recording the EEG signals.

 The result of X, are the size of filters. Defining hk as the kth convolution of any depth, then sampled feature is: hk = f (Wk * X + bk), where,

 W = weight of filter, b = bias of filter, * = convolution,

 f (.) = non linear activation function.

 When CNN is trained, cross–entropy function is usually used as the cost function.

 Cost = 1n x[y Ln y + (1 − y)Ln (1 − y)], where, n = no. of training samples, x = input samples, y = actual output, y = target output. It defines the smaller the cost function, the closer the classification results is to target output. The convolution layer input samples are {X, Y} = {{X1, Y1}, {X2, Y2},….,{Xi, Yi}}, i = {1, 2,….,n}.

 X = feature of ith sample, Y = label of ith sample. X = {A * B * C}, a = channel of EEG signals. b = Down sampled EEG signals, f = sampling frequency. C = duration of EEG signals, t = time of video. C is the depth of 2 dimensional feature vector.

 Labels are:Yi = {0, 0 < labels i < 4.5, 1, 4.5 < labels i < 9}Yi = {0, 0 < labels i < 3, 1, 3 < labels i < 6, 2, 6 < labels i < 9}

 In 2 category recognition algorithm, 0 = optimism & 1 = pessimism. In 3 category recognition algorithm 0 = optimism, 1 = calm & 2 = pessimistic.tan (hk) = ehk − e − hkehk + e − hk

 The full connection layers use following as an activation function: Softplus(y) = Log (1 + ey)

When trained, a stochastic gradient descent is used as an optimization algorithm to update the weights:


y() = output of CNN, J() is loss value which is mean of multiple cost function values. The program is written in python and implemented using keras library toolkit and theano.

Regarding Neuromarketing techniques, we read up n the recent research that linked EEG signals with predicting consumer behavior and emotions on self-reported ratings.

The correlation between neurons’ activities and the decision-making process is studied [7] during shopping have extensively been exploited to ascertain the bond between brain mapping and decision-making while visualizing a supermarket. The participants were asked to select one of every 3 brands after an interval of 90 stops. They discovered improvement in choice-predictions brand-wise. They also established significant correla-tions between right-parietal cortex activation with the participant’s previous experience with the brand.

The researchers [8] explored the Neuro-signals of 18 participants while evaluating products for like/dislike. It also incorporated eye-tracking methodology for recording participant’s choice from set of 3 images and capturing Neuro-signals at the same time. They implemented PCA and FFT for preprocessing the EEG data. After processing mutual data amongst preference and various EEG bands, they noticed major activity in “theta bands” in the frontal, occipital and parietal lobes.

The authors [9] tried to analyze and predict the 10 participant’s preference regarding consumer products in a visualization scenario. In the next procedure, the products were grouped into pair and presented to participants which recorded increase frequencies on mid frontal lobe and also studied the theta band EEG signals correlating to the products.

It has implemented an application-oriented solution [10] for footwear retailing industry which put forward the pre-market prediction system using EEG data to forecast demand. They recorded 40 consumers in store while viewing the products, evaluating them and asked to label it as bought/not with an additional rating-based questionnaire. They concluded that 80–60% of accuracy was achieved while classifying products into the 2 categories.

A suggestive system [11] was created based on EEG signal analysis platform which coupled pre and post buying ratings in a virtual 3D product display. Here, they factored the emotions of the subjects by analyzing beta and alpha EEG signals.

The authors created a preference prediction system [12] for automobile brands in a portable form while conducting trial on 12 participants as they watched the promotional ad. The Laplacian filter and Butterworth band pass was implemented for preprocessing and 3 tactical features—“Power-Spectral Density”, “Spectral Energy” and “Spectral Centroid” was procured from alpha band. The prediction was done by “K-Nearest Neighbor” and “Probabilistic Neural Network” classification with 96% accuracy.

They used the scenario of predicting the consumer’s choice based on EEG signal analysis [13] while viewing the trailers which resulted in finding significant gamma and beta high frequencies with high correlation to participants and average preferences.

Participants were assessed on self-arousal and valence features while watching particular scenes in a movie [14]. They analyzed the data while factoring in 5 peripheral physiological signals relating them to movie’s content-based features which inferred that they can be used to categorize and rank the videos.

Here 19 participants were shown 2 colors for an interval of 1 s and during the time EEG oscillations were analyzed [15] on Neural mechanisms for correlations of color preferences.

They had 18 participants who were subjected to a set of choices and analyzed their Neuro-activity and Eye-tracking activity to brain-map regions associated with decision making and inter-dependence of regions for the said task [16]. They concluded with high synchronization amongst frontal lobe and occipital lobe giving major frequencies in theta, alpha and beta waves.

They are trying to establish a bond between Neuro-signals and the learning capacity of a model software [17] while assuming that the model has the capability to train itself for dominant alpha wave participants.

“Independent Component Analysis (ICA)” to separate multivariate signals coming from 120 channels of electro-cortical activity [18]. This was done to convert those signals into additive subcomponents. Patterns of sensory impulses were recorded which matched movement of the body.

They have used filter is as a stabilizing and filtering element in the ECG data of 26 volunteers and then applied Approximate Entropy on it for inter-subject evaluation of data as the part of a retrospective approach [19] while adding truthfulness to Entropy windows for its stable distribution. This filter is very extensively being used in Signal processing which led us to adopt it.

The study [20] is an experiment on ECG signals of 26 participants where approximate entropy method is implemented for examining the concentration. Approximation entropy window was taken less for intra-patient comparing to inter-patient and for filtering the noisy signals S-Golay method was implemented.

They have innovatively preprocessed the ECG signal using S–Golay filter technique [21]. With both quadratic degrees of smoothing and differentiation filter methods combinedly has processed ECG signals having sampling rate 500 Hz with seventeen points length.

A very unique “double-class motor imaginary Brain Computer Interface” was implemented with Recurrent Quantum Neural Network model for filtering EEG signals [22].

In the paper [23] using the S-Golay filter, the artifacts due to blinking of eyes are found out and it is eliminated adapting a noise removal method.

Machine Learning for Healthcare Applications

Подняться наверх