Читать книгу Artificial Intelligence and Quantum Computing for Advanced Wireless Networks - Savo G. Glisic - Страница 24
Design Example 2.1
ОглавлениеSuppose we observe a street guitar player who plays different types of music, say, jazz, rock, or country. Passersby leave a tip in a box in front of him depending on whether or not they like what he is playing. The player chooses to play different songs independent of the tips he receives. Below we have a training dataset of song and the corresponding target variable “tip” (which suggests possibilities of getting a tip for a given song). Now, we need to classify whether player will get a tip or not based on the song he is playing. Let us follow the steps involved in this task.
1 Convert the dataset into a frequency table Data TablesongjazzrockcountryjazzjazzrockcountrycountryjazztipnoyesyesyesyesyesnonoyescountryjazzrockrockcountryyesnoyesyesnoFrequency Tablesongnoyesrock4country32jazz23sum59
2 Create a Likelihood table by finding the probabilities, for example, rock probability = 0.29 and probability of getting a tip is 0.64.Likelihood Tablesongnoyesrock4=4/140.29country32=5/140.36jazz23=5/140.36sum59=5/14=9/140.360.64
3 Now, use the Naive Bayesian equation to calculate the posterior probability for each class. The class with the highest posterior probability is the outcome of prediction.
In our case, the player will get a tip if he plays jazz. Is this statement correct? We can solve it using the above method of posterior probability.
which has a relatively high probability. On the other hand, if he plays rock we have
R Code for Naive Bayes
require(e1071) #Holds the Naive Bayes Classifier Train <- read.csv(file.choose()) Test <- read.csv(file.choose()) #Make sure the target variable is of a two-class classification problem only levels(Train$Item:Fat_Content) model <- naiveBayes(Item:Fat_Content~., data = Train) class(model) pred <- predict(model,Test) table(pred)
Nearest neighbor algorithms: These are among the “simplest” supervised ML algorithms and have been well studied in the field of pattern recognition over the last century. They might not be as popular as they once were, but they are still widely used in practice, and we recommend that the reader at least consider the k‐nearest neighbor algorithm in classification projects as a predictive performance benchmark when trying to develop more sophisticated models. In this section, we will primarily talk about two different algorithms, the nearest neighbor (NN) algorithm and the k‐nearest neighbor (kNN) algorithm. NN is just a special case of kNN, where k = 1. To avoid making this text unnecessarily convoluted, we will only use the abbreviation NN if we talk about concepts that do not apply to kNN in general. Otherwise, we will use kNN to refer to NN algorithms in general, regardless of the value of k.
kNN is an algorithm for supervised learning that simply stores the labeled training examples,, during the training phase and postpones the processing of the training examples until the phase of making predictions. Again, the training consists literally of just storing the training data.
Then, to make a prediction (class label or continuous target), the kNN algorithms find the k nearest neighbors of a query point and compute the class label (classification) or continuous target (regression) based on the k nearest (most “similar”) points. The overall idea is that instead of approximating the target function f(x) = y globally, during each prediction, kNN approximates the target function locally. In practice, it is easier to learn to approximate a function locally than globally.