Читать книгу Machine Learning Approach for Cloud Data Analytics in IoT - Группа авторов - Страница 85

3.3 Predictive Data Analytics in Retail

Each retail industry aims to devise attractive and efficient business strategies to lure the largest portion of customers. For the past few years, retail industries had been using historical data to frame business strategies [18]. Focusing on mere historical transaction data fails to give promising results in this rapidly evolving and competing business world involving huge ocean of data [19]. This inability is addressed using predictive data analytics, an efficient approach to use big data to predict the activity, behavior, and future trends for any enterprise. Further, predictive data analytics is required owing to exponential rise in data and cut-throat competition. Predictive data analytics also helps to obtain a thorough understanding of customers, budget, and stock. As a result, predictive data analytics has gained wide acceptance and attracted several researchers and academicians. Predictive data analytics fails to achieve the desired results using simple regression type methods as it is not suitable in this multidimensional environment. Hence, it employs ML to enhance its capability [20]. The following are the most prevailing models for predictive data analytics [14]:

Classification Model

Clustering Model

Outliers Model

Time Series Model

The readers may refer to [14] for the explanation of these models. All these models use common predictive algorithms. The various predictive algorithms can be broadly categorized into two groups, viz., ML and deep learning. ML primarily works for tabular data which may be linear or nonlinear. Basically, deep learning is also a subset of ML but it has better optimization when dealing with audio, text, and images. ML-based predictive modeling uses various algorithms. Some common algorithms are discussed below in brief [21].

Random Forest: It is the most popular classification and regression algorithm of ML capable of handling huge volumes of data. Random forest implements bagging where a subset of training data is used to train the network. Training process may be repeated with another subset in parallel thus achieving a strong learner.

Generalized Linear Model (GLM): This model narrows down the list of variables and thus performs better than the general linear model. As a result of narrowing down the variables, it gets trained quickly. The limitation of this model is that it requires relatively huge training data sets.

Gradient Boosted Model (GBM): it generates a model that uses decision trees for classification. In this approach, each tree rectifies errors present in previously trained tree. As it builds one tree at a time, it takes longer but gives better generalizations. Hence, it is used in ML-based ranking in Yahoo, among others.

K-Means: It is a popular and fast algorithm to classify data points in various groups so that all points in the same group are highly similar. The aim of this classification is that intragroup similarity is maximized and intergroup similarity is minimized.

Owing to abovementioned algorithms, ML has been widely accepted and recognized as an efficient choice for handling huge volumes of data in the retail industry. It enables sophisticated algorithms for customers’ understanding and thus provides customer-oriented shopping experience. The subsequent subsection discusses the employment of ML for predictive data analytics in the retail industry.

Machine Learning Approach for Cloud Data Analytics in IoT

Подняться наверх