Читать книгу Computational Statistics in Data Science - Группа авторов - Страница 72

2 Machine Learning: An Overview 2.1 Introduction

Machine learning is a field focusing on the design and analysis of algorithms that can learn from data [3]. The field originated from artificial intelligence research in the late 1950s, developing independently from statistics. However, by the early 1990s, machine learning researchers realized that a lot of statistical methods could be applied to the problems they were trying to solve. Modern machine learning is an interdisciplinary field that encompasses theory and methodology from both statistics and computer science.

Machine learning methods are grouped into two main categories, based on what they aim to achieve. The first category is known as supervised learning. In supervised learning, each observation in a dataset comes attached with a label. The label, similar to a response variable, may represent a particular class the observation belongs to (categorical response) or an output value (real‐valued response). In either case, the ultimate goal is to make inferences on possibly unlabeled observations outside of the given dataset. Prediction and classification are both problems that fall into the supervised learning category. The second category is known as unsupervised learning. In unsupervised learning, the data come without labels, and the goal is to find a pattern within the data at hand. Unsupervised learning encompasses the problems of clustering, density estimation, and dimension reduction.

Computational Statistics in Data Science

Подняться наверх