Читать книгу Data Science in Theory and Practice - Maria Cristina Mariani - Страница 15
1.4 Big Data
ОглавлениеBig data is a term applied to ways to analyze, systematically extract information from, or otherwise deal with data sets that are too large or complex to be dealt with by classical data‐processing tools. In particular, it refers to data sets whose size or type is beyond the ability of traditional relational databases to capture, manage, and process the data with low latency. Sources of big data includes data from sensors, stock market, devices, video/audio, networks, log files, transactional applications, web, and social media and much of it generated in real time and at a very large scale.
In recent times, the use of the term “big data” (both stored and real‐time) tend to refer to the use of user behavior analytics (UBA), predictive analytics, or certain other advanced data analytics methods that extract value from data. UBA solutions look at patterns of human behavior, and then apply algorithms and statistical analysis to detect meaningful anomalies from those patterns' anomalies that indicate potential threats. For example detection of hackers, detection of insider threats, targeted attacks, financial fraud, and several others.
Predictive analytics deals with the process of extracting information from existing data sets in order to determine patterns and predict future outcomes and trends. Generally, predictive analytics does not tell you what will happen in the future. However, it forecasts what might happen in the future with some degree of certainty. Predictive analytics goes hand in hand with big data: Businesses and organizations collect large amounts of real‐time customer data and predictive analytics and uses this historical data, combined with customer insight, to forecast future events. Predictive analytics helps organizations to use big data to move from a historical view to a forward‐looking perspective of the customer. In this book, we will discuss several methods for analyzing big data.