Читать книгу Handbook on Intelligent Healthcare Analytics - Группа авторов - Страница 57

3.2 Overview of Big Data 3.2.1 Big Data: Definition

Оглавление

Every day, the organization is producing an enormous quantity of data. These huge volumes of data compose “big data”. Big data, complex in nature, requires powerful technologies and advanced algorithms for its processing.

A formal definition of big data was given in [1]: “Big data is the information asset characterized by such a high volume, velocity, and variety to acquire specific technology and analytical methods for its transformation into value.”

Data has been increasing constantly in an unpredicted way from the last decade due to the digitalization and the advancements in technology. In common, the big data is being generated from following sources [14]:

Social data: The social data refers to social media data. It is generated from social media such as YouTube and Twitter. This data is mainly used in market analysis. The analysis on Facebook likes and comments and tweets on Twitter provide the details about the consumer behavior.

Machine data: Machine data refers to the data generated by machines, such as wearable, sensor devices, web logs, and satellites.

Transactional data: The transactional data are generated as a result of the transactions. The transactions can be online or offline. Examples of the transactional data are the delivery receipts, order, invoices, etc.

Human generated data: The human generated data is extracted from the emails, electronic medical reports, messages, etc.

Search engine data: The search engine data are generated from the browsers.

All the abovementioned data are in diverse formats such as comments, videos, email, and sensor data, most of which are in unstructured format. Big data is the huge size of a data set that grows exponentially with time. Examples of big data: Amazon product list, YouTube videos, Google search engine, and Jet engine data. Storing and processing of abovementioned big data is not possible with conventional databases because traditional databases can contain only gigabytes of data. But, the big data contains several petabytes of data. The big data solutions solve this entire problem with distributed storage and processing systems.

Handbook on Intelligent Healthcare Analytics

Подняться наверх