Читать книгу Big Data - Seifedine Kadry - Страница 31
1.8.3.1 Data Integration
ОглавлениеData integration involves combining data from different sources to give the end users a unified data view. Several challenges are faced while integrating data; as an example, while extracting data from the profile of a person, the first name and family name may be interchanged in a certain culture, so in such cases integration may happen incorrectly. Data redundancies often occur while integrating data from multiple sources. Figure 1.11 illustrates that diversified sources such as organizations, smartphones, personal computers, satellites, and sensors generate disparate data such as e‐mails, employee details, WhatsApp chat messages, social media posts, online transactions, satellite images, and sensory data. These different types of structured, unstructured, and semi‐structured data have to be integrated and presented as unified data for data cleansing, data modeling, data warehousing, and to extract, transform, and load (ETL) the data.
Figure 1.11 Data integration.