Читать книгу Official Google Cloud Certified Professional Data Engineer Study Guide - Dan Sullivan - Страница 43

Unstructured Data

Оглавление

The distinguishing characteristic of unstructured data is that it does not have a defined schema or data model. Structured data, like relational database tables, has a fixed data model that is defined before data is added to the table. Semi-structured databases include a schema with each row or document in the database. Examples of unstructured data include the following:

 Text files of natural language content

 Audio files

 Video files

 Binary large objects (BLOBs)

It should be pointed out that data is considered unstructured if it does not have a schema that influences how the data is stored or accessed. Unstructured data may have an internal structure that is not relevant to the way it is stored. For example, natural language is highly structured according to the syntax rules of languages. Audio and video files may have an internal format that includes metadata as well as content. Here again, there is structure within the file, but that structure is not used by storage systems, and that is the reason why this kind of data is classified as unstructured.

Official Google Cloud Certified Professional Data Engineer Study Guide

Подняться наверх