Читать книгу Data Science For Dummies - Lillian Pierson - Страница 41
Introducing NoSQL databases
ОглавлениеA traditional RDBMS isn’t equipped to handle big data demands. That’s because it’s designed to handle only relational datasets constructed of data that’s stored in clean rows and columns and thus is capable of being queried via SQL. RDBMSs are incapable of handling unstructured and semistructured data. Moreover, RDBMSs simply lack the processing and handling capabilities that are needed for meeting big data volume-and-velocity requirements.
This is where NoSQL comes in — its databases are nonrelational, distributed database systems that were designed to rise to the challenges involved in storing and processing big data. They can be run on-premise or in a cloud environment. NoSQL databases step out past the traditional relational database architecture and offer a much more scalable, efficient solution. NoSQL systems facilitate non-SQL data querying of nonrelational or schema-free, semistructured and unstructured data. In this way, NoSQL databases are able to handle the structured, semistructured, and unstructured data sources that are common in big data systems.
A key-value pair is a pair of data items, represented by a key and a value. The key is a data item that acts as the record identifier and the value is the data that’s identified (and retrieved) by its respective key.
NoSQL offers four categories of nonrelational databases: graph databases, document databases, key-values stores, and column family stores. Because NoSQL offers native functionality for each of these separate types of data structures, it offers efficient storage and retrieval functionality for most types of nonrelational data. This adaptability and efficiency make NoSQL an increasingly popular choice for handling big data and for overcoming processing challenges that come along with it.
NoSQL applications like Apache Cassandra and MongoDB are used for data storage and real-time processing. Apache Cassandra is a popular type of key-value store NoSQL database, and MongoDB is the most-popular document-oriented type of NoSQL database. It uses dynamic schemas and stores JSON-esque documents.
A document-oriented database is a NoSQL database that houses, retrieves, and manages the JSON files and XML files that you heard about back in Chapter 1, in the definition of semistructured data. A document-oriented database is otherwise known as a document store.
Some people argue that the term NoSQL stands for Not Only SQL, and others argue that it represents non-SQL databases. The argument is rather complex and has no cut-and-dried answer. To keep things simple, just think of NoSQL as a class of nonrelational systems that don’t fall within the spectrum of RDBMSs that are queried using SQL.