We are excited to be a part of AWS re:Invent 2024. Visit us at booth #1844 in Las Vegas.More info
Glossary

What is a NoSQL database?

A NoSQL database (or non-relational database) stores data in a non-tabular manner, using a storage model geared toward the type of being stored, such as key-value, document, time-series, graph, etc.

From SQL to NoSQL

Structured Query Language (SQL), the standard language for relational database management systems, is known for its reliability. This stalwart has allowed computers that are processing large and complex data to do it faster and more effectively since it was developed by IBM in the 1970s. But a greater need for faster and more adaptive databases has arisen, which is why the NoSQL system was developed more than a decade ago. Forrester Research calls NoSQL databases “critical for all businesses to support modern business applications” and notes that half of global data and analytics technology decision-makers either have implemented or are implementing NoSQL platforms.

What is a NoSQL database?

NoSQL databases, also known as non-relational databases, can store and manage data both quickly and flexibly. Google, Amazon, Yahoo, and Facebook have all had a hand in developing these databases as they sought to store their content or processive data for their websites. SQL databases cannot be scaled horizontally across thousands of servers – whereas NoSQL can.

Amazon says that NoSQL databases “are a great fit for many modern applications such as mobile, web and gaming that requires flexible, scalable, high-performance, and highly functional databases to provide great user experience.”

Choosing the right database system

Still, NoSQL isn’t perfect for all situations. For one thing, SQL is seen as having greater data consistency than NoSQL (with only one or two NoSQL systems having strong consistency). For another, some applications, particularly financial, demand the kind of safeguards and consistency that are the hallmarks of SQL. However, other applications can benefit from strong consistency in NoSQL – it simplifies the programming as developers no longer have to account for inconsistent data scenarios. Given the minority of NoSQL systems have the best of both worlds, it’s often up to organizations to determine the right database for the right application.

“The availability of choice in NoSQL databases, is both good and bad at the same time,” says Pramod Sadalage, director of ThoughtWorks Inc. “Good because now we have choice to design the system according to the requirements. Bad because now you have a choice and we have to make a good choice based on requirements and there is a chance where the same database product may be used properly or not used properly.”

Database options

As more companies embrace data and all it has to offer, they may be faced with the dilemma of how to choose the right software solution that not only fits their needs, but also supports their business strategy. One of the first steps to making the right choice is understanding the options. Among them:

geo-distribution-icon

Distributed database management systems (DDBMS)

This is a type of database management system (DBMS) that manages a number of databases hosted at various locations and are tied together through a computer network. It supplies mechanisms so that users see it as a single database despite the distribution.

key-value-icon

Relational Database Management Systems (RDBMS)

This is a database program and the basis for SQL. A relational database means that the database stores data in a structured way using rows and columns. This makes it easy to locate and access specific values within the database. The “relational” aspect refers to the fact that the values within each table are related to one another. Or, tables are related to other tables. This enables users to run queries across several tables at one time. A drawback is the need for table “joins”, which greatly slow analytics considerably as the need to combine pieces of data across tables can be paramount. According to the American National Standards Institute (ANSI), the standard language for relational database management systems is SQL.

document-icon

Non-relational database

Just as its name implies, a non-relational database is one that doesn’t use the “relational” rows and columns. Instead, it employs a storage model that is set up for the specific requirements of the type of data that needs to be stored. In this way, it is more specific in the type of data it supports and how the data can be queried. Non-relational data historically hasn’t been considered a good fit for transactional data, which is used for things like purchases, payments or subscriptions. While in practice NoSQL means “non-relational database,” many of these databases do support queries that are compatible with SQL.

Embracing NoSQL

When organizations are considering NoSQL, it’s often because they fear failure without it. But organizations need to do their homework to ensure they make the right decision. Many organizations will use both SQL and NoSQL databases because there are instances where organizations need the fixed schema, vertical scalability, and predictability of SQL. Then, there are other situations that call for NoSQL with its flexibility and horizontal scalability. The key is defining the requirements needed for a database and then choosing the one that will provide the best support for the project. Here are some common questions about NoSQL:

The simple answer is that each one stores data differently. Specifically, SQL uses a “schema,” which is used to define how data that is composed will be put into the database. SQL schema is more rigid than the more flexible NoSQL schema.

With its more free-form style, any data can be stored in any record. This gives an organization faster access to data, an important consideration if you really need speed and simple accessibility over reliability or consistency. Another advantage: If you don’t want to be tied to a specific schema because you may want to make changes later, NoSQL allows you to do so, even with large amounts of data. NoSQL requires less management with automatic repair, easier data distribution and simpler data models.

Not every application will succeed under such flexibility. SQL provides more safeguards and greater consistency than NoSQL databases. At the same time, NoSQL is still the new kid on the block and there are potential problems that go along with being untested in many situations.

There are four basic types: key-value store, document-based store, column-based store and graph-based store (though you can build graph on top of key-value). Key-value is designed for storing, retrieving and managing associative arrays, a data structure known as a hash table. A good fit for key-value databases would be for storing session information, user profiles, preferences and shopping cart data. Document-based stores documents composed of tagged elements, and may be the most useful to content management systems, blogging platforms and web analytics. With a column store, data is stored in cells grouped in columns of data rather than as rows, and is also considered a good fit for content management, as well as expiring usage. Graph based has a flexible graph representation. It is often ideal for scalability concerns and where there are problem spaces such as connected data that is used in things like social networks.

The two most common consistency models are either ACID (atomic, consistent, isolated, durable) or BASE (basic availability, soft-state, eventually consistency). With ACID, once the data is written, you will be able to access the data and get a consistent view of it. With BASE, once data is written, it will eventually appear for reading, which is why it is seen as having looser consistency. The majority of NoSQL databases don’t provide ACID guarantees, which may not be a big deal when talking about having to wait a few moments to see a Facebook post. But if an organization is dealing with billion-dollar financial transactions, then that may be opening the door for fraud. (Note, there are just a couple NoSQL systems that have some form of strong consistency). Choosing the consistency needs to be decided on a case-by-case basis.

NoSQL databases use distributed clusters of hardware to scale up, and are considered to be cheaper and more scalable than SQL databases. The flexible, schema-less model can store, process and access various types of business data, while providing greater control over data storage and processing. Some enterprises are using relational databases along with NoSQL, while others are scrapping their relational databases in certain scenarios because they get better performance and scale at a lower cost with NoSQL.

The bottom line is that NoSQL databases are cheap and open source. NoSQL typically uses inexpensive servers to manage fast-growing amounts of data and transactions. This can be an important consideration if an organization can’t afford costly RDBMS databases that use big servers and storage systems. Since there are options in how NoSQL systems store data, they can use less or more servers, but typically at the detriment of some other performance characteristic.

There are a few, but the most common is in-memory (DRAM). In-memory has fast performance characteristics, but since DRAM is expensive, it can be costly to scale out. In addition, multiple copies of data need to be stored for redundancy as in-memory data isn’t persistent.

The complementary storage mechanism is on-disk, but performance degrades significantly with typical NoSQL systems. The benefit is fewer servers, however. Aerospike, however, combines the best of both worlds with their Hybrid Memory Architecture™, which only stores indexes on DRAM yet persists data on disk, with yet the same performance levels of in-memory. Aerospike also has All Flash, which preserves 95% of the performance but has all the data on disk, condensing server footprint. Lastly, Intel has their Optane DC Persistent Memory, which has the performance of DRAM with the persistence of disk, also for much higher data densities.

If your organization wants to provide a personalized experience, then it’s going to require a lot of data from demographics to behavioral – and the flexibility of NoSQL is just the ticket. Other top uses can be profile management, real-time big data, content management, catalogs, customer 360-degree view, mobile applications, the internet of things (IoT), digital communications and fraud detection (which is also dependent on behavioral analysis). Organizations that need low latency, high scalability and speed for large amounts of data find that NoSQL gives them the responsiveness they need.

There are a number of innovative features for NoSQL that are being developed, according to Forrester. Among them: greater automation that helps speed up NoSQL deployments and support more complex applications with little effort. In addition, open-source NoSQL solutions “are stable and ready for primetime,” Forrester reports.