BLOG@CACM
Computing Applications

Understanding NoSQL Database Types: Key Value   

Posted

Key-value stores work very differently to traditional relational schemas, organizing data into collections instead of tables. As many detailed posts have been written on this topic already, this article is intended to provide a very brief introduction; I've focused on the bedrock aspects needed to get a feel for what this database type entails. 

The wider "umbrella" of NoSQL databases continues to power some of the most in-demand services and platforms of today. Content is delivered at light-speed — NoSQL may also be synergistically better suited to the newly emerging decentralized finance system. 

Of all non-relationals, the key-value store is by far the most popular due to its extreme simplicity. If you want to review some other schemas and aspects of non-relationals, here are related posts I've previously posted on CACM:

Overview: Key-Value Databases

Key-value stores/database software programs use sets of unique identifiers to store data. Each pairing of identifiers has an associated value, which we call a "key-value pair." The unique identifier acts as a "key" representing a data item, with an ascribed value of the item's location or the data itself (eg. key: CITY, value: LONDON).

The nature of the key is flexible and also depends on what the chosen software allows, but it must be unique enough that queries can bring up a specific key/value. The value itself is unrestricted and could be a list or another key-value pair (this design philosophy is actually used in most programming systems, as a map object or array — only, in this instance, for persistent storage in a DBS).

This data modelling system is so popular because it can plop information into undefined globs, rather than requiring super discrete data (which becomes inevitable in relational models, due to their preset tabular structures, which are based on rows and columns). This means that there is no real need to index, as the structure is naturally light on feet performatively. There's also no real native language; most queries rely on simple put, get and delete commands.

The obvious tradeoff to this simplicity and scalability is that the data you generate from a query is unfiltered. After 2020's pandemic, with families today opting for remote work in order to attain a better work-life balance, there is more need for relationals and non-relationals to grow together. Some use cases become impossible without rigid control of the data, but in others, the speed and reliability of key-value stores makes up for the drawbacks. Programmers can often work around NoSQL's control/filter problems ad hoc — DevOps should probably also adopt this philosophy more for SQL too; utilizing elements according to situational need rather than to tredge black-and-white trench lines between SQL/NoSQL.

In traditional relational designs, data is stored using a tabular structure of rows and columns. The requirement for many attributes to be stored upfront (and the inheritance of each field in the full database with new attributes) increases its ability for high-integrity, vertical optimizations. Such optimizations include compression and performance enhancements related to data access and aggregations. That said, while brilliant for paperless personal accounting apps, this functionality introduces a certain restrictive inflexibility.

Traditional key-value stores, by comparison, naturally have a lot more flexibility and can perform very quick reads and writes, partly because the database only needs to search for the single key then bring up the associated value instead of doing complicated aggregations.

Why Use Key-Value Stores?

This is such a popular form of NoSQL due to the many perks that become available when your database uses it:

Scalability — As with NoSQL in general, you are given extraordinary horizontal scalability. Where the reliability of vertical expansion is limited (even relations have finite scalability vertically), key-value stores make up for this in giving you the tremendous ability to create supersized databases with complex data at various states of structure. This is facilitated via partitioning & replication, while ACID guarantees can be minimized.

Querying  — This is much simpler or altogether unused. Querying is only used in instances when you need to query keys, and sometimes even this is not always possible. As a result, it's cheaper when handling things like shopping carts, user profiles and sessions — as you'll just be doing a single request for reading and another for writing (owing to the glob-like nature of key-value storage). Concurrency problems are also simpler to manage since only one key will need to be resolved. 

Migration — Another perk that comes with not having a structured query language, data that is based on the key-value store design has high mobility. You can transfer it to another data system without needing to change the code or to introduce new architecture. For instance, migrating to a new operating system would not create the sort of large-scale disruption that you could expect with a relational database. 

Popular Use Cases of Key-Value Stores

The simplicity and horizontal scalability of key-value databases lends itself to many high-value use cases:

  • Web apps — Today, web applications need to store an immense amount of data on user sessions and preferences. Key-value stores lend themselves to fast reads and writes, with diverse information being rapidly accessible (and introducible into the database) via simple user keys. The recent U.S. antitrust bills may increase the diversity of apps on marketplaces.

  • Personalizations — An extension to the above: key-value stores allow for a cohort of unique data pieces that can be integrated in real-time for recommendations and advertising. These stores can instantly access and load new ads or recommendations during a web visitor's journey through a website.

  • In-memory caching — Key-value stores are often used to boost application speeds by making retrievals faster. Whereas traditional relations are not suited to intense read/write requests, key-values are a natural fit. This makes it optimal for applications dealing with enormous numbers of simultaneous users in real-time. Redundancy safeguards make it also well-equipped for managing data losses, and as a data cache for data that is not regularly updated.

Key-value stores are overall strongly suited to managing user accounts, massive user sessions (from MMO games with more surface area than planet Earth to in-demand web apps), eCommerce and product recommendations, customized advertisement deliveries (personalized to each user), and general big data situations where scalability is desirable.

A little-known but major use of key-value stores is for one-time, temporary or seasonal spending trends on platforms. For instance, national holidays, Christmas, Black Friday, and so on. Rather than dedicating an expensive infrastructure that will sit live all year-round, the simplicity and performance of key-value allows providers to buy temporary shards specifically to help with processing during these seasonal rushes. 

Conclusion

Every database balances its trade-offs against its strengths. While SQL may have the edge in security, websites and applications using RDBMs like MySQL still face serious vulnerabilities. For instance, SQL injections are the second-largest threat vector for web applications, and on the rise since about 2018.

Where key-value stores really shine is in keeping things simple. On one hand, this can be a serious limitation to our pipeline — such as in the most vertically-demanding use cases like financial transactions, which require a rugged level of system reliability of records and querying. On the other hand, it manages to shore several major weaknesses found in relational databases (some architectures are best for disk, others like Memached do in-memory).

By using both systems, we can make our operations maximally efficient — be it for managing users, data analytics or scaling quickly without the complexity and cost.

Alex Williams is a full-stack developer with over 15 years of experience, and the owner of Hosting Data UK.

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More