Historically, Online Transaction Processing (OLTP) was performed by customers submitting traditional transactions (order something, withdraw money, cash a check, etc.) to a relational DBMS. Large enterprises might have dozens to hundreds of these systems. Invariably, enterprises wanted to consolidate the information in these OLTP systems for business analysis, cross selling, or some other purpose. Hence, Extract-Transform-and-Load (ETL) products were used to convert OLTP data to a common format and load it into a data warehouse. Data warehouse activity rarely shared machine resources with OLTP because of lock contention in the DBMS and because business intelligence (BI) queries were so resource-heavy that they got in the way of timely responses to transactions.
This combination of a collection of OLTP systems, connected to ETL, and connected to one or more data warehouses is the gold standard in enterprise computing. I will term it “Old OLTP.” By and large, this activity was supported by the traditional RDBMS vendors. In the past I have affectionately called them “the elephants”; in this posting I refer to them as “Old SQL.”
As noted by most pundits, “the Web changes everything,” and I have noticed a very different collection of OLTP requirements that are emerging for Web properties, which I will term “New OLTP.” These sites seem to be driven by two customer requirements:
The need for far more OLTP throughput. Consider new Web-based applications such as multi-player games, social networking sites, and online gambling networks. The aggregate number of interactions per second is skyrocketing for the successful Web properties in this category. In addition, the explosive growth of smartphones has created a market for applications that use the phone as a geographic sensor and provide location-based services. Again, successful applications are seeing explosive growth in transaction requirements. Hence, the Web and smartphones are driving the volume of interactions with a DBMS through the roof, and New OLTP developers need vastly better DBMS performance and enhanced scalability.
The need for real-time analytics. Intermixed with a tidal wave of updates is the need for a query capability. For example, a Web property wants to know the number of current users playing its game, or a smartphone user wants to know “What is around me?” These are not the typical BI requests to consolidated data, but rather real-time inquiries to current data. Hence, New OLTP requires a real-time query capability.
In my opinion, these two characteristics are shared by quite a number of enterprise non-Web applications. For example, electronic trading firms often trade securities in several locations around the world. The enterprise wants to keep track of the global position (short or long and by how much) for each security. To do so, all trading actions must be recorded, creating a fire hose of updates. Furthermore, there are occasional real-time queries. Some of these are triggered by risk exposure—i.e., alert the CEO if the aggregate risk for or against a particular security exceeds a certain monetary threshold. Others come from humans, e.g., “What is the current position of the firm with respect to security X?”
Hence, we expect New OLTP to be a substantial application area, driven by Web applications as the early adopters but followed by enterprise systems. Let’s look at the deployment options.
1) Traditional OLTP. This architecture is not ideal for New OLTP for two reasons. First, the OLTP workload experienced by New OLTP may exceed the capabilities of Old SQL solutions. In addition, data warehouses are typically stale by tens of minutes to hours. Hence, this technology is incapable of providing real-time analytics.
2) NoSQL. There have been a variety of startups in the past few years that call themselves NoSQL vendors. Most claim extreme scalability and high performance, achieved through relaxing or eliminating transaction support and moving back to a low-level DBMS interface, thereby eliminating SQL.
In my opinion, these vendors have a couple of issues when presented with New OLTP. First, most New OLTP applications want real ACID. Replacing real ACID with either no ACID or “ACID lite” just pushes consistency problems into the applications where they are far harder to solve. Second, the absence of SQL makes queries a lot of work. In summary, NoSQL will translate into “lots of work for the application”—i.e., this will be the full employment act for programmers for the indefinite future.
3) New SQL. Systems are starting to appear that preserve SQL and offer high performance and scalability, while preserving the traditional ACID notion for transactions. To distinguish these solutions from the traditional vendors, we term this class of systems “New SQL.” Such systems should be equally capable of high throughput as the NoSQL solutions, without the need for application-level consistency code. Moreover, they preserve the high-level language query capabilities of SQL. Such systems include Clustrix, NimbusDB, and VoltDB. (Disclosure: I am a founder of VoltDB.)
Hence, New SQL should be considered as an alternative to NoSQL or Old SQL for New OLTP applications. If New OLTP is as big a market as I foresee, I expect we will see many more New SQL engines employing a variety of architectures in the near future.
Disclosure: Michael Stonebraker is associated with four startups that are either producers or consumers of database technology.
VMWare recently announced a high velocity distributed DB with SQL and all the scaling characteristics touted by the NoSQL plays.
Wonder if Mr. stonebraker has a opinion on them.
Displaying comments 11 - 11 of 11 in total