Sign In

Communications of the ACM

BLOG@CACM

The "NoSQL" Discussion has Nothing to Do With SQL


MIT Adjunct Professor Michael Stonebraker

Recently, there has been a lot of buzz about "No SQL" databases. This blog post considers the performance argument about No SQL databases; a subsequent posting will address the flexibility argument.


Comments


CACM Administrator

It appears that Racine is proposing a return to a hierarchical data model (popularized by IBM's IMS in the 1960s). In such a system joins are avoided by being "prebuilt" into the definition of the hierarchy. Hence, the line items of a purchase order would be clustered together and stored with their parent purchase order. A hierarchical data model will work well for certain kinds of data (for example documents); however, history has shown it to be unsuitable for a general purpose data model. Instead of writing a long-winded response, I will just make two quick points:

Codd's 1970 CACM paper presents a very clear exposition of why hierarchies don't work in general. In fact, Codd's work was mainly motivated by the problems that IMS customers were having "shoe-horning" their data into a hierarchical model.

IMS was extended in the 1970's with so-called logical data bases, which extended IMS to support non-hierarchical data. Hence, the limitations of hierarchies were well understood long ago.

In summary, there has been about 40 years of DBMS research and experience on data models. Much of this is well recorded in the literature (especially in the 1970's) and is discussed in all major DBMS textbooks. I would refer hierarchical enthusiasts to these sources for a comparison of data models.

—Michael Stonebraker, July 6, 2010


Anonymous

Ahhh this is a very clarifying article on the alleged differences between sql and nosql.

the way nosql is hyped it's as though it's found a way to operate without the "impedance mismatch" between the constraints imposed by hardware and the need for speed. I always wondered what that way might have been.

Data is to be transmitted, written, stored and retrieved. Just how much performance is to be squeezed out of any approach given those requirements and the requirement that a solution actually manifest itself in the real world? ?

Thanks so much for providing this no-specialist with a solid foundation on which to evaluate competing claims.


Displaying comments 11 - 12 of 12 in total

Comment on this article

Signed comments submitted to this site are moderated and will appear if they are relevant to the topic and not abusive. Your comment will appear with your username if published. View our policy on comments

(Please sign in or create an ACM Web Account to access this feature.)

Create an Account

Log in to Submit a Signed Comment

Sign In »

Sign In

Signed comments submitted to this site are moderated and will appear if they are relevant to the topic and not abusive. Your comment will appear with your username if published. View our policy on comments
Forgot Password?

Create a Web Account

An email verification has been sent to youremail@email.com
ACM veriȚes that you are the owner of the email address you've provided by sending you a veriȚcation message. The email message will contain a link that you must click to validate this account.
NEXT STEP: CHECK YOUR EMAIL
You must click the link within the message in order to complete the process of creating your account. You may click on the link embedded in the message, or copy the link and paste it into your browser.
Sign In for Full Access
» Forgot Password? » Create an ACM Web Account