Sign In

Communications of the ACM

ACM TechNews

Computer Scientists at Yale Develop New Hybrid Database System


View as: Print Mobile App Share: Send by email Share on reddit Share on StumbleUpon Share on Hacker News Share on Tweeter Share on Facebook

Yale computer scientists recently demonstrated HadoopDB, their new open source system for managing huge amounts of data, at the VLDB conference in Lyon, France. The computer scientists used the gathering to discuss the results of a performance analysis they conducted and to provide an overview of its characteristics, run-time performance, loading time, fault tolerance, and scalability dimensions.

HadoopDB combines parallel database management systems (DBMS) technology with the MapReduce software framework to handle petabytes of data. DBMS technology is good for managing structured data that might have tables with trillions of rows of data, while MapReduce, which is used by Google to search data on the Web, allows for greater control in retrieving data. "We get the performance of parallel database systems with the scalability and ease of use of MapReduce," says Yale professor Daniel Abadi.

Some tasks, such as those involved in finding patterns in the stock market, earthquakes, consumer behavior, and outbreaks, will now only take hours rather than days. "People have all this data, but they're not using it in the most efficient or useful way," Abadi says.

From Yale University
View Full Article

 

Abstracts Copyright © 2009 Information Inc., Bethesda, Maryland, USA


 

No entries found