News
Computing Applications

Software Enables Pcs to Solve Problems That Once Required the Cloud

Posted
Mini Mac
Using GraphChi, a Mini Mac PC can solve certain problems faster than a cloud-based system.

Remarkably, a new software program running on a single laptop or PC can, in some cases, solve bigger problems in less time than other available distributed frameworks.

For example, the program–dubbed GraphChi–can do the same graph computation in 59 minutes on one Mac Mini PC that would require 423 minutes on 1,536 machines "in the cloud" using Apache Hadoop, says Carlos Guestrin, an associate professor in the Machine Learning Department of Carnegie Mellon University (CMU).

"It’s pretty impressive," says Guestrin, who is the coauthor of GraphChi along with his Ph.D. student Aapo Kyrola. "We’re able to solve problems on a personal computer that people have required the cloud to solve before."

Guy Blelloch, a computer science professor at CMU, also collaborated on the project.

GraphChi, an open source software system that enables large-scale graph computations, took a year to build and was recently made available for free download.

It is a spinoff from GraphLab, an earlier CMU project that Guestrin’s group began in 2009.

"Distributed machine learning algorithms were in their infancy at the time," Guestrin recalls, "and programming parallel machines was extremely difficult. So we ended up building a system for us to be able to run our parallel algorithms. When we realized that other people needed a similar technology, we came up with GraphLab."

But Kyrola questioned why the same sort of technology that allowed GraphLab to perform graph computations in the cloud couldn’t be applied to a single PC if the machine’s hard disk was utilized. Typically a hard drive can hold much more information compared to a machine’s temporary memory. The challenge, however, was to overcome the extremely slow speed of the drive’s random accesses.

"The guts of GraphChi is a new technique we call the parallel sliding windows method," says Guestrin, "which exploits the structure of the graph and the type of computations that are possible to do with GraphLab, but in a way that you use the disk with a minimum number of random accesses."

While GraphChi will work on any PC or laptop, the size of the hard drive influences how much data can be run. There is no formula yet for determining how long a computation will take.

Because GraphChi is so new, there are no deployed commercial applications for it yet. But Guestrin reports a considerable number of downloads and expects to see the program finding uses similar to GraphLab’s in, for example, collaborative filtering problems.

"GraphChi would be ideal for the kind of recommender systems that look at user activity and try to make recommendations," he says. "In other words, the way Netflix recommends movies or Amazon.com recommends purchases."

Geoffrey J. Gordon, who is an associate research professor in the same Machine Learning Department as Guestrin, anticipates many commercial uses for GraphChi. They include:

*Mobile phones. Learning the user’s navigation preferences, or learning to assist with common tasks. "Think Siri on steroids," says Gordon.

*First-person vision. For example, a pair of glasses with cameras embedded in them, automatically creating a map of the building the user is in and then helping her or him navigate back to any place they have seen.

*Lightweight robotics. A robot has to lug around any computer it wants to use along with the batteries to power it, and so it benefits from being able to make do with lighter, more energy-efficient hardware.

"We want machine learning everywhere," says Gordon, "and to achieve that, we have to be able to do it without sucking down a server room’s worth of power. GraphChi is perfect for that."
 

Paul Hyman is a science and technology writer based in Great Neck, NY. 

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More