Home/Magazine Archive/October 2008 (Vol. 51, No. 10)/Technical Perspective: New Developments in Graph Partitioning/Full Text

Research highlights
# Technical Perspective: New Developments in Graph Partitioning

Arora, Rao, and Vazirani discuss the most important developments in approximation algorithms over the last two decades. Graph partitioning has played a central role in algorithms research, both as an important problem with many applications and as a fertile breeding ground for interesting and deep algorithmic techniques. The paper reviews several related papers, starting with the breakthrough of an improved *O(log n)* approximation algorithm based on semidefinite programming and also including a faster combinatorial version using expander flows.

Graph partitioning is a natural algorithmic primitive in many settings, including data clustering, divide-and-conquer approaches, parallel computing architectures and algorithms, and packet routing in distributed networks. A well-understood graph cut problem is to partition the vertices into two sets such that there are few edges crossing between the two sets. A smallest cut can be found in polynomial time via a number of different methods, including traditional network flows. While minimal cuts have many applications, they are not useful primitives for the divide-and-conquer type applications; such cuts often result in extremely unbalanced partitions. Graph partitioning refers to a broad class of partitioning problems that seek cuts with few crossing edges, where the two sides of the partition are more balanced.

Finding an optimal balanced partition is intractable (NP-hard), so for the last 40 years attention has been focused on finding nearly optimal partitions; this has resulted in the development of deep techniques in approximation algorithms. The first major breakthrough result appeared in a 1988 paper by Leighton and Rao: an efficient algorithm that finds almost-balanced partitions is an *O(log n)* approximation algorithm; that is, where the number of edges that cross the cut is no more than *O(log n)* times more than the optimal, where *n* is the number of vertices of the graph. The authors used a linear programming relaxation of the balanced partitioning problems. Solving the linear program results in *a fractional cut*, where each edge is assigned a fractional extent to which it belongs to the cut. The algorithm provides an interesting technique that "rounds" the fractional values to 0 or 1 and hence turns the fractional cut into a real cut. Since that paper, the linear programming relaxations and rounding paradigm proved a powerful tool in approximation algorithms with a wide range of applications.

The *O*(log *n*) approximation algorithm presented here is a long-awaited improvement in the graph partitioning guarantee. The paper builds on the improved understanding of the Leighton and Rao technique gained over the years. In 1994, Linial, London, and Rabinovich brought to light an intriguing connection between graph partitioning problems and embeddings into high-dimensional metric spaces. A cut can be thought of as a simple embedding into a line: vertices on one side of the cut are mapped to 0, and those on the other side to 1, and any mapping of the points to the one-dimensional metric can be thought of as combinations of cuts (where the two sides of the cut are the nodes mapped to points a and the nodes mapped to points a for some value a). Leighton and Rao's linear program is a relaxation of this one-dimensional metric space to an arbitrary metric. While finding the best cut and the best embedding into the one-dimensional metric are NP-complete, finding the best embedding to an arbitrary metric space can be done in polynomial time. One can derive the *O(logn)* approximation by mapping the metric space obtained by the linear program into the L_{1} metric space so that the distances are approximately preserved, and then decomposing the *L _{1}* metric space as a product of one-dimensional metric spaces. The new paper uses Euclidian metrics and build on a semidefinite programming technique introduced by Goemans and Williamson that allows them to optimally embed the nodes of a graph into a high-dimensional Euclidian metric.

A well-known disadvantage of using general-purpose linear programming solvers and semidefinite programs in approximation algorithms is that they are slow. The authors here review the faster *O(log n)* approximation algorithm of Arora, Hazan, and Kale based on flows and the multiplicative update method. The connection of cut problems to certain multicommodity flow problems pioneered by Shahrokhi and Matula, who note that the edges congested by the flow are the ones good to cut. There have since been great developments in solving this class of linear programs efficiently by using variants of the multiplicative update method. The flow problem naturally related to balanced cut is that of sending one unit of flow between every pair of nodes in the graph. Edges in a small balanced cut will have to be congested as *0(n ^{2}*) flow wants to cross the cut. Unfortunately, the best routing of flow can have

To speed the computation, Leighton and Rao replace this all-pairs flow with a sparse random pairing, only sending flow between *O(n)* randomly selected source-sink pairs. If the graph of the source-sink pairs is an expander, then this smaller flow problem works just as well in the algorithm. The authors achieve their improved approximation by showing an expander exists whose flow can be routed with *O(log n)* less congestion than random expanders.

Arora, Rao, and Vazirani review an impressive collection of results, bringing to bear a wide array of tools on an important problem. Combining tools from high-dimensional metric spaces with ideas from expander flows, spar-sification, and multiplicative update methods results in a simple and fast method that achieves the improved *O(log n)* approximation guarantee.

**©2008 ACM 0001-0782/08/1000 $5.00**

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

The Digital Library is published by the Association for Computing Machinery. Copyright © 2008 ACM, Inc.

No entries found