News
Computing Applications

New Ranking Algorithm Separates Digital Wheat from Chaff

Posted
Ching Man Au Yeung and Michael G. Noll
Documents cited by recognized experts are given extra weight by the SPEAR algorithm developed by Ching Man Au Yeung of the University of Southampton and Michael G. Noll of Germany's Hasso Plattner Institute.

As social bookmarking sites like delicious.com grow in popularity, it gets harder and harder to find knowledgeable contributors and good information among the less so, especially since large audiences tend to attract spammers. A new algorithm presented at SIGIR 2009 solves this problem, identifying expert contributors and the content these contributors recommend.

The algorithm, called SPEAR (SPamming-resistant Expertise Analysis and Ranking) is the brainchild of two PhD candidates, Michael G. Noll of Germany’s Hasso Plattner Institute and Ching Man Au Yeung of the University of Southampton.

It uses two design elements. The first is mutual reinforcement, a process that labels as experts in a particular topic those users who recommend good documents, and that labels good documents, in turn, as those identified by many experts. Google’s PageRank and Jon Kleinberg‘s HITS algorithm do the same with Web pages, but both these iterative algorithms have a major drawback: they’re vulnerable to Sybil attacks and other spamming tactics. In one scheme, spammers can boost their expertise scores by recommending pages that are already popular — and then parlay their cheaply gained expert status to promote spam.

SPEAR, which is based on HITS, overcomes this weakness through its second design idea: the use of time stamps to distinguish between "discoverers" (the people who were among the first to bookmark a page) and mere "followers." Not all followers are spammers free-riding on the experts’ coattails; many followers mean well and find good sites on their own. But if they’re late to the party, SPEAR assumes they have less expertise in the topic and therefore gives them less credit. Time of discovery isn’t just a reasonable measure of expertise — it’s also a system no one can game. As Noll puts it, "Spammers have lots of things they can twiddle with to trick you, but this time information on delicious is something they cannot fake."

Although Noll and Au Yeung tested their algorithm on delicious.com data, SPEAR holds far broader promise. "It can be used in lots of different scenarios," says Vik Singh, a Yahoo! Search architect who’s excited about SPEAR’s potential uses for delicious and beyond. Social networking sites, for instance, might be able to suggest users with specialized knowledge, and shopping sites could use it to recommend products rated highly by experts in a given category. "To find a way to transform social signals into existing and elegant models like HITS and PageRank is a really great contribution," Singh says.

 

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More