Sign In

Communications of the ACM

ACM News

Yahoo! Labs Competition Yields Better Ranking Model

View as: Print Mobile App Share:
search results


When Yahoo! Labs started its three-month-long Learning to Rank competition in March 2010, it knew it'd get interest—if not for the lure of cash prizes totaling $60,000, then for the unprecedented size of the data sets it was releasing. Though learning to rank has been a hot topic in machine learning for years, academic researchers hadn't had much real data to work with, says the challenge's organizer, Yahoo! Labs senior research scientist Olivier Chapelle. Even so, the response was overwhelming, with 1,055 teams participating, 75 percent from academic institutions.

Like other machine-learning tasks, such as the Netflix Prize competition, the LTR Challenge gave contestants a set of training data—in this case, information about the relevance of particular Web pages for a given query. With this data, the contestants' ranking models had to sort, in order of predicted relevance, a new set of pages with respect to a new query. By submitting multiple entries and seeing near-instant Leaderboard feedback on their performance against a validation set, contestants could tune their ranking models until they were ready to submit their official entry.

But to get around copyright and privacy constraints, Yahoo! couldn't release the queries and page URLs themselves. Instead, the organizers released anonymous information about opaque features and their relevance scores. "You don't know what those numbers mean, but the machine learning algorithm has to figure out the predictive value of the numbers," explains Chapelle.

Because the contestants thus had less information than the Yahoo! researchers, it's hard to compare the performance of the winning entries with Yahoo!'s performance. Tellingly, however, the teams that performed well had several techniques in common both with each other and with the algorithms used by Yahoo!—including such existing learning-to-rank methods as bagging, boosting, ensembles of decision trees, and random forests. "The fact that 200 other teams think that the approach we use in practice is the right one—this is a valuable insight," says Yahoo! Fellow Andrei Broder, Chief Scientist for Search and Advertising.

Though Broder sees that as a sign that Yahoo! is at the forefront of learning to rank, it's worth noting that nobody from Google, as far as the organizers could tell, entered the challenge. Rather, the top finisher in the main competition was a team from Microsoft Research. (The top finisher in a second track, which focused on transferring knowledge from a U.S. data set to make inferences about data in a different market, went to the Russian search engine company Yandex.)

Microsoft's winning entry narrowly edged out the nearest competitor's in part because it was optimized for Expected Reciprocal Rank (ERR), the particular performance measure used in the LTR Challenge. "We trained a bunch of models to directly learn ERR," says Microsoft team leader Chris Burges, "but our claim is that our approach would also work well if they had a different measure."

Though Yahoo! recently announced that it is outsourcing its Web search to Microsoft, the lessons from the LTR Challenge are not moot for Yahoo! Labs. As Microsoft's Burges puts it, "Ranking can appear in all sorts of guises." For example, Burges and his colleagues have used machine learning to teach appropriate responses to user statements in a chat session. And at Yahoo! learning to rank algorithms help solve such seemingly varied problems as displaying news stories that will be interesting to users, identifying and preventing spam, and matching ads to Web-page content. "Machine learning is fantastically important for everything that happens on the Web," says Broder.


No entries found