Researchers at Yahoo!'s Labs in Barcelona, Spain, have developed a distributed search approach that spreads the search index and additional data out over a larger number of smaller data centers instead of a centralized model. The smaller data centers would contain locally relevant information and a small portion of globally replicated data. Most search queries common to a particular area could be answered using information stored in the local data center, while other, more generic queries could be forwarded to other data centers.
The concept of distributed search engines is not new, but until recently such a system was considered too expensive and too slow or that searches would return results that favor locally stored information.
To create a viable distributed system, Yahoo!'s Ricardo Baeza-Yates and colleagues designed their system so statistical information on page rankings could be shared between different data centers, which allows each data center to run an algorithm comparing its results with the results from other data centers, ensuring the best result is returned to the user. Duke University professor Bruce Maggs calls it a valid approach and notes that it "also saves considerably on everything else in the same proportion, such as capital costs and real estate."
The results of a feasibility study on the researchers' distributed search approach were presented at ACM's recent Conference on Information and Knowledge Management in Hong Kong.
From Technology Review
View Full Article
Abstracts Copyright © 2009 Information Inc., Bethesda, Maryland, USA
No entries found