Researchers at Princeton and Carnegie Mellon universities are working to harness the explosion of Web data by developing a model that enables computers to group information on the Internet by topic.
The model, developed by Princeton's David Blei and Carnegie Mellon's John Lafferty, allows users to decide how narrow the topic will be. The computer creates a virtual bin for each topic and then analyzes the chosen documents. The model determines which words are associated more often than they would be randomly, and separates those connected words into specific bins.
The researchers say the model is a way to handle information overload and improves searching techniques by making tagging documents easier. Blei also adapted the model to track how the topics evolve by examining how the patterns in each topic bin change from year to year, which he says could lead to insights in the scientific method.
View Full Article
No entries found