Sign In

Communications of the ACM

Last byte

Q&A: Finding Themes

ACM-Infosys Foundation Award recipient David Blei

Credit: Denise Applewhite / Princeton University

In announcing David Blei as the latest recipient of the ACM-Infosys Foundation Award in the Computing Sciences, ACM president Vint Cerf said Blei's contributions provided a basic framework "for an entire generation of researchers." Blei's seminal 2003 paper on latent Dirichlet allocation (LDA, co-authored with Andrew Ng and Michael Jordan while still a graduate student at the University of California, Berkeley) presented a way to uncover document topics within large bodies of data; it has become one of the most influential in all of computer science, with more Google Scholar citations than, for example, the earlier PageRank paper that launched Google. Blei's approach and its extensions have found wide-ranging uses in fields as varied as e-commerce, legal pretrial discovery, literary studies, history, and more. At Princeton since 2006, Blei begins this fall as a professor at Columbia University, with joint appointments in the two departments his work bridges, computer science and statistics.

Back to Top

How did the idea for LDA come about?

The idea behind topic modeling is that we can take a big collection of documents and learn that there are topics inside that collection—like sports or health or business—and that some documents exhibit a pattern of words around that topic and others don't. That idea really began in the late '80s and early '90s with latent semantic analysis (LSA), from people like Susan Dumais at Microsoft Research. Then Thomas Hoffman developed probabilistic latent semantic analysis (PLSA), taking the ideas of LSA and embedding them in a probability model.


No entries found

Log in to Read the Full Article

Sign In

Sign in using your ACM Web Account username and password to access premium content if you are an ACM member, Communications subscriber or Digital Library subscriber.

Need Access?

Please select one of the options below for access to premium content and features.

Create a Web Account

If you are already an ACM member, Communications subscriber, or Digital Library subscriber, please set up a web account to access premium content on this site.

Join the ACM

Become a member to take full advantage of ACM's outstanding computing information resources, networking opportunities, and other benefits.

Subscribe to Communications of the ACM Magazine

Get full access to 50+ years of CACM content and receive the print version of the magazine monthly.

Purchase the Article

Non-members can purchase this article or a copy of the magazine in which it appears.
Sign In for Full Access
» Forgot Password? » Create an ACM Web Account