BLOG@CACM
Artificial Intelligence and Machine Learning blog@CACM

The Netflix Prize, Computer Science Outreach, and Japanese Mobile Phones

The Communications Web site, http://cacm.acm.org, features more than a dozen bloggers in the BLOG@CACM community. In each issue of Communications, we'll publish excerpts from selected posts.

Greg Linden writes about machine learning and the Netflix Prize, Judy Robertson offers suggestions about getting teenagers interested in computer science, and Michael Conover discusses mobile phone usage and quick response codes in Japan.
Posted
  1. From Greg Linden's "The Biggest Gains Come From Knowing Your Data"
  2. From Judy Robertson's "Computer Science Outreach: Meeting the Kids Halfway"
  3. From Michael Conover's "Advertainment"
  4. Authors
  5. Footnotes
BLOG@CACM logo

Machine learning is hard. It can be awfully tempting to try to skip the work. Can’t we just download a machine learning package? Do we really need to understand what we are doing?

It is true that off-the-shelf algorithms are a fast way to get going and experiment. Just plug in your data and go.

The only issue is if development stops there. By understanding the peculiarities of your data and what people want and need on your site, by experimenting and learning, it is likely you can outperform a generic system.

A great example of how understanding the peculiarities of your data can help came out of the Netflix Prize. Progress on the $1 million prize largely stalled until Gavin Potter discovered peculiarities in the data, including that people interpret the rating scale differently.

More recently, Yehuda Koren found additional gains by supplementing the models to allow for temporal effects, such as that people tend to rate older movies higher, that movies rated together in a short time window tend to be more related, and that people over time might start rating all the movies they see higher or lower.

In both cases, looking closely at the data, better understanding how people behave, and then adapting the models yielded substantial gains. Combined with other work, that was enough to win the million-dollar prize.

The Netflix Prize followed a pattern you often see when people try to implement a feature that requires machine learning. Most of the early attempts threw off-the-shelf algorithms at the data, yielding something that works, but not with particularly impressive results.

Without a clear metric for success and a way to test against that metric, development stops there. But, like Google and Amazon do with ubiquitous A/B testing, the Netflix Prize had a clear metric for success and a way to test against that metric.

There are a lot of lessons that can be taken from the Netflix contest, but a big one should be the importance of constant experimentation and learning. By competing algorithms against each other, by looking carefully at the data, by thinking about what people want and why they do what they do, and by continuous testing and experimentation, you can reap big gains.

Back to Top

From Judy Robertson’s "Computer Science Outreach: Meeting the Kids Halfway"

I just spent the afternoon working with teenagers at some of our summer school workshops. As luck would have it, we had two different sessions running on the same afternoon, and while galloping between labs, it occurred to me some interesting things were going on. First, a bit about the workshops; the summer schools were both for 17- and 18-year-olds, both were set up to encourage young people to study computer science, and both involved building virtual worlds. One of the workshops, on making computer games using the Neverwinter Nights 2 toolset, lasted for just two hours, and the other was the final presentation session of an eight-week project on Second Life programming. Both of them went very well from the point of view of introducing young people to the fun aspects of computer science. Whether they pay off in terms of recruiting people to study our degree courses in CS remains to be seen. But you have to start somewhere, right? Here are some things I noticed that might be useful to others who are interested in schools’ outreach and recruitment.

  • A relaxed atmosphere prevailed. The young people were joking around and enjoying themselves. Importantly, they were laughing with the staff rather than at them. Having some handpicked students who I knew to be friendly and approachable really helped with this.
  • The young people were doing stuff rather than listening to me drone on. The games workshop kids spent most of their time exploring the software with minimal time spent in demos. The Second Life project groups were presenting their projects and giving demos while their classmates assessed their work. They seemed to be taking the assessment task seriously and responsibly. And I’ll tell you what: It really makes them ask sensible questions at the end of each presentation. This is a contrast to the usual setup in class where students sit like turnips when you ask, "Are there any questions?"
  • Both the workshops involved creative tasks where the teenagers chose for themselves what to build. This does have the drawback of revealing my ignorance of the latest pop culture fads, but at least I do know what South Park is. Seriously, though, this is very important. If you want people to take pride in their work, they need to take some ownership of it. For that to happen, they need to have the choice to work on personally meaningful projects and this often means embracing popular culture in a way which we, as grown-up computer scientists, might find baffling or intensely irritating.

Rather than pushing our agenda of what we think is important and berating young people that they ought to find it interesting, we need to meet them halfway. We need to start from their interests, and then help them to see how computer science knowledge can help them achieve something that appeals to them. As in "You’re interested in alcohol and The Simpsons. Ideal. How about you make a 3D Homer Simpson whose arm can move up and down to drink beer?" At that point you can start explaining the necessary programming and math concepts to do the rotation in 3D space. Or even just admire what they have figured out by themselves. Once you have them hooked on programming or signed up on your degree program, you can build on it. I’m not saying we don’t need to teach sober, serious, and worthy aspects of computer science. Of course we do. I’m just saying we don’t need to push it immediately. It’s kind of like when you have a new boyfriend and you know you have to introduce him to your weird family. Do you take him to meet the mad uncle with the scary eyebrows straight off? No, you introduce him to a friendly cousin who will make him feel at home and has something in common with him.

What I’m suggesting is not new—there are pockets of excellent outreach work with kids in various parts of the world. I think it’s time we tried more of it, even although it is time consuming. After all, we know we can recruit hardcore computer scientists to our degree programs with our current tactics (you know, the people who are born with silver Linux kernels in their mouths). But given there aren’t that many of them, it’s well worth the effort to reach out to the normal population. Unleash the inner computer scientist in everyone!

Back to Top

From Michael Conover’s "Advertainment"

Mobile phones are a way of life in Japan, and this aspect of the culture manifests itself in many ways. Among the more remarkable are the ubiquitous quick response (QR) codes that adorn a sizable percentage of billboards, magazines, and other printed media. In brief, these two-dimensional bar codes offer camera phones with the appropriate software an opportunity to connect with Web-based resources relating to the product or service featured in an advertisement. Encoding a maximum of 4,296 alphanumeric characters, or 1,817 kanji, QR codes are a forerunner of ubiquitous computing technology and portend great things to come.

What’s remarkable to me is, for all our similarities, how widely divergent American and Japanese urban cultures can be. The market penetration numbers aren’t that strikingly different; a March 2008 study showed that more than 84% of Americans own a cell phone, where a Wolfram Alpha query shows that 83% of Japanese own one. The differences in practice, however, could not be more pronounced. In terms of mobile phone use, walking the streets of Japan is like being on a college campus all the time. It’s not unreasonable to estimate that every fifth person is interacting in some way with a mobile device, and here’s the rub on this point—Americans make calls on their phones, the Japanese interact.

Ubiquitous Web access and widespread support for the mobile platform, in addition to the vastly increased data-transfer capabilities, mean Japan is a society in which cell phones are a practical mobile computing platform. QR codes have blossomed in this culture not only because they’re immensely useful to both organizations and consumers, but because the cultural soil is ripe for their adoption. QR codes have been met with lukewarm response in the U.S., and I fear it may be yet another mobile technology to which we get hip three to five years behind the curve.

Irrespective of this, the applications of QR codes in Japan are at times astounding. For many high-dollar corporations, such as Louis Vuitton and Coca-Cola, the QR code is the ad (art?) itself. Oftentimes, the QR code is the actual content, made of something unexpected or even a medium for digital activism. Because of its robust digital format, creative marketers have a lot of wiggle room when it comes to creating eye-catching, market-driven applications of this technology and, like ubiquitous translation technology, it’s the widespread use of Internet-enabled phones that underlies this technological paradigm shift.

Back to Top

Back to Top

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More