A View to the Future

Takeo Kanade has spent his career trying to help computers understand what is around them. The robotics pioneer joined the faculty of Carnegie Mellon University (CMU) in 1980, and his computer vision research underscores everything from medical robotics technology to self-driving cars. This year, to his long list of accolades, he adds the Kyoto Prize.

You’ve been working in computer vision since you were a graduate student at Kyoto University. What drew you to the field?

I did my undergraduate work in electrical engineering in the late 1960s, which was the dawn of computer science in Japan. There was no computer science department at that time, but I was impressed with a talk I heard from a professor who was using computers for what he called non-numerical computation. Nowadays, we might call it artificial intelligence—it was very pioneering work in speech recognition, image processing, and translation.

For your Ph.D. thesis, you developed the world’s first complete face recognition system.

You said the first "complete." Indeed, that is the word—it was a real system, including everything from the very first step digitizer to an image-processing program, all the way to a display device that I made from a Sony TV. It was written in assembler language. You know, in baseball, a complete game is when you pitch from the first to the ninth inning. In a perfect game, you don’t give out any hits. My Ph.D. thesis program was complete, but not perfect; that’s why I’m still working on it.

"I see the advantage of learning from biological systems, now that we understand them a little better and have the tools to recreate them."

To create this system, you drew inspiration from origami and geometry, rather than from human vision.

I was not against learning from humans. But I didn’t feel like that the human body is an optimum system. Bodies have to deal with everything from seeing to eating to surviving. But when you isolate a particular capability, there should be a better way to do it. Moreover, at that time, we didn’t know much about how the body works. I’m a practical person, so I thought that by understanding the underlying physical, geometrical, and statistical principles, we should be able to design a good system. These days, I see the advantage of learning from biological systems, now that we understand them a little better and have the tools to recreate them.

In 1980, after several years of teaching; at Kyoto University, you moved to the U.S. to become a professor at Carnegie Mellon University. Why?

I felt it was a kind of intellectual adventure. At that time, computer science was very different in the U.S. than in Japan—the environment, the way people talked … I visited CMU in 1977, and I still remember seeing Nobel Prize winner Herbert Simon in the terminal room at midnight, talking with students while he waited for his results from the printer. It was out of my imagination.

In the mid-1980s, you began your pioneering research into the field of autonomous vehicles. What sparked your interest?

It was something of a historical accident. We had a fairly big contract from DARPA (the U.S. Defense Advanced Research Projects Agency) to develop vision capabilities for Autonomous Land Vehicles, or ALVs. But the arrangement didn’t work very well, because we didn’t have a real car that gave us real input in real time. Instead, we were given video, and asked to analyze whether we could recognize this or that. But recognition is relative to what the car does. Recognition and action and the way that input changes make a loop, and you have to understand that whole loop yourself.

So you began building your own ALVs, and later, cars.

In the beginning, the work was on natural terrain, but that’s harder to do in the city. So we began to use a pathway in Schenley Park, and our interests shifted to path and road following.

At one point, I understand that a tree presented something of an obstacle.

I often use the story. One of the trees in the park accidentally aligned with a path when seen from particular direction. Of course, in a three-dimensional world, the tree edge is a vertical edge and the road edge is a horizontal edge. But as a two-dimensional picture, they appeared aligned, as if they were a single object. At that time, we didn’t have a good understanding of how to deal with appearance vs. reality, so we just tried to see whether a pair of edges created the appearance of a road boundary, like parallel lines.

And the edges of this particular tree trunk fooled the system.

Yes, whenever the car went the wrong direction, we had to push the "kill" button. We called it the killer tree. We once asked students how we could avoid this problem, and the best answer was, "Call up the Pittsburgh Parks Department and ask them to cut the tree."

What are you working on now?

One project I’m working on with Professor Srinivasa Narasimhan is smart headlights. When you’re driving at night or on a rainy or snowy day, the raindrops or snowflakes appear as bright dots in your visible scene, because they are very reflective. But if you replace the headlight with a projector, you can turn the individual rays on and off. Then, by recognizing where the raindrops are, you can turn off just the rays that will hit the raindrops so that you would not be able to see them, and could drive as if it’s a fine day. I had this idea a long time ago, but could not do it. Professor Narasimhan’s team made this magic real.

That sounds amazing.

The real use of the raindrop control thing may be still a little ways off. But our system can recognize oncoming cars and turn off just the rays that would hit driver’s eyes, so that you don’t have to switch between your high and low beams. This is what I call augmented reality. What people call "augmented reality" are often applications where a smartphone or a head-mount display appends additional information to the object that you’re viewing. But that’s not augmented reality. That’s the augmented display of reality. With smart headlights, reality is indeed augmented, because to your naked eye, it appears to be more informative.

You’re also involved with a number of Quality of Life Technology initiatives that are focused on developing systems that enable people with disabilities to live more independently.

The work has given me a different perspective on robotics. In autonomous systems, the implicit goal is to reduce or even eliminate human involvement. That’s natural for applications like space and defense, where it’s either difficult, undesirable, or impossible for humans to go. In Quality of Life robotics, people are a part of the system, and people want to be helped only when they need it; otherwise, they want to do things themselves. That is probably the most important aspect that we have to respect.

So your goal is to figure out how to increase human involvement.

In order for the system to work, we have to understand each component. I often say that people are the weakest link, because they are the least understood. In cars, for example, we have worked so hard to understand mechanical factors like tire traction, air pressure and resistance, and so on. But we don’t understand the driver, and that’s why I think we still have a lot of accidents.

A View to the Future

DOI

December 2016 Issue

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

Communications of the ACM (CACM) is now a fully Open Access publication.

A View to the Future

DOI

December 2016 Issue

Related Reading

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

Shape the Future of Computing

Communications of the ACM (CACM) is now a fully Open Access publication.