Artificial Intelligence and Machine Learning Interview

An Interview with Ed Feigenbaum

ACM Fellow and A.M. Turing Award recipient Edward A. Feigenbaum, a pioneer in the field of expert systems, reflects on his career.
  1. Article
  2. Author
  3. Footnotes
ACM Fellow and A.M. Turing Award recipient Edward A. Feigenbaum
ACM Fellow and A.M. Turing Award recipient Edward A. Feigenbaum

The computer history Museum has an active program to gather videotaped histories from people who have done pioneering work in this first century of the information age. These tapes are a rich aggregation of stories that are preserved in the collection, transcribed, and made available on the Web to researchers, students, and anyone curious about how invention happens. The oral histories are conversations about people’s lives. We want to know about their upbringing, their families, their education, and their jobs. But above all, we want to know how they came to the passion and creativity that leads to innovation.

Presented here are excerptsa from four interviews with Edward A. Feigenbaum, the Kumagai Professor of Computer Science, Emeritus, at Stanford University and a pioneering researcher in artificial intelligence. The interviews were conducted in 2007 separately by Donald Knuth and Nils Nilsson, both professors of computer science at Stanford University.
        —Len Shustek

What was your family background?

I was born in New Jersey in 1936 to a culturally Jewish family. That Jewish culture thinks of itself as the people of the book, and so there’s a tremendous focus on learning, and books, and reading. I learned to read very early.

What got you interested in science and engineering?

My stepfather was the only one in the family who had any college education. Once a month he would take me to the Hayden Planetarium of the American Museum of Natural History. I got really interested in science, mostly through astronomy, at about 10 years old.

My stepfather worked as an accountant and had a Monroe calculator. I was absolutely fascinated by these calculators and learned to use them with great facility. That was one of my great skills—in contrast to other friends of mine whose great skills were things like being on the tennis team.

I was a science kid. I would read Scientific American every month—if I could get it at the library. One book that really sucked me into science was Microbe Hunters. We need more books like Microbe Hunters to bring a lot more young people into science now.

Why did you study electrical engineering?

I got As in everything, but I really enjoyed most the math and physics and chemistry. So why electrical engineering, as opposed to going into physics? Around my family, no one had ever heard of a thing called a physicist. In this middle-class to lower-middleclass culture people were focused on getting a job that would make money, and engineers could get jobs and make money.

I happened to see an advertisement for scholarships being offered by an engineering school in Pittsburgh called Carnegie Institute of Technology. I got a scholarship, so that’s what I did. Life is an interesting set of choices, and the decision to go to Carnegie Tech (now Carnegie-Mellon University) was a fantastically good decision.

Something else there got you excited.

I had a nagging feeling that there was something missing in my courses. There’s got to be more to a university education! In the catalog I found a really interesting listing called "Ideas and Social Change," taught by a young new instructor, James March. The first thing he did was to expose us to Von Neumann’s and Morgenstern’s "Theory of Games and Economic Behavior." Wow! This is mind-blowing! My first published paper was with March in social psychology, on decision-making in small groups.

March introduced me to a more senior and famous professor, Herbert Simon. That led to my taking a course from Simon called "Mathematical Models in the Social Sciences." I got to know Herb, and got to realize that this was a totally extraordinary person.

In January 1956 Herb walked into our seminar of six people and said these famous words: "Over Christmas Allen Newell and I invented a thinking machine." Well, that just blew our minds. He and Newell had formulated the Logic Theorist on December 15th, 1955. They put together a paper program that got implemented in the language called IPL-1, which was not a language that ran on any computer. It was the first list processing language, but it ran in their heads.

That led to your first exposure to computers.

When we asked Herb in that class, "What do you mean by a machine?" he handed us an IBM 701 manual, an early IBM vacuum tube computer. That was a born-again experience! Taking that manual home, reading it all night long—by the dawn, I was hooked on computers. I knew what I was going to do: stay with Simon and do more of this. But Carnegie Tech did not have any computers at that time, so I got a job at IBM for the summer of 1956 in New York.

What did you learn at IBM?

First, plug board programming, which was a phenomenally interesting thing for a geeky kid. Second, the IBM 650, because by that time it became known that Carnegie Tech would be getting a 650. Third, the IBM 704, which was a successor machine to the 701.

When I got back to Carnegie Tech in September 1956 and began my graduate work, there was Alan Perlis, a wonderful computer genius, and later the first Turing Award winner. Perlis was finishing up an amazing program called a compiler. That was "IT," Internal Translator, and it occupied 1,998 words of the 2,000-word IBM 650 drum.

I had known about the idea of algebraic languages because in the summer at IBM someone had come down from the fourth floor to talk to the graduate students and tell them about a new thing that had just hit the scene. You didn’t have to write "CLA" for "clear and add," and you didn’t have to write "005" for "add." You could write a formula, and a program would translate that formula into machine language. FOR-TRAN. The guy was John Backus, who had come downstairs to talk to us. IT’s introduction actually preceded Fortran’s by about nine months.

This idea has been very important for my career—the experimental approach to computer science as opposed to the theoretical approach.

What was it like to use a computer then?

There was no staff between you and the computer. You could book time on the computer, then you went and did your thing. A personal computer! I loved it. I loved the lights, I loved pressing the switches. This idea has been very important for my career—the hands on, experimental approach to computer science as opposed to the theoretical approach. Experiment turns out to be absolutely vital.

I was able to write a rather complicated—for that time—simulation of two companies engaged in a duopolistic decision-making duel about pricing of tin cans in the can industry, the second such simulation of economic behavior ever written. It led to my first conference paper, in December 1958, at the American Economics Association annual meeting.

What did you do for your dissertation?

A model called EPAM, Elementary Perceiver and Memorizer, a computer simulation model of human learning and memory of nonsense syllables.

I invented a data structure called a Discrimination Net—a memory structure that started out as nothing when the learner starts. List structures had just been invented, but no one had tried to grow trees. I had to, because I would start with two nonsense syllables in the Net, and then the next pair would come in and they’d have to "grow into" the net somewhere. These were the first adaptively growing trees. Now here’s an amazing and kind of stupid thing that shows what it means to focus your attention on x rather than y. We were focused on psychology. We were not focused on what is now called computer science. So we never published anything about those adaptively growing trees, except as they related to the psychological model. But other people did see trees as a thing to write papers about in the IT literature. So I missed that one!

Where was your first academic job?

I had wanted to come to the West Coast, and the University of California at Berkeley was excited about getting me. There I taught two things: organization theory à la March and Simon, and the new discipline called Artificial Intelligence.

There were no books on the subject of AI, but there were some excellent papers that Julian Feldman and I photocopied. We decided that we needed to do an edited collection, so we took the papers we had collected, plus a few more that we asked people to write, and put together an anthology called Computers and Thought that was published in 1963.

The two sections mirrored two groups of researchers. There were people who were behaving like psychologists and thinking of their work as computer models of cognitive processes, using simulation as a technique. And there were other people who were interested in the problem of making smart machines, whether or not the processes were like what people were doing.

How did choosing one of those lead you to Stanford?

The choice was: do I want to be a psychologist for the rest of my life, or do I want to be a computer scientist? I looked inside myself, and I knew that I was a techno-geek. I loved computers, I loved gadgets, and I loved programming. The dominant thread for me was not going to be what humans do, it was going to be what can I make computers do.

I had tenure at Berkeley, but the business school faculty couldn’t figure out what to make of a guy who is publishing papers in computer journals, artificial intelligence, and psychology. That was the push away from Berkeley. The pull to Stanford was John McCarthy.

How did you decide on your research program?

Looking back in time, for reasons that are not totally clear to me, I really, really wanted smart machines. Or I should put the "really" in another place: I really wanted really smart machines.

I wasn’t going to get there by walking down the EPAM road, which models verbal learning, or working on puzzle-solving deductive tasks. I wanted to model the thinking processes of scientists. I was interested in problems of induction. Not problems of puzzle solving or theorem proving, but inductive hypothesis formation and theory formation.

AI is not much of a theoretical discipline. It needs to work in specific task environments.

I had written some paragraphs at the end of the introduction to Computers and Thought about induction and why I thought that was the way forward into the future. That’s a good strategic plan, but it wasn’t a tactical plan. I needed a "task environment"—a sandbox in which to specifically work out ideas in detail.

I think it’s very important to emphasize, to this generation and every generation of AI researchers, how important experimental AI is. AI is not much of a theoretical discipline. It needs to work in specific task environments. I’m much better at discovering than inventing. If you’re in an experimental environment, you put yourself in the situation where you can discover things about AI, and you don’t have to create them.

Talk About DENDRAL.

One of the people at Stanford interested in computer-based models of mind was Joshua Lederberg, the 1958 Nobel Prize winner in genetics. When I told him I wanted an induction "sandbox", he said, "I have just the one for you." His lab was doing mass spectrometry of amino acids. The question was: how do you go from looking at a spectrum of an amino acid to the chemical structure of the amino acid? That’s how we started the DENDRAL Project: I was good at heuristic search methods, and he had an algorithm which was good at generating the chemical problem space.

We did not have a grandiose vision. We worked bottom up. Our chemist was Carl Djerassi, inventor of the chemical behind the birth control pill, and also one of the world’s most respected mass spectrometrists. Carl and his postdocs were world-class experts in mass spectrometry. We began to add in their knowledge, inventing knowledge engineering as we were going along. These experiments amounted to titrating into DENDRAL more and more knowledge. The more you did that, the smarter the program became. We had very good results.

The generalization was: in the knowledge lies the power. That was the big idea. In my career that is the huge, "Ah ha!," and it wasn’t the way AI was being done previously. Sounds simple, but it’s probably AI’s most powerful generalization.

Meta-DENDRAL was the culmination of my dream of the early to mid- 1960s having to do with theory formation. The conception was that you had a problem solver like DENDRAL that took some inputs and produced an output. In doing so, it used layers of knowledge to steer and prune the search. That knowledge got in there because we interviewed people. But how did the people get the knowledge? By looking at thousands of spectra. So we wanted a program that would look at thousands of spectra and infer the knowledge of mass spectrometry that DENDRAL could use to solve individual hypothesis formation problems.

We did it. We were even able to publish new knowledge of mass spectrometry in the Journal of the American Chemical Society, giving credit only in a footnote that a program, Meta-DENDRAL, actually did it. We were able to do something that had been a dream: to have a computer program come up with a new and publishable piece of science.

What then?

We needed to play in other playpens. I believe that AI is mostly a qualitative science, not a quantitative science. You are looking for places where heuristics and inexact knowledge can come into play. The term I coined for my lab was "Heuristic Programming Project" because heuristic programming is what we did.

For example, MYCIN was the Ph.D. thesis project of Ted Shortliffe, which turned out to be a very powerful knowledge-based system for diagnosing blood infections and recommending their antibiotic therapies. Lab members extracted from Mycin the core of it and called it E-Mycin for Essential Mycin, or Empty Mycin. That rule-based software shell was widely distributed.

What is the meaning of all those experiments that we did from 1965 to 1968? The Knowledge-Is-Power Hypothesis, later called the Knowledge Principle, which was tested with dozens of projects. We came to the conclusion that for the "reasoning engine" of a problem solving program, we didn’t need much more than what Aristotle knew. You didn’t need a big logic machine. You need modus ponens, backward and forward chaining, and not much else in the way of inference. Knowing a lot is what counts. So we changed the name of our laboratory to the "Knowledge System Lab," where we did experiments in many fields.

What other AI models did you use?

AI people use a variety of underlying problem-solving frameworks, and combine a lot of knowledge about the domain with one of these frameworks. These can either be forward-chaining—sometimes called generate and test—or they could be backward-chaining, which say, for example, "here’s the theorem I want to prove, and here’s how I have to break it down into pieces in order to prove it."

I began classified research on detecting quiet submarines in the ocean by their sound spectrum. The problem was that the enemy submarines were very quiet, and the ocean is a very noisy place. I tried the same hypothesis formation framework that had worked for DENDRAL, and it didn’t even come close to working on this problem.

Fortunately Carnegie Mellon people—Reddy, Erman, Lesser and Hayes-Roth—had invented another framework they were using for understanding speech, the Blackboard Framework. It did not work well for them, but I picked it up and adapted it for our project. It worked beautifully. It used a great deal of knowledge at different "levels of abstraction." It allowed flexible combination of top-down and bottom-up reasoning from data to be merged at those different levels. In Defense Department tests, the program did better than people.

But that research was classified as "secret." How could ideas be published from a military classified project? The Navy didn’t care about the blackboard framework; that was computer science. So we published the ideas in a paper on a kind of hypothetical: "how to find a koala in eucalyptus trees," which was a non-cassified problem drawn from my personal experience in an Australian forest!

In my view the science that we call AI, maybe better called computational intelligence, is the manifest destiny of computer science.

Talk about being an entrepreneur as well as an academic.

There was a very large demand for the software generalization of the MYCIN medical diagnosis expert system "shell," called EMYCIN. So a software company was born called Teknowledge, whose goal was to migrate EMYCIN into the commercial domain, make it industrial strength, sell it, and apply it. Teknowledge is still in existence.

Our Stanford MOLGEN project was the first project in which computer science methods were applied to what is now called computational molecular biology. Some MOLGEN software turned out to have a very broad applicability and so was the basis of the very first company in computational molecular biology, called Intelligenetics, later Intellicorp. They had lots of very sophisticated applications. During the dot-com bust they went bust, but they lasted, roughly speaking, 20 years.

In the 1980s you studied the Japanese government’s major effort in AI.

The Japanese plan was very ambitious. They organized a project to essentially do knowledge-based AI, but in a style different from the style we were accustomed to in this country. For one thing, they wanted to do it in the "I-am-not-LISP style," because the Japanese had been faulted in the past for being imitators. So they chose Prolog and tried formal methods. And they included parallel computing in their initiative.

They made a big mistake in their project of not paying enough attention to the application space at the beginning. They didn’t really know what applications they were aiming at until halfway through; they were flying blind for five years. Then they tried to catch up and do it all in five more years, and didn’t succeed. [See the book, The Fifth Generation," written with Pamela McCorduck].

How did you come to work for the U.S. government?

In 1994 an amazing thing happened. The phone rings and it is Professor Sheila Widnall of the Department of Aeronautics and Astronautics of MIT. She said, "Do you know anyone who wants to be Chief Scientist of the Air Force? And by the way, if you are interested let me know." She had been chosen to be Secretary of the Air Force, and she was looking for her Chief Scientist. I thought about it briefly, told her yes, and stayed for three years.

My job was to be a window on science for the Chief of Staff of the Air Force. I was the first person to be asked to be Chief Scientist who was not an Aero-Astro person, a weapons person, or from the physical sciences. There had not been any computer scientists before me.

I did two big things. One was consciousness-raising in the Air Force about software. The one big report I wrote, at the end of my term, was a report called, It’s a Software-First World. The Air Force had not realized that. They probably still do not think that. They think it is an airframe-based world.

The other was on software development. The military up to that point believed in, and could only imagine, a structured-programming top-down world. You set up requirements, you get a contractor to break down the requirements into blocks, another contractor breaks them down into mini-blocks, and down at the bottom there are some people writing the code. It takes years to do. When it all comes back up to the top, (a) it’s not right, and (b) it’s not what you want anymore. They just didn’t know how to contract for cyclical development. Well, I think we were able to help them figure out how to do that.

What happened after your "tour of duty" in Washington?

It was a rather unsettling experience to come back to Stanford. After playing a role on a big stage, all of a sudden you come back and your colleagues ask, "What are you going to teach next year? Intro to AI?"

So at the beginning of 2000, I retired. Since then I have been leading a wonderful life doing whatever I please. Now that I have a lot more time than I had before, I’m getting geekier and geekier. It feels like I’m 10 years old again, getting back involved with details of computing.

The great thing about being retired is not that you work less hard, but that what you do is inner-directed. The world has so many things you want to know before you’re out of here that you have a lot to do.

Why is history important?

When I was younger, I was too busy for history and not cognizant of the importance of it. As I got older and began to see my own career unfolding, I began to realize the impact of the ideas of others on my ideas. I became more and more of a history buff.

That convinced me to get very serious about archives, including my own. If you’re interested in discoveries and the history of ideas, and how to manufacture ideas by computer, you’ve got to treat this historical material as fundamental data. How did people think? What alternatives were being considered? Why was the movement from one idea to another preposterous at one time and then accepted?

You are a big fan of using heuristics not only for AI, but also for life. What are some of your life heuristics?

  • Pay a lot of attention to empirical data, because in empirical data one can discover regularities about the world.
  • Meet a wonderful collaborator—for me it was Joshua Lederberg—and work with that collaborator on meaningful problems
  • It takes a while to become really, really good at something. Stick with it. Focus. Persistence, not just on problems but on a whole research track, is really worth it. Switching in the middle, flitting around from problem to problem, isn’t such a great idea.
  • Life includes of a lot of stuff you have to do that isn’t all that much fun, but you just have to do it.
  • You have to have a global vision of where you’re going and what you’re doing, so that life doesn’t appear to be just Brownian motion where you are being bumped around from one little thing to another thing.

How far have we come in your quest to have computers think inductively?

Our group, the Heuristic Programming Project, did path-breaking work in the large, unexplored wilderness of all the great scientific theories we could possibly have. But most of that beautiful wilderness today remains largely unexplored. Am I am happy with where we have gotten in induction research? Absolutely not, although I am proud of the few key steps we took that people will remember.

Is general pattern recognition the answer?

I don’t believe there is a general pattern recognition problem. I believe that pattern recognition, like most of human reasoning, is domain specific. Cognitive acts are surrounded by knowledge of the domain, and that includes acts of inductive behavior. So I don’t really put much hope in "general anything" for AI. In that sense I have been very much aligned with Marvin Minsky’s view of a "society of mind." I’m very much oriented toward a knowledge-based model of mind.

How should we give computers knowledge?

I think the only way is the way human culture has gotten there. We transmit our knowledge via cultural artifacts called texts. It used to be manuscripts, then it was printed text, now it’s electronic text. We put our young people through a lot of reading to absorb the knowledge of our culture. You don’t go out and experience chemistry, you study chemistry.

We need to have a way for computers to read books on chemistry and learn chemistry. Or read books on physics and learn physics. Or biology. Or whatever. We just don’t do that today. Our AI programs are handcrafted and knowledge engineered. We will be forever doing that unless we can find out how to build programs that read text, understand text, and learn from text.

Reading from text in general is a hard problem, because it involves all of common sense knowledge. But reading from text in structured domans I don’t think is as hard. It is a critical problem that needs to be solved.

Why is AI important?

There are certain major mysteries that are magnificent open questions of the greatest import. Some of the things computer scientists study are not. If you’re studying the structure of data-bases—well, sorry to say, that’s not one of the big magnificent questions.

I’m talking about mysteries like the initiation and development of life. Equally mysterious is the emergence of intelligence. Stephen Hawking once asked, "Why does the universe even bother to exist?" You can ask the same question about intelligence. Why does intelligence even bother to exist?

We should keep our "eye on the prize." Actually, two related prizes. One is that when we finish our job, whether it is 100 years from now or 200 years from now, we will have invented the ultra-intelligent computer. The other is that we will have a very complete model of how the human mind works. I don’t mean the human brain, I mean the mind: the symbolic processing system.

In my view the science that we call AI, maybe better called computational intelligence, is the manifest destiny of computer science.

For the people who will be out there years from now, the question will be: will we have fully explicated the theory of thinking in your lifetime? It would be very interesting to see what you people of a hundred years from now know about all of this.

It will indeed. Stay tuned.

Back to Top

Back to Top

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More