The Real Problem With AI

Some of the current debates around artificial intelligence (AI) are baffling. I am not sure what philosophers have to contribute, and the fears of "machines taking over humans" sound like stuff for science-fiction novels or late-night party musings rather than anything that threatens us now. Especially since real threats do exist, more concrete and more scary. And more mundane.

The problem with AI is not the metaphysical risk that machines will deprive us of our free will. The problem with AI is the practical risk that even with our free will we start trusting our decisions to algorithms, and they make the wrong decisions.

Serious discussion is hard because we are often in the domain of magical thinking rather than reason. People seem to understand "AI" as an abracadabra that will miraculously get the right answers. Sorry to disappoint (in fact, not sorry at all — that is our role): AI is algorithms. Or, to paraphrase a famous saying, "It’s algorithms, stupid!", not to be confused with "it is stupid algorithms." The algorithms can be very smart, but they are still just algorithms conceived and implemented by humans. Like always, the algorithms have limitations, and the implementations often have bugs. Also, they need data, which can be erroneous, too (as anyone knows who has ever used a GPS navigator and been told "turn left" into a closed street or worse).

The current renewed excitement about artificial intelligence, coming after the early hype in the Seventies, the disappointment and inevitable backlash that followed, and several decades of "AI winter," is due to the emergence of powerful new algorithms taking advantage of machine learning from the immense amounts of data that modern computing hardware enables us to process, at levels heretofore unimaginable. But they are still algorithms!

Learning-based approaches have produced remarkable results, but their results are only as good as the learning data. At the time of the Volkswagen Dieselgate, someone pointed out [1] that the cheating could also have arisen innocently (in some sense) as a result of a deep-learning algorithm detecting that under some conditions (testing), disabling the emissions led to better results. Learning algorithms are neutral; they have no notion of ethics, or of good engineering practices.

In addition, statistically-based learning techniques can only, by nature, achieve good enough results. A telling example is machine translation. After decades of trying structural methods based on linguistics knowledge, the field made a big stride by turning to statistical techniques. Tools such as Google Translate are the best-known example and perform a useful role, particularly for languages you do not know at all. But they can also be very wrong. Translating the Russian "Aleksey liubit Natashu" (in Cyrillic letters) correctly yields "Alex loves Natasha." But Russian, unlike English or French, mostly uses word form, rather than word position, to determine the function of each word. "Aleksey," a nominative, is the subject, while "Natashu," the accusative of "Natasha," denotes the object of the love. Now try "Alekseia liubit Natasha": Google still produces the exact same translation as with the first form. Here, however, the meaning is the other way around: "Alekseia" is the accusative, so the sentence means that Natasha loves Alex. It has a slightly different nuance, as in "Natasha is the one who loves Alex," but it is still indubitable who loves whom. Google’s statistical techniques rely on word order as in English, and gets the opposite of the meaning. (By the way, translation through Yandex, the main Russian search engine, does no better [2].)

For love, such a confusion is bad enough, but imagine an AI program in charge of decoding diplomatic messages, like "Country A just attacked Country B." Who attacked whom is not a matter of style.

We should take such examples seriously. Programs beating chess or Go masters are great, but human intelligence manifests itself daily in matters such as language; and machine translation is the flagship application of the new wondrous learning-based approaches to AI. Yet it stumbles on such elementary cases, while any native-speaking five-year-old, with no idea of what an accusative is, instantly parses who loves and who is loved. Artificial intelligence still has lots of catching-up to do with the natural kind, and there is no guarantee that it will succeed. More precisely, it has become good at finding solutions that are often right. This is good enough to get the gist of a foreign web page. But from there to trust AI-based programs with decisions?

That such deficiencies exist can surprise no one in our field. We all know that errors are an inescapable component of computing.

Throwing in "AI" as a magic spell does nothing to advance discussions. There is not even a firm definition of what constitutes artificial intelligence. I remember John McCarthy saying that "as soon as it works, it is no longer AI." That pithy observation may be exaggerated, but the boundary continuously shifts. At one time, garbage collection (automatic memory management techniques, now standard in the implementation of many mainstream programming languages) was considered AI. So was list processing. The new algorithms have pushed the boundary again, but they are still just algorithms.

It is part of the role of the technical community to explain these circumstances to the non-technical world and to stem the propagation of magical views of AI, which detract from the real problems.

A surprising example is the petition against "Legal Autonomous Weapons" which, probably like many others in the field, I was recently invited to sign. I am indeed ready to endorse it; no good, and a lot of harm, can come from developing such weapons. The technical community should oppose the idea. But the accompanying text, in listing the risks, makes no mention of a particularly scary one: that they might just go awry! It is true that "by removing the risk, attributability, and difficulty of taking human lives, lethal autonomous weapons could become powerful instruments of violence and oppression," but they can do the same even without anyone’s bad intent, simply as a result of a programmer mistake or badly interpreted data (as in the case, paradoxically recalled on the same Web page, that brought the world to within a hair’s breadth of nuclear war 55 years ago).

I am flabbergasted that the list of original signatories includes several professors of computer science, and still does not mention this critical and very realistic risk.

We must consider it our duty to warn the public [3] of the real issues. It is not a matter of fighting "hype." Hype is an inevitable phenomenon in the computing world and has good aspects as well as bad. Our role is to explain the distinction between reality and promises, and to refocus debates, particularly debates about risks, on what really matters.

In artificial intelligence today, the most menacing risk is stuff that does not work.

Notes

[1] I think the observation appeared on the ACM Risks forum but cannot find it; if someone has the reference I will be happy to add proper credit.

[2] The new translation site deepl.com does an amazing job, leaving Google Translate and such in the dust. I assume it combines statistical and structural techniques. It only supports a limited set of languages so far. One of them is Polish, and I would be interested to learn from someone who knows that language (which is structually in the same class as Russian) how it performs on examples such as the one cited.

[3] Like David Parnas did 30 years ago when he publicly stepped out of the Strategic Defense Initiative; not out of any political disagreement, but because he knew, as a scientist, that the advertised goals could not be met.