A Case Against Mission-Critical Applications of Machine Learning

In their column "Learning Machine Learning" (Dec. 2018), Ted G. Lewis and Peter J. Denning raised a crucial question about machine learning systems: "These [neural] networks are now used for critical functions such as medical diagnosis … fire-control systems. How can we trust the networks?" They answered: "We know that a network is quite reliable when its inputs come from its training set. But these critical systems will have inputs corresponding to new, often unanticipated situations. There are numerous examples where a network gives poor responses for untrained inputs."

David Lorge Parnas followed up on this discussion in his Letter to the Editor (Feb. 2019), highlighting "the trained network may fail unexpectedly when it encounters data radically different from its training set."

We wish to point out that machine learning-based systems, including commercial ones performing safety critical tasks, can fail not only under "unanticipated situations" (noted by Lewis and Denning) or "when it encounters data radically different from its training set" (noted by Parnas), but also under normal situations, even on data that is extremely similar to its training set.

In our article "Metamorphic Testing of Driverless Cars" (Mar. 2019), we tested the real-life LiDAR obstacle perception system of Baidu’s Apollo self-driving software, and reported surprising findings that as few as 10 sheer random points scattered outside the driving area could cause an autonomous vehicle to fail to detect an obstacle on the roadway, with 2.7% probability—of the 1,000 tests, 27 failed (see Figure 3a in our article). The Apollo self-driving team confirmed "it might happen" because the system was "deep learning trained."

Now, after a further investigation, we have found that in 24 of these 27 failed tests, the 10 random points can actually be reduced to just one single point, on which the system still fails. For the remaining three failed tests, the 10 random points can be reduced to two points on which the system still fails. Such a random point can represent a tiny particle in the air or sheer noise commonly found in real-life data from sensors. In all our studies, the original data (before adding the random points) was training data downloaded from Apollo’s official website, where each data frame normally contained more than 100,000 data points (therefore, adding one or two additional points to the frame is trivial).

These findings mean the existence of just one single tiny particle in the air outside the driving area can cause the LiDAR system of a real-life deep-neural-network-driven autonomous vehicle to fail to detect an obstacle on the roadway. This result reveals a much more serious problem than those pointed out by Lewis and Denning, and by Parnas, and hence provides a case against mission critical applications of machine learning techniques.

Zhi Quan Zhou and Liqun Sun, Wollongong, Australia

Authors’ Response

We agree completely. Examples of fragility of trained neural networks keep popping up and casting doubt on whether deep learning networks can be trusted in safety-critical applications. We saw a recent demonstration in which a network trained to read road signs correctly identified a stop sign when the image was clean, and incorrectly called it a speed limit sign when just a few pixels of the image were altered. This is one of the reasons Dave Parnas called for skilled programmers instead of neural networks to find solutions to problems because their programs could be verified and bugs fixed.

Ted G. Lewis and Peter J. Denning

The relevance of Zhou and Sun’s important point is not limited to neural networks or machine learning technology. They illustrate dangers that can exist whenever a program’s precise behavior is not known to its developers.

I have heard neural network researchers say, with apparent pride, that devices they have built sometimes surprise them. A good engineer would feel shame not pride. In safety-critical applications, it is the obligation of the developers to know exactly what their product will do in all possible circumstances. Sadly, we build systems so complex and badly structured that this is rarely the case.

David Lorge Parnas

Never Too Late to Share Computational Thinking

I commend Judy Robertson’s wonderful blog post "What Children Want to Know About Computers" (Oct. 19, 2018) for illustrating the challenges children face understanding computers, and for the challenges CS educators face helping them. It reminded me of a moment in 2018 when I was explaining programming to my daughter, a recent college graduate who majored in the humanities and was never very interested in computers. She asked, "How does the computer actually work? How does it add two numbers?" It turned out to be the perfect opportunity to whip out my copy of Digital Equipment Corporation’s 1981 VAX Architecture Handbook. She found the details of the instruction sets, opcodes, registers, and memory fascinating and helped her begin to understand what computers and programs actually do behind the scenes.

She has since been studying computers and software development online, as she considers a career in technology. As Robertson said, we can do a much better job teaching our children how computers work. I think it is important to add that young adults—everyone, really—can benefit from a greater understanding of computers and computational thinking.

Geoffrey A. Lowney, Issaquah, WA, USA

Footnotes

Communications welcomes your opinion. To submit a Letter to the Editor, please limit your contribution to 500 words or less, and send to letters@cacm.acm.org.

A Case Against Mission-Critical Applications of Machine Learning

Authors’ Response

Never Too Late to Share Computational Thinking

A Case Against Mission-Critical Applications of Machine Learning

DOI

August 2019 Issue

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

Communications of the ACM (CACM) is now a fully Open Access publication.

Authors’ Response

Never Too Late to Share Computational Thinking

A Case Against Mission-Critical Applications of Machine Learning

DOI

August 2019 Issue

Related Reading

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

Shape the Future of Computing

Communications of the ACM (CACM) is now a fully Open Access publication.