Deep learning has transformed numerous fields. In tackling complex tasks such as speech recognition, computer vision, predictive analytics, and even medical diagnostics, these systems consistently achieve—and even exceed—human-level performance. Yet deep learning, an umbrella term for machine learning systems based primarily on artificial neural networks, is not without its limitations. As data becomes non-planar and more complex, the ability of the machine to identify patterns declines markedly.
At the heart of the issue are the basic mechanics of deep learning frameworks. "With just two layers, a simple perceptron-type network can approximate any smooth function to any desired accuracy, a property called 'universal approximation'," points out Michael Bronstein, a professor in the Department of Computing at Imperial College London in the U.K. "Yet, multilayer perceptrons display very weak inductive bias, in the sense that they assume very little about the structure of the problem at hand and fail miserably if applied to high-dimensional data."
Simply put, these systems can approximate complex functions, but they do not generalize well with previously unseen data and unfamiliar examples. Thus, when the technology is applied to sophisticated computer vision and image recognition problems, simple neural networks typically require colossal training sets. Although today's Convolutional Neural Networks (CNNs) provide a stronger inductive bias by processing images using small local filters, they are designed to operate on 1-dimensional and 2-dimensional (2D) data, such as a photograph or audio file. Designing neural networks that can cope with more complex entities such as molecules, data trees, networks, and manifolds pushes the task into a non-Euclidean world.
That is where a concept called geometric deep learning enters the picture. It relies on a broad class of approaches that use "geometric" inductive biases and concepts to make sense of non-Euclidean structures, such as graphs and manifolds. "When you go to 3D (three-dimensional) deep learning, you greatly increase the possibilities within a convolutional network," explained Max Welling, professor and Research Chair at the University of Amsterdam, The Netherlands, and vice president of technologies for Qualcomm. "There are many exciting applications for the technology."
Geometric deep learning aims to expand data science in much the same way that a 3D image offers more insight and perspective than a 2D photo. "There's a natural connection to physics, in the sense that geometrical properties are typically expressed through symmetries," said Joan Bruna Estrach, assistant professor of computer science, data science, and mathematics at the Courant Institute and the Center for Data Science at New York University. This includes signals that arise in climate science, molecular biology, and many other areas in the physical sciences.
Geometric deep learning builds upon a rich history of machine learning. The first artificial neural network, called "perceptrons," was invented by Frank Rosenblatt in the 1950s. Early "deep" neural networks were trained by Soviet mathematician Alexey Ivakhnenko in the 1960s. A major advance took place in 1989, when a group of researchers, including New York University professor (and ACM A.M. Turing Award recipient) Yann LeCun, designed the now-classical Convolutional Neural Network (CNN). The group used CNNs to solve computer vision problems that were considered incredibly difficult at that time, including that of handwritten digit recognition.
What imbues a neural network with its expressive power is a "modular design based on connecting neurons into multiple layers that can spot highly complex problems." As data passes through the different layers of the CNN, each layer relies on the previous layer to extract more detailed information. For example, in the case of a photo of a butterfly, the initial layer may identify the basic shape from the pixel patterns, a second neural layer may detect features such as antennae and wings, and another layer may detect colors and other features. An algorithm can determine that an object is either a butterfly, or not. The use of convolutional filters endows CNNs with an important property called shift equivariance, which means they can identify objects no matter where they are located within an image.
However, there's a catch. Many objects and things—from molecules and scans of human organs to the streets on which autonomous vehicles must drive—are 3D and far more complex than a flat photo of a butterfly, zebra, or human face. These 3D objects have many more degrees of freedom and the shortest distance between two points isn't necessarily how it appears in a 2D image or photo. Thus, the CNN struggles to tackle the volume and complexity of this data. Metaphorically speaking, CNNs lack the capability to see beyond the flat earth of Euclidian geometry. As a result, researchers in fields such as biology, chemistry, physics, network science, computer graphics, and social media have found they are somewhat limited in their ability to explore important data science problems.
In 2015, Bronstein introduced the term "geometric deep learning" to describe neural network architectures with geometric inductive biases that can be applied to data structured as surfaces (or "manifolds" in geometric jargon) and graphs. These graphs, which are mathematical abstractions of networks, are especially useful in a broad range of applications involving systems of relations and interactions. By analyzing an object in a non-Euclidean way, including examining the edge of pixels and changing the way the convolutional neural network filters data, the system learns much more about the relationship between and among pixels.
As data passes through the convolutional neural network, each layer relies on the previous layer to extract more detailed information.
Indeed, deep learning on graphs, which also goes by the name of "graph representation learning" or "relational inductive biases," bears many similarities to classical CNNs, but at the same time it is very different. "Similar to convolutional neural networks, graph neural networks perform local operations with shared parameters, implemented in the form of 'message passing' between every node and its neighbors," Bronstein said. However, unlike convolution operations used on grid-structured data, graph operations are permutation-invariant, which means they do not recognize the order of nodes.
Geometric deep learning is not a complete break from classical deep learning. In fact, "If you look at the algorithms and the architectures that researchers are mostly dealing with, there's a huge overlap," Bruna pointed out. "In reality, deep learning represents a continuum of increasingly structured architectures that reflect inductive biases of the physical world." Bruna said CNNs serve as a "canonical instance" of a more basic translation symmetry. "Geometric deep learning provides a toolkit to express symmetries and [processes] that work best for a specific task or type of computational problem," he said.
The technique is opening up new vistas for understanding data. A team of researchers at the Netherlands' University of Amsterdam, including Taco Cohen, a machine learning researcher and Ph.D. candidate, advanced the field in 2018 when they figured out a way to encode basic assumptions about images and models into geometric deep learning algorithms. By scanning a plane of pixels for an entire volume, creating a 3D map and using the artificial neural net, they were able to leapfrog conventional CNN methods when studying lung cancer computed tomography (CT) scans. The approach produced results on par with conventional CNNs using only about a tenth of the data. "Whereas classical convolutional networks need to learn the appearance of lung nodules in every orientation, our network can automatically recognize nodules no matter their orientation, due to its rotation equivariance property," Cohen explained.
As the team continued to study various models, they confirmed their approach could address equivariance issues, also known as covariance in physics. In other words, the same data presented in different ways or collected by different systems produced the same results. Then, when they analyzed climate data, they found that conventionally trained CNNs resulted in 74% accuracy in identifying extreme weather patterns, such as cyclones. The same data run through a geometric learning gauge CNN they built detected storms with nearly 98% accuracy.
Scientists are turning to geometric deep learning to explore complex problems that require highly precise results.
As researchers attempt to develop models that detect and predict events in biology, chemistry, and physics, the ramifications are clear. "There are a huge number of remarkable insights to be gained by applying ideas used in physics and mathematics to produce new deep learning models," Welling explained. Although the technology is still in the nascent stages, it already is showing remarkable potential. Bronstein said the approach could revolutionize everything from materials science to medicine, and even social media. It will help scientists discover new combinations of compounds that lead to new types of antibiotics, and more effective cancer drugs.
The advantages don't stop there, however. Geometric deep learning can disregard nuisance variations that cause conventional CNNs to go completely haywire. "A standard convolutional neural network can recognize visual patterns regardless of how they are shifted in the image plane, but can easily get confused by rotated patterns" Cohen said.
Not surprisingly, challenges remain in developing geometric deep learning systems that are fully equipped to solve real-world problems. Bronstein said that for now, scalability is a key factor limiting industrial applications. "Real-life applications often have to deal with very large graphs with hundreds of millions of nodes and billions of edges, such as Twitter and Facebook social graphs. So far, the focus of academic research in geometric deep learning has been primarily on developing new models, and these important aspects have until recently been almost completely ignored. As a result, many graph neural network models are completely inadequate for large-scale settings."
Another crucial factor limiting geometric deep learning is that real systems are not static; they evolve in time, and hence require methods capable of dealing with dynamic graphs. "This topic has also been only scarcely addressed in the literature," Bronstein said.
Still another obstacle is developing chips and hardware specifically designed to tackle geometric deep learning. Today's systems use graphics processing units (GPUs) and central processing units (CPUs)—which are ideal for conventional CNNs operating on a stream of pixels. However, they are not necessarily the best fit for graph-structured data, where data can come in random order. "In the long run, we might need specialized hardware for graphs," Bronstein says.
Nevertheless, the field continues to gain traction. Scientists are turning to geometric deep learning to explore complex problems that require highly precise results. Among those particularly interested in the field are physicists and chemists who work with large and wildly disparate data sets based on foundational data structures that are known in advance. Geometric deep learning greatly increases their ability to understand molecular structures, cosmological maps and Feyn-man diagrams with pictorial representations of extraordinarily complex 3D subatomic particles.
Concludes Welling, "Geometric deep learning and gauge-equivariant CNNs are likely to emerge as standard tools in the data science toolkit. They are advancing rapidly because there's a growing recognition they can tackle new and entirely different sets of problems."
LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P.
Gradient-Based Learning Applied to Document Recognition, Proceedings of the IEEE, November 1998. Volume: 86, Issue: 11. https://ieeexplore.ieee.org/document/726791
Bronstein, M.M., Bruna, J., LeCun, Y., Szlam, A., and Vandergheynst, P.
Geometric Deep Learning: Going Beyond Euclidean Data, IEEE Signal Processing Magazine. July 2017. Volume: 34, Issue: 4. https://ieeexplore.ieee.org/abstract/document/7974879
Masci, J., Rodolà, E., Boscaini, D., Bronstein, M.M., and Li, H.
Geometric Deep Learning. SA '16: SIGGRAPH ASIA 2016 Courses. November 2016. Article No.: 1, Pages 1–50. https://doi.org/10.1145/2988458.2988485
Cohen, T.S., Weiler, M., Kicanaoglu, B., and Welling, M.
Gauge Equivariant Convolutional Networks and the Icosahedral CNN, Proceedings of the International Conference on Machine Learning (ICML), 2019 https://arxiv.org/abs/1902.04615
Cohen, T.S. and Welling, M.
Steerable CNNs, Proceedings of the International Conference on Learning Representations, 2017. https://arxiv.org/abs/1612.08498
©2021 ACM 0001-0782/21/1
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and full citation on the first page. Copyright for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or fee. Request permission to publish from email@example.com or fax (212) 869-0481.
The Digital Library is published by the Association for Computing Machinery. Copyright © 2021 ACM, Inc.
No entries found