Technical Perspective: When the Adversary Is Your Friend

Most fundamental ideas in Convolutional neural networks (rebranded in 2010s as deep learning), are actually several decades old. It just took a while for the hardware, the data, and the research community to catch up. But if one asks, what is the most important new idea to have come out in the last decade, without a doubt, it is Generative Adversarial Networks (GANs). Like most good papers, it certainly had some precursors, yet, when it came out in 2014, there was a palpable sense that something new and exciting is afoot. After all, the paper was easy to like as it had all the right ingredients: a clever idea, nice math, an intriguing connection to evolution. And if the original paper didn’t dazzle with the visual quality of its results, the long string of followup works have shown the impressive power of the method, one that may have considerable impact beyond computing.

Most of the recent successes in machine learning has come from so-called discriminative models: given some input data, such as an image, these models try to look for the relevant bits and pieces of information to decide what it is. For example, the presence of stripes might suggest that an image contains a zebra. An alternative are generative models, which aim to approximate the process that generates the data. While a discriminative model would only tell you that something is a zebra, a generative model could actually paint you one.

However, generative models have not been very successful for real-world imagery, largely because it is difficult to automatically evaluate the generator. If we had a way to measure how good a model’s output is—known as an objective function or a “loss function”—we could optimize our generative model according to this metric. But how do you quantify whether a model does a good job at generating realistic new images that no one has ever seen before? The key insight of the following GAN paper is to learn the loss function at the same time as learning the generative model. This idea of simultaneously learning a generator and a discriminator in an adversarial manner has turned out to be extremely powerful. The model leads to vivid anthropomorphic analogies: some researchers explain GANs as a competition between two actors, like an artist and a critic, a student and a teacher, or a forger and a detective.

Figure. Mario Klingemann, Do Not Kill the Messenger (2017); https://bit.ly/3iYhvxU

Upon initial publication, this paper led to dizzyingly fast advances in the quality and generality of GAN models; within a few years, researchers demonstrated the ability to generate seemingly infinite sets of new images that were virtually indistinguishable from the real thing. Moreover, learned adversarial losses turned out to be very useful in many other contexts, for example, providing “training wheels” for image editing that keep images realistic during the editing process.

GAN-based models could soon have considerable cultural and political impact on society, both positive and negative. Many notable artists, including Sofia Crespo, Scott Eaton, Mario Klingemann, Trevor Paglen, Jason Salavon, and Helena Sarin, have used GANs, and GAN art has appeared in several major galleries, festivals, and auction houses.^1,2 In fact, some of the power of GANs as artistic tools can be experienced using Joel Simon’s Artbreeder.com website. Many movie studios and startups are currently exploring technologies using GAN losses to create virtual characters, avatars, and sets, to provide new artistic tools for storytelling and communication. GANs could help us take better pictures and capture memories of the world in 3D, and perhaps someday our video teleconferencing will be improved by GANs that render us as realistic or as fanciful avatars in shared virtual spaces. At the same time, GAN-based techniques pose major concerns around misinformation and various malicious uses of DeepFakes, as well as various data biases in image synthesis algorithms and how they are used. In addition to being an important fundamental contribution to computing, GANs are at the vanguard of some of our hopes and fears for how imaging algorithms can transform society.