Artificial Intelligence and Machine Learning News

AI Rewrites Coding

Even as software grows increasingly complex, artificial intelligence helps to simplify and automate coding tasks.


Computer code intersects with almost every aspect of modern life. It runs factories, controls transportation networks, and defines the way we interact with personal devices. It is estimated that somewhere in the neighborhood of 2.8 trillion lines of code have been written over the last two decades alone.a

Yet it is easy to overlook a basic fact: people have to write software—and that is often a long, tedious, and error-prone process. Although low-code and no-code environments have simplified things—and even allowed non-data scientists to build software through drag-and-drop interfaces—they still require considerable time and effort.

Enter artificial intelligence (AI). Over the last several years, various systems and frameworks have appeared that can automate code generation. For example, Amazon has developed CodeWhisperer, a coding assistant tool that automates coding in Python, Java, and JavaScript. GitHub’s Copilot autogenerates code through natural language, and IBM’s Project Wisdom is focused on building a framework that allows computers to program computers.

“As software becomes more complex and moves into the realm of non-developers and non-data scientists, there’s a need for systems that can simplify and automate coding tasks,” says Ruben Martins, an assistant research professor at Carnegie Mellon University. Adds Abraham Parangi, co-founder and CEO of Akkio, a firm that offers AI-assisted coding tools, “People have been working on these tools for many years. Suddenly, the trajectory is going vertical.”

Although it is unlikely AI will eliminate jobs for developers anytime soon, it is poised to revolutionize the way software is created. For instance, OpenAI has introduced DALL-E 2, a tool that generates photorealistic images and art through natural language. In addition, the OpenAI Codex builds software in more than a dozen programming languages, including Python, Perl, Ruby, and PHP.

Observes Ruchir Puri, chief scientist for IBM Research, “The ability for computers to write code—and even program other computers—has the potential to fundamentally reshape the way we work and live.”

Back to Top

Abstracting the Code

The idea of automating coding tasks is not new or particularly revolutionary. From punch cards to today’s vast open source code libraries, the need to construct software from scratch has steadily declined. In recent years, low-code and no-code environments—which typically allow a person to drag-and-drop elements that represent pre-established tasks or functions—have greatly simplified software development, while expanding who can produce software.

Yet the emerging crop of AI tools turbocharge the concept. In some cases, these platforms anticipate tasks and suggest blocks of code—similar to the way applications now autopredict words and phrases in email and other documents. In other cases, they actually generate images, functions, and entire websites based on natural language input, or they suggest coding actions based on what the AI believes should happen next.

For example, Akkio’s platform allows humans to build machine learning and other AI models for things like forecasting, text classification, and lead scoring, without ever interacting with code. It is a simple drag-and-drop proposition en route to a tool or app. “This makes it possible for people who have no knowledge of coding to accomplish all sorts of reasonably complicated tasks—and produce code without the formidable barriers of the past,” Parangi explains.

Amazon’s CodeWhisperer also is designed to serve as a machine learning-powered coding companion for software developers. It analyzes existing code structure, as well as standard written comments residing in an integrated development environment (IDE), and generates up to 15 lines of code at one time. These include entire functions and logical blocks.

GitHub’s Copilot automates code generation through IDE platforms such as Visual Studio Code, Visual Studio, Neovim, and JetBrains. It makes suggestions across dozens of programming languages.

Then there is IBM’s Project Wisdom, which aims to leap beyond 10 or 15 lines of autogenerated code and build an AI framework that can produce entire components, services, and applications within hours, rather than days or weeks. It automatically generates YAML code on the Red Hat Ansible platform through a natural-language interface. The goal, Puri says, is to “move beyond structured rules and mechanical processes and develop an AI framework that’s capable of understanding machine language.”

Back to Top

Maintaining an Image

The splashiest examples of AI-generated code come from OpenAI, however. Type a few words into the DALL-E application—anything from “photorealistic chateau next to a river in France” to “woman sitting at a beach Picasso style painting”—and the application spits out images that look as though they were produced by a talented human artist or photographer. The platform deftly combines concepts, attributes, and styles to create the images.

The OpenAI Codex also is pushing the boundaries of conventional software development. It can build simple apps from natural language commands. For example, a software designer might instruct the system in plain English to write an app that can process a product return and issue a label. After providing basic parameters and a description of the graphics, menus, and buttons needed, the OpenAI Codex can generate the code. Although the system is not perfect, a human can typically review the results and apply minor fixes where needed.

The goal, Puri says, is to “develop an AI framework that’s capable of understanding machine language.”

Meanwhile, various other companies, including the likes of Microsoft, Diffblue, and DeepMind (acquired by Google in 2014), continue to explore and develop AI coding as part of their platforms or as discrete AI engines. In Microsoft’s case, it has its own tool called Power Platform, but it also has helped fund GitHub’s Copilot, along with the private initiative OpenAI. “The enormous amount of data that we can feed into neural nets has changed the playing field. We’re seeing platforms that can generate code from scratch—through natural language models,” Martins explains.

The resulting unsupervised deep learning frameworks fuel this revolution. For example, Amazon trained CodeWhisperer on a large language model (LLM) of both open source and Amazon-supplied data. The OpenAI Codex is derived from more than 700 gigabytes of data collected from the Web and other sources. IBM initially plugged in about a third of a gigabyte of Ansible information technology-centric data, as well as its own GitHub and Yammer data; the total size of the data sources is approximately 10 gigabytes.

Still, AI coding is not without its challenges and controversies. While no one is particularly concerned that developers, website designers, and others will join the unemployment line anytime soon—these platforms serve as assistive tools for now—it is not outside the realm of possibility that within a decade or two, they will begin to replace humans at many tasks. Machines are gaining an increasingly impressive grasp of natural language, and the semantics and logical sequences that produce software.

Back to Top

AI Coding Is Not Picture Perfect

A bigger concern revolves around the quality of the code these systems generate—and an inability to peer into many of these AI models and fully understand how they are built. It is one thing to use predictive tools to suggest code to developers; it is entirely another thing to fully hand off the task to a computer. In many situations, such as software that controls an airplane, medical device, or an autonomous vehicle, coding errors could prove dangerous or even deadly. In banking or healthcare systems, automated coding errors could result in major disruptions.

Yet, because AI does not truly grasp the meaning of things—it can generate natural language text and even computer code that appears to make sense, but contains logic, syntax, and other errors—the lack of close human review ratchets up the risks. As Stuart M. Shieber, James O. Welch, Jr. and Virginia B. Welch Professor of Computer Science at Harvard University, puts it, “There is no consciousness or sentience. It’s simply a machine making highly probable predictions.” Indeed, AI can lead to mistakes in binding, operational variables, and consistency. What’s more, there’s no certainty that a program will produce the same code over time—even with the same input or request. As a result, for the foreseeable future, many of these systems are not completely trustworthy, if not risky, to deploy in critical situations.

In fact, enthusiasm about the technology notwithstanding, “AI coding still has a lot of limitations and a long ways to go,” Martins says. “It still makes many mistakes—including critical errors.” One of the trickiest errors is spotting syntactic anomalies that are not easily detected by humans. Just as a person or AI system might ignore the use of “threw” for “through” or “witch” for “which,” the computer can incorporate errors into code, with sometimes serious consequences.

As a result, Martins has focused his research on developing a framework that can spot and fix problems through a technique called program synthesis. It relies on advanced validation tools that incorporate structural elements and semantic patterns.b “The validation process occurs using formal methods and mathematics,” Martins explains. Although the method works well on both human and AI-generated code—and it shows a great deal of promise for building a repair framework—it is currently limited to reviewing fewer than 100 lines of code at a time.

“There’s no consciousness or sentience. It’s simply a machine making highly probably predictions.”

Another issue is the blurry legal and ethical line that AI coding systems introduce. Because these deep learning systems often are built from numerous sources of data containing both publicly available and copyrighted content, problems can ensue. For instance, the DALL-E system’s ability to generate artificial images in the style of the creator has attracted the wrath of contemporary artists who say their works and portfolios—which can take years to create—are diminished in an instant as a result of copycat images.c

Back to Top

Blurred Lines

Things become even more complicated when copyrighted material—including publicly available source code scraped from the Internet—is fed into a deep learning model that builds actual software applications. For instance, in October 2022, software developer and designer Matthew Butterick accused GitHub of improperly training Copilot with his intellectual property and other people’s copyrighted works. He subsequently threatened to file a lawsuit.d The accusation also ensnarled Microsoft, which partly funded the project and promotes the tool.e

To be sure, many questions remain unanswered, including whether it is possible to reach a point where programming languages simply are not necessary. In the meantime, it is clear automated AI coding will play an increasingly prominent role in the way developers, data scientists, and ordinary people create software and other content. “AI coding is destined to become even more powerful—and influential,” says Akkio’s Parangi. “We’re already building very powerful and capable AI code-generation models that are making a clear impact.”

*  Further Reading

Sarkar, A., Gordon, A.D., Negreanu, C., Poelitz, C., Ragavan, S.S., and Zorn, B.
What is it like to program with artificial intelligence? Proceedings of the 33rd Annual Conference of the Psychology of Programming Interest Group (PPIG 2022), August 12, 2022.

Chen, M. et al.
Evaluating Large Language Models Trained on Code, July 7, 2021. Cornell University.

Li, Yi et al.
Competition-Level Code Generation with AlphaCode. March 16, 2022. DeepMind.

Feng, Y., Martins, R., Bastani, O., and Dillig, I.
Program synthesis using conflict-driven learning, ACM SIGPLAN Notices, Vol. 53, Issue 4, April 2018, pp 420–435.

Vasconcelos, H., Gansal, G., Fourney, A., Liao, Q.V., and Vaughan, J.W.
Generation Probabilities are Not Enough: Improving Error Highlighting for AI Code Suggestions, 2022.


Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More