Computer code now touches almost every aspect of our lives. Worldwide, 27 million developers churn out billions of lines of code every day. Yet, despite an abundance of open source libraries and increasingly sophisticated development tools, the task is time-consuming and prone to errors.
As a result, researchers are studying ways to introduce artificial intelligence (AI) into coding processes. While much of the effort centers on automating coding tasks, spotting bugs, fixing vulnerabilities, and producing more elegant code, there's also an emerging effort to tap AI to write code based on short text descriptions of what the code should do.
"There's interest in improving current coding practices and generating code through machine learning and AI models," says Brendan Dolan-Gavitt, an assistant professor of computer science and engineering at New York University (NYU) Tandon School.
Adds Furkan Bektes, founder and CEO of SourceAI, which has developed a tool to write code based on short natural language input, "The use of AI will allow developers to code faster, and allow non-developers to pursue their ideas."
The complexities of today's coding processes are not lost on anyone. A Boeing 787 Dreamliner has approximately 20 million lines of code. Major software programs and games have upwards of 50 million lines of code. Somewhere between functionality and chaos lies the real-world task of producing code rapidly and as bug-free as possible.
AI is taking direct aim at the challenge. "Ideally, AI could intervene, examine patterns and provide feedback about coding errors," says Shashank Srikant, a Ph.D. student in the Department of Electrical Engineering and Computer Science at the Massachusetts Institute of Technology (MIT). "This could help coders avoid traps that others have fallen into."
Google, Microsoft, and others have already begun experimenting with AI for assisted coding. For instance, in 2018, Microsoft introduced AI-assisted coding for Java, Python, C++, and other languages through Visual Studio IntelliCode. It offers developers relevant coding suggestions based on thousands of the most popular open source projects at GitHub. Through machine learning, it analyzes common usage patterns and practices, and delivers suggestions tailored to a specific project.
However, in May 2020, the field took a giant leap forward. OpenAI introduced a next-generation AI-based neural network and programming model called GPT-3, which is already used to build apps—including buttons, colors and input fields—using AI. When researchers and coding experts began testing the language, they realized that it could also write its own code.
Bektes, among the first to gain access to the platform, fed high-quality code samples into GPT-3 and built an application that generates code in any programming language—using input in English and most other major languages. For example, a user might say, "Calculate factorial of number given by user," and Source AI spits out the code.
Other tools incorporating AI are also popping up. For instance, Tabnine can autofill lines and functions as developers type.
Machine-generated coding could be a game changer, although it's unlikely to supplant the need for developers anytime soon. "It will open new horizons," Bektes explains. "There are many non-developers who have ideas but don't know how to code, and there are also developers who are experts in one language but not in others. AI can help them learn to code in other languages."
Despite optimism about AI coding, there are concerns about it being used for nefarious purposes, including producing new types of malware and other malicious code. Bektes says SourceAI is committed to preventing AI from being used for hacking and malware.
Of course, it is only one company. "There is potential for abuse," says Dolan-Gavitt, who creates AI-generated code that produces bugs for testing security software. For example, "It's conceivable that malware authors could generate numerous variants of the same malware to avoid detection."
There also are potential problems related to AI errors and biases that would likely show up in software. "Because AI models are sensitive to input data, it's very easy to wind up with bad results and even gibberish," Srikant points out. "There's a fundamental problem that these models haven't been designed to understand with a human capacity yet, so they may miss obvious problems." In fact, the research of his group at MIT found that minor variations made to code models can result in a decrease of 50% in accuracy.
Nevertheless, AI-assisted coding almost certainly will rewrite development practices over the next few years—particularly as researchers refine the algorithms that make AI-assisted coding possible. Bektes says AI language models are currently growing by 100x every two years, and it's only a matter of time until the technology goes mainstream.
Concludes Dolan-Gavitt, "Hurdles remain, but the idea of using AI in coding is becoming feasible."
Samuel Greengard is an author and journalist based in West Linn, OR, USA.
No entries found