Who Owns AI’s Output?

In November 2022, ChatGPT took the world by storm, demonstrating that a chatbot could be sufficiently refined to be practical and useful (unlike earlier attempts like Microsoft’s disastrous Tay chatbot back in 2016).

Now, after just two short years, generative AI technology has advanced by leaps and bounds, progressing quickly from simple text and image generation tools to advanced multimodal models that produce highly competent outputs across text, images, video, audio, and code.

Generative AI is not just generating content and art and code, however. It also is generating tons of problems for business and society, especially when it comes to figuring out who owns the outputs produced by these systems.

After all, individuals and companies are increasingly using generative AI to generate sophisticated, commercially valuable outputs. And that includes outputs that are valuable enough to potentially copyright (in the case of a produced work) or patent (in the case of inventions).

But consider: can you protect an AI-generated work or invention? Also, how are existing works protected from AI using them to produce its outputs?

These are seemingly simple questions, but not questions with easy answers.

That is because countries are scrambling to effectively legislate and regulate the ownership and usage of AI-produced works and ideas—and keep up with rapidly changing AI capabilities and systems. Currently, countries take very different approaches to whether or not individuals and companies can copyright or patent AI outputs. And, in the process, they’re sometimes raising more questions than they answer, especially about deeper issues like human creativity and ingenuity. The result is that the landscape for AI copyright and patent law looks a lot like the overall landscape for AI technology: confusing, chaotic, and changing at lightspeed.

Here’s where that legal landscape stands today.

Different countries, different rules

For the birthplace of ChatGPT and many other core generative AI technologies, the U.S. today has a fairly strict interpretation of AI-generated copyright and patent protection.

Currently, the U.S. does not allow for solely AI-generated outputs to be copyrighted or patent-protected, according to Ryan Abbott, an AI law researcher at the U.K.’s University of Surrey. A copyright, which refers to a protection conferred original creative and artistic works, must prove human authorship in the U.S., according to the U.S. Copyright Office (USCO). A patent, which protects inventions and discoveries, must have a human provide a “significant contribution” to an invention in order for it to be patentable.

“This has been interpreted to mean that works or inventions created by AI are not protectable at this time in the United States, even though it has not been codified in the law,” said Elizabeth Rothman, an attorney who specializes in technology law. She noted the perspectives of U.S. regulators are largely rooted in the idea that both the copyright and patent systems were developed to incentivize and protect human creativity and innovation.

Though these rules are not yet codified into law, they do appear to be driving decision-making from regulators and courts. In one high-profile example, Stephen Thaler challenged both the USCO and USPTO in court when his applications for copyrights and patents on AI-generated outputs were rejected. In both cases, a court ruled in favor of government regulators, and both offices have denied applications that claim a work or invention was solely created by AI, said James Grimmelmann, a professor of digital and information law at Cornell University.

“That seems very unlikely to change in the short term,” he said. “However, the U.S. position is less hardline than many people seem to think. In a case where it’s clear the AI provided substantial assistance to a human creator, and the human creator is willing to acknowledge their own role, I think both a copyright and a patent could be obtained based on the human contributions.”

This approach mirrors several Chinese court rulings on generative AI and copyright that take this more flexible approach, according to Grimmelmann. It also raises an even more important point: Current U.S. regulations are not yet set in stone, and they differ significantly from how things are done elsewhere in the world.

Many other countries take very different approaches to copyright and patent rules than the U.S. does. The U.K., Ireland, South Africa, India, New Zealand, and Ukraine, for instance, all have laws that allow the copyright protection of AI-generated works to some degree, said Abbott.

In the case of the U.K., the country has allowed for the copyright protection of computer-generated works, which includes AI-generated works, since 1988, with the copyright going to the person who arranged for the work to be created, said Rothman. However, she noted, there is ongoing debate about whether these regulations should continue to protect computer-generated works today.

For patents, most countries around the world have agreed that AI cannot be listed as the inventor on a patent application, said Rothman. However, it is possible in some jurisdictions that AI-generated inventions may be patentable if there is a human or entity named as the inventor. (For instance, Germany and South Africa do allow patents on AI-generated inventions.)

Copyright and AI training data

There are also open questions not only around how AI outputs can or can’t be protected, but also how AI systems were trained using protected works.

All generative AI technologies require vast amounts of data to train on in order to generate their outputs. In the case of a large language model of the type that powers ChatGPT, that system is trained on large quantities of text and material from the Internet. In turn, generative AI systems that produce images, video, and code are trained on large quantities of those materials.

In many, perhaps even most cases, AI technologies were trained on materials without the express permission of the creator, ingesting large swaths of data from the public Internet.

As a result, lawsuit after lawsuit has been levied against companies like OpenAI (maker of ChatGPT) for using copyrighted work in AI training without permission, with the highest-profile one coming straight from The New York Times, which alleges the company stole its material to train ChatGPT. There also have been many instances where companies have been proven to have used data without permission, even if they are not yet being sued for it. (The most recent high-profile example is companies like Apple, Runway, and Anthropic being caught scraping YouTube videos for training—a clear violation of the site’s policies.)

This uncomfortable fact of AI training has countries scrambling to get ahead of regulating how AI companies can and cannot use protected works.

One of the stricter regimes is taking shape in the European Union, which passed its long-awaited AI Act in 2024. The AI Act is sweeping, comprehensive legislation that establishes rules and regulations about how individuals and organizations within the bloc approach AI. Among other regulations, the AI Act mandates that global companies must respect when users have opted out of material or data being used to train AI, said Philipp Hacker, a digital law professor and researcher at European University Viadrina in Frankfurt (Oder).

Yet, that’s easier said than done, Hacker said. There’s no real standardization yet around what opting out really looks like, which makes it difficult for AI trainers to ascertain whether or not an individual has opted out of training. Already, litigation is underway because opt-outs have been overlooked. This measure also conflicts with some core tenets of international copyright law, according to Hacker.

“If the EU now wants everyone to respect EU copyright law, other countries could make similar moves, so that European providers would simultaneously have to respect Chinese, U.S., and other copyright laws if they want to offer their models in those jurisdictions,” he said. “That would wreak absolute havoc internationally.”

Japan seems to be trying to avoid that fate by taking a very different approach. The country has established what are widely considered some of the more permissible rules in the world around the issue. Japan has a broad text and data mining exception to its laws, which allows AI systems to be trained on copyright-protected content broadly, said Abbott. That means Japan allows AI companies to ingest copyrighted materials in order to train their systems, even in a commercial context.

Despite the differences in these approaches, one common thread runs through how every country is attempting to regulate AI copyright and patent considerations:

Everyone is trying their best.

The very recent advancements in generative AI have been rapid and exponential. That presents unique challenges to regulators—and the companies adhering to regulations.

“AI is rapidly changing, and the issues that seemed most significant even a year or two ago are no longer the ones that present the greatest challenges,” said Grimmelmann. “The danger of laying down legislation or regulation is that it could create a rule that works for some types of AI but not for others, or that applies to an older generation of AI technologies.”

Rothman agreed. “There are a lot of ongoing and evolving legal issues in this area,” she said. “That includes the need to establish clear guidelines to define the line between human and machine-generated content, addressing ethical concerns of using training data, international harmonization of rules and laws, establishing transparency standards, and realistically considering the use and economic impact of the technology.”

Different countries, different rules

Copyright and AI training data

Further Reading:

Who Owns AI’s Output?

DOI

January 2025 Issue

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

Communications of the ACM (CACM) is now a fully Open Access publication.

Different countries, different rules

Copyright and AI training data

Further Reading:

Who Owns AI’s Output?

DOI

January 2025 Issue

Related Reading

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

Shape the Future of Computing

Communications of the ACM (CACM) is now a fully Open Access publication.