Authors' lawsuit against OpenAI could 'fundamentally reshape' artificial intelligence, according to experts

Posted By: Ashley Smith
September 25, 2023 @ 7:05 pm
Local News

gettyimages_gavel_083123146375-150x150812821-1

Jason Marz/Getty Images

(NEW YORK) — A group of prominent authors joined a proposed class action lawsuit filed against OpenAI over allegations that products like ChatGPT make illegal use of their copyrighted work, setting off a high-profile legal clash.

While the lawsuit follows a series of similar legal challenges, it features a roster of well-known plaintiffs including authors George R.R. Martin and Jodi Picoult. The case targets a company at the center of a wave of artificial intelligence-driven programs that can instantaneously suggest recipes, compose poems and muse over existentialism.

“At the heart of these algorithms is systemic theft on a massive scale,” the lawsuit claims.

The case could fundamentally shape the direction and capabilities of generative AI, either imposing a new set of limits on a mechanism at the core of the technology or cementing an expansive approach to online material that has fueled the rise of products currently offered, legal analysts told ABC News.

“If anyone is going to win on the straight-up copyright infringement claims against OpenAI, this is probably the lawsuit that has the best chance of it,” James Grimmelmann, professor of digital and information law at Cornell University Law School, told ABC News.

Grimmelmann described the legal filing as a “well-drafted complaint” that presents compelling arguments over copyright infringement while avoiding murkier concerns over trademark issues or privacy.

In a statement to ABC News, an OpenAI spokesperson said the company has held constructive discussions in general with creators and remains confident its technology will prove beneficial to them.

“Creative professionals around the world use ChatGPT as a part of their creative process. We respect the rights of writers and authors, and believe they should benefit from AI technology,” the spokesperson said.

“We’re having productive conversations with many creators around the world, including the Authors Guild, and have been working cooperatively to understand and discuss their concerns about AI. We’re optimistic we will continue to find mutually beneficial ways to work together to help people utilize new technology in a rich content ecosystem,” the spokesperson said.

Here’s what to know about the class action lawsuit brought by authors against OpenAI, and what it may mean for the future of artificial intelligence.

What are the authors claiming in the lawsuit?

Generative AI programs, such as ChatGPT, respond to user prompts through an algorithm that selects words based on lessons learned from scanning billions of pieces of text across the internet.

The primary argument made in the lawsuit brought by the authors, in turn, centers on the alleged illegal use of copyrighted material for the training of the AI models, Pamela Samuelson, a professor at the University of California, Berkeley Law School who specializes in the overlap between technology and copyright, told ABC News.

“The big claim is that the ingestion of works of authorship as training data is itself a reproduction of the works,” Samuelson said.

Lacking permission to use the copyrighted work, OpenAI scans and makes use of the writing, which helps foster work that publishers would otherwise pay authors to create, the lawsuit alleges.

Questions remain over the exact set of data that OpenAI uses to train its products, including whether and to what extent the company draws on copyrighted material, Brian Buckmire, an ABC News legal contributor and former public defender with the Legal Aid Society, told ABC News.

“We know how copyright infringements operate but we don’t know how these data sets work. We don’t even have the ability to look under the hood to see what type of information they are and are not using,” Buckmire said. “This lawsuit could open the pandora’s box, so to speak, to give light to what’s going on.”

OpenAI did not respond to ABC News’ request for comment about the datasets.

A similar lawsuit brought against OpenAI by comedian and actress Sarah Silverman and other authors, in July, alleged that the company scanned her 2010 memoir “The Bedwetter” without her permission. Silverman filed a similar suit over an AI product released by Meta, the parent company of Facebook.

In response to the claim alleging the illegal use of copyrighted material, OpenAI may argue that any alleged copying of protected works falls within an exception to copyright protection known as “fair use,” which allows for the limited reproduction of text for uses like commentary or criticism, Grimmelmann said.

In this vein, Grimmelmann added, OpenAI may defend its alleged use of authors’ work as part of an effort to create separate, original writing rather than to regurgitate identical text.

“Fair use is famously open-ended,” Grimmelmann said.

Last week, Meta and OpenAI each filed separate motions to dismiss the cases brought by Silverman. Both filings citied “fair use” in defense of company conduct.

Arguing in defense of Meta, attorneys argued that “fair use” protections apply to the company’s use of material for the training of its AI product, Llama.

“Copyright law does not protect facts or the syntactical, structural, and linguistic information that may have been extracted from books like Plaintiffs’ during training,” the attorneys said. “Use of texts to train Llama to statistically model language and generate original expression is transformative by nature and quintessential fair use.”

Similarly, attorneys arguing on behalf of OpenAI said that AI-driven chatbots such as ChatGPT, also known as large language models, amount to a novel technological use of copyrighted material that does not violate the law.

“At the heart of Plaintiffs’ Complaints are copyright claims,” attorneys for OpenAI said. “Those claims, however, misconceive the scope of copyright, failing to take into account the limitations and exceptions (including fair use) that properly leave room for innovations like the large language models now at the forefront of artificial intelligence.”

What are the potential implications of the lawsuit?

The implications of the lawsuit will depend on how broadly the court chooses to interpret the challenge brought by the authors, as well as the outcomes of other similar cases, Samuelson and Grimmelmann said.

However, the impact of this case could also hold profound implications for the language-training mechanism on which text bots across the industry rely, they added.

“If the plaintiffs’ claims and their arguments get upheld in full generality then it really does fundamentally reshape the industry,” Grimmelmann said. “If the plaintiffs in this case are right and they get everything they want, then you can’t just scrape the entire web, use all of the existing big data sets and train a model.”

The decision could force AI companies to gain permission from authors and publishers for the use of their work, giving way to potential negotiations over licensing deals between the two sides, Grimmelmann said.

If OpenAI prevails, on the other hand, it could pave the way for private individuals or firms to widely scan the internet and establish AI models based on the results, Grimmelmann added.

“If the AI companies win really broadly and all of the claims get dismissed, it basically means anybody can create an AI model by training it on almost any data they can find,” Grimmelmann said.

The decision could shape the information marketplace, Grimmelmann added.

“This is the biggest challenge to the assumptions that the copyright system makes since the rise of the internet or maybe the rise of mass media,” he said.