The rise of artificial intelligence tools that draw on a wealth of content from the Internet has begun to test the limits of copyright law.
Authors and a leading photo agency have filed a lawsuit over the past year, claiming their intellectual property was being used illegally to train AI systems, capable of producing human-like processes and to power applications such as chatbots.
Today, the news industry is joining them in the spotlight. The New York Times filed a lawsuit Wednesday accusing OpenAI and Microsoft of copyright infringement, the first such challenge by a major U.S. news organization over the use of artificial intelligence.
The lawsuit claims that OpenAI’s ChatGPT and Microsoft’s Bing Chat can produce content nearly identical to Times articles, allowing companies to “profit for free from the Times’ massive investment in its journalism by using it to create news products.” substitution without authorization or payment.”
OpenAI and Microsoft were not given the opportunity to respond in court. But after the lawsuit was filed, these companies indicated that they were in discussions with a number of news organizations about the use of their content – and, in the case of OpenAI, had begun signing agreements.
Without such agreements, limits could be set in court, with significant repercussions. Data is essential to the development of generative AI technologies – capable of generating text, images and other media on their own – and to the business models of the companies that do this work.
“Copyright will be one of the key points that shapes the generative AI industry,” said Fred Havemeyer, an analyst at financial research firm Macquarie.
A central consideration is the doctrine of “fair use” in intellectual property law, which allows creators to rely on copyrighted works. Among other factors, defendants in copyright cases must prove that they have transformed the content in a substantial way and that they are not competing in the same market as a replacement for the original creator’s work.
A review citing passages from a book, for example, could be considered fair use because it draws on that content to create a new and unique work. On the other hand, selling lengthy extracts from the book may violate the doctrine.
Courts have not ruled on how these standards apply to AI tools.
“There is no clear answer as to whether or not this is copyright infringement in the United States or whether it is fair use,” said Ryan Abbott, an attorney at Brown Neri Smith & Khan who handles intellectual property matters. “Meanwhile, numerous lawsuits are underway, with potentially billions of dollars at stake.”
It may be some time before the industry gets definitive answers.
The lawsuits asking these questions are in the early stages of litigation. If they don’t reach settlements (as most litigation does), it could be years before a federal district court rules on the issue. These decisions would likely be appealed, and appeal decisions could vary by circuit, potentially raising the issue before the U.S. Supreme Court.
Getting there could take about a decade, Mr. Abbott said. “A decade is an eternity in the market we’re in now,” he said.
The Times said in its complaint that it was in talks with Microsoft and OpenAI about terms to resolve the dispute, possibly including a license. The Associated Press and Axel Springer, the German owner of media outlets like Politico and Business Insider, recently contacted data license agreements with OpenAI.
Bringing cases to court could answer vital questions about what copyrighted data AI developers can use and how. But it could also simply serve as leverage for a plaintiff to obtain a more favorable licensing deal through a settlement.
“Ultimately, whether or not this lawsuit ends up shaping copyright law will be determined by whether the lawsuit is truly about the future of fair use and copyright law. author, or whether it is safe in a negotiation,” said Jane Ginsburg, a professor at Columbia Law School. » said the Times of the trial.
The evolving legal landscape could shape the nascent but heavily capitalized AI industry.
Some AI companies have been flooded with venture capital over the past year after the public deployment of ChatGPT went viral. A share plan under consideration could value OpenAI at more than $80 billion; Microsoft invested $13 billion in the company and integrated its technology into its own products. But questions about using intellectual property to train models are a focus for investors, Mr. Havemeyer said.
Competition in AI can come down to who has data and who doesn’t.
Companies that own rights to large amounts of data, like Adobe and Bloomberg – or that have accumulated their own data, like Meta and Google – have started developing their own AI tools. Mr. Havemeyer pointed out that an established company like Microsoft was well-equipped to enter into data licensing deals and navigate legal challenges. But startups with less capital might have a harder time getting the data they need to compete.
“Generative AI begins and ends with data,” Havemeyer said.
Benjamin Mullin reports contributed.