The Center for Investigative Reporting (CIR) has filed a lawsuit against OpenAI and Microsoft, alleging unauthorized use of its content to train AI models, sparking a wider debate on copyright and ethics in AI technology.
Short Summary:
- The CIR is suing OpenAI and Microsoft for allegedly using its content without permission or compensation.
- CIR claims this violates the Copyright Act and the Digital Millennium Copyright Act.
- This legal challenge reflects broader concerns about AI’s impact on journalism and intellectual property rights.
The Center for Investigative Reporting, the United States’ oldest nonprofit newsroom, has filed a lawsuit in federal court in Manhattan against technology giants OpenAI and Microsoft. The CIR claims these companies have appropriated and used its content to train AI models without proper authorization or compensation. Founded in 1977 in San Francisco, the CIR merged in February with Mother Jones and continues to be a critical player in investigative journalism through its multi-platform presence, including the flagship Reveal.
“OpenAI and Microsoft started vacuuming up our stories to make their product more powerful, but they never asked for permission or offered compensation, unlike other organizations that license our material,” said Monika Bauerlein, CEO of CIR, in a statement. “This free rider behavior is not only unfair, it is a violation of copyright. The work of journalists, at CIR and everywhere, is valuable, and OpenAI and Microsoft know it.”
Filed in a federal court in Manhattan, the lawsuit accuses OpenAI and Microsoft of breaching the Copyright Act and the Digital Millennium Copyright Act multiple times. CIR asserts that the unauthorized use of its journalistic work not only infringes on copyright laws but also menaces the public’s access to accurate information in a dwindling news environment. Bauerlein particularly decried how generative AI companies treat independent publishers’ output as free raw material for their technological advancements.
It’s an interesting moment in the world of journalism where organizations like the CIR are reckoning with the rapid rise of generative AI. Notably, the CIR joins other notable publishers, such as The New York Times and The Intercept, in suing OpenAI for similar reasons. While these legal battles ensue, some news publishers, in contrast, have opted to sign licensing agreements with OpenAI. These deals permit OpenAI to utilize archives and new content from these publishers to train its AI models, referencing this information in outputs delivered by ChatGPT, OpenAI’s well-known chatbot.
This lawsuit isn’t an isolated complaint against OpenAI and Microsoft. Other publishers and even bestselling authors, including John Grisham, Jodi Picoult, and George R.R. Martin, have also sued these tech firms on similar grounds. Notably, another legal case brought by authors such as comedian Sarah Silverman is currently active in San Francisco’s federal court.
Monika Bauerlein voiced her concerns to The Associated Press, stating, “Our existence relies on users finding our work valuable and deciding to support it.” She emphasized that when people form relationships with AI tools instead of with the actual work of journalists, it threatens the sustainability and integrity of independent newsrooms. Bauerlein warned that using AI-generated content could undermine the foundations of independent journalism and jeopardize the future of other news organizations.
OpenAI and Microsoft have traditionally not been forthcoming about the sources of their training datasets. However, they have argued that using publicly accessible online text, images, and other media constitutes ‘fair use’ under American copyright law. The CIR’s lawsuit, however, challenges this assertion, pointing out that a dataset used by OpenAI included links to Mother Jones’ website without adequate credit to the authors or notices of copyright.
This emerging rift between tech companies and news organizations has not gone unnoticed. Last summer, more than 4,000 writers penned a letter to tech company CEOs, including those of OpenAI, accusing them of unethical practices in creating chatbots by exploiting journalistic content. Bauerlein succinctly captured the sentiment: “They pay for office space, they pay for electricity, they pay salaries for their workers. Why would the content that they ingest be the only thing that they don’t pay for?”
While some in the news industry, including major outlets like The Wall Street Journal, The Atlantic, and Financial Times, have struck deals with OpenAI, offering compensation for the use of their content, CIR has taken a stand against what it perceives as exploitative practices. Both Mother Jones and CIR have a storied history dating back to the 1970s, with a firm position against unauthorized use of their work.
In its press release, CIR said that top technology companies have used its resources without proper authorization to enhance their artificial intelligence systems – a move termed “immensely dangerous” by Bauerlein. The CIR emphasized how critical it is to have stringent boundaries and fair compensation frameworks when incorporating copyrighted materials into AI systems. The lawsuit insists that AI-created summaries can limit public access to verified information, thereby jeopardizing publishers’ interests.
“Developers like OpenAI have garnered billions in investment and revenue because of AI products fundamentally created with and trained on copyright-protected material,” said Matt Topic, the lawyer representing the news organizations in the case. “Publishers are rightfully concerned about AI summaries damaging the market for their journalism, and we intend for this lawsuit to stop this latest example of ‘move fast and break things’ tech companies doing what they want without regard for the rights of others and without compensation.”
As these legal proceedings gain momentum, the stakes are exceptionally high for all parties involved. The outcomes of this lawsuit could serve as a precedent, shaping how AI technologies use and acquire copyrighted materials in the future. News organizations, tech companies, and legal experts will undoubtedly be watching closely.
In its defense, OpenAI has noted that it is working collaboratively with the news industry, establishing partnerships with global news publishers to display their content within ChatGPT. OpenAI stated that this move aims to drive traffic back to original articles, helping maintain the integrity of the journalism industry. However, Microsoft has yet to issue a formal response to the lawsuit.
The CIR argues that future collaborations should not be formed at the expense of independent journalism’s viability. As the legal proceedings unfold, it remains to be seen how they will impact existing practices and regulatory frameworks regarding the use of copyrighted material in AI systems.
Interestingly, amidst this legal battle, OpenAI announced a new partnership with Time magazine. This deal grants OpenAI access to Time’s “extensive archives from the last 101 years.” The partnership underscores the complex and multifaceted relations between AI technology and traditional media, navigating the thin line between beneficial collaboration and potential copyright violations.
From a broader perspective, this lawsuit signals a critical moment in the AI development community, as it wrestles with the ethical dimensions of data collection, usage, and compensation. As more creative, journalistic, and intellectual outputs make their way into AI models, the clarity and enforcement of copyright laws have never been more crucial.
Stay tuned for more updates on this landmark case and its implications on Artificial Intelligence for Writing. To explore the ethical facets of AI development and its future impact, visit our sections on AI Ethics and the Future of AI Writing. Let’s navigate this evolving landscape together.