By: Alec Winshel
Last month, Mona Awad and Paul Tremblay filed a lawsuit against OpenAI for infringement of their works. The complaint is another in a series of cases filed by Matthew Butterick and the Joseph Saveri Law Firm that mount legal challenges against companies developing AI-powered large language models. These models, often referred to as LLMs, are algorithms that utilize artificial intelligence and massive datasets to generate natural-sounding text in response to prompts. ChatGPT, owned by defendant OpenAI, is the most popular LLM, which now tallies more than 100M users. OpenAI feeds on enormous datasets to develop its capabilities. Awad and Tremblay don’t want to be part of the feeding frenzy.
Each plaintiff is a best-selling author of multiple novels. Mona Awad is the author of Bunny and 13 Ways of Looking at a Fat Girl. Paul Tremblay authored The Cabin at the End of the World, which was recently adapted by M. Night Shyamalan into the 2023 film Knock at the Cabin.
Trembley and Awad’s complaint accuses OpenAI of ingesting their novels into its product, ChatGPT, without permission. The complaint alleges that ChatGPT is capable of generating summaries of their copyrighted works, which suggests that the software’s training data included their books. Plaintiffs allege that the training occurred “without consent, without credit, and without compensation.” The complaint was filed as a class action on behalf of the authors and states that “there are at least thousands of members in the Class” who have suffered similar injuries in the United States. Claims include direct and vicarious copyright infringement, violations of the Digital Millenium Copyright Act, unfair competition under California law, and negligence.
This lawsuit is one of many recent legal actions challenging the expansion of AI. On July 7, the same legal team filed a second class-action lawsuit on behalf of three more authors, including comedian Sarah Silverman. Both complaints make similar allegations. In January, the team also filed a complaint against a different generative AI company, Stable Diffusion, for its alleged use of copyrighted images in the training dataset for its own LLM. The Clarkson Law Firm, a self-described public interest firm, recently filed a broader complaint against OpenAI that lists anonymous plaintiffs and seeks a halt on development of ChatGPT until the dangers of AI are more readily understood.
Legal battles are poised to dictate the future of generative AI. ChatGPT and similar products can only provide their users with book summaries, how-to guides, and personalized playlists by scouring the Internet and its treasure trove of existing data. Companies like OpenAI are likely to lean into the Internet’s longtime ethos: all information is for everyone. Safeguarding works only deprives people of access. But, authors adopt a different stance. Copyright protects creative expression by ensuring that authors profit from their endeavors. Will fewer consumers purchase Awad’s novels because they can access a free summary on ChatGPT? Does Tremblay deserve payment when ChatGPT’s naturalistic style of ‘speaking’ is marginally improved by scanning his prose?
More lawsuits will come. These cases represent the early reckoning between LLMs and traditional authors that is playing out across industries. As the cases slowly make their way through the court system, software will continue to develop as more and more expressive works are subsumed into training datasets. There is no easy resolution visible in the distance.