OpenAI’s Unauthorized Regurgitation of Copyright-Protected Works of Journalism

cover
13 Aug 2024

The Center for Investigative Reporting Inc. v. OpenAI Court Filing, retrieved on June 27, 2024, is part of HackerNoon’s Legal PDF Series. You can jump to any part in this filing here. This part is 6 of 18.

DEFENDANTS’ UNAUTHORIZED REGURGITATION OF COPYRIGHTPROTECTED WORKS OF JOURNALISM

76. ChatGPT and Copilot provide responses to questions or other prompts. Their ability to provide these responses is the key value proposition of Defendants’ products, which they are able to sell to their customers for enormous sums of money, soon likely to be in the billions of dollars.

77. To train ChatGPT, the OpenAI Defendants retain users’ chat histories with ChatGPT unless the user takes the affirmative step of disabling that feature.[12] Thus, the OpenAI Defendants possess a repository of every regurgitation of Plaintiff’s works apart from those whose storage users have affirmatively disabled.

78. At least some of the time, ChatGPT and Copilot provide or have provided responses to users that regurgitate verbatim or nearly verbatim copyright-protected works of journalism without providing author, title, copyright, or terms of use information contained in those works. Examples of such regurgitations are included in Exhibit J to the Complaint in Daily News, LP v. Microsoft Corporation, No. 24-cv-03285 (S.D.N.Y. Apr. 30, 2024).

79. At least some of the time, ChatGPT and Copilot provide or have provided responses to users that mimic significant amounts of material from copyright-protected works of journalism without providing any author, title, copyright, or terms of use information contained in those works. For example, if a user asks ChatGPT or Copilot about a current event or the results of a work of investigative journalism, ChatGPT or Copilot will provide responses that mimic copyright-protected works of journalism that covered those events, not responses that are based on any journalism efforts by Defendants.

80. At least some of the time, ChatGPT memorizes and regurgitates material. The OpenAI Defendants have publicly admitted their knowledge of this fact.[13] The OpenAI Defendants have also effectively admitted that regurgitation of copyrighted works is infringement: when Plaintiff attempted to obtain the same regurgitations set forth in the Daily News case using the same methodology, Plaintiff received in one instance a message stating, “I’m sorry, but I can’t generate the original ending for the article or any copyrighted content.” Thus, upon information and belief, the OpenAI Defendants have recently changed ChatGPT to reduce regurgitations for copyright reasons.

81. Nonetheless, ChatGPT has produced regurgitations of Plaintiff’s copyrightprotected works. Examples of three such regurgitations, along with the prompts that generated them, are attached as Exhibit 7.

82. Such memorization and regurgitation constitute unauthorized copies or derivative works of the Plaintiff’s work. Defendants directly engage in the unauthorized public display of CIR’s articles as part of generative output provided by their products built on the GPT models.

Continue Reading Here.


About HackerNoon Legal PDF Series: We bring you the most important technical and insightful public domain court case filings.

This court case retrieved on June 27, 2024, motherjones.com is part of the public domain. The court-created documents are works of the federal government, and under copyright law, are automatically placed in the public domain and may be shared without legal restriction.

[12] New ways to manage your data in ChatGPT (Apr. 25, 2023), https://openai.com/index/newways-to-manage-your-data-in-chatgpt/.

[13] OpenAI and journalism (Jan. 8, 2024), https://openai.com/index/openai-and-journalism/.