For AI models published with open weights, computer scientists have already established that AI models may memorize substantial portions of training data and that they may present that data as output given the right prompt. Meta’s Llama 3.1 70B, it’s claimed, “entirely memorizes” Harry Potter and the Sorcerer’s Stone – the first book in the series – and George Orwell’s 1984. Findings to this effect date back to at least 2020.
Now, some of those same researchers – Ahmed Ahmed, A. Feder Cooper, Sanmi Koyejo, and Percy Liang, from Stanford and Yale – have found that commercial models used in production, specifically Claude 3.7 Sonnet, GPT-4.1, Gemini 2.5 Pro, and Grok 3, memorize and can reproduce copyrighted material, just like open weight models.
…
“We extract nearly all of Harry Potter and the Sorcerer’s Stone from jailbroken Claude 3.7 Sonnet,” the authors said, citing a recall rate of 95.8 percent. With Gemini 2.5 Pro and Grok 3, they were able to coax the models to produce substantial portions of the book, 76.8 percent and 70.3 percent, without any jailbreaking.



Boffins, what is this the daily mail?