Nothing Gets Deleted

Table of Contents

You can delete a post. You can unpublish an article. You can contact a news outlet and demand a correction. You can even invoke GDPR Article 17 - the “right to erasure” - and get a company to remove your personal data from their databases.

What you cannot do is remove something from an LLM.

Not yet. Possibly not ever. And the implications of that are something most people - including the regulators writing the rules - haven’t fully reckoned with.

The Model Doesn’t Read. It Absorbs.
#

Here’s the thing most people get wrong about how LLMs are trained: they imagine something like reading. Text goes in, knowledge comes out, stored somewhere you could theoretically retrieve or delete.

That’s not what happens.

Training is a compression process. Hundreds of billions of words - scraped from websites, books, forums, scientific papers, comment sections, and everything in between - are fed through a neural network repeatedly. Each pass adjusts the model’s weights: billions of floating-point numbers that encode statistical relationships between tokens. The source documents don’t end up stored anywhere. They dissolve into the math.

There’s no table to query. No file to delete. No pointer to null out.

What the model retains isn’t facts - it’s tendencies. The tendency to follow “the capital of France is” with “Paris”. The tendency to associate certain phrases with certain contexts. The tendency, as it turns out, to reproduce the biases, errors, and fabrications that appeared frequently enough in the training data to leave a statistical trace.

That distribution across billions of weights is exactly what makes removal so technically intractable. You can’t excise a single piece of knowledge the way you’d remove a row from a database. The knowledge isn’t in any one place. It’s everywhere and nowhere, simultaneously.

The Poisoned Well
#

Think about what the internet looked like between 2016 and 2022.

The years of peak coordinated misinformation. Anti-vaccine narratives scaling across Facebook. Climate denial funded and amplified to manufactured consensus. Fabricated election fraud stories repeated so many times they became search results. Sophisticated, high-volume disinformation campaigns explicitly designed to flood the information space with false content.

All of it was scraped.

An LLM doesn’t know what’s true. It knows what’s common. And if a false narrative appears often enough - if it’s repeated across enough pages, forums, and news articles - it becomes statistically indistinguishable from fact inside the training corpus. The model can’t tell the difference between “Paris is the capital of France” and “vaccines cause autism” in terms of how frequently and confidently those claims appeared in the data. One is true. One was repeated millions of times by people who wanted it to seem true.

This is the fake news problem reframed. We’ve spent years worrying about misinformation in public discourse. We should be equally worried about misinformation in training data - because it’s the same problem, except now it’s baked into infrastructure rather than filtered through human judgment.

Data poisoning isn’t a theoretical future risk. It already happened. The training window captured it all. And unlike a news article that can be corrected or a post that can be taken down, the statistical traces it left in model weights are, for all practical purposes, permanent.

The Ouroboros
#

Here’s where it gets worse.

We are now training new models on content generated by previous models. Not by accident - often deliberately, as a cost-effective way to produce training data at scale. And even where it’s not deliberate, the internet is flooding with AI-generated text fast enough that it’s becoming nearly impossible to avoid.

Researchers have found that AI-generated content now makes up a substantial and rapidly growing fraction of text online. A 2025 analysis of the web found that at least 30% of text on active web pages originates from AI-generated sources, with the true proportion likely approaching 40% [1]. On social media platforms, the picture is more uneven - but the trajectory is unmistakable [2]. The models being trained today are increasingly training on outputs from their predecessors.

In 2024, a research team from Oxford, Cambridge, and Imperial College London published a paper in Nature - “AI models collapse when trained on recursively generated data” [3] - demonstrating what they called model collapse. When models are trained on synthetic data from previous models, they degrade. The outputs look plausible. The grammar is fine. But statistical diversity shrinks with each generation. Edge cases disappear. The model’s understanding of the full distribution of human expression narrows toward a compressed, flattened average.

Think of photocopying a photocopy. Then copying that copy. Each generation is technically legible, but you’re accumulating artifacts and losing resolution. What presents as information is increasingly a statistical echo of an echo.

The ouroboros: the snake eating its own tail. We built systems that learn from human expression, and now we are polluting human expression with their outputs. The feedback loop is already running. We have no clear mechanism to stop it.

The Right to Be Forgotten Is a Fantasy
#

Regulators have noticed the problem. The EU AI Act requires documentation of training data. GDPR’s right to erasure technically applies to personal data used in training. Multiple legal challenges are in progress across jurisdictions, demanding that companies remove specific individuals’ information from their models.

The lawyers are writing rules about a technical capability that does not yet exist.

The field working on this problem is called machine unlearning, and it’s genuinely interesting - but honest researchers in the area will tell you it’s far from solved. The available approaches break down roughly as follows:

Retraining from scratch without the offending data is the only method that actually guarantees removal. It also costs tens of millions of dollars, requires months of compute, and must be repeated every time a new removal request arrives. This is not a viable operational solution.

Approximate unlearning methods - gradient ascent, SISA training, influence functions - attempt to surgically adjust weights without full retraining. They can suppress specific outputs. They cannot guarantee that underlying representations are gone. A 2025 study demonstrated that even exact unlearning - retraining from scratch without the target data, the supposed gold standard - can paradoxically increase information leakage, as the difference between pre- and post-unlearning models creates a signal that can be exploited to extract the very content that was meant to be forgotten [4]. And the approaches often degrade model performance in unpredictable ways.

Fine-tuning away unwanted behavior is the most common approach in practice. It works on the surface. But surface behavior is not the same as weight modification. The model has learned to not say the thing - it hasn’t unlearned the thing.

Regulators are writing obligations for a world where this works reliably. We are not in that world.

Reset to What?
#

This brings us to the question nobody wants to answer directly.

If we acknowledged that a model is irreparably contaminated - by misinformation, by private data it shouldn’t have, by synthetic degradation - could we simply reset it? Roll back to an earlier checkpoint?

Let’s think about what that actually means.

An earlier checkpoint is not a cleaner version of the same model. It’s a model trained on earlier data. Different contamination. The 2020 checkpoint captured the first wave of coordinated pandemic misinformation and four years of accelerating disinformation campaigns. The 2018 checkpoint missed some of that - but it also reflects a different internet, with its own biases, its own gaps, its own poisons.

There is no neutral state. There is no pristine training corpus. Every checkpoint reflects the world as documented up to that moment, including all the ways that documentation was distorted, gamed, or fabricated.

The uncomfortable truth: there is no clean LLM. There is only which contamination you’re willing to accept.

We built the tool before we understood its implications. Now we are trying to retrofit governance, accountability, and the right to be forgotten onto something that was never designed to support any of those things.

An Echo Is Not Knowledge
#

None of this means LLMs are worthless. It means they carry sediment.

Every output from a large language model carries the statistical residue of the entire training corpus - the good information and the fabricated information, the human insight and the synthetic recursion, the ideas that deserved to spread and the ones that were engineered to. You cannot separate them. The model can’t tell them apart.

The appropriate mental model isn’t a search engine or an expert. It’s a very well-read person who absorbed everything ever written on the internet, including the misinformation campaigns, the AI-generated filler, and four years of coordinated political fabrication, and who has no way of knowing which parts of their education were real.

Useful. But not neutral. Never neutral.

So the next time an LLM answers a question with calm confidence - and they all do - ask yourself: is this knowledge, or is this an echo?

Because the difference matters. And unlike the post you can delete or the article you can correct, the echo doesn’t go away.

Join the Conversation
#

What’s your experience with this? I’d genuinely like to hear how people in data and AI roles are thinking about training data provenance - reach out on LinkedIn or BlueSky.

References
#

[1] “Delving into: the quantification of AI-generated content on the internet (synthetic data),” arXiv:2504.08755 (2025). https://arxiv.org/abs/2504.08755

[2] Yafu Li et al., “Are We in the AI-Generated Text World Already? Quantifying and Monitoring AIGT on Social Media,” arXiv:2412.18148 (2024). https://arxiv.org/abs/2412.18148

[3] Ilia Shumailov, Zakhar Shumaylov, Yiren Zhao, Nicolas Papernot, Ross Anderson, Yarin Gal, “AI models collapse when trained on recursively generated data,” Nature 631, 755–759 (2024). https://doi.org/10.1038/s41586-024-07566-y

[4] Xiaoyu Wu, Yifei Pang, Terrance Liu, Zhiwei Steven Wu, “Unlearned but Not Forgotten: Data Extraction after Exact Unlearning in LLM,” NeurIPS 2025, arXiv:2505.24379. https://arxiv.org/abs/2505.24379

[5] Regulation (EU) 2016/679 (GDPR), Article 17: Right to erasure (“right to be forgotten”).

[6] Regulation (EU) 2024/1689 (EU AI Act), Article 53: Obligations for providers of general-purpose AI models - training data documentation requirements. https://artificialintelligenceact.eu/article/53/

[7] Lucas Bourtoule, Varun Chandrasekaran, Christopher A. Choquette-Choo, Hengrui Jia, Adelin Travers, Baiwu Zhang, David Lie, Nicolas Papernot, “Machine Unlearning,” IEEE Symposium on Security and Privacy (2021), arXiv:1912.03817. https://arxiv.org/abs/1912.03817

Photo by Artem Yellow: https://www.pexels.com/photo/men-standing-in-front-of-a-storage-full-of-trash-15193736/

The Model Doesn’t Read. It Absorbs. #

The Poisoned Well #

The Ouroboros #

The Right to Be Forgotten Is a Fantasy #

Reset to What? #

An Echo Is Not Knowledge #

Join the Conversation #

References #