Recursive Pollution and Model Collapse Are Not the Same

Forever ago in 2020, we identified “looping” as one of the “raw data in the world” risks. See An Architectural Risk Analysis of Machine Learning Systems (January 20, 2020), where we said, “If we have learned only one thing about ML security over the last few months, it is that data play just as important role in ML system security as the learning algorithm and any technical deployment details. In fact, we’ll go out on a limb and state for the record that we believe data make up the most important aspects of a system to consider when it comes to securing an ML system.”

Here is how we presented the original risk back in 2020. Remember, this was well before GPT2 changed everything.

[raw:8:looping]

Model confounded by subtle feedback loops. If data output from the model are later used as input back into the same model, what happens? Note that this is rumored to have happened to Google translate in the early days when translations of pages made by the machine were used to train the machine itself. Hilarity ensued. To this day, Google restricts some translated search results through its own policies.

We were all excited when Ross Anderson and Ilia Shumailov “did the math” on the looping thing and wrote it up in this paper three years later:

Shumailov, Ilia, Zakhar Shumaylov, Yiren Zhao, Yarin Gal, Nicolas Papernot, and Ross Anderson. “The Curse of Recursion: Training on Generated Data Makes Models Forget.” arXiv preprint arXiv:2305.17493 (2023). (Their paper was later published in Nature.)

In our BIML Bibiography entry, we call it, “a very easy to grasp discourse covering the math of eating your own tail. This is directly relevant to LLMs and the pollution of large datasets. We pointed out this risk in 2020. This is the math. Finally published in Nature vol 631.” In 2026, we still believe it is one of the top five papers in the field of machine learning security.

In the science world, this problem came to be known as “model collapse.” Honestly, we don’t care what it’s called as long as ML users are aware of the risk. See our original blog entry about all this here.

Four years later, we need to revisit our position. The problem is this. Discussion of model collapse focuses on END STATE conditions to the detriment of any focus on the pollution part itself. Your model does not have to completely collapse to become an unusable disaster. It can become a disaster enough through recursive pollution well before the model collapses. This is especially worrisome in light of Carlini’s recent work, Poisoning Attacks on LLMs Require a Near-constant Number of Poison Samples, which is also in our top 5.

We’ve been digging into the model collapse literature this January (2026), and we think it is time to clarify our view that recursive pollution is NOT model collapse…though it can LEAD to model collapse in the worst state.

Here’s our definition from the 2024 paper An Architectural Risk Analysis of Large Language Models (January 24, 2024) where we identified recursive pollution as the number one LLM risk. We have not changed our mind.

We have identified what we believe are the top ten LLM security risks. These risks come in two relatively distinct but equally significant flavors, both equally valid: some are risks associated with the intentional actions of an attacker; others are risks associated with an intrinsic design flaw. Intrinsic design flaws emerge when engineers with good intentions screw things up. Of course, attackers can also go after intrinsic design flaws complicating the situation.

[LLMtop10:1:recursive pollution]

LLMs can sometimes be spectacularly wrong, and confidently so. If and when LLM output is pumped back into the training data ocean (by reference to being put on the Internet, for example), a future LLM may end up being trained on these very same polluted data. This is one kind of “feedback loop” problem we identified and discussed in 2020. See, in particular, [BIML78 raw:8:looping], [BIML78 input:4:looped input], and [BIML78 output:7:looped output]. Shumilov et al, subsequently wrote an excellent paper on this phenomenon. Also see Alemohammad. Recursive pollution is a serious threat to LLM integrity. ML systems should not eat their own output just as mammals should not consume brains of their own species. See [raw:1:recursive pollution] and [output:8:looped output].

And just for completeness, here are the other two risk entries:
[raw:1:recursive pollution]

The number one risk in LLMs today is recursive pollution. This happens when an LLM model is trained on the open Internet (including errors and misinformation), creates content that is wrong, and then later eats that content when it (or another generation of models) is trained up again on a data ocean that includes its own pollution. Wrongness grows just like guitar feedback through an amp does. BIML identified this problem in 2020. See [BIML78 raw:8:looping], [LLMtop10:1:recursive pollution] and Shumailov.


[output:8:looped output]

See [BIML78 input:4:looped input]. If system output feeds back into the real world there is some risk that it may find its way back into input causing a feedback loop. This has come to be known as recursive pollution. See [LLMtop10:1:recursive pollution].

Anyway, expect to hear more from us on the recursive pollution front as we try to dig through the science and make sense of it all so you don’t have to.

And watch out for recursive pollution. It’s bad.

The Anthropic Copyright Settlement is Telling

I have written 12 books (not counting translations of particularly popular works), so I expected to find some works of mine on the Anthropic setllement website. Though I have known about this settlement action for a while now, I put off thinking about it until I got an official email from a law office just last week. That made me bite the bullet and go digging through the data pile.

You probably already know BIML’s distinction between HOW machines (normal computer programs) and WHAT machines (machines built by ML over an often immense WHAT pile). We talk all about this in our LLM risks report. Lots of risks are tied up in the very nature of the WHAT pile. Poison in the WHAT pile is bad. Racism, xenophobia, and sexism in the WHAT pile is bad. It turns out that one excellent approach to ML risk management is to compile a nice clean WHAT pile to train on.

Back to our Anthropic story. Much to my surprise, the Anthropic WHAT pile only had seven of my authored books on its list of stolen things—and it didn’t have the most famous of my works on it at all. Weird.

Is that good? Well, not really. You see, I would prefer that AI/ML systems encapsulated in LLMs would understand and incorporate the concepts in my very best work, Software Security. Apparently they don’t. I am torn about this as an author. I don’t want to see my stuff outright stolen and my copyrights infringed. But I also don’t want LLMs to be wrong and under-informed about software security—a field I helped to define from the very beginning.

It’s complicated, huh?

Gula Does BIML

Ron Gula, serial security entrepreneur, interviews vRon (an AI Agent) about BIML and BIML risks. This is fun.

Houston, we have a problem: Anthropic Rides an Artificial Wave

I’ll tip my hat to the new Constitution
Take a bow for the new revolution
Smile and grin at the change all around
Pick up my guitar and play
Just like yesterday
Then I’ll get on my knees and pray
We don’t get fooled again

Out there in the smoking rubble of the fourth estate, it is hard enough to cover cyber cyber. Imagine, then, piling on the AI bullshit. Can anybody cut through the haze? Apparently for the WSJ and the NY Times, the answer is no.

Yeah, it’s Anthropic again. This time writing a blog-post level document titled “Disrupting the first reported AI-orchestrated cyber espionage campaign” and getting the major tech press all wound around the axle about it.

The root of the problem here is that expertise in cyber cyber is rare AND expertise in AI/ML is rare…but expertise in both fields? Not only is it rare, but like hydrogen-7, which has a half-life of about 10^-24 seconds, it disappears pretty fast as both fields progress. Even superstar tech reporters can’t keep everything straight.

Lets start with the end. What question should the press have asked Anthropic about their latest security story? How about, “which parts of these attacks could ONLY be accomplished with agentic AI?” From our little perch at BIML, it looks like the answer is a resounding none.

Now that we know the ending, lets look at both sides of the beginning. Security first. Unfortunately, brute force, cloud-scale, turnkey software exploit is what has been driving the ransomware cybercrime wave for at least a decade now. All of the offensive security tool technology used by the attackers Anthropic describes is available as open source frameworks, leading experts like Kevin Beaumont to label the whole thing, “vibe usage of open source attack frameworks.” Would existing controls work against this? Apparently not for “a handful” of the thirty companies Anthropic claims were successfully attacked. LOL.

By now those of us old enough to know better than to call ourselves security experts have learned how to approach claims like the ones Anthropic is making skeptically. “Show me the logs,” we yell as we shake our canes in the air. Seriously. Where is the actual evidence? Who has seen it. Do we credulously repeat whatever security vendors tell us as it it is the gods’ honest truth? No we do not. Who was successfully attacked? Did the reporters chase them down? Who was on the list of 30?

AI second. It is all too easy to exaggerate claims in today’s superheated AI universe. One of the most trivial (and intellectually lazy) ways to do this is to use anthropomorphic language when we are describing what LLMs do. LLMs don’t “think” or “believe” or “have intentionality” like humans do. (FWIW, Anthropic is very much guilty of this and they are not getting any better.) LLMs do do a great job of role playing though. So dressing one up as a black hat nation state hacker and sending it lumbering off into the klieg lights is easy.

So who did it? How do we prove that beyond a reasonable doubt? Hilariously, the real attacks here appear to be asking an LLM to pretend to be a white hat red team member dressed in a Where’s Waldo shirt and weilding a SSRF attack. Wake me up when it’s over.

Ultimately, is this really the “first documented case of a cyberattack largely executed without human intervention at scale”…no, that was the script kiddies in the ’90s.

Lets be extremely clear here. Machine Learning Security is absolutely critical. We have lots of work to do. So lets ground ourselves in reality and get to it.

BIML granted official non-profit status

After an extensive year long process, the Berryville Institute of Machine Learning has been granted 501(c)3 status by the United States Internal Revenue Service. BIML is located at the foot of the Blue Ridge mountains on the banks of the Shenandoah river in Berryville, Virginia.

We are proud of the impact our work has made since we were founded in 2019, and we look forward to the wider engagement that non-profit status will allow us.

BIML in Brazil: mind the sec keynote

This Mind the Sec keynote was delivered on September 18th in São Paulo Brazil to an audience of several thousand attendees. The stage was set “in the round” which made delivery interesting. Mind the Sec is the largest information security conference in Latin America, with an audience of 16,000.

BIML in São Paulo

In addition to keynoting mind the sec, Dr. McGraw spoke at University São Paulo.

You can watch the talk (delivered to 180 USP graduate students) here.

The in person portion of the audience…

Legit Webinar: swsec/appsec Meets AI/Development

Has application development changed because of AI? Yes it has. Fundamentally. What does this mean for software security? Liav Caspi, Legit CTO and BIML’s Gary McGraw discuss this important topic. Have a watch.