Uncategorized | Page 2 of 13 | Berryville Institute of Machine Learning

11 February 2026

No comments

Categories: Uncategorized

Back in the mid-’90s, an era or two ago, and long before the advent of the transformer model and explosive rise of LLMs that define the modern ML landscape, our own Dr. Gary McGraw (under the guidance of Doug Hofstadter) was exploring a fundamental question of artificial intelligence:

“What are the mechanisms underlying the fluidity of human concepts?”

How is it that we can understand conceptual boundaries, develop categories, and implicitly see the sameness that binds different instances of a concept together? And what might we learn by building a machine that simulates this behavior? Or, rather, what is an A?

The perceptual hypothesis behind the Letter Spirit project is that letter-concepts are composed of constituent roles. That is, letter concepts, in turn, have letter-part concepts.

The Letter Spirit project approached these questions from the angle of letter perception. While easy to take for granted, we literate apes possess the ability to differentiate letters and letter categories displayed a huge variety of fonts, handwriting styles, and artistic styles. Our gut instinct may tell us that the letter “a” is a mere shape made up of a bunch of tiny dots; but just a few examples can reveal a much greater depth to what constitutes our concept of the letter ‘a’.

This role model hypothesis is implemented here as the Letter Spirit Examiner program (a program written in scheme in 1995). It works through emergent computation—by segmenting letters into natural, constituent parts that correspond to the conceptual roles of the very concept of a letter—that is, different conceptual rules that when satisfied lead us to identify a letter. The examiner does this by running hundreds of micro-agents (called codelets) that are instantiations of sixteen codelet types. The asynchronous, parallel, local processing done by the codelets implements a parallel terraced scan of possible structures (as in the role model’s predecessor, Copycat). From these codelets emerges a high-level perception—the categorization of a letter shape into an idea.

*Just a couple of ‘a’s – Letter Spirit Ch. 1*

To our great pleasure and delight, we recently learned that Paul Geiger has developed a JavaScript implementation of the Letter Spirit Examiner based on the original Scheme code developed originally by McGraw and then adapted by Dr. John Rehling (standard-bearer of Letter Spirit – Part 2). This version is now accessible on the web, for curious people to play with.

Check it out in your very own browser now: Letter Spirit Examiner

An in-depth explanation of the architecture and implementation can be found here.

Big thanks to Paul Geiger: https://github.com/Paul-G2/letter-spirit-examiner-js

5 February 2026

No comments

Categories: Uncategorized

The brilliance of Anthropic’s Super Bowl Ad campaign spotlights what might be considered the core of humanity’s brilliance: creativity and nuanced communication.

Unless these ads were created 100% by AI with zero human involvement (even in the ideation phase), this is a moment to celebrate the (presumably-human) humor and pause for the deeper thoughts that they might just trigger.

In the end, do the “pretty people” of Madison Ave actually win the AI Era? The snarky comment one would hear way back when in Silicon Valley was always “when the pretty people (aka Creatives, agency types, ad people, etc.) got involved, the tech was toast.” Remember when Google was a search company and not an ad serving company? We do.

All that said, these ads are targeted at B2C and mass market—the consumer base that loves to lap up beige. Now why would that be? Well, the WHAT pile has a great deal of their thinking (and hopes and dreams and aspirations) encoded in it through text. That’s why AI will do really well with the great American consumer: because it reflects their (beige) worldview.

Another quick question, will we use ad tech to run our banks and nuclear power plants? Probably not. So MLsec is just as critical as it ever was.

5 February 2026

No comments

Categories: Uncategorized

We all know that WHAT machines like LLMs reflect the quality and security of everything in their WHAT pile (that is, their training set). We invent cutesy names like “hallucinate” to cover up being dangerously wrong. However, ignoring or soft pedaling risk is often not the best way forward. Real risk management is about understanding risk and adjusting strategy and tactics accordingly.

In order to do better risk management in MLsec, we need to understand what’s going on inside the network. Which nodes (and node groups) do what, what is the nature of representation inside the network, can we spot wrongness before it comes out? Better yet, can we compare networks and adjust networks from the inside before we adopt them?

These are the sorts of things that Starseer is looking into. At BIML we are bullish on this technical approach.

4 February 2026

No comments

Categories: Uncategorized

What happens when you organize a machine learning security conference together with a bunch of security experts who have widely varying degrees of machine learning experience? Fun and games!

The [un]prompted conference has a program committee reading like a who’s who of security, stretching from Bruce Schneier on one end to Halvar Flake on the other. BIML is proud and honored to have two people representing on the committee. (But we will say that we are legitimately surprised at how many people claim to have deep knowledge of machine learning security all lickety split like. Damn they must be fast readers.)

Ultimately all the experts had to slog through the 461 submissions, boiling the pile down to 25 or 30 actual talks. Did the law of averages descend in all its glory? Why yes, yes it did.

I have served on some impressive and diligent academic program committees over the decades (especially Usenix Security, NDSS, and Oakland). The [un]prompted approach is apparently more like Blackhat or DEFCON than that, with lots of inside baseball, big personalities, seemingly-arbitrary process, really smart people who actually do stuff, and much much more fun. And honestly the conference is going to be great—wide and deep and very real with a huge bias towards demos. ALL of the talks will be excellent.

I took it on myself to review everything submitted to my track (TRACK 1: Building Secure AI Systems) and also track 5 (TRACK 5: Strategy, Governance & Organizational Reality). Though I did get track 1 done (three times no less), I did not get through everything that came in during the deadline tidal wave. Lets just say A&A for Agents is over-subscribed and under-depth, prompt injection is the dead horse that still gets beaten, MCP and other operations fun at scale is the state of the practice, and wonky government types still like to talk about policy (wake me up when it’s over). If you want to see what’s next in building security in for ML, well it is only very slimly represented by two “lets get in there and see what the network is actually doing” proposals (one from Starseer and one from Realm labs). Yeah, submissions were “anonymous,” but everybody knows who is doing what at this end of the security field, so that’s just pretend.

Not only do we desperately need more whitebox work (leveraging the ideas behind transformer circuits you can find here), we also need to stop and think in MLsec. Where does recursive pollution (our #1 risk is BIML) fit in [un]prompted? Nowhere. How about model collapse? Nope. Data poisoning a la Carlini? Not even. Anything at all about data curation and cleaning (and its relationship to security)? Nah. Representation issues and security engineering? Well, there was one proposal about tokens…

Hats off to the outside–>in ops guys, they’re grabbing hold of the megaphone again! Just raw hacker sex appeal I guess.

Anyway, if you’re looking for a reason that BIML exists in all of our philosophical glory, it’s to peer as far into the MLsec future as possible. Somewhat ironically, we can do that by remembering the past. This [un]prompted experience feels so much like early software security (everyone was talking about buffer overflows in 1998 and penetration testing was an absolute wild west blast) that we can confidently predict MLsec is going to evolve from blackbox outside->in malicious input stuff, through intrusion detection, monitoring and sandboxing, eventually discovering that networks have lots of actual stuff you can try to make sense of inside the black box. Meanwhile the ops guys will paint a little number on each agentic ant, not thinking once about what the ant colony might be up to.

Do you remember when we decided to start looking at code to find bugs before it was even compiled? Because I do…it was my DARPA project. It will happen again. Not through static analysis…but through understanding just what the heck is going on INSIDE the networks we are building as fast as we can.

23 January 2026

1 Comment

Categories: Uncategorized

Pushing back on my flight from NYC to IAD, I caught one last headline before powering down the computer in my palm. This, from OpenAI:

Hum, “Education” or “OpenAI’s Education”... The headline felt worrisome given the total ‘fail’ experience I just had with ChatGPT, during a MoMa guided tour, the evening before, when I used it to augment my educational experience.

A masterful art expert, Agnes Berecz, had just led us through works of Helen Frankenthaler, Lee Krasner, Yente (Eugenia Crenovich), Louise Bourgeois, and Joan Mitchell.

Then we stopped to view this piece by Niki de Saint Phalle:

I was using the ChatGPT App as a companion during the tour because the human tour guide had mesmerizing knowledge of details of each artist, their style, inspiration, and wider impact on the art scene of the times. I wanted to hoover it all up.

I jotted notes on my phone and also shot queries into ChatGPT, to further dig for nuggets that could add to my knowledge.

Now, we all know ChatGPT can ‘get it wrong’, but it’s all too delightful to not lean on it and expect rightness.

I fed the photo into ChatGPT (in part, to log the artist’s name and spelling correctly).

What unfolded was a shocking reminder that no matter how spectacularly confident outputs read, you must keep your critical brain switched ‘on’!

ChatGPT responded with crediting the artwork to Robert Rauschenberg, not Niki de Saint Phalle.

Had it really just attributed a work hanging in the MoMa to the wrong Artist?

It had.

So I chose a prompt to suggest a sense-check.

Next, this output reponse:

For now and on so many levels, Agnes as teacher far exceeds the machine.

At BIML, we are pushing on the topic of Recursive Pollution as a very real thing.

If it plays out, the museum of the future may be full of Mona Lisas.

10 January 2026

2 Comments

Categories: Uncategorized

Forever ago in 2020, we identified “looping” as one of the “raw data in the world” risks. See An Architectural Risk Analysis of Machine Learning Systems (January 20, 2020), where we said, “If we have learned only one thing about ML security over the last few months, it is that data play just as important role in ML system security as the learning algorithm and any technical deployment details. In fact, we’ll go out on a limb and state for the record that we believe data make up the most important aspects of a system to consider when it comes to securing an ML system.”

Here is how we presented the original risk back in 2020. Remember, this was well before GPT2 changed everything.

[raw:8:looping]

Model confounded by subtle feedback loops. If data output from the model are later used as input back into the same model, what happens? Note that this is rumored to have happened to Google translate in the early days when translations of pages made by the machine were used to train the machine itself. Hilarity ensued. To this day, Google restricts some translated search results through its own policies.

We were all excited when Ross Anderson and Ilia Shumailov “did the math” on the looping thing and wrote it up in this paper three years later:

Shumailov, Ilia, Zakhar Shumaylov, Yiren Zhao, Yarin Gal, Nicolas Papernot, and Ross Anderson. “The Curse of Recursion: Training on Generated Data Makes Models Forget.” arXiv preprint arXiv:2305.17493 (2023). (Their paper was later published in Nature.)

In our BIML Bibiography entry, we call it, “a very easy to grasp discourse covering the math of eating your own tail. This is directly relevant to LLMs and the pollution of large datasets. We pointed out this risk in 2020. This is the math. Finally published in Nature vol 631.” In 2026, we still believe it is one of the top five papers in the field of machine learning security.

In the science world, this problem came to be known as “model collapse.” Honestly, we don’t care what it’s called as long as ML users are aware of the risk. See our original blog entry about all this here.

Four years later, we need to revisit our position. The problem is this. Discussion of model collapse focuses on END STATE conditions to the detriment of any focus on the pollution part itself. Your model does not have to completely collapse to become an unusable disaster. It can become a disaster enough through recursive pollution well before the model collapses. This is especially worrisome in light of Carlini’s recent work, Poisoning Attacks on LLMs Require a Near-constant Number of Poison Samples, which is also in our top 5.

We’ve been digging into the model collapse literature this January (2026), and we think it is time to clarify our view that recursive pollution is NOT model collapse…though it can LEAD to model collapse in the worst state.

Here’s our definition from the 2024 paper An Architectural Risk Analysis of Large Language Models (January 24, 2024) where we identified recursive pollution as the number one LLM risk. We have not changed our mind.

We have identified what we believe are the top ten LLM security risks. These risks come in two relatively distinct but equally significant flavors, both equally valid: some are risks associated with the intentional actions of an attacker; others are risks associated with an intrinsic design flaw. Intrinsic design flaws emerge when engineers with good intentions screw things up. Of course, attackers can also go after intrinsic design flaws complicating the situation.

[LLMtop10:1:recursive pollution]

LLMs can sometimes be spectacularly wrong, and confidently so. If and when LLM output is pumped back into the training data ocean (by reference to being put on the Internet, for example), a future LLM may end up being trained on these very same polluted data. This is one kind of “feedback loop” problem we identified and discussed in 2020. See, in particular, [BIML78 raw:8:looping], [BIML78 input:4:looped input], and [BIML78 output:7:looped output]. Shumilov et al, subsequently wrote an excellent paper on this phenomenon. Also see Alemohammad. Recursive pollution is a serious threat to LLM integrity. ML systems should not eat their own output just as mammals should not consume brains of their own species. See [raw:1:recursive pollution] and [output:8:looped output].

And just for completeness, here are the other two risk entries:
[raw:1:recursive pollution]

The number one risk in LLMs today is recursive pollution. This happens when an LLM model is trained on the open Internet (including errors and misinformation), creates content that is wrong, and then later eats that content when it (or another generation of models) is trained up again on a data ocean that includes its own pollution. Wrongness grows just like guitar feedback through an amp does. BIML identified this problem in 2020. See [BIML78 raw:8:looping], [LLMtop10:1:recursive pollution] and Shumailov.

[output:8:looped output]

See [BIML78 input:4:looped input]. If system output feeds back into the real world there is some risk that it may find its way back into input causing a feedback loop. This has come to be known as recursive pollution. See [LLMtop10:1:recursive pollution].

Anyway, expect to hear more from us on the recursive pollution front as we try to dig through the science and make sense of it all so you don’t have to.

And watch out for recursive pollution. It’s bad.

17 December 2025

No comments

Categories: Uncategorized

I like my town with a little drop of poison

Nobody knows, they’re lining up to go insane

-Tom Waits

We have updated our top papers list with “Poisoning Attacks on LLMs Require a Near Constant Number of Poison Samples”. The work highlights key themes in the security of machine learning and uncovers the surprising result that effective data poisoning attacks can be realized with a fixed amount of tampered data, not in proportion to training data. This makes larger models more not less vulnerable, and data pollution a greater risk. It uses simple experiments, builds on previous research, leverages open-source models, and perhaps we can even say uses modest computational resources. Altogether, a breath of fresh science as the analysis of proprietary models using elaborate benchmarks and mechanistic interpretability frameworks generate relief through statistical and mathematical rhetoric.

The repeated headline from the paper is a variant of “LLMs can be poisoned with as little as 250 samples”, OK we clicked. Models are built from data, vast amounts required, and data pollution and vulnerability is a top risk. Reading the original work, the headline summarizes a set of results, where the authors implement backdoor attacks to induce anomalous model behaviors and they do so using various base model scenarios. Every attack instance uses a more or less fixed amount of data. This suggests something fundamental. The backdoor attack is implemented through a trigger pattern, and target behaviors include: generating random tokens, switching language, and the erasure of safety training (compliance with unsafe requests after trigger). Triggering and anomalous behavior programming are achieved using constant sized poisoned data. The small size of the perturbation and the versatility point to something fundamental.

It reminds us of phase transition phenomena, and in particular the Watts-Strogatz small-world model in network science. This model showed that the introduction of very small (“constant size”) perturbation to a network structure could transform global properties of the network, creating a small world (with very short paths between any two nodes, the famous six degrees) from a large world (one with paths proportional to the number of nodes).

Good science leads to better insights, questions, and experiments, an improved ignorance. Future questions of defenses obviously follow: persistence through “realistic (safety) post-training”, data filtering, backdoor detection. However, we find that understanding how data requirements change with the complexity of the backdoor behavior (transition to randomness, language change,“safety erasure”, etc.) is the most intriguing and we look forward to hearing more along this line.

5 December 2025

No comments

Categories: Uncategorized

I have written 12 books (not counting translations of particularly popular works), so I expected to find some works of mine on the Anthropic setllement website. Though I have known about this settlement action for a while now, I put off thinking about it until I got an official email from a law office just last week. That made me bite the bullet and go digging through the data pile.

You probably already know BIML’s distinction between HOW machines (normal computer programs) and WHAT machines (machines built by ML over an often immense WHAT pile). We talk all about this in our LLM risks report. Lots of risks are tied up in the very nature of the WHAT pile. Poison in the WHAT pile is bad. Racism, xenophobia, and sexism in the WHAT pile is bad. It turns out that one excellent approach to ML risk management is to compile a nice clean WHAT pile to train on.

Back to our Anthropic story. Much to my surprise, the Anthropic WHAT pile only had seven of my authored books on its list of stolen things—and it didn’t have the most famous of my works on it at all. Weird.

Is that good? Well, not really. You see, I would prefer that AI/ML systems encapsulated in LLMs would understand and incorporate the concepts in my very best work, Software Security. Apparently they don’t. I am torn about this as an author. I don’t want to see my stuff outright stolen and my copyrights infringed. But I also don’t want LLMs to be wrong and under-informed about software security—a field I helped to define from the very beginning.

It’s complicated, huh?

30 November 2025

No comments

Categories: Uncategorized

Ron Gula, serial security entrepreneur, interviews vRon (an AI Agent) about BIML and BIML risks. This is fun.

19 November 2025

No comments

Categories: Uncategorized

Quick take on AI moments at the 2025 Paley International Council Summit: Global Media Unbound: The Future of Innovation, held in Silicon Valley.

Attending the Paley Council Summit, on the slopes of Sand Hill Road, afforded BIML’s Katie McMahon the chance to hear Media, Entertainment, Sports, and Tech titans share insights on how they view AI/ML impact in their industries.

The overwhelming theme tended towards safe and self-assuring platitudes of the form, “humans will always be the lead for creative endeavors. Who else would come up with the spirited idea of a walking lighthouse?” One hopes this happy human thought remains true.

In our view, an important highlight of the conference was the conversation between Mustafa Suleyman, CEO of Microsoft AI, and Tim Higgins, Business Columnist from The Wall Street Journal, who drove a probing interview. On the topic of Superintelligence, Mustafa provided a clear position of “conscious design intent” versus building for the sake of building. AI should be developed for the “purpose to serve the human” and constructed in the vein of “humanist technology.” See video clip here:

Katie had a quick chat with Mustafa following his stage appearance. She asked him if he was concerned about BIML’s top rated LLM risk, Recursive Pollution. After a pause of surprise and a “that’s an excellent question…’” he gathered a quick response: his lab is “working on it and thinks it’s solvable.” Well, that’s a relief. While they are at it maybe they can help us determine how to clean all the plastic out of the ocean too—a problem of the same order of difficulty.