BIML is proud to host Patrick McDaniel, an OG of machine learning security (prominently featured in the BIML TOP 5) and a Dean of Research at Wisconsin, for a visit to the BIML Barn. Patrick arrived in Berryville late on Thursday and was greeted with a Liberal or two on the porch. We stayed up way too late talking about AI and security.
In the morning after breakfast, we spent much of the Friday research discussion going over our soon to be released paper No Security Meter for AI. Patrick has been thinking about measuring ML behavior for a long time, and was an early proponent of a whitebox approach. He had lots of very useful feedback for us.
Does science really get done around the kitchen table? Why yes. Yes it does. (And technical talks really get delivered in the BIML Barn.)
We ventured into greater metropolitan Berryville for lunch and coffee.
And then Patrick delivered a new talk as a BIML in the Barn feature to be released on May 13th. Patrick’s talk really surprised us and in very important philosophical ways.
After the talk we shared a cocktail on the patio. Maybelline is an honorary BIML dog.
Gary McGraw, cofounder of the Berryville Institute of Machine Learning, pointed to a core gap: Today’s benchmarks tend to measure how well AI systems can perform security tasks—not how secure the systems themselves are. Companies need to keep that distinction in mind when evaluating their tools and defenses.
McGraw warned as far back as 2019 that securing machine learning systems would be “one of the defining cybersecurity struggles of the next decade.” That moment has now arrived.
“These meetings are a way to remind ourselves of the fundamentals,” he said, “as we try to define what machine learning security actually is.”
What was to be a more standard copy of the BIML risk talk, instead was transformed into a debut of BIML’s forthcoming paper No Security Meter for AI. (expected mid-May) for an audience of NIST computer scientists.
It’s always fun to debut a talk for an audience that is engaged and knowledgeable.
While we were inside the very industrial Chemistry building for a talk that was 80% zoom, it rained outside.
Booting MOSAIC: multi-organization security and AI coalition
Well, maybe. (McGraw proposed the name which is being vetted.) We did all get together in Arlington 4.21.26 to discuss policy and AI. It was a good meeting set up by OWASP and SANS and run very professionally by Rob van der Veer.
The cool thing? BIML’s work was not only cited, but included.
The meeting setting was gorgeous.
As usual, the hall track was the best part of the entire day…especially when the hall was moved across the street to the bar.
Sounil Yu from Knostic and his son (a security analyst at Salesforce). Sounil discussed BIML’s measurement paper with McGraw.
Apparently, AI was used to steal original art from an aspiring musician. This is something that eminent philosopher Dan Dennett warned us about in 2022 (see the video link at the bottom of the BIML Bibliography). Though we mostly focus on technical risks built in to ML, the risk of deep fakes is a very real one that deserves more attention.
Too Dangerous to Release (Again): Software Security and AI
Have you heard? The mythos model from Anthropic is so dangerously good at finding software vulnerabilities that its release must be initially limited to companies participating in the Glasswing software security project! {Oh my. Also lions and tigers and bears!}
Does that sound like a marketing ploy to you? Because it does to most expert bug finders that I know best. In fact, the software exploit community (some of whom make a very good living selling bugs to the very companies that produced them…LOL) is pretty evenly split on this issue. So what is a grownup to think?
Those of who have been around the block a few times in AI-land remember way back when Chat-GPT2 was too dangerous to release too (because it could generate fake news even faster than a political PR flak). That garnered some press and helped with the launch for sure. Well, it’s happening again…just look at the tech headlines! Go, Anthropic, go!
Fortunately, there is some balanced coverage out there adopting a thoughtful approach (thanks, Cade). Here’s what we think:
We still have a very real software security problem, so ANYTHING that helps people find AND FIX bugs in code is good. Everyone who is serious about software vulnerability has been using Agentic AI to do this better. You should too. Want to get started using AI to find bugs? Hold your nose (because LinkedIn) and check out this link. But please also figure out how to FIX the bugs you find. And don’t expect to be paid for slop.
LLMs really are good at helping find easy vulnerabilities, but expert mode requires human experience and expertise. Will you become Halvar Flake by strapping on mythos? No, you will not.
Building exploits that really work is much harder than just finding bugs. In fact, I wrote a whole book about this in 2004, 22 years ago, and it is still true. Patching is also harder than finding vulnerabilities. Hopefully AI will help with both of these software security activities.
AI tools are all helpful in different ways. Use them all. Use the ones that are already released. (We hear tell that a well prompted Opus-4.6 (82%) does nearly as well as Mythos (84%) on CRSBench…which calls into question just what the hell these benchmarks measure—a topic we have been thinking about a bunch.)
As a last thought, we’re going to appeal to the four I’s that excellent human designers are familiar with: Intuition, Insight, and Inspiration (the fourth one is the “self” kind of I). AI is great and we love it. We are really going to need lots more software architects, information architects, designers, actual building architects, and humans who know what they are doing. If you know what you’re doing, you’ll be fine. If you are simply a bullshitter, you’re toast.
Imagine that you are trying to practice good security engineering at the system level when one of your essential components is an unpredicatable black box that sometimes does the wrong thing. How do you ensure or even measure the trustworthiness of that system? That seems to be the current situation we are in with LLMs and Agentic AI.
One of the levers we are exploring is observability INSIDE the black box. SO, In the case of an LLM, that would be trying to figure out what is going on inside the Transformer. Are there circuits in the trained model that correlate with and define certain behaviors? Are there concepts in there? Can we make use of various activation patterns (and weights) or otherwise guide them from inside the network? Are there indicators of bad behavior? Can we see the “guidelines” imposed by alignment training? Are they robust? Etc.
This is what we call (for the moment anyway) “Whitebox Interpositioning” at BIML. It’s like watching your brain (and interposing inside it) while you are acting as part of a system. Maybe we can build an “Intention-ometer” or maybe not. But we are certainly moving toward “WHYness” in a WHAT machine.
This all reminds us of what happened in software security when we moved from black box monitoring and sandboxing to whitebox code analysis (static and dynamic both). Thing is, we never really got a handle on architecture, especially when it came to security…
Plenty of work to do on the raw science front…and something we want to create a coalition to approach. Toward that end, BIML recently hosted a whitebox summit with Realm Labs and Starseer. We were joined by Paul Kocher. Expect something to come of this.
Lets face it, beige has a bad name. Maybe it was the omnipresent Docker khakis of middle management 20 years ago, or maybe it was that particular shade of beige approved by the HOA; perhaps it was that “non-Presidential” suit that made President Obama look so dapper, or maybe beige is just the vanilla of colors. Then again, according to cosmologists the color of the universe itself is beige.
So when it comes to AI what, exactly, is “beigification,” and is it good or bad? Like most things, it depends on who is doing the asking.
We use the term “beigification” at BIML to signify what happens when all of the textual knowledge that humans have managed to write down and digitize gets turned into an enormous training set for LLMs. Wait. Isn’t it good to have all of the stuff wired up in one place with a nice language-based interface to chat with? Well kinda. The world training set is chock full of pollution, poison, and lots of terrible ideas…just like humanity. That is, nobody went through and cleaned out the bad stuff (not that that is even possible). So we have that to deal with. There is also lots of clueless wrongness in there, leading some people to claim that LLMs provide “mansplaining as a service.” But on average, the training set is filled with lots and lots and lots of boring everyday stuff. In fact, it’s kind of beige.
The problem is this. Your average human probably resonates with the middle of the Bell curve, because that’s precisely where most humans exist (by definition). But scientists, experts, and academics specializing in pinhead Angel counting, all exist at the edge of the Bell curve. Just for the record, crackpots, conspiracy theorists, and political morons all exist outside the middle too—just at the other end.
So, will LLMs fail to work economically? Will the AI bubble burst like so much Middle Eastern oil? We don’t think so. There are too many humans in the middle that love how they sound to themselves. LLMs are here to stay in all of their boring beige glory.
Anyway, that’s what “beigification” means to us. Feel free to steal our work.
P.S. Also see Don’t Call It ‘Intelligence’ which seems to express a similar idea, only viewed from one of the edges.
I hosted the Silver Bullet Security Podcast for 13.5 years from 2006 to 2018. For each of the 153 episodes that meant: choosing the guest, getting help from research assistants (at IEEE S&P magazine) to gather background, digesting the background, writing a script (of 9 or so questions), recording the podcast in our studio at Cigital, and finally helping with “launch.” Of all of these activities, the interview itself was by far the easiest.
Know why Silver Bullet was so good with such in-depth questions? Because the script writing took 4-5 hours per episode (not counting the background research…which was often much more involved than just googling the person). All this for a 20 minute show.
We are rebooting Silver Bullet after a few years off with a new focus on Machine Learning security. Our first guest will be Gadi Evron. We’ve redesigned the logo, built an initial distribution list, created a landing zone with proper feeds to the usual channels, and yes..written a script. But this time I decided to use Gemini as my research assistant. TL/DR it was great.
I started with a bunch of ideas in an amorphous blob. This got me thinking about show story arc, coverage of various aspects of MLsec, etc. Here is what my notes looked like.
Then it was time to invoke Gemini. Fortunately, Gemini knows lots about me and about Silver Bullet. Eerily so. It knew where the archive was, and was able to garner a meta-pattern for the show with some insight into its philosophy. Was it absolutely spot on? Nope. Was it sycophantic and overly agreeable? Yes. But hey, the show’s creator is here driving the laser pointer (which, like a good cat, Gemini was happy to pounce after).
I worked through the script in order with Gemini for about an hour, during which I was impressed with its up-to-date (like yesterday) access to things happening in the world…like on this very website. For example, Gemini knew that Gadi had just visited BIML and that [un]prompted was something we had worked on together. It was very helpful, sometimes wrong, often using the wrong words…but, question by question, the show arc emerged. It kept track of where we were, sometimes suggesting new directions (which I rejected every time), but always knowing where we were in the work. After the session, I asked it to dump the script to one place for copy/paste and then did a fine tuning edit pass (including real fact checking on a couple of things).
All told, my bet is Gemini saved me about a factor three or four times the usual amount of work I used to do. Will the show be just as good? Obviously, the proof is in the pudding. We will be launching the first episode on March 2nd.
Here’s how it will all start…
Silver Bullet Intro (BIML Focused)
[MUSIC: Classic Silver Bullet Theme – Up and Under]
gem: Welcome to the Silver Bullet Security Podcast episode 154. I’m your host, Gary McGraw, coming to you from the Berryville Institute of Machine Learning where we are defining the future of machine learning security.
From 2006-2018, Silver Bullet explored the nascent field of software security through the lens of building security in. But today, the frontier has moved. As we integrate machine learning into the fabric of our essential systems, we find ourselves facing a new set of architectural flaws and security challenges that traditional software security can’t touch.
On Silver Bullet, we’re shifting our focus to the security of machine learning—bringing the same deep-dive, “no-silver-bullet” philosophy to the world of AI.
To help me kick off this new era, I’m joined by my new friend Gadi Evron. Gadi is a veteran of the botnet wars, a community builder, and the chair of the new [un]prompted conference. Gadi, welcome to the show.
[MUSIC: Swells briefly then fades out]
1. The [un]prompted Vision
Gadi, you’re chairing the [un]prompted conference, and I’m really pleased to be working on the committee with you. We’ve both seen the security conference circuit evolve over the decades, but [un]prompted feels like it’s trying to capture lightning in a bottle for the ML security space. What was it about the current state of AI security that made you feel we needed a dedicated, practitioner-first venue—something beyond just another “AI track” at a traditional security show?