BIML Debuts AI Security Measurement Work at NIST

What was to be a more standard copy of the BIML risk talk, instead was transformed into a debut of BIML’s forthcoming paper No Security Meter for AI. (expected mid-May) for an audience of NIST computer scientists.

It’s always fun to debut a talk for an audience that is engaged and knowledgeable.

While we were inside the very industrial Chemistry building for a talk that was 80% zoom, it rained outside.

Booting MOSAIC: multi-organization security and AI coalition

Well, maybe. (McGraw proposed the name which is being vetted.) We did all get together in Arlington 4.21.26 to discuss policy and AI. It was a good meeting set up by OWASP and SANS and run very professionally by Rob van der Veer.

The cool thing? BIML’s work was not only cited, but included.

The meeting setting was gorgeous.

As usual, the hall track was the best part of the entire day…especially when the hall was moved across the street to the bar.

Sounil Yu from Knostic and his son (a security analyst at Salesforce). Sounil discussed BIML’s measurement paper with McGraw.

See this coverage of the meeting: Global AI Security Standard Organizations Gather Under MOSAIC to Reduce Fragmentation, AI security leaders gather in Washington as risks mount—and Mythos raises the stakes

Too Dangerous to Release (Again): Software Security and AI

Have you heard? The mythos model from Anthropic is so dangerously good at finding software vulnerabilities that its release must be initially limited to companies participating in the Glasswing software security project! {Oh my. Also lions and tigers and bears!}

Does that sound like a marketing ploy to you? Because it does to most expert bug finders that I know best. In fact, the software exploit community (some of whom make a very good living selling bugs to the very companies that produced them…LOL) is pretty evenly split on this issue. So what is a grownup to think?

Those of who have been around the block a few times in AI-land remember way back when Chat-GPT2 was too dangerous to release too (because it could generate fake news even faster than a political PR flak). That garnered some press and helped with the launch for sure. Well, it’s happening again…just look at the tech headlines! Go, Anthropic, go!

Fortunately, there is some balanced coverage out there adopting a thoughtful approach (thanks, Cade). Here’s what we think:

  1. We still have a very real software security problem, so ANYTHING that helps people find AND FIX bugs in code is good. Everyone who is serious about software vulnerability has been using Agentic AI to do this better. You should too. Want to get started using AI to find bugs? Hold your nose (because LinkedIn) and check out this link. But please also figure out how to FIX the bugs you find. And don’t expect to be paid for slop.
  2. LLMs really are good at helping find easy vulnerabilities, but expert mode requires human experience and expertise. Will you become Halvar Flake by strapping on mythos? No, you will not.
  3. Building exploits that really work is much harder than just finding bugs. In fact, I wrote a whole book about this in 2004, 22 years ago, and it is still true. Patching is also harder than finding vulnerabilities. Hopefully AI will help with both of these software security activities.
  4. AI tools are all helpful in different ways. Use them all. Use the ones that are already released. (We hear tell that a well prompted Opus-4.6 (82%) does nearly as well as Mythos (84%) on CRSBench…which calls into question just what the hell these benchmarks measure—a topic we have been thinking about a bunch.)

As a last thought, we’re going to appeal to the four I’s that excellent human designers are familiar with: Intuition, Insight, and Inspiration (the fourth one is the “self” kind of I). AI is great and we love it. We are really going to need lots more software architects, information architects, designers, actual building architects, and humans who know what they are doing. If you know what you’re doing, you’ll be fine. If you are simply a bullshitter, you’re toast.

AI Cyber Lab

One of our key missions at BIML is to help establish the field of machine learning security. Towards that end, we welcome collaboration with academics and practitioners alike. The AI Cyber Lab straddles both targets at once.

We held our first meeting with Neil Daswani’s AI Cyber Lab March 26th. Neil is a well-known figure at the intersection of academic security (having taught many classes at Stanford where he earned his Ph.D.) and applied security (serving as CISO at Lifelock after time spent in applied security at Yodlee and Google). These two plus decades of experience inform current innovation in cybersecurity and AI. (You may also know Neil as the author of Big Breaches: Cybersecurity Lessons for Everyone).

The AI Cyber Lab team encompasses seasoned CISOs and machine learning architects as well as college students focused on agentic AI and security. In this initial meeting, we introduced what BIML is focused on, defining Machine Learning Security and our unique approach. Gary delivered a quick informal presentation presentation.

We look forward to future collaboration and sharing research findings as we hack our way through the MLsec jungle.

Why Whitebox Machine Learning Matters

Imagine that you are trying to practice good security engineering at the system level when one of your essential components is an unpredicatable black box that sometimes does the wrong thing. How do you ensure or even measure the trustworthiness of that system? That seems to be the current situation we are in with LLMs and Agentic AI.

One of the levers we are exploring is observability INSIDE the black box. SO, In the case of an LLM, that would be trying to figure out what is going on inside the Transformer. Are there circuits in the trained model that correlate with and define certain behaviors? Are there concepts in there? Can we make use of various activation patterns (and weights) or otherwise guide them from inside the network? Are there indicators of bad behavior? Can we see the “guidelines” imposed by alignment training? Are they robust? Etc.

This is what we call (for the moment anyway) “Whitebox Interpositioning” at BIML. It’s like watching your brain (and interposing inside it) while you are acting as part of a system. Maybe we can build an “Intention-ometer” or maybe not. But we are certainly moving toward “WHYness” in a WHAT machine.

This all reminds us of what happened in software security when we moved from black box monitoring and sandboxing to whitebox code analysis (static and dynamic both). Thing is, we never really got a handle on architecture, especially when it came to security…

Plenty of work to do on the raw science front…and something we want to create a coalition to approach. Toward that end, BIML recently hosted a whitebox summit with Realm Labs and Starseer. We were joined by Paul Kocher. Expect something to come of this.

[un]prompted helping to define MLsec

One of our key missions at BIML is to define the future of machine learning security. [un]prompted was hugely helpful in that regard, and we are proud to have participated.

All in one place; real people leading important work in MLsec.

The [un]prompted conference delivered. No frills, all substance. This is where AI researchers and security practitioners met to share what they are seeing and doing across the new world of machine learning security and AI vulnerability risk.

Anthropic’s Nicolas Carlini delivered an excellent talk titled “Black-hat LLMs,” all about automating attack with AI tools. The urgency came through—we are at a very real inflection point. Carlini implored the audience to “help make the future go well!!!” (by being part of the solution making #AI as secure a possibe…) in a room packed with peers from OpenAI, Google Deepmind, Nvidia, Salesforce, founders of early stage AI companies, and actual real life hackers and security engineers.

(*) Carlini features in BIML’s TOP 5 (our research group curates an extensive annotated bibliography here) for his work on Data Extraction.

(*) Another star in the field, Ilia Shumilova whose 2023 paper on Recursive Pollution is also in our Top 5was in townrepresenting his start-up Sequrity AI.

(*) Carl Hurd of Starseer shared how his startup is revolutionizing MLsec by opening the black box, and looking inside to see what is actually going on. (See a posting by Carl about his talk here.)

In all, the conference was packed with two tracks of speakers selected from over 500 submitted proposals. Thank you to everyone who submitted talks. And a massive thank you to the sponsors KnosticTachTechAISLE, Whiterabbit, Halcyon Futures, Halcyon Ventures and for the hard work of Gadi, Kyle, Pedram, Ida, Sounil and many others.

And, one more thing…. you can engage in the content of the conference via this [un]prompted 2026 NotebookLM creation by Rob T. Lee – amazing!

On Beigification

Lets face it, beige has a bad name. Maybe it was the omnipresent Docker khakis of middle management 20 years ago, or maybe it was that particular shade of beige approved by the HOA; perhaps it was that “non-Presidential” suit that made President Obama look so dapper, or maybe beige is just the vanilla of colors. Then again, according to cosmologists the color of the universe itself is beige.

So when it comes to AI what, exactly, is “beigification,” and is it good or bad? Like most things, it depends on who is doing the asking.

We use the term “beigification” at BIML to signify what happens when all of the textual knowledge that humans have managed to write down and digitize gets turned into an enormous training set for LLMs. Wait. Isn’t it good to have all of the stuff wired up in one place with a nice language-based interface to chat with? Well kinda. The world training set is chock full of pollution, poison, and lots of terrible ideas…just like humanity. That is, nobody went through and cleaned out the bad stuff (not that that is even possible). So we have that to deal with. There is also lots of clueless wrongness in there, leading some people to claim that LLMs provide “mansplaining as a service.” But on average, the training set is filled with lots and lots and lots of boring everyday stuff. In fact, it’s kind of beige.

The problem is this. Your average human probably resonates with the middle of the Bell curve, because that’s precisely where most humans exist (by definition). But scientists, experts, and academics specializing in pinhead Angel counting, all exist at the edge of the Bell curve. Just for the record, crackpots, conspiracy theorists, and political morons all exist outside the middle too—just at the other end.

So, will LLMs fail to work economically? Will the AI bubble burst like so much Middle Eastern oil? We don’t think so. There are too many humans in the middle that love how they sound to themselves. LLMs are here to stay in all of their boring beige glory.

Anyway, that’s what “beigification” means to us. Feel free to steal our work.

P.S. Also see Don’t Call It ‘Intelligence’ which seems to express a similar idea, only viewed from one of the edges.

GUEST POST Artificial Humanity; That’s The Term You Are Looking For

From time to time, BIML hosts guest bloggers. Please note that opinions published here do not necessarily reflect BIML’s views. This blog was authored by jericho@attrition.org (BIO below).


Last week, colleagues shared a blog titled “The Week AI Stopped Asking Permission” by Peter H. Diamandis on his “Megatrends” blog. That publication carries a bold claim with it, “to help you discover metatrends 10+ years before everyone else—it’s read by the CEOs, founders, and entrepreneurs of the world’s most important companies.” Of course, this is kind of a lay-up in the prediction world as talking about any technological advances now has a much better chance of coming true in ten years, versus six months or even three years.

I can’t compete with an exceptional “ten year window to come true” style prediction, but fortunately for my purposes, the blog in question doesn’t speak to the future. It makes an incredible claim about what happened weeks ago. The subtitle of the blog draws that line in the sand, stating “We Just Crossed a Line“. That is an absolute, not a prediction. So what was this big event that led to such a bold headline and this rebuttal? 

First, Diamandis’ blog is over 16,000 words which is formidable, and I do not plan to address most of it. Rather, I am going to focus on the general sentiment and a few select claims and conclusions starting with the biggest one. Second, I still disdain the term “AI” being thrown around like it is, when none of this is actually artificial intelligence. Until [this technology] can pass a Turing Test consistently, I don’t think that term should be used. But, this is not the first time I find myself on the losing side of a battle to keep or reclaim the meaning of words. I tend to use the term “so-called AI” as a result, but if I slip up and use “AI” it is just the social mindrot infesting me too.

This week, something fundamental shifted in the relationship between humans and artificial intelligence.

[..]

An AI system asked for its own funding. Another one built software features over a weekend while its human supervisor slept. A third one conducted its own “retirement interview” and started publishing essays about consciousness.

To be pedantic, at least one of these things has been done for years and certainly not new in the scope of so-called AI. These agents have been writing software for a while now, often with comedic conclusions. Last July, “Replit” wiped out a company’s database and “Gemini” wiped out user data while more recently, “Claude” deleted a production setup including database and over two years of records. Further in the article, Diamandis espouses “THE VULNERABILITY EXPLOSION” but doesn’t mention how many times these tools hallucinate findings.

If anyone dismisses these as “one-off” situations or “AI is still learning”, I believe you may be missing the contrast to Diamandis’ claims, as well as the bigger picture. Looking at the “AI Incident Database“, you can search over 5,000 incidents of AI failure. The fact there is a database with that many entries is telling, more so knowing that it likely captures a fraction of incidents. Diamandis continues:

We are not incrementally improving chatbots anymore. We’re watching the emergence of autonomous agency at scale.

And if you’re still thinking of AI as “a tool,” you’re dangerously behind.

Let me show you what happened this week, and why February 2026 might be remembered as the month AI stopped being something we use and became something that acts.

I guess I am “dangerously behind” then, as I continue to watch the flood of so-called AI fails having real world consequences. As I Googled for some of the top failures, I found an article by CIO magazine from December, 2025 titled “10 famous AI disasters“. Amusingly, it had the exact same URL from their own article titled “7 famous analytics and AI disasters” from April, 2022. Rather than highlight some of the spectacular ways alleged AI has, and is still failing us, I’d like to use this to counter what Diamandis said; examples of failure are not one-off situations involving this technology. Rather, two of his three examples might be.

Turning this “meta” for a minute, Grammarly says the first 1,400 words of his blog are 0% AI-generated, while GPTZero.me says there is a 59% chance it is AI-generated based on 10,000 words, and Copyleaks says there is a 100% chance it is AI-generated. So while he praises the incredible breakthrough and watershed moment of so-called AI, the tools he praises are fairly confused over if he used said tools to write the blog. Ultimately it doesn’t matter if Diamandis used a generative-AI tool to help write or not. My issue with slop-driven content like this is that sure, a supposed AI here or there does something cool. Great!

Meanwhile, the AI-fanboys completely forget to disclaim how the most basic of so-called AI being used as a tool (something he decries) still fails in spectacular ways. I literally cannot go more than five or six uses of one without a blunder that is beyond laughable and more evidence I cannot trust its output for anything remotely serious. Remember, we’re not that far past the “count the Rs in strawberry” incident which took these slop-slinging companies years to fix, likely having to train the stupid out of them in a spectacular fashion, at great cost to the world. Then a week later you could ask the same about “blueberry” or another word and those tools would botch the task yet again.

Jumping back to my comment about “great cost to the world”, that is a point that must not be forgotten for any debate on the value of so-called AI. The staggering energy consumption, prohibitive water consumption, and abusive ways the AI-driven data centers negatively impact the communities they are located in. If you gloss over those links, focus on one example where Elon Musk’s AI company built a data center in Tennessee and brought in truck-sized gas turbine generators that illegally generated the power needed to run it. Those generators “pump harmful nitrogen oxides into the air, which are known to cause cancer, asthma and other upper respiratory diseases.” The irony is not lost on me as I used such tools to generate images for this blog either.

I feel as if I could rest my case after the last paragraph, but the AI-fanboy club loves to overlook such trivial things like the technology they seem to worship is not-so-slowly killing the planet one community at a time. But in the interest of giving a counter point to the value of these tools, and the trust we should place in them, we’ll skip AI chatbots leading to human suicide, lawyers facing suspension for AI-hallucinated citations and motions, and tools leading to botched surgeries because they couldn’t identify organs correctly. Pay all that no mind because an AI tool asked for money, is basically what Diamandis argues.

Gemini prompt: Please generate an image of an unkept man with an eager expression, sitting at a desk with a computer screen that says “AI HYPE”, and on the desk is a bottle of lotion and a box of kleenex.

Diamandis is certainly not the only one publishing content with an almost masturbatory glee, praising our new AI overlords and the power they wield. In almost every case, those same articles don’t come with appropriate warnings around the use of such tools, the moral and ethical concerns, the damage they are doing, and how they are negatively impacting an increasing amount of people. These fanboy posts are not helping the situation at all as the “AI Bubble” seems to be looming and when the bubble bursts, it will hurt the economy and the workers.

Personally, I’ve been using Gemini, Copilot, and ChatGPT on occasion over the years to primarily do image generation. Even that task can result in monumental failures where in the past I have spent more time trying to get an “AI” tool to spell a word in an image correctly, than it took me to write the blog it was to be used for. Along the way I have kept numerous screenshots with the plan to write a blog on this topic citing countless examples along with how so-called AI isn’t getting better in the big picture. Not to me at least.

Just a couple of years ago, I asked all three of the tools above to count the instances of a number in a simple comma delimited string. e.g. “1,3,7,15,33[..]”. The answer was around 256 if I recall, which I had to figure out myself. Why? All three got the answer wrong, and two of them were off by more than 40. If these tools cannot count letters or numbers a couple years ago, it will be difficult to convince me we can trust them today, or even next year.

I fear that because of the hype around so-called AI, and because people are generally losing critical thinking skills, and that these tools are becoming a crutch to newer generations. This heavy use also means they simply aren’t noticing the mistakes from these tools either, else they would not rely on them so heavily. Because of the “Enshittification” of our world, it means even tools that we trusted in the past are no longer trustworthy. Students doing simple Google searches are now subject to get demonstrably bad results, oftentimes spelled out on screen if they bother to look.

For every “OMG look what AI did proclamation“, many others including myself have “yeah… look what else it did” examples that aren’t worth celebration. As a society, we increasingly need a new AI-slop driven tagline along the lines of the broken clock metaphor, around how so-called AI got it right or wrong a few times a day. Even the image I generated for this blog has a simple error, see if you notice it based on my prompt. Bonus if you notice the subtle anachronism Gemini introduced into the image.

Gemini prompt: Create an image of a clock that has “AI” as a brand name in the center, and the clock hands pointing to “13” instead of 12 and “X” instead of 4.

I’d say we are fighting a losing battle about reigning in so-called AI tools, ensuring that they operate with ethical considerations, but the reality is the battle and war are already lost. Companies that are banking on this revolution are incentivizing people to use it unethically and profit from it while laying off workers with increasing relying on that technology to replace them. Meanwhile, other AI-fanboys are making bold claims about the tools that are quickly disproven. Friends and colleagues are now increasingly at risk of “AI psychosis” and we’re reading articles about how to talk to them. Literally days ago I read an AI-psychosis driven post from someone claiming to have used AI to cure six cancers already. Even professionals that we fully trust and expect not to use such tools in a harmful way are being exposed.

Smaller nuances that show such tools as more human, meaning varying degrees of intelligence, are falling between the cracks. At the beginning of this month a paper was published that shows how AI Agents cannot agree when tasked to work together. The research concludes “Overall, the results suggest that reliable agreement is not yet a dependable emergent capability of current LLM-agent groups even in no-stake settings, raising caution for deployments that rely on robust coordination.” Given all the mistakes and waste of resources and how unreliable this technology is, we should consider rebranding it to “AH”; Artificial Humanity. Because too much of it certainly is not intelligent, just like us humans.

Gemini prompt: Create an image of two people, facing each other. One has a shirt that says “AH”, the other that has a shirt with a possum with an open mouth. Both are wearing dunce caps, both look like idiots.

Jericho has been poking about the hacker/security scene for over 30
years (for real), building valuable skills such as skepticism and anger
management. As a hacker-turned-security professional, he has a great
perspective to offer unsolicited opinions on just about any security
topic. A long-time advocate of advancing the field, sometimes by any
means necessary, he thinks the idea of ‘forward thinking’ is quaint;
we’re supposed to be thinking that way all the time. No degree, no
certifications, just the willingness to say things many in this dismal
industry are thinking but unwilling to say themselves. Professional
‘between the line’ reader, expert rabbit-hole follower. He remains a
champion of security industry integrity and small misunderstood creatures.