All Your LLM Are Belong to Us

We didn’t want to rain on the Davos parade, so we waited until this week to release our latest piece of work. Our paper “An Architectural Risk Analysis of Large Language Models: Applied Machine Learning Security,” spotlights what we view as major concerns with foundation model LLMs as well as their adaptations and applications.

We are fans of ML and “AI” (which the whole world tilted towards in 2023, fawning over the latest models with both awe and apprehension). We’re calling out the inherent risks. Not hand wavy stuff—we’ve spent the past year reading science publications, dissecting the research ideas, understanding the math, testing models, parsing through the noise, and ultimately analyzing LLMs through the lens of security design and architecture. We took the tool we invented for ML security risk analysis in 2020 (see our earlier paper, “Architectural Risk Analysis of Machine Learning Systems: Toward More Secure Machine Learning”) and applied it to LLMs specifically.

We found 81 risks overall, distilled a Top Ten (Risks) list, and shined a spotlight on 23 critical risks inherent in the black box LLM foundation models.

And now 2024 is off and running. It will be the year of “AI Governance” in name and (optimistic) intent. In practice, however, it’s on pace to be a shitshow for democracy as regulators run like hell just to get to the starting line.

The Slovak parliamentary election deepfake debacle, is the tip of the iceberg. OpenAI tried to get ahead of concerns that their technology may be used to influence the US Presidential Election in nefarious ways by posting its plans to deter misinformation. The irony is that OpenAI trained its models on a corpus so large that it holds vast globs of crazy rhetoric, conspiracy theories, fake news, and other pollution which its stochastic models will draw upon and (predictably) spit out…that will, in turn, add to the ever amassing pile of garbage-strewn data in the world, which future LLM foundation models will ingest, … See the problem here? That’s recursive pollution.

It’s the Data, stupid. (We sure wish it were that simple, anyway.)

See our official Press Release here.

Another Round of “Adversarial Machine Learning” from NIST

The National Institute of Standards and Technology (aka NIST) recently released a paper enumerating many attacks relevant to AI system developers. With the seemingly-unending rise in costs incurred by cybercrime, it’s sensible to think through the means and motives behind these attacks. NIST provides good explanations of the history and context for a variety of AI attacks in a partially-organized laundry list. That’s a good thing. However, in our view, NIST’s taxonomy lacks a useful structure for thinking about and categorizing systemic AI risks. We released a simple (and hopefully more effective) taxonomy of ML attacks in 2019 that divides attacks into two types—extraction and manipulation—and further divides these types into the three most common attack surfaces found in all ML systems—the model, the (training) data, and the (runtime) inputs. That move yields a six category taxonomy.

But wait, there’s more… Attacks represent only a small portion of security risks present in AI systems. NIST’s attack taxonomy doesn’t have any room for serious (non-attack-related) concerns such as recursive pollution or improper use of AI technology for tasks it wasn’t designed for. Far too much of NIST’s evaluation of generative AI is dedicated to prompt injection attacks, where an attacker manipulates the prompt provided to the LLM at runtime, producing undesirable results. LLM developers certainly need to consider the potential for malicious prompts (or malicious input as computer security people have always called it), but this downplays a much more important risk—stochastic behavior from LLM foundation models can be wrong and bad all by itself without any clever prompting!

At BIML, we are chiefly concerned with building security in to ML systems—a fancy way of saying security engineering. By contrast, NIST’s approach encourages “red-teaming”, using teams of ethical hackers (or just people off the street and pizza delivery guys) to try to penetration test LLM systems based on chugging down a checklist of known problems. Adopting this “outside–>in” paradigm of build (a broken thing)-break-fix will inevitably overlook huge security chasms inside the system—holes that are ripe for attackers to exploit. Instead of trying to test your way toward security one little prompt at a time (which turns out to be insanely expensive), why not build systems properly in the first place through a comprehensive overview of systemic risks?!

In any case, we would like to see appropriate regulatory action to ensure that proper security engineering takes place (including, say, documenting exactly where those training data came from and what they contain). We don’t think enlisting an army of pizza guys providing prompts is the answer,In the meantime, AI systems are already being made available to the public, and they are already wreaking havoc. Consider, for example, the recent misuse of AI to suppress voter turnout in the New Hampshire presidential primary! This kind of thing should shock the conscience of any who believe AI security can be tested in as an afterthought. So we have a call to action for you. It is imperative that AI architects and thought leaders adopt a risk-driven approach to engineering secure systems before releasing them to the public.

Bottom line on the NIST attack list? Mostly harmless.

Our Secret BIML Strategy

Dang. Darkreading went and published our world domination plan for machine learning securiy

To properly secure machine learning, the enterprise needs to be able to do three things: find where machine learning is being used, threat model the risk based on what was found, and put in controls to manage those risks.

‘We need to find machine learning [and] do a threat model based on what you found,’ McGraw says. ‘You found some stuff, and now your threat model needs to be adjusted. Once you do your threat model and you’ve identified some risks and threats, you need to put in some controls right across all those problems.’

There is no one tool or platform that can handle all three things, but McGraw happens to be on the advisory boards for three companies corresponding to each of the areas. Legit Security finds everything, IriusRisk helps with threat modeling, and Calypso AI puts controls in place.

‘I can see all the parts moving,’ McGraw says. ‘All the pieces are coming together.'”

Ah the Trinity of MLsec explained! Read the article here.

First Things First: Find AI

Apparently there are many CISOs out there who believe that their enterprise policies prohibit the use of ML, LLMS, and AI in their organization. Little do they know what’s actually happening.

This excellent article by darkreading discusses the first thing a CISO should do to secure AI: Find AI. The system described here is implemented by Legit Security.

MLsec in Rio

BIML provided a preview of our upcoming LLM Risk Analysis work (including the top ten LLM risks) at a Philosophy of Mind workshop in Rio de Janeiro January 5th. The workshop was organized by David Chalmers (NYU) and Laurie Paul (Yale).

McGraw in Rio

A tongue-in-cheek posting about the meeting can be found here.

Threat Modeling for ML

Once you learn that many of your new applications have ML built into them (often regardless of policy), what’s the next step? Threat modeling, of course. Irius Risk, the worldwide leader for threat modeling automation, announced a threat modeling library covering ML risks identified by BIML on October 26, 2023.

This is the first tool in the world to include ML risk as part of threat modeling automation. Now we’re getting somewhere.

Darkreading was the first publication to cover the news, and remains the best source for cutting edge stories about machine learning security. Read the original story here.

https://www.darkreading.com/cyber-risk/iriusrisk-brings-threat-modeling-to-machine-learning

BIML Coins a Term: Data Feudalism

Decipher covers the White House AI Executive Order, with the last word to BIML. Read the article from October 31, 2023 here.

https://duo.com/decipher/white-house-ai-executive-order-puts-focus-on-cybersecurity

Much of what the executive order is trying to accomplish are things that the software and security communities have been working on for decades, with limited success.

“We already tried this in security and it didn’t work. It feels like we already learned this lesson. It’s too late. The only way to understand these systems is to understand the data from which they’re built. We’re behind the eight ball on this,” said Gary McGraw, CEO of the Berryville Institute of Machine Learning, who has been studying software security for more than 25 years and is now focused on AI and machine learning security.

“The big data sets are already being walled off and new systems can’t be trained on them. Google, Meta, Apple, those companies have them and they’re not sharing. The worst future is that we have data feudalism.”

Another challenge in the effort to build safer and less biased models is the quality of the data on which those systems are being trained. Inaccurate, biased, or incomplete data going in will lead to poor results coming out.

“We’re building this recursive data pollution problem and we don’t know how to address it. Anything trained on a huge pile of data is going to reflect the data that it ate,” McGraw said. “These models are going out and grabbing all of these bad inputs that in a lot of cases were outputs from the models themselves.”

“It’s good that people are thinking about this problem. I just wish the answer from the government wasn’t red teaming. You can’t test your way out of this problem.”

BIML Presents at NBIM 10.18.23

NBIM is the world’s largest sovereign wealth fund

BIML was invited to Oslo to present its views on Machine Learning Security in two presentations at NBIM in October.

The first was delivered to 250+ technologists on staff (plus 25 or so invited guests from all around Norway). During the talk, BIML revealed its “Top Ten LLM Risks” data for the first time (pre-publication).

BIML presented two talks at NBIM

The second session was a fireside chat for 19 senior executives.

BIML on the AP Wire: why red teaming is feeble

The idea that machine learning security is exclusively about “hackers,” “attacks,” or some other kinds of “adversary,” is misguided. This is the same sort of philosophy that misled software security into a myopic overfocus on penetration testing way back in the mid ’90s. Not that pen testing and red teaming are useless, mind you, but there is way more to security engineering that penetrate and patch. It took us forever (well, a decade or more) to get past the pen test puppy love and start building real tools to find actual security bugs in code.

That’s why the focus on Red Teaming AI coming out of the White House this summer was so distressing. On the one hand…OK, the White House said AI and Security in the same sentence; but on the other hand, hackers gonna hack us outta this problem…not so much.

This red teaming nonsense is worse than just a philosophy problem, it’s a technical issue too.  Just take a look at this ridiculous piece of work from Anthropic.

Red Teaming Language Models to Reduce Harms:
Methods, Scaling Behaviors, and Lessons Learned

Red teaming sounds high tech, mysterious and steeped in hacker mystique, but today’s ML systems won’t benefit much from post facto pen testing. We must build security into AI systems from the very beginning (by paying way more attention to the enormous swaths of data used to train them and the risks these data carry). We can’t security test our way out of this corner, especially when it comes to the current generation of LLMs.

It’s tempting to pretend we can sprinkle some magic security dust on these systems after they are built, patch them into submission, or bolt special security apparatus on the side. Unfortunately the world well knows what happens when we pretend to be hard at work on security yet what we’re actually doing is more akin to squeezing our eyes shut and claiming to be invisible. Just ask yourself one simple question, who benefits from a security circus in this case?

AP reporter Frank Bajak covered BIML’s angle in this worldwide story August 13, 2023.

https://apnews.com/article/ai-cybersecurity-malware-microsoft-google-openai-redteaming-1f4c8d874195c9ffcc2cdffa71e4f44b