Have a listen as Paul Roberts digs deep into BIML’s work on machine learning security. What exactly is data feudalism? Why does it matter? What are the biggest risks associated with LLMs?
Air Canada is learning the hard way that when YOUR chatbot on YOUR website is wrong, YOU pay the price. This is as it should be. This story from CTV News is a great development.
[LLMtop10:9:model trustworthiness] Generative models, including LLMs, include output sampling algorithms by their very design. Both input (in the form of slippery natural language prompts) and generated output (also in the form of natural language) are wildly unstructured (and are subject to the ELIZA effect). But mostly, LLMs are auto-associative predictive generators with no understanding or reasoning going on inside. Should LLMs be trusted? Good question.
[inference:3:wrongness] LLMs have a propensity to be just plain wrong. Plan for that. (Using anthropomorphic terminology for error-making, such as the term “hallucinate” is not at all helpful.)
[output:2:wrongness] Prompt manipulation can lead to fallacious output (see [input:2:prompt injection]), but fallacious output can occur spontaneously as well. LLMs are notorious BS-ers that can make stuff up to justify their wrongness. If that output escapes into the world undetected, bad things can happen. If such output is later consumed by an LLM during training, recursive pollution is in effect.
Do you trust that black box foundation model you built your LLM application on? Why?
The More Things Change, the More They Stay The Same: Defending Against Vulnerabilities you Create
Like any tool that humans have created, LLMs can be repurposed to do bad things. The biggest danger that LLMs pose in security is that they can leverage the ELIZA effect to convince gullible people into believing they are thinking and understanding things. This makes them particularly interesting in attacks that involve what security people call “spoofing.” Spoofing is important enough as an attack category that Microsoft included it in it’s STRIDE system as the very first attack to worry about. There is no doubt that LLMs make spoofing much more powerful as an attack. This includes creating and using “deep fakes” FWIW. Phishing attacks? Spoofing. Confidence flim-flams? Spoofing. Ransomware negotiations? Spoofing will help. Credit card fraud? Spoofing used all the time.
Twenty years ago the security community found it pretty brazen that Microsoft was thinking about selling defensive security tools at all since many of the attacks and exploits in the wild were successfully targeting their broken software. “Why don’t they just fix the broken software instead of monetizing their own bugs?” we asked. We might ask the same thing today. Why not create more secure black box LLM foundation models instead of selling defensive tools for a problem they are helping to create?
Here is an excellent piece from Dennis Fisher (currently writing for decipher) covers our new LLM Architectural Risk Analysis. Dennis always produces accurate and tightly-written work.
This article includes an important section on data feudalism, a term that BIML coined in an earlier Decipher article:
“Massive private data sets are now the norm and the companies that own them and use them to train their own LLMs are not much in the mood for sharing anymore. This creates a new type of inequality in which those who own the data sets control how and why they’re used, and by whom.
‘The people who built the original LLM used the whole ocean of data, but then they started dividing [the ocean] up, [leading] to data feudalism. Which means you can’t build your own model because you don’t have access to [enough] data,’ McGraw said.”
The Register has a great interview with Ilia Shumailov on the number one risk of LLMs. He calls it “model collapse” but we like the term “recursive pollution” better because we find it more descriptive. Have a look at the article.
Our work at BIML has been deeply influenced by Shumailov’s work. In fact, he currently has two articles in our Annotated Bibliography TOP 5.
[LLMtop10:1:recursive pollution] LLMs can sometimes be spectacularly wrong, and confidently so. If and when LLM output is pumped back into the training data ocean (by reference to being put on the Internet, for example), a future LLM may end up being trained on these very same polluted data. This is one kind of “feedback loop” problem we identified and discussed in 2020. See, in particular, [BIML78 raw:8:looping], [BIML78 input:4:looped input], and [BIML78 output:7:looped output]. Shumilov et al, subsequently wrote an excellent paper on this phenomenon. Also see Alemohammad. Recursive pollution is a serious threat to LLM integrity. ML systems should not eat their own output just as mammals should not consume brains of their own species. See [raw:1:recursive pollution] and [output:8:looped output].
Another excellent piece, this time in the politics, policy, business and international relations press is written by Peter Levin. See The real issue with artificial intelligence: The misalignment problem in The Hill. We like the idea of a “mix master of ideas” but we think it is more of a “mix master of auto-associative predictive text.” LLMs do not have “ideas.”
What’s the difference (philosophically) between Adversarial AI and Machine Learning Security? Once again, Rob Lemos cuts to the quick with his analysis of MLsec happenings. It helps that Rob has actual experience in ML/AI (unlike, say, most reporters on the planet). That helps Rob get things right.
We were proud to have our first coverage come from Rob in darkreading.
My favorite quote: “Those things that are in the black box are the risk decisions that are being made by Google and Open AI and Microsoft and Meta on your behalf without you even knowing what the risks are,” McGraw says. “We think that it would be very helpful to open up the black box and answer some questions.”
“To properly secure machine learning, the enterprise needs to be able to do three things: find where machine learning is being used, threat model the risk based on what was found, and put in controls to manage those risks.
‘We need to find machine learning [and] do a threat model based on what you found,’ McGraw says. ‘You found some stuff, and now your threat model needs to be adjusted. Once you do your threat model and you’ve identified some risks and threats, you need to put in some controls right across all those problems.’
There is no one tool or platform that can handle all three things, but McGraw happens to be on the advisory boards for three companies corresponding to each of the areas. Legit Security finds everything, IriusRisk helps with threat modeling, and Calypso AI puts controls in place.
‘I can see all the parts moving,’ McGraw says. ‘All the pieces are coming together.'”