gem | Page 4 of 9 | Berryville Institute of Machine Learning

Decypher Podcast Features BIML LLM Work

6 February 2024

gem

No comments

Categories: Uncategorized

The February 6th episode of Dennis Fisher’s Decypher podcast does an excellent job unpacking BIML’s latest work on LLMs. Have a listen: https://duo.com/decipher/decipher-podcast-gary-mcgraw-on-ai-security

Podcast Episode

The Silver Bullet podcast archive (all 153 episodes) can be found here.

Dennis Fisher Covers BIML and Data Feudalism

30 January 2024

gem

No comments

Categories: Uncategorized

Here is an excellent piece from Dennis Fisher (currently writing for decipher) covers our new LLM Architectural Risk Analysis. Dennis always produces accurate and tightly-written work.

https://duo.com/decipher/for-ai-risk-the-real-answer-has-to-be-regulation

This article includes an important section on data feudalism, a term that BIML coined in an earlier Decipher article:

“Massive private data sets are now the norm and the companies that own them and use them to train their own LLMs are not much in the mood for sharing anymore. This creates a new type of inequality in which those who own the data sets control how and why they’re used, and by whom.

‘The people who built the original LLM used the whole ocean of data, but then they started dividing [the ocean] up, [leading] to data feudalism. Which means you can’t build your own model because you don’t have access to [enough] data,’ McGraw said.”

Two interesting reads on LLM security

29 January 2024

gem

No comments

Categories: Uncategorized

The Register has a great interview with Ilia Shumailov on the number one risk of LLMs. He calls it “model collapse” but we like the term “recursive pollution” better because we find it more descriptive. Have a look at the article.

Our work at BIML has been deeply influenced by Shumailov’s work. In fact, he currently has two articles in our Annotated Bibliography TOP 5.

https://www.theregister.com/2024/01/26/what_is_model_collapse/

Here is what we have to say about recursive pollution in our work — An Architectural Risk Analysis of Large Language Models:

[LLMtop10:1:recursive pollution] LLMs can sometimes be spectacularly wrong, and confidently so. If and when LLM output is pumped back into the training data ocean (by reference to being put on the Internet, for example), a future LLM may end up being trained on these very same polluted data. This is one kind of “feedback loop” problem we identified and discussed in 2020. See, in particular, [BIML78 raw:8:looping], [BIML78 input:4:looped input], and [BIML78 output:7:looped output]. Shumilov et al, subsequently wrote an excellent paper on this phenomenon. Also see Alemohammad. Recursive pollution is a serious threat to LLM integrity. ML systems should not eat their own output just as mammals should not consume brains of their own species. See [raw:1:recursive pollution] and [output:8:looped output].

Another excellent piece, this time in the politics, policy, business and international relations press is written by Peter Levin. See The real issue with artificial intelligence: The misalignment problem in The Hill. We like the idea of a “mix master of ideas” but we think it is more of a “mix master of auto-associative predictive text.” LLMs do not have “ideas.”

https://thehill.com/opinion/4427702-the-real-issue-with-artificial-intelligence-the-misalignment-problem/

Lemos on the BIML LLM Risk Analysis

28 January 2024

gem

No comments

Categories: Uncategorized

What’s the difference (philosophically) between Adversarial AI and Machine Learning Security? Once again, Rob Lemos cuts to the quick with his analysis of MLsec happenings. It helps that Rob has actual experience in ML/AI (unlike, say, most reporters on the planet). That helps Rob get things right.

READ THE ARTICLE:https://www.darkreading.com/cyber-risk/researchers-map-ai-threat-landscape-risks

We were proud to have our first coverage come from Rob in darkreading.

My favorite quote: “Those things that are in the black box are the risk decisions that are being made by Google and Open AI and Microsoft and Meta on your behalf without you even knowing what the risks are,” McGraw says. “We think that it would be very helpful to open up the black box and answer some questions.”

Read BIML’s An Architectural Risk Analysis of Large Language Models (January 24, 2024)

Google Cloud Security Podcast Features BIML

25 January 2024

gem

No comments

Categories: Uncategorized

Have a listen to Google’s cloud security podcast EP150 Taming the AI Beast: Threat Modeling for Modern AI Systems with Gary McGraw, the episode is tight, fast, and filled with good information.

Google Cloud Security Podcast: Taming the AI Beast with Gary McGraw

Gary, you’ve been doing software security for many decades, so tell us: are we really behind on securing ML and AI systems?
If not SBOM for data or “DBOM”, then what? Can data supply chain tools or just better data governance practices help?
How would you threat model a system with ML in it or a new ML system you are building?
What are the key differences and similarities between securing AI and securing a traditional, complex enterprise system?
What are the key differences between securing the AI you built and AI you buy or subscribe to?
Which security tools and frameworks will solve all of these problems for us?

Our Secret BIML Strategy

22 January 2024

gem

No comments

Categories: Uncategorized

Dang. Darkreading went and published our world domination plan for machine learning securiy

“To properly secure machine learning, the enterprise needs to be able to do three things: find where machine learning is being used, threat model the risk based on what was found, and put in controls to manage those risks.

‘We need to find machine learning [and] do a threat model based on what you found,’ McGraw says. ‘You found some stuff, and now your threat model needs to be adjusted. Once you do your threat model and you’ve identified some risks and threats, you need to put in some controls right across all those problems.’

There is no one tool or platform that can handle all three things, but McGraw happens to be on the advisory boards for three companies corresponding to each of the areas. Legit Security finds everything, IriusRisk helps with threat modeling, and Calypso AI puts controls in place.

‘I can see all the parts moving,’ McGraw says. ‘All the pieces are coming together.'”

Ah the Trinity of MLsec explained! Read the article here.

First Things First: Find AI

22 January 2024

gem

No comments

Categories: Uncategorized

Apparently there are many CISOs out there who believe that their enterprise policies prohibit the use of ML, LLMS, and AI in their organization. Little do they know what’s actually happening.

This excellent article by darkreading discusses the first thing a CISO should do to secure AI: Find AI. The system described here is implemented by Legit Security.

MLsec in Rio

22 January 2024

gem

No comments

Categories: Uncategorized

BIML provided a preview of our upcoming LLM Risk Analysis work (including the top ten LLM risks) at a Philosophy of Mind workshop in Rio de Janeiro January 5th. The workshop was organized by David Chalmers (NYU) and Laurie Paul (Yale).

A tongue-in-cheek posting about the meeting can be found here.

Threat Modeling for ML

19 December 2023

gem

No comments

Categories: Uncategorized

Once you learn that many of your new applications have ML built into them (often regardless of policy), what’s the next step? Threat modeling, of course. Irius Risk, the worldwide leader for threat modeling automation, announced a threat modeling library covering ML risks identified by BIML on October 26, 2023.

This is the first tool in the world to include ML risk as part of threat modeling automation. Now we’re getting somewhere.

Darkreading was the first publication to cover the news, and remains the best source for cutting edge stories about machine learning security. Read the original story here.

https://www.darkreading.com/cyber-risk/iriusrisk-brings-threat-modeling-to-machine-learning

BIML Coins a Term: Data Feudalism

19 December 2023

gem

No comments

Categories: Uncategorized

Decipher covers the White House AI Executive Order, with the last word to BIML. Read the article from October 31, 2023 here.

https://duo.com/decipher/white-house-ai-executive-order-puts-focus-on-cybersecurity

Much of what the executive order is trying to accomplish are things that the software and security communities have been working on for decades, with limited success.

“We already tried this in security and it didn’t work. It feels like we already learned this lesson. It’s too late. The only way to understand these systems is to understand the data from which they’re built. We’re behind the eight ball on this,” said Gary McGraw, CEO of the Berryville Institute of Machine Learning, who has been studying software security for more than 25 years and is now focused on AI and machine learning security.

“The big data sets are already being walled off and new systems can’t be trained on them. Google, Meta, Apple, those companies have them and they’re not sharing. The worst future is that we have data feudalism.”

Another challenge in the effort to build safer and less biased models is the quality of the data on which those systems are being trained. Inaccurate, biased, or incomplete data going in will lead to poor results coming out.

“We’re building this recursive data pollution problem and we don’t know how to address it. Anything trained on a huge pile of data is going to reflect the data that it ate,” McGraw said. “These models are going out and grabbing all of these bad inputs that in a lot of cases were outputs from the models themselves.”

“It’s good that people are thinking about this problem. I just wish the answer from the government wasn’t red teaming. You can’t test your way out of this problem.”