Video Interview: A Deep Dive into Generative AI and Cybersecurity

CalypsoAI produced a video interview in which I hosted Jim Routh and Neil Serebryany. We talked all about AI/ML security at the enterprise level. The conversation is great. Have a listen.

I am proud to be an Advisor to CalyopsoAI, have a look at their website.

BIML: The Scandanavian Tour

Dr. McGraw recently visited Stockholm, Oslo, and Bergen, hosting events in all three cities.

In Stockholm, a video interview was added in addition to a live breakfast presentation. Here are some pictures of the presenter’s view of the video shoot.

Reactions were scary!

The talk in Oslo was packed, with lots of BIML friends in the audience.

Bergen had a great turnout too, with a very interactive audience including academics from the university.

Here’s the best slide from the Bergen talk.

If your organization would like to host a BIML talk, please get in touch.

Indiana University SPICE Talk

BIML’s work was featured in a April 5th talk at the Luddy Center for Artificial Intelligence, part of Indiana University.

Here is the talk abstract. If you or your organization are interested in hosting this talk, please let us know.

10, 23, 81 — Stacking up the LLM Risks: Applied Machine Learning Security

I present the results of an architectural risk analysis (ARA) of large language models (LLMs), guided by an understanding of standard machine learning (ML) risks previously identified by BIML in 2020. After a brief level-set, I cover the top 10 LLM risks, then detail 23 black box LLM foundation model risks screaming out for regulation, finally providing a bird’s eye view of all 81 LLM risks BIML identified.  BIML’s first work, published in January 2020 presented an in-depth ARA of a generic machine learning process model, identifying 78 risks.  In this talk, I consider a more specific type of machine learning use case—large language models—and report the results of a detailed ARA of LLMs. This ARA serves two purposes: 1) it shows how our original BIML-78 can be adapted to a more particular ML use case, and 2) it provides a detailed accounting of LLM risks. At BIML, we are interested in “building security in” to ML systems from a security engineering perspective. Securing a modern LLM system (even if what’s under scrutiny is only an application involving LLM technology) must involve diving into the engineering and design of the specific LLM system itself. This ARA is intended to make that kind of detailed work easier and more consistent by providing a baseline and a set of risks to consider.

Tech Target Podcast: BIML Discusses 23 Black Box LLM Foundation Model Risks

A recently-released podcast features a in-depth discussion of BIML’s recent LLM Risk Analysis, defining terms in easy to understand fashion. We cover what exactly a RISK IS, whether open source LLMs make any sense, how big BIG DATA really is, and more.

https://targetingai.podbean.com/e/security-bias-risks-are-inherent-in-genai-black-box-models/

Have a listen here https://targetingai.podbean.com/e/security-bias-risks-are-inherent-in-genai-black-box-models/

BIML LLM Risk Analysis Debuted at NDSS’24

The first public presentation of BIML’s LLM work was presented in San Diego February 26th as an invited talk for three conference workshops (simultaneously).  The workshops coincided with NDSS.  All NDSS ’24 workshops: https://www.ndss-symposium.org/ndss2024/co-located-events/

  1. SDIoTSec: https://www.ndss-symposium.org/ndss2024/co-located-events/sdiotsec/
  2. USEC: https://www.ndss-symposium.org/ndss2024/co-located-events/usec/
  3. AISCC: https://www.ndss-symposium.org/ndss2024/co-located-events/aiscc/.

This was the first public presentation of the BIML LLM Top Ten Risks list since its publication.

When ML goes wrong, who pays the price?

Air Canada is learning the hard way that when YOUR chatbot on YOUR website is wrong, YOU pay the price. This is as it should be. This story from CTV News is a great development.

BIML warned about this in our LLM Risk Analysis report published 1.24.24. In particular, see:

[LLMtop10:9:model trustworthiness] Generative models, including LLMs, include output sampling algorithms by their very design. Both input (in the form of slippery natural language prompts) and generated output (also in the form of natural language) are wildly unstructured (and are subject to the ELIZA effect). But mostly, LLMs are auto-associative predictive generators with no understanding or reasoning going on inside. Should LLMs be trusted? Good question.

[inference:3:wrongness] LLMs have a propensity to be just plain wrong. Plan for that. (Using anthropomorphic terminology for error-making, such as the term “hallucinate” is not at all helpful.)

[output:2:wrongness] Prompt manipulation can lead to fallacious output (see [input:2:prompt injection]), but fallacious output can occur spontaneously as well. LLMs are notorious BS-ers that can make stuff up to justify their wrongness. If that output escapes into the world undetected, bad things can happen. If such output is later consumed by an LLM during training, recursive pollution is in effect.

Do you trust that black box foundation model you built your LLM application on? Why?