We present a basic architectural risk analysis (ARA) of large language models (LLMs), guided by an understanding of standard machine learning (ML) risks as previously identified by BIML.
At BIML, we are interested in “building security in” to ML systems from a security engineering perspective. Our first work, published in January 2020 presented an in-depth ARA of a generic machine learning process model.
In that work, we identified 78 risks, referred to as the BIML-78. In this work, we consider a more specific type of machine learning use case—large language models—and report the results of a detailed ARA of LLMs. This ARA serves two purposes: 1) it shows how our original BIML-78 can be adapted to a more particular ML use case, and 2) it provides a detailed accounting of LLM risks. This work identifies and discusses 81 LLM risks and identifies ten of those risks as most important. Securing a modern LLM system (even if what’s under scrutiny is only an application involving LLM technology) must involve diving into the engineering and design of the specific LLM system itself. This ARA is intended to make that kind of detailed work easier and more consistent by providing a baseline and a set of risks to consider.
Download the full document here
At BIML, we are interested in “building security in” to machine learning (ML) systems from a security engineering perspective. This means understanding how ML systems are designed for security, teasing out possible security engineering risks, and making such risks explicit. We are also interested in the impact of including an ML system as part of a larger design. Our basic motivating question is: how do we secure ML systems proactively while we are designing and building them? This architectural risk analysis (ARA) is an important first step in our mission to help engineers and researchers secure ML systems.
We present a basic ARA of a generic ML system, guided by an understanding of standard ML system components and their interactions. This groundbreaking work in the field was published January 20, 2020.
Download the full document here
IEEE Computer Article (August 2019)
An introduction to BIML published in IEEE Computer, volume 52, number 8, pages 54-57, “Security Engineering for Machine Learning” (August 2019).
Our taxonomy, published in May 2019, considers attacks on ML algorithms, as opposed to peripheral attacks or attacks on ML infrastructure (i.e., software frameworks or hardware accelerators). See the taxonomy.
Our work began with a review of existing scientific literature in MLsec. Each of our meetings includes discussion of a few new papers. After we read a paper, we add it to our annotated bibliography, which also includes a curated and up to date top 5 list. Have a look.
Towards an Architectural Risk Analysis of ML
We are interested in “building security in” to ML systems from a security engineering perspective. This means understanding how ML systems are designed for security (including what representations they use), teasing out possible engineering tradeoffs, and making such tradeoffs explicit. We are also interested in the impact of including an ML system as a component in a larger design. Our basic motivating question is: how do we secure ML systems proactively while we are designing and building them?
Early work in security and privacy in ML has taken an “operations security” tack focused on securing an existing ML system and maintaining its data integrity. For example, Nicolas Papernot uses Salzter and Schroeder’s famous security principals to provide an operational perspective on ML security. In our view, this work does not go far enough into ML design to satisfy our goals.
A key goal of our work is to develop a basic architectural risk analysis (sometimes called a threat model) of a typical ML system. Our analysis will take into account common design flaws such as those described by the IEEE Center for Secure Design. A complete architectural risk analysis for a generic ML system was published January 20, 2020.