On recent Microsoft and NIST ML security documents

Recently there have been several documents published as guides to security in machine learning. In October 2019, NIST published a draft called “A Taxonomy and Terminology of Adversarial Machine Learning”. Then in November, Microsoft published several interrelated webpages laying out a threat model for AI/ML systems and tying it to MS’s existing Software Development Lifecycle. We took a look at these documents to find out what they are trying to do, what they do well, and what they lack.

The NIST document is a tool for navigating MLsec literature, somewhat in the vein of an academic survey paper but accessible to those outside the field. The focus is explicitly “adversarial ML”, i.e. the failures a motivated attacker can induce in an ML system through input. They present a taxonomy of concepts in the literature rather than covering specific attacks or risks. Also included is a technical terminology with definitions, synonyms and references to the originating papers. The taxonomy at first appeared conceptually bizarre to us, but we came to see it as a powerful tool for a particular task: working backward from an unfamiliar technical term to its root concept and related ideas. In this way the NIST document may be very helpful to non-ML experts concerned with security attempting to wrangle the ML security literature.

The Microsoft effort is a three-headed beast:

  • “Failure Modes in Machine Learning”, a brief taxonomy of 16 intentional and unintentional failures. It supposedly meets “the need to equip software developers, security incident responders, lawyers, and policy makers with a common vernacular to talk about this problem”. To this end the authors avoid technical language where possible. Each threat is classified using the somewhat dated and quaint Confidentiality/Integrity/Availability security model. This is easy enough to understand, though we find the distinction between Integrity and Availability attacks unclear for most ML scenarios. The unintentional failures are oddly fixated on Reinforcement Learning, and several seem to boil down to the same thing. For example #16 “Common Corruption” appears to be a subcategory of #14 “Distributional Shifts.”
  • “AI/ML Pivots to the Security Development Lifecycle Bug Bar”, similar to the above but aimed at a different audience, “as a reference for the triage of AI/ML-related security issues”. This section presents materials for use while applying some of the standard Microsoft SDL processes.  Of interest is the fact that threat modeling is emphasized in its own section.  We approve of that move.
  • “Threat Modeling AI/ML Systems and Dependencies”  is the most detailed component, containing the meat of the Microsoft MLsec effort. Here you can find security review checklists and a survey paper-style elaboration of each major risk with an emphasis on mitigations. The same eleven categories of “intentional failures” are used as in the other documents. However, (at the time of writing) the unintentional failures are left out. We found the highlighting of risk #6 “Neural Net Reprogramming” particularly interesting, as it had been unknown to us before. This work shows how adversarial examples can be used to do a kind of arbitrage where a service provided at cost (say, automatically tagging photos in a cloud storage account) can be repurposed to a similar task like breaking CAPTCHAs.

The Microsoft documents function as practical tools for securing software, including checklists for a security review and lists of potential mitigations. However, we find their categorizations confusing or redundant in places. Laudably, they move beyond adversarial ML to the concept of “unintentional failures”. But unfortunately, these failure modes remain mostly unelaborated in the more detailed documents.

Adversarial/intentional failures are important, but we shouldn’t neglect the unintentional ones. Faulty evaluation, unexpected distributional shifts, mis-specified models, and unintentional reproduction of data biases can all threaten the efficacy, safety and fairness of every ML system. Both the Microsoft and NIST documents are tools for an organization seeking to secure itself against external threats. But equally important to secure against is the misuse of AI/ML.


  1. Hi Viktor — Thank you so much for your analysis! I would love to follow up with you to make the taxonomy better. My email address is in the comment field!

    Couple of quick notes:
    1) I would love to understand why the C-I-A mapping is antiquated. Organizations we spoke to continue to use it, and wanted a mental model to map back to adversarial ML paradigm. In fact, the In fact the
    USCTO when publishing their AI Principles specifically call out the confidentiality, integrity and availability of ML models – https://www.commerce.senate.gov/services/files/B7184908-E657-441C-967A-871D8A80B0F0

    2) also think common corruption is very different from distributional shifts, and can be divorced from Reinforcement learning. Common Corruption from “noise, blur, weather” can cause ML systems to fail. See: https://arxiv.org/pdf/1903.12261.pdf

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>