Fail Securely [Principle 3]
Even under ideal conditions, complex systems are bound to fail eventually. Failure is an unavoidable state that should always be planned for. From a security perspective, failure itself isn’t the problem so much as the tendency for many systems to exhibit insecure behavior when they fail.
The best real-world example we know is one that bridges the real world and the electronic world—credit card authentication. Big credit card companies such as Visa and MasterCard spend lots of money on authentication technologies to prevent credit card fraud. Most notably, whenever you go into a store and make a purchase, the vendor swipes your card through a device that calls up the credit card company. The credit card company checks to see if the card is known to be stolen. More amazingly, the credit card company analyzes the requested purchase in context of your recent purchases and compares the patterns to the overall spending trends. If their engine senses anything suspicious, the transaction is denied. (Sometimes the trend analysis is performed off-line and the owner of a suspect card gets a call later.)
This scheme appears to be remarkably impressive from a security point of view; that is, until you note what happens when something goes wrong. What happens if the line to the credit card company is down? Is the vendor required to say, “I’m sorry, our phone line is down”? No. The credit card company still gives out manual machines that take an imprint of your card, which the vendor can send to the credit card company for reimbursement later. An attacker need only cut the phone line before ducking into a 7-11.
ML systems are particularly complicated (what with all that dependence on data) and are prone to fail in new and spectacular ways. Consider a system that is meant to classify/categorize its input. Reporting an inaccurate classification seems like not such a bad thing to do. But in some cases, simply reporting what an ML system got wrongcan lead to a security vulnerability. As it turns out, attackers can exploit misclassification to create adversarial examples[i], or use a collection of errors en masse to ferret out sensitive and/or confidential information used to train the model.i In general, ML systems would do well toavoid transmitting low-confidence classification results to untrusted users in order to defend against these attacks, but of course that seriously constrains the usual engineering approach.
Classification results should only be provided when the system is confident they are correct. In the case of either a failure or a low confidence result, care must be taken that feedback from the model to a malicious user can’t be exploited. Note that many ML models are capable of providing confidence levels along with their output to address some of these risks. That certainly helps when it comes to understanding the classifier itself, but it doesn’t really address information exploit or leakage (both of which are more challenging problems). ML system engineers should carefully consider the sensitivity of their systems’ predictions and take into account the amount of trust they afford the user when deciding what to report.
If your ML system has to fail, make sure that it fails securely.
[i]Gilmer, Justi, Ryan P. Adams, Ian Goodfellow, David Andersen, and George E. Dahl. “Motivating the Rules of the Game for Adversarial Example Research.” arXiv preprint 1807.06732 (2018)