Annotated Bibliography

As our research group reads and discusses scientific papers in MLsec, we add an entry to this bibliography. We also curate a “top 5” list.

Top 5 Papers

Gilmer 2018 — Adversarial Examples

Gilmer, Justi, Ryan P. Adams, Ian Goodfellow, David Andersen, and George E. Dahl. “Motivating the Rules of the Game for Adversarial Example Research.” arXiv preprint 1807.06732 (2018)

Great use of realistic scenarios in a risk analysis. Hilariously snarky.

  • Representation

Jetley 2018 — On generalization and vulnerability

Jetley, Saumya, Nicholas A. Lord, and Phillip H.S.Torr. “With Friends Like These, Who Needs Adversaries?.” 32nd Conference on Neural Information Processing Systems. 2018.

Excellent paper. Driven by theory and demostrated by experimentation, generalization in DCNs trades off agains vulnerability

  • Attack-Lit-Pointers

Papernot 2018 — Building Security In for ML (IT stance)

Papernot, Nicolas. “A Marauder’s Map of Security and Privacy in Machine Learning.” arXiv preprint arXiv:1811.01134 (2018).

Tainted only by an old school IT security approach, this paper aims at the core of #MLsec but misses the mark. Too much ops and not enough security engineering.

  • MLsec

Shumailov 2020 — Energy DoS attacks against NNs (uses GAs)

Shumailov, Ilia, Yiren Zhao, Daniel Bates, Nicolas Papernot, Robert Mullins, Ross Anderson. “Sponge Examples: Energy-Latency Attacks on Neural Networks.” arXiv preprint arXiv:2006.03463 (2020).

Excellent paper, very clear and well-stated. Availability attacks against DNNs. Makes use of GAs to evolve attack input. Energy consumption is the target.

  • MLsec
  • Attack-Lit-Pointers

Yuan 2018 — Adversarial Examples

Yuan, Xiaoyong, Pan He, Qile Zhu, Xiaolin Li. “Adversarial Examples: Attacks and Defenses for Deep Learning.” arXiv preprint arXiv:1712.07107 (2018).

A solid paper with a stunning set of references. A good way to understand the adversarial example landscape.

  • Attack-Lit-Pointers

Other Papers

Antorán 2020 — Uncertainty

Antorán, J., Umang Bhatt, Tameen Adel, Adrian Weller, and José Miguel Hernández-Lobato. “GETTING A CLUE: A METHOD FOR EXPLAINING UNCERTAINTY ESTIMATES.” ICLR 2020 Workshop paper (2020).

Representation helps with the why of uncertainty. Little relevance to security. Error bars.

  • Engineering

Arora 2018 — Multiple Meanings

Arora, Sanjeev, Yuanzhi Li, Yingyu Liang, Tengyu Ma, and Andrej Risteski. “Linear algebraic structure of word senses, with applications to polysemy.” Transactions of the Association of Computational Linguistics 6 (2018): 483-495.

Structured representations that capture distributed sub-features (micro-topics) through ML. Beyond word2vec and glove adding “semantics.”

  • Representation

Barreno 2010 — Fundamental work in MLsec

Barreno, Marco, Blaine Nelson, Anthony D. Joseph, J.D. Tygar. “The security of machine learning.” Machine Learning, 81:2, pp. 121-148 (November 2010).

Solid but dated work with lots of fundamentals. Made harder to grasp by mixing two issues: ML FOR security and security OF ML. Untangling these things is critical. (Also see their 2006 paper.)

  • MLsec

Bellamy (IBM) 2018 — IBM User Manual

Bellamy, Rachel, Kuntal Dey, Michael Hind, Samuel C. Hoffman, Stephanie Houde, Kalapriya Kannan, Pranay Lohia, Jacquelyn Martino, Sameep Mehta, Aleksandra Mojsilovic, Seema Nagar, Karthikeyan Natesan Ramamurthy, John Richards, Diptikalyan Saha, Prasanna Sattigeri, Moninder Singh, Kush R. Varshney, and Yunfeng Zhang. “AI Fairness 360: An Extensible Toolkit for Detecting, Understanding, and Mitigating Unwanted Algorithmic Bias” arXiv preprint arXiv:1810.01943 (2018).

Kind of like reading a manual and a marketing glossy mashup. Nothing at all about making actual bias decisions. Bag of tools described.

  • Engineering

Biggio 2018 — Biggio on Adversarial Machine Learning

Battista Biggio, Fabio Roli. “Wild Patterns: Ten Years After the Rise of Adversarial Machine Learning” arXiv preprint arXiv:1712.03141 (2018).

Myopia abounds. This is basically a review paper. (Very defensive of prior work by the author.)

  • Attack-Lit-Pointers

Buchanan 2020 — National Security Policy

Buchanan, Ben. “A National Security Research Agenda for Cybersecurity and Artificial Intelligence.” CSET Policy Brief (2020).

Good work with some base confusion between security OF ML (what BIML does) and ML FOR security. ML is not a magic force multiplier. OK #MLsec section too heavy on adversarial examples.

  • Policy

Carlini 2019 — Memorization and Data Leaking

Carlini, Nicoholas Chang Liu, Úlfar Erlingsson, Jernej Kos, Dawn Song. “The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks ” arXiv preprint arXiv:1802.08232 (2019).

Clear, cogent and fairly simple. Great results. Protecting secrets in ML data.

  • Attack-Lit-Pointers

Chen 2017 — Backdoor attacks coined

Chen, Xinyun, Chang Liu, Bo Li, Kimberly Lu, Dawn Song . “Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning” arXiv preprint arXiv:1712.05526 (2017).

A badly written and loosely constructed paper that introduces the (poorly chosen) “backdoor” terminology. The work is about data poisoning attacks.

  • Attack-Lit-Pointers

Christiansen 2016 — Language Representation and Structure

Christiansen, Morten H., and Nick Chater. “The Now-or-Never bottleneck: A fundamental constraint on language.” Behavioral and Brain Sciences 39 (2016).

Too much psychology and not enough ML. This paper is about context in language representation, including look ahead and structured patterns. How big is your buffer is the main question.

  • AI-Philosophy
  • Representation

Dai 2019 — Transformer-XL

Dai, Zihang, Zhilin Yang, Yiming Yang, William W. Cohen, Jaime Carbonell, Quoc V. Le, and Ruslan Salakhutdinov. “Transformer-xl: Attentive language models beyond a fixed-length context.” arXiv preprint arXiv:1901.02860 (2019).

Getting past fixed-length context through various kludges. Recursive feedback to represent previous state.

  • Language-Processing

De Deyne 2020 — Psych Rep Grounding

De Deyne, Simon, Danielle Navarro, Guillem Collell, and Andrew Perfors. “Visual and Affective Grounding in Language and Mind.” PsyArv preprint PsyArv:q97f8 (2020).

Too much insider psych gobbledygook in this paper. Lots of results, very poorly presented. An important subject best approached another way.

  • Representation
  • Psychology

Devlin 2018— BERT (transformers) and pre-training

Devlin, Jacob, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. “Bert: Pre-training of deep bidirectional transformers for language understanding.” arXiv preprint arXiv:1810.04805 (2018).

Input windows and representation. Precomputing leads to transfer attacks.

  • Language-Processing

Dhariwal 2020— Music generation

Dhariwal, Prafulla, Heewoo Jun, Christine Payne, Jong Wook Kim, Alec Radford, Ilya Sutskever. “Jukebox: A Generative Model for Music.” arXiv preprint arXiv:2005.00341 (2020).

Generating music with a very weird model. Training a model on raw audio. Also see https://openai.com/blog/jukebox/

  • Music-Processing

Eniser 2020— Adversarial Image Defense

Hasan Ferit Eniser, Maria Christakis, Valentin Wüstholz “RAID: Randomized Adversarial-Input Detection for Neural Networks” arXiv preprint arXiv:2002.02776 (2020).

This paper describes a very narrow defense against adversarial image input. Experiments are very arbitrary and lack focus. One interesting note is that the defense leverages activation patterns.

  • Attack-Lit-Pointers

Eykholt 2018— Physical Attacks on Vision

Eykholt, Kevin, Ivan Evtimov, Earlence Fernandes, Bo Li, Amir Rahmati, Chaowei Xiao, Atul Prakash, Tadayoshi Kohno, and Dawn Song. “Robust physical-world attacks on deep learning visual classification.” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1625-1634. 2018.

Tape on the stop sign paper. Fairly naive attacks on non-robust representations that are meant to be psychologically plausible in that humans won’t notice. Many “empirical” settings.

  • Attack-Lit-Pointers

Gamaleldin 2018— Adversarial Reprogramming

Gamaleldin F. Elsayed, Ian Goodfellow, Jascha Sohl-Dickstein “Adversarial Reprogramming of Neural Networks” arXiv preprint arXiv:1806.11146 (2018).

A very interesting paper well worth a read, though the work is very weird. The idea of reprogramming existing ML tech stacks in an adversarial fashion is powerful. Given a Turing complete language construct, all kinds of terrible shenanigans could result. Imagine ransomware running on photo recognition ML machines.

  • Attack-Lit-Pointers

Goodman 2019 — Wagner on Adversarial Testing

Goodman, Dan and Tao Wei . “Cloud-based Image Classification Service Is Not Robust To Simple Transformations:A Forgotten Battlefield” arXiv preprint arXiv:1906.07997 (2019).

Naive experiment on cloud services using well-known methods. Real result: hints at structured noise vs statistical noise as attack type. Representation matters.

  • Attack-Lit-Pointers

View Page

GPT-3 2020 — GPT-3 Launch Paper

Brown, Tom B., Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, Dario Amodei. “Language Models are Few-Shot Learners” arXiv preprint arXiv:2005.14165 (2020).

Autoregressive language model that predicts next token. Memorization?! Astounding results. Section 6 is a basic treatment of MLsec issues by Ariel Herbert-Voss. A little too ass cover on the bias front but well worth thinking about.

  • MLsec

Graces 2020 — RNN Handwriting Generation

Alex Graves. “Generating Sequences With Recurrent Neural Networks” arXiv preprint arXiv:1308.0850 (2014).

Engineering tract documenting an auto-regressive model and various kludges. Reads like a thesis. Kludge heavy.

Gu 2019 — BadNets: Classic Data Poisoning

Gu, Tianyu, Brendan Dolan-Gavitt, Siddharth Garg. “BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain” arXiv preprint arXiv:1708.06733 (2019).

A paper about Trojan functionality. Solidly written and easy to understand. This is classic data poisoning.

  • Attack-Lit-Pointers

Henderson 2018 — Hacking Around with ML

Henderson, Peter, Riashat Islam, Philip Bachman, Joelle Pineau, Doina Precup and David Meger. “arXiv preprint arXiv:1709.06560 (2018)

We tweaked lots of things and found some stuff. Things matter. How you measure stuff also matters.

  • Representation

Hinton 2015 — Review

LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. “Deep learning.” nature 521, no. 7553 (2015): 436.

This review from Nature covers the basics in an introductory way. Some hints at representation as a thing. Make clear that more data and faster CPUs account for the resurgence.

  • Review
  • AI-Philosophy

Hoffmann 2019 — Fairness Politics

Hoffmann, Anna. “Where fairness fails: data, algorithms, and the limits of antidiscrimination discourse, Information, Communication & Society, Volume 22, Number 7, pp 900-915.

This paper is all problems and no solutions couched in high academic blather. A (very negative) overview of politics and ML/AI for an audience of insiders.

  • Review
  • AI-Philosophy

Jacobsen 2019 — Adversarial Examples

Jacobsen, Jörn-Henrik, Jens Behrmann, Richard Zemel, Matthias Bethge. “Mathematical explanation of adversarial vulnerability space. Includes home brew network and analysis set..” arXiv preprint 1811.00401v2 (2019)

Great use of realistic scenarios in a risk analysis. Hilariously snarky.

  • Representation

Jagielski 2018— Data Poisoning

Matthew Jagielski, Alina Oprea, Battista Biggio, Chang Liu, Cristina Nita-Rotaru, Bo Li “Manipulating Machine Learning: Poisoning Attacks and Countermeasures for Regression Learning” arXiv preprint arXiv:1804.00308 (2018).

A solid introduction to the data poisoning subfield. This is a critical category of ML attacks. See the BIML ML attack taxonomy here.

  • Attack-Lit-Pointers

Jha 2019— (Weak) Adversarial Defense

Susmit Jha, Sunny Raj, Steven Lawrence Fernandes, Sumit Kumar Jha, Somesh Jha, Gunjan Verma, Brian Jalaian, Ananthram Swami “Attribution-driven Causal Analysis for Detection of Adversarial Examples” arXiv preprint arXiv:1903.05821 (2019).

Treating pixels in an image as very small “features,” this work tries to kill important features that drive too much of the output (in some sense weakening the natural representation). This kind of masking makes the networks perform poorly. Pretty dumb.

  • Attack-Lit-Pointers

Jin 2020— Adversarial Text

Di Jin, Zhijing Jin, Joey Tianyi Zhou, Peter Szolovits “Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment” arXiv preprint arXiv:1907.11932 (2020).

A cute but not very profound paper. Focuses on attack category #1 (adversarial examples) approached through text processing. BERT is an important language processing model and serves as the target. Low detectability plays a role in the attack model.

  • Attack-Lit-Pointers

Kazemi 2019— Time

Kazemi, Seyed, Rishab Goel, Sepehr Eghbali, Janahan Ramanan, Jaspreet Sahota, Sanjay Thakur, Stella Wu, Cathal Smyth, Pascal Poupart, Marcus Brubaker. “Time2Vec: Learning a Vector Representation of Time” arXiv preprint arXiv:1907.05321 (2019).

Very abstract treatment of time represented as a learned periodic vector. More engineering than ML.

  • Representation

Kilbertus 2018— Learning and Causality

Kilbertus, Niki, Giambattista Parascandolo, Bernhard Schölkopf. “Generalization in anti-causal learning” arXiv preprint arXiv:1812.00524 (2018).

A vague position paper that is more philosophy than anything else. Emphasizes the importance of generation (and causal models). Representation issues around continuity are explored.

  • Representation

Koh 2017— Influence Functions

Koh, Pang Wei and Percy Liang. “Understanding Black-box Predictions via Influence Functions” arXiv preprint arXiv:1703.04730 (2017).

Understanding adversarial inputs. Getting the “same” result through diverse paths. Influence functions, representation, and positive/negative data points.

  • Representation

Krizhevsky 2012 — Convolutional Nets (ReLU)

Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. “Imagenet classification with deep convolutional neural networks.” Advances in neural information processing systems. 2012.

Elegant series of hacks to reduce overfitting. A bit of hand waving. Reference to CPU speed and huge data sets. Depth is important, but nobody knows why.

  • Time
  • Representation

Kurita 2020— Transfer attacks (backdoors)

Kurita, Keita, Paul Michel, Graham Neubig. “Weight Poisoning Attacks on Pre-trained Models” arXiv preprint arXiv:2004.06660 (2020).

Transfer attacks (one of the six BIML attack categories). Very basic results. Fairly obvious. Simple. Nice. Clear. (The only bug is poor terminology…misuse of “backdoor” which has crept into the MLsec literature.)

  • Attack-Lit-Pointers

Lake 2015 — Cogsci

Lake, Brenden, Ruslan Salakhutdinov, Joshua Tenenbaum. “Human-level concept learning through probabilistic program induction.” Science, vol. 350, no. 6266 (2015): 1332-1338.

Representation, models, and one-shot learning. A study promoting BPL.

  • AI-Philosophy
  • Representation

Lake 2017 — Recurrent Net Weakness

Lake, Brenden, and Marco Baroni. “Still not systematic after all these years: On the compositional skills of sequence-to-sequence recurrent networks.” (2018).

Naive micro domain with misleading maps into human semantics (movement). An artificial attack angled with structure as weapon.

  • AI-Philosophy
  • Representation

Lake 2020— Concepts

Lake, Breden and Gregory L. Murphy. “Word meaning in minds and machines” arXiv preprint arXiv:2008.01766 (2020).

Super clear (maybe obvious) treatment of fluid concepts a la dughof. Getting past the bag of words.

  • AI-Philosophy
  • Representation

Legg 2007— Universal Intelligence Definition

Shane Legg, Marcus Hutter “Universal Intelligence: A Definition of Machine Intelligence” arXiv preprint arXiv:0712.3329 (2007).

This is as much a philosophy paper as it is an ML paper. Well worth a read, especially if you are not familiar with philosophy of mind and how it pertains to AI. Defines a (non-computable) measure of intelligence and then tries to move that to something useful.

  • AI-Philosophy

Marcus 2018 — AI Perspective on ML

Marcus, Gary. “Deep learning: A critical appraisal.” arXiv preprint arXiv:1801.00631 (2018).

General overview tainted by old school AI approach. Makes clear the overlooking of representation as essential. Some failure conditions noted, at philosophical level.

  • AI-Philosophy

McGuffie 2020 — Terrorism policy BS

McGuffie, Kris, and Alex Newhouse “THE RADICALIZATION RISKS OF GPT-3 AND ADVANCED NEURAL LANGUAGE MODELS” Technical Report (2020).

A lightly reasoned paper that claims that GPT-3 capabilities (which are apparently assumed to have passed the Turing Test) will lead to more radicalization. Grab your pearls.

  • Policy

Merrill 2020 — RNN Theory

Merrill, William, Gail Weiss, Yoav Goldberg, Roy Schwartz, Noah A. Smith, Eran Yahav. “A Formal Hierarchy of RNN Architectures” arXiv preprint arXiv:2004.08500 (2020).

A CS theory paper that combines two lines of research: rational recurrence and sequential NNs as automata. Continuous inputs may be a problem.

  • Representation

Mitchell 2019 — Model Cards

Mitchell, Margaret , Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, Timnit Gebru . “Model Cards for Model Reporting” arXiv preprint arXiv:1810.03993 (2019).

A mix of sociology and political correctness with engineering transparency. Human-centric models emphasized.

  • Engineering

Mnih 2013 — Atari

Mnih, Volodymyr, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. “Playing atari with deep reinforcement learning.” arXiv preprint arXiv:1312.5602 (2013).

An application of convolution nets where the game representation has been shoved through a filter. Some questions open regarding randomness in the game (making the games very hard to learn). Not dice rolling for turn, but rather random behavior that is cryptographically unpredictable. This paper made a bigger splash than it likely warranted.

  • Games

Oh 2018 — Reversing NNs through queries

Oh, Seong Joon, Max Augustin, Bernt Schiele, Mario Fritz. “Towards Reverse-Engineering Black-Box Neural Networks.” arXiv preprint 1711.01768 (2018)

A goofy, clever, interesting paper that compliments Wang. Well-written but not too deep.

  • Engineering

Peters 2018 — ELMo

Peters, Matthew E, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. “Deep contextualized word representations.” arXiv preprint 1802.05365 (2018)

Important seminal work on ELMo. Some echoes of SDM and highly distributed representation power.

  • Representation

Phillips 2011 — Racism

Phillips, P. Jonathon, Fang Jiang, Abhijit Narvekar, Julianne Ayyad, and Alice J. O’Toole. “An other-race effect for face recognition algorithms.” ACM Transactions on Applied Perception (TAP) 8, no. 2 (2011): 14.

This paper is pretty stupid. The result is simply “when your data are racists, your system will be too” which is beyond obvious for anyone who knows how ML works. This its what happens when psych people write about ML instead of CS people

  • Sociology

Quinn 2017 (also mm17) — Dog Walker

Quinn, Max H., Erik Conser, Jordan M. Witte, Melanie Mitchell . “Semantic Image Retrieval via Active Grounding of Visual Situations” arXiv preprint arXiv:1711.00088 (2017).

Building up representations with a hybrid Copycat/NN model. Hofstadterian model. Time as an essential component in building up a representation.

  • AI-Philosophy
  • Representation

Rahwan 2019 — Towards a Study of Machine Behavior

Rahwan, Iyad, Manuel Cebrian, Nick Obradovich, Josh Bongard, Jean-François Bonnefon, Cynthia Breazeal, Jacob W. Crandall, Nicholas A. Christakis, Iain D. Couzin, Matthew O. Jackson, Nicholas R. Jennings, Ece Kamar, Isabel M. Kloumann, Hugo Larochelle, David Lazer, Richard McElreath, Alan Mislove, David C. Parkes, Alex ‘Sandy’ Pentland, Margaret E. Roberts, Azim Shariff, Joshua B. Tenenbaum & Michael Wellman. “Machine behavior.” Nature 568 (2019): 477-486.

Social science on machines. Very clear treatment. Trinity of trouble hinted at. Good analogs for security. Is ML code/data open source or not?

  • Sociology

Ribeiro 2020 — ATCG for NNs

Ribeiro, Marco T., Tongshuang Wu, Carlos Guestrin, Sameer Singh. “Beyond Accuracy: Behavioral Testing of NLP models with CheckList” arXiv preprint arXiv:2005.04118 (2020).

Very basic approach to bbox ATCG that begins to ask the question WHAT exactly should be tested and how to get past accuracy. Obvious and fairly shallow from a testing perspective.

  • Engineering

Schmidhuber 2010 — Creativity

Jürgen Schmidhuber. “Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990-2010)” Technical Report (2010).

Post-facto justification of “the thing I built.” An overview with interesting mappings to aesthetics and self-motivation. The lossless compression angle is weird. Flirts with innocent crackpotism.

  • Creativity

Sculley 2015 — Software Engineering Would Help

Sculley, David, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, Michael Young, Jean-Francois Crespo, and Dan Dennison. “Hidden technical debt in machine learning systems.” In Advances in neural information processing systems, pp. 2503-2511. 2015.

Random kludges built of interlocked pieces and parts is a bad idea. This applies to ML as well. Light on analysis and mis-directed on focus.

  • Engineering

Sculley-ccard 2014 — Technical Debt

Sculley, D., Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, and Michael Young. “Machine learning: The high interest credit card of technical debt.” (2014).

A diatribe against deadline and just making stuff work. Naive criticism of flaws.

  • Engineering
  • Representation

Sculley 2018 — Building Security In for ML (IT stance)

Sculley, D., Jasper Snoek, Ali Rahmini, Alex Wiltschko. “Winner’s Curse? On Pace, Progress, and Empirical Rigor.” ICLR 2018 Workshop paper (2018).

Argues for a scientific approach. General and pretty obvious.

  • Engineering

Schwartz-Ziv 2017— Representation

Shwartz-Ziv, Ravid, and Naftali Tishby. “Opening the black box of Deep Neural Networks via Information.” arXiv preprint arXiv:1703.00810 (2017).

An opaque paper on representation. Pretty far afield from security.

  • Representation

Shankar 2020— Microsoft on MLsec

Ram Shankar Siva Kumar, Magnus Nyström, John Lambert, Andrew Marshall, Mario Goertzel, Andi Comissoneru, Matt Swann, Sharon Xia “Adversarial Machine Learning — Industry Perspectives” arXiv preprint arXiv:2002.05646 (2020).

Microsoft’s first stab at Threat Modeling for ML. Problems with nomenclature are par for the course for Microsoft (e.g., “adversarial ML” should be “MLsec”). This is a solid start but needs deeper thought. More emphasis on design would help. Also see this BIML blog entry.

  • Engineering
  • MLsec

Shu 2020— Disentaglement

Shu, Rui, Yining Chen, Abhishek Kumar, Stefano Ermon, Ben Poole “Weakly Supervised Disentanglement with Guarantees” arXiv preprint arXiv:1910.09772 (2020).

A complex paper on representation. Worth a close reading.

  • Representation

Silver 2017— AlphaGo

Silver, David, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert et al. “Mastering the game of go without human knowledge.” Nature 550, no. 7676 (2017): 354.

AlphaGo trains itself by playing itself. Surprising and high profile results. Monte Carlo tree search seems to underly the results (which representations are amenable to that kind of search?). Unclear how general these results are or if they only apply to certain games with fixed rules and perfect knowledge.

  • Games

Slack 2020— Adversarial Classifiers

Slack, Dylan, Sophie Hilgard, Emily Jia, Sameer Singh, and Himabindu Lakkaraju. “Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods” arXiv preprint arXiv:1911.02508 (2020).

Adversarial classifiers with a focus on ML bias including racism and sexism in black box models.

  • Attack-Lit-Pointers

Springer 2018 — Sparse Coding is Good

Arora, Sanjeev, Yuanzhi Li, Yingyu Liang, Tengyu Ma, and Andrej Risteski. “Classifiers Based on Deep Sparse Coding Architectures are Robust to Deep Learning Transferable Examples.” Transactions of the Association of Computational Linguistics 6 (2018): 483-495.

Important theory, but silly experiment. Hints at the importance of context, concept activation, and dynamic representation. Explores limits of transfer attacks WRT representation

  • Representation

Stretcu 2020 — Curriculum Learning

Stretcu, Otilia, Emmanouil Antonios Platanios, Tom Mitchell, Barnabás Póczos. “COARSE-TO-FINE CURRICULUM LEARNING FOR CLASSIFICATION.” ICLR 2020 Workshop paper (2020).

Ties to error-making, confusion matrices, and representation

  • Representation

Sundararajan 2017 — Explaining Networks

Sundararajan, Mukund, Ankur Taly, and Qiqi Yan. “Axiomatic Attribution for Deep Networks” arXiv preprint arXiv:1703.01365 (2017).

A strangely-written paper trying to get to the heart of describing why a network does what it does. Quirky use of mathematical style. Hard to understand and opaque.

  • Representation

Tenenbaum 2011 — Review

Joshua B. Tenenbaum, Charles Kemp, Thomas L. Griffiths, Noah D. Goodman. “How to Grow a Mind: Statistics, Structure, and Abstraction.” Science, vol. 331, no. 6022 (2011): 1279-1285.

A practical philosophy of AI paper focused on bridging the usual symbolic vs sub-symbolic chasm. Overfocus on HBMs, but worth a read to understand the role that structure plays in intelligence and representation.

  • AI-Philosophy

Udrescu 2020— AI Feynman

Udrescu, Silviu-Marian, Max Tegmark. “AI Feynman: a Physics-Inspired Method for Symbolic Regression” arXiv preprint arXiv:1905.11481 (2020).

Crazy. Interesting use of NN to find simplicity for physics equations. NN vs GA battles.

  • Representation

Vaswani 2017 — BERT percursor

Jetley, Saumya, Nicholas A. Lord, and Phillip H.S.Torr. “Attention is All You Need.” 31st Conference on Neural Information Processing Systems. 2017.

BERT percursor

  • Attack-Lit-Pointers

Wallace 2020— Attacking Machine Translation

Eric Wallace, Mitchell Stern, Dawn Song. “Imitation Attacks and Defenses for Black-box Machine Translation Systems” arXiv preprint arXiv:2004.15015 (2020).

Attacking Machinbe Translation. 1. distill model by query (cloning), 2. use distilled version as whitebox, 3. a defense that fails. (Attacks BING and SYS-TRAN. Real systems!)

  • Attack-Lit-Pointers

Wang 2020— Stealing Hyperparameters

Wang, Binghui and Gong, Neil. “Stealing Hyperparameters in Machine Learning” arXiv preprint arXiv:1802.05351 (2020).

Fairly trivial, poorly motivated threat model. The notion of getting “free cycles” is not too impressive. Good history overview of ML security.

  • Attack-Lit-Pointers

Wang 2018 — Transfer Learning Attacks

Wang, Bolun, Yuanshun Yao, Bimal Viswanath, Haitao Zheng, and Ben Y. Zhao. “With great training comes great vulnerability: Practical attacks against transfer learning.” In 27th {USENIX} Security Symposium ({USENIX} Security 18), pp. 1281-1297. 2018.

Attacks against transfer learning in cascaded systems. If the set of all Trained networks is small, this work hold water. “Empirical” settings. Some NNs highly susceptible to tiny noise. Good use of confusion matrix. Dumb defense through n-version voting.

  • Attack-Lit-Pointers

Witty 2019— Causal Inference

Witty, Sam, Alexander Lew, David Jensen, Vikash Mansinghka. “Bayesian causal inference via probabilistic program synthesis” arXiv preprint arXiv:1910.14124 (2019).

Interesting paper about a toy problem. Not much of a tutorial. Doesn’t really stand on its own…so more of a teaser.

  • AI-Philosophy

Videos and Popular Press

Ian Goodfellow gives a talk about Adversarial Examples
Lots of solid thinking about the subject
Q: Why Do Keynote Speakers Keep Suggesting That Improving Security Is Possible?
A: Because Keynote Speakers Make Bad Life Decisions and Are Poor Role Models
James Mickens, Harvard University
27th Usenix Security Symposium

Ali Rahimi’s talk at NIPS(NIPS 2017 Test-of-time award presentation)

Ingredients of Intelligence VIDEO, Brenden Lake explains why he builds computer programs that seek to mimic the way humans think.

Brenden Lake, NYU, March 26, 2018 | EmTech Digital


Douglas R. Hofstadter, “The Shallowness of Google Translate” from the Atlantic Monthly