Annotated Bibliography
As our research group reads and discusses scientific papers in MLsec, we add an entry to this bibliography. We also curate a “top 5” list.
Top 5 Papers
Carlini 2020 — Extraction attacks
Carlini, Nicoholas, Florian Tramer, Eric Wallace, Matthew Jagielski, Ariel Herbert-Voss, Katherine Lee, Adam Roberts, Tom Brown, Dawn Song, Ulfar Erlingsson, Alina Oprea, Colin Raffel. “Extracting Training Data from Large Language Models ” arXiv preprint arXiv:2012.07805 (2020).
Classic and easy extraction clearly explained. Striking results (but not that deep).
Gilmer 2018 — Adversarial Examples
Gilmer, Justi, Ryan P. Adams, Ian Goodfellow, David Andersen, and George E. Dahl. “Motivating the Rules of the Game for Adversarial Example Research.” arXiv preprint 1807.06732 (2018)
Great use of realistic scenarios in a risk analysis. Hilariously snarky.
Jetley 2018 — On generalization and vulnerability
Jetley, Saumya, Nicholas A. Lord, and Phillip H.S.Torr. “With Friends Like These, Who Needs Adversaries?.” 32nd Conference on Neural Information Processing Systems. 2018.
Excellent paper. Driven by theory and demostrated by experimentation, generalization in DCNs trades off agains vulnerability
Papernot 2016 — Building Security In for ML (IT stance)
Papernot, Nicolas, Patrick McDaniel, Arunesh Sinha, Michael Wellman. “SoK: Towards the Science of Security and Privacy in Machine Learning.” arXiv preprint arXiv:1611.03814 (2016).
A clear, concise, and expansive paper. The takeaway lessons are particularly useful.
Shumailov 2020 — Energy DoS attacks against NNs (uses GAs)
Shumailov, Ilia, Yiren Zhao, Daniel Bates, Nicolas Papernot, Robert Mullins, Ross Anderson. “Sponge Examples: Energy-Latency Attacks on Neural Networks.” arXiv preprint arXiv:2006.03463 (2020).
Excellent paper, very clear and well-stated. Availability attacks against DNNs. Makes use of GAs to evolve attack input. Energy consumption is the target.
Other Papers
Antorán 2020 — Uncertainty
Antorán, J., Umang Bhatt, Tameen Adel, Adrian Weller, and José Miguel Hernández-Lobato. “Getting a Clue: A Method for Explaining Uncertainty Estimates.” ICLR 2020 Workshop paper (2020).
Representation helps with the why of uncertainty. Little relevance to security. Error bars.
Arora 2018 — Multiple Meanings
Arora, Sanjeev, Yuanzhi Li, Yingyu Liang, Tengyu Ma, and Andrej Risteski. “Linear algebraic structure of word senses, with applications to polysemy.” Transactions of the Association of Computational Linguistics 6 (2018): 483-495.
Structured representations that capture distributed sub-features (micro-topics) through ML. Beyond word2vec and glove adding “semantics.”
Barreno 2010 — Fundamental work in MLsec
Barreno, Marco, Blaine Nelson, Anthony D. Joseph, J.D. Tygar. “The security of machine learning.” Machine Learning, 81:2, pp. 121-148 (November 2010).
Solid but dated work with lots of fundamentals. Made harder to grasp by mixing two issues: ML FOR security and security OF ML. Untangling these things is critical. (Also see their 2006 paper.)
Bellamy (IBM) 2018 — IBM User Manual
Bellamy, Rachel, Kuntal Dey, Michael Hind, Samuel C. Hoffman, Stephanie Houde, Kalapriya Kannan, Pranay Lohia, Jacquelyn Martino, Sameep Mehta, Aleksandra Mojsilovic, Seema Nagar, Karthikeyan Natesan Ramamurthy, John Richards, Diptikalyan Saha, Prasanna Sattigeri, Moninder Singh, Kush R. Varshney, and Yunfeng Zhang. “AI Fairness 360: An Extensible Toolkit for Detecting, Understanding, and Mitigating Unwanted Algorithmic Bias” arXiv preprint arXiv:1810.01943 (2018).
Kind of like reading a manual and a marketing glossy mashup. Nothing at all about making actual bias decisions. Bag of tools described.
Bender 2020 — Stochastic Parrots
Bender, Emily, Angelina McMillan-Major, Timnit Gebru and Shmargaret Shmitchell. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?, FAccT ’21, March 3-10, 2021, Virtual Event, Canada.
The infamous paper that got Timnit fired. Continuing to scale may not be the NLP answer. A few too many reasons why to try some other things. Great points interspersed with political diatribe.
Bender 2020 — Understanding
Bender, Emily and Alexander Koller. “Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data.” Proceedings of the 58th Annual Meeting of the ASsociation for Computational Linguistics (July 2020): 5185-5198.
A narrow view of LM. Lacks a conception of emergence. Right result, but wrong reasons.
Biggio 2018 — Biggio on Adversarial Machine Learning
Battista Biggio, Fabio Roli. “Wild Patterns: Ten Years After the Rise of Adversarial Machine Learning” arXiv preprint arXiv:1712.03141 (2018).
Myopia abounds. This is basically a review paper. (Very defensive of prior work by the author.)
Buchanan 2020 — National Security Policy
Buchanan, Ben. “A National Security Research Agenda for Cybersecurity and Artificial Intelligence.” CSET Policy Brief (2020).
Good work with some base confusion between security OF ML (what BIML does) and ML FOR security. ML is not a magic force multiplier. OK #MLsec section too heavy on adversarial examples.
Bosselut 2019 — COMET
Bosselut, Antoine, Hannah Rashkin, Maarten Sap, Chaitanya Malaviya, Asli Celikyilmaz, Yejin Choi. “COMET : Commonsense Transformers for Automatic Knowledge Graph Construction” arXiv preprint arXiv:1906.05317 (2019).
Building an informal KB with less structure. Allow internal structure to form. Discrete representstion — Corpus representation.
Carlini 2019 — Memorization and Data Leaking
Carlini, Nicoholas Chang Liu, Úlfar Erlingsson, Jernej Kos, Dawn Song. “The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks ” arXiv preprint arXiv:1802.08232 (2019).
Clear, cogent and fairly simple. Great results. Protecting secrets in ML data.
Chollet 2019 — On the Measure of Intelligence
Chollet, François. “On the Measure of Intelligence .” arXiv preprint arXiv:1911.01547 (2019).
An interesting perspective on progress in AI with a particular view of history biased towards ML. Focuses on the importance of generalization and learning. Some discussion of collective entities. The author develops a formalism with pretty terrible notation. Then comes ARC, the Abstraction and Reasoning Corpus, a benchmark for general intelligence.
Chen 2017 — Backdoor attacks coined
Chen, Xinyun, Chang Liu, Bo Li, Kimberly Lu, Dawn Song . “Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning” arXiv preprint arXiv:1712.05526 (2017).
A badly written and loosely constructed paper that introduces the (poorly chosen) “backdoor” terminology. The work is about data poisoning attacks.
Christiansen 2016 — Language Representation and Structure
Christiansen, Morten H., and Nick Chater. “The Now-or-Never bottleneck: A fundamental constraint on language.” Behavioral and Brain Sciences 39 (2016).
Too much psychology and not enough ML. This paper is about context in language representation, including look ahead and structured patterns. How big is your buffer is the main question.
Dai 2019 — Transformer-XL
Dai, Zihang, Zhilin Yang, Yiming Yang, William W. Cohen, Jaime Carbonell, Quoc V. Le, and Ruslan Salakhutdinov. “Transformer-xl: Attentive language models beyond a fixed-length context.” arXiv preprint arXiv:1901.02860 (2019).
Getting past fixed-length context through various kludges. Recursive feedback to represent previous state.
D’Amour 2020 — Underspecification
D’Amour, Alexander, Katherine Heller, Dan Moldovan, Ben Adlam, Babak Alipanahi, Alex Beutel, Christina Chen, Jonathan Deaton, Jacob Eisenstein, Matthew D. Hoffman, Farhad Hormozdiari, Neil Houlsby, Shaobo Hou, Ghassen Jerfel, Alan Karthikesalingam, Mario Lucic, Yian Ma, Cory McLean, Diana Mincu, Akinori Mitani, Andrea Montanari, Zachary Nado, Vivek Natarajan, Christopher Nielson, Thomas F. Osborne, Rajiv Raman, Kim Ramasamy, Rory Sayres, Jessica Schrouff, Martin Seneviratne, Shannon Sequeira, Harini Suresh, Victor Veitch, Max Vladymyrov, Xuezhi Wang, Kellie Webster, Steve Yadlowsky, Taedong Yun, Xiaohua Zhai, D. Sculley. “Underspecification Presents Challenges for Credibility in Modern Machine Learning.” arXiv preprint arXiv:2011.03395 (2020).
Very nice work. Strange terminology, but intuitive results. Makes us ask “what is sparseness?”
D’Amour 2020 — Dynamic Simulation of Fairness
D’Amour, Alexander, Hansa Srivasan, James Atwood, Pallavi Baljekar, D Sculley, amd Yoni Halpern. Fairness Is Not Static: Deeper Understanding of Long Term Fairness via Simulation Studies, FAT* ’20: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. January 2020 Pages 525–534
Sociology? Economics? Some simple experiments well explained but no clarity of results.
Danzig 2022 — Machines, Bureaucracies, and Markets as Artificial Intelligences
Danzig, Richard. “Machines, Bureaucracies, and Markets as Artificial Intelligences.” (2022).
An outstanding treatise on AI, ML, and emergent systems, premised on the idea that we have something to learn about those fields by studying markets and bureaucracies. Highly readable and thought provoking.
De Deyne 2020 — Psych Rep Grounding
De Deyne, Simon, Danielle Navarro, Guillem Collell, and Andrew Perfors. “Visual and Affective Grounding in Language and Mind.” PsyArv preprint PsyArv:q97f8 (2020).
Too much insider psych gobbledygook in this paper. Lots of results, very poorly presented. An important subject best approached another way.
Devlin 2018— BERT (transformers) and pre-training
Devlin, Jacob, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. “Bert: Pre-training of deep bidirectional transformers for language understanding.” arXiv preprint arXiv:1810.04805 (2018).
Input windows and representation. Precomputing leads to transfer attacks.
Dhariwal 2020— Music generation
Dhariwal, Prafulla, Heewoo Jun, Christine Payne, Jong Wook Kim, Alec Radford, Ilya Sutskever. “Jukebox: A Generative Model for Music.” arXiv preprint arXiv:2005.00341 (2020).
Generating music with a very weird model. Training a model on raw audio. Also see https://openai.com/blog/jukebox/
Dreamcoder 2020— Dreamcoder
Ellis, Kevin, Catherine Wong, Maxwell Nye, Mathias Sable-Meyer, Luc Cary, Lucas Morales, Luke Hewitt, Armando Solar-Lezama, Joshua B. Tenenbaum. “DreamCoder: Growing generalizable, interpretable knowledge with wake-sleep Bayesian program learning .” arXiv preprint arXiv:2006.08381 (2020).
Great paper combining symbolic, functional, and statistical AI in an elegant way.
Eniser 2020— Adversarial Image Defense
Hasan Ferit Eniser, Maria Christakis, Valentin Wüstholz “RAID: Randomized Adversarial-Input Detection for Neural Networks” arXiv preprint arXiv:2002.02776 (2020).
This paper describes a very narrow defense against adversarial image input. Experiments are very arbitrary and lack focus. One interesting note is that the defense leverages activation patterns.
Eykholt 2018— Physical Attacks on Vision
Eykholt, Kevin, Ivan Evtimov, Earlence Fernandes, Bo Li, Amir Rahmati, Chaowei Xiao, Atul Prakash, Tadayoshi Kohno, and Dawn Song. “Robust physical-world attacks on deep learning visual classification.” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1625-1634. 2018.
Tape on the stop sign paper. Fairly naive attacks on non-robust representations that are meant to be psychologically plausible in that humans won’t notice. Many “empirical” settings.
Fazelpour 2020— Algorithmic Fairness
Fazelpour,Sina and Zachary C. Lipton “Algorithmic Fairness from a Non-ideal Perspective” arXiv preprint arXiv:2001.09773 (2020).
An uncharacteristically good social justice in ML paper. Addresses the broader problem of algorithmic failure. Written by a computer scientists (less gobbledygook).
Feldman 2020— Does learning require memorization? a short tale about a long tail.
Feldman, Vitaly. “Does learning require memorization? a short tale about a long tail.” In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, pp. 954-959. 2020.
A set of very intuitive, well-explained ideas backed up by reams of somewhat inscrutable math. Upshot: memorization is often unavaiodable and mechanisms to limit it screw things up.
Gamaleldin 2018— Adversarial Reprogramming
Gamaleldin F. Elsayed, Ian Goodfellow, Jascha Sohl-Dickstein “Adversarial Reprogramming of Neural Networks” arXiv preprint arXiv:1806.11146 (2018).
A very interesting paper well worth a read, though the work is very weird. The idea of reprogramming existing ML tech stacks in an adversarial fashion is powerful. Given a Turing complete language construct, all kinds of terrible shenanigans could result. Imagine ransomware running on photo recognition ML machines.
Goldwasser 2022 — Planting Undetectable Backdoors in Machine Learning Models
Goldwasser, Shafi, Michael P. Kim, Vinod Vaikuntanathan, and Or Zamir. “Planting Undetectable Backdoors in Machine Learning Models.” arXiv preprint arXiv:2204.06974 (2022).
You can’t test your way out of possible backdoor space (in CS or in deep learning). Running arbitrary code someone evil wrote is not safe. Obvious and good. You can Trojan EVERY DNN undetectably.
Goodman 2019 — Wagner on Adversarial Testing
Goodman, Dan and Tao Wei . “Cloud-based Image Classification Service Is Not Robust To Simple Transformations:A Forgotten Battlefield” arXiv preprint arXiv:1906.07997 (2019).
Naive experiment on cloud services using well-known methods. Real result: hints at structured noise vs statistical noise as attack type. Representation matters.
GPT-3 2020 — GPT-3 Launch Paper
Brown, Tom B., Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, Dario Amodei. “Language Models are Few-Shot Learners” arXiv preprint arXiv:2005.14165 (2020).
Autoregressive language model that predicts next token. Memorization?! Astounding results. Section 6 is a basic treatment of MLsec issues by Ariel Herbert-Voss. A little too ass cover on the bias front but well worth thinking about.
Graves 2020 — RNN Handwriting Generation
Alex Graves. “Generating Sequences With Recurrent Neural Networks” arXiv preprint arXiv:1308.0850 (2014).
Engineering tract documenting an auto-regressive model and various kludges. Reads like a thesis. Kludge heavy.
Gregor 2020 — Temporal difference variational auto-encoder.
Gregor, Karol, George Papamakarios, Frederic Besse, Lars Buesing, and Theophane Weber. “Temporal difference variational auto-encoder.” arXiv preprint arXiv:1806.03107 (2018).
This paper is a mumbo jumbo mix of insider language and statistics. This is motivated by work at the very edge but does not help anyone other than scientists at the very edge. Even the problem they are trying to solve is unclear and badly motivated. Skip it.
Gu 2019 — BadNets: Classic Data Poisoning
Gu, Tianyu, Brendan Dolan-Gavitt, Siddharth Garg. “BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain” arXiv preprint arXiv:1708.06733 (2019).
A paper about Trojan functionality. Solidly written and easy to understand. This is classic data poisoning.
Guedj 2019 — PAC-Bayes
Guedj, Benjamin. “A Primer on PAC-Bayesian Learning” arXiv preprint arXiv:1901.05353 (2019).
Heavy on theory. Solid intro to PAC-Bayes. Relevant to ML bounding conditions is some cases.
Hall 2019 — XAI (explainable AI)
Hall, Patrick, Navdeep Gill, and Nicholas Schmidt. “Proposed Guidelines for the Responsible Use of Explainable Machine Learning” arXiv preprint arXiv:1906.03533 (2019).
Explanation versus testing and debugging. This paper is weirdly legalistic. Lots of financial system examples.
Hawkins 2016 — A Theory of Sequence Memory in Neocortex
Hawkins, Jeff, and Subutai Ahmad. “Why neurons have thousands of synapses, a theory of sequence memory in neocortex.” Frontiers in neural circuits (2016): 23.
Cells that fire together, wire together. Hebb rule as instantiated in dendrites. A more realistic neuron model.
Henderson 2018 — Hacking Around with ML
Henderson, Peter, Riashat Islam, Philip Bachman, Joelle Pineau, Doina Precup and David Meger. “Deep Reinforcement Learning that Matters arXiv preprint arXiv:1709.06560 (2018)
We tweaked lots of things and found some stuff. Things matter. How you measure stuff also matters.
Hendrycks 2019 — Robustness (or not)
Hendrycks, Dan and Thomas Dietterich. “Benchmarking Neural Network Robustness to Common Corruptions and Perturbations arXiv preprint arXiv:1903.12261 (2019)
How to “spread out” generalization. Some influence from human error-making would help. Perturbations.
Hendrycks 2020 — Robustness (or not)
Hendrycks, Dan, Steven Basart, Norman Mu, Saurav Kadavath, Frank Wang, Evan Dorundo, Rahul Desai, Tyler Zhu, Samyak Parajuli, Mike Guo, Dawn Song, Jacob Steinhardt, and Justin Gilmer. “The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization arXiv preprint arXiv:2006.16241 (2020)
Robustness can’t be achieved with simple distribution shifts. Clear result.
Hinton 2015 — Review
LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. “Deep learning.” nature 521, no. 7553 (2015): 436.
This review from Nature covers the basics in an introductory way. Some hints at representation as a thing. Make clear that more data and faster CPUs account for the resurgence.
Hoffmann 2019 — Fairness Politics
Hoffmann, Anna. “Where fairness fails: data, algorithms, and the limits of antidiscrimination discourse, Information, Communication & Society, Volume 22, Number 7, pp 900-915.
This paper is all problems and no solutions couched in high academic blather. A (very negative) overview of politics and ML/AI for an audience of insiders.
Hong 2019 — Hardware Fault Injection
Hong, Sangghyun, Pietro Frigo, Yiğitcan Kaya, Cristiano Giuffrida, Tudor Dumitraş. “Terminal Brain Damage: Exposing the Graceless Degradation in Deep Neural Networks Under Hardware Fault Attacks.” arXiv preprint 1906.01017 (2019)
NN’s run on computers, oh my! Rowhammer attacks against running NN models work just fine.
Jacobsen 2019 — Adversarial Examples
Jörn-Henrik Jacobsen, Jens Behrmann, Richard Zemel, Matthias Bethge. “Excessive Invariance Causes Adversarial Vulnerability.” arXiv preprint 1811.00401v2 (2019)
Great use of realistic scenarios in a risk analysis. Hilariously snarky.
Jagielski 2018— Data Poisoning
Matthew Jagielski, Alina Oprea, Battista Biggio, Chang Liu, Cristina Nita-Rotaru, Bo Li “Manipulating Machine Learning: Poisoning Attacks and Countermeasures for Regression Learning” arXiv preprint arXiv:1804.00308 (2018).
A solid introduction to the data poisoning subfield. This is a critical category of ML attacks. See the BIML ML attack taxonomy here.
Jha 2019— (Weak) Adversarial Defense
Susmit Jha, Sunny Raj, Steven Lawrence Fernandes, Sumit Kumar Jha, Somesh Jha, Gunjan Verma, Brian Jalaian, Ananthram Swami “Attribution-driven Causal Analysis for Detection of Adversarial Examples” arXiv preprint arXiv:1903.05821 (2019).
Treating pixels in an image as very small “features,” this work tries to kill important features that drive too much of the output (in some sense weakening the natural representation). This kind of masking makes the networks perform poorly. Pretty dumb.
Johnson 2013— Rise of New Machine Ecology
Johnson, Neil, Guannan Zhao, Eric Hunsader, Hong Qi, Nicholas Johnson, Jing Meng, and Brian Tivnan. “Abrupt rise of new machine ecology beyond human response time“. Scientific reports 3, no. 1 (2013): 1-7.
You’ve probably heard of high frequency trading, flash crashes, etc. This paper explains how adaptive algorithms are involved in this activity and how they happen at subhuman perception speeds. A picosecond is a thing.
Jin 2020— Adversarial Text
Di Jin, Zhijing Jin, Joey Tianyi Zhou, Peter Szolovits. “Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment“. arXiv preprint arXiv:1907.11932 (2020).
A cute but not very profound paper. Focuses on attack category #1 (adversarial examples) approached through text processing. BERT is an important language processing model and serves as the target. Low detectability plays a role in the attack model.
Jones 2004 — NLP and Generative Models
Jones, Karen. 2004 Language modelling’s generative model: is it rational?. Technical Report, University of Cambridge. June 2004.
A hilarious paper that is critical of LM’s (as defined very tightly by the author). Appendix is more useful than the rambling main text.
Jumper 2021— Highly accurate protein structure prediction with AlphaFold
Jumper, John, Richard Evans, Alexander Pritzel, Tim Green, Michael Figurnov, Olaf Ronneberger, Kathryn Tunyasuvunakool et al. “Highly accurate protein structure prediction with AlphaFold.” Nature 596, no. 7873 (2021): 583-589.
A difficult to read paper (due mostly to unfamiliarity with the large number of subfields), but very interesting work. Computational geometry, optimization, physics, microbiology, evolution… combined into a notably better deep learning system informed by science. Hybrid model for the win.
Juuti 2019— PRADA: Protecting against DNN Model Stealing Attacks
Juuti, Mika, Sebastian Szyller, Samuel Marchal, and N. Asokan. “PRADA: protecting against DNN model stealing attacks.” In 2019 IEEE European Symposium on Security and Privacy (EuroS&P), pp. 512-527. IEEE, 2019.
This paper is very good (and is number 6 in our top 5 list). A super clear treatment of extraction attacks and adversarial examples with nice notation, excellent algorithmic description, and solid basic concepts. Describes improved and generalized extraction attacks and protections against them. The protections are somewhat naïve.
Kairouz 2019— Censored and Fair Universal Representations using Generative Adversarial Models
Kairouz, Peter, Jiachun Liao, Chong Huang, Maunil Vyas, Monica Welfert, and Lalitha Sankar. “Generating Fair Universal Representations using Adversarial Models.” arXiv preprint arXiv:1910.00411 (2019).
A clear but very dense paper. Use GANs to hide sensitive features in a representation. The encoder tries to find the sensitive features. This purports to work on fairness.
Kaplan 2020— Enormous Transformers
Kaplan, Jared, Sam McCandlish, Tom Henighan, Tom B. Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, Dario Amodei. “Scaling Laws for Neural Language Models” arXiv preprint arXiv:2001.08361 (2020).
What’s going on with enormous transformers. Language treated mechanically. Impressive empirical paper. How general are these tasks and data sets?
Kazemi 2019— Time
Kazemi, Seyed, Rishab Goel, Sepehr Eghbali, Janahan Ramanan, Jaspreet Sahota, Sanjay Thakur, Stella Wu, Cathal Smyth, Pascal Poupart, Marcus Brubaker. “Time2Vec: Learning a Vector Representation of Time” arXiv preprint arXiv:1907.05321 (2019).
Very abstract treatment of time represented as a learned periodic vector. More engineering than ML.
Kilbertus 2018— Learning and Causality
Kilbertus, Niki, Giambattista Parascandolo, Bernhard Schölkopf. “Generalization in anti-causal learning” arXiv preprint arXiv:1812.00524 (2018).
A vague position paper that is more philosophy than anything else. Emphasizes the importance of generation (and causal models). Representation issues around continuity are explored.
Kleinberg 2016— Bias Tradeoffs
Kleinberg, Jon, Sendhil Mullainathan, Manish Raghavan. “Inherent Trade-Offs in the Fair Determination of Risk Scores” arXiv preprint arXiv:1609.05807 (2016).
Very strong for a bias paper. Brings some rigor to goal states and makes clear that tradeoffs exist. If you read only one bias paper, read this one.
Koh 2017— Influence Functions
Koh, Pang Wei and Percy Liang. “Understanding Black-box Predictions via Influence Functions” arXiv preprint arXiv:1703.04730 (2017).
Understanding adversarial inputs. Getting the “same” result through diverse paths. Influence functions, representation, and positive/negative data points.
Krizhevsky 2012 — Convolutional Nets (ReLU)
Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. “Imagenet classification with deep convolutional neural networks.” Advances in neural information processing systems. 2012.
Elegant series of hacks to reduce overfitting. A bit of hand waving. Reference to CPU speed and huge data sets. Depth is important, but nobody knows why.
Krotov 2016 — Dense Associative Memory for Pattern Recognition
Krotov, Dmitry, and John J. Hopfield. “ Dense associative memory for pattern recognition.” arXiv preprint arXiv:1606.01164 (2016).
This is a very solid introductory explanation of modern Hopfield nets. A bit “mathy” but with an important result that is worth unpacking and understanding.
For more explanation on Hopfield Networks, watch these videos with Dmitry Krotov.Kurita 2020— Transfer attacks (backdoors)
Kurita, Keita, Paul Michel, Graham Neubig. “Weight Poisoning Attacks on Pre-trained Models” arXiv preprint arXiv:2004.06660 (2020).
Transfer attacks (one of the six BIML attack categories). Very basic results. Fairly obvious. Simple. Nice. Clear. (The only bug is poor terminology…misuse of “backdoor” which has crept into the MLsec literature.)
Lake 2015 — Cogsci
Lake, Brenden, Ruslan Salakhutdinov, Joshua Tenenbaum. “Human-level concept learning through probabilistic program induction.” Science, vol. 350, no. 6266 (2015): 1332-1338.
Representation, models, and one-shot learning. A study promoting BPL.
Lake 2017 — Recurrent Net Weakness
Lake, Brenden, and Marco Baroni. “Still not systematic after all these years: On the compositional skills of sequence-to-sequence recurrent networks.” (2018).
Naive micro domain with misleading maps into human semantics (movement). An artificial attack angled with structure as weapon.
Lake 2020— Concepts
Lake, Breden and Gregory L. Murphy. “Word meaning in minds and machines” arXiv preprint arXiv:2008.01766 (2020).
Super clear (maybe obvious) treatment of fluid concepts a la dughof. Getting past the bag of words.
Legg 2007— Universal Intelligence Definition
Shane Legg, Marcus Hutter “Universal Intelligence: A Definition of Machine Intelligence” arXiv preprint arXiv:0712.3329 (2007).
This is as much a philosophy paper as it is an ML paper. Well worth a read, especially if you are not familiar with philosophy of mind and how it pertains to AI. Defines a (non-computable) measure of intelligence and then tries to move that to something useful.
Marcus 2018 — AI Perspective on ML
Marcus, Gary. “Deep learning: A critical appraisal.” arXiv preprint arXiv:1801.00631 (2018).
General overview tainted by old school AI approach. Makes clear the overlooking of representation as essential. Some failure conditions noted, at philosophical level.
McClelland 2020 — NN/NLP History
McClelland, James L., Felix Hill, Maja Rudolph, Jason Baldridge, and Hinrich Schütze. “Extending Machine Language Models toward Human-Level Language Understanding.” arXiv preprint arXiv:1912.05877 (2020).
A concise and clear history of NN and NLP. Addresses situations, neurophysiology, and sensory fusion.
McGuffie 2020 — Terrorism policy BS
McGuffie, Kris, and Alex Newhouse “The Radicalization Risks of GPT-3 and Advanced Neural Language Models” Technical Report (2020).
A lightly reasoned paper that claims that GPT-3 capabilities (which are apparently assumed to have passed the Turing Test) will lead to more radicalization. Grab your pearls.
Merrill 2020 — RNN Theory
Merrill, William, Gail Weiss, Yoav Goldberg, Roy Schwartz, Noah A. Smith, Eran Yahav. “A Formal Hierarchy of RNN Architectures” arXiv preprint arXiv:2004.08500 (2020).
A CS theory paper that combines two lines of research: rational recurrence and sequential NNs as automata. Continuous inputs may be a problem.
Mittelstadt 2016 — Ethics
Brent Daniel Mittelstadt, Brent Daniel, Patrick Allo,Mariarosaria Taddeo, Sandra Wachter and Luciano Floridi. “The ethics of algorithms: Mapping the debate.” Big Data & Society, July–December 2016: 1-21.
Weird usage of the term “algorithm” (becoming all too common). An OK map.
Mireshghallah 2021 — Discovering Essential Features for Preserving Prediction Privacy
Mireshghallah, Fatemehsadat, Mohammadkazem Taram, Ali Jalali, Ahmed Taha Taha Elthakeb, Dean Tullsen, and Hadi Esmaeilzadeh. “Not all features are equal: Discovering essential features for preserving prediction privacy.” In Proceedings of the Web Conference 2021, pp. 669-680. 2021.
Part of ML security data protection is protecting query data when an ML system is in operation. The technology described here is being commercialized by Protopia AI. A popular press article in Dark Reading by BIML explains this. Preserve important features in query and obscure the rest.
Mitchell 2021 — Abstraction and Analogy-Making
Mitchell, Melanie. “Abstraction and Analogy-Making in Artificial Intelligence.” arXiv preprint arXiv:2102.10717 (2021).
A nice overview and comparison of Copycat, DNN’s, and program induction approaches to AI through the lense of analogy making. If you believe that perception is analogy-making, then you will find this work interesting. The coverage of DNNs is a little sparse, however.
Mitchell 2019 — Model Cards
Mitchell, Margaret , Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, Timnit Gebru . “Model Cards for Model Reporting” arXiv preprint arXiv:1810.03993 (2019).
A mix of sociology and political correctness with engineering transparency. Human-centric models emphasized.
Mitchell 2021 — Why AI is Harder Than We Think
Mitchell, Melanie. “Why AI is Harder Than We Think“. arXiv preprint arXiv:2104.12871 (2021).
Super clear paper about the four AI fallacies leading to the AI Winter Sine Wave. Excellent read.
Mnih 2013 — Atari
Mnih, Volodymyr, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. “Playing atari with deep reinforcement learning.” arXiv preprint arXiv:1312.5602 (2013).
An application of convolution nets where the game representation has been shoved through a filter. Some questions open regarding randomness in the game (making the games very hard to learn). Not dice rolling for turn, but rather random behavior that is cryptographically unpredictable. This paper made a bigger splash than it likely warranted.
Narayanan — How to recognize AI snake oil
Narayanan, Arvind. “How to recognize AI snake oil.”
This presentation emphasizes the importance of data sets (and simple analysis) when it comes to ML. Also discusses prediction. Predicting things we don’t really understand is hard and deep neural nets suck at it.
Oh 2018 — Reversing NNs through queries
Oh, Seong Joon, Max Augustin, Bernt Schiele, Mario Fritz. “Towards Reverse-Engineering Black-Box Neural Networks.” arXiv preprint 1711.01768 (2018)
A goofy, clever, interesting paper that compliments Wang. Well-written but not too deep.
Papernot 2018 — Building Security In for ML (IT stance)
Papernot, Nicolas. “A Marauder’s Map of Security and Privacy in Machine Learning.” arXiv preprint arXiv:1811.01134 (2018).
Tainted only by an old school IT security approach, this paper aims at the core of #MLsec but misses the mark. Too much ops and not enough security engineering.
Park 2020 — Attribution Preservation in Network Compression for Reliable Network Interpretation
Park, Geondo, June Yong Yang, Sung Ju Hwang, and Eunho Yang. “Attribution Preservation in Network Compression for Reliable Network Interpretation.” arXiv preprint arXiv:2010.15054 (2020).
Representation and XAI meet economic reality. Solid work explained a bit obtusely.
Peters 2018 — ELMo
Peters, Matthew E, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. “Deep contextualized word representations.” arXiv preprint 1802.05365 (2018)
Important seminal work on ELMo. Some echoes of SDM and highly distributed representation power.
Phillips 2011 — Racism
Phillips, P. Jonathon, Fang Jiang, Abhijit Narvekar, Julianne Ayyad, and Alice J. O’Toole. “An other-race effect for face recognition algorithms.” ACM Transactions on Applied Perception (TAP) 8, no. 2 (2011): 14.
This paper is pretty stupid. The result is simply “when your data are racists, your system will be too” which is beyond obvious for anyone who knows how ML works. This its what happens when psych people write about ML instead of CS people
Quinn 2017 (also mm17) — Dog Walker
Quinn, Max H., Erik Conser, Jordan M. Witte, Melanie Mitchell . “Semantic Image Retrieval via Active Grounding of Visual Situations” arXiv preprint arXiv:1711.00088 (2017).
Building up representations with a hybrid Copycat/NN model. Hofstadterian model. Time as an essential component in building up a representation.
Rahwan 2019 — Towards a Study of Machine Behavior
Rahwan, Iyad, Manuel Cebrian, Nick Obradovich, Josh Bongard, Jean-François Bonnefon, Cynthia Breazeal, Jacob W. Crandall, Nicholas A. Christakis, Iain D. Couzin, Matthew O. Jackson, Nicholas R. Jennings, Ece Kamar, Isabel M. Kloumann, Hugo Larochelle, David Lazer, Richard McElreath, Alan Mislove, David C. Parkes, Alex ‘Sandy’ Pentland, Margaret E. Roberts, Azim Shariff, Joshua B. Tenenbaum & Michael Wellman. “Machine behavior.” Nature 568 (2019): 477-486.
Social science on machines. Very clear treatment. Trinity of trouble hinted at. Good analogs for security. Is ML code/data open source or not?
Ramesh 2021 — Zero-Shot Text-to-Image Generation
Ramesh, Aditya, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, Mark Chen, and Ilya Sutskever. “ Zero-shot text-to-image generation” arXiv preprint arXiv:2102.12092 (2021).
An explanation of Dall-E, without much more to it than treating the system as an engineering exercise. Too dense and not enough grounding in philosophy.
Ramsauer 2020 — Hopfield Networks
Ramsauer, Hubert, Bernhard Schäfl, Johannes Lehner, Philipp Seidl, Michael Widrich, Thomas Adler, Lukas Gruber, Markus Holzleitner, Milena Pavlović, Geir Kjetil Sandve, Victor Greiff, David Kreil, Michael Kopp, Günter Klambauer, Johannes Brandstetter, Sepp Hochreiter. “Hopfield Networks is All You Need” arXiv preprint arXiv:2008.02217 (2020).
Monster of a paper with a 12 page summary at the top. Best to start with Hopfield:Krotov. Attention is like a hopfield layer.
Rendell 2010 — Insights from the Social Learning Strategies Tournament
Rendell, Luke, Robert Boyd, Daniel Cownden, Marquist Enquist, Kimmo Eriksson, Marc W. Feldman, Laurel Fogarty, Stefano Ghirlanda, Timothy Lillicrap, and Kevin N. Laland. “Why copy others? Insights from the social learning strategies tournament.” Science 328, no. 5975 (2010): 208-213.
A very interesting treatment of social learning through game theory. Turns out that copying is a good strategy. This paper is mostly about the multi-armed bandit experiment.
Ribeiro 2016 — Explaining the Predictions of Any Classifier
Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin. ““Why Should I Trust You?”: Explaining the Predictions of Any Classifier.” In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp. 1135-1144. 2016.
LIME. This paper is the source of the famous (and mis-used) wolf/husky/snow example; the hypothetical example is often cited as a real ML system error. Explainable ML or XAI.
Ribeiro 2020 — ATCG for NNs
Ribeiro, Marco Tulio, Tongshuang Wu, Carlos Guestrin, and Sameer Singh. “Beyond Accuracy: Behavioral Testing of NLP models with CheckList.” arXiv preprint arXiv:2005.04118 (2020).
Very basic approach to bbox ATCG that begins to ask the question WHAT exactly should be tested and how to get past accuracy. Obvious and fairly shallow from a testing perspective.
Rolnick 2020 — Reversing NNs
Rolnick, David and Konrad P. Kording. “Reverse-Engineering Deep ReLU Networks” arXiv preprint arXiv:1910.00744 (2020).
Inverting an NN (linear) from queries. Great theory. Unclear about feasibility in production.
Rule 2020 — Child as Hacker
Rule, Joshua, Joshua B. Tenenbaum, Steven T. Piantadosi. “The Child as Hacker.” Trends in Cognitive Science (2020), Vol 24, No 11, 900-915. November 2020.
Philosophically great, but conceptually vague. Stochastic programs. Programs as representations.
Santoro 2021 — Symbolic Behaviour
Santoro, Adam, Andrew Lampinen, Kory Mathewson, Timothy Lillicrap, and David Raposo. “Symbolic Behaviour in Artificial Intelligence.” arXiv preprint arXiv:2102.03406 (2021).
A modern approach to the symbol grounding problem that talks about emergence of symbols through symbolic behavior. A very nice paper.
Schmidhuber 2010 — Creativity
Jürgen Schmidhuber. “Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990-2010)” Technical Report (2010).
Post-facto justification of “the thing I built.” An overview with interesting mappings to aesthetics and self-motivation. The lossless compression angle is weird. Flirts with innocent crackpotism.
Schulam 2017 — Reliable Decision Support using Counterfactual Models
Schulam, Peter, and Suchi Saria. “Reliable Decision Support using Counterfactual Models” Advances in neural information processing systems 30 (2017).
So many assumptions that the base idea becomes warped. High hopes dashed.
Sculley 2014 — Technical Debt
Sculley, D., Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, and Michael Young. “Machine learning: The high interest credit card of technical debt.” (2014).
A diatribe against deadline and just making stuff work. Naive criticism of flaws.
Sculley 2015 — Software Engineering Would Help
Sculley, David, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, Michael Young, Jean-Francois Crespo, and Dan Dennison. “Hidden technical debt in machine learning systems.” In Advances in neural information processing systems, pp. 2503-2511. 2015.
Random kludges built of interlocked pieces and parts is a bad idea. This applies to ML as well. Light on analysis and mis-directed on focus.
Sculley 2018 — Building Security In for ML (IT stance)
Sculley, D., Jasper Snoek, Ali Rahmini, Alex Wiltschko. “Winner’s Curse? On Pace, Progress, and Empirical Rigor.” ICLR 2018 Workshop paper (2018).
Argues for a scientific approach. General and pretty obvious.
Schwartz-Ziv 2017— Representation
Shwartz-Ziv, Ravid, and Naftali Tishby. “Opening the black box of Deep Neural Networks via Information.” arXiv preprint arXiv:1703.00810 (2017).
An opaque paper on representation. Pretty far afield from security.
Shankar 2020— Microsoft on MLsec
Ram Shankar Siva Kumar, Magnus Nyström, John Lambert, Andrew Marshall, Mario Goertzel, Andi Comissoneru, Matt Swann, Sharon Xia “Adversarial Machine Learning — Industry Perspectives” arXiv preprint arXiv:2002.05646 (2020).
Microsoft’s first stab at Threat Modeling for ML. Problems with nomenclature are par for the course for Microsoft (e.g., “adversarial ML” should be “MLsec”). This is a solid start but needs deeper thought. More emphasis on design would help. Also see this BIML blog entry.
Shu 2020— Disentaglement
Shu, Rui, Yining Chen, Abhishek Kumar, Stefano Ermon, Ben Poole. “Weakly Supervised Disentanglement with Guarantees.” arXiv preprint arXiv:1910.09772 (2020).
A complex paper on representation. Worth a close reading.
Shumailov 2018— The Taboo Trap: Behavioural Detection of Adversarial Samples
Shumailov, Ilia, Yiren Zhao, Robert Mullins, and Ross Anderson. “The Taboo Trap: Behavioural Detection of Adversarial Samples.” arXiv preprint arXiv:1811.07375 (2018).
Collecting activation patterns and using them to build boundaries. Stopping (simple) adversarial examples cheaply. Clear good paper with some important lessons about representation.
Shumailov 2021— Data Ordering Attacks
Shumailov, Ilia, Zakhar Shumaylov, Dmitry Kazhdan, Yiren Zhao, Nicolas Papernot, Murat A. Erdogdu, and Ross Anderson. “Manipulating SGD with Data Ordering Attacks.” arXiv preprint arXiv:2104.09667 (2021).
Another super clear fun paper from Anderson and company. Turns out that you can poison ML systems by changing the ordering of training data. RNGs are critical to good ML behavior and are an important attack vector.
Silver 2017— AlphaGo
Silver, David, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert et al. “Mastering the game of go without human knowledge.” Nature 550, no. 7676 (2017): 354.
AlphaGo trains itself by playing itself. Surprising and high profile results. Monte Carlo tree search seems to underly the results (which representations are amenable to that kind of search?). Unclear how general these results are or if they only apply to certain games with fixed rules and perfect knowledge.
Slack 2020— Adversarial Classifiers
Slack, Dylan, Sophie Hilgard, Emily Jia, Sameer Singh, and Himabindu Lakkaraju. “Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods“. arXiv preprint arXiv:1911.02508 (2020).
Adversarial classifiers with a focus on ML bias including racism and sexism in black box models.
Spelke 2007— Adversarial Classifiers
Spelke, Elizabeth S., and Katherine D. Kinzler. “Core knowledge“. Developmental science 10, no. 1 (2007): 89-96.
A tiny, tight little ditty. Describes four core knowledge systems (OANS): objects, actions, number, and space, then adds a new one for social group recognition.
Springer 2018 — Sparse Coding is Good
Arora, Sanjeev, Yuanzhi Li, Yingyu Liang, Tengyu Ma, and Andrej Risteski. “Classifiers Based on Deep Sparse Coding Architectures are Robust to Deep Learning Transferable Examples.” Transactions of the Association of Computational Linguistics 6 (2018): 483-495.
Important theory, but silly experiment. Hints at the importance of context, concept activation, and dynamic representation. Explores limits of transfer attacks WRT representation
Stretcu 2020 — Curriculum Learning
Stretcu, Otilia, Emmanouil Antonios Platanios, Tom Mitchell, Barnabás Póczos. “Coarse-to-Fine Curriculum Learning for Classification.” ICLR 2020 Workshop paper (2020).
Ties to error-making, confusion matrices, and representation
Sundararajan 2017 — Explaining Networks
Sundararajan, Mukund, Ankur Taly, and Qiqi Yan. “Axiomatic Attribution for Deep Networks” arXiv preprint arXiv:1703.01365 (2017).
A strangely-written paper trying to get to the heart of describing why a network does what it does. Quirky use of mathematical style. Hard to understand and opaque.
Tenenbaum 2011 — Review
Joshua B. Tenenbaum, Charles Kemp, Thomas L. Griffiths, Noah D. Goodman. “How to Grow a Mind: Statistics, Structure, and Abstraction.” Science, vol. 331, no. 6022 (2011): 1279-1285.
A practical philosophy of AI paper focused on bridging the usual symbolic vs sub-symbolic chasm. Overfocus on HBMs, but worth a read to understand the role that structure plays in intelligence and representation.
Thoppilan 2022— LaMDA: Language Models for Dialog Applications
Thoppilan, Romal, Daniel De Freitas, Jamie Hall, Noam Shazeer, Apoorv Kulshreshtha, Heng-Tze Cheng, Alicia Jin et al. “LaMDA: Language Models for Dialog Applications” arXiv preprint arXiv:2201.08239 (2022).
The GPT3 equivalent from Google. Interesting use of filters and tools to adapt DNN output. But still no sign of general AI.
Udrescu 2020— AI Feynman
Udrescu, Silviu-Marian, Max Tegmark. “AI Feynman: a Physics-Inspired Method for Symbolic Regression” arXiv preprint arXiv:1905.11481 (2020).
Crazy. Interesting use of NN to find simplicity for physics equations. NN vs GA battles.
Vaswani 2017 — BERT percursor
Jetley, Saumya, Nicholas A. Lord, and Phillip H.S.Torr. “Attention is All You Need.” 31st Conference on Neural Information Processing Systems. 2017.
BERT percursor
Vigna 2019— Randomness
Vigna, Sebastiano. “It Is High Time We Let Go Of The Mersenne Twister” arXiv preprint arXiv:1910.06437 (2019).
Randomness (in this case stochastic) really matters for NNs. Note that cryptographic randomness is something else entirely.
Wachter 2020— UK Regulation
Sandra Wachter & Brent Mittelstadt*. “A Right To Reasonable Inferences: Re-Thinking Data Protection Law in the Age of Big Data and AI” Technical Report, Oxford Internet Institute (2020).
Posits that inferences and predictions that include private data or PII in input set should be protected. Very EU centric.
Wallace 2020— Attacking Machine Translation
Eric Wallace, Mitchell Stern, Dawn Song. “Imitation Attacks and Defenses for Black-box Machine Translation Systems” arXiv preprint arXiv:2004.15015 (2020).
Attacking Machinbe Translation. 1. distill model by query (cloning), 2. use distilled version as whitebox, 3. a defense that fails. (Attacks BING and SYS-TRAN. Real systems!)
Wang 2018 — Transfer Learning Attacks
Wang, Bolun, Yuanshun Yao, Bimal Viswanath, Haitao Zheng, and Ben Y. Zhao. “With great training comes great vulnerability: Practical attacks against transfer learning.” In 27th {USENIX} Security Symposium ({USENIX} Security 18), pp. 1281-1297. 2018.
Attacks against transfer learning in cascaded systems. If the set of all Trained networks is small, this work hold water. “Empirical” settings. Some NNs highly susceptible to tiny noise. Good use of confusion matrix. Dumb defense through n-version voting.
Wang 2020— Stealing Hyperparameters
Wang, Binghui and Gong, Neil. “Stealing Hyperparameters in Machine Learning” arXiv preprint arXiv:1802.05351 (2020).
Fairly trivial, poorly motivated threat model. The notion of getting “free cycles” is not too impressive. Good history overview of ML security.
Wang 2021— EvilModel: Hiding Malware Inside of Neural Network Models
Wang, Zhi, Chaoge Liu, and Xiang Cui. “EvilModel: Hiding Malware Inside of Neural Network Models.” arXiv preprint arXiv:2107.08590 (2021).
This paper is not really worth reading: it’s silly. The idea that you can hide malware in big globs of stuff is not surprising. In this case, the globs are DNN neurons (yawn). Basic steganography.
Witty 2019— Causal Inference
Witty, Sam, Alexander Lew, David Jensen, Vikash Mansinghka. “Bayesian causal inference via probabilistic program synthesis” arXiv preprint arXiv:1910.14124 (2019).
Interesting paper about a toy problem. Not much of a tutorial. Doesn’t really stand on its own…so more of a teaser.
Wu 2018— The Kanerva Machine: A Generative Distributed Memory
Wu, Yan, Greg Wayne, Alex Graves, and Timothy Lillicrap. “The kanerva machine: A generative distributed memory.” arXiv preprint arXiv:1804.01756 (2018).
A hybrid model based on Kanerva’s SDM. This paper is so dense as to be inscrutable. The idea of hybrid memory/perception models is a very good one. This paper is not the best way to start.
Yuan 2018 — Adversarial Examples
Yuan, Xiaoyong, Pan He, Qile Zhu, Xiaolin Li. “Adversarial Examples: Attacks and Defenses for Deep Learning.” arXiv preprint arXiv:1712.07107 (2018).
A solid paper with a stunning set of references. A good way to understand the adversarial example landscape.
Yuan 2018 — Thieves’ Cant
Yuan, Kan, Haoran Lu, Xiaojing Liao, and XiaoFeng Wang. “Reading Thieves’ Cant: Automatically Identifying and Understanding Dark Jargons from Cybercrime Marketplaces.” In 27th {USENIX} Security Symposium ({USENIX} Security 18), pp. 1027-1041. 2018.
Train up separate embedded training sets that share projection into statistical space. Use blink testing. Bonus hilarious drug terms.
Žliobaité 2015 — Bias
Žliobaité, Indré. “A survey on measuring indirect discrimination in machine learning.” arXiv preprint arXiv:1511.00148 (2015).
Not that insightful. Intro level.
Zhu 2020 — Learning adversarially robust representations via worst-case mutual information maximization
Zhu, Sicheng, Xiao Zhang, and David Evans. “Learning adversarially robust representations via worst-case mutual information maximization.” In International Conference on Machine Learning, pp. 11609-11618. PMLR, 2020.
An excellent treatment of adversarial examples and their relationship to representation. Lots of math. Aiming in exactly the right direction. We wish we had written this one.
Videos and Popular Press
MIT Technology Review: Hundreds of AI tools have been built to catch covid. None of them helped.
This article emphasizes that ML is not magic (as approached through covid diagnosis). Turns out that data quality is a major issue in this reporting, with duplicates, meta-data, and distributed data sets all playing a role in systemic failure.
Ian Goodfellow on Adversarial Examples
James Mickens of Harvard University at the 27th Usenix Symposium
Q: Why Do Keynote Speakers Keep Suggesting That Improving Security Is Possible?
A: Because Keynote Speakers Make Bad Life Decisions and Are Poor Role Models
James Mickens, Harvard University
27th Usenix Security Symposium
Ali Rahimi’s talk at NIPS — Test-of-time Award Presentation
Joshua Tenenbaum on Triangulating Intelligence, Sessions 2 & 3
“Ingredients of Intelligence” – Brenden Lake explains why he builds computer programs that seek to mimic the way humans think.
Douglas R. Hofstadter, “The Shallowness of Google Translate” from the Atlantic Monthly
François Candelon, Rodolphe Charme di Carlo, Midas De Bondt, and Theodoros Evgeniou, “AI Regulation Is Coming” from the Harvard Business Review
Useful for understanding how some people use sloppy thinking about math to make points about ML that are nonsensical. XAI is harder than this article lets on.
Shalini Kantayya, “Coded Bias” from PBS Independent Lens

Ross Anderson on Security Engineering and Machine Learning from ICSA Colloquium (SPT Seminar July 17th, 2021)
The lecture is elegant and clear. It explains some of the sponge attacks that Ross Anderson’s group has uncovered. There is also some interesting discussion of “manners” (and standard versus anomalous behavior near the end.
Gary Marcus on why “Deep Learning is Hitting a Wall” from Nautilus
Gary Marcus on why deep learning is not quite all it’s cracked up to be. General AI is far far away.
Artificial Intelligence at Google: Our Principles
Platitudes from Google. Let’s hope this philosophy fares better than, “don’t be evil.”
Bargav Jayaraman and David Evans on Evaluating Differentially Private Machine Learning in Practice at the 28th USENIX Security Symposium
A Usenix security presentation on differential privacy. Good relevance to issues of representation.