Workshop on New Directions in Optimization Statistics and Machine Learning

Interpretability for Everyone

Been Kim
April 16, 2020
In this talk, I would like to share some of my reflections on the progress made in the field of interpretable machine learning. We will reflect on where we are going as a field, and what are the things that we need to be aware of to make progress. With that perspective, I will then discuss some of my work on 1) sanity checking popular methods and 2) developing more lay person-friendly interpretability methods. I will also share some open theoretical questions that may help us move forward.

Steps towards more human-like learning in machines

Josh Tenenbaum
April 16, 2020
There are several broad insights we can draw from computational models of human cognition in order to build more human-like forms of machine learning. (1) The brain has a great deal of built-in structure, yet still tremendous need and potential for learning. Instead of seeing built-in structure and learning as in tension, we should be thinking about how to learn effectively with more and richer forms of structure. (2) The most powerful forms of human knowledge are symbolic and often causal and probabilistic.

Tradeoffs between Robustness and Accuracy

Percy Liang
April 16, 2020
Standard machine learning produces models that are highly accurate on average but that degrade dramatically when the test distribution deviates from the training distribution. While one can train robust models, this often comes at the expense of standard accuracy (on the training distribution). We study this tradeoff in two settings, adversarial examples and minority groups, creating simple examples which highlight generalization issues as a major source of this tradeoff.

Modularity, Attention and Credit Assignment: Efficient information dispatching in neural computations

Anirudh Goyal
April 16, 2020
Physical processes in the world often have a modular structure, with complexity emerging through combinations of simpler subsystems. Machine learning seeks to uncover and use regularities in the physical world. Although these regularities manifest themselves as statistical dependencies, they are ultimately due to dynamic processes governed by physics. These processes are often independent and only interact sparsely..Despite this, most machine learning models employ the opposite inductive bias, i.e., that all processes interact.

The Peculiar Optimization and Regularization Challenges in Multi-Task Learning and Meta-Learning

Chelsea Finn
April 16, 2020
Despite the success of deep learning, much of its success has existed in settings where the goal is to learn one, single-purpose function from data. However, in many contexts, we hope to optimize neural networks for multiple, distinct tasks (i.e. multi-task learning), and optimize so that what is learned from these tasks is transferable to the acquisition of new tasks (e.g. as in meta-learning).

Deep equilibrium models via monotone operators

Zico Kolter
April 16, 2020
In this talk, I will first introduce our recent work on the Deep Equilibrium Model (DEQ). Instead of stacking nonlinear layers, as is common in deep learning, this approach finds the equilibrium point of the repeated iteration of a single non-linear layer, then backpropagates through the layer directly using the implicit function theorem. The resulting method achieves or matches state of the art performance in many domains (while consuming much less memory), and can theoretically express any "traditional" deep network with just a single layer.

Do Simpler Models Exist and How Can We Find Them?

Cynthia Rudin
April 16, 2020
While the trend in machine learning has tended towards more complex hypothesis spaces, it is not clear that this extra complexity is always necessary or helpful for many domains. In particular, models and their predictions are often made easier to understand by adding interpretability constraints. These constraints shrink the hypothesis space; that is, they make the model simpler. Statistical learning theory suggests that generalization may be improved as a result as well. However, adding extra constraints can make optimization (exponentially) harder.

Evaluating Lossy Compression Rates of Deep Generative Models

Roger Grosse
April 15, 2020
Implicit generative models such as GANs have achieved remarkable progress at generating convincing fake images, but how well do they really match the distribution? Log-likelihood has been used extensively to evaluate generative models whenever it’s convenient to do so, but measuring log-likelihoods for implicit generative models presents computational challenges. Furthermore, in order to obtain a density, one needs to smooth the distribution using a noisy model (typically Gaussian), and this choice is hard to motivate.

Generative Modeling by Estimating Gradients of the Data Distribution

Stefano Ermon
April 15, 2020
Existing generative models are typically based on explicit representations of probability distributions (e.g., autoregressive or VAEs) or implicit sampling procedures (e.g., GANs). We propose an alternative approach based on modeling directly the vector field of gradients of the data distribution (scores). Our framework allows flexible energy-based model architectures, requires no sampling during training or the use of adversarial training methods.