The Peculiar Optimization and Regularization Challenges in Multi-Task Learning and Meta-Learning

Chelsea Finn
April 16, 2020
Despite the success of deep learning, much of its success has existed in settings where the goal is to learn one, single-purpose function from data. However, in many contexts, we hope to optimize neural networks for multiple, distinct tasks (i.e. multi-task learning), and optimize so that what is learned from these tasks is transferable to the acquisition of new tasks (e.g. as in meta-learning).

Modularity, Attention and Credit Assignment: Efficient information dispatching in neural computations

Anirudh Goyal
April 16, 2020
Physical processes in the world often have a modular structure, with complexity emerging through combinations of simpler subsystems. Machine learning seeks to uncover and use regularities in the physical world. Although these regularities manifest themselves as statistical dependencies, they are ultimately due to dynamic processes governed by physics. These processes are often independent and only interact sparsely..Despite this, most machine learning models employ the opposite inductive bias, i.e., that all processes interact.

Tradeoffs between Robustness and Accuracy

Percy Liang
April 16, 2020
Standard machine learning produces models that are highly accurate on average but that degrade dramatically when the test distribution deviates from the training distribution. While one can train robust models, this often comes at the expense of standard accuracy (on the training distribution). We study this tradeoff in two settings, adversarial examples and minority groups, creating simple examples which highlight generalization issues as a major source of this tradeoff.

Steps towards more human-like learning in machines

Josh Tenenbaum
April 16, 2020
There are several broad insights we can draw from computational models of human cognition in order to build more human-like forms of machine learning. (1) The brain has a great deal of built-in structure, yet still tremendous need and potential for learning. Instead of seeing built-in structure and learning as in tension, we should be thinking about how to learn effectively with more and richer forms of structure. (2) The most powerful forms of human knowledge are symbolic and often causal and probabilistic.

Local-global compatibility in the crystalline case

Ana Caraiani
Imperial College
April 16, 2020
Let F be a CM field. Scholze constructed Galois representations associated to classes in the cohomology of locally symmetric spaces for GL_n/F with p-torsion coefficients. These Galois representations are expected to satisfy local-global compatibility at primes above p. Even the precise formulation of this property is subtle in general, and uses Kisin’s potentially semistable deformation rings. However, this property is crucial for proving modularity lifting theorems. I will discuss joint work with J.

Interpretability for Everyone

Been Kim
April 16, 2020
In this talk, I would like to share some of my reflections on the progress made in the field of interpretable machine learning. We will reflect on where we are going as a field, and what are the things that we need to be aware of to make progress. With that perspective, I will then discuss some of my work on 1) sanity checking popular methods and 2) developing more lay person-friendly interpretability methods. I will also share some open theoretical questions that may help us move forward.

Iterative Random Forests (iRF) with applications to genomics and precision medicine

Bin Yu
April 15, 2020
Genomics has revolutionized biology, enabling the interrogation of whole transcriptomes, genome-wide binding sites for proteins, and many other molecular processes. However, individual genomic assays measure elements that interact in vivo as components of larger molecular machines. Understanding how these high-order interactions drive gene expression presents a substantial statistical challenge.

Generative Modeling by Estimating Gradients of the Data Distribution

Stefano Ermon
April 15, 2020
Existing generative models are typically based on explicit representations of probability distributions (e.g., autoregressive or VAEs) or implicit sampling procedures (e.g., GANs). We propose an alternative approach based on modeling directly the vector field of gradients of the data distribution (scores). Our framework allows flexible energy-based model architectures, requires no sampling during training or the use of adversarial training methods.

Evaluating Lossy Compression Rates of Deep Generative Models

Roger Grosse
April 15, 2020
Implicit generative models such as GANs have achieved remarkable progress at generating convincing fake images, but how well do they really match the distribution? Log-likelihood has been used extensively to evaluate generative models whenever it’s convenient to do so, but measuring log-likelihoods for implicit generative models presents computational challenges. Furthermore, in order to obtain a density, one needs to smooth the distribution using a noisy model (typically Gaussian), and this choice is hard to motivate.