Seminar on Theoretical Machine Learning

Instance-Hiding Schemes for Private Distributed Learning

Sanjeev Arora
Princeton University; Distinguishing Visiting Professor, School of Mathematics
June 25, 2020
An important problem today is how to allow multiple distributed entities to train a shared neural network on their private data while protecting data privacy. Federated learning is a standard framework for distributed deep learning Federated Learning, and one would like to assure full privacy in that framework . The proposed methods, such as homomorphic encryption and differential privacy, come with drawbacks such as large computational overhead or large drop in accuracy.

The challenges of model-based reinforcement learning and how to overcome them

Csaba Szepesvári
University of Alberta
June 18, 2020
Some believe that truly effective and efficient reinforcement learning algorithms must explicitly construct and explicitly reason with models that capture the causal structure of the world. In short, model-based reinforcement learning is not optional. As this is not a new belief, it may be surprising that empirically, at least as far as the current state of art is concerned, the majority of the top performing algorithms are model-free.

On learning in the presence of biased data and strategic behavior

Avrim Blum
Toyota Technological Institute at Chicago
June 16, 2020
In this talk I will discuss two lines of work involving learning in the presence of biased data and strategic behavior. In the first, we ask whether fairness constraints on learning algorithms can actually improve the accuracy of the classifier produced, when training data is unrepresentative or corrupted due to bias. Typically, fairness constraints are analyzed as a tradeoff with classical objectives such as accuracy. Our results here show there are natural scenarios where they can be a win-win, helping to improve overall accuracy.

On Langevin Dynamics in Machine Learning

Michael I. Jordan
University of California, Berkeley
June 11, 2020
Langevin diffusions are continuous-time stochastic processes that are based on the gradient of a potential function. As such they have many connections---some known and many still to be explored---to gradient-based machine learning. I'll discuss several recent results in this vein: (1) the use of Langevin-based algorithms in bandit problems; (2) the acceleration of Langevin diffusions; (3) how to use Langevin Monte Carlo without making smoothness assumptions.

What Do Our Models Learn?

Aleksander Mądry
Massachusetts Institute of Technology
June 9, 2020
Large-scale vision benchmarks have driven---and often even defined---progress in machine learning. However, these benchmarks are merely proxies for the real-world tasks we actually care about. How well do our benchmarks capture such tasks?

Forecasting Epidemics and Pandemics

Roni Rosenfeld
Carnegie Mellon University
May 21, 2020
Epidemiological forecasting is critically needed for decision making by national and local governments, public health officials, healthcare institutions and the general public. The Delphi group at Carnegie Mellon University was founded in 2012 to advance the theory and technological capability of epidemiological forecasting, and to promote its role in decision making, both public and private. Our long term vision is to make epidemiological forecasting as useful and universally accepted as weather forecasting is today.

Neural SDEs: Deep Generative Models in the Diffusion Limit

Maxim Raginsky
University of Illinois Urbana-Champaign
May 19, 2020
In deep generative models, the latent variable is generated by a time-inhomogeneous Markov chain, where at each time step we pass the current state through a parametric nonlinear map, such as a feedforward neural net, and add a small independent Gaussian perturbation. In this talk, based on joint work with Belinda Tzen, I will discuss the diffusion limit of such models, where we increase the number of layers while sending the step size and the noise variance to zero.

MathZero, The Classification Problem, and Set-Theoretic Type Theory

David McAllester
Toyota Technological Institute at Chicago
May 14, 2020
AlphaZero learns to play go, chess and shogi at a superhuman level through self play given only the rules of the game. This raises the question of whether a similar thing could be done for mathematics --- a MathZero. MathZero would require a formal foundation and an objective. We propose the foundation of set-theoretic dependent type theory and an objective defined in terms of the classification problem --- the problem of classifying concept instances up to isomorphism. Isomorphism is central to the structure of mathematics.

Generative Modeling by Estimating Gradients of the Data Distribution

Stefano Ermon
Stanford University
May 12, 2020
Existing generative models are typically based on explicit representations of probability distributions (e.g., autoregressive or VAEs) or implicit sampling procedures (e.g., GANs). We propose an alternative approach based on modeling directly the vector field of gradients of the data distribution (scores). Our framework allows flexible energy-based model architectures, requires no sampling during training or the use of adversarial training methods.

Learning probability distributions; What can, What can't be done

Shai Ben-David
University of Waterloo
May 7, 2020
A possible high level description of statistical learning is that it aims to learn about some unknown probability distribution ("environment”) from samples it generates ("training data”). In its most general form, assuming no prior knowledge and asking to find accurate approximations to the data generating distributions, there can be no success guarantee. In this talk I will discuss two major directions of relaxing that too hard problem.