Meta-Learning: Why It’s Hard and What We Can Do

Ke Li
Member, School of Mathematics
April 9, 2020
Meta-learning (or learning to learn) studies how to use machine learning to design machine learning methods themselves. We consider an optimization-based formulation of meta-learning that learns to design an optimization algorithm automatically, which we call Learning to Optimize. Surprisingly, it turns out that the most straightforward approach of learning such an algorithm, namely backpropagation, does not work. We explore the underlying reason for this failure, devise a solution based on reinforcement learning and discuss the open challenges in meta-learning.

On the Kudla-Rapoport conjecture

Chao Li
Columbia University
April 9, 2020
The Kudla-Rapoport conjecture predicts a precise identity between the arithmetic intersection number of special cycles on unitary Rapoport-Zink spaces and the derivative of local representation densities of hermitian forms. It is a key local ingredient to establish the arithmetic Siegel-Weil formula and the arithmetic Rallis inner product formula, relating the height of special cycles on Shimura varieties to the derivative of Siegel Eisenstein series and L-functions. We will motivate this conjecture, explain a proof and discuss global applications.

Primality testing

Andrey Kupavskii
Member, School of Mathematics
April 7, 2020
In the talk, I will explain the algorithm (and its analysis) for testing whether a number is a prime, invented by Agrawal, Kayal, and Saxena.

Interpolation in learning: steps towards understanding when overparameterization is harmless, when it helps, and when it causes harm

Anant Sahai
University of California, Berkeley
April 7, 2020
A continuing mystery in understanding the empirical success of deep neural networks has been in their ability to achieve zero training error and yet generalize well, even when the training data is noisy and there are many more parameters than data points. Following the information-theoretic tradition of seeking understanding, this talk will share our four-part approach to shedding some light on this phenomenon.

Borrowing memory that's being used: catalytic approaches to the Tree Evaluation Problem

James Cook
University of Toronto
April 6, 2020
I'll be presenting some joint work with Ian Mertz scheduled to appear at STOC 2020. The study of the Tree Evaluation Problem (TEP), introduced by S. Cook et al. (TOCT 2012), is a promising approach to separating L from P. Given a label in [k] at each leaf of a complete binary tree and an explicit function [k]^2 -> [k] for recursively computing the value of each internal node from its children, the problem is to compute the value at the root node.

The Palais-Smale Theorem and the Solution of Hilbert’s 23 Problem

Karen Uhlenbeck
The University of Texas at Austin; Distinguished Visiting Professor, School of Mathematics
April 6, 2020
Hilbert’s 23rd Problem is the last in his famous list of problems and is of a different character than the others. The description is several pages, and basically says that the calculus of variations is a subject which needs development. We will look in retrospect at one of the critical events in the calculus of variations: The point at which the critical role of dimension was understood, and the role that the Palais-Smale condition(1963) played in this understanding. I apologize that in its present state, the talk consists mostly of my reminiscences and lacks references.

The Simplicity Conjecture

Dan Cristofaro-Gardiner
Member, School of Mathematics
April 3, 2020
I will explain recent joint work proving that the group of compactly supported area preserving homeomorphisms of the two-disc is not a simple group; this answers the ”Simplicity Conjecture” in the affirmative. Our proof uses new spectral invariants, defined via periodic Floer homology, that I will introduce: these recover the Calabi invariant of monotone twists.

Learning Controllable Representations

Richard Zemel
University of Toronto; Member, School of Mathematics
April 2, 2020
As deep learning systems become more prevalent in real-world applications it is essential to allow users to exert more control over the system. Exerting some structure over the learned representations enables users to manipulate, interpret, and even obfuscate the representations, and may also improve out-of-distribution generalization. In this talk I will discuss recent work that makes some steps towards these goals, aiming to represent the input in a factorized form, with dimensions of the latent space partitioned into task-dependent and task-independent components.