Interpolation in learning: steps towards understanding when overparameterization is harmless, when it helps, and when it causes harm

Anant Sahai
University of California, Berkeley
April 7, 2020
A continuing mystery in understanding the empirical success of deep neural networks has been in their ability to achieve zero training error and yet generalize well, even when the training data is noisy and there are many more parameters than data points. Following the information-theoretic tradition of seeking understanding, this talk will share our four-part approach to shedding some light on this phenomenon.

Borrowing memory that's being used: catalytic approaches to the Tree Evaluation Problem

James Cook
University of Toronto
April 6, 2020
I'll be presenting some joint work with Ian Mertz scheduled to appear at STOC 2020. The study of the Tree Evaluation Problem (TEP), introduced by S. Cook et al. (TOCT 2012), is a promising approach to separating L from P. Given a label in [k] at each leaf of a complete binary tree and an explicit function [k]^2 -> [k] for recursively computing the value of each internal node from its children, the problem is to compute the value at the root node.

The Palais-Smale Theorem and the Solution of Hilbert’s 23 Problem

Karen Uhlenbeck
The University of Texas at Austin; Distinguished Visiting Professor, School of Mathematics
April 6, 2020
Hilbert’s 23rd Problem is the last in his famous list of problems and is of a different character than the others. The description is several pages, and basically says that the calculus of variations is a subject which needs development. We will look in retrospect at one of the critical events in the calculus of variations: The point at which the critical role of dimension was understood, and the role that the Palais-Smale condition(1963) played in this understanding. I apologize that in its present state, the talk consists mostly of my reminiscences and lacks references.

The Simplicity Conjecture

Dan Cristofaro-Gardiner
Member, School of Mathematics
April 3, 2020
I will explain recent joint work proving that the group of compactly supported area preserving homeomorphisms of the two-disc is not a simple group; this answers the ”Simplicity Conjecture” in the affirmative. Our proof uses new spectral invariants, defined via periodic Floer homology, that I will introduce: these recover the Calabi invariant of monotone twists.

Learning Controllable Representations

Richard Zemel
University of Toronto; Member, School of Mathematics
April 2, 2020
As deep learning systems become more prevalent in real-world applications it is essential to allow users to exert more control over the system. Exerting some structure over the learned representations enables users to manipulate, interpret, and even obfuscate the representations, and may also improve out-of-distribution generalization. In this talk I will discuss recent work that makes some steps towards these goals, aiming to represent the input in a factorized form, with dimensions of the latent space partitioned into task-dependent and task-independent components.

Density conjecture for horizontal families of lattices in SL(2)

Mikolaj Fraczyk
Member, School of Mathematics
April 2, 2020
Let G be a real semi-simple Lie group with an irreducible unitary representation \pi. The non-temperedness of \pi is measured by the parameter p(\pi) which is defined as the infimum of p\geq 2 such that \pi has matrix coefficients in L^p(G). Sarnak and Xue conjectured that for any arithmetic lattice \Gamma \subset G and principal congruence subgroup \Gamma(q)\subset \Gamma, the multiplicity of \pi in L^2(G/\Gamma(q)) is at most O(V(q)^{2/p(\pi)+\epsilon}) where V(q) is the covolume of \Gamma(q).

Some Recent Insights on Transfer Learning

Samory Kpotufe
Columbia University; Member, School of Mathematics
March 31, 2020
A common situation in Machine Learning is one where training data is not fully representative of a target population due to bias in the sampling mechanism or high costs in sampling the target population; in such situations, we aim to ’transfer’ relevant information from the training data (a.k.a. source data) to the target application. How much information is in the source data? How much target data should we collect if any? These are all practical questions that depend crucially on 'how far' the source domain is from the target.

CSPs with Global Modular Constraints: Algorithms and Hardness via Polynomial Representations

Sivakanth Gopi
Microsoft Researcher
March 30, 2020
A theorist's dream is to show that hard instances/obstructions for an (optimal) algorithm can be used as gadgets to prove tight hardness reductions (which proves optimality of the algorithm). An example of such a result is that of Prasad Raghavendra who showed that for any constraint satisfaction problem (CSP), there is an SDP which achieves the best possible approximation factor assuming UGC. We show that a similar phenomenon occurs in CSPs with global modular constraints.

Geometry and 5d N=1 QFTs

Lakshya Bhardwaj
Harvard University
March 30, 2020
I will explain that a geometric theory built upon the theory of complex surfaces can be used to understand wide variety of phenomena in five-dimensional supersymmetric theories, which includes the following: Classification of 5d superconformal field theories (SCFTs)
Enhanced flavor symmetries of 5d SCFTs
5d N=1 gauge theory descriptions of 5d and 6d SCFTs
Dualities between 5d N=1 gauge theories
T-dualities between 6d N=(1,0) little string theories