Seminar on Theoretical Machine Learning

Multi-Output Prediction: Theory and Practice

Inderjit Dhillon
University of Texas, Austin
August 27, 2020
Many challenging problems in modern applications amount to finding relevant results from an enormous output space of potential candidates, for example, finding the best matching product from a large catalog or suggesting related search phrases on a search engine. The size of the output space for these problems can be in the millions to billions. Moreover, observational or training data is often limited for many of the so-called “long-tail” of items in the output space.

Learning-Based Sketching Algorithms

Piotr Indyk
Massachusetts Institute of Technology
August 25, 2020
Classical algorithms typically provide "one size fits all" performance, and do not leverage properties or patterns in their inputs. A recent line of work aims to address this issue by developing algorithms that use machine learning predictions to improve their performance. In this talk I will present two examples of this type, in the context of streaming and sketching algorithms.

Event Sequence Modeling with the Neural Hawkes Process

Jason Eisner
Johns Hopkins University
August 20, 2020
Suppose you are monitoring discrete events in real time. Can you predict what events will happen in the future, and when? Can you fill in past events that you may have missed? A probability model that supports such reasoning is the neural Hawkes process (NHP), in which the Poisson intensities of K event types at time t depend on the history of past events. This autoregressive architecture can capture complex dependencies. It resembles an LSTM language model over K word types, but allows the LSTM state to evolve in continuous time.

From Speech AI to Finance AI and Back

Li Deng
Citadel
August 18, 2020
A brief review will be provided first on how deep learning has disrupted speech recognition and language processing industries since 2009. Then connections will be drawn between the techniques (deep learning or otherwise) for modeling speech and language and those for financial markets. Similarities and differences of these two fields will be explored. In particular, three unique technical challenges to financial investment are addressed: extremely low signal-to-noise ratio, extremely strong nonstationarity (with adversarial nature), and heterogeneous big data.

Latent State Recovery in Reinforcement Learning

John Langford
Microsoft Research
August 13, 2020
There are three core orthogonal problems in reinforcement learning: (1) Crediting actions (2) generalizing across rich observations (3) Exploring to discover the information necessary for learning. Good solutions to pairs of these problems are fairly well known at this point, but solutions for all three are just now being discovered. I’ll discuss several such results and dive into details on a few of them.

Statistical Learning Theory for Modern Machine Learning

John Shawe-Taylor
University College London
August 11, 2020
Probably Approximately Correct (PAC) learning has attempted to analyse the generalisation of learning systems within the statistical learning framework. It has been referred to as a ‘worst case’ analysis, but the tools have been extended to analyse cases where benign distributions mean we can still generalise even if worst case bounds suggest we cannot. The talk will cover the PAC-Bayes approach to analysing generalisation that is inspired by Bayesian inference, but leads to a different role for the prior and posterior distributions.

A Blueprint of Standardized and Composable Machine Learning

Eric Xing
Carnegie Mellon University
August 6, 2020
In handling wide range of experiences ranging from data instances, knowledge, constraints, to rewards, adversaries, and lifelong interplay in an ever-growing spectrum of tasks, contemporary ML/AI research has resulted in thousands of models, learning paradigms, optimization algorithms, not mentioning countless approximation heuristics, tuning tricks, and black-box oracles, plus combinations of all above.

Nonlinear Independent Component Analysis

Aapo Hyvärinen
University of Helsinki
August 4, 2020
Unsupervised learning, in particular learning general nonlinear representations, is one of the deepest problems in machine learning. Estimating latent quantities in a generative model provides a principled framework, and has been successfully used in the linear case, e.g. with independent component analysis (ICA) and sparse coding. However, extending ICA to the nonlinear case has proven to be extremely difficult: A straight-forward extension is unidentifiable, i.e. it is not possible to recover those latent components that actually generated the data.

Efficient Robot Skill Learning via Grounded Simulation Learning, Imitation Learning from Observation, and Off-Policy Reinforcement Learning

Peter Stone
University of Texas at Austin
July 30, 2020
For autonomous robots to operate in the open, dynamically changing world, they will need to be able to learn a robust set of skills from relatively little experience. This talk begins by introducing Grounded Simulation Learning as a way to bridge the so-called reality gap between simulators and the real world in order to enable transfer learning from simulation to a real robot.

Generalized Energy-Based Models

Arthur Gretton
University College London
July 28, 2020
I will introduce Generalized Energy Based Models (GEBM) for generative modelling. These models combine two trained components: a base distribution (generally an implicit model), which can learn the support of data with low intrinsic dimension in a high dimensional space; and an energy function, to refine the probability mass on the learned support. Both the energy function and base jointly constitute the final model, unlike GANs, which retain only the base distribution (the "generator").