School of Mathematics

Multi-Output Prediction: Theory and Practice

Inderjit Dhillon
University of Texas, Austin
August 27, 2020
Many challenging problems in modern applications amount to finding relevant results from an enormous output space of potential candidates, for example, finding the best matching product from a large catalog or suggesting related search phrases on a search engine. The size of the output space for these problems can be in the millions to billions. Moreover, observational or training data is often limited for many of the so-called “long-tail” of items in the output space.

Learning-Based Sketching Algorithms

Piotr Indyk
Massachusetts Institute of Technology
August 25, 2020
Classical algorithms typically provide "one size fits all" performance, and do not leverage properties or patterns in their inputs. A recent line of work aims to address this issue by developing algorithms that use machine learning predictions to improve their performance. In this talk I will present two examples of this type, in the context of streaming and sketching algorithms.

Event Sequence Modeling with the Neural Hawkes Process

Jason Eisner
Johns Hopkins University
August 20, 2020
Suppose you are monitoring discrete events in real time. Can you predict what events will happen in the future, and when? Can you fill in past events that you may have missed? A probability model that supports such reasoning is the neural Hawkes process (NHP), in which the Poisson intensities of K event types at time t depend on the history of past events. This autoregressive architecture can capture complex dependencies. It resembles an LSTM language model over K word types, but allows the LSTM state to evolve in continuous time.

From Speech AI to Finance AI and Back

Li Deng
August 18, 2020
A brief review will be provided first on how deep learning has disrupted speech recognition and language processing industries since 2009. Then connections will be drawn between the techniques (deep learning or otherwise) for modeling speech and language and those for financial markets. Similarities and differences of these two fields will be explored. In particular, three unique technical challenges to financial investment are addressed: extremely low signal-to-noise ratio, extremely strong nonstationarity (with adversarial nature), and heterogeneous big data.

Latent State Recovery in Reinforcement Learning

John Langford
Microsoft Research
August 13, 2020
There are three core orthogonal problems in reinforcement learning: (1) Crediting actions (2) generalizing across rich observations (3) Exploring to discover the information necessary for learning. Good solutions to pairs of these problems are fairly well known at this point, but solutions for all three are just now being discovered. I’ll discuss several such results and dive into details on a few of them.

Statistical Learning Theory for Modern Machine Learning

John Shawe-Taylor
University College London
August 11, 2020
Probably Approximately Correct (PAC) learning has attempted to analyse the generalisation of learning systems within the statistical learning framework. It has been referred to as a ‘worst case’ analysis, but the tools have been extended to analyse cases where benign distributions mean we can still generalise even if worst case bounds suggest we cannot. The talk will cover the PAC-Bayes approach to analysing generalisation that is inspired by Bayesian inference, but leads to a different role for the prior and posterior distributions.

A Blueprint of Standardized and Composable Machine Learning

Eric Xing
Carnegie Mellon University
August 6, 2020
In handling wide range of experiences ranging from data instances, knowledge, constraints, to rewards, adversaries, and lifelong interplay in an ever-growing spectrum of tasks, contemporary ML/AI research has resulted in thousands of models, learning paradigms, optimization algorithms, not mentioning countless approximation heuristics, tuning tricks, and black-box oracles, plus combinations of all above.

Nonlinear Independent Component Analysis

Aapo Hyvärinen
University of Helsinki
August 4, 2020
Unsupervised learning, in particular learning general nonlinear representations, is one of the deepest problems in machine learning. Estimating latent quantities in a generative model provides a principled framework, and has been successfully used in the linear case, e.g. with independent component analysis (ICA) and sparse coding. However, extending ICA to the nonlinear case has proven to be extremely difficult: A straight-forward extension is unidentifiable, i.e. it is not possible to recover those latent components that actually generated the data.