John Langford

Microsoft Research

August 13, 2020

There are three core orthogonal problems in reinforcement learning: (1) Crediting actions (2) generalizing across rich observations (3) Exploring to discover the information necessary for learning. Good solutions to pairs of these problems are fairly well known at this point, but solutions for all three are just now being discovered. I’ll discuss several such results and dive into details on a few of them.