Reinforcement Learning, Deep Learning,and the Role of Policy Gradient Methods - Sham Kakade