Reinforcement Learning, Deep Learning,
and the Role of Policy Gradient Methods