Self-tuning Gradient Estimators through Higher-order Automatic Differentiation in Julia

Gradient-based optimization is the main trick of deep learning and deep reinforcement learning. However, it’s hard to estimate gradients in the most interesting settings - when the mechanism being optimized is unknown (as in reinforcement learning) or involves discrete operations (such as in optimizing programs). I’ll give a quick overview of the tricks of the trade:

The REINFORCE estimator, which can be summarized as “if it works, do it more”.
The reparameterization trick, which factors out randomness to expose the deterministic relationship between inputs and outputs.
Control variates, which reduce variance by introducing a predictable baseline. I’ll also talk about a recent family of self-tuning gradient estimators that combines all of these, LAX 1. This involves not just automatic differentiation, but also differentiating through automatic differentiation itself. I’ll talk about the subtleties of doing this correctly, and how to approach these problems in Julia.

Speaker's bio

Jesse Bettencourt is a graduate student in the Machine Learning group at the University of Toronto and the Vector Institute. He is supervised by David Duvenaud and Roger Grosse and teaches the undergraduate/graduate course on probabilistic models and machine learning. He is very excited to use Julia in his ML research and possibly in future course offerings.