Skip to main content
Uber AI, Engineering

Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic

November 1, 2016 / Global