Shixiang Shane Gu (顾世翔)
Shixiang Shane Gu (顾世翔)
Senior Research Scientist, Google Brain
Verified email at google.com - Homepage
Title
Cited by
Cited by
Year
Categorical reparameterization with gumbel-softmax
E Jang, S Gu, B Poole
arXiv preprint arXiv:1611.01144, 2016
20172016
Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates
S Gu, E Holly, T Lillicrap, S Levine
2017 IEEE international conference on robotics and automation (ICRA), 3389-3396, 2017
9132017
Continuous deep q-learning with model-based acceleration
S Gu, T Lillicrap, I Sutskever, S Levine
International Conference on Machine Learning, 2829-2838, 2016
7622016
Continuous deep q-learning with model-based acceleration
S Gu, T Lillicrap, I Sutskever, S Levine
International Conference on Machine Learning, 2829-2838, 2016
7622016
Towards deep neural network architectures robust to adversarial examples
S Gu, L Rigazio
arXiv preprint arXiv:1412.5068, 2014
5782014
Data-efficient hierarchical reinforcement learning
O Nachum, S Gu, H Lee, S Levine
arXiv preprint arXiv:1805.08296, 2018
2872018
Q-prop: Sample-efficient policy gradient with an off-policy critic
S Gu, T Lillicrap, Z Ghahramani, RE Turner, S Levine
arXiv preprint arXiv:1611.02247, 2016
2652016
Sequence tutor: Conservative fine-tuning of sequence generation models with kl-control
N Jaques, S Gu, D Bahdanau, JM Hernández-Lobato, RE Turner, D Eck
International Conference on Machine Learning, 1645-1654, 2017
177*2017
Temporal difference models: Model-free deep rl for model-based control
V Pong, S Gu, M Dalal, S Levine
arXiv preprint arXiv:1802.09081, 2018
1312018
Interpolated policy gradient: Merging on-policy and off-policy gradient estimation for deep reinforcement learning
S Gu, T Lillicrap, Z Ghahramani, RE Turner, B Schölkopf, S Levine
arXiv preprint arXiv:1706.00387, 2017
1202017
Muprop: Unbiased backpropagation for stochastic neural networks
S Gu, S Levine, I Sutskever, A Mnih
arXiv preprint arXiv:1511.05176, 2015
1202015
Neural adaptive sequential monte carlo
S Gu, Z Ghahramani, RE Turner
arXiv preprint arXiv:1506.03338, 2015
1192015
Neural adaptive sequential monte carlo
S Gu, Z Ghahramani, RE Turner
arXiv preprint arXiv:1506.03338, 2015
1192015
Human-centric dialog training via offline reinforcement learning
N Jaques, JH Shen, A Ghandeharioun, C Ferguson, A Lapedriza, ...
arXiv preprint arXiv:2010.05848, 2020
79*2020
Near-optimal representation learning for hierarchical reinforcement learning
O Nachum, S Gu, H Lee, S Levine
arXiv preprint arXiv:1810.01257, 2018
762018
Dynamics-aware unsupervised discovery of skills
A Sharma, S Gu, S Levine, V Kumar, K Hausman
arXiv preprint arXiv:1907.01657, 2019
74*2019
The mirage of action-dependent baselines in reinforcement learning
G Tucker, S Bhupatiraju, S Gu, R Turner, Z Ghahramani, S Levine
International conference on machine learning, 5015-5024, 2018
692018
Leave no trace: Learning to reset for safe and autonomous reinforcement learning
B Eysenbach, S Gu, J Ibarz, S Levine
arXiv preprint arXiv:1711.06782, 2017
612017
A divergence minimization perspective on imitation learning methods
SKS Ghasemipour, R Zemel, S Gu
Conference on Robot Learning, 1259-1277, 2020
59*2020
Doubly reparameterized gradient estimators for monte carlo objectives
G Tucker, D Lawson, S Gu, CJ Maddison
arXiv preprint arXiv:1810.04152, 2018
532018
The system can't perform the operation now. Try again later.
Articles 1–20