A. Rupam Mahmood
Title
Cited by
Cited by
Year
An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning
RS Sutton, AR Mahmood, M White
Journal of Machine Learning Research 17, 2016
1312016
Weighted importance sampling for off-policy learning with linear function approximation
AR Mahmood, H van Hasselt, RS Sutton
Advances in Neural Information Processing Systems 27, 2014
912014
True Online Temporal-Difference Learning
H van Seijen, AR Mahmood, PM Pilarski, MC Machado, RS Sutton
Journal of Machine Learning Research 17, 2016
702016
Benchmarking Reinforcement Learning Algorithms on Real-World Robots
AR Mahmood, D Korenkevych, G Vasan, W Ma, J Bergstra
Proceedings of the 2nd Annual Conference on Robot Learning (CoRL), 2018
492018
Tuning-free step-size adaptation
AR Mahmood, RS Sutton, T Degris, PM Pilarski
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International …, 2012
412012
A new Q (λ) with interim forward view and Monte Carlo equivalence
RS Sutton, AR Mahmood, D Precup, M CA, H van Hasselt, U CA
352014
Off-policy TD (λ) with a true online equivalence
H van Hasselt, AR Mahmood, RS Sutton
Proceedings of the 30th Conference on Uncertainty in Artificial Intelligence …, 2014
332014
Multi-step Off-policy Learning Without Importance Sampling Ratios
AR Mahmood, H Yu, RS Sutton
arXiv preprint arXiv:1702.03006, 2017
282017
Off-policy learning based on weighted importance sampling with linear computational complexity
AR Mahmood, RS Sutton
Proceedings of the 31st Conference on Uncertainty in Artificial Intelligence …, 2015
252015
Setting up a reinforcement learning task with a real-world robot
AR Mahmood, D Korenkevych, BJ Komer, J Bergstra
2018 IEEE/RSJ International Conference on Intelligent Robots and Systems …, 2018
242018
Emphatic temporal-difference learning
AR Mahmood, H Yu, M White, RS Sutton
arXiv preprint arXiv:1507.01569, 2015
222015
Representation Search through Generate and Test
AR Mahmood, RS Sutton
Workshops at the Twenty-Seventh AAAI Conference on Artificial Intelligence, 2013
222013
On generalized bellman equations and temporal-difference learning
H Yu, AR Mahmood, RS Sutton
The Journal of Machine Learning Research 19 (1), 1864-1912, 2018
122018
Incremental Off-policy Reinforcement Learning Algorithms
A Mahmood
University of Alberta, 2017
92017
Autoregressive Policies for Continuous Control Deep Reinforcement Learning
D Korenkevych, AR Mahmood, G Vasan, J Bergstra
Proceedings of the 28th International Joint Conference on Artificial …, 2019
72019
Structure Learning of Causal Bayesian Networks: A Survey
A Mahmood
Department of Computing Science, University of Alberta, Edmonton, Canada …, 2011
72011
Automatic step-size adaptation in incremental supervised learning
A Mahmood
University of Alberta, 2010
72010
Heteroscedastic Uncertainty for Robust Generative Latent Dynamics
O Limoyo, B Chan, F Marić, B Wagstaff, AR Mahmood, J Kelly
IEEE Robotics and Automation Letters 5 (4), 6654-6661, 2020
12020
An Empirical Evaluation of True Online TD (λ)
H van Seijen, AR Mahmood, PM Pilarski, RS Sutton
arXiv preprint arXiv:1507.00353, 2015
12015
Making hyper-parameters of proximal policy optimization robust to time discretization
H Farrahi, AR Mahmood
3rd Robot Learning Workshop at NeurIPS, 2020
2020
The system can't perform the operation now. Try again later.
Articles 1–20