Mohammad Ghavamzadeh
Mohammad Ghavamzadeh
Facebook AI Research (FAIR)
Verified email at inria.fr - Homepage
TitleCited byYear
Natural actor–critic algorithms
S Bhatnagar, RS Sutton, M Ghavamzadeh, M Lee
Automatica 45 (11), 2471-2482, 2009
3512009
Regularized policy iteration
AM Farahmand, M Ghavamzadeh, S Mannor, C Szepesvári
Advances in Neural Information Processing Systems, 441-448, 2009
1402009
Best arm identification: A unified approach to fixed budget and fixed confidence
V Gabillon, M Ghavamzadeh, A Lazaric
Advances in Neural Information Processing Systems, 3212-3220, 2012
1352012
Hierarchical multi-agent reinforcement learning
R Makar, S Mahadevan, M Ghavamzadeh
Proceedings of the fifth international conference on Autonomous agents, 246-253, 2001
1322001
Incremental natural actor-critic algorithms
S Bhatnagar, M Ghavamzadeh, M Lee, RS Sutton
Advances in neural information processing systems, 105-112, 2008
1282008
Hierarchical multi-agent reinforcement learning
M Ghavamzadeh, S Mahadevan, R Makar
Autonomous Agents and Multi-Agent Systems 13 (2), 197-229, 2006
1272006
Bayesian reinforcement learning: A survey
M Ghavamzadeh, S Mannor, J Pineau, A Tamar
Foundations and Trends® in Machine Learning 8 (5-6), 359-483, 2015
1212015
J. 4 supervised actor-critic reinforcement learning
M Barto, MT Rosenstein
Handbook of learning and approximate dynamic programming 2, 359, 2004
1212004
High-confidence off-policy evaluation
PS Thomas, G Theocharous, M Ghavamzadeh
Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015
892015
Bayesian policy gradient algorithms
M Ghavamzadeh, Y Engel
Advances in neural information processing systems, 457-464, 2007
772007
Learning to communicate and act using hierarchical reinforcement learning
M Ghavamzadeh, S Mahadevan
Proceedings of the Third International Joint Conference on Autonomous Agents …, 2004
742004
Multi-bandit best arm identification
V Gabillon, M Ghavamzadeh, A Lazaric, S Bubeck
Advances in Neural Information Processing Systems, 2222-2230, 2011
722011
Bayesian multi-task reinforcement learning
A Lazaric, M Ghavamzadeh
712010
Analysis of a classification-based policy iteration algorithm
A Lazaric, M Ghavamzadeh, R Munos
712010
Regularized fitted Q-iteration for planning in continuous-space Markovian decision problems
A massoud Farahmand, M Ghavamzadeh, C Szepesvári, S Mannor
2009 American Control Conference, 725-730, 2009
702009
Finite-Sample Analysis of Proximal Gradient TD Algorithms.
B Liu, J Liu, M Ghavamzadeh, S Mahadevan, M Petrik
UAI, 504-513, 2015
662015
High confidence policy improvement
P Thomas, G Theocharous, M Ghavamzadeh
International Conference on Machine Learning, 2380-2388, 2015
642015
Finite-sample analysis of least-squares policy iteration
A Lazaric, M Ghavamzadeh, R Munos
Journal of Machine Learning Research 13 (Oct), 3041-3074, 2012
632012
Finite-sample analysis of LSTD
A Lazaric, M Ghavamzadeh, R Munos
612010
Speedy Q-learning
MG Azar, R Munos, M Ghavamzadaeh, HJ Kappen
Spain, Granada: NIPS, 2011
582011
The system can't perform the operation now. Try again later.
Articles 1–20