Follow
Szymon Sidor
Szymon Sidor
OpenAI
Verified email at openai.com - Homepage
Title
Cited by
Cited by
Year
Dota 2 with large scale deep reinforcement learning
C Berner, G Brockman, B Chan, V Cheung, P Dębiak, C Dennison, ...
arXiv preprint arXiv:1912.06680, 2019
16452019
Evolution strategies as a scalable alternative to reinforcement learning
T Salimans, J Ho, X Chen, S Sidor, I Sutskever
arXiv preprint arXiv:1703.03864, 2017
16372017
Learning dexterous in-hand manipulation
OAIM Andrychowicz, B Baker, M Chociej, R Jozefowicz, B McGrew, ...
The International Journal of Robotics Research 39 (1), 3-20, 2020
15742020
Openai baselines
P Dhariwal, C Hesse, O Klimov, A Nichol, M Plappert, A Radford, ...
9892017
Stable baselines
A Hill, A Raffin, M Ernestus, A Gleave, A Kanervisto, R Traore, P Dhariwal, ...
8552018
Parameter space noise for exploration
M Plappert, R Houthooft, P Dhariwal, S Sidor, RY Chen, X Chen, T Asfour, ...
arXiv preprint arXiv:1706.01905, 2017
6982017
Gpt-4 technical report
J Achiam, S Adler, S Agarwal, L Ahmad, I Akkaya, FL Aleman, D Almeida, ...
arXiv preprint arXiv:2303.08774, 2023
5102023
Emergent complexity via multi-agent competition
T Bansal, J Pachocki, S Sidor, I Sutskever, I Mordatch
arXiv preprint arXiv:1710.03748, 2017
4492017
Schema networks: Zero-shot transfer with a generative causal model of intuitive physics
K Kansky, T Silver, DA Mély, M Eldawy, M Lázaro-Gredilla, X Lou, ...
International conference on machine learning, 1809-1818, 2017
2712017
Ucb exploration via q-ensembles
RY Chen, S Sidor, P Abbeel, J Schulman
arXiv preprint arXiv:1706.01502, 2017
1262017
Dota 2 with large scale deep reinforcement learning
CB OpenAI, G Brockman, B Chan, V Cheung, P Debiak, C Dennison, ...
arXiv preprint arXiv:1912.06680 2, 2019
1022019
Tensor programs v: Tuning large neural networks via zero-shot hyperparameter transfer
G Yang, EJ Hu, I Babuschkin, S Sidor, X Liu, D Farhi, N Ryder, J Pachocki, ...
arXiv preprint arXiv:2203.03466, 2022
662022
Evolution strategies as a scalable alternative to reinforcement learning. arXiv 2017
T Salimans, J Ho, X Chen, S Sidor, I Sutskever
arXiv preprint arXiv:1703.03864, 2017
602017
Tuning large neural networks via zero-shot hyperparameter transfer
G Yang, E Hu, I Babuschkin, S Sidor, X Liu, D Farhi, N Ryder, J Pachocki, ...
Advances in Neural Information Processing Systems 34, 17084-17097, 2021
592021
Openai baselines (2017)
P Dhariwal, C Hesse, O Klimov, A Nichol, M Plappert, A Radford, ...
URL https://github. com/openai/baselines, 2016
572016
UCB and infogain exploration via q-ensembles
RY Chen, J Schulman, P Abbeel, S Sidor
arXiv preprint arXiv:1706.01502 9, 2017
292017
OpenAI baselines
C Hesse, M Plappert, A Radford, J Schulman, S Sidor, Y Wu
192017
Reinforcement learning with natural language signals
S Sidor
Massachusetts Institute of Technology, 2016
62016
Time resource networks
S Sidor, P Yu, C Fang, B Williams
arXiv preprint arXiv:1602.03203, 2016
22016
Occam's gates
J Raiman, S Sidor
arXiv preprint arXiv:1506.08251, 2015
12015
The system can't perform the operation now. Try again later.
Articles 1–20