https://github.com/manfreddiaz/rl-benchmarks

A Historic Recollection of Reinforcement Learning Benchmarks
https://github.com/manfreddiaz/rl-benchmarks
Last synced: 4 months ago
JSON representation
A Historic Recollection of Reinforcement Learning Benchmarks
Host: GitHub
URL: https://github.com/manfreddiaz/rl-benchmarks
Owner: manfreddiaz
License: mit
Created: 2021-10-02T17:52:20.000Z (over 3 years ago)
Default Branch: main
Last Pushed: 2021-10-03T15:47:35.000Z (over 3 years ago)
Last Synced: 2024-12-27T01:09:32.792Z (5 months ago)
Language: TeX
Size: 59.6 KB
Stars: 2
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project

README

        # RL Benchmarks: A Historic Recollection of Reinforcement Learning Benchmarks

## Pre-Deep Learning Era (2009-2014)

1. Tanner, Brian, and Adam White. 2009. __“RL-Glue: Language-Independent Software for Reinforcement-Learning Experiments.”__ Journal of Machine Learning Research: JMLR 10 (74): 2133–36. [[paper]](http://jmlr.org/papers/v10/tanner09a.html)

1. Laird, John E., and Robert E. Wray III. 2010. ___“Cognitive Architecture Requirements for Achieving AGI.”___ In Proceedings of the 3d Conference on Artificial General Intelligence (AGI-10). Paris, France: Atlantis Press. [[paper]](https://doi.org/10.2991/agi.2010.2)

1. Whiteson, Shimon, Brian Tanner, and Adam White. 2010. __“The Reinforcement Learning Competitions.”__ In AI Magazine. [[paper]](http://citeseerx.ist.psu.edu/viewdoc/citations;jsessionid=B401441AFD41A5C593498EFD3ACB31EF?doi=10.1.1.634.5825)

1. Whiteson, Shimon, Brian Tanner, Matthew E. Taylor, and Peter Stone. 2011. ___“Protecting against Evaluation Overfitting in Empirical Reinforcement Learning.”___ In 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL). IEEE. [[paper]](https://doi.org/10.1109/adprl.2011.5967363)

1. Schaul, Tom, Julian Togelius, and Jürgen Schmidhuber. 2011. __“Measuring Intelligence through Games.”__ arXiv. [[paper]](http://arxiv.org/abs/1109.1314)

1. Bellemare, Marc G., Yavar Naddaf, Joel Veness, and Michael Bowling. 2012. ___“The Arcade Learning Environment: An Evaluation Platform for General Agents.”___ arXiv [cs.AI]. arXiv. [[paper]](http://arxiv.org/abs/1207.4708)

1. Adams, Sam, Itmar Arel, Joscha Bach, Robert Coop, Rod Furlan, Ben Goertzel, J. Storrs Hall, et al. 2012. ___“Mapping the Landscape of Human-Level Artificial General Intelligence.”___ AI Magazine 33 (1): 25–42. [[paper]](https://doi.org/10.1609/aimag.v33i1.2322)

1. Riedmiller, Martin, Manuel Blum, Thomas Lampe, Roland Hafner, Sascha Lange, and Stephan Timmer. 2013. ___CLSquare: Closed Loop Simulation System.___ [[paper]](https://ml.informatik.uni-freiburg.de/former/research/clsquare.html)

1. Schaul, Tom. 2013. ___“A Video Game Description Language for Model-Based or Interactive Learning.”___ In 2013 IEEE Conference on Computational Inteligence in Games (CIG). IEEE.  [[paper]](https://doi.org/10.1109/cig.2013.6633610)

1. Coleman, Oliver J., Alan D. Blair, and Jeff Clune. 2014. ___“Automated Generation of Environments to Test the General Learning Capabilities of AI Agents.”___ In Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation, 161–68. GECCO ’14. New York, NY, USA: Association for Computing Machinery. [[paper]](https://doi.org/10.1145/2576768.2598257)

## The Deep Reinforcement Learning Era (2015-*)

1. Agarwal, Rishabh, Dale Schuurmans, and Mohammad Norouzi. 2019. ___“An Optimistic Perspective on Offline Reinforcement Learning.”___ arXiv [cs.LG]. arXiv. [[paper]](http://arxiv.org/abs/1907.04543)

1. Ahn, Michael, Henry Zhu, Kristian Hartikainen, Hugo Ponte, Abhishek Gupta, Sergey Levine, and Vikash Kumar. 2019. ___“ROBEL: Robotics Benchmarks for Learning with Low-Cost Robots.”___ arXiv [cs.RO]. arXiv. [[paper]](http://arxiv.org/abs/1909.11639).

1. Beattie, Charles, Joel Z. Leibo, Denis Teplyashin, Tom Ward, Marcus Wainwright, Heinrich Küttler, Andrew Lefrancq, et al. 2016. ___“DeepMind Lab.”___ arXiv [cs.AI]. arXiv. [[paper]](http://arxiv.org/abs/1612.03801).

1. Brockman, Greg, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. 2016. ___“OpenAI Gym.”___ arXiv [cs.LG]. arXiv. [[paper]](http://arxiv.org/abs/1606.01540).

1. Chollet, François. 2019. ___“On the Measure of Intelligence.”___ arXiv [cs.AI]. arXiv. [[paper]](http://arxiv.org/abs/1911.01547).

1. Cobbe, Karl, Christopher Hesse, Jacob Hilton, and John Schulman. 2019. ___“Leveraging Procedural Generation to Benchmark Reinforcement Learning.”___ arXiv [cs.LG]. arXiv. [[paper]](http://arxiv.org/abs/1912.01588).

1. Cobbe, Karl, Oleg Klimov, Chris Hesse, Taehoon Kim, and John Schulman. 2018. ___“Quantifying Generalization in Reinforcement Learning.”___ arXiv [cs.LG]. arXiv. [[paper]](http://arxiv.org/abs/1812.02341).

1. Collins, Jack, Jessie McVicar, David Wedlock, Ross Brown, David Howard, and Jürgen Leitner. 2019. ___“Benchmarking Simulated Robotic Manipulation through a Real World Dataset.”___ arXiv [cs.RO]. arXiv. [[paper]](http://arxiv.org/abs/1911.01557).

1. Côté, Marc-Alexandre, Ákos Kádár, Xingdi Yuan, Ben Kybartas, Tavian Barnes, Emery Fine, James Moore, et al. 2019. ___“TextWorld: A Learning Environment for Text-Based Games.”___ In Computer Games, 41–75. Springer International Publishing. [[paper]](https://doi.org/10.1007/978-3-030-24337-1_3).

1. Crosby, Matthew, Benjamin Beyret, Murray Shanahan, José Hernández-Orallo, Lucy Cheke, and Marta Halina. 2020. ___“The Animal-AI Testbed and Competition.”___ In Proceedings of the NeurIPS 2019 Competition and Demonstration Track, edited by Hugo Jair Escalante and Raia Hadsell, 123:164–76. Proceedings of Machine Learning Research. PMLR. [[paper]](https://proceedings.mlr.press/v123/crosby20a.html).

1. Daniel Freeman, C., Erik Frey, Anton Raichuk, Sertan Girgin, Igor Mordatch, and Olivier Bachem. 2021. ___“Brax -- A Differentiable Physics Engine for Large Scale Rigid Body Simulation.”___ arXiv [cs.RO]. arXiv. [[paper]](http://arxiv.org/abs/2106.13281).

1. Duan, Yan, Xi Chen, Rein Houthooft, John Schulman, and Pieter Abbeel. 2016. ___“Benchmarking Deep Reinforcement Learning for Continuous Control.”___ arXiv [cs.LG]. arXiv. [[paper]](http://arxiv.org/abs/1604.06778).

1. Dulac-Arnold, Gabriel, Nir Levine, Daniel J. Mankowitz, Jerry Li, Cosmin Paduraru, Sven Gowal, and Todd Hester. 2020. ___“An Empirical Investigation of the Challenges of Real-World Reinforcement Learning.”___ arXiv [cs.LG]. arXiv. [[paper]](http://arxiv.org/abs/2003.11881).

1. Fan, Linxi, and Yuke Zhu. 2018. ___“SURREAL: Open-Source Reinforcement Learning Framework and Robot Manipulation Benchmark.”___ In. [[paper]](https://surreal.stanford.edu/img/surreal-corl2018.pdf).

1. Fortunato, Meire, Melissa Tan, Ryan Faulkner, Steven Hansen, Adrià Puigdomènech Badia, Gavin Buttimore, Charles Deck, Joel Z. Leibo, and Charles Blundell. 2019. ___“Generalization of Reinforcement Learners with Working and Episodic Memory.”___ In Advances in Neural Information Processing Systems, 12448–57.

1. Fu, Justin, Aviral Kumar, Ofir Nachum, George Tucker, and Sergey Levine. 2020. ___D4RL: Datasets for Deep Data-Driven Reinforcement Learning.”___ arXiv [cs.LG]. arXiv. [[paper]](http://arxiv.org/abs/2004.07219).

1. Fujimoto, Scott, Edoardo Conti, Mohammad Ghavamzadeh, and Joelle Pineau. 2019. ___“Benchmarking Batch Deep Reinforcement Learning Algorithms.”___ arXiv [cs.LG]. arXiv. [[paper]](http://arxiv.org/abs/1910.01708).

1. Gulcehre, Caglar, Ziyu Wang, Alexander Novikov, Tom Le Paine, Sergio Gomez Colmenarejo, Konrad Zolna, Rishabh Agarwal, et al. 2020. ___“RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning.”___ arXiv [cs.LG]. arXiv. [[paper]](https://proceedings.neurips.cc/paper/2020/file/51200d29d1fc15f5a71c1dab4bb54f7c-Paper.pdf).

1. Guss, William H., Cayden Codel, Katja Hofmann, Brandon Houghton, Noboru Kuno, Stephanie Milani, Sharada Mohanty, et al. 2019. ___“The MineRL 2019 Competition on Sample Efficient Reinforcement Learning Using Human Priors.”___ arXiv [cs.LG]. arXiv. [[paper]](http://arxiv.org/abs/1904.10079).

1. Guss, William H., Brandon Houghton, Nicholay Topin, Phillip Wang, Cayden Codel, Manuela Veloso, and Ruslan Salakhutdinov. 2019. ___“MineRL: A Large-Scale Dataset of Minecraft Demonstrations.”___ arXiv [cs.LG]. arXiv. [[paper]](http://arxiv.org/abs/1907.13440).

1. James, Stephen, Zicong Ma, David Rovick Arrojo, and Andrew J. Davison. 2019. ___“RLBench: The Robot Learning Benchmark & Learning Environment.”___ arXiv [cs.RO]. arXiv. [[paper]](http://arxiv.org/abs/1909.12271).

1. Johnson, Matthew, Katja Hofmann, Tim Hutton, and David Bignell. 2016. ___“The Malmo Platform for Artificial Intelligence Experimentation.”___ In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 4246–47. IJCAI’16. AAAI Press. [[paper]](https://dl.acm.org/doi/10.5555/3061053.3061259).

1. Juliani, Arthur, Vincent-Pierre Berges, Ervin Teng, Andrew Cohen, Jonathan Harper, Chris Elion, Chris Goy, et al. 2018. ___“Unity: A General Platform for Intelligent Agents.”___ arXiv [cs.LG]. arXiv. [[paper]](http://arxiv.org/abs/1809.02627).

1. Juliani, Arthur, Ahmed Khalifa, Vincent-Pierre Berges, Jonathan Harper, Ervin Teng, Hunter Henry, Adam Crespi, Julian Togelius, and Danny Lange. 2019. ___“Obstacle Tower: A Generalization Challenge in Vision, Control, and Planning.”___ arXiv [cs.AI]. arXiv. [[paper]](http://arxiv.org/abs/1902.01378).

1. Kannan, Harini, Danijar Hafner, Chelsea Finn, and Dumitru Erhan. 2021. ___“RoboDesk Environment v0.”___ 2021. [[paper]](https://github.com/google-research/robodesk).

1. Kempka, Michał, Marek Wydmuch, Grzegorz Runc, Jakub Toczek, and Wojciech Jaśkowski. 2016. ___“ViZDoom: A Doom-Based AI Research Platform for Visual Reinforcement Learning.”___ arXiv [cs.LG]. arXiv. [[paper]](http://arxiv.org/abs/1605.02097)

1. Kurach, Karol, Anton Raichuk, Piotr Stańczyk, Michał Zając, Olivier Bachem, Lasse Espeholt, Carlos Riquelme, et al. 2019. ___“Google Research Football: A Novel Reinforcement Learning Environment.”___ arXiv [cs.LG]. arXiv. [[paper]](http://arxiv.org/abs/1907.11180).

1. Küttler, Heinrich, Nantas Nardelli, Alexander H. Miller, Roberta Raileanu, Marco Selvatici, Edward Grefenstette, and Tim Rocktäschel. 2020. ___“The NetHack Learning Environment.”___ arXiv [cs.LG]. arXiv. [[paper]](http://arxiv.org/abs/2006.13760).

1. Le Paine, Tom, Caglar Gulcehre, Bobak Shahriari, Misha Denil, Matt Hoffman, Hubert Soyer, Richard Tanburn, et al. 2019. ___“Making Efficient Use of Demonstrations to Solve Hard Exploration Problems.”___ arXiv [cs.LG]. arXiv. [[paper]](http://arxiv.org/abs/1909.01387).

1. Lee, Youngwoon, Edward S. Hu, Zhengyu Yang, Alex Yin, and Joseph J. Lim. 2019. ___“IKEA Furniture Assembly Environment for Long-Horizon Complex Manipulation Tasks.”___ arXiv [cs.RO]. arXiv. [[paper]](http://arxiv.org/abs/1911.07246).

1. Leibo, Joel Z., Cyprien de Masson d’Autume, Daniel Zoran, David Amos, Charles Beattie, Keith Anderson, Antonio García Castañeda, et al. 2018. ___“Psychlab: A Psychology Laboratory for Deep Reinforcement Learning Agents.”___ arXiv [cs.AI]. arXiv. [[paper]](http://arxiv.org/abs/1801.08116).

1. Machado, Marlos C., Marc G. Bellemare, Erik Talvitie, Joel Veness, Matthew Hausknecht, and Michael Bowling. 2018. ___“Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents.”___ The Journal of Artificial Intelligence Research 61 (1): 523–62. [[paper]](https://dl.acm.org/doi/abs/10.5555/3241691.3241702).

1. Mbuwir, Brida V., Carlo Manna, Fred Spiessens, and Geert Deconinck. 2020. ___“Benchmarking Reinforcement Learning Algorithms for Demand Response Applications.”___ In 2020 IEEE PES Innovative Smart Grid Technologies Europe (ISGT-Europe), 289–93. [[paper]](https://doi.org/10.1109/ISGT-Europe47291.2020.9248800).

1. Nichol, Alex, Vicki Pfau, Christopher Hesse, Oleg Klimov, and John Schulman. 2018. ___“Gotta Learn Fast: A New Benchmark for Generalization in RL.”___ arXiv [cs.LG]. arXiv. [[paper]](http://arxiv.org/abs/1804.03720).

1. Osband, Ian, Yotam Doron, Matteo Hessel, John Aslanides, Eren Sezener, Andre Saraiva, Katrina McKinney, et al. 2019. ___“Behaviour Suite for Reinforcement Learning.”___ arXiv [cs.LG]. arXiv. [[paper]](http://arxiv.org/abs/1908.03568).

1. Perez-Liebana, Diego, Spyridon Samothrakis, Julian Togelius, Tom Schaul, Simon M. Lucas, Adrien Couëtoux, Jerry Lee, Chong-U Lim, and Tommy Thompson. 2016. ___“The 2014 General Video Game Playing Competition.”___ IEEE Transactions on Computational Intelligence in AI and Games 8 (3): 229–43. [[paper]](https://doi.org/10.1109/TCIAIG.2015.2402393).

1. Platanios, Emmanouil Antonios, Abulhair Saparov, and Tom Mitchell. 2020. ___“Jelly Bean World: A Testbed for Never-Ending Learning.”___ [[paper]](https://www.semanticscholar.org/paper/665fbb2645d1213e7eb95d870acd2ed75c74d1a5).

1. Rajeswaran, Aravind, Vikash Kumar, Abhishek Gupta, Giulia Vezzani, John Schulman, Emanuel Todorov, and Sergey Levine. 2017. ___“Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations.”___ arXiv [cs.LG]. arXiv. [[paper]](http://arxiv.org/abs/1709.10087).

1. Ray, Alex, Joshua Achiam, and Dario Amodei. 2019. ___“Benchmarking Safe Exploration in Deep Reinforcement Learning.”___ [[paper]](https://cdn.openai.com/safexp-short.pdf).

1. Samvelyan, Mikayel, Robert Kirk, Vitaly Kurin, Jack Parker-Holder, Minqi Jiang, Eric Hambro, Fabio Petroni, Heinrich Kuttler, Edward Grefenstette, and Tim Rocktäschel. 2021. ___“MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research.”___ [[paper]](https://openreview.net/pdf?id=skFwlyefkWJ).

1. Savva, Manolis, Abhishek Kadian, Oleksandr Maksymets, Yili Zhao, Erik Wijmans, Bhavana Jain, Julian Straub, et al. 2019. ___“Habitat: A Platform for Embodied AI Research.”___ arXiv [cs.CV]. arXiv. [[paper]](http://arxiv.org/abs/1904.01201).

1. Szot, Andrew, Alex Clegg, Eric Undersander, Erik Wijmans, Yili Zhao, John Turner, Noah Maestre, et al. 2021. ___“Habitat 2.0: Training Home Assistants to Rearrange Their Habitat.”___ arXiv [cs.LG]. arXiv. [[paper]](http://arxiv.org/abs/2106.14405).

1. Tassa, Yuval, Saran Tunyasuvunakool, Alistair Muldal, Yotam Doron, Piotr Trochim, Siqi Liu, Steven Bohez, et al. 2020. ___“Dm_control: Software and Tasks for Continuous Control.”___ arXiv [cs.RO]. arXiv. [[paper]](http://arxiv.org/abs/2006.12983).

1. Wang, Jane X., Michael King, Nicolas Porcel, Zeb Kurth-Nelson, Tina Zhu, Charlie Deck, Peter Choy, et al. 2021. ___“Alchemy: A Structured Task Distribution for Meta-Reinforcement Learning.”___ arXiv [cs.LG]. arXiv. [[paper]](http://arxiv.org/abs/2102.02926).

1. Wang, Tingwu, Xuchan Bao, Ignasi Clavera, Jerrick Hoang, Yeming Wen, Eric Langlois, Shunshi Zhang, Guodong Zhang, Pieter Abbeel, and Jimmy Ba. 2019. ___“Benchmarking Model-Based Reinforcement Learning.”___ arXiv [cs.LG]. arXiv. [[paper]](http://arxiv.org/abs/1907.02057).

1. Yu, Tianhe, Deirdre Quillen, Zhanpeng He, Ryan Julian, Avnish Narayan, Hayden Shively, Adithya Bellathur, Karol Hausman, Chelsea Finn, and Sergey Levine. 2019. ___“Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning.”___ arXiv [cs.LG]. arXiv. [[paper]](http://arxiv.org/abs/1910.10897).

1. Zhang, Amy, Yuxin Wu, and Joelle Pineau. 2018. ___“Natural Environment Benchmarks for Reinforcement Learning.”___ arXiv [cs.LG]. arXiv. [[paper]](http://arxiv.org/abs/1811.06032).
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/manfreddiaz/rl-benchmarks

Awesome Lists containing this project

README