{"id":16403895,"url":"https://github.com/marcometer/nerorl","last_synced_at":"2025-05-09T01:47:48.318Z","repository":{"id":42992970,"uuid":"266057803","full_name":"MarcoMeter/neroRL","owner":"MarcoMeter","description":"Deep Reinforcement Learning Framework done with PyTorch","archived":false,"fork":false,"pushed_at":"2025-03-12T14:05:59.000Z","size":88928,"stargazers_count":36,"open_issues_count":1,"forks_count":8,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-05-09T01:47:41.262Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/MarcoMeter.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-05-22T08:23:17.000Z","updated_at":"2025-05-05T06:19:35.000Z","dependencies_parsed_at":"2024-05-28T08:29:36.587Z","dependency_job_id":"581b23e8-9f8e-4205-ace5-e969a34db85c","html_url":"https://github.com/MarcoMeter/neroRL","commit_stats":{"total_commits":110,"total_committers":1,"mean_commits":110.0,"dds":0.0,"last_synced_commit":"f16c100b9029818e4322f7c1c88dbcc5bbade34b"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MarcoMeter%2FneroRL","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MarcoMeter%2FneroRL/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MarcoMeter%2FneroRL/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MarcoMeter%2FneroRL/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/MarcoMeter","download_url":"https://codeload.github.com/MarcoMeter/neroRL/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253176440,"owners_count":21866142,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-11T05:50:41.692Z","updated_at":"2025-05-09T01:47:48.296Z","avatar_url":"https://github.com/MarcoMeter.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# neroRL\n\nneroRL is a PyTorch based research framework for Deep Reinforcement Learning specializing on Transformer and Recurrent Agents based on Proximal Policy Optimization.\nIts focus is set on environments that are procedurally generated, while providing some usefull tools for experimenting and analyzing a trained behavior.\nThis is a research framework.\n\n# Features\n- Environments:\n  - [Memory Gym](https://github.com/MarcoMeter/drl-memory-gym)\n  - [Obstacle Tower](https://github.com/Unity-Technologies/obstacle-tower-env)\n  - [Unity ML-Agents](https://github.com/Unity-Technologies/ml-agents)\n  - [Procgen](https://github.com/openai/procgen)\n  - [Minigrid](https://github.com/Farama-Foundation/Minigrid) (Vector (one-hot) or Visual Observations (84x84x3))\n  - [Gym CartPole](https://github.com/openai/gym) using masked velocity\n  - [DM Ballet](https://github.com/deepmind/deepmind-research/tree/master/hierarchical_transformer_memory/pycolab_ballet)\n  - [RandomMaze](https://github.com/zuoxingdong/mazelab)\n- Proximal Policy Optimization (PPO)\n  - Discrete and Multi-Discrete Action Spaces\n  - Vector and Visual Observation Spaces (either alone or simultaneously)\n  - [Recurrent Policies using Truncated Backpropagation Through Time](https://github.com/MarcoMeter/recurrent-ppo-truncated-bptt)\n  - [Episodic Memory based on Transformer-XL](https://github.com/MarcoMeter/episodic-transformer-memory-ppo)\n\n# Memory Gym ICLR 2022 Paper\n\nThis repository is used to achieve the results on the Memory Gym Environments given the following paper:\n\n```bibtex\n@inproceedings{pleines2023memory,\ntitle={Memory Gym: Partially Observable Challenges to Memory-Based Agents},\nauthor={Marco Pleines and Matthias Pallasch and Frank Zimmer and Mike Preuss},\nbooktitle={International Conference on Learning Representations},\nyear={2023},\nurl={https://openreview.net/forum?id=jHc8dCx6DDr}\n}\n```\n\n# Obstacle Tower Challenge\nOriginally, this work started out by achieving the 7th place during the [Obstacle Tower Challenge](https://blogs.unity3d.com/2019/08/07/announcing-the-obstacle-tower-challenge-winners-and-open-source-release/) by using a relatively simple FFCNN. This [video](https://www.youtube.com/watch?v=P2rBDHBHxcM) presents some footage of the approach and the trained behavior:\n\n\u003cp align=\"center\"\u003e\u003ca href=\"http://www.youtube.com/watch?feature=player_embedded\u0026v=P2rBDHBHxcM\n\" target=\"_blank\"\u003e\u003cimg src=\"http://img.youtube.com/vi/P2rBDHBHxcM/0.jpg\" \nalt=\"Rising to the Obstacle Tower Challenge\" width=\"240\" height=\"180\" border=\"10\" /\u003e\u003c/a\u003e\u003c/p\u003e\n\nRecently we published a [paper](https://arxiv.org/abs/2004.00567) at CoG 2020 (best paper candidate) that analyzes the taken approach. Additionally the model was trained on 3 level designs and was evaluated on the two left out ones. The results can be reproduced using the [obstacle-tower-challenge](https://github.com/MarcoMeter/neroRL/tree/obstacle-tower-challenge) branch.\n\n```bibtex\n@inproceedings{pleines2020otc,\nauthor    = {Marco Pleines and Jenia Jitsev and Mike Preuss and Frank Zimmer},\ntitle     = {Obstacle Tower Without Human Demonstrations: How Far a Deep Feed-Forward Network Goes with Reinforcement Learning},\nbooktitle = {{IEEE} Conference on Games, CoG 2020, Osaka, Japan, August 24-27, 2020},\npages     = {447--454},\npublisher = {{IEEE}},\nyear      = {2020},\nurl       = {https://doi.org/10.1109/CoG47356.2020.9231802},\ndoi       = {10.1109/CoG47356.2020.9231802},\n}\n```\n\n# Getting Started\n\nTo get started check out the [docs](/docs/)!\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmarcometer%2Fnerorl","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmarcometer%2Fnerorl","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmarcometer%2Fnerorl/lists"}