{"id":17369560,"url":"https://github.com/eloialonso/diamond","last_synced_at":"2025-02-26T20:30:43.837Z","repository":{"id":240615563,"uuid":"803024793","full_name":"eloialonso/diamond","owner":"eloialonso","description":"DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model. NeurIPS 2024 Spotlight.","archived":false,"fork":false,"pushed_at":"2024-12-06T16:45:28.000Z","size":48,"stargazers_count":1606,"open_issues_count":2,"forks_count":106,"subscribers_count":20,"default_branch":"main","last_synced_at":"2024-12-06T17:28:13.358Z","etag":null,"topics":["artificial-intelligence","atari","deep-learning","diffusion-models","machine-learning","reinforcement-learning","research","world-models"],"latest_commit_sha":null,"homepage":"https://diamond-wm.github.io","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/eloialonso.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-05-19T22:31:40.000Z","updated_at":"2024-12-06T16:17:49.000Z","dependencies_parsed_at":"2024-08-21T18:20:20.358Z","dependency_job_id":"bd0c853a-f7f0-4212-be47-6fb5e862fa5b","html_url":"https://github.com/eloialonso/diamond","commit_stats":null,"previous_names":["eloialonso/diamond"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eloialonso%2Fdiamond","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eloialonso%2Fdiamond/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eloialonso%2Fdiamond/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eloialonso%2Fdiamond/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/eloialonso","download_url":"https://codeload.github.com/eloialonso/diamond/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":240930301,"owners_count":19880447,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","atari","deep-learning","diffusion-models","machine-learning","reinforcement-learning","research","world-models"],"created_at":"2024-10-16T00:01:19.845Z","updated_at":"2025-02-26T20:30:43.831Z","avatar_url":"https://github.com/eloialonso.png","language":"Python","funding_links":[],"categories":["Repos","Resource","Python"],"sub_categories":["[2024]"],"readme":"# Diffusion for World Modeling: Visual Details Matter in Atari (NeurIPS 2024 Spotlight)\n\n[**TL;DR**] 💎 DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained entirely in a diffusion world model.\n\n🌍 [Project Page](https://diamond-wm.github.io) • 🤓 [Paper](https://arxiv.org/pdf/2405.12399) • 𝕏 [Atari thread](https://x.com/EloiAlonso1/status/1793916382779982120) • 𝕏 [CSGO thread](https://x.com/EloiAlonso1/status/1844803606064611771) • 💬 [Discord](https://discord.gg/74vha5RWPg)\n\n\u003cdiv align='center'\u003e\n  RL agent playing in autoregressive imagination of Atari world models\n  \u003cbr\u003e\n  \u003cimg alt=\"DIAMOND agent in WM\" src=\"https://github.com/user-attachments/assets/eb6b72eb-73df-4178-8a3d-cdad80ff9152\"\u003e\n\n\u003c/div\u003e\n\n\u003cdiv align='center'\u003e\n  Human player in CSGO world model (full quality video \u003ca href=\"https://diamond-wm.github.io/static/videos/grid.mp4\"\u003ehere\u003c/a\u003e)\n  \u003cbr\u003e\n  \u003cimg alt=\"DIAMOND agent in WM\" src=\"https://github.com/user-attachments/assets/dcbdd523-ca22-46a9-bb7d-bcc52080fe00\"\u003e\n\u003c/div\u003e\n\nQuick install to try our [pretrained world models](#try) using [miniconda](https://docs.anaconda.com/free/miniconda/miniconda-install/):\n\n\u003e```bash\n\u003egit clone https://github.com/eloialonso/diamond.git\n\u003ecd diamond\n\u003econda create -n diamond python=3.10\n\u003econda activate diamond\n\u003epip install -r requirements.txt\n\u003e```\n\nFor Atari (world model + RL agent)\n\n\u003e```bash\n\u003epython src/play.py --pretrained\n\u003e```\n\nFor CSGO (world model only)\n\n\u003e```bash\n\u003egit checkout csgo\n\u003epython src/play.py\n\u003e```\n\nAnd press `m` to take control (the policy is playing by default)!\n\n**Warning**: Atari ROMs will be downloaded with the dependencies, which means that you acknowledge that you have the license to use them.\n\n## CSGO\n\n\n**Edit**: Check out the [csgo branch](https://github.com/eloialonso/diamond/tree/csgo) to try our DIAMOND's world model trained on *Counter-Strike: Global Offensive*!\n\n```bash\ngit checkout csgo\npython src/play.py\n```\n\u003e Note on Apple Silicon you must enable CPU fallback for MPS backend with\n\u003e PYTORCH_ENABLE_MPS_FALLBACK=1 python src/play.py\n\n\n\u003ca name=\"quick_links\"\u003e\u003c/a\u003e\n## Quick Links\n\n- [Try our playable diffusion world models](#try)\n- [Launch a training run](#launch)\n- [Configuration](#configuration)\n- [Visualization](#visualization)\n  - [Play mode (default)](#play_mode)\n  - [Dataset mode (add `-d`)](#dataset_mode)\n  - [Other options, common to play/dataset modes](#other_options)\n- [Run folder structure](#structure)\n- [Results](#results)\n- [Citation](#citation)\n- [Credits](#credits)\n\n\u003ca name=\"try\"\u003e\u003c/a\u003e\n## [⬆️](#quick_links) Try our playable diffusion world models\n\n```bash\npython src/play.py --pretrained\n```\n\nThen select a game, and world model and policy pretrained on Atari 100k will be downloaded from our [repository on Hugging Face Hub 🤗](https://huggingface.co/eloialonso/diamond) and cached on your machine.\n\nSome things you might want to try:\n- Press `m` to change the policy between the agent and human (the policy is playing by default).\n- Press `↑/↓` to change the imagination horizon (default is 50 for playing).\n\nTo adjust the sampling parameters (number of denoising steps, stochasticity, order, etc) of the trained diffusion world model, for instance to trade off sampling speed and quality, edit the section `world_model_env.diffusion_sampler` in the file `config/trainer.yaml`.\n\nSee [Visualization](#visualization) for more details about the available commands and options.\n\n\u003ca name=\"launch\"\u003e\u003c/a\u003e\n## [⬆️](#quick_links) Launch a training run\n\nTo train with the hyperparameters used in the paper on cuda:0, launch:\n```bash\npython src/main.py env.train.id=BreakoutNoFrameskip-v4 common.devices=0\n```\n\nThis creates a new folder for your run, located in `outputs/YYYY-MM-DD/hh-mm-ss/`.\n\nTo resume a run that crashed, navigate to the fun folder and launch:\n\n```bash\n./scripts/resume.sh\n```\n\n\u003ca name=\"configuration\"\u003e\u003c/a\u003e\n## [⬆️](#quick_links) Configuration\n\nWe use [Hydra](https://github.com/facebookresearch/hydra) for configuration management.\n\nAll configuration files are located in the `config` folder:\n\n- `config/trainer.yaml`: main configuration file.\n- `config/agent/default.yaml`: architecture hyperparameters.\n- `config/env/atari.yaml`: environment hyperparameters.\n\nYou can turn on logging to [weights \u0026 biases](https://wandb.ai) in the `wandb` section of `config/trainer.yaml`.\n\nSet `training.model_free=true` in the file `config/trainer.yaml` to \"unplug\" the world model and perform standard model-free reinforcement learning.\n\n\u003ca name=\"visualization\"\u003e\u003c/a\u003e\n## [⬆️](#quick_links) Visualization\n\n\u003ca name=\"play_mode\"\u003e\u003c/a\u003e\n### [⬆️](#quick_links) Play mode (default)\n\nTo visualize your last checkpoint, launch **from the run folder**:\n\n```bash\npython src/play.py\n```\n\nBy default, you visualize the policy playing in the world model. To play yourself, or switch to the real environment, use the controls described below.\n\n```txt\nControls (play mode)\n\n(Game-specific commands will be printed on start up)\n\n⏎   : reset environment\n\nm   : switch controller (policy/human)\n↑/↓ : imagination horizon (+1/-1)\n←/→ : next environment [world model ←→ real env (test) ←→ real env (train)]\n\n.   : pause/unpause\ne   : step-by-step (when paused)\n```\n\nAdd `-r` to toggle \"recording mode\" (works only in play mode). Every completed episode will be saved in `dataset/rec_\u003cenv_name\u003e_\u003ccontroller\u003e`. For instance:\n\n- `dataset/rec_wm_π`: Policy playing in world model.\n- `dataset/rec_wm_H`: Human playing in world model.\n- `dataset/rec_test_H`: Human playing in test real environment.\n\nYou can then use the \"dataset mode\" described in the next section to replay the stored episodes.\n\n\u003ca name=\"dataset_mode\"\u003e\u003c/a\u003e\n### [⬆️](#quick_links) Dataset mode (add `-d`)\n\n**In the run folder**, to visualize the datasets contained in the `dataset` subfolder, add `-d` to switch to \"dataset mode\":\n\n```bash\npython src/play.py -d\n```\n\nYou can use the controls described below to navigate the datasets and episodes.\n\n```txt\nControls (dataset mode)\n\nm   : next dataset (if multiple datasets, like recordings, etc)\n↑/↓ : next/previous episode\n←/→ : next/previous timestep in episodes\nPgUp: +10 timesteps\nPgDn: -10 timesteps\n⏎   : back to first timestep\n```\n\n\u003ca name=\"other_options\"\u003e\u003c/a\u003e\n### [⬆️](#quick_links) Other options, common to play/dataset modes\n\n```txt\n--fps FPS             Target frame rate (default 15).\n--size SIZE           Window size (default 800).\n--no-header           Remove header.\n```\n\n\u003ca name=\"structure\"\u003e\u003c/a\u003e\n## [⬆️](#quick_links) Run folder structure\n\nEach new run is located at `outputs/YYYY-MM-DD/hh-mm-ss/`. This folder is structured as follows:\n\n```txt\noutputs/YYYY-MM-DD/hh-mm-ss/\n│\n└─── checkpoints\n│   │   state.pt  # full training state\n│   │\n│   └─── agent_versions\n│       │   ...\n│       │   agent_epoch_00999.pt\n│       │   agent_epoch_01000.pt  # agent weights only\n│\n└─── config\n│   |   trainer.yaml\n|\n└─── dataset\n│   │\n│   └─── train\n│   |   │   info.pt\n│   |   │   ...\n|   |\n│   └─── test\n│       │   info.pt\n│       │   ...\n│\n└─── scripts\n│   │   resume.sh\n|   |   ...\n|\n└─── src\n|   |   main.py\n|   |   ...\n|\n└─── wandb\n    |   ...\n```\n\n\u003ca name=\"results\"\u003e\u003c/a\u003e\n## [⬆️](#quick_links) Results\n\nThe file [results/data/DIAMOND.json](results/data/DIAMOND.json) contains the results for each game and seed used in the paper.\n\nThe DDPM code used for Section 5.1 of the paper can be found on the [ddpm](https://github.com/eloialonso/diamond/tree/ddpm) branch.\n\n\u003ca name=\"citation\"\u003e\u003c/a\u003e\n## [⬆️](#quick-links) Citation\n\n```text\n@inproceedings{alonso2024diffusionworldmodelingvisual,\n      title={Diffusion for World Modeling: Visual Details Matter in Atari},\n      author={Eloi Alonso and Adam Jelley and Vincent Micheli and Anssi Kanervisto and Amos Storkey and Tim Pearce and François Fleuret},\n      booktitle={Thirty-eighth Conference on Neural Information Processing Systems}}\n      year={2024},\n      url={https://arxiv.org/abs/2405.12399},\n}\n```\n\n\u003ca name=\"credits\"\u003e\u003c/a\u003e\n## [⬆️](#quick_links) Credits\n\n- [https://github.com/crowsonkb/k-diffusion/](https://github.com/crowsonkb/k-diffusion/)\n- [https://github.com/huggingface/huggingface_hub](https://github.com/huggingface/huggingface_hub)\n- [https://github.com/google-research/rliable](https://github.com/google-research/rliable)\n- [https://github.com/pytorch/pytorch](https://github.com/pytorch/pytorch)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Feloialonso%2Fdiamond","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Feloialonso%2Fdiamond","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Feloialonso%2Fdiamond/lists"}