{"id":20000523,"url":"https://github.com/nikhilbarhate99/min-decision-transformer","last_synced_at":"2025-04-13T09:37:18.642Z","repository":{"id":40683117,"uuid":"458511277","full_name":"nikhilbarhate99/min-decision-transformer","owner":"nikhilbarhate99","description":"Minimal implementation of Decision Transformer: Reinforcement Learning via Sequence Modeling  in PyTorch for mujoco control tasks in OpenAI gym","archived":false,"fork":false,"pushed_at":"2022-06-10T06:23:50.000Z","size":22079,"stargazers_count":263,"open_issues_count":3,"forks_count":26,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-03-17T12:09:41.252Z","etag":null,"topics":["deep-learning","deep-reinforcement-learning","machine-learning","mujoco","offline-reinforcement-learning","openai-gym","pytorch","pytorch-transformers","reinforcement-learning","robotics","transformer"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/nikhilbarhate99.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-02-12T12:09:34.000Z","updated_at":"2025-03-15T23:16:47.000Z","dependencies_parsed_at":"2022-08-25T05:21:02.755Z","dependency_job_id":null,"html_url":"https://github.com/nikhilbarhate99/min-decision-transformer","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nikhilbarhate99%2Fmin-decision-transformer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nikhilbarhate99%2Fmin-decision-transformer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nikhilbarhate99%2Fmin-decision-transformer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nikhilbarhate99%2Fmin-decision-transformer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/nikhilbarhate99","download_url":"https://codeload.github.com/nikhilbarhate99/min-decision-transformer/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245761295,"owners_count":20667895,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","deep-reinforcement-learning","machine-learning","mujoco","offline-reinforcement-learning","openai-gym","pytorch","pytorch-transformers","reinforcement-learning","robotics","transformer"],"created_at":"2024-11-13T05:14:58.042Z","updated_at":"2025-03-27T01:09:54.376Z","avatar_url":"https://github.com/nikhilbarhate99.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Decision Transformer\n\n\n## Overview\n\nMinimal code for [Decision Transformer: Reinforcement Learning via Sequence Modeling](https://arxiv.org/abs/2106.01345) for mujoco control tasks in OpenAI gym.\nNotable difference from official implementation are:\n\n- Simple GPT implementation (causal transformer)\n- Uses PyTorch's Dataset and Dataloader class and removes redundant computations for calculating rewards to go and state normalization for efficient training\n- Can be trained and the results can be visualized and rendered on google colab with the provided notebook\n\n#### [Open `min_decision_transformer.ipynb` in Google Colab](https://colab.research.google.com/github/nikhilbarhate99/min-decision-transformer/blob/master/min_decision_transformer.ipynb) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/nikhilbarhate99/min-decision-transformer/blob/master/min_decision_transformer.ipynb)\n\n\n\n## Results\n\n**Note:** these results are mean and variance of 3 random seeds obtained after 20k updates (due to timelimits on GPU resources on colab) while the official results are obtained after 100k updates. So these numbers are not directly comparable, but they can be used as rough reference points along with their corresponding plots to measure the learning progress of the model. The variance in returns and scores should decrease as training reaches saturation.\n\n\n| Dataset | Environment | DT (this repo) 20k updates | DT (official) 100k updates|\n| :---: | :---: | :---: | :---: |\n| Medium | HalfCheetah | 42.18 ± 00.59 | 42.60 ± 00.10 |\n| Medium | Hopper | 69.43 ± 27.34 | 67.60 ± 01.00 |\n| Medium | Walker | 75.47 ± 31.08 | 74.00 ± 01.40 |\n\n\n| ![](https://github.com/nikhilbarhate99/min-decision-transformer/blob/master/media/halfcheetah-medium-v2.png)  | ![](https://github.com/nikhilbarhate99/min-decision-transformer/blob/master/media/halfcheetah-medium-v2.gif)  |\n| :---:|:---: |\n\n\n| ![](https://github.com/nikhilbarhate99/min-decision-transformer/blob/master/media/hopper-medium-v2.png)  | ![](https://github.com/nikhilbarhate99/min-decision-transformer/blob/master/media/hopper-medium-v2.gif)  |\n| :---:|:---: |\n\n\n| ![](https://github.com/nikhilbarhate99/min-decision-transformer/blob/master/media/walker2d-medium-v2.png)  | ![](https://github.com/nikhilbarhate99/min-decision-transformer/blob/master/media/walker2d-medium-v2.gif)  |\n| :---:|:---: |\n\n\n\n## Instructions\n\n### Mujoco-py\n\nInstall `mujoco-py` library by following instructions on [mujoco-py repo](https://github.com/openai/mujoco-py)\n\n\n### D4RL Data\n\nDatasets are expected to be stored in the `data` directory. Install the [D4RL repo](https://github.com/rail-berkeley/d4rl). Then save formatted data in the `data` directory by running the following script:\n```\npython3 data/download_d4rl_datasets.py\n```\n\n\n### Running experiments\n\n- Example command for training:\n```\npython3 scripts/train.py --env halfcheetah --dataset medium --device cuda\n```\n\n\n- Example command for testing with a pretrained model:\n```\npython3 scripts/test.py --env halfcheetah --dataset medium --device cpu --num_eval_ep 1 --chk_pt_name dt_halfcheetah-medium-v2_model_22-02-13-09-03-10_best.pt\n```\nThe `dataset` needs to be specified for testing, to load the same state normalization statistics (mean and var) that is used for training.\nAn additional `--render` flag can be passed to the script for rendering the test episode.\n\n\n- Example command for plotting graphs using logged data from the csv files:\n```\npython3 scripts/plot.py --env_d4rl_name halfcheetah-medium-v2 --smoothing_window 5\n```\nAdditionally `--plot_avg` and `--save_fig` flags can be passed to the script to average all values in one plot and to save the figure.\n\n\n### Note:\n1. If you find it difficult to install `mujoco-py` and `d4rl` then you can refer to their installation in the colab notebook\n2. Once the dataset is formatted and saved with `download_d4rl_datasets.py`, `d4rl` library is not required further for training.\n3. The evaluation is done on `v3` control environments in `mujoco-py` so that the results are consistent with the decision transformer paper.\n\n\n## Citing\n\nPlease use this bibtex if you want to cite this repository in your publications:\n\n    @misc{minimal_decision_transformer,\n        author = {Barhate, Nikhil},\n        title = {Minimal Implementation of Decision Transformer},\n        year = {2022},\n        publisher = {GitHub},\n        journal = {GitHub repository},\n        howpublished = {\\url{https://github.com/nikhilbarhate99/min-decision-transformer}},\n    }\n\n\n\n## References\n\n- Official [code](https://github.com/kzl/decision-transformer) and [paper](https://arxiv.org/abs/2106.01345)\n- Minimal GPT (causal transformer) [tweet](https://twitter.com/MishaLaskin/status/1481767788775628801?cxt=HHwWgoCzmYD9pZApAAAA) and [colab notebook](https://colab.research.google.com/drive/1NUBqyboDcGte5qAJKOl8gaJC28V_73Iv?usp=sharing)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnikhilbarhate99%2Fmin-decision-transformer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnikhilbarhate99%2Fmin-decision-transformer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnikhilbarhate99%2Fmin-decision-transformer/lists"}