{"id":13932915,"url":"https://github.com/google-research/planet","last_synced_at":"2025-09-28T14:30:57.308Z","repository":{"id":34160776,"uuid":"170441842","full_name":"google-research/planet","owner":"google-research","description":"Learning Latent Dynamics for Planning from Pixels","archived":true,"fork":false,"pushed_at":"2023-03-24T21:57:54.000Z","size":157,"stargazers_count":1170,"open_issues_count":4,"forks_count":203,"subscribers_count":45,"default_branch":"master","last_synced_at":"2024-09-27T02:03:11.064Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://danijar.com/planet","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/google-research.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2019-02-13T04:45:16.000Z","updated_at":"2024-09-16T12:27:49.000Z","dependencies_parsed_at":"2023-01-15T05:00:17.917Z","dependency_job_id":null,"html_url":"https://github.com/google-research/planet","commit_stats":{"total_commits":10,"total_committers":2,"mean_commits":5.0,"dds":"0.19999999999999996","last_synced_commit":"c04226b6db136f5269625378cd6a0aa875a92842"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-research%2Fplanet","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-research%2Fplanet/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-research%2Fplanet/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-research%2Fplanet/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/google-research","download_url":"https://codeload.github.com/google-research/planet/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":234525622,"owners_count":18846935,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-07T21:01:21.518Z","updated_at":"2025-09-28T14:30:51.949Z","avatar_url":"https://github.com/google-research.png","language":"Python","funding_links":[],"categories":["论文","Python"],"sub_categories":["Implementation of Algorithms"],"readme":"# Deep Planning Network\n\nDanijar Hafner, Timothy Lillicrap, Ian Fischer, Ruben Villegas, David Ha, Honglak Lee, James Davidson\n\n![PlaNet policies and predictions](https://imgur.com/UeeQIfo.gif)\n\nThis project provides the open source implementation of the PlaNet agent\nintroduced in [Learning Latent Dynamics for Planning from Pixels][paper].\nPlaNet is a purely model-based reinforcement learning algorithm that solves\ncontrol tasks from images by efficient planning in a learned latent space.\nPlaNet competes with top model-free methods in terms of final performance and\ntraining time while using substantially less interaction with the environment.\n\nIf you find this open source release useful, please reference in your paper:\n\n```\n@inproceedings{hafner2019planet,\n  title={Learning Latent Dynamics for Planning from Pixels},\n  author={Hafner, Danijar and Lillicrap, Timothy and Fischer, Ian and Villegas, Ruben and Ha, David and Lee, Honglak and Davidson, James},\n  booktitle={International Conference on Machine Learning},\n  pages={2555--2565},\n  year={2019}\n}\n```\n\n## Method\n\n![PlaNet model diagram](https://i.imgur.com/fpvrAqw.png)\n\nPlaNet models the world as a compact sequence of hidden states. For planning,\nwe first encode the history of past images into the current state. From there,\nwe efficiently predict future rewards for multiple action sequences in latent\nspace. We execute the first action of the best sequence found and replan after\nobserving the next image.\n\nFind more information:\n\n- [Google AI Blog post][blog]\n- [Project website][website]\n- [PDF paper][paper]\n\n[blog]: https://ai.googleblog.com/2019/02/introducing-planet-deep-planning.html\n[website]: https://danijar.com/project/planet/\n[paper]: https://arxiv.org/pdf/1811.04551.pdf\n\n## Instructions\n\nTo train an agent, install the dependencies and then run:\n\n```sh\npython3 -m planet.scripts.train --logdir /path/to/logdir --params '{tasks: [cheetah_run]}'\n```\n\nThe code prints `nan` as the score for iterations during which no summaries\nwere computed.\n\nThe available tasks are listed in `scripts/tasks.py`. The default parameters\ncan be found in `scripts/configs.py`. To run the experiments from our\npaper, pass the following parameters to `--params {...}` in addition to the\nlist of tasks:\n\n| Experiment | Parameters |\n| :--------- | :--------- |\n| PlaNet | No additional parameters. |\n| Random data collection | `planner_iterations: 0, train_action_noise: 1.0` |\n| Purely deterministic | `mean_only: True, divergence_scale: 0.0` |\n| Purely stochastic | `model: ssm` |\n| One agent all tasks | `collect_every: 30000` |\n\nPlease note that the agent has seen some improvements so the results may be a\nbit different now.\n\n## Modifications\n\nThese are good places to start when modifying the code:\n\n| Directory | Description |\n| :-------- | :---------- |\n| `scripts/configs.py` | Add new parameters or change defaults. |\n| `scripts/tasks.py` | Add or modify environments. |\n| `models` | Add or modify latent transition models. |\n| `networks` | Add or modify encoder and  decoder networks. |\n\nTips for development:\n\n- You can set `--config debug` to reduce the episode length, batch size, and\n  collect data more freqnently. This helps to quickly reach all parts of the\n  code.\n- You can use `--num_runs 1000 --resume_runs False` to automatically start new\n  runs in sub directories of the logdir every time to execute the script.\n- Environments live in separate processes by default. Some environments work\n  better when separated into threads instead by specifying `--params\n  '{isolate_envs: thread}'`.\n\n## Dependencies\n\nThe code was tested under Ubuntu 18 and uses these packages:\n\n- tensorflow-gpu==1.13.1\n- tensorflow_probability==0.6.0\n- dm_control (`egl` [rendering option][dmc-rendering] recommended)\n- gym\n- scikit-image\n- scipy\n- ruamel.yaml\n- matplotlib\n\n[dmc-rendering]: https://github.com/deepmind/dm_control#rendering\n\nDisclaimer: This is not an official Google product.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoogle-research%2Fplanet","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgoogle-research%2Fplanet","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoogle-research%2Fplanet/lists"}