{"id":21098503,"url":"https://github.com/godka/pensieve-ppo","last_synced_at":"2025-04-07T06:11:24.086Z","repository":{"id":36680580,"uuid":"200053383","full_name":"godka/Pensieve-PPO","owner":"godka","description":"The simplest implementation of Pensieve (SIGCOMM' 17) via state-of-the-art RL algorithms, including PPO, DQN, SAC, and support for both TensorFlow and PyTorch.","archived":false,"fork":false,"pushed_at":"2025-01-18T08:56:11.000Z","size":16522,"stargazers_count":73,"open_issues_count":3,"forks_count":35,"subscribers_count":6,"default_branch":"torch","last_synced_at":"2025-03-31T05:09:02.577Z","etag":null,"topics":["a2c","deep-learning","dqn","pensieve","ppo","pytorch","reinforcement-learning","tensorflow"],"latest_commit_sha":null,"homepage":"https://godka.github.io/Pensieve-PPO/","language":"DIGITAL Command Language","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-2-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/godka.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-08-01T13:12:34.000Z","updated_at":"2025-03-24T08:19:21.000Z","dependencies_parsed_at":"2025-02-28T02:25:26.153Z","dependency_job_id":"8892c1c0-c05f-40d5-808b-48b49fa960d5","html_url":"https://github.com/godka/Pensieve-PPO","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/godka%2FPensieve-PPO","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/godka%2FPensieve-PPO/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/godka%2FPensieve-PPO/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/godka%2FPensieve-PPO/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/godka","download_url":"https://codeload.github.com/godka/Pensieve-PPO/tar.gz/refs/heads/torch","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247601448,"owners_count":20964864,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["a2c","deep-learning","dqn","pensieve","ppo","pytorch","reinforcement-learning","tensorflow"],"created_at":"2024-11-19T22:55:29.011Z","updated_at":"2025-04-07T06:11:24.055Z","avatar_url":"https://github.com/godka.png","language":"DIGITAL Command Language","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Pensieve PPO\n\n### Updates\n\n**Jan. 18, 2025:** We removed the rate-based method and added NetLLM [4].\n\n**May. 4, 2024:** We removed the Elastic, revised  BOLA, and add new baseline Comyco [3] and Genet [2].\n\n**Jan. 26, 2024:** We are excited to announce significant updates to Pensieve-PPO! We have replaced TensorFlow with PyTorch, and we have achieved a similar training speed while training models that rival in performance.\n\n*For the TensorFlow version, please check [Pensieve-PPO TF Branch](https://github.com/godka/Pensieve-PPO/tree/master).*\n\n**Dec. 28, 2021:** In a previous update, we enhanced Pensieve-PPO with several state-of-the-art technologies, including Dual-Clip PPO and adaptive entropy decay.\n\n## About Pensieve-PPO\n\nPensieve-PPO is a user-friendly PyTorch implementation of Pensieve [1], a neural adaptive video streaming system. Unlike A3C, we utilize the Proximal Policy Optimization (PPO) algorithm for training.\n\nThis stable version of Pensieve-PPO includes both the training and test datasets.\n\nYou can run the repository by executing the following command:\n\n```\npython train.py\n```\n\nThe results will be evaluated on the test set (from HSDPA) every 300 epochs.\n\n## Tensorboard Integration\n\nTo monitor the training process in real time, you can leverage Tensorboard. Simply run the following command:\n\n```\ntensorboard --logdir=./\n```\n\n## Pretrained Model\n\nWe have also added a pretrained model, which can be found at [this link](https://github.com/godka/Pensieve-PPO/tree/torch/src/pretrain). This model demonstrates a substantial improvement of 7.03% (from 0.924 to 0.989) in average Quality of Experience (QoE) compared to the original Pensieve model [1]. For a more detailed performance analysis, refer to the figures below:\n\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"src/baselines-br.png\" width=\"50%\"\u003e\u003cimg src=\"src/baselines-bs.png\" width=\"50%\"\u003e\n\u003c/p\u003e\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"src/baselines-qoe.png\" width=\"100%\"\u003e\n\u003c/p\u003e\nIf you have any questions or require further assistance, please don't hesitate to reach out.\n\n## Additional Reinforcement Learning Algorithms\n\nFor more implementations of reinforcement learning algorithms, please visit the following branches:\n\n- DQN: [Pensieve-PPO DQN Branch](https://github.com/godka/Pensieve-PPO/tree/dqn)\n- SAC: [Pensieve-PPO SAC Branch](https://github.com/godka/Pensieve-PPO/tree/SAC) or [Pensieve-SAC Repository](https://github.com/godka/Pensieve-SAC)\n\n[1] Mao H, Netravali R, Alizadeh M. Neural adaptive video streaming with Pensieve[C]//Proceedings of the Conference of the ACM Special Interest Group on Data Communication. ACM, 2017: 197-210.\n\n[2] Xia, Zhengxu, et al. \"Genet: automatic curriculum generation for learning adaptation in networking.\" Proceedings of the ACM SIGCOMM 2022 Conference. 2022.\n\n[3] Huang, Tianchi, et al. \"Comyco: Quality-aware adaptive video streaming via imitation learning.\" Proceedings of the 27th ACM international conference on multimedia. 2019.\n\n[4] Wu, Duo, et al. \"Netllm: Adapting large language models for networking.\" Proceedings of the ACM SIGCOMM 2024 Conference. 2024.\n\n* We use the following command to test the *entire traces* in the dataset.\n\n```\npython run_plm.py --test --plm-type llama --plm-size base --rank 128 --device cuda:0 --trace-num -1 --model-dir  data/ft_plms/try_llama2_7b\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgodka%2Fpensieve-ppo","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgodka%2Fpensieve-ppo","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgodka%2Fpensieve-ppo/lists"}