Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/godka/pensieve-ppo
The simplest implementation of Pensieve (SIGCOMM' 17) via state-of-the-art RL algorithms, including PPO, DQN, SAC, and support for both TensorFlow and PyTorch.
https://github.com/godka/pensieve-ppo
a2c deep-learning dqn pensieve ppo pytorch reinforcement-learning tensorflow
Last synced: about 10 hours ago
JSON representation
The simplest implementation of Pensieve (SIGCOMM' 17) via state-of-the-art RL algorithms, including PPO, DQN, SAC, and support for both TensorFlow and PyTorch.
- Host: GitHub
- URL: https://github.com/godka/pensieve-ppo
- Owner: godka
- License: bsd-2-clause
- Created: 2019-08-01T13:12:34.000Z (over 5 years ago)
- Default Branch: torch
- Last Pushed: 2024-05-04T16:39:55.000Z (7 months ago)
- Last Synced: 2024-05-05T07:28:54.006Z (7 months ago)
- Topics: a2c, deep-learning, dqn, pensieve, ppo, pytorch, reinforcement-learning, tensorflow
- Language: DIGITAL Command Language
- Homepage: https://godka.github.io/Pensieve-PPO/
- Size: 15.3 MB
- Stars: 58
- Watchers: 7
- Forks: 32
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Pensieve PPO
### Updates
**May. 4, 2024:** We removed the Elastic, revised BOLA, and add new baseline Comyco [3] and Genet [2].
**Jan. 26, 2024:** We are excited to announce significant updates to Pensieve-PPO! We have replaced TensorFlow with PyTorch, and we have achieved a similar training speed while training models that rival in performance.
*For the TensorFlow version, please check [Pensieve-PPO TF Branch](https://github.com/godka/Pensieve-PPO/tree/master).*
**Dec. 28, 2021:** In a previous update, we enhanced Pensieve-PPO with several state-of-the-art technologies, including Dual-Clip PPO and adaptive entropy decay.
## About Pensieve-PPO
Pensieve-PPO is a user-friendly PyTorch implementation of Pensieve [1], a neural adaptive video streaming system. Unlike A3C, we utilize the Proximal Policy Optimization (PPO) algorithm for training.
This stable version of Pensieve-PPO includes both the training and test datasets.
You can run the repository by executing the following command:
```
python train.py
```The results will be evaluated on the test set (from HSDPA) every 300 epochs.
## Tensorboard Integration
To monitor the training process in real time, you can leverage Tensorboard. Simply run the following command:
```
tensorboard --logdir=./
```## Pretrained Model
We have also added a pretrained model, which can be found at [this link](https://github.com/godka/Pensieve-PPO/tree/torch/src/pretrain). This model demonstrates a substantial improvement of 7.03% (from 0.924 to 0.989) in average Quality of Experience (QoE) compared to the original Pensieve model [1]. For a more detailed performance analysis, refer to the figures below:
If you have any questions or require further assistance, please don't hesitate to reach out.## Additional Reinforcement Learning Algorithms
For more implementations of reinforcement learning algorithms, please visit the following branches:
- DQN: [Pensieve-PPO DQN Branch](https://github.com/godka/Pensieve-PPO/tree/dqn)
- SAC: [Pensieve-PPO SAC Branch](https://github.com/godka/Pensieve-PPO/tree/SAC) or [Pensieve-SAC Repository](https://github.com/godka/Pensieve-SAC)[1] Mao H, Netravali R, Alizadeh M. Neural adaptive video streaming with Pensieve[C]//Proceedings of the Conference of the ACM Special Interest Group on Data Communication. ACM, 2017: 197-210.
[2] Xia, Zhengxu, et al. "Genet: automatic curriculum generation for learning adaptation in networking." Proceedings of the ACM SIGCOMM 2022 Conference. 2022.
[3] Huang, Tianchi, et al. "Comyco: Quality-aware adaptive video streaming via imitation learning." Proceedings of the 27th ACM international conference on multimedia. 2019.