https://github.com/eliashornberg/cartpole-a2c-reinforcement-learning
This repository contains the implementation of the K-workers, n-step Advantage Actor-Critic (A2C) algorithm applied to the CartPole environment, as part of a reinforcement learning project for the EPFL Spring Semester 2024 course on Artificial Neural Networks and Reinforcement Learning.
https://github.com/eliashornberg/cartpole-a2c-reinforcement-learning
advantage-actor-critic artificial-intelligence cartpole pytorch reinforcement-learning
Last synced: 2 months ago
JSON representation
This repository contains the implementation of the K-workers, n-step Advantage Actor-Critic (A2C) algorithm applied to the CartPole environment, as part of a reinforcement learning project for the EPFL Spring Semester 2024 course on Artificial Neural Networks and Reinforcement Learning.
- Host: GitHub
- URL: https://github.com/eliashornberg/cartpole-a2c-reinforcement-learning
- Owner: eliashornberg
- Created: 2024-09-15T16:44:58.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-09-15T16:47:16.000Z (about 1 year ago)
- Last Synced: 2025-04-21T08:44:55.280Z (6 months ago)
- Topics: advantage-actor-critic, artificial-intelligence, cartpole, pytorch, reinforcement-learning
- Language: Jupyter Notebook
- Homepage:
- Size: 15 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# n-step Advantage Actor-Critic (A2C) for CartPole
This repository contains an implementation of the n-step Advantage Actor-Critic (A2C) algorithm for the CartPole environment. The project explores both discrete and continuous action spaces, and investigates the effects of various hyperparameters on the learning process.
## Project Structure
- `K=1-n=1-disc/`: Contains an animation of the CartPole when K=1, n=1 in the discrete environment.
- `imgs/`: Contains plots used in the report.
- `lists/`: Contains data for each agent to reproduce plots without retraining.
- `CS_456_MP2_A2C.pdf`: The project report detailing methodology and results.
- `MP2_A2C.pdf`: The project handout with specifications.
- `train.py`: Implementation of the A2C algorithm and supporting functions.
- `Solution.ipynb`: Jupyter notebook to run the A2C algorithm and generate plots.## Features
- Implementation of n-step A2C for both discrete and continuous action spaces
- Support for multiple workers (K) and n-step returns
- Evaluation and logging functionalities
- Visualization of training progress, value functions, and agent performance## Getting Started
1. Clone this repository
2. Install the required dependencies, see `requirements.txt`.
3. Run cells in `Solution.ipynb` to train the agent or reproduce plots using pre-saved data## Results
The project explores various configurations of the A2C algorithm, including:
- Basic A2C version in CartPole
- Stochastic rewards
- Multiple workers (K-workers)
- n-step returns
- K × n batch learningDetailed results and analysis can be found in `CS_456_MP2_A2C.pdf`.
## Usage
To train a new agent or reproduce results:
1. Open `Solution.ipynb`
2. Adjust hyperparameters as needed
3. Run the cells to train the agent or generate plots from pre-saved data## Acknowledgments
This project was completed as part of the EPFL Artificial Neural Networks and Reinforcement Learning course, in collaboration with [@alibakly](https://github.com/AliBakly).