An open API service indexing awesome lists of open source software.

https://github.com/eliashornberg/cartpole-a2c-reinforcement-learning

This repository contains the implementation of the K-workers, n-step Advantage Actor-Critic (A2C) algorithm applied to the CartPole environment, as part of a reinforcement learning project for the EPFL Spring Semester 2024 course on Artificial Neural Networks and Reinforcement Learning.
https://github.com/eliashornberg/cartpole-a2c-reinforcement-learning

advantage-actor-critic artificial-intelligence cartpole pytorch reinforcement-learning

Last synced: 2 months ago
JSON representation

This repository contains the implementation of the K-workers, n-step Advantage Actor-Critic (A2C) algorithm applied to the CartPole environment, as part of a reinforcement learning project for the EPFL Spring Semester 2024 course on Artificial Neural Networks and Reinforcement Learning.

Awesome Lists containing this project

README

          

# n-step Advantage Actor-Critic (A2C) for CartPole

This repository contains an implementation of the n-step Advantage Actor-Critic (A2C) algorithm for the CartPole environment. The project explores both discrete and continuous action spaces, and investigates the effects of various hyperparameters on the learning process.

## Project Structure

- `K=1-n=1-disc/`: Contains an animation of the CartPole when K=1, n=1 in the discrete environment.
- `imgs/`: Contains plots used in the report.
- `lists/`: Contains data for each agent to reproduce plots without retraining.
- `CS_456_MP2_A2C.pdf`: The project report detailing methodology and results.
- `MP2_A2C.pdf`: The project handout with specifications.
- `train.py`: Implementation of the A2C algorithm and supporting functions.
- `Solution.ipynb`: Jupyter notebook to run the A2C algorithm and generate plots.

## Features

- Implementation of n-step A2C for both discrete and continuous action spaces
- Support for multiple workers (K) and n-step returns
- Evaluation and logging functionalities
- Visualization of training progress, value functions, and agent performance

## Getting Started

1. Clone this repository
2. Install the required dependencies, see `requirements.txt`.
3. Run cells in `Solution.ipynb` to train the agent or reproduce plots using pre-saved data

## Results

The project explores various configurations of the A2C algorithm, including:
- Basic A2C version in CartPole
- Stochastic rewards
- Multiple workers (K-workers)
- n-step returns
- K × n batch learning

Detailed results and analysis can be found in `CS_456_MP2_A2C.pdf`.

## Usage

To train a new agent or reproduce results:

1. Open `Solution.ipynb`
2. Adjust hyperparameters as needed
3. Run the cells to train the agent or generate plots from pre-saved data

## Acknowledgments

This project was completed as part of the EPFL Artificial Neural Networks and Reinforcement Learning course, in collaboration with [@alibakly](https://github.com/AliBakly).