https://github.com/bryanoliveira/spgym-experiments
https://github.com/bryanoliveira/spgym-experiments
Last synced: about 2 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/bryanoliveira/spgym-experiments
- Owner: bryanoliveira
- Created: 2024-09-20T18:21:22.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-02-13T18:59:56.000Z (8 months ago)
- Last Synced: 2025-02-22T09:48:03.568Z (8 months ago)
- Language: Jupyter Notebook
- Size: 24.5 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Sliding Puzzles Gym: A Scalable Benchmark for State Representation in Visual Reinforcement Learning
![]()
![]()
![]()
A PPO agent solving the environment.
This repository contains the official implementation of experiments for the paper [Sliding Puzzles Gym: A Scalable Benchmark for State Representation in Visual Reinforcement Learning](https://arxiv.org/abs/2410.14038). This includes PPO, SAC, and DreamerV3 implementations along with scripts to reproduce all results.
The code for PPO and SAC was adapted from the [CleanRL implementation](https://github.com/vwxyzjn/cleanrl), and the code for DreamerV3 was adapted from the [official implementation](https://github.com/danijar/dreamerv3).
The benchmark environment itself can be found [here](https://github.com/bryanoliveira/sliding-puzzles-gym).## Requirements
1. Setup the benchmark
To install the Sliding Puzzles Gym (SPGym), run the following commands:
```bash
git clone https://github.com/bryanoliveira/sliding-puzzles-gym
cd sliding-puzzles-gym
pip install -e .
```To use the `imagenet-1k` dataset you will also need to download the dataset from https://huggingface.co/datasets/ILSVRC/imagenet-1k/blob/main/data/val_images.tar.gz and extract it to `/imgs/imagenet-1k`. You can do this automatically by running the following command:
```bash
sliding-puzzles setup imagenet
```You can also use the `diffusiondb` dataset by running the following command:
```bash
sliding-puzzles setup diffusiondb
```2. Prepare Python environments
Create separate Python environments for Dreamer and PPO. `requirements.txt` can be found inside each correspondent folder, and Dockerfiles are also available. See `dreamerv3/README.md` and `ppo_sac/README.md` for more details.
## Training
Scripts for reproducing the paper results can be found in `dreamerv3/scripts` and `ppo_sac/scripts`. Seeds are generated automatically by the algorithms if not specified. For example, inside `dreamerv3` folder, run:
```train
bash scripts/run.sh
```## Results
These are the expected results:


Learning curves for DreamerV3, PPO and SAC variants. Shaded regions represent 95\% confidence intervals across 5 independent runs.

Success rate as a function of environment steps. The gradual increase in representation complexity affects the sample efficiency of standard PPO, SAC, and DreamerV3 agents at different rates. Each line represents a different pool size, from 1 to 100 images. Shaded regions represent 95\% confidence intervals across 5 independent seeds.