https://github.com/giansimone/dqn-ale-spaceinvaders
A Deep Q-Network (DQN) implementation for Atari Space Invaders using Gymnasium and PyTorch.
https://github.com/giansimone/dqn-ale-spaceinvaders
ale arcade-learning-environment atari deep-q-learning-network deep-reinforcement-learning dqn gymnasium python pytorch reinforcement-learning space-invaders torch
Last synced: 5 months ago
JSON representation
A Deep Q-Network (DQN) implementation for Atari Space Invaders using Gymnasium and PyTorch.
- Host: GitHub
- URL: https://github.com/giansimone/dqn-ale-spaceinvaders
- Owner: giansimone
- License: mit
- Created: 2025-10-13T14:23:37.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2025-11-25T20:15:13.000Z (6 months ago)
- Last Synced: 2025-11-28T07:16:35.431Z (6 months ago)
- Topics: ale, arcade-learning-environment, atari, deep-q-learning-network, deep-reinforcement-learning, dqn, gymnasium, python, pytorch, reinforcement-learning, space-invaders, torch
- Language: Python
- Homepage:
- Size: 286 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
[](https://badge.fury.io/py/dqn-ale-spaceinvaders)
[](https://badge.fury.io/py/dqn-ale-spaceinvaders)
[](https://github.com/giansimone/dqn-ale-spaceinvaders/blob/main/LICENSE)
# Deep Q-Network (DQN) for Atari Space Invaders
A PyTorch implementation of Deep Q-Learning Network (DQN) trained to play Atari Space Invaders using the Arcade Learning Environment (ALE).
## Features
- **Vanilla DQN** and **Dueling DQN** architectures.
- **Double DQN** support for improved stability.
- **Replay Buffer** for experience replay.
- **Epsilon-greedy exploration** with annealing (i.e., linear decay).
- **TensorBoard integration** for training visualisation.
- **Hugging Face Hub integration** for model sharing.
- **Video recording** of agent gameplay.
## Installation
You can install the package from PyPI or clone the repository and install the required dependencies using Poetry or pip. This project requires **Python 3.13**.
### PyPI
```bash
pip install dqn-ale-spaceinvaders
```
### Source
#### Using Poetry (Recommended)
```bash
# 1. Clone the repository
git clone https://github.com/giansimone/dqn-ale-spaceinvaders.git
cd dqn-ale-spaceinvaders
# 2. Initialize environment and install dependencies
poetry env use python3.13
poetry install
# 3. Activate the virtual environment
eval $(poetry env activate)
```
#### Using pip
```bash
# 1. Clone the repository
git clone https://github.com/giansimone/dqn-ale-spaceinvaders.git
cd dqn-ale-spaceinvaders
# 2. Create and activate a virtual environment
python3.13 -m venv venv
source venv/bin/activate
# 3. Install package in editable mode
pip install -e .
```
## Project Structure
```
dqn-ale-spaceinvaders/
├── dqn_ale_spaceinvaders/
│ ├── agent.py # DQN agent implementation
│ ├── buffer.py # Experience replay buffer
│ ├── config.yaml # Agent configuration
│ ├── environment.py # Environment setup and wrappers
│ ├── model.py # Deep learning architectures
│ ├── train.py # Training script
│ ├── enjoy.py # Play with trained agent
│ ├── export.py # Export model to Hugging Face Hub
│ └── utils.py # Utility functions
├── .gitignore
├── LICENSE
├── README.md
└── pyproject.toml
```
## Usage
### Training
Train a DQN agent with the default configuration.
```bash
python -m dqn_ale_spaceinvaders.train
```
The training script will:
- Create a timestamped run directory in `runs/`.
- Save the configuration, checkpoints, and TensorBoard logs.
- Periodically evaluate the agent and save the best model.
### Configuration
Edit `config.yaml` to customize training parameters.
```yaml
# Environment
env_id: ALE/SpaceInvaders-v5
frame_skip: 5
frame_stack: 4
resized_frame: 84
# Training
training_steps: 10000000
n_eval_episodes: 10
# Exploration
warmup_steps: 100000
epsilon_start: 1.0
epsilon_end: 0.1
anneal_steps: 1000000
# Replay Buffer
buffer_size: 200000
batch_size: 32
# Learning
gamma: 0.99
lr: 0.00025
update_every: 25000
target_update_every: 10000
# DQN Variants
double_dqn: False # Enable Double DQN
dueling: False # Enable Dueling DQN
clip_rewards: False # Clip rewards to [-1, 1]
```
### Monitoring Training
View training progress with TensorBoard:
```bash
tensorboard --logdir runs/dqn_YYYY-MM-DD_HHhMMmSSs/
```
### Testing a Trained Agent
Watch your trained agent play:
```bash
python -m dqn_ale_spaceinvaders.enjoy --artifact runs/dqn_YYYY-MM-DD_HHhMMmSSs/final_model.pt --num-episodes 5
```
### Exporting to Hugging Face Hub
Share your trained model:
```bash
python -m export \
--username YOUR_HF_USERNAME \
--repo-name dqn-spaceinvaders \
--artifact-path runs/dqn_YYYY-MM-DD_HHhMMmSSs/final_model.pt \
--movie-fps 12
```
This will:
- Create a repository on Hugging Face Hub.
- Upload the model weights, configuration, and evaluation results.
- Generate and upload a replay movie.
- Create a model card with usage instructions.
## Algorithm Details
### DQN Architecture
The network consists of:
- 3 convolutional layers for feature extraction.
- 2 fully connected layers for Q-value estimation.
- Input: 4 stacked 84×84 grayscale frames.
- Output: Q-values for each action.
### Dueling DQN Architecture
Separates state value and action advantages:
- Shared convolutional backbone.
- Value stream: estimates state value V(s).
- Advantage stream: estimates action advantages A(s,a).
- Q(s,a) = V(s) + (A(s,a) - mean(A(s,a))).
### Training Process
1. **Warmup**: Random exploration for initial experiences.
2. **Epsilon Annealing**: Gradual reduction from exploration to exploitation.
3. **Experience Replay**: Sample random mini-batches from replay buffer.
4. **Target Network**: Separate network updated periodically for stability.
5. **Double DQN** (optional): Reduces overestimation by decoupling action selection and evaluation.
## License
This project is available under the MIT License.