https://github.com/lijian736/pong
A reinforcement learning pong game
https://github.com/lijian736/pong
onnxruntime-web phaser3 pong-game pytorch reinforce reinforcement-learning
Last synced: 7 months ago
JSON representation
A reinforcement learning pong game
- Host: GitHub
- URL: https://github.com/lijian736/pong
- Owner: lijian736
- License: mit
- Created: 2025-01-21T10:11:01.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2025-02-16T11:09:11.000Z (8 months ago)
- Last Synced: 2025-02-16T12:18:56.446Z (8 months ago)
- Topics: onnxruntime-web, phaser3, pong-game, pytorch, reinforce, reinforcement-learning
- Language: JavaScript
- Homepage:
- Size: 2.55 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# The Pong game with Reinforcement Learning AI Agent

## Getting started
The reinforcement learning `Pong` game demo written in javascript, which runs in web browser.
## Description
Two AI agents play the game. You only can be act as an audience.
If you are a programmer:
1. install VSCode and `Live Server` extension
2. open the `index.html` file with `Live Server` extensionIf you are not a programmer:
deploy the project as an app in any one HTTP server## Actions
Pong has the action space of 2 with the table below listing the meaning of each action's meanings
| Value | Meaning |
| --------- | --------- |
| 0 | move up |
| 1 | move down |## States
Pong's state is a tuple with 5 items. the table below lists the meaning of each item meanings
| index | Meaning | min value | max value |
| --------- | --------------------- | --------- | --------- |
| 0 | the ball x coordinate | 0.0 | 1.0 |
| 1 | the ball y coordinate | 0.0 | 1.0 |
| 2 | the ball x velocity | 0.5 | 0.1 |
| 3 | the ball y velocity | -0.2 | 0.2 |
| 4 | the paddle y position | 0.0 | 1.0 |the x positive direction is to the right
the y positive direction is to the up## Rewards
You get the reward score when the ball pass the paddle or collide with the paddle.
```
reward = math.log(abs(paddle_pos - ball_position.y) / area_height + 0.000001)
```
- `paddle_pos` is the paddle center y position
- `ball_position.y` is the ball center y position
- `area_height` is the game area height## How to train the model
Please refer to the training [`README.md`](./docs/README.md)for training details. [How to train](./docs/README.md)
## Screen Shots
1. the training screen shot
2. the game screen shot
