https://github.com/smallgig/pickomino
Reinforcement Learning Environment with Gymnasium API
https://github.com/smallgig/pickomino
api ci coverage-testing gymnasium mypy numpy pip pre-commit pycharm pygame pylint pyright pytest pytest-cov python reinforcement-learning-environments ruff toml xenon
Last synced: 8 days ago
JSON representation
Reinforcement Learning Environment with Gymnasium API
- Host: GitHub
- URL: https://github.com/smallgig/pickomino
- Owner: smallgig
- License: mit
- Created: 2025-08-18T10:06:32.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2026-01-14T11:26:22.000Z (13 days ago)
- Last Synced: 2026-01-14T12:19:48.072Z (13 days ago)
- Topics: api, ci, coverage-testing, gymnasium, mypy, numpy, pip, pre-commit, pycharm, pygame, pylint, pyright, pytest, pytest-cov, python, reinforcement-learning-environments, ruff, toml, xenon
- Language: Python
- Homepage:
- Size: 1.04 MB
- Stars: 1
- Watchers: 0
- Forks: 1
- Open Issues: 45
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
# Pickomino-Env
[](https://pypi.org/project/pickomino-env/)
[](https://www.python.org/downloads/)
[](https://docs.astral.sh/ruff/)
[](https://pre-commit.com/)
[](https://github.com/microsoft/pyright)
[](https://opensource.org/licenses/MIT)
[](https://gymnasium.farama.org/)
[](https://github.com/psf/black)
[](http://www.pydocstyle.org/)
[](https://radon.readthedocs.io/)
[](https://xenon.readthedocs.io/)
[](https://pylint.pycqa.org/)
[](http://mypy-lang.org/)
[](https://pytest.org/)
## Description
An environment conforming to the **Gymnasium** API for the dice game **Pickomino (Heckmeck am Bratwurmeck)**
Goal: train a Reinforcement Learning agent for optimal play. Meaning, decide which face of the dice to collect,
when to roll and when to stop.
## Action Space
The Action space is a tuple with two integers.
Tuple (int, int)
Action = [dice_face (0-5), action_type (0=roll, 1=stop)].
- 0-5: Face of the dice, which you want to take, where:
- 0 -> face 1
- 1 -> face 2
- 2 -> face 3
- 3 -> face 4
- 4 -> face 5
- 5 -> face worm
- 0-1: Roll (0) or stop (1).
## Observation Space
The observation is a `dict` with shape `(4,)` with the values corresponding to the following: dice, table and player.
| Observation | Min | Max | Shape |
|----------------|-----|-----|-------------------|
| dice_collected | 0 | 8 | (6,) |
| dice_rolled | 0 | 8 | (6,) |
| tiles_table | 0 | 1 | (16,) |
| tile_players | 0 | 36 | number_of_players |
**Note:** There are eight dice to roll and collect. A die has six sides with the number of eyes one through
five, but a worm instead of a six.
The values correspond to the number of eyes, with the worm also having the value five (and not six!).
The 16 tiles are numbered 21 to 36 and have worm values from one to four in spread in four groups.
The game is for two to seven players. Here your Reinforcement Learning Agent is the first player. The
other players are computer bots.
The bots play, according to a heuristic. When you create the environment,
you have to define the number of bots.
For a more detailed description of the rules, see the file pickomino-rulebook.pdf.
You can play the game online here: https://www.maartenpoirot.com/pickomino/.
The heuristic used by the bots is described here: https://frozenfractal.com/blog/2015/5/3/how-to-win-at-pickomino/.
## Rewards
The goal is to collect tiles in a stack. The winner is the player, which at the end of the game has the most worms
on her tiles. For the Reinforcement Learning Agent a reward equal to the value
(worms) of a tile is given when the tile is picked. For a failed attempt
(see rulebook), a corresponding negative reward is given. When a bot steals your
tile, no negative reward is given. Hence, the total reward at the end of the game
can be greater than the score.
## Starting State
* `dice_collected` = [0, 0, 0, 0, 0, 0].
* `dice_rolled` = [3, 0, 1, 2, 0, 2] Random dice, sum = 8.
* `tiles_table` = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1].
* `tile_players` = [0, 0, 0] (with number_of_bots = 2).
## Episode End
The episode ends if one of the following occurs:
1. Termination: If there are no more tiles to take on the table = Game Over.
2. Termination: Action out of allowed range [0–5, 0-1].
### Truncation
Truncation: Attempt to break the rules, the game continues, and you have to give a new valid action.
### Failed Attempt
Note that a Failed Attempt means: If a tile is present, put it back on the table and get a negative reward.
However, the game continues, so the Episode does not end.
## Arguments
These must be specified.
| Parameter | Type | Default | Description |
|------------------|-------------|---------|--------------------------------------------------------------------------------------------|
| `number_of_bots` | int | -- | Number of bot opponents (1-6) you want to play against |
| `render_mode` | str or None | None | Visualization mode:
None (training),
"human" (display), or "rgb_array" (recording) |
## Setup
`pip install pickomino-env`
## Usage example
```python
import gymnasium as gym
# Create environment
env = gym.make("Pickomino-v0", render_mode="human", number_of_bots=2)
# Reset and get initial observation
obs, info = env.reset(seed=42)
# Run one episode
terminated = False
truncated = False
total_reward = 0
while not terminated and not truncated:
# Agent selects action: (dice_face, roll_choice)
action = env.action_space.sample() # Random action for demo
# Step environment
obs, reward, terminated, truncated, info = env.step(action)
total_reward += reward
if truncated:
print(f"Invalid action: {info['explanation']}")
break
print(f"Episode finished. Total reward: {total_reward}")
env.close()
```
## Resources
- **Game Rules:** [Pickomino Rulebook](https://github.com/smallgig/Pickomino/blob/main/pickomino-rulebook.pdf)
- **Play Online:** [Maarteen Poirot's Pickomino](https://www.maartenpoirot.com/pickomino/)
- **Bot Strategy:** [How to Win at Pickomino](https://frozenfractal.com/blog/2015/5/3/how-to-win-at-pickomino/)
- **Repository:** [smallgig/Pickomino](https://github.com/smallgig/Pickomino)
- **Gymnasium:** [https://gymnasium.farama.org/](https://gymnasium.farama.org/)
## License
MIT License. See [LICENSE](LICENSE) for details.