https://github.com/thomashirtz/gym-battleship
Battleship environment for reinforcement learning tasks
https://github.com/thomashirtz/gym-battleship
adversarial adversarial-machine-learning battleship battleship-environment environment gym-battleship openai openai-gym reinforcement-learning reinforcement-learning-environment
Last synced: 19 days ago
JSON representation
Battleship environment for reinforcement learning tasks
- Host: GitHub
- URL: https://github.com/thomashirtz/gym-battleship
- Owner: thomashirtz
- License: mit
- Created: 2020-10-23T16:14:02.000Z (over 4 years ago)
- Default Branch: main
- Last Pushed: 2023-04-29T15:14:25.000Z (almost 2 years ago)
- Last Synced: 2025-03-24T04:08:43.767Z (about 1 month ago)
- Topics: adversarial, adversarial-machine-learning, battleship, battleship-environment, environment, gym-battleship, openai, openai-gym, reinforcement-learning, reinforcement-learning-environment
- Language: Python
- Homepage:
- Size: 314 KB
- Stars: 13
- Watchers: 2
- Forks: 2
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# gym-battleship
Battleship environment using the OpenAI environment toolkit.![]()
## Battleship Environment
### BasicsMake and initialize an environment:
```python
import gym
import gym_battleship
env = gym.make('Battleship-v0')
env.reset()
```Get the action space and the observation space:
```python
ACTION_SPACE = env.action_space.n
OBSERVATION_SPACE = env.observation_space.shape[0]
```Run a random agent:
```python
for i in range(10):
env.step(env.action_space.sample())
```Observe the hidden game state:
```python
print(env.board_generated)
```
### Valid actionsThere is two way to input the action.
The first way is to input the tuple as it is:
```python
env = gym.make('battleship-v0')
env.reset()
action = (0, 0)
env.step(action)
```The second way is to input an encoded action:
```python
env = gym.make('battleship-v0')
env.reset()
action = 10
env.step(action)
```### Customize environments
The original battleship game is played on a 10x10 grid and the fleet is composed 5 ships: Carrier (occupies 5 spaces), Battleship (4), Cruiser (3), Submarine (3), and Destroyer (2).
Therefore, the default parameters are :
```python
ship_sizes = {5: 1, 4: 1, 3: 2, 2: 1}
board_size = (10, 10)
```It is possible to change the parameters when making the environment:
```python
import gym
import gym_battleshipenv = gym.make('battleship-v0', ship_sizes={4: 2, 3: 1}, board_size=(5, 5))
```The other possible parameters are the different rewards as well as the maximum step:
```python
import gym
import gym_battleshipenv = gym.make('battleship-v0', episode_steps=10, reward_dictionary={'win': 200})
```The default reward keys and values are the following:
```python
reward_dictionary = {
'win': 100,
'missed': 0,
'touched': 1,
'repeat_missed': -1,
'repeat_touched': -0.5
}
```
It is only necessary to pass to the environment the rewards that you want to edit.### Render
Two functions exist to render the environment:
```python
env.render_board_generated()
```
```python
env.render()
```
Examples of renders using on an Ipython notebook (`env.render()` on the left and `env.render_board_generated()` on the right)
Code snippet for rendering
import gym
import gym_battleshipenv = gym.make('battleship-v0')
env.reset()for i in range(10):
env.step(env.action_space.sample())
env.render()env.render_board_generated()
Unfortunately, the pretty print of dataframe in IDE or console will not be as nice as a dataframe displayed in a notebook.
## Adversarial Battleship Environment
### About
Adversarial battleship environment is an environment that aim to make a battle between two reinforcement learning agent.
This environment is inspired by the Generative Adversarial Algorithms (GANs).
One agent is the defender and has the goal to strategically place his ships, while the other has the role of the attacker,
like in the `Battleship-v0` environment.### Basics
The following script shows how to train the two agents:
```python
import gym
import gym_battleship
from collections import namedtupleenv = gym.make('AdversarialBattleship-v0')
step = namedtuple('step', ['state', 'reward', 'done', 'info'])attacker_agent = env.attacker_action_space.sample
defender_agent = env.defender_action_space.samplenum_episodes = 1
for episode in range(num_episodes):defender_old_state = env.defender_reset()
while True:
defender_action = defender_agent()
return_value = env.defender_step(*defender_action)
if return_value:
defender_step = step(*return_value)
# defender_agent.learn(defender_old_state, defender_step.reward, defender_step.state, defender_step.done)
defender_old_state = defender_step.state
else:
breakattacker_step = step(*env.attacker_initialization(), False, {})
env.render_board_generated()while not attacker_step.done:
attacker_action = attacker_agent()
attacker_step = step(*env.attacker_step(attacker_action))
# defender_agent.learn(attacker_old_state, attacker_step.reward, attacker_step.state, attacker_step.done)
attacker_old_state = attacker_step.statedefender_step = step(*env.defender_step(*defender_action))
# defender_agent.learn(defender_old_state, defender_step.reward, defender_step.state, defender_step.done)
```First the defender, place his ships, then he received a `None` value to indicate that he needs to wait the action of the
attackers. Then the attackers plays like in `Battleship-v0`. Finally, the defender get his reward and can do its last
learning step.The render and customization is identical as in the `Battleship-v0` environment.
## Todo
- Write docstring
- Create methods to let agents choose the position of the ships## Installation
The command to install the repository via pip is:
```bash
pip install git+https://github.com/thomashirtz/gym-battleship#egg=gym-battleship
```