Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/jlenon7/mountaincar_gym_env

⛰️ Gym environment solving the MountainCar problem using Continuous Q-Learning.
https://github.com/jlenon7/mountaincar_gym_env

Last synced: 9 days ago
JSON representation

⛰️ Gym environment solving the MountainCar problem using Continuous Q-Learning.

Host: GitHub
URL: https://github.com/jlenon7/mountaincar_gym_env
Owner: jlenon7
License: mit
Created: 2024-06-13T11:46:53.000Z (6 months ago)
Default Branch: main
Last Pushed: 2024-06-23T08:53:25.000Z (6 months ago)
Last Synced: 2024-12-06T07:46:56.195Z (17 days ago)
Language: Python
Homepage:
Size: 785 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# OpenAI Mountain Car Gym Environment ⛰️

> Gym environment solving the MountainCar problem using Continuous Q-Learning.

## Results

### Simple agent results

Result

### Continuous Q-Learning agent results

Result

### Epoch points log tracker

## TODOs

- [x] Create simple agent
- [x] Train the agent to land exactly on the flag
- [ ] Save the model to be reused with a library like [stable-baselines](https://stable-baselines3.readthedocs.io/en/master/)

## Running

To run the gym environment first create a new Python environment and activate it. I'm using [Anaconda](https://www.anaconda.com/) for setting the python version that pipenv should use to set up the environment. The command bellow will automatically setup the environment with conda and pipenv:

```shell
make env
```

Now install all the project dependencies:

```shell
make install-all
```

To run the game to be played by a human run:

```shell
make play
```

To run the game and run random actions run:

```shell
make sample
```

To run the agent to complete the game task run:

```shell
make agent
```

To run the simple agent run:

```shell
make simple-agent
```

> [!WARNING]
> Keep in mind that the simple agent is not using rewards to perform the actions.
> Instead is just observing the environment to use the car position and
> velocity to use gravity to accomplish the task of climbing the hill. With
> this approach, we are not completing the secondary task of the problem
> that is landing exactly at the flag.