https://github.com/yuulis/el_school

Learning how to escape from school as fast as possible.
https://github.com/yuulis/el_school

csharp machine-learning ml mlagents python reinforcement-learning reinforcement-learning-agent reinforcement-learning-environments rl unity

Last synced: 3 months ago
JSON representation

Learning how to escape from school as fast as possible.

Host: GitHub
URL: https://github.com/yuulis/el_school
Owner: Yuulis
License: mit
Created: 2021-09-25T04:32:49.000Z (almost 5 years ago)
Default Branch: main
Last Pushed: 2022-01-22T14:51:46.000Z (over 4 years ago)
Last Synced: 2025-06-08T09:45:26.628Z (about 1 year ago)
Topics: csharp, machine-learning, ml, mlagents, python, reinforcement-learning, reinforcement-learning-agent, reinforcement-learning-environments, rl, unity
Language: C#
Homepage:
Size: 28.1 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # EL_School

**Goal**: To learn how to escape from school as fast as possible.  

Using [ML-Agents](https://github.com/Unity-Technologies/ml-agents) in [Release 17](https://github.com/Unity-Technologies/ml-agents/tree/release_17).  

## Environments

1. **School_Only1F** (Curriculum Learning)

1. **School_From2F** (Curriculum Learning) << Training now

---

## Env-1. School_Only1F

### Image

![Stage](https://user-images.githubusercontent.com/79734873/150643360-931afeef-303c-4b48-a7ca-15c2fe9c80f8.png)

### Environment

* **Exit1 ~ 3** : When Agent touched them, one episode will success and be ended. 

* **Obstacles** : When Agent touched them, one episode will fail and be ended.

### Agent

* It spawns random place in this stage.  

[!] There is spawnable area, made by collider, over the floor. When Agent land on the floor without touching it, Agent spawns again.

* It uses different brains depending on where it spawns. 

* It can move forward and back and turn around right and left direction (Discrete Action) .

* It can observe around with ray sensor. This ray is fired at 360 degrees.

### How to train

* Set three different brains to Agent where it spawns.

\  

There two curriculum parameters : ``SpawnableAreaNum`` and ``StepReward``.  

cf. ``\config\AgentManagerCurriculum.yaml``  

Curriculum settings is below: 

| SpawnableAreaNum | StepReward | Using Behavior | threshold |  

|:----------------:|:----------:|:--------------:|:---------:|

| 0.0 (B_StairSide)| -0.0002    | EL_B_StairSide | 0.6       |

| 1.0 (A_StairSide)| -0.0002    | EL_A_StairSide | 0.5       |

| 2.0 (C_StairSide)| -0.0002    | EL_C_StairSide | 0.5       |

| 3.0 (All)        | -0.00025   | One of three   | -         |

Training starts from ``SpawnableAreaNum = 0``.

Max step of each Behavior is below :  

| Behavior Name | Max Step   |

|:-------------:|:----------:|

|EL_B_StairSide | 1,000,000  |

|EL_A_StairSide | 10,000,000 |

|EL_C_StairSide | 10,000,000 |

### Rewards

* Agent gets ``StepReward`` set by Curriculum training at every step.

* When Agent touches Obstacles, it gets ``-1.0``.

* If Agent reaches Exit which is closest from where it spawned, it gets ``1.5``. Else, it gets ``0.75``.

### Result

Here is the result video. *The video is slow, this is due to the specs of my PC :(  

![result video](https://user-images.githubusercontent.com/79734873/147444470-8b665edb-289f-4361-b69f-fd716cac849f.mp4)  

Look it on [My Twitter](https://twitter.com/Yuulis04/status/1475024424101621761).

Here are graph :

![Reward graph](/images/graph_reward.png)  

![Reward graph](/images/graph_episode-length.png)

Here is the scatter plot. Please compare environment map.

![scatter plot](/images/Only1F_result_2021-12-26-16-19-30.png)

Finally here is result of each value.

![scatter plot](/images/Only1F_result_value_2021-12-26-16-19-30.png)

**Agent has a 90% chance to evacuate from 1F of this school!**

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/yuulis/el_school

Awesome Lists containing this project

README