Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/matamegger/reinforced-pid-parameter
https://github.com/matamegger/reinforced-pid-parameter
Last synced: about 2 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/matamegger/reinforced-pid-parameter
- Owner: matamegger
- License: mit
- Created: 2021-11-29T18:28:34.000Z (about 3 years ago)
- Default Branch: master
- Last Pushed: 2021-12-10T16:12:02.000Z (about 3 years ago)
- Last Synced: 2023-08-08T22:25:38.880Z (over 1 year ago)
- Language: Python
- Size: 1.14 MB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Reinforcement Learning to find PID Parameter
This project represents a basic showcase of Python bindings for Simulink model binaries in computational demanding tasks.
A Simulink model of a Spring-Mass-Damper system controlled by a PID controller is the base of the showcase. Using [SliM-PyB](https://github.com/matamegger/slim-pyb), the model is converted to native binaries and the according Python bindings.
The model is then applied in a reinforcement learning environment to find the optimal controller parameters for specific scenarios.
## Reinforcement Learning Environment
The setup of the ML environment shouldn't be seen as a best practice guide. It was created with limited knowledge and no previous experience.
In the environment the agent can give only one action (i.e. static PID parameters) for the simulation to be finished. The action consists of the three control parameters of the PID controller (Kp, Ki, Kd).
Using the provided parameters a 50 seconds long simulation of the controlled Spring-Mass-Damper system is executed.
After the simulation the inverse normalized square of the error is used as the reward function of the agent.![](pictures/reward_function.png)
_(Where `e` is the error [difference between input and output signal] in the simulation step `i` and `T` the absolute number of steps)_
The inverse was needed, because otherwise the reward would sum up to very high numbers, which the RL library could not handle.
## Trained Model
The model in the project is trained with 150k `timesteps` on mixed control environments (a `step` and a `sinus` input signal).
## Prerequisits
Python3 as well as [`pipenv`](https://pypi.org/project/pipenv/) must be installed. The remaining dependencies should be automatically handled with the `Pipfile`s.
## Usage
_The pre-trained model `PID-Parameter-Model` will always be loaded if it exists in the filepath. Else a new model is created._```
python3 main.py
```
Uses the model to get the PID parameters and plots a step response of the system.Providing `sin` as an argument (`python3 main.py sin`) will use a sinus input function instead of the step.
To also train the model before showing a system response `-t` must be provided as a command line argument.
Trainings always run for 25k timesteps and will override the model on disk.