https://github.com/lmarzocchetti/inventorysystem

Inventory System with simpy, with a Reinforcement Learning Agent
https://github.com/lmarzocchetti/inventorysystem

python pytorch reinforcement-learning simpy simulation

Last synced: 2 months ago
JSON representation

Inventory System with simpy, with a Reinforcement Learning Agent

Host: GitHub
URL: https://github.com/lmarzocchetti/inventorysystem
Owner: lmarzocchetti
License: mit
Created: 2024-10-02T15:19:14.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2025-05-27T11:23:40.000Z (about 1 year ago)
Last Synced: 2025-05-29T00:42:47.006Z (about 1 year ago)
Topics: python, pytorch, reinforcement-learning, simpy, simulation
Language: Jupyter Notebook
Homepage:
Size: 121 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# InventorySystem
Inventory System with simpy, with Reinforcement Learning Agent (DQN and REINFORCE)

## Problem Description
Warehouse wants to stock and sell 2 different Products.
The simulation's parameters are:
- demand_inter_arrival_mean_time = 0.1,
- order_setup_cost = 10,
- order_incremental_cost = 3,
- holding_cost = 1,
- shortage_cost = 7

First product's parameters are:
- name = "First Product",
- demand_distribution = ([1, 2, 3, 4], [1/6, 1/3, 1/3, 1/6]),
- lead_time_min = 0.5,
- lead_time_max = 1.0

Second product's paramters are:
- name = "Second Product",
- demand_distribution = ([5, 4, 3, 2], [1/8, 1/2, 1/4, 1/8]),
- lead_time_min = 0.2,
- lead_time_max = 0.7

## Choosing actions
Obviously we need to control the simulation, so the warehouse need to order
every day (or not), some quantity for the each Product.

### S-min S-max Policy
Simple policy that states: If the stocked quantity of a product is less than
S-min, then order products up to S-max

### Reinforcement Learning
I have implemented from scratch these two methods:

#### DQN
Classic Deep Q Network, so a Value-based approach with Replay Buffer, Temporal Difference

#### REINFORCE
Another classic algorithm, Policy-based approach, Monte-carlo

## Performance
I have optimized each methods to act, empirically, the best that they could. Then i have tested
these methods, to see which is the best (basically minimizing the total_cost after a year):
1. DQN: 248427
2. REINFORCE: 364943
3. S-min S-max: 703995

## Usage
I used Python 3.13.3, but for sure python 3.11 and 3.12 will still works.
```
$ pip install -r requirements.txt
```

### Train
```
$ python main.py train {dqn,reinforce} --save_path {folder}
```

### Test
In case of smin_smax you don't need to specify the load path
```
$ python main.py test {dqn,reinforce,smin_smax} [--load_path {path_to_pt_file}]
```

### Changing Hyper Parameters
As of right now you need to modify them inside the code, maybe i can introduce others CLI flags to specify them.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/lmarzocchetti/inventorysystem

Awesome Lists containing this project

README