https://github.com/kemingy/batching
Dynamic Batching for Deep Learning Serving
https://github.com/kemingy/batching
batch-queue-manager deep-learning deep-learning-serving dynamic-batch job-queue
Last synced: 5 months ago
JSON representation
Dynamic Batching for Deep Learning Serving
- Host: GitHub
- URL: https://github.com/kemingy/batching
- Owner: kemingy
- License: apache-2.0
- Archived: true
- Created: 2020-04-21T02:08:59.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2021-10-06T12:48:51.000Z (over 4 years ago)
- Last Synced: 2025-11-22T13:05:46.844Z (7 months ago)
- Topics: batch-queue-manager, deep-learning, deep-learning-serving, dynamic-batch, job-queue
- Language: Go
- Homepage: https://pkg.go.dev/github.com/kemingy/batching
- Size: 52.7 KB
- Stars: 9
- Watchers: 1
- Forks: 3
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Dynamic Batching for Deep Learning Serving
[](https://golang.org/)
[](https://goreportcard.com/report/github.com/kemingy/batching)
[](https://godoc.org/github.com/kemingy/batching)

[](https://github.com/kemingy/batching/blob/master/LICENSE)
[Ventu](https://github.com/kemingy/ventu) already implement this protocol, so it can be used as the worker for deep learning inference.
## Attention
**This project is just a proof of concept. Check the [MOSEC](https://github.com/mosecorg/mosec) for production usage.**
## Features
* dynamic batching with batch size and latency
* invalid request won't affects others in the same batch
* communicate with workers through Unix domain socket or TCP
* load balancing
If you are interested in the design, check my blog [Deep Learning Serving Framework](https://kemingy.github.io/blogs/deep-learning-serving/).
## Configs
```shell script
go run service/app.go --help
```
```
Usage app:
-address string
socket file or host:port (default "batch.socket")
-batch int
max batch size (default 32)
-capacity int
max jobs in the queue (default 1024)
-host string
host address (default "0.0.0.0")
-latency int
max latency (millisecond) (default 10)
-port int
service port (default 8080)
-protocol string
unix or tcp (default "unix")
-timeout int
timeout for a job (millisecond) (default 5000)
```
## Demo
```shell script
go run service/app.go
python examples/app.py
python examples/client.py
```