https://github.com/feifeibear/colobloom
https://github.com/feifeibear/colobloom
Last synced: 12 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/feifeibear/colobloom
- Owner: feifeibear
- License: apache-2.0
- Created: 2022-10-28T08:37:09.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2022-12-12T02:55:45.000Z (over 3 years ago)
- Last Synced: 2023-02-26T03:52:37.052Z (over 3 years ago)
- Language: Python
- Size: 153 KB
- Stars: 4
- Watchers: 2
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# ColossalAI Implementation for BLOOM Inference
Under development.
This repo is going to support BLOOM Inference with optimizations, such as Tensor Parallelism, Int8 quantization, with the help of ColossalAI.
# Fast Inference Solutions for BLOOM
This repo provides demos and packages to perform fast inference solutions for BLOOM. Some of the solutions have their own repos in which case a link to the corresponding repos is provided instead.
Some of the solutions provide both half-precision and int8-quantized solution.
## Client-side solutions
Solutions developed to perform large batch inference locally:
Pytorch:
* [Accelerate, DeepSpeed-Inference and DeepSpeed-ZeRO](./bloom-inference-scripts)
* [Custom HF Code](https://github.com/huggingface/transformers_bloom_parallel/).
JAX:
* [BLOOM Inference in JAX](https://github.com/huggingface/bloom-jax-inference)
## Server solutions
Solutions developed to be used in a server mode (i.e. varied batch size, varied request rate):
Pytorch:
* [Accelerate and DeepSpeed-Inference based solutions](./bloom-inference-server)
Rust:
* [Bloom-server](https://github.com/Narsil/bloomserver)