https://github.com/llm-db/fineinfer

Deferred Continuous Batching in Resource-Efficient Large Language Model Serving (EuroMLSys 2024)
https://github.com/llm-db/fineinfer

fine-tuning inference llm lora peft pytorch

Last synced: 3 months ago
JSON representation

Deferred Continuous Batching in Resource-Efficient Large Language Model Serving (EuroMLSys 2024)

Host: GitHub
URL: https://github.com/llm-db/fineinfer
Owner: llm-db
License: mit
Created: 2024-02-27T11:31:37.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-05-28T20:47:42.000Z (over 1 year ago)
Last Synced: 2024-06-07T23:22:49.226Z (over 1 year ago)
Topics: fine-tuning, inference, llm, lora, peft, pytorch
Language: Python
Homepage: https://dl.acm.org/doi/10.1145/3642970.3655835
Size: 53.7 KB
Stars: 9
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          


FineInfer





| Paper |



FineInfer is a research prototype for fine-tuning and serving large language models.

FineInfer supports concurrent parameter-efficient fine-tuning and inference through the following features:

* Deferred continuous batching

* Hybrid system architecture

* Heterogeneous batching

## Get Started

[Installation and examples](https://github.com/llm-db/FineInfer/tree/main/benchmarks/fineinfer)

The current version removes some previous features and functionalities. If you need them, please download [previous versions](https://github.com/llm-db/FineInfer/releases).

## Citation

```

@inproceedings{FineInfer,

  author = {He, Yongjun and Lu, Yao and Alonso, Gustavo},

  title = {Deferred Continuous Batching in Resource-Efficient Large Language Model Serving},

  year = {2024},

  booktitle = {Proceedings of the 4th Workshop on Machine Learning and Systems},

  pages = {98–106},

  series = {EuroMLSys '24}

}

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/llm-db/fineinfer

Awesome Lists containing this project

README

FineInfer