https://github.com/genji970/llm_pipeline_from_training_to_deploying

pipeline consist of two stage. first, pdf data collecting using ray, fine tuning(llm training) pipeline. second, deploying in aws using rest api. for opensource contributions
https://github.com/genji970/llm_pipeline_from_training_to_deploying

api contributions-welcome finetuning good-first-issue llm

Last synced: 3 months ago
JSON representation

pipeline consist of two stage. first, pdf data collecting using ray, fine tuning(llm training) pipeline. second, deploying in aws using rest api. for opensource contributions

Host: GitHub
URL: https://github.com/genji970/llm_pipeline_from_training_to_deploying
Owner: genji970
License: other
Created: 2025-02-13T16:04:05.000Z (4 months ago)
Default Branch: master
Last Pushed: 2025-02-13T19:48:10.000Z (4 months ago)
Last Synced: 2025-02-13T20:33:37.878Z (4 months ago)
Topics: api, contributions-welcome, finetuning, good-first-issue, llm
Language: Python
Homepage:
Size: 65.4 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

## how to run ##

1) `git clone llm_project_train folder and dockerfile`
2) `pip install -r llm_project_train/master/requirements.txt`
3) `python -m llm_project_train.master.main True/False`
4) `git clone service folder`
5) move saved model from llm_project_train to service folder
6) `pip install -r api_for_service/requirements.txt`
7) `python -m main`
8) api run
9) 'ctrl + c' url and add docs. Then, you can test chat system. `http://127.0.0.1:8000/docs`

or just simply

1) `docker pull ghcr.io/genji970/api:latest`
2) `docker run -d -p 8000:8000 --name api_container ghcr.io/genji970/api_image:latest`
3) `http://:8000`

## Detail ##

This repo consist of two parts. llm_project_train folder + Dockerfile. api_for_service folder.(I merged two different project into one.)

Data_generating -> model_build -> master

In Data_generating folder, train_dataset will be made and saved in the format of csv.

In model_build folder, gpt2 will be loaded from huggingface, gpt2 will be fine tuned.

After fine tuned, weight and whole model structure will be saved as saved model. You have to move this saved model folder into service project consist of service folder.

if you run service project, rest api will run.

## link ##
This project contents are come from `https://github.com/genji970/llm_api_service_deploying_in_AWS` , `https://github.com/genji970/llm-service-deployment-for-fine-tuned-gpt2-pipeline-using-rest-api`

## used ##
python==3.10.12 , torch , ray , huggingface , langchain(not yet) , docker , csv , fast api, aws ec2, etc.

## contributions ##
Contributions are welcome! if you want to contribute, then please submit a pr.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/genji970/llm_pipeline_from_training_to_deploying

Awesome Lists containing this project

README