{"id":14964738,"url":"https://github.com/bastienpo/unsloth_finetuning","last_synced_at":"2025-10-05T14:32:08.583Z","repository":{"id":253768321,"uuid":"844469770","full_name":"bastienpo/unsloth_finetuning","owner":"bastienpo","description":"Finetuning of Gemma-2 2B for structured output","archived":false,"fork":false,"pushed_at":"2024-08-19T20:00:52.000Z","size":539,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-16T08:09:16.067Z","etag":null,"topics":["ai","fine-tuning","gemma2","llamacpp","python","unsloth"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bastienpo.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-08-19T10:25:40.000Z","updated_at":"2024-10-02T04:45:38.000Z","dependencies_parsed_at":"2024-08-19T20:51:47.398Z","dependency_job_id":null,"html_url":"https://github.com/bastienpo/unsloth_finetuning","commit_stats":null,"previous_names":["bastienpo/llm-finetune","bastienpo/unsloth_finetuning"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bastienpo%2Funsloth_finetuning","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bastienpo%2Funsloth_finetuning/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bastienpo%2Funsloth_finetuning/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bastienpo%2Funsloth_finetuning/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bastienpo","download_url":"https://codeload.github.com/bastienpo/unsloth_finetuning/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":235217927,"owners_count":18954520,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","fine-tuning","gemma2","llamacpp","python","unsloth"],"created_at":"2024-09-24T13:33:42.670Z","updated_at":"2025-10-05T14:32:03.246Z","avatar_url":"https://github.com/bastienpo.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Gemma-2 2B fine-tuned for Structured Data Extraction\n\nThis project is a collection of notebook and a simple flask web server to serve \n**Gemma-2** using **llama-cpp**.\n\nThe goal of this project is to fine-tune a model to get a better result on the task of\nto the task of extracting data into a structured format (JSON).\n\nYou will need to provide the **output schema** in openapi format and the **text** (context).\n\n## ⛩️ Project Architecture\n\nThe project is divided between notebook for the fine-tuning, quantization and evaluation and python files.\n\n| **Source**                       | **Description**                                                                                                       |\n|----------------------------------|-----------------------------------------------------------------------------------------------------------------------|\n| [➡️ Gemma-2 Finetuning](https://github.com/bastienpo/llm-finetune/blob/main/src/notebook/gemma_2_finetuning.ipynb)           | A notebook that shows how tofine-tune and quantize  gemma2-2b-it using the unsloth and hugging-face libraries.        |\n| [➡️ Server](https://github.com/bastienpo/llm-finetune/blob/main/src/web/app.py)                       | A simple flask REST server using llama-cpp with a 4 bit quantized model.                                              |\n| [➡️ CI/CD](https://github.com/bastienpo/llm-finetune/blob/main/.github/workflows/cicd.yml)                        | A github action consisting of a formatting/linting step with ruff, testing with pytest and building the docker image. |\n| [➡️ Dockerfile](https://github.com/bastienpo/llm-finetune/blob/main/Dockerfile)                   | A mutlistage dockerfile to build the server with gunicorn.                                                            |\n\n\n## 📊 Details about the Dataset\n\nThe different finetuned models can be found in safetensors and GGUF format (4bit, 8bit) on the hugging-face hub at [bastienp/Gemma-2-2B-it-JSON-data-extration](https://huggingface.co/bastienp/Gemma-2-2B-it-JSON-data-extration).\n\n**Note**: It also gives more details on how to use it with **llama-cpp** or **unsloth**.\n\n## 💻 Installation\n\n### Dev setup\n\nRecommended: Use the fast Python package installer and resolver **uv** from astral.\n\nAlternatively, you can replace this command with *pip*. You can find the documentation\nfor installing uv [here](https://github.com/astral-sh/uv?tab=readme-ov-file#getting-started).\n\n1. Sync the dependencies with uv\n\n```bash\nuv venv .venv\n```\n\n```bash\nsource .venv/bin/activate\n```\n\n```bash\nuv sync --all-extras --dev # in addition it adds pytest and ruff \n```\n\n2. Launch a flask dev server\n\n```bash\nflask --app src.web.app run --debug\n```\n\nTo reproduce the fine-tuning, the easiest way is to use Google Collab (the free version is sufficient).\n\n3. Run the tests (API testing)\n\n```bash\npytest\n```\n**Note**: An example of how to call the API and the prompt format can be found in `examplesexample_api_call.py`.\n\n## 👥 Deployment setup \n\nIn order to deploy the model the easiest way to go is to use the provided docker image.\n\n1. Pull the image from github (buit from the CI):\n\n```bash\ndocker pull ghcr.io/bastienpo/unsloth_finetuning:main \n```\n\n\n**Note**: Otherwise you can build the image yourself\n```bash\ndocker build -tag unsloth_finetuning:0.0.1 .\n```\n\n\n2. Run the docker image\n\n```bash\ndocker run -p 8000:8000 -d unsloth_finetuning:main # or 0.0.1\n```\n\n3. Make a post request\n\n```bash\ncurl -i -H \"Content-Type: application/json\" -X POST -d '{\"query\": \"How are you ?\"}' http://localhost:8000/api/v1/chat/completions\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbastienpo%2Funsloth_finetuning","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbastienpo%2Funsloth_finetuning","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbastienpo%2Funsloth_finetuning/lists"}