{"id":29714250,"url":"https://github.com/mxagar/llm_peft_fine_tuning_example","last_synced_at":"2025-07-24T03:10:07.181Z","repository":{"id":299839044,"uuid":"1004373399","full_name":"mxagar/llm_peft_fine_tuning_example","owner":"mxagar","description":"Example project in which a Large Language Model is fine-tuned using PEFT.","archived":false,"fork":false,"pushed_at":"2025-07-15T15:27:35.000Z","size":4172,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-07-16T09:58:24.467Z","etag":null,"topics":["fine-tuning","huggingface","llm","lora","machine-learning","nlp","peft","text-classification","transformers"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mxagar.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-06-18T14:21:50.000Z","updated_at":"2025-07-15T15:27:38.000Z","dependencies_parsed_at":"2025-07-15T18:34:28.407Z","dependency_job_id":"83edbe37-649e-4fc6-b2a1-f7650ae6364d","html_url":"https://github.com/mxagar/llm_peft_fine_tuning_example","commit_stats":null,"previous_names":["mxagar/llm_peft_fine_tuning_example"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/mxagar/llm_peft_fine_tuning_example","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mxagar%2Fllm_peft_fine_tuning_example","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mxagar%2Fllm_peft_fine_tuning_example/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mxagar%2Fllm_peft_fine_tuning_example/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mxagar%2Fllm_peft_fine_tuning_example/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mxagar","download_url":"https://codeload.github.com/mxagar/llm_peft_fine_tuning_example/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mxagar%2Fllm_peft_fine_tuning_example/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":266786798,"owners_count":23983871,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-24T02:00:09.469Z","response_time":99,"last_error":null,"robots_txt_status":null,"robots_txt_updated_at":null,"robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["fine-tuning","huggingface","llm","lora","machine-learning","nlp","peft","text-classification","transformers"],"created_at":"2025-07-24T03:10:06.119Z","updated_at":"2025-07-24T03:10:07.148Z","avatar_url":"https://github.com/mxagar.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Applying Parameter-Efficient Fine-Tuning (PEFT) to a Large Language Model (LLM)\n\nThis example project shows how to fine-tune a Large Language Model using the PEFT library from HuggingFace.\n\nThe HuggingFace library [`transformers`](https://huggingface.co/docs/transformers/en/index) in combination with [`peft`](https://github.com/huggingface/peft) makes it very easy to fine-tune Large Language Models (LLMs) for our specific tasks.\nThis small project shows how to use those libraries end-to-end to perform a text classification task.\n\nSpecifically:\n\n- We use the [`ag_news`](https://huggingface.co/datasets/fancyzhx/ag_news) dataset, which consists of 120k news texts, each of them with a label related to its associated topic: `'World', 'Sports', 'Business', 'Sci/Tech'`.\n- The [DistilBERT](https://huggingface.co/docs/transformers/en/model_doc/distilbert) model is fine-tuned for the news classification task. In the process, [Low-Rank Adaptation (LoRA)](https://arxiv.org/abs/2106.09685) is used to accelerate the fine-tuning thanks to the [`peft`](https://github.com/huggingface/peft) library.\n\nThe underlying LLM is abstracted and easily handled thanks to the [`transformers`](https://huggingface.co/docs/transformers/en/index) library; the user only needs to understand basic concepts such as\n\n- Tokenization of text sequences\n- Embedding vectors of tokens and associated dimensions\n- The motivation and usage of the encoder \u0026 decoder modules in LLMs\n- Task-specific heads, such as classification\n\n![LLM Architecture Simplified](./assets/llm_simplified.png)\n\nFor a primer in those topics, you can visit\n\n- [mxagar/generative_ai_udacity/01_Fundamentals_GenAI](https://github.com/mxagar/generative_ai_udacity/tree/main/01_Fundamentals_GenAI)\n- [mxagar/nlp_with_transformers_nbs](https://github.com/mxagar/nlp_with_transformers_nbs)\n\n## Setup\n\nA recipe to set up a [conda](https://docs.conda.io/en/latest/) environment with the required dependencies:\n\n```bash\n# Create the necessary Python environment\nconda env create -f conda.yaml\nconda activate peft\n\n# Compile and install all dependencies\npip-compile requirements.in\npip-sync requirements.txt\n\n# If we need a new dependency,\n# add it to requirements.in \n# And then:\npip-compile requirements.in\npip-sync requirements.txt\n```\n\n## Notebook\n\nThe notebook [`llm_peft.ipynb`](./llm_peft.ipynb) contains all the code and explanations necessary to perform the aforementioned fine-tuning.\n\n## Interesting Links\n\n- My personal notes on the O'Reilly book [Generative Deep Learning, 2nd Edition, by David Foster](https://github.com/mxagar/generative_ai_book)\n- My personal notes on the O'Reilly book [Natural Language Processing with Transformers, by Lewis Tunstall, Leandro von Werra and Thomas Wolf (O'Reilly)](https://github.com/mxagar/nlp_with_transformers_nbs)\n- My personal notes and guide for the [Generative AI Nanodegree from Udacity](https://github.com/mxagar/generative_ai_udacity/)\n- [HuggingFace Guide: `mxagar/tool_guides/hugging_face`](https://github.com/mxagar/tool_guides/tree/master/hugging_face)\n- [LangChain Guide: `mxagar/tool_guides/langchain`](https://github.com/mxagar/tool_guides/tree/master/langchain)\n- [LLM Tools: `mxagar/tool_guides/llms`](https://github.com/mxagar/tool_guides/tree/master/llms)\n- [NLP Guide: `mxagar/nlp_guide`](https://github.com/mxagar/nlp_guide)\n- [Deep Learning Methods for CV and NLP: `mxagar/computer_vision_udacity/CVND_Advanced_CV_and_DL.md`](https://github.com/mxagar/computer_vision_udacity/blob/main/03_Advanced_CV_and_DL/CVND_Advanced_CV_and_DL.md)\n- [Deep Learning Methods for NLP: `mxagar/deep_learning_udacity/DLND_RNNs.md`](https://github.com/mxagar/deep_learning_udacity/blob/main/04_RNN/DLND_RNNs.md)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmxagar%2Fllm_peft_fine_tuning_example","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmxagar%2Fllm_peft_fine_tuning_example","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmxagar%2Fllm_peft_fine_tuning_example/lists"}