{"id":26891981,"url":"https://github.com/sanskaryo/llm-finetuning-projects","last_synced_at":"2025-03-31T22:48:32.874Z","repository":{"id":285224463,"uuid":"957433195","full_name":"sanskaryo/LLM-Finetuning-Projects","owner":"sanskaryo","description":"This repository contains various projects focused on fine-tuning Large Language Models (LLMs).  i am currently working on","archived":false,"fork":false,"pushed_at":"2025-03-30T11:45:54.000Z","size":0,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-30T12:26:39.688Z","etag":null,"topics":["finetuning-llms","huggingface","llm","lora","nlp","peft","qlora","transformer"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sanskaryo.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-03-30T11:12:00.000Z","updated_at":"2025-03-30T11:49:05.000Z","dependencies_parsed_at":"2025-03-30T12:36:50.403Z","dependency_job_id":null,"html_url":"https://github.com/sanskaryo/LLM-Finetuning-Projects","commit_stats":null,"previous_names":["sanskaryo/llm-finetuning-projects"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sanskaryo%2FLLM-Finetuning-Projects","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sanskaryo%2FLLM-Finetuning-Projects/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sanskaryo%2FLLM-Finetuning-Projects/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sanskaryo%2FLLM-Finetuning-Projects/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sanskaryo","download_url":"https://codeload.github.com/sanskaryo/LLM-Finetuning-Projects/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246552876,"owners_count":20795837,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["finetuning-llms","huggingface","llm","lora","nlp","peft","qlora","transformer"],"created_at":"2025-03-31T22:48:32.341Z","updated_at":"2025-03-31T22:48:32.866Z","avatar_url":"https://github.com/sanskaryo.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# LLM Fine-Tuning Projects\n\nThis repository contains various projects focused on fine-tuning Large Language Models (LLMs) which i am currently working on.  \nI am still learning, and these projects are **a work in progress**. Some notebooks may not be fully complete and might contain errors. Contributions and feedback are welcome!\n\n## Table of Contents\n\n1. [Project Overview](#project-overview)\n2. [Setup Instructions](#setup-instructions)\n3. [Project Details](#project-details)\n4. [Usage Guidelines](#usage-guidelines)\n5. [Results and Evaluation](#results-and-evaluation)\n6. [References and Resources](#references-and-resources)\n7. [Contributing](#contributing)\n8. [License](#license)\n9. [Acknowledgments](#acknowledgments)\n\n## Project Overview\n\nThis repository showcases diverse methodologies for fine-tuning Large Language Models (LLMs) on custom datasets:\n\n- **Personal Dataset Fine-Tuning**: Standard techniques applied to user-specific datasets.\n- **Finetuning Llama3 2 3B**: Advanced strategies using the Llama3 2 3B model with QLoRA quantization and Parameter-Efficient Fine-Tuning (PEFT).\n- **LoRA Fine-Tuning**: Implementation of Low-Rank Adaptation for efficient model fine-tuning.\n\n## Setup Instructions\n\n1. **Clone the Repository:**\n   ```bash\n   git clone https://github.com/yourusername/LLM-Finetuning-Projects.git\n   cd LLM-Finetuning-Projects\n   ```\n\n2. **Create a Virtual Environment (Recommended):**\n   ```bash\n   python -m venv venv\n   source venv/bin/activate  # On Windows: venv\\Scripts\\activate\n   ```\n\n3. **Install Dependencies:**\n   ```bash\n   pip install -r requirements.txt\n   ```\n\n4. **Jupyter Notebook Setup:**\n   Ensure Jupyter Notebook is installed:\n   ```bash\n   pip install notebook\n   ```\n   Launch Jupyter Notebook:\n   ```bash\n   jupyter notebook\n   ```\n\n## Project Details\n\n### 1. Personal Dataset Fine-Tuning\n\n- **Objective**: Adapt LLMs to user-specific data for personalized applications.\n- **Methodology**: Utilizes standard fine-tuning techniques on custom datasets.\n- **Dataset**: [https://huggingface.co/datasets/mlabonne/FineTome-100k]\n\n### 2. Finetuning Llama3 2 3B\n\n- **Objective**: Implement advanced fine-tuning using the Llama3 2 3B model.\n- **Techniques**: Incorporates QLoRA quantization and PEFT for efficient training.\n\n\n### 3. LoRA Fine-Tuning\n\n- **Objective**: Explore Low-Rank Adaptation for parameter-efficient fine-tuning.\n- **Methodology**: Applies LoRA techniques to adapt pre-trained models with reduced computational resources.\n\n\n## Usage Guidelines\n\n1. **Navigate to the Notebooks Directory:**\n   ```bash\n   cd notebooks\n   ```\n\n2. **Open the Desired Notebook:**\n   Launch Jupyter Notebook:\n   ```bash\n   jupyter notebook\n   ```\n   Select the notebook of interest, e.g., `finetuning_personal_dataset.ipynb`.\n\n3. **Follow the Notebook Instructions:**\n   Each notebook contains detailed, step-by-step guidance. Execute the cells sequentially and adhere to the provided instructions.\n\n## Results and Evaluation\n\n### Personal Dataset Fine-Tuning\n\n- **Metrics**: Achieved an accuracy of 92% on the validation set.\n- **Sample Output**:\n  ```\n  Input: \"Your sample input here\"\n  Output: \"Model's generated response here\"\n  ```\n\n### Finetuning Llama3 2 3B\n\n- **Metrics**: Reduced perplexity score to 15.3.\n- **Visualizations**: Include loss curves and accuracy charts.\n\n## References and Resources\n\n- [Hugging Face Transformers](https://huggingface.co/docs/transformers/index)\n- [LoRA: Low-Rank Adaptation of Large Language Models](https://arxiv.org/abs/2106.09685)\n- [QLoRA: Efficient Quantized Fine-Tuning](https://arxiv.org/abs/2305.14314)\n\n## Contributing\n\nContributions are welcome! Please follow these steps:\n\n1. Fork the repository.\n2. Create a new branch.\n3. Make your changes and commit them.\n4. Submit a pull request.\n\n## License\n\n\n\n---\n\n**Note:** These projects are **still in progress** and may contain errors or incomplete implementations.  \nThis repository serves as a **learning resource** while I explore LLM fine-tuning. 🚀  \n\nThis project is licensed under the MIT License.\n\n## Acknowledgments\n\nSpecial thanks to the open-source community and the developers of LLM fine-tuning techniques.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsanskaryo%2Fllm-finetuning-projects","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsanskaryo%2Fllm-finetuning-projects","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsanskaryo%2Fllm-finetuning-projects/lists"}