https://github.com/nisaaragharia/indian-lawyergpt
Fine-Tuning Falcon-7B, LLAMA 2 with QLoRA to create an advanced AI model with a profound understanding of the Indian legal context.
https://github.com/nisaaragharia/indian-lawyergpt
falcon fine-tuning gpt huggingface-transformers large-language-models llama llama2 llms peft qlora
Last synced: 5 months ago
JSON representation
Fine-Tuning Falcon-7B, LLAMA 2 with QLoRA to create an advanced AI model with a profound understanding of the Indian legal context.
- Host: GitHub
- URL: https://github.com/nisaaragharia/indian-lawyergpt
- Owner: NisaarAgharia
- Created: 2023-06-08T14:52:17.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2024-02-04T10:39:37.000Z (about 1 year ago)
- Last Synced: 2024-10-18T21:59:29.376Z (6 months ago)
- Topics: falcon, fine-tuning, gpt, huggingface-transformers, large-language-models, llama, llama2, llms, peft, qlora
- Language: Jupyter Notebook
- Homepage: https://huggingface.co/nisaar
- Size: 3.54 MB
- Stars: 68
- Watchers: 4
- Forks: 27
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Indian Law AI: Fine-Tuning Falcon-7B & LLAMA 2 Language Models
Welcome to our exciting project where we are adapting two cutting-edge language models, Falcon-7B & LLAMA 2, to become proficient in Indian law.
## Overview
Our adventure began with a modest 150 Q&As on Indian law. Now, we're charging ahead with an impressive dataset of 3300 instructions! This AI legal project combines:
- **Falcon-7B & LLAMA 2**: State-of-the-art language models, prepped and ready for legal training.
- **PEFT & QLoRA**: The dream duo for memory-efficient and high-performance model fine-tuning.
- **[Our Dataset](https://huggingface.co/datasets/nisaar/Articles_Constitution_3300_Instruction_Set)**: Comprehensive Indian law knowledge, spanning constitutional law, civil rights, and more!## Research Paper
## Dataset Creation
## Dive into our Dataset
Our dataset is designed with four key features: `instruction`, `input`, `output`, and `prompt`. Crafted to shape our models into AI law experts!
Dataset on Hugging Face :
https://huggingface.co/datasets/nisaar/Constitution_Of_India_Instruction_Set
https://huggingface.co/datasets/nisaar/Articles_Constitution_3300_Instruction_Set
https://huggingface.co/datasets/nisaar/LLAMA2_Legal_Dataset_4.4k_Instructions## Fine Tuning Process
## Track the Progress
Get a front-row seat to the training progress with TensorBoard. Kickstart it, navigate to the provided localhost link, and witness the models learn: