https://github.com/sreeeswaran/train-your-llm
This repository contains code and resources for training, fine-tuning, and deploying large language models using Hugging Face's Transformers library.
https://github.com/sreeeswaran/train-your-llm
artificial-intelligence deep-learning language-model large-language-model large-language-models llm llm-training llms machine-learning model-training nlp pretrained-language-model pretrained-models training
Last synced: 7 months ago
JSON representation
This repository contains code and resources for training, fine-tuning, and deploying large language models using Hugging Face's Transformers library.
- Host: GitHub
- URL: https://github.com/sreeeswaran/train-your-llm
- Owner: SreeEswaran
- Created: 2024-06-12T18:23:01.000Z (12 months ago)
- Default Branch: main
- Last Pushed: 2024-08-20T17:13:19.000Z (10 months ago)
- Last Synced: 2024-08-20T21:26:34.403Z (10 months ago)
- Topics: artificial-intelligence, deep-learning, language-model, large-language-model, large-language-models, llm, llm-training, llms, machine-learning, model-training, nlp, pretrained-language-model, pretrained-models, training
- Language: Python
- Homepage:
- Size: 31.3 KB
- Stars: 2
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Train-your-LLM
This repository contains code and resources for training, fine-tuning, and deploying large language models using Hugging Face's Transformers library. The aim of this repo is to provide a comprehensive guide and codebase for training large language models from scratch or fine-tuning pre-trained models for specific tasks.
If you're interested in a detailed, step-by-step explanation of how this project was built, including code walkthroughs and in-depth analysis, check out [My Medium blog post- "Step-by-Step Guide to Train a Large Language Model (LLM) with code"](https://blog.gopenai.com/step-by-step-guide-to-train-a-large-language-model-llm-with-code-1f536c34694e).
## Features
- Data preprocessing scripts to clean and tokenize text data.
- Model training scripts with customizable hyperparameters.
- Notebooks for interactive model exploration and experimentation.
- API for deploying trained models and generating text.## Getting Started
1. Clone this repository:
```bash
git clone https://github.com/SreeEswaran/Train-your-LLM.git
cd Train-your-LLM
```2. Install dependencies:
```bash
pip install -r requirements.txt
```3. Follow the instructions in each directory to preprocess data, train models, and deploy APIs.
## Contributing
Contributions are welcome! Feel free to submit bug reports, feature requests, or pull requests. For major changes, please open an issue first to discuss potential changes.
## Contact
If you have any questions, suggestions, or just want to connect, feel free to reach out through the following platforms:
For Mentoring/Coaching/Guidance: https://topmate.io/SreeEswaran
LinkedIn: https://www.linkedin.com/in/sree-deekshitha-yerra
Medium Blogs: https://www.medium.com/@SreeEswaran