https://github.com/sreeeswaran/train-your-llm

This repository contains code and resources for training, fine-tuning, and deploying large language models using Hugging Face's Transformers library.
https://github.com/sreeeswaran/train-your-llm

artificial-intelligence deep-learning language-model large-language-model large-language-models llm llm-training llms machine-learning model-training nlp pretrained-language-model pretrained-models training

Last synced: 7 months ago
JSON representation

This repository contains code and resources for training, fine-tuning, and deploying large language models using Hugging Face's Transformers library.

Host: GitHub
URL: https://github.com/sreeeswaran/train-your-llm
Owner: SreeEswaran
Created: 2024-06-12T18:23:01.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2024-08-20T17:13:19.000Z (10 months ago)
Last Synced: 2024-08-20T21:26:34.403Z (10 months ago)
Topics: artificial-intelligence, deep-learning, language-model, large-language-model, large-language-models, llm, llm-training, llms, machine-learning, model-training, nlp, pretrained-language-model, pretrained-models, training
Language: Python
Homepage:
Size: 31.3 KB
Stars: 2
Watchers: 2
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Train-your-LLM

This repository contains code and resources for training, fine-tuning, and deploying large language models using Hugging Face's Transformers library. The aim of this repo is to provide a comprehensive guide and codebase for training large language models from scratch or fine-tuning pre-trained models for specific tasks.

If you're interested in a detailed, step-by-step explanation of how this project was built, including code walkthroughs and in-depth analysis, check out [My Medium blog post- "Step-by-Step Guide to Train a Large Language Model (LLM) with code"](https://blog.gopenai.com/step-by-step-guide-to-train-a-large-language-model-llm-with-code-1f536c34694e).

## Features

- Data preprocessing scripts to clean and tokenize text data.
- Model training scripts with customizable hyperparameters.
- Notebooks for interactive model exploration and experimentation.
- API for deploying trained models and generating text.

## Getting Started

1. Clone this repository:

```bash
git clone https://github.com/SreeEswaran/Train-your-LLM.git
cd Train-your-LLM
```

2. Install dependencies:

```bash
pip install -r requirements.txt
```

3. Follow the instructions in each directory to preprocess data, train models, and deploy APIs.

## Contributing

Contributions are welcome! Feel free to submit bug reports, feature requests, or pull requests. For major changes, please open an issue first to discuss potential changes.

## Contact

If you have any questions, suggestions, or just want to connect, feel free to reach out through the following platforms:

For Mentoring/Coaching/Guidance: https://topmate.io/SreeEswaran

LinkedIn: https://www.linkedin.com/in/sree-deekshitha-yerra

Medium Blogs: https://www.medium.com/@SreeEswaran

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/sreeeswaran/train-your-llm

Awesome Lists containing this project

README