https://github.com/ksm26/pretraining-llms

Master the essential steps of pretraining large language models (LLMs). Learn to create high-quality datasets, configure model architectures, execute training runs, and assess model performance for efficient and effective LLM pretraining.
https://github.com/ksm26/pretraining-llms

ai-training cost-effective-pretraining data-preparation depth-upscaling developer-advocacy high-quality-datasets hugging-face large-language-models llm-evaluation machine-learning meta-llama model-configuration model-initialization performance-assessment pretraining-llms text-generation training-runs

Last synced: 6 months ago
JSON representation

Host: GitHub
URL: https://github.com/ksm26/pretraining-llms
Owner: ksm26
Created: 2024-07-29T12:27:40.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2024-08-07T12:51:30.000Z (about 1 year ago)
Last Synced: 2024-08-07T17:26:53.108Z (about 1 year ago)
Topics: ai-training, cost-effective-pretraining, data-preparation, depth-upscaling, developer-advocacy, high-quality-datasets, hugging-face, large-language-models, llm-evaluation, machine-learning, meta-llama, model-configuration, model-initialization, performance-assessment, pretraining-llms, text-generation, training-runs
Language: Jupyter Notebook
Homepage: https://www.deeplearning.ai/short-courses/pretraining-llms/
Size: 29.3 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# 🧠 [Pretraining LLMs](https://www.deeplearning.ai/short-courses/pretraining-llms/)

Welcome to the "Pretraining LLMs" course! 🧑‍🏫 The course dives into the essential steps of pretraining large language models (LLMs).

## 📘 Course Summary
In this course, you’ll explore pretraining, the foundational step in training LLMs, which involves teaching an LLM to predict the next token using vast text datasets.

🧠 You'll learn the essential steps to pretrain an LLM, understand the associated costs, and discover cost-effective methods by leveraging smaller, existing open-source models.

**Detailed Learning Outcomes:**
1. 🧠 **Pretraining Basics**: Understand the scenarios where pretraining is the optimal choice for model performance. Compare text generation across different versions of the same model to grasp the performance differences between base, fine-tuned, and specialized pre-trained models.
2. 🗃️ **Creating High-Quality Datasets**: Learn how to create and clean a high-quality training dataset using web text and existing datasets, and how to package this data for use with the Hugging Face library.
3. 🔧 **Model Configuration**: Explore ways to configure and initialize a model for training, including modifying Meta’s Llama models and initializing weights either randomly or from other models.
4. 🚀 **Executing Training Runs**: Learn how to configure and execute a training run to train your own model effectively.
5. 📊 **Performance Assessment**: Assess your trained model’s performance and explore common evaluation strategies for LLMs, including benchmark tasks used to compare different models’ performance.

## 🔑 Key Points
- 🧩 **Pretraining Process**: Gain in-depth knowledge of the steps to pretrain an LLM, from data preparation to model configuration and performance assessment.
- 🏗️ **Model Architecture Configuration**: Explore various options for configuring your model’s architecture, including modifying Meta’s Llama models and innovative pretraining techniques like Depth Upscaling, which can reduce training costs by up to 70%.
- 🛠️ **Practical Implementation**: Learn how to pretrain a model from scratch and continue the pretraining process on your own data using existing pre-trained models.

## 👩‍🏫 About the Instructors
- 👨‍🏫 **Sung Kim**: CEO of Upstage, bringing extensive expertise in LLM pretraining and optimization.
- 👩‍🔬 **Lucy Park**: Chief Scientific Officer of Upstage, with a deep background in scientific research and LLM development.

🔗 To enroll in the course or for further information, visit 📚 [deeplearning.ai](https://www.deeplearning.ai/short-courses/).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ksm26/pretraining-llms

Awesome Lists containing this project

README