An open API service indexing awesome lists of open source software.

https://github.com/ksm26/pretraining-llms

Master the essential steps of pretraining large language models (LLMs). Learn to create high-quality datasets, configure model architectures, execute training runs, and assess model performance for efficient and effective LLM pretraining.
https://github.com/ksm26/pretraining-llms

ai-training cost-effective-pretraining data-preparation depth-upscaling developer-advocacy high-quality-datasets hugging-face large-language-models llm-evaluation machine-learning meta-llama model-configuration model-initialization performance-assessment pretraining-llms text-generation training-runs

Last synced: 6 months ago
JSON representation

Master the essential steps of pretraining large language models (LLMs). Learn to create high-quality datasets, configure model architectures, execute training runs, and assess model performance for efficient and effective LLM pretraining.

Awesome Lists containing this project

README

          

# 🧠 [Pretraining LLMs](https://www.deeplearning.ai/short-courses/pretraining-llms/)

Welcome to the "Pretraining LLMs" course! πŸ§‘β€πŸ« The course dives into the essential steps of pretraining large language models (LLMs).

## πŸ“˜ Course Summary
In this course, you’ll explore pretraining, the foundational step in training LLMs, which involves teaching an LLM to predict the next token using vast text datasets.

🧠 You'll learn the essential steps to pretrain an LLM, understand the associated costs, and discover cost-effective methods by leveraging smaller, existing open-source models.

**Detailed Learning Outcomes:**
1. 🧠 **Pretraining Basics**: Understand the scenarios where pretraining is the optimal choice for model performance. Compare text generation across different versions of the same model to grasp the performance differences between base, fine-tuned, and specialized pre-trained models.
2. πŸ—ƒοΈ **Creating High-Quality Datasets**: Learn how to create and clean a high-quality training dataset using web text and existing datasets, and how to package this data for use with the Hugging Face library.
3. πŸ”§ **Model Configuration**: Explore ways to configure and initialize a model for training, including modifying Meta’s Llama models and initializing weights either randomly or from other models.
4. πŸš€ **Executing Training Runs**: Learn how to configure and execute a training run to train your own model effectively.
5. πŸ“Š **Performance Assessment**: Assess your trained model’s performance and explore common evaluation strategies for LLMs, including benchmark tasks used to compare different models’ performance.

## πŸ”‘ Key Points
- 🧩 **Pretraining Process**: Gain in-depth knowledge of the steps to pretrain an LLM, from data preparation to model configuration and performance assessment.
- πŸ—οΈ **Model Architecture Configuration**: Explore various options for configuring your model’s architecture, including modifying Meta’s Llama models and innovative pretraining techniques like Depth Upscaling, which can reduce training costs by up to 70%.
- πŸ› οΈ **Practical Implementation**: Learn how to pretrain a model from scratch and continue the pretraining process on your own data using existing pre-trained models.

## πŸ‘©β€πŸ« About the Instructors
- πŸ‘¨β€πŸ« **Sung Kim**: CEO of Upstage, bringing extensive expertise in LLM pretraining and optimization.
- πŸ‘©β€πŸ”¬ **Lucy Park**: Chief Scientific Officer of Upstage, with a deep background in scientific research and LLM development.

πŸ”— To enroll in the course or for further information, visit πŸ“š [deeplearning.ai](https://www.deeplearning.ai/short-courses/).