https://github.com/arjuntheprogrammer/llm_with_rlhf
https://github.com/arjuntheprogrammer/llm_with_rlhf
Last synced: about 1 year ago
JSON representation
- Host: GitHub
- URL: https://github.com/arjuntheprogrammer/llm_with_rlhf
- Owner: arjuntheprogrammer
- Created: 2024-05-17T08:23:16.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2024-05-17T10:25:14.000Z (almost 2 years ago)
- Last Synced: 2025-01-26T20:29:25.391Z (about 1 year ago)
- Language: Jupyter Notebook
- Size: 997 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# REINFORCEMENT LEARNING FROM HUMAN FEEDBACK
A conceptual and hands-on introduction to tuning and evaluating large language models (LLMs) using Reinforcement Learning from Human Feedback.
- Get a conceptual understanding of Reinforcement Learning from Human Feedback (RLHF), as well as the datasets needed for this technique
- Fine-tune the Llama 2 model using RLHF with the open source Google Cloud Pipeline Components Library
- Evaluate tuned model performance against the base model with evaluation methods
---
## INDEX
1. How does RLHF Works
2. Datasets for RL Training
3. Tune an LLM with RLHF
4. Evaluate the tuned model
---
## COURSE LINK
https://learn.deeplearning.ai/courses/reinforcement-learning-from-human-feedback
---