Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/andy-yangz/Awesome-RLHF
Awesome Reinforcement Learning from Human Feedback, the secret behind ChatGPT XD
https://github.com/andy-yangz/Awesome-RLHF
List: Awesome-RLHF
Last synced: 2 months ago
JSON representation
Awesome Reinforcement Learning from Human Feedback, the secret behind ChatGPT XD
- Host: GitHub
- URL: https://github.com/andy-yangz/Awesome-RLHF
- Owner: andy-yangz
- License: mit
- Created: 2022-12-12T16:01:03.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2022-12-13T01:17:30.000Z (about 2 years ago)
- Last Synced: 2024-05-22T04:03:39.811Z (8 months ago)
- Homepage:
- Size: 3.91 KB
- Stars: 23
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-human-in-the-loop - Github - andy-yangz/Awesome-RLHF
README
# Awesome Reinforcement Learning from Human Feedback
![GitHub stars](https://img.shields.io/github/stars/andy-yangz/Awesome-RLHF.svg?color=red&style=for-the-badge)
![GitHub forks](https://img.shields.io/github/forks/andy-yangz/Awesome-RLHF.svg?style=for-the-badge)
![GitHub activity](https://img.shields.io/github/last-commit/andy-yangz/Awesome-RLHF?color=yellow&style=for-the-badge)
A collection of resources on Reinforcement Learning from Human Feedback (RLHF), mainly focused on pretrained models.
- [Awesome Reinforcement Learning from Human Feedback](#awesome-reinforcement-learning-from-human-feedback)
- [📜 Papers \& Blog](#-papers--blog)
- [Survey](#survey)
- [Pre-LM RLHF](#pre-lm-rlhf)
- [LM RLHF](#lm-rlhf)
- [Repos](#repos)
- [Datasets](#datasets)
- [Videos \& Lectures](#videos--lectures)
- [TODO](#todo)
- [📧Contact Me](#contact-me)## 📜 Papers & Blog
### Survey
- [Illustrating Reinforcement Learning from Human Feedback (RLHF)](https://huggingface.co/blog/rlhf) :Mainly inspired this repo
### Pre-LM RLHF
- [TAMER: Training an Agent Manually via Evaluative Reinforcement](https://www.cs.utexas.edu/~pstone/Papers/bib2html-links/ICDL08-knox.pdf)
- [Interactive Learning from Policy-Dependent Human Feedback](http://proceedings.mlr.press/v70/macglashan17a/macglashan17a.pdf)
- [Deep Reinforcement Learning from Human Preferences](https://proceedings.neurips.cc/paper/2017/hash/d5e2c0adad503c91f91df240d0cd4e49-Abstract.html) [[Blog](https://www.deepmind.com/blog/learning-through-human-feedback)]
- [Deep TAMER: Interactive Agent Shaping in High-Dimensional State Spaces](https://ojs.aaai.org/index.php/AAAI/article/view/11485)### LM RLHF
- [Fine-Tuning Language Models from Human Preferences](https://arxiv.org/abs/1909.08593) [[Code (TensorFlow)](https://github.com/openai/lm-human-preferences)]
- [Learning to summarize with human feedback](https://proceedings.neurips.cc/paper/2020/hash/1f89885d556929e98d3ef9b86448f951-Abstract.html) [[Video](https://www.youtube.com/watch?v=vLTmnaMpQCs)]
- [Recursively Summarizing Books with Human Feedback](https://arxiv.org/abs/2109.10862)
- [WebGPT: Browser-assisted question-answering with human feedback](https://arxiv.org/abs/2112.09332)
- [Training language models to follow instructions with human feedback](https://arxiv.org/abs/2203.02155)
- [Teaching language models to support answers with verified quotes](https://www.deepmind.com/publications/gophercite-teaching-language-models-to-support-answers-with-verified-quotes)
- [Improving alignment of dialogue agents via targeted human judgements](https://arxiv.org/abs/2209.14375)
- [ChatGPT: Optimizing Language Models for Dialogue](https://openai.com/blog/chatgpt/)
- [Scaling Laws for Reward Model Overoptimization](https://arxiv.org/abs/2210.10760)
- [Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback](https://arxiv.org/abs/2204.05862)
- [Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned](https://arxiv.org/abs/2209.07858)
- [Dynamic Planning in Open-Ended Dialogue using Reinforcement Learning](https://arxiv.org/abs/2208.02294)
- [Is Reinforcement Learning (Not) for Natural Language Processing?: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization](https://arxiv.org/abs/2210.01241) [[Code](https://github.com/allenai/RL4LMs)]
- [Offline RL for Natural Language Generation with Implicit Language Q Learning](https://arxiv.org/abs/2206.11871) [[Code](https://github.com/Sea-Snell/Implicit-Language-Q-Learning)]## Repos
- [Transformer Reinforcement Learning (TRL)](https://github.com/lvwerra/trl):Train GPT type transformers model with ***Proximal Policy Optimization*** (**PPO**)
- [Transformer Reinforcement Learning X (TRLX)](https://github.com/CarperAI/trlx):Enhanced TRL with ***Implicit Language Q-Learning*** (**ILQL**)
- [RL4LMs (A modular RL library to fine-tune language models to human preferences)](https://github.com/allenai/RL4LMs) [[Site](https://rl4lms.apps.allenai.org/)]:Thoroughly tested and benchmarked with over **2000 experiments** on Language Generation tasks, with different types of metrics, and several RL algorithms. Also support Seq2Seq type Model (eg. T5, BART).## Datasets
- [HH-RLHF](https://github.com/anthropics/hh-rlhf) [[HF Hub](https://huggingface.co/datasets/Anthropic/hh-rlhf)]:A Dataset created by Anthropic.
## Videos & Lectures
- [Learning Task Specifications for Reinforcement Learning from Human Feedback](https://www.youtube.com/watch?v=vebzz6EKD2w)
- [Reinforcement Learning from Human Feedback: From Zero to chatGPT](https://www.youtube.com/watch?v=2MBJOuVq380)## TODO
- [ ] Add more descriptions
## 📧Contact Me
If you have any question, please feel free to contact me (📧: [email protected]).