awesome-RLHF

A curated list of reinforcement learning with human feedback resources (continually updated)
https://github.com/opendilab/awesome-RLHF

Last synced: 6 days ago
JSON representation

Papers
Blogs
- 2020 and before
  - Deepmind
  - Notion
  - Notion
  - gist
  - YouTube
  - ZhiHu
  - ZhiHu
  - ZhiHu
  - W&B Fully Connected
  - OpenAI
  - Hugging Face
  - OpenAI / Arize
  - Encord
Dataset
- 2020 and before
Codebases
- 2020 and before
  - enwik8
  - DeepSpeed-Chat
  - FG-RLHF
  - IMDB - dailymail), [ToTTo](https://github.com/google-research-datasets/ToTTo), [WMT-16 (en-de)](https://www.statmt.org/wmt16/it-translation-task.html), [NarrativeQA](https://github.com/deepmind/narrativeqa), [DailyDialog](http://yanran.li/dailydialog)
  - TL;DR - dailymail)

Programming Languages

Categories

Papers 145 Blogs 13 Dataset 8 Codebases 5

Sub Categories

2024 69 2023 41 2020 and before 33 2022 14 2025 9 2021 4 Detailed Explanation 1

Keywords

rlhf 5 large-language-models 3 multimodal 2 alignment 2 dense-reward-for-direct-preference-optimization 1 preference-alignment 1 text-to-image-generation 1 deep-learning 1 fine-tuning 1 self-play 1 diffusion-models 1 human-feedback 1 reinforcement-learning 1 stable-diffusion 1 text-to-image 1 scalable-oversight 1 ai-alignment 1 chatbot 1 gpt-4 1 llama 1 multi-modality 1 rlhf-v 1 visual-language-learning 1 llama3 1 llm 1 chameleon 1 dpo 1 vision-language-model 1