Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/Fhujinwu/Human-Feedback-awesome
Human-Feedback, RLHF.
https://github.com/Fhujinwu/Human-Feedback-awesome
List: Human-Feedback-awesome
Last synced: 16 days ago
JSON representation
Human-Feedback, RLHF.
- Host: GitHub
- URL: https://github.com/Fhujinwu/Human-Feedback-awesome
- Owner: Fhujinwu
- License: bsd-3-clause
- Created: 2023-04-15T06:53:16.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-08-14T13:57:20.000Z (over 1 year ago)
- Last Synced: 2024-05-21T23:00:32.952Z (7 months ago)
- Homepage:
- Size: 50.8 KB
- Stars: 5
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- ultimate-awesome - Human-Feedback-awesome - Human-Feedback, RLHF. (Other Lists / Monkey C Lists)
README
# Human-Feedback-For-AI-awesome
We would like to maintain an up-to-date list of progress (papers, blogs, codes, and *etc.*) made in **Human Feedback For AI** (LLM,Text-image and other task), and provide a guide for some of the papers that have received wide interest.
Please feel free to [open an issue](Fhujinwu/Human-Feedback-For-LLM-awesome) to add papers.- Human Feedback for LLM
- Human Feedback for Text-Image
- Human Feedback for Robot Control
- About Reinforcement Learning## Human Feedback for LLM
* Deep reinforcement learning from human preferences, nips'17. [[paper]](https://proceedings.neurips.cc/paper_files/paper/2017/file/d5e2c0adad503c91f91df240d0cd4e49-Paper.pdf)
* Recursively Summarizing Books with Human Feedback, arxiv'22. [[paper]](https://arxiv.org/pdf/2109.10862.pdf)
* InstructGPT: Training Language Models to Follow Instructions With Human Feedback, nips'22. [[paper]](https://proceedings.neurips.cc/paper_files/paper/2022/file/b1efde53be364a73914f58805a001731-Paper-Conference.pdf) [[video]](https://www.bilibili.com/video/BV1hd4y187CR/)
* Fine-tuning language models to find agreement among humans with diverse preferences, nips'22. [[paper]](https://proceedings.neurips.cc/paper_files/paper/2022/file/f978c8f3b5f399cae464e85f72e28503-Paper-Conference.pdf)
* Constitutional AI: Harmlessness from AI Feedback, arxiv'22. [[paper]](https://arxiv.org/pdf/2212.08073.pdf)
* Training a helpful and harmless assistant with reinforcement learning from human feedback, arxiv'22. [[paper]](https://arxiv.org/pdf/2204.05862.pdf)
* Direct Preference Optimization:Your Language Model is Secretly a Reward Model, arxiv'23. [[paper]](https://arxiv.org/pdf/2305.18290.pdf) [[code]](https://github.com/eric-mitchell/direct-preference-optimization) [[blogs]](https://zhuanlan.zhihu.com/p/634705904)
* RRHF: Rank responses to align language models with human feedback without tears, arxiv'23. [[paper]](https://arxiv.org/pdf/2304.05302.pdf) [[code]](https://github.com/GanjinZero/RRHF) [[blogs]](https://mp.weixin.qq.com/s/MiToPmFuNXY9wJcKH7pZPw)
* RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment, arxiv'23. [[paper]](https://arxiv.org/pdf/2304.06767.pdf) [[code]](https://github.com/OptimalScale/LMFlow) [[blogs]](https://mp.weixin.qq.com/s/rhO0bE8CCQsQzsH3kdTbCA)
* Fine-Grained Human Feedback Gives Better Rewards for Language Model Training, arxiv'23. [[paper]](https://arxiv.org/pdf/2306.01693.pdf) [[code]](https://github.com/allenai/FineGrainedRLHF) [[blogs]](https://mp.weixin.qq.com/s/iqf6Tw2iyYNAUoAj3f1MNw)
* Fine-Tuning Language Models with Advantage-Induced Policy Alignment, arxiv'23. [[paper]](https://arxiv.org/pdf/2306.02231.pdf)
* Scaling Laws for Reward Model Overoptimization, ICLR'23. [[paper]](https://proceedings.mlr.press/v202/gao23h/gao23h.pdf)
* Reward Collapse in Aligning Large Language Models, arxiv'23. [[paper]](https://arxiv.org/pdf/2305.17608.pdf) [[blogs]](https://mp.weixin.qq.com/s/REqLcA9CMEM8M7DYZpuC-Q)
* Chain of Hindsight Aligns Language Models with Feedback, arxiv'23. [[paper]](https://arxiv.org/pdf/2302.02676.pdf)
* Principled Reinforcement Learning with Human Feedback from Pairwise or K, arxiv'23. [[paper]](https://arxiv.org/pdf/2301.11270.pdf)
* Reinforcement Learning from Diverse Human Preferences, arxiv'23. [[paper]](https://arxiv.org/pdf/2301.11774.pdf)
* Personalisation within bounds: A risk taxonomy and policy framework for the alignment of large language models with personalised feedback, arxiv'23. [[paper]](https://arxiv.org/pdf/2303.05453.pdf)
* Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization, iclr'23. [[paper]](https://arxiv.org/pdf/2210.01241.pdf) [[code]](https://github.com/allenai/RL4LMs)
* How to Query Human Feedback Efficiently in RL? arxiv'23. [[paper]](https://arxiv.org/pdf/2305.18505.pdf)
* Pretraining Language Models with Human Preferences, icml'23. [[paper]](https://proceedings.mlr.press/v202/korbak23a/korbak23a.pdf)
* Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback, arxiv'23. [[paper]](https://arxiv.org/pdf/2307.15217.pdf)## Human Feedback for Text-Image
* Aligning text-to-image models using human feedback, arxiv'23. [[paper]](https://arxiv.org/pdf/2302.12192.pdf) [[blogs]](https://mp.weixin.qq.com/s/FrqpybryiJ-ikO4ZVeISIg)
* Better aligning text-to-image models with human preference, arxiv'23. [[paper]](https://arxiv.org/pdf/2303.14420.pdf) [[code]](https://github.com/tgxs002/align_sd)
* DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion Models, arxiv'23. [[paper]](https://arxiv.org/pdf/2305.16381.pdf)
* ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation, arxiv'23. [[paper]](https://arxiv.org/pdf/2304.05977.pdf) [[code]](https://github.com/THUDM/ImageReward)
* AGIQA-3K: An Open Database for AI-Generated Image Quality Assessment, arxiv'23. [[paper]](https://arxiv.org/pdf/2306.04717.pdf) [[code]](https://github.com/lcysyzxdxc/AGIQA-3k-Database?utm_source=catalyzex.com)
* AIGCIQA2023: A Large-scale Image Quality Assessment Database for AI Generated Images: from the Perspectives of Quality, Authenticity and Correspondence, arxiv'23. [[paper]](https://arxiv.org/pdf/2307.00211.pdf) [[code]](https://github.com/wangjiarui153/AIGCIQA2023)
* Pick-a-Pic: An Open Dataset of User Preferences for Text-to-Image Generation, arxiv'23. [[paper]](https://arxiv.org/pdf/2305.01569.pdf)
* Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis, arxiv'23. [[paper]](https://arxiv.org/pdf/2306.09341.pdf) [[code]](https://github.com/tgxs002/HPSv2)## Human Feedback for Robot Control
* Aligning human preferences with baseline objectives in reinforcement learning, icra'23. [[paper]](https://www.diva-portal.org/smash/get/diva2:1744884/FULLTEXT01.pdf)
* Feedback-efficient interactive reinforcement learning via relabeling experience and unsupervised pre-training, icml'21. [[paper]](https://proceedings.mlr.press/v139/lee21i.html)
*## About Reinforcement Learning
* Augmented Proximal Policy Optimization for Safe Reinforcement Learning, aaai'23. [[paper]](https://ojs.aaai.org/index.php/AAAI/article/view/25888)