Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/Fhujinwu/Human-Feedback-awesome

Human-Feedback, RLHF.
https://github.com/Fhujinwu/Human-Feedback-awesome

List: Human-Feedback-awesome

Last synced: 3 months ago
JSON representation

Human-Feedback, RLHF.

Awesome Lists containing this project

README

        

# Human-Feedback-For-AI-awesome
We would like to maintain an up-to-date list of progress (papers, blogs, codes, and *etc.*) made in **Human Feedback For AI** (LLM,Text-image and other task), and provide a guide for some of the papers that have received wide interest.
Please feel free to [open an issue](Fhujinwu/Human-Feedback-For-LLM-awesome) to add papers.

## Table of Contents

- Human Feedback for LLM
- Human Feedback for Text-Image
- Human Feedback for Robot Control
- About Reinforcement Learning

## Human Feedback for LLM
* Deep reinforcement learning from human preferences, nips'17. [[paper]](https://proceedings.neurips.cc/paper_files/paper/2017/file/d5e2c0adad503c91f91df240d0cd4e49-Paper.pdf)
* Recursively Summarizing Books with Human Feedback, arxiv'22. [[paper]](https://arxiv.org/pdf/2109.10862.pdf)
* InstructGPT: Training Language Models to Follow Instructions With Human Feedback, nips'22. [[paper]](https://proceedings.neurips.cc/paper_files/paper/2022/file/b1efde53be364a73914f58805a001731-Paper-Conference.pdf) [[video]](https://www.bilibili.com/video/BV1hd4y187CR/)
* Fine-tuning language models to find agreement among humans with diverse preferences, nips'22. [[paper]](https://proceedings.neurips.cc/paper_files/paper/2022/file/f978c8f3b5f399cae464e85f72e28503-Paper-Conference.pdf)
* Constitutional AI: Harmlessness from AI Feedback, arxiv'22. [[paper]](https://arxiv.org/pdf/2212.08073.pdf)
* Training a helpful and harmless assistant with reinforcement learning from human feedback, arxiv'22. [[paper]](https://arxiv.org/pdf/2204.05862.pdf)
* Direct Preference Optimization:Your Language Model is Secretly a Reward Model, arxiv'23. [[paper]](https://arxiv.org/pdf/2305.18290.pdf) [[code]](https://github.com/eric-mitchell/direct-preference-optimization) [[blogs]](https://zhuanlan.zhihu.com/p/634705904)
* RRHF: Rank responses to align language models with human feedback without tears, arxiv'23. [[paper]](https://arxiv.org/pdf/2304.05302.pdf) [[code]](https://github.com/GanjinZero/RRHF) [[blogs]](https://mp.weixin.qq.com/s/MiToPmFuNXY9wJcKH7pZPw)
* RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment, arxiv'23. [[paper]](https://arxiv.org/pdf/2304.06767.pdf) [[code]](https://github.com/OptimalScale/LMFlow) [[blogs]](https://mp.weixin.qq.com/s/rhO0bE8CCQsQzsH3kdTbCA)
* Fine-Grained Human Feedback Gives Better Rewards for Language Model Training, arxiv'23. [[paper]](https://arxiv.org/pdf/2306.01693.pdf) [[code]](https://github.com/allenai/FineGrainedRLHF) [[blogs]](https://mp.weixin.qq.com/s/iqf6Tw2iyYNAUoAj3f1MNw)
* Fine-Tuning Language Models with Advantage-Induced Policy Alignment, arxiv'23. [[paper]](https://arxiv.org/pdf/2306.02231.pdf)
* Scaling Laws for Reward Model Overoptimization, ICLR'23. [[paper]](https://proceedings.mlr.press/v202/gao23h/gao23h.pdf)
* Reward Collapse in Aligning Large Language Models, arxiv'23. [[paper]](https://arxiv.org/pdf/2305.17608.pdf) [[blogs]](https://mp.weixin.qq.com/s/REqLcA9CMEM8M7DYZpuC-Q)
* Chain of Hindsight Aligns Language Models with Feedback, arxiv'23. [[paper]](https://arxiv.org/pdf/2302.02676.pdf)
* Principled Reinforcement Learning with Human Feedback from Pairwise or K, arxiv'23. [[paper]](https://arxiv.org/pdf/2301.11270.pdf)
* Reinforcement Learning from Diverse Human Preferences, arxiv'23. [[paper]](https://arxiv.org/pdf/2301.11774.pdf)
* Personalisation within bounds: A risk taxonomy and policy framework for the alignment of large language models with personalised feedback, arxiv'23. [[paper]](https://arxiv.org/pdf/2303.05453.pdf)
* Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization, iclr'23. [[paper]](https://arxiv.org/pdf/2210.01241.pdf) [[code]](https://github.com/allenai/RL4LMs)
* How to Query Human Feedback Efficiently in RL? arxiv'23. [[paper]](https://arxiv.org/pdf/2305.18505.pdf)
* Pretraining Language Models with Human Preferences, icml'23. [[paper]](https://proceedings.mlr.press/v202/korbak23a/korbak23a.pdf)
* Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback, arxiv'23. [[paper]](https://arxiv.org/pdf/2307.15217.pdf)

## Human Feedback for Text-Image
* Aligning text-to-image models using human feedback, arxiv'23. [[paper]](https://arxiv.org/pdf/2302.12192.pdf) [[blogs]](https://mp.weixin.qq.com/s/FrqpybryiJ-ikO4ZVeISIg)
* Better aligning text-to-image models with human preference, arxiv'23. [[paper]](https://arxiv.org/pdf/2303.14420.pdf) [[code]](https://github.com/tgxs002/align_sd)
* DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion Models, arxiv'23. [[paper]](https://arxiv.org/pdf/2305.16381.pdf)
* ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation, arxiv'23. [[paper]](https://arxiv.org/pdf/2304.05977.pdf) [[code]](https://github.com/THUDM/ImageReward)
* AGIQA-3K: An Open Database for AI-Generated Image Quality Assessment, arxiv'23. [[paper]](https://arxiv.org/pdf/2306.04717.pdf) [[code]](https://github.com/lcysyzxdxc/AGIQA-3k-Database?utm_source=catalyzex.com)
* AIGCIQA2023: A Large-scale Image Quality Assessment Database for AI Generated Images: from the Perspectives of Quality, Authenticity and Correspondence, arxiv'23. [[paper]](https://arxiv.org/pdf/2307.00211.pdf) [[code]](https://github.com/wangjiarui153/AIGCIQA2023)
* Pick-a-Pic: An Open Dataset of User Preferences for Text-to-Image Generation, arxiv'23. [[paper]](https://arxiv.org/pdf/2305.01569.pdf)
* Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis, arxiv'23. [[paper]](https://arxiv.org/pdf/2306.09341.pdf) [[code]](https://github.com/tgxs002/HPSv2)

## Human Feedback for Robot Control
* Aligning human preferences with baseline objectives in reinforcement learning, icra'23. [[paper]](https://www.diva-portal.org/smash/get/diva2:1744884/FULLTEXT01.pdf)
* Feedback-efficient interactive reinforcement learning via relabeling experience and unsupervised pre-training, icml'21. [[paper]](https://proceedings.mlr.press/v139/lee21i.html)
*

## About Reinforcement Learning
* Augmented Proximal Policy Optimization for Safe Reinforcement Learning, aaai'23. [[paper]](https://ojs.aaai.org/index.php/AAAI/article/view/25888)