Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-RLAIF
A continually updated list of literature on Reinforcement Learning from AI Feedback (RLAIF)
https://github.com/mengdi-li/awesome-RLAIF
Last synced: about 17 hours ago
JSON representation
-
Papers
-
2024
- Datasets and models
- RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness - blue)
- Enhancing Robotic Manipulation with AI Feedback from Multimodal Large Language Models - blue)
- Self-Rewarding Language Models - blue)
- Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations - blue)
- Project website
- RLAIF vs. RLHF: Scaling Reinforcement Learning from Human Feedback with AI Feedback - blue)
- CriticGPT: Multimodal LLM as a Critic for Robot Manipulation - blue)
- A Critical Evaluation of AI Feedback for Aligning Large Language Models - blue)
- Enhancing Zero-Shot Chain-of-Thought Reasoning in Large Language Models through Logic - blue)
- Language Model Self-improvement by Reinforcement Learning Contemplation - blue)
- - blue)
- Project website - VLM-F)
-
2023
- Datasets and models
- Code
- Code & Model Weights & Dataset
- Code
- Code
- Code & Prompts
- Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models - blue)
- Reinforced Self-Training (ReST) for Language Modeling - blue)
- Eureka: Human-Level Reward Design via Coding Large Language Models - blue)
- Project website - research/Eureka)
- Accelerating Reinforcement Learning of Robotic Manipulations via Feedback from Large Language Models - blue)
- RAIN: Your Language Models Can Align Themselves without Finetuning - blue)
- Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision - blue)
- Motif: Intrinsic Motivation from Artificial Intelligence Feedback - blue)
- Language Model Self-improvement by Reinforcement Learning Contemplation - blue)
- Language to Rewards for Robotic Skill Synthesis - blue)
- Project website - deepmind/language_to_reward_2023)
- Language Instructed Reinforcement Learning for Human-AI Coordination - blue)
- Guiding Pretraining in Reinforcement Learning with Large Language Models - blue)
- Reward Design with Language Models - blue)
- UltraFeedback: Boosting Language Models with High-quality Feedback - blue)
-
2022
-
-
Related Awesome Repos
-
2022
-
-
Related Blogs
Programming Languages
Categories