Projects in Awesome Lists by RLHFlow
A curated list of projects in awesome lists by RLHFlow .
https://github.com/RLHFlow/RLHF-Reward-Modeling
Recipes to train reward model for RLHF.
Last synced: 07 May 2025
https://github.com/RLHFlow/Online-RLHF
A recipe for online RLHF and online iterative DPO.
Last synced: 24 Feb 2025
https://github.com/rlhflow/online-rlhf
A recipe for online RLHF and online iterative DPO.
Last synced: 08 Apr 2025
https://github.com/rlhflow/online-dpo-r1
Codebase for Iterative DPO Using Rule-based Rewards
Last synced: 19 Jun 2025
https://github.com/rlhflow/rlhf-reward-modeling
A recipe to train reward models for RLHF.
Last synced: 21 Aug 2025
https://github.com/rlhflow/directional-preference-alignment
Directional Preference Alignment
ai-alignment large-language-models rlhf
Last synced: 09 Mar 2026
https://github.com/RLHFlow/Directional-Preference-Alignment
Directional Preference Alignment
ai-alignment large-language-models rlhf
Last synced: 24 Feb 2025
https://github.com/rlhflow/self-rewarding-reasoning-llm
Recipes to train the self-rewarding reasoning LLMs.
Last synced: 04 Oct 2025
https://github.com/rlhflow/reinforce-ada
An adaptive sampling framework for Reinforce-style LLM post training.
Last synced: 11 Oct 2025