Projects in Awesome Lists tagged with reward-models
A curated list of projects in awesome lists tagged with reward-models .
https://github.com/RLHFlow/RLHF-Reward-Modeling
Recipes to train reward model for RLHF.
Last synced: 07 May 2025
https://github.com/jackaduma/vicuna-lora-rlhf-pytorch
A full pipeline to finetune Vicuna LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Vicuna architecture. Basically ChatGPT but with Vicuna
chatgpt finetune gpt llama llm lora peft ppo pytorch reward-models rlhf vicuna vicuna-7b
Last synced: 13 Apr 2025
https://github.com/jackaduma/chatglm-lora-rlhf-pytorch
A full pipeline to finetune ChatGLM LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the ChatGLM architecture. Basically ChatGPT but with ChatGLM
chatglm chatglm-6b chatgpt deepspeed finetune gpt llama llm lora peft ppo pytorch reward-models rlhf
Last synced: 27 Apr 2025
https://github.com/jackaduma/alpaca-lora-rlhf-pytorch
A full pipeline to finetune Alpaca LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Alpaca architecture. Basically ChatGPT but with Alpaca
alpaca chatgpt deepspeed finetune gpt llama llm lora peft ppo pytorch reward-models rlhf
Last synced: 27 Apr 2025
https://github.com/vicgalle/zero-shot-reward-models
ZYN: Zero-Shot Reward Models with Yes-No Questions
llm reinforcement-learning reward-models rlaif rlhf trlx zero-shot
Last synced: 05 Mar 2025