An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with reward-models

A curated list of projects in awesome lists tagged with reward-models .

https://github.com/RLHFlow/RLHF-Reward-Modeling

Recipes to train reward model for RLHF.

llama3 llm reward-models rlhf

Last synced: 07 May 2025

https://github.com/jackaduma/vicuna-lora-rlhf-pytorch

A full pipeline to finetune Vicuna LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Vicuna architecture. Basically ChatGPT but with Vicuna

chatgpt finetune gpt llama llm lora peft ppo pytorch reward-models rlhf vicuna vicuna-7b

Last synced: 13 Apr 2025

https://github.com/jackaduma/chatglm-lora-rlhf-pytorch

A full pipeline to finetune ChatGLM LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the ChatGLM architecture. Basically ChatGPT but with ChatGLM

chatglm chatglm-6b chatgpt deepspeed finetune gpt llama llm lora peft ppo pytorch reward-models rlhf

Last synced: 27 Apr 2025

https://github.com/jackaduma/alpaca-lora-rlhf-pytorch

A full pipeline to finetune Alpaca LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Alpaca architecture. Basically ChatGPT but with Alpaca

alpaca chatgpt deepspeed finetune gpt llama llm lora peft ppo pytorch reward-models rlhf

Last synced: 27 Apr 2025

https://github.com/vicgalle/zero-shot-reward-models

ZYN: Zero-Shot Reward Models with Yes-No Questions

llm reinforcement-learning reward-models rlaif rlhf trlx zero-shot

Last synced: 05 Mar 2025