Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with reward-models

A curated list of projects in awesome lists tagged with reward-models .

https://github.com/jackaduma/vicuna-lora-rlhf-pytorch

A full pipeline to finetune Vicuna LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Vicuna architecture. Basically ChatGPT but with Vicuna

chatgpt finetune gpt llama llm lora peft ppo pytorch reward-models rlhf vicuna vicuna-7b

Last synced: 11 Nov 2024

https://github.com/jackaduma/chatglm-lora-rlhf-pytorch

A full pipeline to finetune ChatGLM LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the ChatGLM architecture. Basically ChatGPT but with ChatGLM

chatglm chatglm-6b chatgpt deepspeed finetune gpt llama llm lora peft ppo pytorch reward-models rlhf

Last synced: 11 Nov 2024

https://github.com/jackaduma/alpaca-lora-rlhf-pytorch

A full pipeline to finetune Alpaca LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Alpaca architecture. Basically ChatGPT but with Alpaca

alpaca chatgpt deepspeed finetune gpt llama llm lora peft ppo pytorch reward-models rlhf

Last synced: 11 Nov 2024