Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Projects in Awesome Lists tagged with reward-models
A curated list of projects in awesome lists tagged with reward-models .
https://github.com/jackaduma/vicuna-lora-rlhf-pytorch
A full pipeline to finetune Vicuna LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Vicuna architecture. Basically ChatGPT but with Vicuna
chatgpt finetune gpt llama llm lora peft ppo pytorch reward-models rlhf vicuna vicuna-7b
Last synced: 11 Nov 2024
https://github.com/jackaduma/chatglm-lora-rlhf-pytorch
A full pipeline to finetune ChatGLM LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the ChatGLM architecture. Basically ChatGPT but with ChatGLM
chatglm chatglm-6b chatgpt deepspeed finetune gpt llama llm lora peft ppo pytorch reward-models rlhf
Last synced: 11 Nov 2024
https://github.com/jackaduma/alpaca-lora-rlhf-pytorch
A full pipeline to finetune Alpaca LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Alpaca architecture. Basically ChatGPT but with Alpaca
alpaca chatgpt deepspeed finetune gpt llama llm lora peft ppo pytorch reward-models rlhf
Last synced: 11 Nov 2024