https://github.com/l294265421/alpaca-rlhf

Finetuning LLaMA with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat
https://github.com/l294265421/alpaca-rlhf

alpaca chatgpt language-model large-language-models llama llm reinforcement-learning rlhf

Last synced: 3 months ago
JSON representation

Finetuning LLaMA with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat

Host: GitHub
URL: https://github.com/l294265421/alpaca-rlhf
Owner: l294265421
License: mit
Created: 2023-04-12T08:19:46.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2023-06-05T00:47:02.000Z (about 2 years ago)
Last Synced: 2024-10-18T23:12:41.974Z (8 months ago)
Topics: alpaca, chatgpt, language-model, large-language-models, llama, llm, reinforcement-learning, rlhf
Language: Python
Homepage: https://88aeeb3aef5040507e.gradio.live/
Size: 97.9 MB
Stars: 106
Watchers: 3
Forks: 13
Open Issues: 9
Metadata Files:
- Readme: README.md
- License: LICENSE

ecosyste.ms