Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-human-in-the-loop
Awesome List of Human in the Loop resources and references for retraining models.
https://github.com/HumanSignal/awesome-human-in-the-loop
Last synced: 5 days ago
JSON representation
-
Awesome RHLF
-
Tools and Resources
- Secrets of RLHF in Large Language Models
- Github - lucidrains/PaLM-rlhf-pytorch
- Github - anthropics/hh-rlhf
- Github - conceptofmind/LaMDA-rlhf-pytorch - source pre-training implementation of Google's LaMDA in PyTorch. Adding RLHF similar to ChatGPT.
- Github - opendilab/awesome-RLHF
- Github - CarperAI/trlx
- Github - sunzeyeah/RLHF
- Github - LAION-AI/Open-Assistant - based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
- Github - xrsrke/instructGOOSE
- Github - arunprsh/ChatGPT-Decoded-GPT2-FAQ-Bot-RLHF-PPO - 2 on AWS
- Github - voidful/TextRL - 176B/bloom/gpt/bart/T5/MetaICL)
- Github - cogment/Cogment-verse
- Github - s-JoL/Open-Llama - source high-performance Llama model, including the full process from pre-training to RLHF.
- Github - jianzhnie/open-chatgpt
- Github - andy-yangz/Awesome-RLHF
- Github - jordimas/awesome-RLHF-language-models
- Github - RUCAIBox/LLMSurvey
- Github - mfarisadip/T5-rlhf-pytorch
- Github - CarperAI/Polygraph
- Github - ayulockin/T5-RLHF-TF
- Github - ckkissane/rlhf-shakespeare - tuned to generate positive sentiment samples using RLHF
- Github - G-U-N/T2I-HumanFeedback
- Github - nazneenrajani/rlhf_langchain
- Github - uSaiPrashanth/raithubot-training - transformer architecture to answer farmers' queries
- Github - l294265421/alpaca-rlhf
- Github - DaehanKim/EasyRLHF - the-shelf solutions and datasets
- Github - jeremy-collins/robot-rlhf
- Github - Sugoto/GPT-Model-with-RLHF
- Github - vincentmin/transformer_rlhf_eli5
- Scale - RLHF for Large Language Models
-
Demos and Tutorials
-
Blog Posts + Academic Papers
- Open AI - Aligning language models to follow instructions - to-use
- Cornell University - Scaling Language Models: Methods, Analysis & Insights from Training Gopher
- Hugging Face - Illustrating Reinforcement Learning from Human Feedback (RLHF)
- LessWrong - RLHF
- Unite.ai | What is Reinforcement Learning From Human Feedback (RLHF)
- Surge.ai - Introduction to Reinforcement Learning with Human Feedback
-
Programming Languages
Categories
Keywords
rlhf
12
reinforcement-learning
10
chatgpt
9
llm
4
language-model
4
deep-learning
4
human-feedback
4
machine-learning
4
pytorch
3
large-language-models
3
transformers
3
artificial-intelligence
3
gpt-2
2
nlp
2
instruction-tuning
2
ai
2
llama
2
python
2
gpt2
1
question-answering
1
chatbot
1
aws
1
instructgpt
1
sagemaker
1
nextjs
1
discord-bot
1
assistant
1
pangu
1
glm
1
deepspeed
1
deep-reinforcement-learning
1
attention-mechanism
1
openai-api
1
openai
1
ml
1
large-language-model
1
chatgpt3
1
chatgpt-api
1
robotics
1
alignment
1
sft
1
rrhf
1
ipo
1
dpo
1
alpaca
1
pre-training
1
pre-trained-language-models
1
natural-language-processing
1
llms
1
in-context-learning
1