Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/HumanSignal/awesome-human-in-the-loop

Awesome List of Human in the Loop resources and references for retraining models.
https://github.com/HumanSignal/awesome-human-in-the-loop

List: awesome-human-in-the-loop

Last synced: about 1 month ago
JSON representation

Awesome List of Human in the Loop resources and references for retraining models.

Awesome Lists containing this project

README

        

# awesome-human-in-the-loop
> An awesome list of tools and resources to get started with Human in the Loop or RHLF.

[![Awesome](https://awesome.re/badge-flat2.svg)](https://awesome.re)

## Awesome RHLF

### Blog Posts + Academic Papers

* **[Open AI - Aligning language models to follow instructions](https://openai.com/research/instruction-following) | **Internal blog post, how-to-use
* **[Cornell University - Scaling Language Models: Methods, Analysis & Insights from Training Gopher](https://arxiv.org/abs/2112.11446) | **Academic paper
* **[Hugging Face - Illustrating Reinforcement Learning from Human Feedback (RLHF)](https://huggingface.co/blog/rlhf) | **Definition, blog post
* **[LessWrong - RLHF](https://www.lesswrong.com/posts/rQH4gRmPMJyjtMpTn/rlhf) | **Blog post
* **[Unite.ai | What is Reinforcement Learning From Human Feedback (RLHF)](https://www.unite.ai/what-is-reinforcement-learning-from-human-feedback-rlhf/) | **Blog post
* **[Surge.ai - Introduction to Reinforcement Learning with Human Feedback](https://www.surgehq.ai/blog/introduction-to-reinforcement-learning-with-human-feedback-rlhf-series-part-1) | **Blog post

### Tools and Resources
* **[Secrets of RLHF in Large Language Models](https://github.com/OpenLMLab/MOSS-RLHF)** | Code and tutorials for RLHF in nutshell
* **[Scale - RLHF for Large Language Models](https://scale.com/rlhf) |** Landing page, tool
* **[Github - lucidrains/PaLM-rlhf-pytorch](https://github.com/lucidrains/PaLM-rlhf-pytorch) | **Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
* **[Github - anthropics/hh-rlhf ](https://github.com/anthropics/hh-rlhf)| **Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"
* **[Github - conceptofmind/LaMDA-rlhf-pytorch](https://github.com/conceptofmind/LaMDA-rlhf-pytorch) | **Open-source pre-training implementation of Google's LaMDA in PyTorch. Adding RLHF similar to ChatGPT.
* **[Github - opendilab/awesome-RLHF](https://github.com/opendilab/awesome-RLHF) | **A curated list of reinforcement learning with human feedback resources (continually updated)
* **[Github - CarperAI/trlx](https://github.com/CarperAI/trlx) | **A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
* **[Github - sunzeyeah/RLHF](https://github.com/sunzeyeah/RLHF) | **Implementation of Chinese ChatGPT
* **[Github - LAION-AI/Open-Assistant](https://github.com/LAION-AI/Open-Assistant) | **OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
* **[Github - xrsrke/instructGOOSE](https://github.com/xrsrke/instructGOOSE) | **Implementation of Reinforcement Learning from Human Feedback (RLHF)
* **[Github - arunprsh/ChatGPT-Decoded-GPT2-FAQ-Bot-RLHF-PPO](https://github.com/arunprsh/ChatGPT-Decoded-GPT2-FAQ-Bot-RLHF-PPO) | **A Practical Guide to Developing a Reliable FAQ Chatbot with Reinforcement Learning and Human Feedback using GPT-2 on AWS
* **[Github - voidful/TextRL](https://github.com/voidful/TextRL) | **Implementation of ChatGPT RLHF (Reinforcement Learning with Human Feedback) on any generation model in huggingface's transformer (blommz-176B/bloom/gpt/bart/T5/MetaICL)
* **[Github - cogment/Cogment-verse](https://github.com/cogment/cogment-verse) | **Library of Environments, Human Actor UIs and Agent implementation for Human In the Loop Learning & Reinforcement Learning
* **[Github - s-JoL/Open-Llama](https://github.com/s-JoL/Open-Llama) | **The complete training code of the open-source high-performance Llama model, including the full process from pre-training to RLHF.
* **[Github - jianzhnie/open-chatgpt ](https://github.com/jianzhnie/open-chatgpt)| **The open source implementation of chatgpt and RLHF. 从0开始实现一个ChatGPT.
* **[Github - andy-yangz/Awesome-RLHF](https://github.com/andy-yangz/Awesome-RLHF) | **Awesome Reinforcement Learning from Human Feedback, the secret behind ChatGPT XD
* **[Github - jordimas/awesome-RLHF-language-models](https://github.com/jordimas/awesome-RLHF-language-models) | **Curated list of resources for Reinforcement Learning from Human Feedback and Language Models
* **[Github - RUCAIBox/LLMSurvey](https://github.com/RUCAIBox/LLMSurvey) | **A collection of papers and resources related to Large Language Models.
* **[Github - mfarisadip/T5-rlhf-pytorch](https://github.com/mfarisadip/T5-rlhf-pytorch) | **Implementation of RLHF (Reinforcement Learning with Human Feedback) and GAN (Generative Adversarial Network) on top of the T5 architecture.
* **[Github - CarperAI/Polygraph](https://github.com/CarperAI/Polygraph) | **RLHF Mechanistic Interpretability and Deception
* **[Github - ayulockin/T5-RLHF-TF](https://github.com/ayulockin/T5-RLHF-TF) | **Implementation of Reinforcement Learning from Human Feedback for Summarization Task in TensorFlow
* **[Github - ckkissane/rlhf-shakespeare](https://github.com/ckkissane/rlhf-shakespeare) | **Shakespeare transformer fine-tuned to generate positive sentiment samples using RLHF
* **[Github - G-U-N/T2I-HumanFeedback](https://github.com/G-U-N/T2I-HumanFeedback) | **Implementations of Baseline Methods for Aligning Text2Img Diffusion Models with Human FeedBack
* **[Github - nazneenrajani/rlhf_langchain](https://github.com/nazneenrajani/rlhf_langchain) | **Langchain for RLHF
* **[Github - uSaiPrashanth/raithubot-training](https://github.com/uSaiPrashanth/raithubot-training) | **Training a RLHF-transformer architecture to answer farmers' queries
* **[Github - l294265421/alpaca-rlhf](https://github.com/l294265421/alpaca-rlhf) | **Finetuning alpaca with RLHF (Reinforcement Learning with Human Feedback)
* **[Github - DaehanKim/EasyRLHF](https://github.com/DaehanKim/EasyRLHF) | **EasyRLHF aims to providing an easy and minimal interface to train RLHF LMs, using off-the-shelf solutions and datasets
* **[Github - jeremy-collins/robot-rlhf](https://github.com/jeremy-collins/robot-rlhf) | **Robot Learning through Human Feedback. Inspired by advancements in NLP, we train a robot policy via reinforcement learning using a reward function learned exclusively from human preferences.
* **[Github - Sugoto/GPT-Model-with-RLHF](https://github.com/Sugoto/GPT-Model-with-RLHF) | **This is a GPT 📜 model built from scratch that uses Reinforcement Learning with Human Feedback (RLHF) 🤖 to generate positive 👍 or negative 👎 recreations of Shakespeare's writing style 🎭.
* **[Github - vincentmin/transformer_rlhf_eli5](https://github.com/vincentmin/transformer_rlhf_eli5) | **We train a transformer model using Reinforcement Learning Human Feedback on the Reddit ELI5 dataset

### Demos and Tutorials

* **[Github - ojus1/MyMusicTransformer](https://github.com/ojus1/MyMusicTransformer) | **RLHF + MusicTransformer = Generate the music YOU love
* **[Github - AmirMotefaker/Create-your-own-ChatGPT](https://github.com/AmirMotefaker/Create-your-own-ChatGPT) | **Create your own ChatGPT with Python