https://github.com/HumanSignal/awesome-human-in-the-loop

Awesome List of Human in the Loop resources and references for retraining models.
https://github.com/HumanSignal/awesome-human-in-the-loop

List: awesome-human-in-the-loop

Last synced: 6 months ago
JSON representation

Awesome List of Human in the Loop resources and references for retraining models.

Host: GitHub
URL: https://github.com/HumanSignal/awesome-human-in-the-loop
Owner: HumanSignal
License: cc0-1.0
Created: 2023-03-16T19:14:15.000Z (over 2 years ago)
Default Branch: master
Last Pushed: 2023-07-17T09:37:57.000Z (almost 2 years ago)
Last Synced: 2024-05-21T04:00:56.096Z (about 1 year ago)
Size: 15.6 KB
Stars: 17
Watchers: 6
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

ultimate-awesome - awesome-human-in-the-loop - Awesome List of Human in the Loop resources and references for retraining models. . (Other Lists / Julia Lists)

README

        # awesome-human-in-the-loop

> An awesome list of tools and resources to get started with Human in the Loop or RHLF. 

[![Awesome](https://awesome.re/badge-flat2.svg)](https://awesome.re)

## Awesome RHLF 

### Blog Posts + Academic Papers

* **[Open AI - Aligning language models to follow instructions](https://openai.com/research/instruction-following) | **Internal blog post, how-to-use

* **[Cornell University - Scaling Language Models: Methods, Analysis & Insights from Training Gopher](https://arxiv.org/abs/2112.11446) | **Academic paper

* **[Hugging Face - Illustrating Reinforcement Learning from Human Feedback (RLHF)](https://huggingface.co/blog/rlhf) | **Definition, blog post

* **[LessWrong - RLHF](https://www.lesswrong.com/posts/rQH4gRmPMJyjtMpTn/rlhf) | **Blog post

* **[Unite.ai | What is Reinforcement Learning From Human Feedback (RLHF)](https://www.unite.ai/what-is-reinforcement-learning-from-human-feedback-rlhf/) | **Blog post

* **[Surge.ai - Introduction to Reinforcement Learning with Human Feedback](https://www.surgehq.ai/blog/introduction-to-reinforcement-learning-with-human-feedback-rlhf-series-part-1) | **Blog post

### Tools and Resources

* **[Secrets of RLHF in Large Language Models](https://github.com/OpenLMLab/MOSS-RLHF)** | Code and tutorials for RLHF in nutshell

* **[Scale - RLHF for Large Language Models](https://scale.com/rlhf) |** Landing page, tool

* **[Github - lucidrains/PaLM-rlhf-pytorch](https://github.com/lucidrains/PaLM-rlhf-pytorch) | **Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM

* **[Github - anthropics/hh-rlhf ](https://github.com/anthropics/hh-rlhf)| **Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"

* **[Github - conceptofmind/LaMDA-rlhf-pytorch](https://github.com/conceptofmind/LaMDA-rlhf-pytorch) | **Open-source pre-training implementation of Google's LaMDA in PyTorch. Adding RLHF similar to ChatGPT.

* **[Github - opendilab/awesome-RLHF](https://github.com/opendilab/awesome-RLHF) | **A curated list of reinforcement learning with human feedback resources (continually updated)

* **[Github - CarperAI/trlx](https://github.com/CarperAI/trlx) | **A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

* **[Github - sunzeyeah/RLHF](https://github.com/sunzeyeah/RLHF) | **Implementation of Chinese ChatGPT

* **[Github - LAION-AI/Open-Assistant](https://github.com/LAION-AI/Open-Assistant) | **OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.

* **[Github - xrsrke/instructGOOSE](https://github.com/xrsrke/instructGOOSE) | **Implementation of Reinforcement Learning from Human Feedback (RLHF)

* **[Github - arunprsh/ChatGPT-Decoded-GPT2-FAQ-Bot-RLHF-PPO](https://github.com/arunprsh/ChatGPT-Decoded-GPT2-FAQ-Bot-RLHF-PPO) | **A Practical Guide to Developing a Reliable FAQ Chatbot with Reinforcement Learning and Human Feedback using GPT-2 on AWS

* **[Github - voidful/TextRL](https://github.com/voidful/TextRL) | **Implementation of ChatGPT RLHF (Reinforcement Learning with Human Feedback) on any generation model in huggingface's transformer (blommz-176B/bloom/gpt/bart/T5/MetaICL)

* **[Github - cogment/Cogment-verse](https://github.com/cogment/cogment-verse) | **Library of Environments, Human Actor UIs and Agent implementation for Human In the Loop Learning & Reinforcement Learning

* **[Github - s-JoL/Open-Llama](https://github.com/s-JoL/Open-Llama) | **The complete training code of the open-source high-performance Llama model, including the full process from pre-training to RLHF.

* **[Github - jianzhnie/open-chatgpt ](https://github.com/jianzhnie/open-chatgpt)| **The open source implementation of chatgpt and RLHF. 从0开始实现一个ChatGPT.

* **[Github - andy-yangz/Awesome-RLHF](https://github.com/andy-yangz/Awesome-RLHF) | **Awesome Reinforcement Learning from Human Feedback, the secret behind ChatGPT XD

* **[Github - jordimas/awesome-RLHF-language-models](https://github.com/jordimas/awesome-RLHF-language-models) | **Curated list of resources for Reinforcement Learning from Human Feedback and Language Models

* **[Github - RUCAIBox/LLMSurvey](https://github.com/RUCAIBox/LLMSurvey) | **A collection of papers and resources related to Large Language Models.

* **[Github - mfarisadip/T5-rlhf-pytorch](https://github.com/mfarisadip/T5-rlhf-pytorch) | **Implementation of RLHF (Reinforcement Learning with Human Feedback) and GAN (Generative Adversarial Network) on top of the T5 architecture.

* **[Github - CarperAI/Polygraph](https://github.com/CarperAI/Polygraph) | **RLHF Mechanistic Interpretability and Deception

* **[Github - ayulockin/T5-RLHF-TF](https://github.com/ayulockin/T5-RLHF-TF) | **Implementation of Reinforcement Learning from Human Feedback for Summarization Task in TensorFlow

* **[Github - ckkissane/rlhf-shakespeare](https://github.com/ckkissane/rlhf-shakespeare) | **Shakespeare transformer fine-tuned to generate positive sentiment samples using RLHF

* **[Github - G-U-N/T2I-HumanFeedback](https://github.com/G-U-N/T2I-HumanFeedback) | **Implementations of Baseline Methods for Aligning Text2Img Diffusion Models with Human FeedBack

* **[Github - nazneenrajani/rlhf_langchain](https://github.com/nazneenrajani/rlhf_langchain) | **Langchain for RLHF

* **[Github - uSaiPrashanth/raithubot-training](https://github.com/uSaiPrashanth/raithubot-training) | **Training a RLHF-transformer architecture to answer farmers' queries

* **[Github - l294265421/alpaca-rlhf](https://github.com/l294265421/alpaca-rlhf) | **Finetuning alpaca with RLHF (Reinforcement Learning with Human Feedback)

* **[Github - DaehanKim/EasyRLHF](https://github.com/DaehanKim/EasyRLHF) | **EasyRLHF aims to providing an easy and minimal interface to train RLHF LMs, using off-the-shelf solutions and datasets

* **[Github - jeremy-collins/robot-rlhf](https://github.com/jeremy-collins/robot-rlhf) | **Robot Learning through Human Feedback. Inspired by advancements in NLP, we train a robot policy via reinforcement learning using a reward function learned exclusively from human preferences.

* **[Github - Sugoto/GPT-Model-with-RLHF](https://github.com/Sugoto/GPT-Model-with-RLHF) | **This is a GPT 📜 model built from scratch that uses Reinforcement Learning with Human Feedback (RLHF) 🤖 to generate positive 👍 or negative 👎 recreations of Shakespeare's writing style 🎭.

* **[Github - vincentmin/transformer_rlhf_eli5](https://github.com/vincentmin/transformer_rlhf_eli5) | **We train a transformer model using Reinforcement Learning Human Feedback on the Reddit ELI5 dataset

### Demos and Tutorials

* **[Github - ojus1/MyMusicTransformer](https://github.com/ojus1/MyMusicTransformer) | **RLHF + MusicTransformer = Generate the music YOU love

* **[Github - AmirMotefaker/Create-your-own-ChatGPT](https://github.com/AmirMotefaker/Create-your-own-ChatGPT) | **Create your own ChatGPT with Python

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/HumanSignal/awesome-human-in-the-loop

Awesome Lists containing this project

README