Projects in Awesome Lists tagged with human-feedback
A curated list of projects in awesome lists tagged with human-feedback .
https://github.com/lucidrains/palm-rlhf-pytorch
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
artificial-intelligence attention-mechanisms deep-learning human-feedback reinforcement-learning transformers
Last synced: 14 May 2025
https://github.com/lucidrains/PaLM-rlhf-pytorch
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
artificial-intelligence attention-mechanisms deep-learning human-feedback reinforcement-learning transformers
Last synced: 29 Mar 2025
https://github.com/conceptofmind/lamda-rlhf-pytorch
Open-source pre-training implementation of Google's LaMDA in PyTorch. Adding RLHF similar to ChatGPT.
artificial-intelligence attention-mechanism deep-learning human-feedback machine-learning reinforcement-learning transformers
Last synced: 21 Apr 2025
https://github.com/conceptofmind/LaMDA-rlhf-pytorch
Open-source pre-training implementation of Google's LaMDA in PyTorch. Adding RLHF similar to ChatGPT.
artificial-intelligence attention-mechanism deep-learning human-feedback machine-learning reinforcement-learning transformers
Last synced: 29 Mar 2025
https://github.com/huggingface/data-is-better-together
Let's build better datasets, together!
community datasets human-feedback machine-learning
Last synced: 14 Oct 2025
https://github.com/wxjiao/ParroT
The ParroT framework to enhance and regulate the Translation Abilities during Chat based on open-sourced LLMs (e.g., LLaMA-7b, Bloomz-7b1-mt) and human written translation and evaluation data.
bloomz chatgpt contrastive error-guided gpt-4 human-feedback instruction-tuning llama lora machine-translation
Last synced: 17 Apr 2025
https://github.com/xrsrke/instructgoose
Implementation of Reinforcement Learning from Human Feedback (RLHF)
chatgpt human-feedback instructgpt reinforcement-learning rlhf
Last synced: 09 Apr 2025
https://github.com/xrsrke/instructGOOSE
Implementation of Reinforcement Learning from Human Feedback (RLHF)
chatgpt human-feedback instructgpt reinforcement-learning rlhf
Last synced: 29 Mar 2025
https://github.com/trubrics/trubrics-python
Product analytics for AI Assistants
human-feedback llm llmops machine-learning ml-monitoring mlops model-feedback streamlit
Last synced: 04 Apr 2025
https://github.com/pku-alignment/beavertails
BeaverTails is a collection of datasets designed to facilitate research on safety alignment in large language models (LLMs).
ai-safety beaver datasets gpt human-feedback human-feedback-data language-model large-language-model llama llm llms rlhf safe-rlhf safety
Last synced: 09 Aug 2025
https://github.com/davidberenstein1957/dataset-viber
Dataset Viber is your chill repo for data collection, annotation and vibe checks.
data-collection data-quality evaluation human-feedback
Last synced: 06 Mar 2025
https://github.com/ZiyiZhang27/tdpo
[ICML 2024] Code for the paper "Confronting Reward Overoptimization for Diffusion Models: A Perspective of Inductive and Primacy Biases"
alignment diffusion-models human-feedback reinforcement-learning rlhf stable-diffusion text-to-image
Last synced: 19 Apr 2025
https://github.com/cluebbers/dpo-rlhf-paraphrase-types
Enhancing paraphrase-type generation using Direct Preference Optimization (DPO) and Reinforcement Learning from Human Feedback (RLHF), with large-scale HPC support. This project aligns model outputs to human-ranked data for robust, safety-focused NLP.
alignment deep-learning direct-preference-optimization human-feedback paraphrase-generation paraphrase-type-generation reinforcement-learning transformers
Last synced: 29 Apr 2026
https://github.com/auraoneai/open
Open tools for the human-judgment layer of AI evaluation: EvalKit (Python package + CLI), Robotics ReviewKit, and the Buying Toolkit.
ai-safety auraone evals evaluation human-feedback lerobot llm openx rlds robotics rubrics teleoperation
Last synced: 28 May 2026
https://github.com/sunwang-ai-linguist/bilingual-rlhf-semantic-repair-corpus
Daily Mandarin-English semantic alignment corpus for RLHF training, tone repair, AI metaphor translation, and OpenAI contributor tracking. #SamPickMe #RLHF #TSMC
bilingual bilingual-corpora chatgpt crosslingual crosslingual-transfer gpt-training human-feedback openai rlhf sam-altman semantic-alignment tone-correction
Last synced: 28 Apr 2026