Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-llm-human-preference-datasets
A curated list of Human Preference Datasets for LLM fine-tuning, RLHF, and eval.
https://github.com/glgh/awesome-llm-human-preference-datasets
Last synced: about 4 hours ago
JSON representation
-
Datasets
- **OpenAI WebGPT Comparisons**
- OpenAI WebGPT
- **OpenAI Summarization**
- OpenAI Learning to Summarize from Human Feedback
- here
- Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback - generated red teaming data from [Red Teaming Language Models to Reduce Harms](https://arxiv.org/abs/2209.07858), divided into 3 sub-datasets:
- **OpenAssistant Conversations Dataset (OASST1)**
- **Stanford Human Preferences Dataset (SHP)**
- **Reddit ELI5**
- **Human ChatGPT Comparison Corpus (HC3)**
- Chinese
- **HuggingFace H4 StackExchange Preference Dataset**
- **ShareGPT.com**
- Precompliled datasets
- **Alpaca**
- **GPT4All**
- **Databricks Dolly Dataset**
- by Databricks employees
- **HH_golden**
- here
- GitHub repo
- Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback - generated red teaming data from [Red Teaming Language Models to Reduce Harms](https://arxiv.org/abs/2209.07858), divided into 3 sub-datasets:
- Anthropic HH datasets - writtened using GPT4 to yield more harmless answers. The comparison before and after re-written can be found [here](https://huggingface.co/datasets/Unified-Language-Model-Alignment/Anthropic_HH_Golden). Empirically, compared with the original Harmless dataset, training on this dataset improves the harmless metrics for various alignment methods such as RLHF and DPO.
Programming Languages
Categories
Sub Categories
Keywords