https://github.com/ruvenguna94/dialogue-summary-remove-toxic-text-ppo
Fine-tuning FLAN-T5 with PPO and PEFT to generate less toxic text summaries. This notebook leverages Meta AI's hate speech reward model and utilizes RLHF techniques for improved safety.
https://github.com/ruvenguna94/dialogue-summary-remove-toxic-text-ppo
detoxification dialogue-summarization generative-ai hate-speech-detection nlp ppo-pytorch reward-model toxic-comment-classification toxicity-analysis
Last synced: about 1 month ago
JSON representation
Fine-tuning FLAN-T5 with PPO and PEFT to generate less toxic text summaries. This notebook leverages Meta AI's hate speech reward model and utilizes RLHF techniques for improved safety.
- Host: GitHub
- URL: https://github.com/ruvenguna94/dialogue-summary-remove-toxic-text-ppo
- Owner: RuvenGuna94
- Created: 2025-01-02T07:59:11.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2025-01-04T12:58:05.000Z (4 months ago)
- Last Synced: 2025-02-13T03:54:44.072Z (3 months ago)
- Topics: detoxification, dialogue-summarization, generative-ai, hate-speech-detection, nlp, ppo-pytorch, reward-model, toxic-comment-classification, toxicity-analysis
- Language: Jupyter Notebook
- Homepage:
- Size: 12.9 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files: