https://github.com/Corleno/KEPO
KEPO: Knowledge-Enhanced Preference Optimization for Reinforcement Learning with Reasoning
https://github.com/Corleno/KEPO
Last synced: 15 days ago
JSON representation
KEPO: Knowledge-Enhanced Preference Optimization for Reinforcement Learning with Reasoning
- Host: GitHub
- URL: https://github.com/Corleno/KEPO
- Owner: Corleno
- Created: 2025-11-17T03:50:42.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2026-03-26T08:32:54.000Z (3 months ago)
- Last Synced: 2026-03-27T01:32:28.514Z (3 months ago)
- Language: Jupyter Notebook
- Homepage:
- Size: 33 MB
- Stars: 2
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- awesomeopd - KEPO - the-badge&logo=github&logoColor=white&labelColor=181717&color=ffd700" alt="Stars"> | 2026.01 | Industrial | [arXiv 2602.00400](https://arxiv.org/abs/2602.00400) | KEPO | (🤝 OPD-RL Hybrids — Inside-RL OPD / 🔁 Iterative Self-Bootstrapping)
README
KEPO: Knowledge-Enhanced Preference Optimization for Reinforcement Learning with Reasoning
This code is built upon the **Med-R1** repo, it supports GKD, GRPO and KEPO algorithns on sota models such as Qwen-3VL. The dataset mainly focus on the multimodal medical data.