https://github.com/Corleno/KEPO

KEPO: Knowledge-Enhanced Preference Optimization for Reinforcement Learning with Reasoning
https://github.com/Corleno/KEPO

Last synced: 15 days ago
JSON representation

KEPO: Knowledge-Enhanced Preference Optimization for Reinforcement Learning with Reasoning

Host: GitHub
URL: https://github.com/Corleno/KEPO
Owner: Corleno
Created: 2025-11-17T03:50:42.000Z (8 months ago)
Default Branch: main
Last Pushed: 2026-03-26T08:32:54.000Z (3 months ago)
Last Synced: 2026-03-27T01:32:28.514Z (3 months ago)
Language: Jupyter Notebook
Homepage:
Size: 33 MB
Stars: 2
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

awesomeopd - KEPO - the-badge&logo=github&logoColor=white&labelColor=181717&color=ffd700" alt="Stars"> | 2026.01 | Industrial | [arXiv 2602.00400](https://arxiv.org/abs/2602.00400) | KEPO | (🤝 OPD-RL Hybrids — Inside-RL OPD / 🔁 Iterative Self-Bootstrapping)

README

KEPO: Knowledge-Enhanced Preference Optimization for Reinforcement Learning with Reasoning

This code is built upon the **Med-R1** repo, it supports GKD, GRPO and KEPO algorithns on sota models such as Qwen-3VL. The dataset mainly focus on the multimodal medical data.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/Corleno/KEPO

Awesome Lists containing this project

README

KEPO: Knowledge-Enhanced Preference Optimization for Reinforcement Learning with Reasoning