https://github.com/nevercase/qwen-grpo
This is a reproduction of deepseek thinking based on qwen and grpo trainer.
https://github.com/nevercase/qwen-grpo
Last synced: 12 months ago
JSON representation
This is a reproduction of deepseek thinking based on qwen and grpo trainer.
- Host: GitHub
- URL: https://github.com/nevercase/qwen-grpo
- Owner: neverCase
- Created: 2025-05-03T09:58:55.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-05-06T09:53:49.000Z (about 1 year ago)
- Last Synced: 2025-05-19T03:12:28.518Z (about 1 year ago)
- Language: Python
- Size: 1000 Bytes
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# qwen-grpo
This is a reproduction of deepseek thinking based on qwen and grpo trainer.