An open API service indexing awesome lists of open source software.

https://github.com/CJReinforce/PURE

Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning"
https://github.com/CJReinforce/PURE

llm mathematics o1 r1 reasoning reinforcement-finetuning reinforcement-learning rl

Last synced: about 2 months ago
JSON representation

Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning"

Awesome Lists containing this project