https://github.com/wnjxyk/rpc
Official Repository for NeurIPS 2025 Paper: "A Theoretical Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning"
https://github.com/wnjxyk/rpc
Last synced: 4 months ago
JSON representation
Official Repository for NeurIPS 2025 Paper: "A Theoretical Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning"
- Host: GitHub
- URL: https://github.com/wnjxyk/rpc
- Owner: WNJXYK
- License: mit
- Created: 2025-09-30T07:26:05.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2025-10-07T08:52:43.000Z (9 months ago)
- Last Synced: 2025-10-07T10:29:41.014Z (9 months ago)
- Size: 2.02 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# [NeurIPS 2025] A Theoretical Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning
Official Repository for NeurIPS 2025 Paper: "A Theoretical Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning"
## 🛠️ 1. Environment Setup
We provide two ways to create the Python environment for this repository. Please choose one of the following methods:
### 1.1. Using Python virtual environment:
```bash
python -m venv rpc
source rpc/bin/activate
pip install -r requirements.txt
```
### 1.2. Using Conda environment:
```bash
conda create -n rpc python=3.9
conda activate rpc
pip install -r requirements.txt
```
## 🚀 2. Reproducing Experiments
### 2.1. Single Experiment
Run evaluation with specific parameters:
```bash
python main.py --dataset MathOdyssey --model InternLM2-Math-Plus-7B --method RPC --K 128
```
**Parameters:**
- `--dataset`: Choose from `MATH`, `MathOdyssey`, `AIME`, `OlympiadBench`
- `--model`: Choose from `Deepseek-Math-RL-7B`, `InternLM2-Math-Plus-1.8B`, `InternLM2-Math-Plus-7B`
- `--method`: Choose from `PPL` (Perplexity), `SC` (Self-Consistency), `RPC` (our method)
- `--K`: Number of reasoning paths to sample (`128` for `MathOdyssey`, `AIME`, `OlympiadBench`, and `64` for `MATH`)
### 2.2. Batch Experiments
Run comprehensive evaluation across multiple settings:
```bash
bash all_exps.sh
```
This will evaluate all method-dataset-model combinations and save results to `results.txt`.
### 2.3. Hints
1. If you cannot download data from Hugging Face directly, please use [Hugging Mirror](https://hf-mirror.com/) instead.
2. It may take some time to generate the cache for checking answer equality when running each dataset for the first time.
## 📚 3. BibTex
```bibtex
@inproceedings{zhou24theoretical,
author = {Zhou, Zhi and Tan, Yuhao and Li, Zenan and Yao, Yuan and Guo, Lan-Zhe and Li, Yu-Feng and Ma, Xiaoxing},
title = {A Theorecial Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning},
booktitle = {Advances in Neural Information Processing Systems},
year = {2025},
}
```