https://github.com/naidezhujimo/three-search-algorithms-based-on-prm
https://github.com/naidezhujimo/three-search-algorithms-based-on-prm
Last synced: 7 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/naidezhujimo/three-search-algorithms-based-on-prm
- Owner: naidezhujimo
- Created: 2025-04-06T02:12:09.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2025-04-06T02:19:19.000Z (about 2 months ago)
- Last Synced: 2025-04-06T03:20:12.259Z (about 2 months ago)
- Language: Python
- Size: 215 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Text Generation with MiniLM and Search Algorithms
This repository demonstrates text generation using a simple language model (MiniLM) combined with different search algorithms (Best-of-N, Beam Search, Lookahead Search) and a reward model (PRM). The implementation includes visualization of the search processes.
The associated mathematical formula can be found here:
## Features- **MiniLM**: A lightweight language model with LSTM layers for sequence generation.
- **PRM (Proximal Reward Model)**: A neural network that scores generated sequences based on learned rewards.
- **Search Algorithms**:
- **Best-of-N Sampling**: Selects the best candidate from N generated samples.
- **Beam Search**: Maintains multiple top candidates during generation.
- **Lookahead Search**: Simulates future steps to optimize token selection.
- **Visualization**: Generates PNG images showing search decision processes.## Requirements
- Python 3.7+
- PyTorch
- NumPy
- Matplotlib
- NetworkXInstall dependencies:
```bash
pip install torch numpy matplotlib networkx
```## Code Structure
- `MiniLM` class: Implements the language model.
- `PRM` class: Implements the reward model.
- Test functions:
- `test_prm_training()`: Trains the PRM model.
- `test_best_of_n()`: Demonstrates Best-of-N sampling.
- `beam_search()`: Visualizes beam search.
- `lookahead_search()`: Demonstrates multi-step lookahead.## Example Outputs

*Comparison of candidate tokens using PRM scores*
*Beam search visualization (width=2)*
*Decision points in lookahead search (horizon=2)*## Contributing
Contributions are welcome! Please open an issue or submit a pull request.
## Blog
https://blog.csdn.net/2303_79071981/article/details/147016903?spm=1001.2014.3001.5502