https://github.com/weiminxiong/mpo
MPO: Boosting LLM Agents with Meta Plan Optimization
https://github.com/weiminxiong/mpo
agent deep-learning large-language-models llm llms natural-language-processing
Last synced: 11 months ago
JSON representation
MPO: Boosting LLM Agents with Meta Plan Optimization
- Host: GitHub
- URL: https://github.com/weiminxiong/mpo
- Owner: WeiminXiong
- License: apache-2.0
- Created: 2025-02-17T06:16:41.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-03-06T15:31:13.000Z (over 1 year ago)
- Last Synced: 2025-07-15T22:44:27.120Z (12 months ago)
- Topics: agent, deep-learning, large-language-models, llm, llms, natural-language-processing
- Language: Python
- Homepage:
- Size: 617 KB
- Stars: 62
- Watchers: 1
- Forks: 4
- Open Issues: 6
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
MPO: Boosting LLM Agents with Meta Plan Optimization



[📜 Arxiv] •
[🤗 Dataset] •
[🤗 Models] •
[🐱 GitHub]
This repository contains the code for the paper "MPO: Boosting LLM Agents with Meta Plan Optimization"
In this work, we introduce the **Meta Plan Optimization (MPO)** framework, designed to enhance agent planning capabilities by directly integrating explicit guidance. Unlike previous methods that depend on complex knowledge—often requiring extensive human effort or lacking quality assurance—MPO leverages high-level general guidance through meta plans. This approach not only assists agents in planning but also enables continuous optimization of meta plans based on feedback from the agent's task execution.
## 🔥 News
- [2025/03/05] 🔥🔥🔥 MPO-optimized meta planner released at 🤗 HuggingFace!
- Llama-3.1-70B-Instruct, enhanced with the MPO-optimized meta planner ([ALFWorld-MPO](https://huggingface.co/xwm/ALFWorld-MPO) and [SciWorld-MPO](https://huggingface.co/xwm/SciWorld-MPO)), achieved an average accuracy of 83.1 on ALFWorld and SciWorld, setting a new state-of-the-art (SOTA) performance.
- Llama-3.1-8B-Instruct + MPO achieved an average performance of 53.6, outperforming GPT-4o-mini by a significant margin with a 30.1% improvement.
- [2025/03/05] 🔥🔥🔥 The [dataset](https://huggingface.co/datasets/xwm/Meta_Plan_Optimization) for MPO released at 🤗 HuggingFace!
- [2025/03/04] MPO paper and repo released.
## 🛠️ Setup
```bash
git clone https://github.com/WeiminXiong/MPO.git
cd MPO
conda create -n mpo python=3.10
conda activate mpo
pip install -r requirements.txt
bash download_data.sh
```
## 🚀 Quick Start
To evaluate the effectiveness of MPO-optimized meta plans on baseline models, directly run the following bash script:
```bash
bash run_experiment.sh
```
The script performs the following steps:
1. configure the experiment parameters in `run_experiment.sh`
2. launch the model server
3. run the experiment
## 🎮 Dataset Construction
To generate training data for the DPO optimization phase of the meta planner, run the following bash script.
```bash
bash scripts/mc_sample.sh
```
The script performs the following steps:
1. configure the experiment parameters in `scripts/mc_sample.sh`
2. sample metaplans from the SFT-initialized metaplan generator
3. let the explorer agent to evaluate the quality of the sampled metaplans
4. generate training data for the DPO optimization phase of the meta planner
For more details about the dataset construction, please refer to the `scripts` directory.
## 🧩 Structure of This Project
There are eight main folders in this project: `agents`, `configs`, `data`, `envs`, `prompt`, `scripts`, `tasks`, `utils`.
`agents`: code for the agents
`configs`: configuration files for the experiments
`data`: data for the experiments
`envs`: code for the environments
`prompt`: prompt templates
`scripts`: script for dataset construction and meta plan generation
`tasks`: code for the tasks
`utils`: utility functions
## 📖 Citation
If you find this repo helpful, please cite out paper:
```
@misc{xiong2025mpoboostingllmagents,
title={MPO: Boosting LLM Agents with Meta Plan Optimization},
author={Weimin Xiong and Yifan Song and Qingxiu Dong and Bingchan Zhao and Feifan Song and Xun Wang and Sujian Li},
year={2025},
eprint={2503.02682},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2503.02682},
}
```