{"id":26172267,"url":"https://github.com/weiminxiong/mpo","last_synced_at":"2025-08-02T09:12:13.343Z","repository":{"id":280653416,"uuid":"934018240","full_name":"WeiminXiong/MPO","owner":"WeiminXiong","description":"MPO: Boosting LLM Agents with Meta Plan Optimization","archived":false,"fork":false,"pushed_at":"2025-03-06T15:31:13.000Z","size":632,"stargazers_count":62,"open_issues_count":6,"forks_count":4,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-07-15T22:44:27.120Z","etag":null,"topics":["agent","deep-learning","large-language-models","llm","llms","natural-language-processing"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/WeiminXiong.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-02-17T06:16:41.000Z","updated_at":"2025-07-09T06:49:01.000Z","dependencies_parsed_at":null,"dependency_job_id":"d9ed3476-bd51-4407-adb9-b758507ecd78","html_url":"https://github.com/WeiminXiong/MPO","commit_stats":null,"previous_names":["weiminxiong/mpo"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/WeiminXiong/MPO","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WeiminXiong%2FMPO","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WeiminXiong%2FMPO/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WeiminXiong%2FMPO/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WeiminXiong%2FMPO/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/WeiminXiong","download_url":"https://codeload.github.com/WeiminXiong/MPO/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WeiminXiong%2FMPO/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":268361745,"owners_count":24238530,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-02T02:00:12.353Z","response_time":74,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agent","deep-learning","large-language-models","llm","llms","natural-language-processing"],"created_at":"2025-03-11T19:55:17.554Z","updated_at":"2025-08-02T09:12:13.334Z","avatar_url":"https://github.com/WeiminXiong.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003ch1 align=\"center\"\u003e\n\u003cimg src=\"assets/planner.png\" width=\"100\" alt=\"rho-logo\" /\u003e\n\u003cbr\u003e\nMPO: Boosting LLM Agents with Meta Plan Optimization\n\u003c/h1\u003e\n\n\u003cdiv align=\"center\"\u003e\n\n![](https://img.shields.io/badge/Paper-arXiv-red)\n![](https://img.shields.io/badge/Model-Released-blue)\n![](https://img.shields.io/badge/Code%20License-Apache%202.0-green)\n\n\u003c/div\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://arxiv.org/pdf/2503.02682\"\u003e\u003cb\u003e[📜 Arxiv]\u003c/b\u003e\u003c/a\u003e •\n  \u003ca href=\"https://huggingface.co/datasets/xwm/Meta_Plan_Optimization\"\u003e\u003cb\u003e[🤗 Dataset]\u003c/b\u003e\u003c/a\u003e •\n  \u003ca href=\"https://huggingface.co/xwm/ALFWorld-MPO\"\u003e\u003cb\u003e[🤗 Models]\u003c/b\u003e\u003c/a\u003e •\n  \u003ca href=\"https://github.com/WeiminXiong/MPO\"\u003e\u003cb\u003e[🐱 GitHub]\u003c/b\u003e\u003c/a\u003e\n\u003c/p\u003e\n\nThis repository contains the code for the paper \"MPO: Boosting LLM Agents with Meta Plan Optimization\"\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=assets/main.png width=700/\u003e\n\u003c/p\u003e\n\nIn this work, we introduce the **Meta Plan Optimization (MPO)** framework, designed to enhance agent planning capabilities by directly integrating explicit guidance. Unlike previous methods that depend on complex knowledge—often requiring extensive human effort or lacking quality assurance—MPO leverages high-level general guidance through meta plans. This approach not only assists agents in planning but also enables continuous optimization of meta plans based on feedback from the agent's task execution.  \n\n\n## 🔥 News\n\n- [2025/03/05] 🔥🔥🔥 MPO-optimized meta planner released at 🤗 HuggingFace! \n    - Llama-3.1-70B-Instruct, enhanced with the MPO-optimized meta planner ([ALFWorld-MPO](https://huggingface.co/xwm/ALFWorld-MPO) and [SciWorld-MPO](https://huggingface.co/xwm/SciWorld-MPO)), achieved an average accuracy of 83.1 on ALFWorld and SciWorld, setting a new state-of-the-art (SOTA) performance.\n    - Llama-3.1-8B-Instruct + MPO achieved an average performance of 53.6, outperforming GPT-4o-mini by a significant margin with a 30.1% improvement.\n- [2025/03/05] 🔥🔥🔥 The [dataset](https://huggingface.co/datasets/xwm/Meta_Plan_Optimization) for MPO released at 🤗 HuggingFace! \n- [2025/03/04] MPO paper and repo released.\n\n\n## 🛠️ Setup\n\n```bash\ngit clone https://github.com/WeiminXiong/MPO.git\ncd MPO\nconda create -n mpo python=3.10\nconda activate mpo\npip install -r requirements.txt\nbash download_data.sh\n```\n\n## 🚀 Quick Start\nTo evaluate the effectiveness of MPO-optimized meta plans on baseline models, directly run the following bash script:\n```bash\nbash run_experiment.sh\n```\nThe script performs the following steps:\n\n1. configure the experiment parameters in `run_experiment.sh`\n2. launch the model server\n3. run the experiment\n\n## 🎮 Dataset Construction\nTo generate training data for the DPO optimization phase of the meta planner, run the following bash script.\n```bash\nbash scripts/mc_sample.sh\n```\nThe script performs the following steps:\n1. configure the experiment parameters in `scripts/mc_sample.sh`\n2. sample metaplans from the SFT-initialized metaplan generator\n3. let the explorer agent to evaluate the quality of the sampled metaplans\n4. generate training data for the DPO optimization phase of the meta planner\n\nFor more details about the dataset construction, please refer to the `scripts` directory.\n\n## 🧩 Structure of This Project\nThere are eight main folders in this project: `agents`, `configs`, `data`, `envs`, `prompt`, `scripts`, `tasks`, `utils`.\n\n`agents`: code for the agents\n\n`configs`: configuration files for the experiments\n\n`data`: data for the experiments\n\n`envs`: code for the environments\n\n`prompt`: prompt templates\n\n`scripts`: script for dataset construction and meta plan generation\n\n`tasks`: code for the tasks\n\n`utils`: utility functions\n\n## 📖 Citation\n\nIf you find this repo helpful, please cite out paper:\n\n```\n@misc{xiong2025mpoboostingllmagents,\n      title={MPO: Boosting LLM Agents with Meta Plan Optimization}, \n      author={Weimin Xiong and Yifan Song and Qingxiu Dong and Bingchan Zhao and Feifan Song and Xun Wang and Sujian Li},\n      year={2025},\n      eprint={2503.02682},\n      archivePrefix={arXiv},\n      primaryClass={cs.CL},\n      url={https://arxiv.org/abs/2503.02682}, \n}\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fweiminxiong%2Fmpo","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fweiminxiong%2Fmpo","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fweiminxiong%2Fmpo/lists"}