{"id":28676511,"url":"https://github.com/zjunlp/autoact","last_synced_at":"2025-06-13T23:04:59.152Z","repository":{"id":216523379,"uuid":"733574085","full_name":"zjunlp/AutoAct","owner":"zjunlp","description":"[ACL 2024] AutoAct: Automatic Agent Learning from Scratch for QA via Self-Planning","archived":false,"fork":false,"pushed_at":"2025-01-13T12:50:08.000Z","size":6639,"stargazers_count":221,"open_issues_count":0,"forks_count":13,"subscribers_count":17,"default_branch":"main","last_synced_at":"2025-05-07T20:34:15.801Z","etag":null,"topics":["agent","agent-learning","agents","autoact","large-language-models","natural-language-processing","nlp","self-planning"],"latest_commit_sha":null,"homepage":"https://zjunlp.github.io/project/AutoAct","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/zjunlp.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2023-12-19T16:25:40.000Z","updated_at":"2025-04-26T09:10:28.000Z","dependencies_parsed_at":"2024-04-01T08:15:14.357Z","dependency_job_id":null,"html_url":"https://github.com/zjunlp/AutoAct","commit_stats":null,"previous_names":["zjunlp/autoact"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/zjunlp/AutoAct","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zjunlp%2FAutoAct","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zjunlp%2FAutoAct/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zjunlp%2FAutoAct/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zjunlp%2FAutoAct/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/zjunlp","download_url":"https://codeload.github.com/zjunlp/AutoAct/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zjunlp%2FAutoAct/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":259732772,"owners_count":22903087,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agent","agent-learning","agents","autoact","large-language-models","natural-language-processing","nlp","self-planning"],"created_at":"2025-06-13T23:04:58.414Z","updated_at":"2025-06-13T23:04:59.133Z","avatar_url":"https://github.com/zjunlp.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003ch1 align=\"center\"\u003e AutoAct \u003c/h1\u003e\n\u003ch3 align=\"center\"\u003e Automatic Agent Learning from Scratch for QA via Self-Planning \u003c/h3\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://arxiv.org/abs/2401.05268\"\u003e📄arXiv\u003c/a\u003e •\n  \u003ca href=\"https://huggingface.co/papers/2401.05268\"\u003e🤗HFPaper\u003c/a\u003e •\n  \u003ca href=\"https://zjunlp.github.io/project/AutoAct/\"\u003e🌐Web\u003c/a\u003e\n\u003c/p\u003e\n\n[![Awesome](https://awesome.re/badge.svg)](https://github.com/zjunlp/AutoAct) \n[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)\n![](https://img.shields.io/github/last-commit/zjunlp/AutoAct?color=green) \n\n## Table of Contents\n\n- 🌻[Acknowledgement](#🌻acknowledgement)\n- 🌟[Overview](#🌟overview)\n- 🔧[Installation](#🔧installation)\n- ✏️[Self-Instruct](#✏️Self-Instruct)\n- 📝[Self-Planning](#📝Self-Planning)\n  - [Automatic Tool Selection](#Automatic-Tool-Selection)\n  - [Trajectories Synthesis](#Trajectories-Synthesis)\n  - [Self-Differentiation](#Self-Differentiation)\n  - [Group Planning](#Group-Planning)\n- 🚩[Citation](#🚩Citation)\n\n---\n\n\n\n## 🌻Acknowledgement\n\nOur code of training module is referenced and adapted from [FastChat](https://github.com/lm-sys/FastChat), while the code of inference module is implemented based on [BOLAA](https://github.com/salesforce/BOLAA). Various baseline codes use [ReAct](https://github.com/ysymyth/ReAct), [Reflexion](https://github.com/noahshinn/reflexion), [BOLAA](https://github.com/salesforce/BOLAA), [Chameleon](https://github.com/lupantech/chameleon-llm), [ReWOO](https://github.com/billxbf/ReWOO), [FireAct](https://github.com/anchen1011/FireAct) respectively. We use LangChain with open models via [Fastchat](https://github.com/lm-sys/FastChat/blob/main/docs/langchain_integration.md). Thanks for their great contributions!\n\n\n\n## 🌟Overview\n\nLanguage agents have achieved considerable performance on various complex tasks. Despite the incessant exploration in this field, existing language agent systems still struggle with costly, non-reproducible data reliance and face the challenge of compelling a single model for multiple functions. To this end, we introduce **AutoAct**, an automatic agent learning framework that does not rely on large-scale annotated data and synthetic trajectories from closed-source models (e.g., GPT-4). Given limited data with a tool library, **AutoAct** first automatically synthesizes planning trajectories without any assistance from humans or strong closed-source models. Then, **AutoAct** leverages a *division-of-labor* strategy to automatically differentiate based on the target task information and synthesized trajectories, producing a sub-agent group to complete the task. We conduct comprehensive experiments with different LLMs, which demonstrates that **AutoAct** yields better or parallel performance compared to various strong baselines.\n\n\u003cimg src=\"./method.gif\" alt=\"method\" style=\"zoom: 50%;\" /\u003e\n\n\n\n## 🔧Installation\n\n```bash\ngit clone https://github.com/zjunlp/AutoAct\ncd AutoAct\npip install -r requirements.txt\n```\n\nBefore the experiments, you need to apply for a Bing Search key [here](https://www.microsoft.com/en-us/bing/apis/bing-web-search-api) (not free).\n\n## ✏️Self-Instruct\n\nWe conduct self-instruct on Meta-Agent to acquire a sufficient amount of task data and provide an ample training resource. \n\n```bash\npython Self_Instruct/data_generation.py \\\n    --source_data Self_Instruct/Meta_sample/Meta_Hotpotqa.json \\\n    --target_data Self_Instruct/hotpotqa_metaqa.json \\\n    --dataset_name hotpotqa  \\\n    --generate_all_num 800 \\\n    --generate_per_round_num 10 \\\n    --model_name llama-2-13b-chat \\\n```\n\nThe `source_data` contains data examples from the target task information. The `target_data` consists of data generated through self-instruct. The variable `generate_all_num` represents the total number of generated data instances. In order to improve generation efficiency and avoid duplication, we generate `generate_per_round_num` data instances per round.\n\n\n\n## 📝Self-Planning\n\n### Automatic Tool Selection\n\nWith the tool library at hand, we ask the Meta-Agent to select applicable tools for each task automatically.\n\n```bash\npython Self_Planning/Tool_Selection/tool_selected.py \\\n    --model_name llama-2-13b-chat \\\n    --task_name ScienceQA \\\n    --top_k 40 \\\n    --top_p 0.75 \\\n    --max_tokens 1024 \\\n    --tool_save_path Self_Planning/Tool_Selection/{task_name}_Tools.json\n```\n\nThe information of the selected tools will be stored in `tool_save_path`.\n\n\n\n### Trajectories Synthesis\n\n```bash\npython Self_Plan/Traj_Syn/run_task.py \\\n    --agent_name ZeroshotThink_HotPotQA_run_Agent \\\n    --llm_name llama-2-13b-chat \\\n    --max_context_len 4096 \\\n    --task Hotpotqa \\\n    --task_path Self_Instruct/hotpotqa_metaqa.json \\\n    --save_path Self_Plan/Traj_Syn/output/hotpotqa_train_data.jsonl\n```\n\nIn order to obtain high-quality synthesized trajectories, we filter out all the trajectories with $\\texttt{reward}\u003c1$ and collect trajectories with exactly correct answers ($\\texttt{reward}=1$) as the training source for self-differentiation. We release the trajectories synthesized by Llama-{13,70}b-chat after filtering in [Google Drive](https://drive.google.com/drive/folders/1Sh6Ksj8T0fT23ePWRf_dDcOTmpZlulr2?usp=sharing) (but you should also run `filter_data.py` for trajectory differentiation).\n\n```bash\npython Scripts/filter_data.py \\\n    --source_path Self_Plan/Traj_Syn/output/hotpotqa_train_data.jsonl \\\n    --save_path Self_Plan/Traj_Syn/output \\\n    --task_name HotpotQA \\\n    --filter_num 200\n```\n\n\n\n### Self-Differentiation\n\nIn order to establish a clear *division-of-labor*, we leverage synthesized planning trajectories to differentiate the Meta-Agent into three sub-agents with distinct functionalities:\n\n- **Plan-Agent** undertakes task decomposition and determines which tool to invoke in each planning loop.\n- **Tool-Agent** is responsible for how to invoke the tool by deciding the parameters for the tool invocation.\n- **Reflect-Agent** engages in reflection by considering all the historical trajectories and providing a reflection result.\n\nAgent training:\n\n```bash\nfor agent in plan tool reflect\ndo\necho \"####################\"\necho $agent\necho \"####################\"\nCUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 deepspeed Self_Plan/Train/train_lora.py \\\n    --model_name_or_path llama-2-13b-chat \\\n    --lora_r 8 \\\n    --lora_alpha 16 \\\n    --lora_dropout 0.05 \\\n    --data_path Self_Plan/Traj_Syn/output/data_$agent.json \\\n    --output_dir Self_Plan/Train/lora/HotpotQA/13b-$agent-5-epoch \\\n    --num_train_epochs 5 \\\n    --per_device_train_batch_size 2 \\\n    --per_device_eval_batch_size 1 \\\n    --gradient_accumulation_steps 1 \\\n    --evaluation_strategy \"no\" \\\n    --save_strategy \"steps\" \\\n    --save_steps 10000 \\\n    --save_total_limit 1 \\\n    --learning_rate 1e-4 \\\n    --weight_decay 0. \\\n    --warmup_ratio 0.03 \\\n    --lr_scheduler_type \"cosine\" \\\n    --logging_steps 1 \\\n    --fp16 True \\\n    --model_max_length 4096 \\\n    --gradient_checkpointing True \\\n    --q_lora False \\\n    --deepspeed Self_Plan/Train/deepspeed_config_s3.json \\\n    --resume_from_checkpoint False \ndone\n```\n\n\n\n### Group Planning\n\nAfter obtaining the task-specific sub-agents, any new question is processed through group planning among the sub-agents to achieve the desired outcome.\n\n```bash\npython Self_Planning/Group_Planning/run_eval.py \\\n    --agent_name ZeroshotThink_HotPotQA_run_Agent \\\n    --plan_agent plan \\\n    --tool_agent tool \\\n    --reflect_agent reflect \\\n    --max_context_len 4096 \\\n    --task HotpotQA \\\n    --task_path Self_Planning/Group_Planning/benchmark_run/data/hotpotqa \\\n    --save_path Self_Planning/Group_Planning/output/13b\n```\n\nWe release the trajectories of text sets generated by Llama-{7,13,70}b-chat in [Google Drive](https://drive.google.com/drive/folders/1Sh6Ksj8T0fT23ePWRf_dDcOTmpZlulr2?usp=sharing).\n\nThe prompts used in our experiments are in directory [Prompts]https://github.com/zjunlp/AutoAct/tree/main/Prompts.\n\n## 🚩Citation\n\nPlease cite our repository if you use AutoAct in your work. Thanks!\n\n```bibtex\n@article{DBLP:journals/corr/abs-2401-05268,\n  author       = {Shuofei Qiao and\n                  Ningyu Zhang and\n                  Runnan Fang and\n                  Yujie Luo and\n                  Wangchunshu Zhou and\n                  Yuchen Eleanor Jiang and\n                  Chengfei Lv and\n                  Huajun Chen},\n  title        = {{AUTOACT:} Automatic Agent Learning from Scratch via Self-Planning},\n  journal      = {CoRR},\n  volume       = {abs/2401.05268},\n  year         = {2024},\n  url          = {https://doi.org/10.48550/arXiv.2401.05268},\n  doi          = {10.48550/ARXIV.2401.05268},\n  eprinttype    = {arXiv},\n  eprint       = {2401.05268},\n  timestamp    = {Thu, 25 Jan 2024 15:41:08 +0100},\n  biburl       = {https://dblp.org/rec/journals/corr/abs-2401-05268.bib},\n  bibsource    = {dblp computer science bibliography, https://dblp.org}\n}\n```\n\n\n\n## 🎉Contributors\n\n\u003ca href=\"https://github.com/zjunlp/AutoAct/graphs/contributors\"\u003e\n  \u003cimg src=\"https://contrib.rocks/image?repo=zjunlp/AutoAct\" /\u003e\u003c/a\u003e\n\nWe will offer long-term maintenance to fix bugs and solve issues. So if you have any problems, please put issues to us.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzjunlp%2Fautoact","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fzjunlp%2Fautoact","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzjunlp%2Fautoact/lists"}