{"id":23792102,"url":"https://github.com/zou-group/avatar","last_synced_at":"2025-05-13T15:39:37.402Z","repository":{"id":244880363,"uuid":"815040970","full_name":"zou-group/avatar","owner":"zou-group","description":"AvaTaR: Optimizing LLM Agents for Tool Usage via Contrastive Reasoning (NeurIPS 2024)","archived":false,"fork":false,"pushed_at":"2025-03-04T07:58:54.000Z","size":14086,"stargazers_count":188,"open_issues_count":2,"forks_count":17,"subscribers_count":11,"default_branch":"main","last_synced_at":"2025-03-28T13:05:32.461Z","etag":null,"topics":["agents","knowledge-base","llms","retrieval","tool-use"],"latest_commit_sha":null,"homepage":"https://arxiv.org/abs/2406.11200","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/zou-group.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-06-14T08:18:17.000Z","updated_at":"2025-03-22T11:43:06.000Z","dependencies_parsed_at":"2025-01-29T22:21:13.198Z","dependency_job_id":null,"html_url":"https://github.com/zou-group/avatar","commit_stats":null,"previous_names":["zou-group/avatar"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zou-group%2Favatar","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zou-group%2Favatar/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zou-group%2Favatar/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zou-group%2Favatar/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/zou-group","download_url":"https://codeload.github.com/zou-group/avatar/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247186301,"owners_count":20898133,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agents","knowledge-base","llms","retrieval","tool-use"],"created_at":"2025-01-01T18:27:24.834Z","updated_at":"2025-05-13T15:39:37.384Z","avatar_url":"https://github.com/zou-group.png","language":"Python","readme":"# AvaTaR: Optimizing LLM Agents for Tool Usage via Contrastive Reasoning (NeurIPS 2024)\n\n[NeurIPS paper](https://arxiv.org/abs/2406.11200) | [DSPy Implementation](https://github.com/stanfordnlp/dspy/blob/main/examples/outdated_v2.4_examples/agents/avatar_langchain_tools.ipynb)\n\nAvaTaR is a novel and automatic framework that optimizes an LLM agent to effectively use the provided tools and improve its performance on a given task/domain. During optimization, we design a comparator module to iteratively provide insightful and holistic prompts to the LLM agent via reasoning between positive and negative examples sampled from training data.\n\n## News\n\n[July 2024] 🔥 Avatar is integrated into [DSPy](https://github.com/stanfordnlp/dspy) - Credit to Herumb Shandilya! You can try out [the example on jupyter notebook](https://github.com/stanfordnlp/dspy/blob/main/examples/outdated_v2.4_examples/agents/avatar_langchain_tools.ipynb). \n\n## 1. (For general QA）Using Avatar with DSPy\n\nAvatar is now integrated with DSPy as `Avatar` Module for agent execution and `AvatarOptimizer` for Actor optimization. To use Avatar you'll need: Task Signature and Tools. \n\n* Task Signature is a `dspy.Signature` class defining the structure of your task. So if your task is of QA type you can create a signature with `question` input field and `answer` output field.\n* Tools is a list of `dspy.Tools` containing all the tools of langchain tool format.\n\nHere is an example\n\n```python\nfrom dspy.predict.avatar import Tool, Avatar\nfrom langchain_community.utilities import GoogleSerperAPIWrapper, ArxivAPIWrapper\n\ntools = [\n    Tool(\n        tool=GoogleSerperAPIWrapper(),\n        name=\"WEB_SEARCH\",\n        desc=\"If you have a question, you can use this tool to search the web for the answer.\"\n    ),\n]\n\nagent = Avatar(\n    tools=tools,\n    signature=\"question-\u003eanswer\",\n    verbose=True,\n)\n```\n\nYou can execute it like any other DSPy module by passing the inputs you specified in your task signature:\n\n```python\nanswer = agent(question)\n```\n\nYou can optimize the Actor for optimal tool usage using `AvatarOptimizer` which optimizes it using the comparator module:\n\n```python\nfrom dspy.teleprompt import AvatarOptimizer\n\ndef metric(example, prediction, trace=None):\n    ...\n\nteleprompter = AvatarOptimizer(\n    metric=metric,\n    max_iters=10,\n    max_negative_inputs=10,\n    max_positive_inputs=10,\n)\n\noptimized_arxiv_agent = teleprompter.compile(\n    student=agent,\n    trainset=trainset\n)\n```\n\nFor a detailed walkthrough, you can refer to the [notebook](https://github.com/stanfordnlp/dspy/blob/avatar-optimization-integration/examples/agents/avatar_langchain_tools.ipynb) in DSPy repo.\n\n## 2. (To reproduce the results) Run AvaTaR on STaRK and Flickr-30kEntities\n### Installation\n\n```\nconda create -n avatar python=3.11\npip install stark-qa typeguard\n```\n\n### Preparation\n- Specify API keys in command line\n    ```bash\n    export ANTHROPIC_API_KEY=YOUR_API_KEY\n    ```\n    ```bash\n    export OPENAI_API_KEY=YOUR_API_KEY\n    export OPENAI_ORG=YOUR_ORGANIZATION\n    ```\n- Embeddings: Download all embeddings by running the following script:\n  ```bash\n  sh scripts/emb_download_all.sh\n  ```\n- Raw data:\n  STaRK data will be downloaded automatically when running the code. \n  For Flickr30k Entities, submit form at [Flickr 30k \u0026 Denotation Graph data](https://forms.illinois.edu/sec/229675) to request access. Then organize the data as follows:\n  ```\n  data\n  ├── flickr30k_entities\n  │   ├── raw\n  │   │   ├── Annotations\n  │   │   │   ├── 36979.xml\n  │   │   │   ├── ...\n  │   │   ├── flickr30k-images\n  │   │       ├── 36979.jpg\n  │   │       ├── ...\n  │   ├── split\n  │   │   ├── test.index\n  │   │   ├── train.index\n  │   │   ├── val.index\n  │   ├── qa.csv\n  ├── ...\n  ```\n\n### Run Agents\nWe already include the VSS results locally under `output/eval` and the grouping (for STaRK only) under `output/agent`. With these files, you should be able to optimize actor actions directly following the AvaTaR pipeline.\n\n- Optimization: Following the default settings at `config/default_args.json`, run the following command to optimize the actor actions for a group of queries:\n  ```bash\n  sh scripts/run_avatar_stark.sh\n  ```\n  You can specify the dataset name and group in `scripts/run_avatar_stark.sh`. \n  ```bash\n  sh scripts/run_avatar_flickr30k_entities.sh\n  ```\n- Evaluation: Run the following command to evaluate the optimized actor actions:\n  ```bash\n  sh scripts/eval_avatar_stark.sh\n  ```\n  or\n  ```bash\n  sh scripts/eval_avatar_flickr30k_entities.sh\n  ```\n### Run ReAct baseline\nWe provide the implementation of ReAct baseline on STaRK and Flickr-30kEntities. The function lists provided to ReAct are under `avatar/tools/react`. \n- Evaluation: Run the following command to evaluate ReAct:\n  ```bash\n  sh scripts/eval_react_stark.sh\n  ```\n  or\n  ```bash\n  sh scripts/eval_react_flickr30k_entities.sh\n  ```\nBy default, we store the logs of ReAct reasoning and acting process at `logs/`.\n\n## Reference \n\n```\n@article{wu24avatar,\n    title        = {AvaTaR: Optimizing LLM Agents for Tool Usage via Contrastive Reasoning},\n    author       = {\n        Shirley Wu and Shiyu Zhao and \n        Qian Huang and Kexin Huang and \n        Michihiro Yasunaga and Kaidi Cao and \n        Vassilis N. Ioannidis and Karthik Subbian and \n        Jure Leskove and James Zou\n    },\n    booktitle    = {NeurIPS},\n    year         = {2024}\n}\n```\n","funding_links":[],"categories":["A01_文本生成_文本对话"],"sub_categories":["大语言对话模型及数据"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzou-group%2Favatar","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fzou-group%2Favatar","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzou-group%2Favatar/lists"}