{"id":30181898,"url":"https://github.com/snap-stanford/optimas","last_synced_at":"2026-03-11T11:34:44.980Z","repository":{"id":303508760,"uuid":"1015487065","full_name":"snap-stanford/optimas","owner":"snap-stanford","description":"Optimize Any User-defined Compound AI Systems","archived":false,"fork":false,"pushed_at":"2025-08-18T23:47:03.000Z","size":1549,"stargazers_count":59,"open_issues_count":3,"forks_count":6,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-10-27T02:47:07.980Z","etag":null,"topics":["compound-ai-systems","multiagent-systems","optimization","reward-learning"],"latest_commit_sha":null,"homepage":"https://optimas.stanford.edu/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/snap-stanford.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-07-07T15:17:37.000Z","updated_at":"2025-10-26T15:46:50.000Z","dependencies_parsed_at":"2025-07-08T03:02:59.479Z","dependency_job_id":"e982534f-4e4d-4cfd-b1db-d952200b1b1e","html_url":"https://github.com/snap-stanford/optimas","commit_stats":null,"previous_names":["snap-stanford/optimas"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/snap-stanford/optimas","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/snap-stanford%2Foptimas","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/snap-stanford%2Foptimas/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/snap-stanford%2Foptimas/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/snap-stanford%2Foptimas/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/snap-stanford","download_url":"https://codeload.github.com/snap-stanford/optimas/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/snap-stanford%2Foptimas/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30379901,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-11T06:09:32.197Z","status":"ssl_error","status_checked_at":"2026-03-11T06:09:17.086Z","response_time":84,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["compound-ai-systems","multiagent-systems","optimization","reward-learning"],"created_at":"2025-08-12T09:21:38.966Z","updated_at":"2026-03-11T11:34:44.976Z","avatar_url":"https://github.com/snap-stanford.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n\u003cdiv align=\"center\"\u003e\n\u003cfigure class=\"center-figure\"\u003e \u003cimg src=\"media/compound-ai-system.jpg\" width=\"85%\"\u003e\u003c/figure\u003e\n\u003c/div\u003e\n\n\u003ch1 align=\"left\"\u003e\n    Optimas: Optimizing Compound AI Systems with Globally Aligned Local Rewards\n\u003c/h1\u003e\n\n\u003cdiv align=\"left\"\u003e\n\n[![](https://img.shields.io/badge/website-Optimas-purple?style=plastic\u0026logo=Google%20chrome)](https://optimas.stanford.edu/)\n[![](https://img.shields.io/badge/Arxiv-paper-red?style=plastic\u0026logo=arxiv)](https://arxiv.org/abs/2507.03041)\n[![](https://img.shields.io/badge/pip-optimas--ai-brightgreen?style=plastic\u0026logo=Python)](https://pypi.org/project/optimas-ai/)\n[![](https://img.shields.io/badge/doc-online-blue?style=plastic\u0026logo=Read%20the%20Docs)](https://optimas.stanford.edu/docs/getting-started/introduction)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\u003c/div\u003e\n\n## NEWS\n- **[Jul 2025]** We release Optimas!\n\n## What is Optimas?\nOptimas is a unified framework for end-to-end optimization of compound AI systems. While traditional optimization methods focus on single configuration types—such as prompts or hyperparameters—modern compound AI systems require coordinated optimization across multiple heterogeneous configuration types that work well together.\n\nOptimas addresses this fundamental challenge through its core innovation: **Globally Aligned Local Reward Functions (LRFs)** that align each component's optimization with global system performance. This enables efficient, decentralized optimization while ensuring that local improvements contribute meaningfully to global rewards, backed by formal theoretical guarantees.\n\n\n\n🔥 Check out our [website](https://optimas.stanford.edu/) for more overview!\n\n## 0. Set up API keys\n\n```\nexport OPENAI_API_KEY=[YOUR KEY]\nexport ANTHROPIC_API_KEY=[YOUR KEY]\n```\n\n## 1. Generate Preference Data (used for reward model and optimization)\n\n\n`python -m scripts.generate_reward_dataset scripts/configs/generate/{dataset}.yaml`\n\nThis runs reward data generation over a given dataset + system.\nOutput: HuggingFace-style reward dataset saved locally.\n\n## 2. Train Initial Reward Model (Local Reward Functions)\n\n`CUDA_VISIBLE_DEVICES=2,3,4,5 torchrun --master_port=56781 --nnodes=1 --nproc_per_node=4 -m scripts.train_reward_model scripts/configs/train/{dataset}.yaml`\n\nwhere `nnodes` is the number of number of nodes, and `nproc_per_node` is the number of GPUs per node.\n\nTrains a reward model using preference data. You need to include WANDB_ENTITY and WANDB_PROJECT in the `.env` file or export them in your shell:\n```\nexport WANDB_ENTITY=your_wandb_entity\nexport WANDB_PROJECT=your_wandb_project\n```\n\n## 3. Run Optimization (Prompts, PPO LoRA, Hyperparameters)\n\n`CUDA_VISIBLE_DEVICES=6 torchrun --master_port=56790 --nnodes=1 --nproc_per_node=1 -m scripts.optimize_system scripts/configs/optimize/{dataset}.yaml`\n\nUses Globally Aligned Local Reward Functions (LRFs) to optimize component variables.\nSupports:\n    - prompt tuning (opro, mipro, copro)\n    - hyperparameter search\n    - PPO for local models via LoRA (works with vLLM + OpenAI API)\nEach component can be optimized independently or jointly.\n\nRemember to include WANDB_ENTITY and WANDB_PROJECT in the `.env` file or export them in your shell.\n\n## 4. Evaluate Final System\n\n`python scripts/eval_system.py scripts/configs/eval/{dataset}.yaml`\n\nEvaluates a saved system state dict on val/test sets.\nSupports test repeat for randomized components.\n\n## Component Types Supported\n\n- Prompt templates (as strings)\n- Model config (e.g., model name, temperature)\n- Hyperparameters (grid search)\n- Local LLM weights (LoRA + PPO finetuning)\n\nEach component declares:\n    - input_fields\n    - output_fields\n    - variable (what to optimize)\n    - variable_search_space (optional)\n\n## Adding Your Own System\n\n1. Define your pipeline in examples/systems/\u003cyour_system\u003e.py as `system_engine()`\n2. Register it in examples/systems/__init__.py\n3. Add your dataset to examples/datasets/\n\nExample:\n```python\n    def system_engine():\n        return CompoundAISystem(\n            components={...},\n            final_output_fields=[...],\n            ground_fields=[...],\n            eval_func=...\n        )\n```\n\n## Reference\n\n```\n@inproceedings{optimas,\n    title        = {Optimas: Optimizing Compound AI Systems with Globally Aligned Local Rewards},\n    author       = {\n        Shirley Wu and Parth Sarthi and Shiyu Zhao and\n        Aaron Lee and Herumb Shandilya and\n        Adrian Mladenic Grobelnik and Nurendra Choudhary and\n        Eddie Huang and Karthik Subbian and \n        Linjun Zhang and Diyi Yang and\n        James Zou and Jure Leskovec\n    },\n    year        = {2026},\n    booktitle   = {ICLR},\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsnap-stanford%2Foptimas","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsnap-stanford%2Foptimas","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsnap-stanford%2Foptimas/lists"}