{"id":34482887,"url":"https://github.com/future-agi/simulate-sdk","last_synced_at":"2026-01-07T11:59:12.319Z","repository":{"id":328323036,"uuid":"1070689810","full_name":"future-agi/simulate-sdk","owner":"future-agi","description":"Enterprise Grade, Voice AI simulation SDK for testing your AI Agents","archived":false,"fork":false,"pushed_at":"2025-12-19T11:28:11.000Z","size":427,"stargazers_count":37,"open_issues_count":0,"forks_count":3,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-12-22T03:44:59.088Z","etag":null,"topics":["agentic-ai","ai-agents","evaluation","simulate","testing-tools","voice-ai"],"latest_commit_sha":null,"homepage":"https://app.futureagi.com","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/future-agi.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-10-06T09:59:23.000Z","updated_at":"2025-11-26T19:07:24.000Z","dependencies_parsed_at":"2025-12-14T00:01:40.985Z","dependency_job_id":null,"html_url":"https://github.com/future-agi/simulate-sdk","commit_stats":null,"previous_names":["future-agi/simulate-sdk"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/future-agi/simulate-sdk","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/future-agi%2Fsimulate-sdk","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/future-agi%2Fsimulate-sdk/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/future-agi%2Fsimulate-sdk/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/future-agi%2Fsimulate-sdk/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/future-agi","download_url":"https://codeload.github.com/future-agi/simulate-sdk/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/future-agi%2Fsimulate-sdk/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28101403,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-12-28T02:00:05.685Z","response_time":62,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agentic-ai","ai-agents","evaluation","simulate","testing-tools","voice-ai"],"created_at":"2025-12-23T15:00:33.276Z","updated_at":"2025-12-28T16:04:11.185Z","avatar_url":"https://github.com/future-agi.png","language":"Python","readme":"\u003cdiv align=\"center\"\u003e\n\n# 🧪 Agent Simulate SDK\n\n**A Python SDK for testing conversational voice AI agents through realistic customer simulations.**  \nBuilt by [Future AGI](https://futureagi.com) | [Docs](https://docs.futureagi.com) | [Platform](https://app.futureagi.com)\n\n\u003c/div\u003e\n\n---\n\n## 🚀 Overview\n\n**Agent Simulate** provides a powerful framework for testing deployed voice AI agents. It automates realistic customer conversations, records the full interaction, and integrates with evaluation pipelines to help you ship high-quality, reliable agents.\n\n- 📞 **Test Deployed Agents**: Connect directly to your agent in a LiveKit room.\n- 🎭 **Persona-driven Scenarios**: Define customer personas, situations, and goals.\n- 🎙️ **Full Audio \u0026 Transcripts**: Capture complete conversation audio and text.\n- 📊 **Integrate Evaluations**: Use `ai-evaluation` to score agent performance.\n---\n\n## Key Features\n| Feature                       | Description                                                                              |\n| ----------------------------- | ---------------------------------------------------------------------------------------- |\n| **Agent Definition**          | Configure the connection to your deployed agent, including room, prompts, and credentials. |\n| **Scenario Creation**         | Programmatically define test cases with unique customer personas, situations, and desired outcomes. |\n| **Automated Test Runner**     | Orchestrates the simulation, connects the persona to the agent, and manages the conversation flow. |\n| **Audio/Transcript Capture**  | Automatically records individual and combined audio tracks, plus a full text transcript. |\n| **Evaluation Integration**    | Seamlessly pass test results (audio, transcripts) to the `ai-evaluation` library for scoring. |\n| **Extensible \u0026 Customizable** | Customize STT, TTS, and LLM providers for the simulated customer. |\n---\n\n## 🔧 Installation\n\n\n\n```bash\n# Install through pip \npip install agent-simulate\n```\n\nIf you want to fork the project and install it in editable mode, you can use the following command:\n```bash\ngit clone https://github.com/future-agi/agent-simulate.git\ncd agent-simulate\npip install -e . # poetry install if you want to use poetry\n```\nThe project uses Poetry for dependency management. \n\n### Download VAD Model Weights\n\nThe SDK uses Silero VAD for voice activity detection. Download the required model weights by running this script once:\n\n```python\nfrom livekit.plugins import silero\n\nif __name__ == \"__main__\":\n    print(\"Downloading Silero VAD model...\")\n    silero.VAD.load()\n    print(\"Download complete.\")\n```\n---\n\n## 🧑‍💻 Quickstart\n\n### 1. 🔐 Set Environment Variables\n\nCreate a `.env` file with your credentials:\n\n```bash\n# LiveKit Server Details\nLIVEKIT_URL=\"wss://your-livekit-server.com\"\nLIVEKIT_API_KEY=\"your-api-key\"\nLIVEKIT_API_SECRET=\"your-api-secret\"\n\n# OpenAI API Key (for the default simulated customer)\nOPENAI_API_KEY=\"your-openai-key\"\n\n# Future AGI Evaluation Keys (for running evaluations)\nFI_API_KEY=\"your-fi-api-key\"\nFI_SECRET_KEY=\"your-fi-secret-key\"\n```\n\n### 2. ✅ Run a Simulation\n\nThis example connects a simulated customer (\"Alice\") to your deployed agent.\n\n```python\nimport asyncio\nimport os\nfrom dotenv import load_dotenv\nfrom fi.simulate import AgentDefinition, Scenario, Persona, TestRunner\nfrom fi.simulate.evaluation import evaluate_report\n\nload_dotenv()\n\nasync def main():\n    # 1. Define your deployed agent\n    agent_definition = AgentDefinition(\n        name=\"my-support-agent\",\n        url=os.environ[\"LIVEKIT_URL\"],\n        room_name=\"support-room\",  # The room where your agent is waiting\n        system_prompt=\"Helpful support agent\", # The system prompt for the agent that defines its behavior\n    )\n\n    # 2. Create a test scenario\n    scenario = Scenario(\n        name=\"Customer Support Test\",\n        dataset=[\n            Persona(\n                persona={\"name\": \"Alice\", \"mood\": \"frustrated\"},\n                situation=\"She cannot log into her account.\",\n                outcome=\"The agent should guide her through password reset.\",\n            ),\n        ]\n    )\n\n    # 3. Run the test\n    runner = TestRunner()\n    report = await runner.run_test(\n        agent_definition,\n        scenario,\n        record_audio=True,  # Capture WAV files\n    )\n\n    # 4. View results\n    for result in report.results:\n        print(f\"Transcript: {result.transcript}\")\n        print(f\"Combined Audio Path: {result.audio_combined_path}\")\n\n    # 5. Evaluate the Report\n    # This helper runs evaluations for each test case in the report.\n    # Map report fields (e.g., 'transcript') to the inputs required by the eval template.\n    evaluated_report = evaluate_report(\n        report,\n        eval_specs=[\n            {\n                \"eval_templates\": [\"task_completion\"],\n                \"template_inputs\": {\"transcript\": \"transcript\"},\n            },\n            {\n                \"eval_templates\": [\"audio_quality\"],\n                \"template_inputs\": {\"audio\": \"audio_combined_path\"},\n            },\n        ],\n    )\n\n    # View evaluation results\n    for result in evaluated_report.results:\n        for eval_result in result.evaluation_results:\n            print(f\"Evaluation for {eval_result.eval_template_name}:\")\n            print(f\"  Score: {eval_result.score}\")\n            print(f\"  Output: {eval_result.output}\")\n\n\nif __name__ == \"__main__\":\n    asyncio.run(main())\n```\n\n---\n\n## 🚀 LLM Evaluation with Future AGI Platform\n\nFuture AGI delivers a **complete, iterative evaluation lifecycle** so you can move from prototype to production with confidence:\n\n| Stage                             | What you can do                                                                                                                 \n| --------------------------------- | ------------------------------------------------------------------------------------------------------------------------------- \n| **1. Curate \u0026 Annotate Datasets** | Build, import, label, and enrich evaluation datasets in‑cloud. Synthetic‑data generation and Hugging Face imports are built in. \n| **2. Benchmark \u0026 Compare**        | Run prompt / model experiments on those datasets, track scores, and pick the best variant in Prompt Workbench or via the SDK.   \n| **3. Fine‑Tune Metrics**          | Create fully custom eval templates with your own rules, scoring logic, and models to match domain needs.                        \n| **4. Debug with Traces**          | Inspect every failing datapoint through rich traces—latency, cost, spans, and evaluation scores side‑by‑side.                   \n| **5. Monitor in Production**      | Schedule Eval Tasks to score live or historical traffic, set sampling rates, and surface alerts right in the Observe dashboard. \n| **6. Close the Loop**             | Promote real‑world failures back into your dataset, retrain / re‑prompt, and rerun the cycle until performance meets spec.      \n\n\u003e Everything you need—including SDK guides, UI walkthroughs, and API references—is in the [Future AGI docs](https://docs.futureagi.com).\n\n\u003cimg width=\"2880\" height=\"2048\" alt=\"image\" src=\"https://github.com/user-attachments/assets/e3ab2b32-6b44-49f5-aa66-0a3d65ba176e\" /\u003e\n \n---\n\n## 🗺️ Roadmap \n\n* [x] **Core Simulation Engine**\n* [x] **Persona-driven Scenarios**\n* [x] **Audio \u0026 Transcript Recording**\n* [x] **`ai-evaluation` Integration Helper**\n* [ ] **Advanced Scenarios (Conversation Graphs)**\n* [ ] **Deeper Performance Metrics (Latency, Interruption Rates)**\n\n---\n\n## 🤝 Contributing\n\nWe welcome contributions! To report issues, suggest templates, or contribute improvements, please open a GitHub issue or PR.\n\n---\n","funding_links":[],"categories":["Large Scale Deployment"],"sub_categories":["Workflow"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffuture-agi%2Fsimulate-sdk","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffuture-agi%2Fsimulate-sdk","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffuture-agi%2Fsimulate-sdk/lists"}