{"id":29875824,"url":"https://github.com/videosdk-live/agents","last_synced_at":"2026-02-23T15:01:00.054Z","repository":{"id":301735729,"uuid":"976469474","full_name":"videosdk-live/agents","owner":"videosdk-live","description":"Open-source framework for developing real-time multimodal conversational AI agents.","archived":false,"fork":false,"pushed_at":"2026-02-17T10:23:31.000Z","size":8504,"stargazers_count":592,"open_issues_count":6,"forks_count":82,"subscribers_count":9,"default_branch":"main","last_synced_at":"2026-02-17T11:35:47.298Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://docs.videosdk.live/ai_agents/introduction","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/videosdk-live.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE.txt","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":"NOTICE.txt","maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-05-02T06:49:27.000Z","updated_at":"2026-02-17T06:02:51.000Z","dependencies_parsed_at":"2025-06-28T14:34:29.373Z","dependency_job_id":"81f715dc-16a5-4af6-b557-1e1746b4c8f7","html_url":"https://github.com/videosdk-live/agents","commit_stats":null,"previous_names":["videosdk-live/agents"],"tags_count":57,"template":false,"template_full_name":null,"purl":"pkg:github/videosdk-live/agents","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/videosdk-live%2Fagents","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/videosdk-live%2Fagents/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/videosdk-live%2Fagents/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/videosdk-live%2Fagents/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/videosdk-live","download_url":"https://codeload.github.com/videosdk-live/agents/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/videosdk-live%2Fagents/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29746499,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-23T07:44:07.782Z","status":"ssl_error","status_checked_at":"2026-02-23T07:44:07.432Z","response_time":90,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-07-31T02:44:02.486Z","updated_at":"2026-02-23T15:01:00.009Z","avatar_url":"https://github.com/videosdk-live.png","language":"Python","readme":"\u003c!--BEGIN_BANNER_IMAGE--\u003e\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://assets.videosdk.live/images/github-banner.png\" alt=\"VideoSDK AI Agents Banner\" style=\"width:100%;\"\u003e\n\u003c/p\u003e\n\u003c!--END_BANNER_IMAGE--\u003e\n\n# VideoSDK AI Agents\nOpen-source framework for building real-time multimodal conversational AI agents.\n\n![PyPI - Version](https://img.shields.io/pypi/v/videosdk-agents)\n[![PyPI Downloads](https://static.pepy.tech/badge/videosdk-agents/month)](https://pepy.tech/projects/videosdk-agents)\n[![Twitter Follow](https://img.shields.io/twitter/follow/video_sdk)](https://x.com/video_sdk)\n[![YouTube](https://img.shields.io/badge/YouTube-VideoSDK-red)](https://www.youtube.com/c/VideoSDK)\n[![LinkedIn](https://img.shields.io/badge/LinkedIn-VideoSDK-blue)](https://www.linkedin.com/company/video-sdk/)\n[![Discord](https://img.shields.io/badge/Discord-Join%20Us-7289DA)](https://discord.com/invite/f2WsNDN9S5)\n[![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/videosdk-live/agents)\n\n\nThe **VideoSDK AI Agents framework** connects your infrastructure, agent worker, VideoSDK room, and user devices, enabling **real-time, natural voice and multimodal interactions** between users and intelligent agents.\n\n\u003c!-- ![VideoSDK AI Agents High Level Architecture](https://strapi.videosdk.live/uploads/Group_15_1_5610ce9c7e.png) --\u003e\n\u003c!-- ![VideoSDK AI Agents High Level Architecture](https://cdn.videosdk.live/website-resources/docs-resources/voice_agent_intro.png) --\u003e\n\n![VideoSDK AI Agents High Level Architecture](https://assets.videosdk.live/images/agent-architecture.png)\n\n\n## Overview\n\nThe AI Agent SDK is a Python framework built on top of the VideoSDK Python SDK that enables AI-powered agents to join VideoSDK rooms as participants. This SDK serves as a real-time bridge between AI models (like OpenAI or Gemini) and your users, facilitating seamless voice and media interactions.\n\n\u003ctable width=\"100%\"\u003e\n  \u003ctr\u003e\n    \u003ctd width=\"50%\" valign=\"top\" style=\"padding-left: 20px;\"\u003e\n      \u003ch3\u003e🎙️ \u003ca href=\"examples/test_cascading_pipeline.py\" target=\"_blank\"\u003eAgent with Cascading Pipeline\u003c/a\u003e\u003c/h3\u003e\n      \u003cp\u003eTest an AI Voice Agent that uses a Cascading Pipeline for STT → LLM → TTS.\u003c/p\u003e\n    \u003c/td\u003e\n    \u003ctd width=\"50%\" valign=\"top\" style=\"padding-left: 20px;\"\u003e\n      \u003ch3\u003e📞 \u003ca href=\"examples/sip_agent_example.py\" target=\"_blank\"\u003eAI Telephony Agent\u003c/a\u003e\u003c/h3\u003e\n      \u003cp\u003eTest an AI Agent that answers and interacts over phone calls using SIP.\u003c/p\u003e\n    \u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd width=\"50%\" valign=\"top\" style=\"padding-left: 20px;\"\u003e\n      \u003ch3\u003e💻 \u003ca href=\"https://docs.videosdk.live/ai_agents/introduction\" target=\"_blank\"\u003eAgent Documentation\u003c/a\u003e\u003c/h3\u003e\n      \u003cp\u003eThe VideoSDK Agent Official Documentation.\u003c/p\u003e\n    \u003c/td\u003e\n    \u003ctd width=\"50%\" valign=\"top\" style=\"padding-left: 20px;\"\u003e\n      \u003ch3\u003e📚 \u003ca href=\"https://docs.videosdk.live/agent-sdk-reference/agents/\" target=\"_blank\"\u003eSDK Reference\u003c/a\u003e\u003c/h3\u003e\n      \u003cp\u003eReference Docs for Agents Framework.\u003c/p\u003e\n    \u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\n\u003cdiv style={{marginTop: '1.5rem'}}\u003e\u003c/div\u003e\n\n\n| #  | Feature                         | Description                                                                 |\n|----|----------------------------------|-----------------------------------------------------------------------------|\n| 1  | **🎤 Real-time Communication (Audio/Video)**       | Agents can listen, speak, and interact live in meetings.                   |\n| 2  | **📞 SIP \u0026 Telephony Integration**   | Seamlessly connect agents to phone systems via SIP for call handling, routing, and PSTN access. |\n| 3  | **🧍 Virtual Avatars**               | Add lifelike avatars to enhance interaction and presence using Simli.     |\n| 4  | **🤖 Multi-Model Support**           | Integrate with OpenAI, Gemini, AWS NovaSonic, and more.                    |\n| 5  | **🧩 Cascading Pipeline**            | Integrates with different providers of STT, LLM, and TTS seamlessly.       |\n| 6  | **⚡ Realtime Pipeline**         | Use unified realtime models (OpenAI Realtime, AWS Nova, Gemini Live) for lowest latency | \n| 7  | **🧠 Conversational Flow**           | Manages turn detection and VAD for smooth interactions.                    |\n| 8  | **🛠️ Function Tools**               | Extend agent capabilities with event scheduling, expense tracking, and more. |\n| 9  | **🌐 MCP Integration**               | Connect agents to external data sources and tools using Model Context Protocol. |\n| 10  | **🔗 A2A Protocol**                  | Enable agent-to-agent interactions for complex workflows.                  |\n| 11 | **📊 Observability**             | Built-in OpenTelemetry tracing and metrics collection |  \n| 12 | **🚀 CLI Tool**                  | Run agents locally and test with `videosdk` CLI |  \n\n\n\n\u003e \\[!IMPORTANT]\n\u003e\n\u003e **Star VideoSDK Repositories** ⭐️\n\u003e\n\u003e Get instant notifications for new releases and updates. Your support helps us grow and improve VideoSDK!\n\n\n## Pre-requisites\n\nBefore you begin, ensure you have:\n\n- A VideoSDK authentication token (generate from [app.videosdk.live](https://app.videosdk.live))\n   - A VideoSDK meeting ID (you can generate one using the [Create Room API](https://docs.videosdk.live/api-reference/realtime-communication/create-room) or through the VideoSDK dashboard)\n- Python 3.12 or higher\n- Third-Party API Keys:\n   - API keys for the services you intend to use (e.g., OpenAI for LLM/STT/TTS, ElevenLabs for TTS, Google for Gemini etc.).\n\n## Installation\n\n- Create and activate a virtual environment with Python 3.12 or higher.\n    \u003cdetails\u003e\n    \u003csummary\u003e\u003cstrong\u003e macOS / Linux\u003c/strong\u003e\u003c/summary\u003e\n    \n    ```bash\n    python3 -m venv venv\n    source venv/bin/activate\n    ```\n    \u003c/details\u003e \n    \u003cdetails\u003e \n    \u003csummary\u003e\u003cstrong\u003e Windows\u003c/strong\u003e\u003c/summary\u003e\n    \n    ```bash\n    python -m venv venv\n    venv\\Scripts\\activate\n    ```\n    \u003c/details\u003e\n    \n- Install the core VideoSDK AI Agent package \n  ```bash\n  pip install videosdk-agents\n  ```\n- Install Optional Plugins. Plugins help integrate different providers for Realtime, STT, LLM, TTS, and more. Install what your use case needs:\n  ```bash\n  # Example: Install the Turn Detector plugin\n  pip install videosdk-plugins-turn-detector\n  ```\n  👉 Supported plugins (Realtime, LLM, STT, TTS, VAD, Avatar, SIP) are listed in the [Supported Libraries](#supported-libraries-and-plugins) section below.\n\n\n## Generating a VideoSDK Meeting ID\n\nBefore your AI agent can join a meeting, you'll need to create a meeting ID. You can generate one using the VideoSDK Create Room API:\n\n### Using cURL\n\n```bash\ncurl -X POST https://api.videosdk.live/v2/rooms \\\n  -H \"Authorization: YOUR_JWT_TOKEN_HERE\" \\\n  -H \"Content-Type: application/json\"\n```\n\nFor more details on the Create Room API, refer to the [VideoSDK documentation](https://docs.videosdk.live/api-reference/realtime-communication/create-room).\n\n## Getting Started: Your First Agent\n\n### Quick Start\n\nNow that you've installed the necessary packages, you're ready to build!\n\n### Step 1: Creating a Custom Agent\n\nFirst, let's create a custom voice agent by inheriting from the base `Agent` class:\n\n```python title=\"main.py\"\nfrom videosdk.agents import Agent, function_tool\n\n# External Tool\n# async def get_weather(self, latitude: str, longitude: str):\n\nclass VoiceAgent(Agent):\n    def __init__(self):\n        super().__init__(\n            instructions=\"You are a helpful voice assistant that can answer questions and help with tasks.\",\n             tools=[get_weather] # You can register any external tool defined outside of this scope\n        )\n\n    async def on_enter(self) -\u003e None:\n        \"\"\"Called when the agent first joins the meeting\"\"\"\n        await self.session.say(\"Hi there! How can I help you today?\")\n    \n    async def on_exit(self) -\u003e None:\n      \"\"\"Called when the agent exits the meeting\"\"\"\n        await self.session.say(\"Goodbye!\")\n```\n\nThis code defines a basic voice agent with:\n\n- Custom instructions that define the agent's personality and capabilities\n- An entry message when joining a meeting\n- State change handling to track the agent's current activity\n\n### Step 2: Implementing Function Tools\n\nFunction tools allow your agent to perform actions beyond conversation. There are two ways to define tools:\n\n- **External Tools:** Defined as standalone functions outside the agent class and registered via the `tools` argument in the agent's constructor.\n- **Internal Tools:** Defined as methods inside the agent class and decorated with `@function_tool`.\n\nBelow is an example of both:\n\n```python\nimport aiohttp\n\n# External Function Tools\n@function_tool\ndef get_weather(latitude: str, longitude: str):\n    print(f\"Getting weather for {latitude}, {longitude}\")\n    url = f\"https://api.open-meteo.com/v1/forecast?latitude={latitude}\u0026longitude={longitude}\u0026current=temperature_2m\"\n    async with aiohttp.ClientSession() as session:\n        async with session.get(url) as response:\n            if response.status == 200:\n                data = await response.json()\n                return {\n                    \"temperature\": data[\"current\"][\"temperature_2m\"],\n                    \"temperature_unit\": \"Celsius\",\n                }\n            else:\n                raise Exception(\n                    f\"Failed to get weather data, status code: {response.status}\"\n                )\n\nclass VoiceAgent(Agent):\n# ... previous code ...\n# Internal Function Tools\n    @function_tool\n    async def get_horoscope(self, sign: str) -\u003e dict:\n        horoscopes = {\n            \"Aries\": \"Today is your lucky day!\",\n            \"Taurus\": \"Focus on your goals today.\",\n            \"Gemini\": \"Communication will be important today.\",\n        }\n        return {\n            \"sign\": sign,\n            \"horoscope\": horoscopes.get(sign, \"The stars are aligned for you today!\"),\n        }\n```\n\n- Use external tools for reusable, standalone functions (registered via `tools=[...]`).\n- Use internal tools for agent-specific logic as class methods.\n- Both must be decorated with `@function_tool` for the agent to recognize and use them.\n\n\n### Step 3: Setting Up the Pipeline\n\nThe pipeline connects your agent to an AI model. Here, we are using Google's Gemini for a [Real-time Pipeline](https://docs.videosdk.live/ai_agents/core-components/realtime-pipeline). You could also use a [Cascading Pipeline](https://docs.videosdk.live/ai_agents/core-components/cascading-pipeline).\n\n\n```python\nfrom videosdk.plugins.google import GeminiRealtime, GeminiLiveConfig\nfrom videosdk.agents import RealTimePipeline, JobContext\n\nasync def start_session(context: JobContext):\n    # Initialize the AI model\n    model = GeminiRealtime(\n        model=\"gemini-2.5-flash-native-audio-preview-12-2025\",\n        # When GOOGLE_API_KEY is set in .env - DON'T pass api_key parameter\n        api_key=\"AKZSXXXXXXXXXXXXXXXXXXXX\",\n        config=GeminiLiveConfig(\n            voice=\"Leda\", # Puck, Charon, Kore, Fenrir, Aoede, Leda, Orus, and Zephyr.\n            response_modalities=[\"AUDIO\"]\n        )\n    )\n\n    pipeline = RealTimePipeline(model=model)\n\n    # Continue to the next steps...\n```\n### Step 4: Assembling and Starting the Agent Session\n\nNow, let's put everything together and start the agent session:\n\n```python\nimport asyncio\nfrom videosdk.agents import AgentSession, WorkerJob, RoomOptions, JobContext\n\nasync def start_session(context: JobContext):\n    # ... previous setup code ...\n\n    # Create the agent session\n    session = AgentSession(\n        agent=VoiceAgent(),\n        pipeline=pipeline\n    )\n\n    try:\n       await context.connect()\n        # Start the session\n        await session.start()\n        # Keep the session running until manually terminated\n        await asyncio.Event().wait()\n    finally:\n        # Clean up resources when done\n        await session.close()\n        await context.shutdown()\n\ndef make_context() -\u003e JobContext:\n    room_options = RoomOptions(\n        room_id=\"\u003cmeeting_id\u003e\", # Replace it with your actual meetingID\n        auth_token = \"\u003cVIDEOSDK_AUTH_TOKEN\u003e\", # When VIDEOSDK_AUTH_TOKEN is set in .env - DON'T include videosdk_auth\n        name=\"Test Agent\", \n        playground=True,\n        # vision= True # Only available when using the Google Gemini Live API\n    )\n    \n    return JobContext(room_options=room_options)\n\nif __name__ == \"__main__\":\n    job = WorkerJob(entrypoint=start_session, jobctx=make_context)\n    job.start()\n```\n### Step 5: Connecting with VideoSDK Client Applications\n\nAfter setting up your AI Agent, you'll need a client application to connect with it. You can use any of the VideoSDK quickstart examples to create a client that joins the same meeting:\n\n- [JavaScript](https://github.com/videosdk-live/quickstart/tree/main/js-rtc)\n- [React](https://github.com/videosdk-live/quickstart/tree/main/react-rtc)\n- [React Native](https://github.com/videosdk-live/quickstart/tree/main/react-native)\n- [Android](https://github.com/videosdk-live/quickstart/tree/main/android-rtc)\n- [Flutter](https://github.com/videosdk-live/quickstart/tree/main/flutter-rtc)\n- [iOS](https://github.com/videosdk-live/quickstart/tree/main/ios-rtc)\n- [Unity](http://github.com/videosdk-live/videosdk-rtc-unity-sdk-example)\n- [IoT](https://github.com/videosdk-live/videosdk-rtc-iot-sdk-example)\n\nWhen setting up your client application, make sure to use the same meeting ID that your AI Agent is using.\n\n### Step 6: Running the Project\nOnce you have completed the setup, you can run your AI Voice Agent project using Python. Make sure your `.env` file is properly configured and all dependencies are installed.\n\n```bash\npython main.py\n```\n\u003e [!TIP]\n\u003e \n\u003e **Test Your Agent Instantly with the CLI Tool**\n\u003e\n\u003e Run your agent locally using:\n\u003e\n\u003e ```bash\n\u003e python main.py console\n\u003e ```\n\u003e\n\u003e Experience real-time interactions right from your terminal - no meeting room required!  \n\u003e Speak and listen through your system’s mic and speakers for quick testing and rapid development.\n\n\n### Step 7: Deployment\n\nFor deployment options and guide, checkout the official documentation here: [Deployment](https://docs.videosdk.live/ai_agents/deployments/introduction)\n\n---\n\n\u003c!-- - For detailed guides, tutorials, and API references, check out our official [VideoSDK AI Agents Documentation](https://docs.videosdk.live/ai_agents/introduction).\n- To see the framework in action, explore the code in the [Examples](examples/) directory. It is a great place to quickstart. --\u003e\n\n## Supported Libraries and Plugins\n\nThe framework supports integration with various AI models and tools, across multiple categories:\n\n\n| Category                 | Services |\n|--------------------------|----------|\n| **Real-time Models**     | [OpenAI](https://docs.videosdk.live/ai_agents/plugins/realtime/openai) \u0026#124; [Gemini](https://docs.videosdk.live/ai_agents/plugins/realtime/google-live-api) \u0026#124; [AWS Nova Sonic](https://docs.videosdk.live/ai_agents/plugins/realtime/aws-nova-sonic) \u0026#124; [Azure Voice Live](https://docs.videosdk.live/ai_agents/plugins/realtime/azure-voice-live)|\n| **Speech-to-Text (STT)** | [OpenAI](https://docs.videosdk.live/ai_agents/plugins/stt/openai) \u0026#124; [Google](https://docs.videosdk.live/ai_agents/plugins/stt/google) \u0026#124; [Azure AI Speech](https://docs.videosdk.live/ai_agents/plugins/stt/azure-ai-stt) \u0026#124; [Azure OpenAI](https://docs.videosdk.live/ai_agents/plugins/stt/azureopenai) \u0026#124; [Sarvam AI](https://docs.videosdk.live/ai_agents/plugins/stt/sarvam-ai) \u0026#124; [Deepgram](https://docs.videosdk.live/ai_agents/plugins/stt/deepgram) \u0026#124; [Cartesia](https://docs.videosdk.live/ai_agents/plugins/stt/cartesia-stt) \u0026#124; [AssemblyAI](https://docs.videosdk.live/ai_agents/plugins/stt/assemblyai) \u0026#124; [Navana](https://docs.videosdk.live/ai_agents/plugins/stt/navana) |\n| **Language Models (LLM)**| [OpenAI](https://docs.videosdk.live/ai_agents/plugins/llm/openai) \u0026#124; [Azure OpenAI](https://docs.videosdk.live/ai_agents/plugins/llm/azureopenai) \u0026#124; [Google](https://docs.videosdk.live/ai_agents/plugins/llm/google-llm) \u0026#124; [Sarvam AI](https://docs.videosdk.live/ai_agents/plugins/llm/sarvam-ai-llm) \u0026#124; [Anthropic](https://docs.videosdk.live/ai_agents/plugins/llm/anthropic-llm) \u0026#124; [Cerebras](https://docs.videosdk.live/ai_agents/plugins/llm/Cerebras-llm) |\n| **Text-to-Speech (TTS)** | [OpenAI](https://docs.videosdk.live/ai_agents/plugins/tts/openai) \u0026#124; [Google](https://docs.videosdk.live/ai_agents/plugins/tts/google-tts) \u0026#124; [AWS Polly](https://docs.videosdk.live/ai_agents/plugins/tts/aws-polly-tts) \u0026#124; [Azure AI Speech](https://docs.videosdk.live/ai_agents/plugins/tts/azure-ai-tts) \u0026#124; [Azure OpenAI](https://docs.videosdk.live/ai_agents/plugins/tts/azureopenai) \u0026#124; [Deepgram](https://docs.videosdk.live/ai_agents/plugins/tts/deepgram) \u0026#124; [Sarvam AI](https://docs.videosdk.live/ai_agents/plugins/tts/sarvam-ai-tts) \u0026#124; [ElevenLabs](https://docs.videosdk.live/ai_agents/plugins/tts/eleven-labs) \u0026#124; [Cartesia](https://docs.videosdk.live/ai_agents/plugins/tts/cartesia-tts) \u0026#124; [Resemble AI](https://docs.videosdk.live/ai_agents/plugins/tts/resemble-ai-tts) \u0026#124; [Smallest AI](https://docs.videosdk.live/ai_agents/plugins/tts/smallestai-tts) \u0026#124; [Speechify](https://docs.videosdk.live/ai_agents/plugins/tts/speechify-tts) \u0026#124; [InWorld](https://docs.videosdk.live/ai_agents/plugins/tts/inworld-ai-tts) \u0026#124; [Neuphonic](https://docs.videosdk.live/ai_agents/plugins/tts/neuphonic-tts) \u0026#124; [Rime AI](https://docs.videosdk.live/ai_agents/plugins/tts/rime-ai-tts) \u0026#124; [Hume AI](https://docs.videosdk.live/ai_agents/plugins/tts/hume-ai-tts) \u0026#124; [Groq](https://docs.videosdk.live/ai_agents/plugins/tts/groq-ai-tts) \u0026#124; [LMNT AI](https://docs.videosdk.live/ai_agents/plugins/tts/lmnt-ai-tts) \u0026#124; [Papla Media](https://docs.videosdk.live/ai_agents/plugins/tts/papla-media) |\n| **Voice Activity Detection (VAD)** | [SileroVAD](https://docs.videosdk.live/ai_agents/plugins/silero-vad) |\n| **Turn Detection Model** | [Namo Turn Detector](https://docs.videosdk.live/ai_agents/plugins/namo-turn-detector) |\n| **Virtual Avatar** | [Simli](https://docs.videosdk.live/ai_agents/core-components/avatar) |\n| **Denoise** | [RNNoise](https://docs.videosdk.live/ai_agents/core-components/de-noise) |\n\n\u003e [!TIP]\n\u003e **Installation Examples**\n\u003e\n\u003e ```bash\n\u003e # Install with specific plugins\n\u003e pip install videosdk-agents[openai,elevenlabs,silero]\n\u003e\n\u003e # Install individual plugins\n\u003e pip install videosdk-plugins-anthropic\n\u003e pip install videosdk-plugins-deepgram\n\u003e ```\n\n\n\n## Examples\n\nExplore the following examples to see the framework in action:\n\n\u003ch2\u003e🤖 AI Voice Agent Usecases\u003c/h2\u003e\n\n\u003ctable width=\"100%\"\u003e\n  \u003ctr\u003e\n    \u003ctd width=\"50%\" valign=\"top\" style=\"padding-left: 20px;\"\u003e\n      \u003ch3\u003e📞 \u003ca href=\"https://github.com/videosdk-community/ai-telephony-demo\" target=\"_blank\"\u003eAI Telephony Agent Quickstart\u003c/a\u003e\u003c/h3\u003e\n      \u003cp\u003eUse case: Hospital appointment booking via a voice-enabled agent.\u003c/p\u003e\n    \u003c/td\u003e\n    \u003ctd width=\"50%\" valign=\"top\" style=\"padding-left: 20px;\"\u003e\n      \u003ch3\u003e✈️ \u003ca href=\"https://github.com/videosdk-community/videosdk-whatsapp-ai-calling-agent\" target=\"_blank\"\u003eAI Whatsapp Agent Quickstart\u003c/a\u003e\u003c/h3\u003e\n      \u003cp\u003eUse case: Ask about available hotel rooms and book on the go.\u003c/p\u003e\n    \u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd width=\"50%\" valign=\"top\" style=\"padding-left: 20px;\"\u003e\n      \u003ch3\u003e👨‍🏫 \u003ca href=\"https://github.com/videosdk-live/agents-quickstart/tree/main/A2A\" target=\"_blank\"\u003eMulti Agent System\u003c/a\u003e\u003c/h3\u003e\n      \u003cp\u003eUse case: Customer care agent that transfers loan related to queries to Loan Specialist Agent.\u003c/p\u003e\n    \u003c/td\u003e\n    \u003ctd width=\"50%\" valign=\"top\" style=\"padding-left: 20px;\"\u003e\n      \u003ch3\u003e🛒 \u003ca href=\"https://github.com/videosdk-live/agents-quickstart/tree/main/RAG\" target=\"_blank\"\u003eAgent with Knowledge (RAG)\u003c/a\u003e\u003c/h3\u003e\n      \u003cp\u003eUse case: Agent that answers questions based on documentation knowledge.\u003c/p\u003e\n    \u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd width=\"50%\" valign=\"top\" style=\"padding-left: 20px;\"\u003e\n      \u003ch3\u003e👨‍🏫 \u003ca href=\"https://github.com/videosdk-live/agents/tree/main/examples/mcp_server_examples\" target=\"_blank\"\u003eAgent with MCP Server\u003c/a\u003e\u003c/h3\u003e\n      \u003cp\u003eUse case: Stock Market Analyst Agent with realtime Market Data Access.\u003c/p\u003e\n    \u003c/td\u003e\n    \u003ctd width=\"50%\" valign=\"top\" style=\"padding-left: 20px;\"\u003e\n      \u003ch3\u003e🛒 \u003ca href=\"https://github.com/videosdk-live/agents-quickstart/tree/main/Virtual%20Avatar\" target=\"_blank\"\u003eVirtual Avatar Agent\u003c/a\u003e\u003c/h3\u003e\n      \u003cp\u003eUse case: A Virtual Avatar Agent that presents weather forecast. \u003c/p\u003e\n    \u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\n## Documentation\n\nFor comprehensive guides and API references:\n\n\u003ctable width=\"100%\"\u003e\n  \u003ctr\u003e\n    \u003ctd width=\"33%\" valign=\"top\" style=\"padding-left: 20px;\"\u003e\n      \u003ch3\u003e📄 \u003ca href=\"https://docs.videosdk.live/ai_agents/introduction\" target=\"_blank\"\u003eOfficial Documentation\u003c/a\u003e\u003c/h3\u003e\n      \u003cp\u003eComplete framework documentation\u003c/p\u003e\n    \u003c/td\u003e\n    \u003ctd width=\"33%\" valign=\"top\" style=\"padding-left: 20px;\"\u003e\n      \u003ch3\u003e📝 \u003ca href=\"https://docs.videosdk.live/agent-sdk-reference/agents/\" target=\"_blank\"\u003eAPI Reference\u003c/a\u003e\u003c/h3\u003e\n      \u003cp\u003eDetailed API documentation\u003c/p\u003e\n    \u003c/td\u003e\n    \u003ctd width=\"33%\" valign=\"top\" style=\"padding-left: 20px;\"\u003e\n      \u003ch3\u003e📂 \u003ca href=\"examples/\" target=\"_blank\"\u003eExamples Directory\u003c/a\u003e\u003c/h3\u003e\n      \u003cp\u003eAdditional code examples\u003c/p\u003e\n    \u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\n\n## Contributing\n\nWe welcome contributions! Here's how you can help:\n\n\u003ctable width=\"100%\"\u003e\n  \u003ctr\u003e\n    \u003ctd width=\"25%\" valign=\"top\" style=\"padding-left: 20px;\"\u003e\n      \u003ch3\u003e🐞 \u003ca href=\"https://github.com/videosdk-live/agents/issues\" target=\"_blank\"\u003eReport Issues\u003c/a\u003e\u003c/h3\u003e\n      \u003cp\u003eOpen an issue for bugs or feature requests\u003c/p\u003e\n    \u003c/td\u003e\n    \u003ctd width=\"25%\" valign=\"top\" style=\"padding-left: 20px;\"\u003e\n      \u003ch3\u003e🔀 \u003ca href=\"https://github.com/videosdk-live/agents/pulls\" target=\"_blank\"\u003eSubmit PRs\u003c/a\u003e\u003c/h3\u003e\n      \u003cp\u003eCreate a pull request with improvements\u003c/p\u003e\n    \u003c/td\u003e\n    \u003ctd width=\"25%\" valign=\"top\" style=\"padding-left: 20px;\"\u003e\n      \u003ch3\u003e🛠️ \u003ca href=\"BUILD_YOUR_OWN_PLUGIN.md\" target=\"_blank\"\u003eBuild Plugins\u003c/a\u003e\u003c/h3\u003e\n      \u003cp\u003eFollow our plugin development guide\u003c/p\u003e\n    \u003c/td\u003e\n    \u003ctd width=\"25%\" valign=\"top\" style=\"padding-left: 20px;\"\u003e\n      \u003ch3\u003e💬 \u003ca href=\"https://discord.com/invite/Gpmj6eCq5u\" target=\"_blank\"\u003eJoin Community\u003c/a\u003e\u003c/h3\u003e\n      \u003cp\u003eConnect with us on Discord\u003c/p\u003e\n    \u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\nThe framework is under active development, so contributions in the form of new plugins, features, bug fixes, or documentation improvements are highly appreciated.\n\n### 🛠️ Building Custom Plugins\n\nWant to integrate a new AI provider? Check out **[BUILD YOUR OWN PLUGIN](BUILD_YOUR_OWN_PLUGIN.md)** for:\n\n- Step-by-step plugin creation guide  \n- Directory structure and file requirements  \n- Implementation examples for STT, LLM, and TTS  \n- Testing and submission guidelines  \n\n## Community \u0026 Support\n\nStay connected with VideoSDK:\n\n\u003ctable width=\"100%\"\u003e\n  \u003ctr\u003e\n    \u003ctd width=\"25%\" valign=\"top\" style=\"padding-left: 20px;\"\u003e\n      \u003ch3\u003e💬 \u003ca href=\"https://discord.com/invite/Gpmj6eCq5u\" target=\"_blank\"\u003eDiscord\u003c/a\u003e\u003c/h3\u003e\n      \u003cp\u003eJoin our community\u003c/p\u003e\n    \u003c/td\u003e\n    \u003ctd width=\"25%\" valign=\"top\" style=\"padding-left: 20px;\"\u003e\n      \u003ch3\u003e🐦 \u003ca href=\"https://x.com/video_sdk\" target=\"_blank\"\u003eTwitter\u003c/a\u003e\u003c/h3\u003e\n      \u003cp\u003e@video_sdk\u003c/p\u003e\n    \u003c/td\u003e\n    \u003ctd width=\"25%\" valign=\"top\" style=\"padding-left: 20px;\"\u003e\n      \u003ch3\u003e▶️ \u003ca href=\"https://www.youtube.com/c/VideoSDK\" target=\"_blank\"\u003eYouTube\u003c/a\u003e\u003c/h3\u003e\n      \u003cp\u003eVideoSDK Channel\u003c/p\u003e\n    \u003c/td\u003e\n    \u003ctd width=\"25%\" valign=\"top\" style=\"padding-left: 20px;\"\u003e\n      \u003ch3\u003e🔗 \u003ca href=\"https://www.linkedin.com/company/video-sdk/\" target=\"_blank\"\u003eLinkedIn\u003c/a\u003e\u003c/h3\u003e\n      \u003cp\u003eVideoSDK Company\u003c/p\u003e\n    \u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\n\u003e [!TIP]\n\u003e\n\u003e **Support the Project!** ⭐️  \n\u003e Star the repository, join the community, and help us improve VideoSDK by providing feedback, reporting bugs, or contributing plugins.\n\n---\n\n\u003ca href=\"https://github.com/videosdk-live/agents/graphs/contributors\"\u003e\n  \u003cimg src=\"https://contrib.rocks/image?repo=videosdk-live/agents\" /\u003e\n\u003c/a\u003e\n\n**\u003ccenter\u003eMade with ❤️ by The VideoSDK Team\u003c/center\u003e**\n","funding_links":[],"categories":["Framework","语音识别与合成_其他"],"sub_categories":["资源传输下载"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvideosdk-live%2Fagents","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvideosdk-live%2Fagents","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvideosdk-live%2Fagents/lists"}