{"id":29031656,"url":"https://github.com/danieladdisonorg/livekit-voice-agent","last_synced_at":"2025-08-13T07:13:39.667Z","repository":{"id":300255574,"uuid":"1005533998","full_name":"danieladdisonorg/livekit-voice-agent","owner":"danieladdisonorg","description":"A production-ready voice agent implementation using LiveKit and Python, featuring advanced conversational AI capabilities and optional telephony integration. It provides intelligent turn detection, function calling, comprehensive logging, telephony integration, and audio enhancement.","archived":false,"fork":false,"pushed_at":"2025-06-20T16:53:08.000Z","size":12,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-06-20T17:46:04.835Z","etag":null,"topics":["ai-agent","deepgram","elevenlabs","livekit","openai","pytho","twillio"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/danieladdisonorg.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-06-20T11:30:52.000Z","updated_at":"2025-06-20T16:51:28.000Z","dependencies_parsed_at":"2025-06-20T17:46:13.054Z","dependency_job_id":"d6754fab-dffb-41ea-ae9e-903dc339b354","html_url":"https://github.com/danieladdisonorg/livekit-voice-agent","commit_stats":null,"previous_names":["danieladdisonorg/livekit-voice-agent"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/danieladdisonorg/livekit-voice-agent","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/danieladdisonorg%2Flivekit-voice-agent","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/danieladdisonorg%2Flivekit-voice-agent/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/danieladdisonorg%2Flivekit-voice-agent/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/danieladdisonorg%2Flivekit-voice-agent/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/danieladdisonorg","download_url":"https://codeload.github.com/danieladdisonorg/livekit-voice-agent/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/danieladdisonorg%2Flivekit-voice-agent/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":270201459,"owners_count":24544137,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-13T02:00:09.904Z","response_time":66,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-agent","deepgram","elevenlabs","livekit","openai","pytho","twillio"],"created_at":"2025-06-26T10:04:53.265Z","updated_at":"2025-08-13T07:13:39.595Z","avatar_url":"https://github.com/danieladdisonorg.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# LiveKit Voice Agent\n\nA production-ready voice agent implementation using LiveKit and Python, featuring advanced conversational AI capabilities and optional telephony integration.\n\n## Architecture Overview\n\n```\n┌─────────────────────────────────────────────────────────────────────────────────┐\n│                           LiveKit Voice Agent Architecture                      │\n└─────────────────────────────────────────────────────────────────────────────────┘\n\n┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐\n│   Web Client    │    │  Phone System   │    │  Mobile App     │\n│   (Next.js)     │    │   (Twilio)      │    │   (React)       │\n└─────────┬───────┘    └─────────┬───────┘    └─────────┬───────┘\n          │                      │                      │\n          └──────────────────────┼──────────────────────┘\n                                 │\n                    ┌────────────▼────────────┐\n                    │     LiveKit Server      │\n                    │   (WebRTC Gateway)      │\n                    └────────────┬────────────┘\n                                 │\n                    ┌────────────▼────────────┐\n                    │   Voice Pipeline Agent  │\n                    │                         │\n                    │   ┌─────────────────┐   │\n                    │   │ Turn Detection  │   │\n                    │   │   (Silero)      │   │\n                    │   └─────────────────┘   │\n                    │                         │\n                    │   ┌─────────────────┐   │\n                    │   │ Audio Pipeline  │   │\n                    │   │ ┌─────────────┐ │   │\n                    │   │ │   Krisp     │ │   │\n                    │   │ │ (Noise      │ │   │\n                    │   │ │ Cancel)     │ │   │\n                    │   │ └─────────────┘ │   │\n                    │   └─────────────────┘   │\n                    └────────────┬────────────┘\n                                 │\n        ┌────────────────────────┼────────────────────────┐\n        │                        │                        │\n        ▼                        ▼                        ▼\n┌──────────────┐        ┌──────────────┐        ┌──────────────┐\n│ Speech-to-   │        │ Language     │        │ Text-to-     │\n│ Text (STT)   │        │ Model (LLM)  │        │ Speech (TTS) │\n│              │        │              │        │              │\n│ Deepgram API │        │  OpenAI API  │        │ ElevenLabs   │\n│              │        │              │        │ API          │\n└──────┬───────┘        └──────┬───────┘        └──────┬───────┘\n       │                       │                       │\n       │              ┌─────────▼────────┐             │\n       │              │ Function Calling │             │\n       │              │                  │             │\n       │              │ ┌──────────────┐ │             │\n       │              │ │   Weather    │ │             │\n       │              │ │   Service    │ │             │\n       │              │ └──────────────┘ │             │\n       │              │                  │             │\n       │              │ ┌──────────────┐ │             │\n       │              │ │   Clock      │ │             │\n       │              │ │   Service    │ │             │\n       │              │ └──────────────┘ │             │\n       │              │                  │             │\n       │              │ ┌──────────────┐ │             │\n       │              │ │   Custom     │ │             │\n       │              │ │   Tools      │ │             │\n       │              │ └──────────────┘ │             │\n       │              └──────────────────┘             │\n       │                                               │\n       └───────────────────┐     ┌─────────────────────┘\n                           │     │\n                    ┌──────▼─────▼──────┐\n                    │   Logging \u0026       │\n                    │   Analytics       │\n                    │                   │\n                    │ • Usage Metrics   │\n                    │ • Conversation    │\n                    │   Summaries       │\n                    │ • Performance     │\n                    │   Monitoring      │\n                    └───────────────────┘\n\n┌─────────────────────────────────────────────────────────────────────────────────┐\n│                              Data Flow Process                                  │\n└─────────────────────────────────────────────────────────────────────────────────┘\n\n1. Audio Input → 2. Noise Cancellation → 3. Speech Detection → 4. STT Processing\n                                                    ↓\n8. Audio Output ← 7. TTS Generation ← 6. Response Generation ← 5. LLM Processing\n                                                    ↓\n                                            Function Execution\n                                          (Weather, Clock, etc.)\n\n┌─────────────────────────────────────────────────────────────────────────────────┐\n│                            Telephony Integration                                │\n└─────────────────────────────────────────────────────────────────────────────────┘\n\nPhone Call → Twilio SIP → LiveKit SIP Gateway → Voice Agent → Response Pipeline\n     ↑                                                              ↓\n     └──────────────── Audio Response ←─────────────────────────────┘\n\nRegional SIP Configuration:\n• US East: 54.172.60.0, 54.244.51.0\n• US West: 54.171.127.192, 35.156.191.128  \n• Europe: 54.171.127.200, 35.156.191.140\n• Asia Pacific: 54.169.127.128, 52.65.191.64\n```\n\n## How It Works\n\n### 1. **Connection Establishment**\n- Users connect via web browsers, mobile apps, or phone calls\n- LiveKit server handles WebRTC connections and SIP integration\n- Agent automatically detects connection type and optimizes accordingly\n\n### 2. **Audio Processing Pipeline**\n- **Input**: Raw audio from user's microphone or phone\n- **Noise Cancellation**: Krisp AI removes background noise\n- **Turn Detection**: Silero VAD detects when user starts/stops speaking\n- **Speech-to-Text**: Deepgram converts speech to text in real-time\n\n### 3. **Intelligent Processing**\n- **Language Understanding**: OpenAI processes user intent\n- **Function Calling**: Agent can execute tools (weather, time, custom functions)\n- **Context Management**: Maintains conversation history and state\n\n### 4. **Response Generation**\n- **Text Generation**: LLM creates appropriate responses\n- **Text-to-Speech**: ElevenLabs converts text to natural speech\n- **Audio Delivery**: Processed audio sent back to user\n\n### 5. **Monitoring \u0026 Analytics**\n- Real-time performance metrics\n- Conversation logging and summaries\n- Usage analytics and optimization insights\n\n## Features\n\n- **Intelligent Turn Detection** - Natural conversation flow with automatic speech detection\n- **Function Calling** - Extensible tool integration including:\n  - Weather information retrieval\n  - Real-time clock functionality\n- **Comprehensive Logging** - Usage analytics and conversation summaries\n- **Telephony Integration** - Inbound call support via Twilio SIP trunking\n- **Audio Enhancement** - Krisp noise cancellation for crystal-clear communication\n- **Optimized Models** - Automatic model switching for telephony vs. web-based interactions\n\n## Prerequisites\n\n- Python 3.8 or higher\n- LiveKit Cloud account or self-hosted LiveKit server\n- API keys for required services (OpenAI, ElevenLabs, Deepgram)\n- Optional: Twilio account for telephony features\n\n## Installation\n\n### Quick Start\n\n1. **Clone and navigate to the repository:**\n\n```bash\ngit clone https://github.com/danieladdisonorg/livekit-voice-agent.git\ncd livekit-voice-agent\n```\n\n2. **Set up Python environment:**\n\n**Linux/macOS:**\n\n```bash\npython3 -m venv venv\nsource venv/bin/activate\npip install -r requirements.txt\npython3 agent.py download-files\n```\n\n**Windows:**\n\n```bash\npython3 -m venv venv\nvenv\\Scripts\\activate\npip install -r requirements.txt\npython3 agent.py download-files\n```\n\n### Configuration\n\n1. **Environment Setup:**\n   Copy the example environment file and configure your API credentials:\n\n```bash\ncp .env.example .env.local\n```\n\n2. **Required Environment Variables:**\n   ```\n   LIVEKIT_URL=your_livekit_server_url\n   LIVEKIT_API_KEY=your_api_key\n   LIVEKIT_API_SECRET=your_api_secret\n   OPENAI_API_KEY=your_openai_key\n   ELEVEN_API_KEY=your_elevenlabs_key\n   DEEPGRAM_API_KEY=your_deepgram_key\n   ```\n\n3. **Automated Configuration (Optional):**\n   If using LiveKit Cloud, you can auto-configure using the CLI:\n\n```bash\nlk app env\n```\n\n## Usage\n\n### Development Mode\n\nStart the agent in development mode:\n\n```bash\npython3 agent.py dev\n```\n\n### Frontend Integration\n\nThis agent requires a compatible frontend application. We recommend using the [LiveKit Next.js Voice Agent Interface](https://github.com/kylecampbell/livekit-nextjs-voice-agent-interface) for a complete solution.\n\n## Telephony Integration (Optional)\n\nEnable inbound phone calls through Twilio SIP integration.\n\n### Prerequisites\n\n- LiveKit CLI installed and authenticated\n- Twilio account with phone number\n- SIP trunk configuration\n\n### Installation Steps\n\n1. **Install LiveKit CLI (macOS):**\n\n```bash\nbrew update \u0026\u0026 brew install livekit-cli\n```\n\n2. **Authenticate with LiveKit Cloud:**\n\n```bash\nlk cloud auth\n```\n\n### Twilio Configuration\n\n1. **Create Twilio Resources:**\n   - Sign up for a Twilio account\n   - Purchase a phone number\n   - Create a new SIP trunk in the Twilio Console\n\n2. **Configure SIP Trunk:**\n   - Navigate to: Elastic SIP Trunking → SIP Trunks → Create\n   - Add Origination URI: `\u003cYOUR_LIVEKIT_SIP_URI\u003e;transport=tcp`\n   - Associate your phone number with priority 1, weight 1\n\n3. **Deploy LiveKit SIP Configuration:**\n\n   **Create Inbound Trunk:**\n\n```bash\nlk sip inbound create inbound-trunk.json\n```\n\n   **Create Dispatch Rule:**\n\n```bash\nlk sip dispatch create dispatch-rule.json\n```\n\n### Regional Configuration\n\nUpdate `inbound-trunk.json` with appropriate Twilio SIP signaling IP addresses for your region. The default configuration includes US IP addresses.\n\n## Architecture\n\n- **Agent Core** - Main conversation logic and state management\n- **Function Registry** - Extensible tool calling system\n- **Audio Pipeline** - Real-time audio processing with noise cancellation\n- **SIP Integration** - Telephony gateway for inbound calls\n- **Logging System** - Comprehensive usage and performance analytics\n\n## Support\n\nFor issues and questions:\n- Check the [LiveKit Documentation](https://docs.livekit.io/)\n- Review existing GitHub issues\n- Contact support through your LiveKit Cloud dashboard\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdanieladdisonorg%2Flivekit-voice-agent","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdanieladdisonorg%2Flivekit-voice-agent","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdanieladdisonorg%2Flivekit-voice-agent/lists"}