{"id":28406198,"url":"https://github.com/t-rav/ai-operator","last_synced_at":"2025-06-29T08:31:57.579Z","repository":{"id":284246752,"uuid":"954292442","full_name":"Kode-Rex/ai-operator","owner":"Kode-Rex","description":"A simple voice bot that was vibe coded","archived":false,"fork":false,"pushed_at":"2025-05-29T20:42:12.000Z","size":227,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-06-02T07:16:28.920Z","etag":null,"topics":["real-time","vibe-coding","voice-assistant"],"latest_commit_sha":null,"homepage":"","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Kode-Rex.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-03-24T21:24:10.000Z","updated_at":"2025-05-25T21:59:23.000Z","dependencies_parsed_at":"2025-03-24T23:26:31.248Z","dependency_job_id":"4d3446ba-b19c-4786-a127-2560d2663dcb","html_url":"https://github.com/Kode-Rex/ai-operator","commit_stats":null,"previous_names":["t-rav/ai-operator"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Kode-Rex/ai-operator","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Kode-Rex%2Fai-operator","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Kode-Rex%2Fai-operator/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Kode-Rex%2Fai-operator/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Kode-Rex%2Fai-operator/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Kode-Rex","download_url":"https://codeload.github.com/Kode-Rex/ai-operator/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Kode-Rex%2Fai-operator/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":262564249,"owners_count":23329466,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["real-time","vibe-coding","voice-assistant"],"created_at":"2025-06-01T22:10:56.811Z","updated_at":"2025-06-29T08:31:57.574Z","avatar_url":"https://github.com/Kode-Rex.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# AI Operator - Real-time Voice Conversation System\n\nThis project implements a low-latency, real-time voice conversation system with a web client. It combines specialized services to create a responsive AI assistant that can understand speech, respond intelligently, and be interrupted naturally during conversation.\n\nSee it in action: https://www.youtube.com/watch?v=iPqDASo2gsQ\n\n[![Backend Tests](https://img.shields.io/badge/backend_tests-pytest-green.svg)](https://docs.pytest.org/en/stable/) [![Frontend Tests](https://img.shields.io/badge/frontend_tests-jest-red.svg)](https://jestjs.io/)\n\n## Key Features\n\n- **Real-time voice conversations** with GPT-4o\n- **Low-latency responses** through WebSocket streaming\n- **Natural interruption handling** - speak while AI is talking to interrupt it\n- **Multi-service architecture** optimizing each part of the conversation pipeline:\n  - Deepgram for speech-to-text\n  - OpenAI GPT-4o for language processing\n  - Cartesia TTS for high-quality voice output\n\n## Advantages Over Other Systems\n\n- **Speed**: Optimized for reduced latency compared to single-provider solutions\n- **Voice Quality**: Uses Cartesia's \"British Reading Lady\" voice for natural speech\n- **Interruption**: Supports natural conversation flow with immediate response to interruptions\n- **Customizable**: Each component can be swapped with alternatives\n\n## Getting Started\n\n### Setup\n\n```python\npython3 -m venv venv\nsource venv/bin/activate\npip install -r requirements.txt\ncp env.example .env # and add your API credentials\n```\n\n### Required API Keys\n\nAdd the following to your `.env` file:\n- `OPENAI_API_KEY` - For GPT-4o language model\n- `DEEPGRAM_API_KEY` - For speech recognition\n- `CARTESIA_API_KEY` - For text-to-speech\n\n### Run the Bot Server\n\n```bash\npython bot.py\n```\n\n### Run the Web Client\n\n```bash\npython -m http.server\n```\n\nThen, visit `http://localhost:8000` in your browser to start a conversation.\n\n### Run the Tests\n\n#### Python Backend Tests\n\n```bash\n# Run all Python tests\npytest\n\n# Run with coverage report\npytest --cov=. --cov-report=html\n\n# Run only unit tests\npytest tests/unit/\n\n# Run only integration tests\npytest tests/integration/\n```\n\n#### JavaScript Frontend Tests\n\n```bash\n# Run all JavaScript tests\nnpm test\n\n# Run with coverage\nnpm run test:coverage\n\n# Run tests in watch mode (for development)\nnpm run test:watch\n```\n\n#### Run All Tests\n\nFor convenience, you can run all tests (both backend and frontend) with:\n\n```bash\n./run_tests.sh\n```\n\n## Technical Architecture\n\nThe system uses a pipeline architecture:\n1. Web client captures audio and streams to server via WebSockets\n2. Speech is converted to text using Deepgram\n3. Text is processed by GPT-4o\n4. Responses are converted to speech using Cartesia TTS\n5. Audio is streamed back to client for playback\n\nVoice detection monitors audio levels and triggers interruption handling when the user starts speaking during AI responses.\n\n## Testing\n\nThe project has comprehensive test coverage for both backend and frontend components.\n\n### Backend Testing\n\nThe Python backend uses pytest for testing. Tests are organized into:\n\n- **Unit Tests**: Test individual components in isolation\n- **Integration Tests**: Test interactions between components\n\nThe backend test suite includes:\n- Bot initialization and configuration\n- Pipeline setup and component connections\n- Text processing and transformation\n- Session timeout handling\n- Event handling\n\nTo write new Python tests, add them to the appropriate directory under `tests/`.\n\n### Frontend Testing\n\nThe JavaScript frontend uses Jest for testing. Tests are organized by component:\n\n- **Unit Tests**: Test individual JS modules\n- **UI Tests**: Test DOM interactions and UI updates\n\nThe frontend test suite includes:\n- Configuration validation\n- UI state management\n- Audio processing\n- WebSocket communication\n- Event handling\n\nTo write new JavaScript tests, add them to the `js/__tests__/` directory.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ft-rav%2Fai-operator","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ft-rav%2Fai-operator","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ft-rav%2Fai-operator/lists"}