{"id":29604282,"url":"https://github.com/psychip/berlin-hackathon","last_synced_at":"2025-07-20T15:05:21.326Z","repository":{"id":305308678,"uuid":"1022524796","full_name":"PsyChip/berlin-hackathon","owner":"PsyChip","description":"11labs powered conversational ai agent","archived":false,"fork":false,"pushed_at":"2025-07-19T13:01:17.000Z","size":70,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-07-19T14:57:53.921Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/PsyChip.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-07-19T09:02:10.000Z","updated_at":"2025-07-19T13:01:21.000Z","dependencies_parsed_at":"2025-07-19T14:57:56.755Z","dependency_job_id":"e10cffe1-0c76-4f5b-b4c2-125b580312f1","html_url":"https://github.com/PsyChip/berlin-hackathon","commit_stats":null,"previous_names":["psychip/berlin-hackathon"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/PsyChip/berlin-hackathon","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PsyChip%2Fberlin-hackathon","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PsyChip%2Fberlin-hackathon/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PsyChip%2Fberlin-hackathon/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PsyChip%2Fberlin-hackathon/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/PsyChip","download_url":"https://codeload.github.com/PsyChip/berlin-hackathon/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PsyChip%2Fberlin-hackathon/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":266144002,"owners_count":23883084,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-07-20T15:05:19.117Z","updated_at":"2025-07-20T15:05:21.321Z","avatar_url":"https://github.com/PsyChip.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Conversational Voice Agent with Tool Support\n\nA conversational AI agent powered by ElevenLabs conversational agents, featuring real-time audio visualization, geographic location awareness, and integrated tool capabilities including weather, directions, and search functionality.\n\n## Team Members\n- Alec Fritsch (@flokzybtw)\n- Mehmet Ali Dolgun (@psychip_)\n\n## Live Demo\n[vox.psychip.net](https://vox.psychip.net)\n\n## Project Overview\n\nThis application demonstrates an advanced conversational AI interface with:\n- **Real-time voice conversation** using ElevenLabs Conversational AI\n- **Dynamic audio visualization** with speech activity detection\n- **Geographic awareness** with IP-based location detection\n- **Integrated tools** for weather, directions, and search\n- **Responsive web interface** with mobile optimization\n\n### Core Technologies\n- **Node.js** with Express.js server\n- **Webpack** for module bundling and development\n- **Web Audio API** for real-time audio processing\n- **Canvas API** for audio visualization\n- **MaxMind GeoIP2** for location detection\n\n### APIs \u0026 Services\n- **ElevenLabs API** - Voice synthesis and conversation management\n- **Google Routes API** - Driving directions (11labs tool)\n- **OpenWeather API** - Weather information  (11labs tool)\n- **Google Custom Search API** - Web search capabilities (11labs tool)\n- **MaxMind GeoLite2** - Local IP geolocation databases\n\n### Frontend Libraries\n- **Sound.js** - Sound effects and noise generation\n- **Web Audio API** - Real-time audio analysis and effects\n\n## Prerequisites\n- **Node.js** (v16 or higher)\n- **npm** package manager\n- **ElevenLabs account** with API access\n- **Google Cloud Platform** account (for Routes and Search APIs)\n- **SerpAPI** for local news\n- **OpenWeather** account for weather data\n\n## Installation \u0026 Setup\n\n### 1. Clone the Repository\n```bash\ngit clone https://github.com/psychip/berlin-hackathon\ncd berlin-hackathon\n```\n\n### 2. Install Dependencies\n```bash\nnpm install\n```\n\n### 3. Environment Configuration\nCreate a `.env` file in the root directory:\n```env\n# ElevenLabs Configuration\nXI_API_KEY=your_elevenlabs_api_key\nAGENT_ID=your_elevenlabs_agent_id\n\n# Server Configuration\nPORT=3388\n```\n\nnote: google cloud and serpapi keys hardcoded into 11labs tool calls\n\n### 4. ElevenLabs Agent Setup\n1. Create an account at [ElevenLabs](https://elevenlabs.io)\n2. Navigate to the Conversational AI section\n3. Create a new agent with the following configuration:\n   - **Voice**: Choose your preferred voice model\n   - **Tools**: Enable the following tools:\n\ntake a look to the screenshots in ./doc folder for detailed setup\n\n4. Copy the Agent ID to your `.env` file\n\n### 5. Database Setup\nThe application includes MaxMind GeoLite2 databases for IP geolocation:\n- `db/GeoLite2-City.mmdb` - City-level geolocation\n- `db/GeoLite2-ASN.mmdb` - ISP/Organization data\n\nThese are included in the repository for development purposes.\n\n## Running the Application\n\n```bash\nnpm run build\nnode server.js\n```\n\n## Project Structure\n\n```\nberlin-hackathon/\n├── src/                    # Frontend source files\n│   ├── app.js             # Main application logic\n│   ├── index.html         # HTML template\n│   ├── styles.css         # Stylesheets\n├── dist/                  # Built/compiled files\n│   ├── bundle.js          # Webpack compiled bundle\n│   ├── index.html         # Production HTML\n│   └── static/            # Static assets\n├── db/                    # MaxMind GeoIP databases\n│   ├── GeoLite2-City.mmdb\n│   └── GeoLite2-ASN.mmdb\n├── server.js              # Express.js backend server\n├── system_prompt.txt      # AI agent system prompt\n├── webpack.config.js      # Webpack configuration\n├── package.json           # Project dependencies\n└── README.md             # This documentation\n```\n\n## Configuration Details\n\n### Audio Processing\n- **FFT Size**: 256 (standard), 64 (low-end devices)\n- **Smoothing**: 0.6 (standard), 0.25 (low-end)\n- **Speech Detection Threshold**: 15 (adjustable)\n- **Silence Detection**: 800ms pause for sentence end\n\n### Visualization Settings\n- **Circle Radius**: 80px\n- **Audio Multiplier**: 40 (standard), 15 (low-end)\n- **Color Speed**: 10\n- **Glow Effect**: 8 (disabled on low-end devices)\n\n### Performance Optimization\nThe application automatically detects device capabilities:\n- **Mobile devices** or devices with \u003c8GB RAM use optimized settings\n- **Manual override** available via URL parameter: `?lowperf=true/false`\n\n## API Integrations\n\n### ElevenLabs Conversational AI\n- Real-time voice synthesis and recognition\n- Custom system prompts with location awareness\n- Tool integration for external API calls\n- WebSocket-based communication\n\n### Location Services\n- IP-based geolocation using MaxMind GeoLite2\n- Automatic timezone and location detection\n- Privacy-focused (no external API calls for basic geolocation)\n\n## Features\n\n### Audio Visualization\n- Real-time FFT analysis\n- Circular spectrum display with rotation\n- Speech activity detection with visual feedback\n- Agent/user state differentiation\n- Performance-adaptive rendering\n\n### Conversation Management\n- Automatic greeting based on time of day\n- Subtitle display\n- List formatting for structured responses\n- Connection status monitoring\n- Error handling with audio feedback\n\n### Common Issues\n\n**Agent Not Connecting**\n- Verify ElevenLabs API key and Agent ID\n- Check network connectivity\n- Confirm agent configuration matches requirements\n\n**Performance Issues**\n- Try low performance mode: `?lowperf=true`\n- Close other audio applications\n- Use supported browsers (Chrome, Firefox, Safari)\n\nThis project was developed for {Tech:Europe} 19/07/2025 Berlin Hackathon competition in 48 hours. For evaluation purposes, please review:\n1. Code architecture and organization\n2. API integration implementations\n3. Real-time audio processing\n4. User experience design\n5. Error handling and performance optimization\n\n## 📄 License\n\nThis project is developed for educational and demonstration purposes as part of a hackathon competition.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpsychip%2Fberlin-hackathon","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpsychip%2Fberlin-hackathon","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpsychip%2Fberlin-hackathon/lists"}