https://github.com/arshcypherz/dialmate-backend
Backend for DialMate project.
https://github.com/arshcypherz/dialmate-backend
Last synced: 7 months ago
JSON representation
Backend for DialMate project.
- Host: GitHub
- URL: https://github.com/arshcypherz/dialmate-backend
- Owner: ArshCypherZ
- License: mit
- Created: 2025-01-14T17:16:02.000Z (9 months ago)
- Default Branch: dev
- Last Pushed: 2025-01-14T18:14:52.000Z (9 months ago)
- Last Synced: 2025-01-14T19:06:40.246Z (9 months ago)
- Language: Python
- Size: 59.6 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Hackathon Submission: WebRTC Connection with Phone AI Agent
## WebRTC Client Code
Below is the core WebRTC connection code. It sets up an RTVIClient instance with configurable parameters for audio playback and event handling:```typescript
const rtviClient = new RTVIClient({
transport,
params: {
baseUrl: "http://localhost:7860/",
},
enableMic: true,
enableCam: false,
timeout: 30 * 1000,
});
```This snippet includes event handlers for:
- Setting up audio playback
- Handling user interface interactions---
## Environment Setup
### Step 1: Configure Environment Variables
Rename the sample environment file and populate the required variables:
```bash
mv sample.env .env
```
Fill in the following variables:
- `NGROK_URL`: Generated using `ngrok http 7860`
- `DAILY_API_KEY`: Obtain from [daily.co](https://www.daily.co)
- `OPEN_API_KEY`: Obtain from [OpenAI](https://platform.openai.com)
- `GEMINI_API_KEY`: Obtain from [Google Console](https://console.cloud.google.com)
- Twilio variables: Obtain from [twilio.com](https://www.twilio.com)### Step 2: Run Gemini Backend Server
Navigate to the server directory, install dependencies, and start the server:
```bash
cd server
pip3 install -r requirements.txt
python server.py
```---
## Web Application Setup
### Step 1: Install and Run the Web App
Install dependencies and start the development server:
```bash
npm i
npm run dev
```Or it is better to use some deployment sites like vercel for automation. Else there will be CORS issue if deployed on a non SSL server.
**Note:** The web app cannot be run directly in the browser without a customer ID. Use the hosted version at: [DialMateAI](https://dialmateai.vercel.app)
---
## Phone AI Agent
### Step 1: Make a Test Call
To test the Phone AI Agent, use the following cURL command:
```bash
curl -X POST https:///make-call \
-H "Content-Type: application/json" \
-d '{"to_phone_number": "+9111111"}'
```
Replace `+9111111` with your phone number. **Note:** Only verified numbers from the Twilio console can be called.### Step 2: Test the phone ai agent
For your number to be verified, contact us directly at arsh0javed@gmail.comTherefore, use our web app to test which is [DialMateAI](https://dialmateai.vercel.app)
---## Additional Notes
- This submission showcases a seamless integration of WebRTC, Twilio, and AI capabilities.
- The hosted web app allows users to experience the application without complex setup.### Architecture for Gemini + WebRTC
```
┌─────────────────────────────────────────┐
│ │
│ Server │
│ │
│ │
│ ┌────────────────────┐ │
│ │ │ │
│ │ Pipecat │ │
│ │ Pipeline │ │
│ │ │ │
│ │ │ │
┌──────────────────────────┐ │ │ Audio Processing │ │
│ │ │ │ ▼ │ │
│ Pipecat Client │ │ ┌─────────────│ Gemini Flash ─┼──┼────►
│ ┌───────────────┐ │ │ │ │ Transcription ◄┼──┼─────
│ │ WebRTC (Daily)│ ────┼────────►│WebRTC (Daily) ▼ │ │
│ │ Transport │ ◄───┼─────────│ Transport │ Gemini Multimodal─┼──┼────►
│ └───────────────┘ │ │ │ │ Live API ◄┼──┼─────
│ │ │ └─────────────│ ▼ │ │
└──────────────────────────┘ │ │ Gemini Flash ─┼──┼────►
│ │ Transcription ◄┼──┼─────
│ │ ▼ │ │
│ │ Conversation │ │
│ │ Context │ │
│ │ Management │ │
│ │ ▼ │ │
│ │ RTVI Events │ │
│ │ │ │
│ └────────────────────┘ │
│ │
└─────────────────────────────────────────┘
```### Architecture for OpenAI + Twilio

### Features
- Function (Tool) Calling
- Low latency with cloud WebRTC
- Integration with Twilio for high quality phone calls
- OpenAI and Gemini Realtime Support
- Voice Activity Detection (VAD)
- Configurable Natural Sounding Voices
- Highly intelligent sales agent**Note**:
Since you cannot run the phone ai agent without getting verified, here is an example demo:
https://youtube.com/shorts/r8-pi9Vtf7w?si=5_790MHq0RiQ6PRPThank You.