https://github.com/maximbilan/habla-core
FastAPI backend for real-time phone-call translation and AI agent mode using Amazon Nova 2 Sonic and Twilio
https://github.com/maximbilan/habla-core
amazon-nova fastapi python realtime-audio speech-translation twilio voice-ai websocket
Last synced: about 9 hours ago
JSON representation
FastAPI backend for real-time phone-call translation and AI agent mode using Amazon Nova 2 Sonic and Twilio
- Host: GitHub
- URL: https://github.com/maximbilan/habla-core
- Owner: maximbilan
- Created: 2026-02-16T09:39:47.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2026-05-08T11:11:12.000Z (26 days ago)
- Last Synced: 2026-05-08T12:17:54.151Z (26 days ago)
- Topics: amazon-nova, fastapi, python, realtime-audio, speech-translation, twilio, voice-ai, websocket
- Language: Python
- Homepage:
- Size: 184 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Habla Core
AWS/Nova backend for Habla app.
This service supports both iOS modes:
- **Live Call Mode**: low-latency bidirectional phone-call translation
- **Agent Mode**: autonomous caller agent with transcript + verified-facts signals
System architecture and sequence diagrams: [`architecture.md`](architecture.md)
Direct Agent Mode flow diagram: [`architecture.md#61-agent-mode-runtime-sequence`](architecture.md#61-agent-mode-runtime-sequence).
## Current Implementation Summary
### Live Call Mode (fast audio path)
- Uses two Amazon Nova 2 Sonic sessions per call:
- iOS -> callee language (to Twilio/PSTN)
- callee -> iOS language (back to app)
- Streams audio in both directions continuously
- Optimized for latency: translation call mode focuses on audio forwarding
### Agent Mode
- Twilio call orchestration with model-driven agent conversation.
- WebSocket events for:
- call status
- agent status (`listening/thinking/speaking`)
- transcript and transcript updates
- critical confirmations
- verified facts summar.
### Caller ID Isolation
- Caller ID verification/list/delete endpoints are provided
- Ownership is enforced per device via `X-Habla-Device-ID`
- Shared ownership state is delegated to `habla-accounts` (`HABLA_ACCOUNTS_*`)
## API Surface
### Translation
- `GET /`
- `GET /translation/languages`
- `POST /call`
- `POST /call/{sid}/end`
- `GET /call/{sid}/status`
- `POST /twilio/webhook` (compatibility TwiML endpoint)
- `WS /ws/{call_sid}`
- `WS /twilio/media-stream`
### Agent
- `POST /agent/call`
- `POST /agent/call/{call_sid}/end`
- `GET /agent/call/{call_sid}/status`
- `POST /agent/twilio/webhook/{call_sid}`
- `WS /agent/ws/{call_sid}`
- `WS /agent/twilio/media-stream/{call_sid}`
### Caller ID
- `POST /caller-id/verify/start`
- `GET /caller-id/verify/status/{phone_number}`
- `GET /caller-id/list`
- `DELETE /caller-id/{sid}`
## Request Authentication
If `HABLA_SECRET` is set, iOS-facing REST + WS routes require:
- `Authorization: HMAC_SHA256(HABLA_SECRET, HABLA_APP_BUNDLE_ID)`
Caller ID ownership-sensitive routes also require:
- `X-Habla-Device-ID`
## Supported Languages
- `en-US`, `en-GB`, `en-AU`, `en-IN`
- `es-US`, `fr-FR`, `de-DE`, `it-IT`, `pt-BR`, `hi-IN`
## Local Development
### 1) Install
```bash
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
# test-only deps
pip install pytest httpx
```
### 2) Configure
```bash
cp .env.example .env
```
Required groups:
- AWS credentials + `AWS_REGION`
- Twilio credentials + `TWILIO_FROM_NUMBER`
- `PUBLIC_URL` reachable by Twilio
Optional/conditional:
- `HABLA_SECRET`, `HABLA_APP_BUNDLE_ID`
- `HABLA_ACCOUNTS_BASE_URL`, `HABLA_ACCOUNTS_SERVICE_TOKEN`, `HABLA_ACCOUNTS_TIMEOUT_SECONDS`
### 3) Run
```bash
uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
```
## Tests
```bash
python -m compileall app tests
PYTHONPATH=. pytest -q
```
## Deployment (EC2)
Main branch deploy is handled by `.github/workflows/deploy-ec2.yml`:
- runs syntax validation
- rsyncs source to EC2
- installs dependencies in server venv
- restarts `habla-core` systemd service
- runs local health check (`http://127.0.0.1:8000/`)
Required GitHub variables/secrets are defined in the workflow (`EC2_*`, `EC2_SSH_PRIVATE_KEY`).
## Repository Layout
```text
app/
main.py
config.py
models.py
call_manager.py
translation_bridge.py
nova_sonic.py
audio_utils.py
request_auth.py
twilio_handler.py
language_support.py
caller_id/
agent/
```