https://github.com/maximbilan/habla-ios
SwiftUI iOS app for real-time translated phone calls and AI agent mode powered by habla-core
https://github.com/maximbilan/habla-ios
amazon-nova ios realtime-audio speech-translation swift swiftui twilio voice-ai websocket
Last synced: 11 days ago
JSON representation
SwiftUI iOS app for real-time translated phone calls and AI agent mode powered by habla-core
- Host: GitHub
- URL: https://github.com/maximbilan/habla-ios
- Owner: maximbilan
- Created: 2026-02-16T16:28:53.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2026-03-08T09:52:35.000Z (3 months ago)
- Last Synced: 2026-03-08T13:59:17.288Z (3 months ago)
- Topics: amazon-nova, ios, realtime-audio, speech-translation, swift, swiftui, twilio, voice-ai, websocket
- Language: Swift
- Homepage:
- Size: 1.69 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Habla iOS
iOS client for Habla with two modes:
- **Live Call Mode**: real-time phone-call translation.
- **Agent Mode**: an AI phone agent that calls on behalf of the user.
For the detailed runtime and system design, see [architecture.md](./architecture.md).
## Current Feature Set
- Live call translation over PSTN (Twilio-backed backends).
- Active call UX improvements:
- short neutral pre-answer tone while waiting for remote presence,
- clear runtime phase status (`Listening -> Translating -> Speaking`),
- local mic and remote activity indicators.
- Agent mode with:
- live transcript,
- mid-call instruction injection,
- critical confirmation prompts,
- verified facts summary.
- Post-call summary screen with verified facts and conversation timeline.
- Call history list with persisted summaries/conversations.
- Caller memory preferences per phone number (local, consent-based).
- Caller ID management (verify/list/delete/select outbound caller ID).
- Device-level caller ID isolation via `X-Habla-Device-ID` handled by backend + `habla-accounts`.
- Backend selection (Nova/Gemini), source/target language selection, and voice gender selection.
## Architecture
Redux-like unidirectional data flow with SwiftUI:
- `AppState`: single source of truth
- `AppAction`: all state transitions
- `appReducer`: pure state reducer
- `Store`: dispatches actions through middleware, then reduces state
- `Middleware`: side effects (network, WebSocket, audio, persistence)
Core middlewares:
- `NetworkMiddleware`, `WebSocketMiddleware`, `AudioMiddleware`
- `AgentNetworkMiddleware`, `AgentWebSocketMiddleware`
- `CallHistoryMiddleware`, `CallerMemoryMiddleware`, `CallerIdMiddleware`
## Requirements
- iOS 17.0+
- Xcode 16.2+ (Swift 6 toolchain)
- Swift 6
- One running backend (`habla-core` or `habla-core-gemini`)
## Setup
1. Configure backend URLs:
```bash
cp .env.example .env
```
Set at least one of:
- `HABLA_BACKEND_URL_NOVA`
- `HABLA_BACKEND_URL_GEMINI`
- Legacy fallback also supported: `HABLA_BACKEND_URL`
Optional:
- `HABLA_BACKEND_URL_DEFAULT`
- `HABLA_SECRET` and `HABLA_APP_BUNDLE_ID` (if backend request auth is enabled)
- `.env` is ignored by git; do not commit real secrets.
- `Sources/Config/Config.swift` is generated and may contain auth material; review before committing.
2. Generate `Sources/Config/Config.swift`:
- Local/manual: run `ci_scripts/ci_post_clone.sh` after setting env vars (it reads shell env first, then `.env`).
- Xcode Cloud: set the same env vars in workflow settings.
3. Open `habla-ios.xcodeproj` and run.
A real device is recommended for full audio/call testing.
## Backend Contracts Used By iOS
### Live Call Mode
- `POST /call`
- `POST /call/{sid}/end`
- `WS /ws/{call_sid}`
Note: live call status updates are handled via WebSocket `status` events.
Binary audio format over WS:
- iOS -> backend: PCM16, mono, 16 kHz
- backend -> iOS: PCM16, mono, 16 kHz
### Agent Mode
- `POST /agent/call`
- `POST /agent/call/{sid}/end`
- `WS /agent/ws/{call_sid}`
### Caller ID
- `POST /caller-id/verify/start`
- `GET /caller-id/verify/status/{phone_number}`
- `GET /caller-id/list`
- `DELETE /caller-id/{sid}`
### Headers
When enabled on backend:
- `Authorization: ` (generated by `ci_scripts/ci_post_clone.sh`)
For caller-id ownership isolation:
- `X-Habla-Device-ID` (generated by app and persisted on device)
## Supported Translation Languages
- `en-US`, `en-GB`, `en-AU`, `en-IN`
- `es-US`, `fr-FR`, `de-DE`, `it-IT`, `pt-BR`, `hi-IN`
## Persistence
- Call metadata/history: SwiftData (`CallRecordModel`)
- Conversation archives: app support files via `CallHistoryMiddleware`
- Caller memory: local JSON cache
- Device identity: UserDefaults cache with Keychain-backed persistence
## Project Structure
```text
Sources/
App/
Actions/
Core/
Models/
Middlewares/
UI/
Extensions/
Config/
```
## Tech Stack
- Swift 6 + SwiftUI
- AVAudioEngine
- URLSession REST + WebSocket
- SwiftData