An open API service indexing awesome lists of open source software.

https://github.com/maximbilan/habla-ios

SwiftUI iOS app for real-time translated phone calls and AI agent mode powered by habla-core
https://github.com/maximbilan/habla-ios

amazon-nova ios realtime-audio speech-translation swift swiftui twilio voice-ai websocket

Last synced: 11 days ago
JSON representation

SwiftUI iOS app for real-time translated phone calls and AI agent mode powered by habla-core

Awesome Lists containing this project

README

          

# Habla iOS

iOS client for Habla with two modes:

- **Live Call Mode**: real-time phone-call translation.
- **Agent Mode**: an AI phone agent that calls on behalf of the user.

For the detailed runtime and system design, see [architecture.md](./architecture.md).

## Current Feature Set

- Live call translation over PSTN (Twilio-backed backends).
- Active call UX improvements:
- short neutral pre-answer tone while waiting for remote presence,
- clear runtime phase status (`Listening -> Translating -> Speaking`),
- local mic and remote activity indicators.
- Agent mode with:
- live transcript,
- mid-call instruction injection,
- critical confirmation prompts,
- verified facts summary.
- Post-call summary screen with verified facts and conversation timeline.
- Call history list with persisted summaries/conversations.
- Caller memory preferences per phone number (local, consent-based).
- Caller ID management (verify/list/delete/select outbound caller ID).
- Device-level caller ID isolation via `X-Habla-Device-ID` handled by backend + `habla-accounts`.
- Backend selection (Nova/Gemini), source/target language selection, and voice gender selection.

## Architecture

Redux-like unidirectional data flow with SwiftUI:

- `AppState`: single source of truth
- `AppAction`: all state transitions
- `appReducer`: pure state reducer
- `Store`: dispatches actions through middleware, then reduces state
- `Middleware`: side effects (network, WebSocket, audio, persistence)

Core middlewares:

- `NetworkMiddleware`, `WebSocketMiddleware`, `AudioMiddleware`
- `AgentNetworkMiddleware`, `AgentWebSocketMiddleware`
- `CallHistoryMiddleware`, `CallerMemoryMiddleware`, `CallerIdMiddleware`

## Requirements

- iOS 17.0+
- Xcode 16.2+ (Swift 6 toolchain)
- Swift 6
- One running backend (`habla-core` or `habla-core-gemini`)

## Setup

1. Configure backend URLs:

```bash
cp .env.example .env
```

Set at least one of:

- `HABLA_BACKEND_URL_NOVA`
- `HABLA_BACKEND_URL_GEMINI`
- Legacy fallback also supported: `HABLA_BACKEND_URL`

Optional:

- `HABLA_BACKEND_URL_DEFAULT`
- `HABLA_SECRET` and `HABLA_APP_BUNDLE_ID` (if backend request auth is enabled)
- `.env` is ignored by git; do not commit real secrets.
- `Sources/Config/Config.swift` is generated and may contain auth material; review before committing.

2. Generate `Sources/Config/Config.swift`:

- Local/manual: run `ci_scripts/ci_post_clone.sh` after setting env vars (it reads shell env first, then `.env`).
- Xcode Cloud: set the same env vars in workflow settings.

3. Open `habla-ios.xcodeproj` and run.

A real device is recommended for full audio/call testing.

## Backend Contracts Used By iOS

### Live Call Mode

- `POST /call`
- `POST /call/{sid}/end`
- `WS /ws/{call_sid}`

Note: live call status updates are handled via WebSocket `status` events.

Binary audio format over WS:

- iOS -> backend: PCM16, mono, 16 kHz
- backend -> iOS: PCM16, mono, 16 kHz

### Agent Mode

- `POST /agent/call`
- `POST /agent/call/{sid}/end`
- `WS /agent/ws/{call_sid}`

### Caller ID

- `POST /caller-id/verify/start`
- `GET /caller-id/verify/status/{phone_number}`
- `GET /caller-id/list`
- `DELETE /caller-id/{sid}`

### Headers

When enabled on backend:

- `Authorization: ` (generated by `ci_scripts/ci_post_clone.sh`)

For caller-id ownership isolation:

- `X-Habla-Device-ID` (generated by app and persisted on device)

## Supported Translation Languages

- `en-US`, `en-GB`, `en-AU`, `en-IN`
- `es-US`, `fr-FR`, `de-DE`, `it-IT`, `pt-BR`, `hi-IN`

## Persistence

- Call metadata/history: SwiftData (`CallRecordModel`)
- Conversation archives: app support files via `CallHistoryMiddleware`
- Caller memory: local JSON cache
- Device identity: UserDefaults cache with Keychain-backed persistence

## Project Structure

```text
Sources/
App/
Actions/
Core/
Models/
Middlewares/
UI/
Extensions/
Config/
```

## Tech Stack

- Swift 6 + SwiftUI
- AVAudioEngine
- URLSession REST + WebSocket
- SwiftData