{"id":50416848,"url":"https://github.com/srthorat/aura-sde-interview-agent","last_synced_at":"2026-05-31T06:10:03.115Z","repository":{"id":349150245,"uuid":"1201246032","full_name":"srthorat/aura-sde-interview-agent","owner":"srthorat","description":"Real-time voice AI agent for Google SDE mock interviews. Built with Google ADK + Gemini Live native audio on Vertex AI. Full interview loop (Behavioural, Coding, System Design, Debugging, Debrief) — live grading, spoken scores, cross-session memory. No text box. Just an interview.","archived":false,"fork":false,"pushed_at":"2026-04-13T05:19:27.000Z","size":5826,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-13T06:22:12.384Z","etag":null,"topics":["cloud-run","fastapi","gemini-live","google-adk-framework","interview-coach","livekit","python","vertex-ai","voice-ai","voice-ai-agents","webrtc"],"latest_commit_sha":null,"homepage":"https://aura-sde-interview-agent-ivhauk7c7a-uc.a.run.app/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/srthorat.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-04T12:13:59.000Z","updated_at":"2026-04-13T05:34:38.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/srthorat/aura-sde-interview-agent","commit_stats":null,"previous_names":["srthorat/aura-sde-interview-agent"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/srthorat/aura-sde-interview-agent","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/srthorat%2Faura-sde-interview-agent","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/srthorat%2Faura-sde-interview-agent/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/srthorat%2Faura-sde-interview-agent/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/srthorat%2Faura-sde-interview-agent/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/srthorat","download_url":"https://codeload.github.com/srthorat/aura-sde-interview-agent/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/srthorat%2Faura-sde-interview-agent/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33721097,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-31T02:00:06.040Z","response_time":95,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cloud-run","fastapi","gemini-live","google-adk-framework","interview-coach","livekit","python","vertex-ai","voice-ai","voice-ai-agents","webrtc"],"created_at":"2026-05-31T06:10:02.338Z","updated_at":"2026-05-31T06:10:03.103Z","avatar_url":"https://github.com/srthorat.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Aura — Google SDE Interview Coach\n\nAura is a real-time AI-powered Google SDE interview coach built on the **Google ADK** + **Gemini Live** native audio stack with **LiveKit** WebRTC transport and **Vertex AI** persistent session memory.\n\nThis repo is now in a hackathon-ready state: the core demo flow is implemented, locally testable, deploys on Google Cloud Run, and includes the final spoken scoring and memory continuity features that matter in a live judging session.\n\n## Core Features\n\n1. Real-time voice interview agent with interruption handling and Gemini server-side VAD.\n2. Four-round Google-style interview flow covering behavioural, coding, system design, and debrief.\n3. Live spoken per-round scoring on a clear 1-4 scale, plus post-call rubric grading, answer feedback, and exportable session summary.\n4. Candidate progress view for named users, showing recent sessions, average rubric score, and question volume.\n5. Named-user interview state persistence across restarts, carrying forward asked questions, rubric grades, and answer notes.\n6. Self-healing Vertex AI session-service recovery for repeated session persistence failures.\n7. Google Cloud deployment on Cloud Run with Artifact Registry, Secret Manager, Vertex AI, Terraform, and Cloud Build CI/CD.\n\n## Demo Value\n\nWhy this is a strong live demo:\n1. It is visibly multimodal: the candidate speaks naturally, Aura responds with low-latency voice, and the UI shows live transcript and session state.\n2. It demonstrates real continuity: named users come back to preserved notes, grades, and prior round context instead of starting from zero.\n3. It closes the feedback loop during the session: Aura can now speak a clear round score out of 4 before wrap-up, instead of hiding evaluation until the end.\n4. It is actually deployable: the same repo includes Cloud Run, Cloud Build, Artifact Registry, Secret Manager, and Terraform infrastructure.\n\n| Layer | Technology |\n|---|---|\n| Agent framework | Google ADK (`google-adk`) — `LlmAgent`, `VertexAiSessionService` |\n| Voice model | `gemini-live-2.5-flash-native-audio` via Vertex AI bidi stream |\n| WebRTC transport | LiveKit (`livekit`, `livekit-api`) |\n| Session memory | Vertex AI Agent Engine (`VertexAiSessionService`) with named-user state snapshots plus automatic recovery after repeated session-service failures |\n| Infra | Google Cloud only (Cloud Run, Artifact Registry, Secret Manager, Vertex AI) |\n| Backend | FastAPI + Python 3.11, containerised on Cloud Run |\n| Frontend | Vite + React + LiveKit JS SDK |\n| IaC | Terraform + Cloud Build CI/CD |\n\n---\n\n## Architecture\n\n![Aura Architecture Diagram](docs/Aura-Architecture.png)\n\n\u003cdetails\u003e\n\u003csummary\u003eASCII overview\u003c/summary\u003e\n\n```\nBrowser mic → LiveKit WebRTC → PCM16 @ 16 kHz\n                                    ↓\n           Silero VAD (UI-only hints)     ← bot/audio/silero_vad.py\n                                    ↓\n                         Gemini Live (Vertex AI)      ← server-side VAD\n                         gemini-live-2.5-flash-native-audio\n                                    ↓\nBrowser speaker ← LiveKit WebRTC ← PCM16 @ 24 kHz ← Gemini audio response\n```\n\n\u003c/details\u003e\n\n### VAD strategy — Gemini server-side turn detection + local UI hints\n\nAura uses **Gemini's built-in server-side VAD** for actual turn detection and interruption semantics. A lightweight local **Silero VAD** is still used only for fast UI speaking indicators and STS timing, not for deciding when Gemini should end a turn.\n\nServer VAD is configured via `RealtimeInputConfig` → `AutomaticActivityDetection`:\n\n| Parameter | Value | Effect |\n|---|---|---|\n| `silence_duration_ms` | `600` | 600 ms of silence ends a turn — balanced: fast response without premature cuts |\n| `prefix_padding_ms` | `200` | Includes 200 ms of audio before speech start — captures leading consonants |\n\n### Latency analysis\n\n| Approach | End-of-speech → first audio byte | Notes |\n|---|---|---|\n| **Server VAD (current)** | **~150–300 ms** | VAD runs inside Gemini's inference pipeline; no round-trip penalty |\n| Pure silence threshold (no VAD) | ~300–500 ms | Simple but clips fast speakers; misses barge-in |\n\n**Server VAD wins on latency** because:\n- VAD judgment happens inside the same process as the LLM — no extra network hop\n- Barge-in (interruption) is handled natively without any muting gate on the send path\n- `silence_duration_ms=800` is the only mandatory wait; the 150 ms figure applies when the model starts generating immediately after speech ends\n\nFull diagram and design decisions: see [Architecture](#architecture) section above.\n\n---\n\n## Local Development\n\n### Prerequisites\n\n- Python 3.11+, [`uv`](https://github.com/astral-sh/uv)\n- Node.js 20+\n- A [LiveKit Cloud](https://livekit.io) project (free tier works)\n- A Google Cloud project with Vertex AI enabled **or** a Google AI Studio API key\n\n### 1. Clone and install\n\n```bash\ngit clone https://github.com/srthorat/aura-sde-interview-agent.git\ncd aura-sde-interview-agent\ncp .env.example .env\n# Fill in .env — see section below\nuv sync\n```\n\n### 2. Configure `.env`\n\nMinimum required values:\n\n```bash\n# LiveKit\nLIVEKIT_URL=wss://your-project.livekit.cloud\nLIVEKIT_API_KEY=your_livekit_api_key\nLIVEKIT_API_SECRET=your_livekit_api_secret\n\n# Google Cloud (Vertex AI path — recommended)\nGOOGLE_CLOUD_PROJECT_ID=your-gcp-project-id\nGOOGLE_CLOUD_LOCATION=us-central1\n# Authenticate with: gcloud auth application-default login\n\n# --- OR --- Google AI Studio (preview path, no GCP project needed)\n# GEMINI_MODEL=preview\n# GOOGLE_API_KEY=your_ai_studio_api_key\n```\n\n### 3. Run the backend\n\n```bash\n# Optional: use a non-default port for local work\n# export PORT=7863\n\nuv run python -m bot.bot\n# Listening on http://localhost:${PORT:-7862}\n```\n\n### 4. Run the frontend (separate terminal)\n\n```bash\ncd frontend\nnvm use   # picks Node 20 from .nvmrc automatically\nnpm install\nnpm run dev\n# http://localhost:3000\n```\n\nOpen `http://localhost:3000`, enter a User ID (1–10), click **Connect**, and speak.\n\n### 5. Build the frontend for the backend-served app\n\nThe FastAPI app serves static files from `frontend/dist`. For the integrated app on the backend port, build the frontend first:\n\n```bash\ncd frontend\nnvm use\nnpm install\nnpm run build\n```\n\nThen run the backend and open `http://localhost:${PORT:-7862}`.\n\n### 6. Single-container run (Docker)\n\n```bash\ncp .env.example .env   # fill in values\n\n# Optional local override\n# echo 'PORT=7863' \u003e\u003e .env\n\ndocker compose up --build -d\ndocker compose ps\ncurl http://localhost:${PORT:-7862}/health\n```\n\nOpen `http://localhost:${PORT:-7862}`.\n\n### Local setup used in this repo right now\n\nIf you want the same flow we are using during development:\n\n```bash\ncp .env.example .env\n\n# Edit .env and set your real LiveKit / Google values.\n# If you want the backend on 7863 instead of 7862:\n# PORT=7863\n\ncd frontend\nnvm use\nnpm install\nnpm run build\n\ncd ..\nuv sync\nuv run python -m bot.bot\n```\n\nOpen `http://localhost:7863` if `PORT=7863`, otherwise `http://localhost:7862`.\n\n---\n\n## Session Persistence\n\nAura remembers each candidate's interview history (questions asked, scores, round progress) across reconnects. There are three tiers — use the one that fits your deployment:\n\n| Tier | Env var | Survives | Best for |\n|---|---|---|---|\n| **Vertex AI Agent Engine** | `VERTEX_AI_REASONING_ENGINE_ID` | Container restarts, Cloud Run scale-out, server replacement | Production / Cloud Run |\n| **File-based** | `SESSION_PERSIST_DIR` | Process restarts (with Docker volume mount) | Single-VM / local dev |\n| **In-memory** | _(neither set)_ | Current process only | Quick testing |\n\n---\n\n### Tier 1 — Vertex AI Agent Engine (recommended for Cloud Run)\n\nThis uses Google-managed storage so sessions survive any restart, upgrade, or scale event.\n\n#### Step 1 — Enable the API\n\n```bash\ngcloud services enable aiplatform.googleapis.com \\\n  --project=YOUR_PROJECT_ID\n```\n\n#### Step 2 — Create the Reasoning Engine resource\n\nThe Reasoning Engine is just a named container for sessions — no code is deployed to it.\n\n```bash\ngcloud ai reasoning-engines create \\\n  --display-name=\"aura-sessions\" \\\n  --region=us-central1 \\\n  --project=YOUR_PROJECT_ID\n```\n\nThe output will look like:\n```\nCreated reasoning engine: projects/YOUR_PROJECT_ID/locations/us-central1/reasoningEngines/1234567890123456789\n```\n\nNote the **numeric ID** at the end (e.g. `1234567890123456789`).\n\n#### Step 3 — Grant the service account access\n\nThe identity running the bot needs `aiplatform.user` on the project:\n\n```bash\n# For local dev — your personal account\ngcloud projects add-iam-policy-binding YOUR_PROJECT_ID \\\n  --member=\"user:YOUR_EMAIL\" \\\n  --role=\"roles/aiplatform.user\"\n\n# For Cloud Run — the Cloud Run service account\ngcloud projects add-iam-policy-binding YOUR_PROJECT_ID \\\n  --member=\"serviceAccount:YOUR_SA@YOUR_PROJECT_ID.iam.gserviceaccount.com\" \\\n  --role=\"roles/aiplatform.user\"\n```\n\n#### Step 4 — Set the env var\n\nIn `.env` (local) or Cloud Run environment:\n\n```bash\nVERTEX_AI_REASONING_ENGINE_ID=1234567890123456789\n```\n\nComment out or remove `SESSION_PERSIST_DIR` — when `VERTEX_AI_REASONING_ENGINE_ID` is set it takes priority.\n\n#### Step 5 — Restart and verify\n\n```bash\n# Local\nuv run python -m bot.bot\n# Logs should show:\n# [adk] Using VertexAiSessionService (engine: 1234567890123456789)\n\n# Docker\ndocker compose up -d\ndocker compose logs | grep \"VertexAi\\|session\"\n```\n\n---\n\n### Tier 2 — File-based persistence (single VM / local dev)\n\nSessions are written as JSON files to a local directory. Works without any GCP setup.\n\nAdd to `.env`:\n```bash\nSESSION_PERSIST_DIR=/data/aura-sessions\n```\n\nFor Docker, mount a host volume so data survives container replacement:\n\n```yaml\n# docker-compose.yml volumes section:\nvolumes:\n  - /data/aura-sessions:/data/aura-sessions\n```\n\nLogs will show:\n```\n[adk] Using FileSessionService — sessions persist to /data/aura-sessions\n```\n\n---\n\n## Cloud Deployment (Google Cloud Run)\n\n### One-click deploy\n\n```bash\ncd infra\ncp terraform.tfvars.example terraform.tfvars   # fill in project_id, livekit_url, etc.\n./deploy.sh\n```\n\nThe script:\n1. Stores LiveKit credentials in Secret Manager\n2. Provisions Cloud Run, Artifact Registry, IAM, and Cloud Build trigger via Terraform\n3. Builds and pushes the Docker image to Artifact Registry\n4. Deploys to Cloud Run and prints the live URL\n\n### Manual Terraform\n\n```bash\ncd infra\ncp terraform.tfvars.example terraform.tfvars   # edit values\n\nterraform init\nterraform plan\nterraform apply\n```\n\n### CI/CD\n\n`cloudbuild.yaml` at the repo root is triggered automatically on every push to `main`. It:\n1. Builds the Docker image (with layer caching from the `:latest` tag)\n2. Pushes `:\u003ccommit-sha\u003e` and `:latest` to Artifact Registry\n3. Rolls out the new image to Cloud Run\n\nThe trigger is now configured and managed through Terraform in [infra/main.tf](infra/main.tf), using the `aura-deploy` service account.\n\n#### Live Deployment Proof\n\n![GCP Deployment Proof](docs/gcp-deployment-proof.gif)\n\nFor the full operational runbook, see [docs/OPERATION_MANAGE.md](docs/OPERATION_MANAGE.md).\n\n### Demo \u0026 Documentation\n\n**Demo video:**\n\n[![Aura Demo Video](https://vumbnail.com/1183350669.jpg)](https://vimeo.com/1183350669)\n\n\u003e [Watch the full demo on Vimeo](https://vimeo.com/1183350669) · [Download mp4](docs/GeminiLiveHackathon.mp4)\n\n**Pitch deck:** [docs/GeminiLiveHackathon.pptx](docs/GeminiLiveHackathon.pptx)\n\n**Architecture \u0026 submission:** [docs/Aura_Design_Doc.md](docs/Aura_Design_Doc.md) · [docs/SUBMISSION_DRAFT.md](docs/SUBMISSION_DRAFT.md)\n\n---\n\n## Environment Variables\n\nSee [`.env.example`](.env.example) for the full list with descriptions. Key variables:\n\n| Variable | Required | Description |\n|---|---|---|\n| `LIVEKIT_URL` | ✓ | LiveKit server WebSocket URL |\n| `LIVEKIT_API_KEY` | ✓ | LiveKit API key |\n| `LIVEKIT_API_SECRET` | ✓ | LiveKit API secret |\n| `GOOGLE_CLOUD_PROJECT_ID` | ✓* | GCP project for Vertex AI |\n| `GOOGLE_CLOUD_LOCATION` | — | GCP region (default: `us-central1`) |\n| `VERTEX_AI_REASONING_ENGINE_ID` | — | Numeric ID of Vertex AI Reasoning Engine for cross-deployment session persistence (see [Session Persistence](#session-persistence)) |\n| `SESSION_PERSIST_DIR` | — | Local directory path for file-based session persistence (fallback when Reasoning Engine ID is not set) |\n| `GOOGLE_API_KEY` | ✓* | Google AI Studio key (preview path only) |\n| `GEMINI_LIVE_MODEL` | — | Model name (default: `gemini-live-2.5-flash-native-audio`) |\n| `GEMINI_TEXT_MODEL` | — | Text-only grading/summary model (default: `gemini-2.5-flash`) |\n| `GEMINI_VOICE` | — | Voice name (default: `Aoede`) |\n| `USER_IDLE_TIMEOUT_SECS` | — | Idle silence timeout (default: `120`) |\n| `MAX_CALL_DURATION_SECS` | — | Hard call limit in seconds (default: `840`) |\n\n\\* One of `GOOGLE_CLOUD_PROJECT_ID` (Vertex AI) or `GOOGLE_API_KEY` (preview) is required.\n\n---\n\n## Project Structure\n\n```\n├── bot/\n│   ├── agent.py              # ADK LlmAgent, tool registry, LIVE_TOOL_DECLARATIONS\n│   ├── bot.py                # FastAPI app — /livekit/session, /health\n│   ├── audio/\n│   │   ├── silero_vad.py     # Local UI-only speaking detector\n│   │   ├── silero_vad.onnx   # Bundled Silero model artifact required at runtime\n│   │   └── smart_turn.py     # Deprecated placeholder kept for historical context\n│   ├── pipelines/\n│   │   └── voice.py          # AuraVoiceSession — LiveKit ↔ Gemini Live bridge\n│   ├── processors/\n│   │   └── session_timer.py  # Call duration utility\n│   └── prompts/\n│       ├── grading_rubric.md             # Rubric used by post-call grading\n│       ├── system_prompt.md              # Base persona (fallback)\n│       ├── system_prompt_anon.md         # Anonymous user variant\n│       ├── system_prompt_named_fast.md   # Named user — fast-start variant\n│       ├── prompt_greeting_anon.md       # Opening greeting for anon users\n│       ├── prompt_greeting_named.md      # Opening greeting for named users\n│       └── prompt_round_*.md             # Per-round instruction overlays (8 files)\n├── frontend/\n│   ├── public/demo.html      # Voice UI (real-time transcript, metrics, summary, progress)\n│   └── src/App.tsx           # Vite/React wrapper\n├── infra/\n│   ├── main.tf               # Cloud Run, Artifact Registry, IAM, Secret Manager\n│   ├── variables.tf / outputs.tf / versions.tf\n│   ├── terraform.tfvars.example\n│   └── deploy.sh             # One-click bootstrap script\n├── cloudbuild.yaml           # Cloud Build CI/CD pipeline\n├── Dockerfile                # Multi-stage: Node.js build + Python 3.11 slim\n├── docker-compose.yml        # Local single-container run\n├── pyproject.toml            # Python dependencies (uv)\n└── .env.example              # All environment variables documented\n```\n\n---\n\n## Tech Stack — Components, Roles \u0026 Why We Chose Them\n\n### LiveKit — WebRTC Transport Layer\n\n**What it does:** LiveKit is a self-hostable WebRTC SFU (Selective Forwarding Unit) that manages real-time audio/video rooms. In Aura it provides the bi-directional audio channel between the browser microphone and the Cloud Run backend, plus a server-side data channel for structured events (transcript, grades, call summary).\n\n**Why LiveKit over raw WebRTC or other transports:**\n\n| Concern | Raw WebRTC | LiveKit |\n|---|---|---|\n| NAT traversal / TURN | Manual STUN/TURN setup | Built-in, fully managed |\n| Server-side audio access | Impossible without media server | `rtc.AudioStream` gives PCM frames directly in Python |\n| Participant signalling | Custom protocol required | SDK handles join/leave/reconnect |\n| Data channel | Unreliable ordering without SCTP work | `room.local_participant.publish_data()` — reliable ordered delivery |\n| Scalability | Point-to-point only | SFU routes audio; scales to many participants |\n| Python SDK | None official | `livekit` + `livekit-api` packages — first-class async support |\n\n**Key integration points in Aura:**\n- `rtc.AudioStream(track)` — pulls incoming browser mic PCM16 @ 16 kHz frame-by-frame\n- `rtc.AudioSource(GEMINI_OUTPUT_SAMPLE_RATE, GEMINI_OUTPUT_CHANNELS)` — pushes Gemini audio output back to browser\n- `room.local_participant.publish_data(payload, reliable=True)` — sends transcript lines, grade chips, and call summary JSON to the UI without a separate WebSocket\n- `rtc.RoomEvent` — drives session lifecycle (participant join/leave triggers session start/teardown)\n- `livekit-api` + `api.AccessToken` — backend mints short-lived JWT tokens per session; no credentials exposed to browser\n\n---\n\n### Google ADK (`google-adk`) — Agent Orchestration\n\n**What it does:** ADK (`LlmAgent`, `Runner`, `VertexAiSessionService`) wraps the Gemini Live bidi stream, handles tool dispatch, and manages session lifecycle. It eliminates writing a custom WebSocket loop for Gemini's `BidiGenerateContent` RPC.\n\n**Why ADK:**\n- Tool calls from Gemini Live are dispatched to Python functions automatically — no `if tool_name == \"...\"` switching\n- `VertexAiSessionService` persists the full conversation event history to Vertex AI Agent Engine natively\n- `RunConfig` gives a clean surface for configuring VAD, speech, modalities, and session resumption in one place\n- `LiveRequestQueue` decouples audio ingestion from the ADK event loop — audio can be enqueued from the LiveKit receive loop without blocking\n\n---\n\n### Gemini Live (`gemini-live-2.5-flash-native-audio`) — Voice Brain\n\n**What it does:** Native audio model that speaks, listens, detects turn ends, handles barge-in, and evaluates interview answers — all in one bidi stream.\n\n**Why native audio (not text-to-speech + text):**\n- Single round-trip: audio in → audio out in the same inference pass, no TTS post-step\n- Server-side VAD: turn detection runs inside the model pipeline, eliminating client-side buffering latency\n- `enable_affective_dialog=True`: Gemini modulates tone and pacing based on conversational context\n- STS latency 126–300 ms on conversational turns vs. 900–1400 ms with client Silero + SmartTurn\n\n---\n\n### Silero VAD (`silero_vad.onnx` + `onnxruntime`) — Local UI Hints\n\n**What it does:** A lightweight ONNX speech detection model that runs locally in-process on each 32 ms audio frame (512 samples @ 16 kHz).\n\n**Why a separate local VAD in addition to Gemini's server-side VAD:**\n- Server VAD fires after Gemini processes the audio — there is a ~80–150 ms delay before the UI knows the user is speaking\n- Silero fires immediately on the local audio frame, enabling instant \"user speaking\" UI indicators and barge-in queue clear without waiting for a Gemini event\n- ONNX runtime is single-threaded, CPU-only, \u003c 5 MB — no GPU, no network, negligible latency\n\n---\n\n### FastAPI + Uvicorn — Backend HTTP/WebSocket Server\n\n**What it does:** Serves three surfaces:\n1. `POST /livekit/session` — mints LiveKit tokens, spawns the voice session task\n2. `GET /health` — Cloud Run health check\n3. Static file serving — `frontend/dist/` (built React app)\n\n**Why FastAPI:**\n- `async def` handlers integrate with the `asyncio` event loop used by LiveKit and ADK — no thread-pool overhead\n- Pydantic request/response models give automatic input validation and OpenAPI docs\n- Lifespan context manager handles pre-warm tasks cleanly at startup\n\n---\n\n### Vertex AI Agent Engine — Session Persistence\n\n**What it does:** Google-managed cloud storage for ADK session events. Each named candidate's history (questions asked, rubric grades, answer notes, round number) is persisted as ADK `Event` objects and survives container restarts, Cloud Run scale-out, and re-deployments.\n\n**Why Vertex AI Agent Engine over a database:**\n- Native ADK integration — `VertexAiSessionService` is a drop-in for `InMemorySessionService`\n- No schema migrations, no connection pool, no DB maintenance\n- Events are JSON-serialisable and readable without a custom SDK\n- Automatic IAM-gated access — no credential rotation needed\n\n---\n\n### Terraform + Cloud Build — Infrastructure as Code + CI/CD\n\n**What it does:** All GCP resources (Cloud Run service, Artifact Registry repo, Secret Manager secrets, IAM bindings, Cloud Build trigger, Vertex AI Reasoning Engine) are declared in `infra/main.tf`. Any push to `main` triggers `cloudbuild.yaml`: build → tag with commit SHA → push to Artifact Registry → roll out to Cloud Run.\n\n**Why Terraform over console clicks:**\n- Reproducible: `terraform apply` from a fresh project recreates the full stack in ~3 minutes\n- Auditable: every infra change is a Git commit\n- Cloud Build trigger is itself managed by Terraform — no manual console setup\n\n---\n\n## Adding Tools\n\nTools are defined in `bot/agent.py`. To add a new tool:\n\n1. Write a Python function and add it to `TOOL_REGISTRY`\n2. Add a matching entry to `LIVE_TOOL_DECLARATIONS` (used by the Gemini Live session)\n3. Add the function to the `tools=[]` list in `build_adk_agent()`\n\nThe same function is invoked whether the call comes from the ADK text runner or the live audio session.\n\n---\n\n## HTTPS Deployment Options\n\nBrowser microphone access requires HTTPS on any public URL. Two GCP paths:\n\n### Option A — Cloud Run (recommended, HTTPS built-in)\n\nCloud Run is already provisioned by Terraform (`infra/`). Google's load balancer terminates TLS automatically — **no Caddy or reverse proxy needed**.\n\n```bash\n# One-command deploy (provisions + builds + deploys)\ncd infra \u0026\u0026 ./deploy.sh\n```\n\nYour service is immediately available at `https://aura-voice-agent-\u003chash\u003e-uc.a.run.app`.\n\n---\n\n### Option B — Google Compute Engine VM + Caddy\n\nUse this if you need a persistent VM (e.g. for testing, or if Cloud Run's request-based scaling doesn't fit your workload).\n\n#### 1. Create a GCE VM\n\n```bash\ngcloud compute instances create aura-vm \\\n  --project=YOUR_PROJECT_ID \\\n  --zone=us-central1-a \\\n  --machine-type=e2-standard-2 \\\n  --image-family=debian-12 \\\n  --image-project=debian-cloud \\\n  --tags=http-server,https-server \\\n  --metadata=startup-script='#! /bin/bash\n    apt-get update\n    apt-get install -y docker.io\n    systemctl enable --now docker'\n```\n\n#### 2. Open firewall ports\n\n```bash\ngcloud compute firewall-rules create allow-http-https \\\n  --allow=tcp:80,tcp:443 \\\n  --target-tags=http-server,https-server \\\n  --project=YOUR_PROJECT_ID\n```\n\n#### 3. SSH in and start the container\n\n```bash\ngcloud compute ssh aura-vm --zone=us-central1-a\n\n# On the VM:\ngit clone https://github.com/your-org/aura-sde-interview-agent.git\ncd aura-sde-interview-agent\ncp .env.example .env   # fill in values\n\n# Optional: if you want Aura on 7863 behind Caddy instead of the default 7862\n# echo 'PORT=7863' \u003e\u003e .env\n\ndocker compose up --build -d\ncurl http://127.0.0.1:${PORT:-7862}/health   # verify\n```\n\n#### 4. Install Caddy and configure HTTPS\n\n```bash\n# On the VM:\napt-get install -y caddy\ncp deploy/caddy/Caddyfile.example /etc/caddy/Caddyfile\n# Edit /etc/caddy/Caddyfile:\n# - replace aura.example.com with your real domain\n# - if PORT is not 7862, also change 127.0.0.1:7862 to your chosen port\nsystemctl enable --now caddy\n```\n\n5. **Point DNS** — create an `A` record for your domain pointing to the VM's external IP:\n   ```bash\n   gcloud compute instances describe aura-vm --zone=us-central1-a \\\n     --format='value(networkInterfaces[0].accessConfigs[0].natIP)'\n   ```\n\nKey points:\n- VM deployment is Docker plus Caddy: Docker runs the Aura container, Caddy terminates TLS on `443` and proxies to the local Aura port\n- Aura listens on port `7862` by default; if you set `PORT=7863`, update both `docker compose` and the Caddy upstream to `127.0.0.1:7863`\n- The Caddyfile includes a `keepalive` transport directive to keep LiveKit WebSocket signalling alive\n- Caddy auto-renews Let's Encrypt certificates — no manual cert management needed\n\n---\n\n## Health Check\n\n```bash\ncurl https://your-cloud-run-url/health\n# {\"status\":\"ok\",\"bot\":\"Aura\",\"transport\":\"livekit\",\"model\":\"gemini-live-2.5-flash-native-audio\",\"active_rooms\":0}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsrthorat%2Faura-sde-interview-agent","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsrthorat%2Faura-sde-interview-agent","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsrthorat%2Faura-sde-interview-agent/lists"}