https://github.com/mahimairaja/livekit-starter
Full-stack LiveKit voice agent starter: a Python voice worker, FastAPI token server, and React frontend. Run, deploy, and swap each piece on its own.
https://github.com/mahimairaja/livekit-starter
ai-agents cartesia conversational-ai deepgram fastapi livekit llm openai python react realtime sip speech-to-text starter-template text-to-speech typescript voice-agent voice-ai voice-assistant webrtc
Last synced: about 3 hours ago
JSON representation
Full-stack LiveKit voice agent starter: a Python voice worker, FastAPI token server, and React frontend. Run, deploy, and swap each piece on its own.
- Host: GitHub
- URL: https://github.com/mahimairaja/livekit-starter
- Owner: mahimairaja
- License: mit
- Created: 2026-06-15T10:16:34.000Z (17 days ago)
- Default Branch: main
- Last Pushed: 2026-06-16T21:16:06.000Z (16 days ago)
- Last Synced: 2026-06-16T22:24:10.639Z (16 days ago)
- Topics: ai-agents, cartesia, conversational-ai, deepgram, fastapi, livekit, llm, openai, python, react, realtime, sip, speech-to-text, starter-template, text-to-speech, typescript, voice-agent, voice-ai, voice-assistant, webrtc
- Language: Python
- Homepage: https://playground.mahimai.ca
- Size: 847 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE
Awesome Lists containing this project
README

LiveKit Voice AI Starter
Talk to an AI agent in your browser, in minutes.
A full-stack, production-minded starter for real-time voice agents: a LiveKit
voice worker, a FastAPI token server, and a React frontend built on LiveKit's
Agents UI, wired together and ready to extend.
---
Most voice-AI demos are a single script. This is the whole loop, structured the
way you'd actually ship it, and split into three pieces you can run, deploy, and
swap independently.
## What's inside
| Package | What it is |
| --------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **`agent/`** | A LiveKit voice worker: Deepgram `nova-3` STT, OpenAI `gpt-4.1-mini`, Cartesia TTS, Silero VAD, and the LiveKit multilingual turn detector. Web and SIP, explicit dispatch. |
| **`backend/`** | A FastAPI service that mints LiveKit room tokens (`POST /api/v1/token`), with a clean API → service → repository layout you copy to add resources. |
| **`frontend/`** | React + Vite + Tailwind using LiveKit's Agents UI components: audio visualizer, live transcript, and text chat. |
Typed end to end, tested, linted, with CI and a pre-commit hook across all three.
## How it fits together
```
React app ──POST /api/v1/token──▶ Backend ──signs token──▶ React joins room
│ │
└───────────────── connects to LiveKit room ◀──────────────────┘
│
agent_name dispatches ──▶ Agent worker joins
│
mic ▶ STT ▶ LLM ▶ TTS ▶ speaker (over WebRTC)
```
The backend and agent share the same LiveKit credentials, so a backend-minted
token is valid for the room the agent joins. Works against self-hosted LiveKit
or LiveKit Cloud.
## Run with Docker
The fastest path. One command brings up Postgres, the backend, the agent, and the
frontend together:
```bash
cp .env.example .env # fill in LIVEKIT_* + OPENAI/DEEPGRAM/CARTESIA
docker compose up --build
```
Open `http://localhost:5173` and click **Start conversation**. Uses an external
LiveKit project (a free LiveKit Cloud project works). The voice demo needs no
database; for the auth/User endpoints, run once:
`docker compose exec backend alembic upgrade head`.
## Run manually
You'll need a LiveKit project (URL + API key/secret) and provider keys
(OpenAI, Deepgram, Cartesia). Run each in its own terminal:
```bash
# 1. Backend: token server (http://localhost:8000)
cd backend && cp .env.example .env # add LIVEKIT_* + JWT_SECRET_KEY
uv sync && uv run uvicorn src.main:app --reload
# 2. Agent: voice worker
cd agent && cp .env.example .env # add LIVEKIT_*, OPENAI/DEEPGRAM/CARTESIA keys
uv sync && uv run python main.py dev
# 3. Frontend: web client (http://localhost:5173)
cd frontend && cp .env.example .env # point VITE_TOKEN_ENDPOINT at the backend
pnpm install && pnpm dev
```
Open `http://localhost:5173`, click **Start conversation**, allow the mic, and talk.
> No frontend yet? Talk to the agent from your terminal with
> `cd agent && uv run python main.py console`.
## Stack
| Layer | Default |
| --------------- | ----------------------------------------------------------------------------- |
| STT / LLM / TTS | Deepgram `nova-3` · OpenAI `gpt-4.1-mini` · Cartesia (all swappable) |
| Realtime | LiveKit Agents (`livekit-agents`), WebRTC, Silero VAD, turn detector |
| Backend | FastAPI, async SQLModel/Postgres, dependency-injector, PyJWT, `livekit-api` |
| Frontend | React 19, Vite, TypeScript, Tailwind v4, shadcn + LiveKit Agents UI |
| Tooling | uv, ruff, mypy, pytest · ESLint, Vitest · pre-commit, GitHub Actions, Codecov |
## Highlights
- **One command per service** to run locally; one `.env.example` each.
- **Web and telephony** (SIP) on the same agent, via a single participant branch.
- **Swappable providers** and self-hosted ↔ LiveKit Cloud with a one-line change.
- **Standard token endpoint** so LiveKit client SDKs connect with zero glue.
- **Copy-to-extend** patterns: a `User` slice in the backend, a bare `Assistant` in the agent.
## Docs
Each package has its own README with details:
[`agent/`](agent/README.md) · [`backend/`](backend/README.md) · [`frontend/`](frontend/README.md)
## License
MIT. See [`LICENSE`](LICENSE).