An open API service indexing awesome lists of open source software.

https://github.com/mahimairaja/livekit-starter

Full-stack LiveKit voice agent starter: a Python voice worker, FastAPI token server, and React frontend. Run, deploy, and swap each piece on its own.
https://github.com/mahimairaja/livekit-starter

ai-agents cartesia conversational-ai deepgram fastapi livekit llm openai python react realtime sip speech-to-text starter-template text-to-speech typescript voice-agent voice-ai voice-assistant webrtc

Last synced: about 3 hours ago
JSON representation

Full-stack LiveKit voice agent starter: a Python voice worker, FastAPI token server, and React frontend. Run, deploy, and swap each piece on its own.

Awesome Lists containing this project

README

          



LiveKit Voice AI Starter

LiveKit Voice AI Starter


Talk to an AI agent in your browser, in minutes.

A full-stack, production-minded starter for real-time voice agents: a LiveKit
voice worker, a FastAPI token server, and a React frontend built on LiveKit's
Agents UI, wired together and ready to extend.



LiveKit
Voice First
Python
FastAPI
React
TypeScript
Tailwind CSS


MIT License
PRs welcome

---

Most voice-AI demos are a single script. This is the whole loop, structured the
way you'd actually ship it, and split into three pieces you can run, deploy, and
swap independently.

## What's inside

| Package | What it is |
| --------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **`agent/`** | A LiveKit voice worker: Deepgram `nova-3` STT, OpenAI `gpt-4.1-mini`, Cartesia TTS, Silero VAD, and the LiveKit multilingual turn detector. Web and SIP, explicit dispatch. |
| **`backend/`** | A FastAPI service that mints LiveKit room tokens (`POST /api/v1/token`), with a clean API → service → repository layout you copy to add resources. |
| **`frontend/`** | React + Vite + Tailwind using LiveKit's Agents UI components: audio visualizer, live transcript, and text chat. |

Typed end to end, tested, linted, with CI and a pre-commit hook across all three.

## How it fits together

```
React app ──POST /api/v1/token──▶ Backend ──signs token──▶ React joins room
│ │
└───────────────── connects to LiveKit room ◀──────────────────┘

agent_name dispatches ──▶ Agent worker joins

mic ▶ STT ▶ LLM ▶ TTS ▶ speaker (over WebRTC)
```

The backend and agent share the same LiveKit credentials, so a backend-minted
token is valid for the room the agent joins. Works against self-hosted LiveKit
or LiveKit Cloud.

## Run with Docker

The fastest path. One command brings up Postgres, the backend, the agent, and the
frontend together:

```bash
cp .env.example .env # fill in LIVEKIT_* + OPENAI/DEEPGRAM/CARTESIA
docker compose up --build
```

Open `http://localhost:5173` and click **Start conversation**. Uses an external
LiveKit project (a free LiveKit Cloud project works). The voice demo needs no
database; for the auth/User endpoints, run once:
`docker compose exec backend alembic upgrade head`.

## Run manually

You'll need a LiveKit project (URL + API key/secret) and provider keys
(OpenAI, Deepgram, Cartesia). Run each in its own terminal:

```bash
# 1. Backend: token server (http://localhost:8000)
cd backend && cp .env.example .env # add LIVEKIT_* + JWT_SECRET_KEY
uv sync && uv run uvicorn src.main:app --reload

# 2. Agent: voice worker
cd agent && cp .env.example .env # add LIVEKIT_*, OPENAI/DEEPGRAM/CARTESIA keys
uv sync && uv run python main.py dev

# 3. Frontend: web client (http://localhost:5173)
cd frontend && cp .env.example .env # point VITE_TOKEN_ENDPOINT at the backend
pnpm install && pnpm dev
```

Open `http://localhost:5173`, click **Start conversation**, allow the mic, and talk.

> No frontend yet? Talk to the agent from your terminal with
> `cd agent && uv run python main.py console`.

## Stack

| Layer | Default |
| --------------- | ----------------------------------------------------------------------------- |
| STT / LLM / TTS | Deepgram `nova-3` · OpenAI `gpt-4.1-mini` · Cartesia (all swappable) |
| Realtime | LiveKit Agents (`livekit-agents`), WebRTC, Silero VAD, turn detector |
| Backend | FastAPI, async SQLModel/Postgres, dependency-injector, PyJWT, `livekit-api` |
| Frontend | React 19, Vite, TypeScript, Tailwind v4, shadcn + LiveKit Agents UI |
| Tooling | uv, ruff, mypy, pytest · ESLint, Vitest · pre-commit, GitHub Actions, Codecov |

## Highlights

- **One command per service** to run locally; one `.env.example` each.
- **Web and telephony** (SIP) on the same agent, via a single participant branch.
- **Swappable providers** and self-hosted ↔ LiveKit Cloud with a one-line change.
- **Standard token endpoint** so LiveKit client SDKs connect with zero glue.
- **Copy-to-extend** patterns: a `User` slice in the backend, a bare `Assistant` in the agent.

## Docs

Each package has its own README with details:
[`agent/`](agent/README.md) · [`backend/`](backend/README.md) · [`frontend/`](frontend/README.md)

## License

MIT. See [`LICENSE`](LICENSE).