{"id":50837611,"url":"https://github.com/assemblyai/cli","last_synced_at":"2026-06-14T05:01:32.512Z","repository":{"id":364150713,"uuid":"1258346835","full_name":"AssemblyAI/cli","owner":"AssemblyAI","description":"AssemblyAI CLI","archived":false,"fork":false,"pushed_at":"2026-06-11T22:07:58.000Z","size":1425,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-11T22:10:30.938Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/AssemblyAI.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2026-06-03T13:48:34.000Z","updated_at":"2026-06-11T22:07:03.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/AssemblyAI/cli","commit_stats":null,"previous_names":["assemblyai/cli"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/AssemblyAI/cli","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AssemblyAI%2Fcli","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AssemblyAI%2Fcli/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AssemblyAI%2Fcli/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AssemblyAI%2Fcli/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/AssemblyAI","download_url":"https://codeload.github.com/AssemblyAI/cli/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AssemblyAI%2Fcli/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34309655,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-14T02:00:07.365Z","response_time":62,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-06-14T05:01:00.202Z","updated_at":"2026-06-14T05:01:32.507Z","avatar_url":"https://github.com/AssemblyAI.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# AssemblyAI CLI\n\n[![Python](https://img.shields.io/badge/python-3.12+-D6402E)](https://github.com/AssemblyAI/cli)\n[![License](https://img.shields.io/badge/license-MIT-D6402E)](https://github.com/AssemblyAI/cli/blob/main/LICENSE)\n[![Docs](https://img.shields.io/badge/docs-assemblyai-D6402E)](https://www.assemblyai.com/docs)\n\nThe AssemblyAI CLI (`assembly`) brings speech AI directly into your terminal: transcribe files, URLs, and YouTube/podcast pages, stream live audio, talk to a two-way voice agent, prompt the LLM Gateway, benchmark speech models, and scaffold ready-to-deploy starter apps.\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"assets/welcome.png\" alt=\"The assembly CLI welcome screen, listing command groups for transcription, streaming, voice agents, app scaffolding, and account management\" width=\"820\"\u003e\n\u003c/p\u003e\n\nLearn more about the platform in the [AssemblyAI docs](https://www.assemblyai.com/docs).\n\n## ⚡ Quickstart\n\nInstall on macOS or Linux with Homebrew:\n\n```sh\nbrew tap assemblyai/cli https://github.com/AssemblyAI/cli\nbrew trust assemblyai/cli   # only needed when HOMEBREW_REQUIRE_TAP_TRUST is set; harmless otherwise\nbrew install assembly\n```\n\nSign in (stores your API key in the OS keyring) and run your first transcription:\n\n```sh\nassembly login\nassembly transcribe --sample\n```\n\nThat's it. Run `assembly onboard` for a guided tour, or see [Installation](#-installation) for pipx/uv and other options.\n\n## 🚀 Why the AssemblyAI CLI?\n\n- **🎯 One command for everything**: transcription, real-time streaming, voice agents, LLM prompts, and WER benchmarking — no SDK boilerplate.\n- **🔌 Built for pipelines**: data goes to stdout, errors to stderr, `--json` gives stable machine-readable output, and `-` reads audio from stdin.\n- **🔐 Secure by default**: your API key lives in the OS keyring, never in a dotfile — and run commands have no `--api-key` flag, so keys can't leak into `ps` or shell history.\n- **🛠️ From demo to deployed app**: `assembly init` scaffolds a runnable FastAPI starter, `assembly dev` / `share` / `deploy` run, tunnel, and ship it, and `--show-code` prints the equivalent Python SDK script for any run command.\n- **🤖 Agent-ready**: `assembly setup install` wires your coding agent up with the AssemblyAI docs MCP server and skills.\n- **📖 Open source**: MIT licensed.\n\n## 📋 Features at a glance\n\n| Command | What it does |\n| :--- | :--- |\n| `assembly transcribe` | Transcribe files, URLs, YouTube/podcast pages, directories, globs, or bucket storage (`s3://`, `gs://`, `az://`) — with speaker labels, PII redaction, summarization, SRT/VTT captions, and resumable batch runs |\n| `assembly stream` | Real-time transcription from your microphone, a file, or a URL — on macOS it can capture system audio too |\n| `assembly dictate` | Push-to-talk dictation: press Enter to record, Enter again for instant text (Sync STT API, up to 120 s per utterance) |\n| `assembly agent` | Full-duplex spoken conversation with a voice agent, right in your terminal |\n| `assembly speak` | Synthesize text to speech over the streaming-TTS WebSocket (sandbox-only) |\n| `assembly llm` | Prompt the LLM Gateway over a transcript, stdin, or a live stream |\n| `assembly clip` | Cut audio/video with ffmpeg by diarized speaker, text match, LLM pick, or time range (`--video` keeps the picture for URL sources) — clip boundaries snap into nearby silence |\n| `assembly dub` | Re-voice an audio/video file or URL in another language: transcription, LLM translation, per-speaker TTS, ffmpeg track-swap (sandbox-only) |\n| `assembly caption` | Burn always-visible captions into a video: transcribe (or reuse a transcript), fetch SRT, ffmpeg burns it in — audio untouched |\n| `assembly eval` | Benchmark WER against Hugging Face datasets (built-in aliases: `librispeech`, `tedlium`, …) or local manifests |\n| `assembly webhooks listen` | Open a public dev URL that prints webhook deliveries and can forward them to your local app |\n| `assembly init` / `dev` / `share` / `deploy` | Scaffold a FastAPI + HTML starter app, run it locally, expose it on a public URL, ship it to Vercel / Railway / Fly.io |\n| `assembly setup` | Wire a coding agent up with the AssemblyAI docs MCP server and skills |\n| `assembly doctor` | Check your environment: API key, network, ffmpeg, microphone |\n| `assembly transcripts` / `sessions` | Browse and fetch past transcripts and streaming sessions |\n| `assembly keys` / `balance` / `usage` / `limits` / `audit` | Account self-service via browser login |\n\nAdd `--show-code` to `transcribe` / `stream` / `agent` to print the equivalent Python SDK script instead of running — the built-in path from CLI experiment to SDK code.\n\n## ✨ Things you can do with it\n\nA few one-liners that show what `assembly` can do. The everyday basics live under [Getting started](#-getting-started) below.\n\n\u003e [!NOTE]\n\u003e `speak` and `dub` are sandbox-only today — that's why the examples below pass `--sandbox`.\n\n**Recreate a scene with synthetic voices** — transcribe and diarize a YouTube clip, then pipe it straight into TTS with a different voice per speaker:\n\n```sh\nassembly transcribe \"https://www.youtube.com/watch?v=awmCtXzFsJo\" --speaker-labels \\\n  | assembly --sandbox speak --voice A=jane --voice B=mary --out scene.wav\n```\n\n`speak` auto-detects `Speaker A:` labels, merges each speaker's turns, and rotates voices.\n\n**Dub a video into another language** — the whole platform in one command: transcription with utterance timestamps, per-utterance LLM translation, TTS for each line (one voice per speaker), and ffmpeg laying the new track over the original video. A great demo is the first YouTube video ever, \"Me at the zoo\" — it's 19 seconds long, a single clear English speaker, and instantly recognizable, so the dub finishes fast and the before/after is obvious:\n\n```sh\nassembly --sandbox dub \"https://www.youtube.com/watch?v=jNQXAC9IVRw\" -l de --video\n```\n\nThe video stream is copied untouched; each dubbed line lands at its original start time.\n\n**Turn a podcast into audio** — Apple and Spotify podcast pages work too (yt-dlp ingestion):\n\n```sh\nassembly transcribe \"https://podcasts.apple.com/us/podcast/id1516093381\" --speaker-labels \\\n  | assembly --sandbox speak --out episode.wav\n```\n\n**Cut the highlight reel from a speech** — `clip` downloads the video (`--video`; omit it for audio-only clips), transcribes it, has an LLM pick the windows, and cuts each one into its own file with ffmpeg (here: Steve Jobs' Stanford commencement address):\n\n```sh\nassembly clip \"https://www.youtube.com/watch?v=UF8uR6Z6KLc\" --video \\\n  --llm \"the most quotable 20-40 seconds from each of the stories\" \\\n  --padding 0.5 --out-dir .\n```\n\n**Burn karaoke subtitles into a music video** — `caption` transcribes the video and burns the captions straight into the picture with ffmpeg; `--chars-per-caption` keeps the lines short so they flip with the vocals:\n\n```sh\nassembly caption video.mp4 --chars-per-caption 24 --font-size 28\n```\n\n**Keep a live to-do list from your mic** — `llm -f` re-runs the prompt over the growing transcript, updating in place:\n\n```sh\nassembly stream -o text | assembly llm -f \"summarize my to-dos as I talk\"\n```\n\n**Caption a meeting from system audio** (macOS) — captures app/system audio alongside your mic as separate diarized speakers:\n\n```sh\nassembly stream --system-audio --speaker-labels -o text\n```\n\n**Get pinged when your name comes up** in a live meeting:\n\n```sh\nassembly stream -o text | grep --line-buffered -i alex \\\n  | while read -r _; do afplay /System/Library/Sounds/Glass.aiff; done\n```\n\n**Chain LLM prompts over a transcript** — each prompt runs on the finished transcript:\n\n```sh\nassembly transcribe --sample --llm \"summarize\" --llm \"translate the summary to French\"\n```\n\n**Talk to a voice agent in your terminal** — full-duplex, around 20 voices:\n\n```sh\nassembly agent --voice ivy --system-prompt \"you're a helpful interviewer\"\n```\n\n**Graduate to the SDK** — `--show-code` prints the equivalent Python script for any `transcribe`/`stream`/`agent` run instead of executing it:\n\n```sh\nassembly agent --system-prompt \"you're a story generator\" --show-code \u003e story.py\n```\n\n**Scaffold and deploy a voice agent** — templates: `voice-agent`, `audio-transcription`, `live-captions`:\n\n```sh\nassembly init voice-agent \u0026\u0026 assembly deploy --prod\n```\n\n**Benchmark WER against public datasets** — built-in aliases for LibriSpeech, TEDLIUM, and more:\n\n```sh\nassembly eval librispeech --speech-model universal-3-pro --limit 50\n```\n\n## 📦 Installation\n\nRequires Python 3.12+ (Homebrew brings its own; for pipx/uv see the `--python` hint below).\n\n\u003e [!WARNING]\n\u003e The `assemblyai-cli` package on PyPI is **not** this project — install with one of the\n\u003e commands below, not `pip install assemblyai-cli`.\n\n### Homebrew (recommended — macOS / Linux)\n\n```sh\nbrew tap assemblyai/cli https://github.com/AssemblyAI/cli\nbrew trust assemblyai/cli   # only needed when HOMEBREW_REQUIRE_TAP_TRUST is set; harmless otherwise\nbrew install assembly\n```\n\nHomebrew pulls in `ffmpeg` and `portaudio`, so every command works out of the box.\n\n### pipx / uv\n\n```sh\npipx install \"git+https://github.com/AssemblyAI/cli.git\"\n# or\nuv tool install \"git+https://github.com/AssemblyAI/cli.git\"\n```\n\nIf your default interpreter is older than Python 3.12, add `--python python3.12` (pipx) or\n`--python 3.12` (uv) to the install command.\n\n\u003cdetails\u003e\n\u003csummary\u003eSystem dependencies for the live-audio commands (pipx/uv installs only)\u003c/summary\u003e\n\nOnly the live-audio commands need anything extra: `stream`, `dictate`, and `agent` use PortAudio for\nmicrophone capture and [`ffmpeg`](https://ffmpeg.org) on `PATH` to stream non-WAV audio.\nPlain `transcribe` uploads your file directly and needs neither.\n\n- Debian/Ubuntu: `sudo apt-get install libportaudio2 ffmpeg`\n- Fedora: `sudo dnf install portaudio ffmpeg`\n- macOS (Homebrew): `brew install portaudio ffmpeg`\n\n\u003c/details\u003e\n\n## 🔐 Authentication\n\nNew to AssemblyAI? Create a free account at\n[assemblyai.com/dashboard](https://www.assemblyai.com/dashboard) to get an API key.\n\n### Option 1: Browser login (recommended)\n\n**✨ Best for:** day-to-day use on your own machine.\n\nBrowser login stores your API key in the OS keyring (Keychain / Credential Manager / Secret Service) — nothing lands in a dotfile, and it unlocks the account commands (`keys`, `balance`, `usage`, `limits`, `sessions`, `audit`):\n\n```sh\nassembly login\n```\n\n### Option 2: API key environment variable\n\n**✨ Best for:** CI, containers, and anywhere a browser isn't an option.\n\nThe environment variable is checked before the keyring, and nothing is written to disk:\n\n```sh\nexport ASSEMBLYAI_API_KEY=\"YOUR_API_KEY\"\n```\n\n## 🚀 Getting started\n\n### Guided tour\n\nSign in, run a first transcription, start building:\n\n```sh\nassembly onboard\n```\n\n### Basic usage\n\n```sh\nassembly transcribe --sample   # transcribe the hosted sample file\nassembly transcribe call.mp3   # then your own audio\nassembly stream --sample       # live streaming, no microphone needed\nassembly stream                # stream your microphone (Ctrl-C to stop)\nassembly agent                 # talk to a voice agent (use headphones)\nassembly init                  # scaffold a starter app\n```\n\n### Quick examples\n\nPull exactly the output you need:\n\n```sh\nassembly transcribe call.mp3 -o text   # just the text\nassembly transcribe video.mp4 -o srt   # captions\nassembly transcribe call.mp3 --speaker-labels --summarization --json\n```\n\nTranscribe in batches — a directory, a glob, or a piped list, resumable on re-run:\n\n```sh\nassembly transcribe ./recordings\nassembly transcribe \"s3://bucket/calls/*.mp3\"   # needs: pip install s3fs\nfind . -name \"*.wav\" | assembly transcribe --from-stdin\n```\n\nCompose with other tools — audio in, text out:\n\n```sh\nffmpeg -i talk.mp4 -f wav - | assembly transcribe -\ngit log --oneline -30 | assembly llm \"write release notes grouped by feature/fix\"\n```\n\n## 📚 Documentation\n\n### In the terminal\n\n- Run `assembly --help` or `assembly \u003ccommand\u003e --help` for flags and examples.\n- Run `assembly doctor` to check your environment (API key, network, ffmpeg, microphone).\n\n### Resources\n\n- [**AssemblyAI docs**](https://www.assemblyai.com/docs) — guides for every model and feature.\n- [**API reference**](https://www.assemblyai.com/docs/api-reference) — the REST and streaming APIs the CLI drives.\n- [**Dashboard**](https://www.assemblyai.com/dashboard) — manage your account and API keys.\n\n## 🤝 Contributing\n\nThis project uses [uv](https://docs.astral.sh/uv/):\n\n```sh\nuv sync                  # create/refresh the venv\nuv run assembly --help   # run the CLI from the locked environment\n./scripts/check.sh       # the full gate CI runs\n```\n\nSee [AGENTS.md](AGENTS.md) for development conventions and architecture notes.\n\n## 📄 Legal\n\n- **License**: released under the [MIT license](LICENSE).\n- **Privacy**: [AssemblyAI privacy policy](https://www.assemblyai.com/legal/privacy-policy) — the CLI's anonymous usage telemetry is opt-out (`assembly telemetry disable`, `AAI_TELEMETRY_DISABLED=1`, or `DO_NOT_TRACK=1`).\n- **Terms**: [AssemblyAI terms of service](https://www.assemblyai.com/legal/terms-of-service).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fassemblyai%2Fcli","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fassemblyai%2Fcli","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fassemblyai%2Fcli/lists"}