https://github.com/testy-cool/video-analyzer-ai

ai cli llm openrouter tiktok transcript video-analysis video-to-text youtube-transcript yt-dlp

Last synced: about 2 months ago
JSON representation

Host: GitHub
URL: https://github.com/testy-cool/video-analyzer-ai
Owner: testy-cool
Created: 2026-05-06T16:02:27.000Z (2 months ago)
Default Branch: main
Last Pushed: 2026-05-09T20:10:21.000Z (2 months ago)
Last Synced: 2026-05-09T22:07:09.368Z (2 months ago)
Topics: ai, cli, llm, openrouter, tiktok, transcript, video-analysis, video-to-text, youtube-transcript, yt-dlp
Language: Python
Size: 63.5 KB
Stars: 1
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # video-analyzer-ai

Extract transcripts and analyze videos with AI. YouTube, Twitter/X, TikTok, Instagram, and [1000+ sites](https://github.com/yt-dlp/yt-dlp/blob/master/supportedsites.md) supported.

```bash

va "https://www.youtube.com/watch?v=dQw4w9WgXcQ"

va "https://x.com/user/status/123456" -a

va "https://www.tiktok.com/@user/video/123456" -a -v

```

## How it works

```

URL → YouTube captions (free) → transcript

        ↓ (no captions?)

      yt-dlp audio download → multimodal LLM transcription → transcript

        ↓ (--analyze?)

      LLM analysis via any OpenAI-compatible API → structured breakdown

        ↓ (--verify?)

      Second pass with prompt cache optimization → verified result

```

- **YouTube**: tries built-in captions first (free, instant), falls back to audio transcription

- **Everything else**: downloads audio via yt-dlp, transcribes it with a multimodal LLM (Gemini by default)

- **Analysis**: sends transcript to any OpenAI-compatible endpoint (OpenRouter, OpenAI, Gemini, local models)

- **Verification**: double-pass analysis where the second call reuses the prompt cache (~66% cheaper)

- **Caching**: transcripts, metadata, and analysis results cached locally with human-readable filenames

## Install

Requires Python 3.13+ and [yt-dlp](https://github.com/yt-dlp/yt-dlp).

```bash

# Install yt-dlp if you don't have it

pip install yt-dlp

# Install video-analyzer-ai

git clone https://github.com/testy-cool/video-analyzer-ai.git

cd video-analyzer-ai

cp .env.example .env  # add your API keys

pip install -e .

```

Or with [uv](https://docs.astral.sh/uv/):

```bash

uv tool install --editable .

```

The CLI command is `va`.

## Configuration

Copy `.env.example` to `.env` and fill in your keys:

```bash

# Analysis + transcription LLM — any OpenAI-compatible endpoint

ANALYSIS_BASE_URL=https://openrouter.ai/api/v1/chat/completions

ANALYSIS_API_KEY=sk-...

ANALYSIS_MODEL=google/gemini-3.1-flash-lite

# Transcription model — must accept audio input (multimodal)

TRANSCRIPTION_MODEL=google/gemini-3.1-flash-lite

# Optional: Langfuse tracing

LANGFUSE_SECRET_KEY=

LANGFUSE_PUBLIC_KEY=

LANGFUSE_BASE_URL=

# Optional: Evomi residential proxy (for rate-limited sources)

EVOMI_USER=

EVOMI_PASS=

```

**Zero-config mode**: YouTube captions work without any API keys. You only need `ANALYSIS_API_KEY` for the `-a` analysis flag and for audio transcription of videos without captions.

## Usage

```bash

# Transcript only (free for YouTube with captions)

va "https://www.youtube.com/watch?v=..."

# With timestamps

va "https://www.youtube.com/watch?v=..." -t

# AI analysis

va "https://www.youtube.com/watch?v=..." -a

# Analysis + verification (double-pass, uses prompt cache)

va "https://www.youtube.com/watch?v=..." -a -v

# Custom prompt

va "https://www.youtube.com/watch?v=..." -q "list every product mentioned"

# Custom prompt + verification

va "https://www.youtube.com/watch?v=..." -q "extract all claims" -v

# JSON output (pipe-friendly)

va "https://www.youtube.com/watch?v=..." -a -j | jq .analysis.summary

# Non-YouTube (requires ANALYSIS_API_KEY)

va "https://x.com/user/status/123456789"

va "https://www.tiktok.com/@user/video/123456789"

# Force audio transcription (skip captions)

va "https://www.youtube.com/watch?v=..." -f

# Different language

va "https://www.youtube.com/watch?v=..." -l ro

# Open cache folder

va -o

# Skip cache

va "https://www.youtube.com/watch?v=..." --no-cache

```

## Flags

| Flag | Short | Description |

|------|-------|-------------|

| `--analyze` | `-a` | Analyze transcript with LLM |

| `--verify` | `-v` | Double-pass verification (implies `-a`) |

| `--prompt` | `-q` | Custom analysis prompt (implies `-a`) |

| `--json-output` | `-j` | JSON output to stdout |

| `--timestamps` | `-t` | Show timestamps in text output |

| `--force-transcribe` | `-f` | Skip captions, use audio transcription |

| `--lang` | `-l` | Caption language codes (default: en) |

| `--no-metadata` | `-M` | Skip metadata fetching |

| `--proxy` | `-p` | Use Evomi residential proxy |

| `--model` | | Transcription model (default: `google/gemini-3.1-flash-lite`) |

| `--analysis-model` | | Analysis model (default: from env) |

| `--no-cache` | | Bypass all caching |

| `--open-cache` | `-o` | Open cache folder in file manager |

## Cache

Transcripts, metadata, and analysis results are cached at `~/.cache/video-analyzer/` with human-readable filenames:

```

AI-Engineer_MCP-UI-Extending-the-frontier_o-zkvb0iFDQ.transcript.json

AI-Engineer_MCP-UI-Extending-the-frontier_o-zkvb0iFDQ.analysis.gemini_gemini-3.1-flash-lite-preview.json

AI-Engineer_MCP-UI-Extending-the-frontier_o-zkvb0iFDQ.metadata.json

```

Analysis is cached per model — switching `--analysis-model` triggers a fresh analysis. Use `--no-cache` to bypass, or `va -o` to browse cached files.

## Cost

| Operation | Cost |

|-----------|------|

| YouTube captions | Free |

| Audio transcription (Gemini 3.1 Flash Lite) | ~$0.001-0.002/min |

| Analysis (depends on model) | ~$0.001-0.01/run |

| Verification pass | ~33% of analysis (prompt cache) |

| Cached results | Free |

## License

MIT

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/testy-cool/video-analyzer-ai

Awesome Lists containing this project

README