https://github.com/jchirayath/mediahound

Turn photos of your movie & music collection (DVD/VHS/Blu-ray, CD/vinyl/cassette) into a sleek, searchable web catalog — offline-first, zero-key.
https://github.com/jchirayath/mediahound

catalog cd collection dvd movies music musicbrainz ocr records self-hosted static-site vhs vinyl

Last synced: about 1 month ago
JSON representation

Turn photos of your movie & music collection (DVD/VHS/Blu-ray, CD/vinyl/cassette) into a sleek, searchable web catalog — offline-first, zero-key.

Host: GitHub
URL: https://github.com/jchirayath/mediahound
Owner: jchirayath
License: mit
Created: 2026-06-09T21:00:21.000Z (about 1 month ago)
Default Branch: main
Last Pushed: 2026-06-10T06:36:14.000Z (about 1 month ago)
Last Synced: 2026-06-10T07:25:16.373Z (about 1 month ago)
Topics: catalog, cd, collection, dvd, movies, music, musicbrainz, ocr, records, self-hosted, static-site, vhs, vinyl
Language: Python
Homepage: https://jchirayath.github.io/mediahound/
Size: 1.11 MB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 5
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Security: SECURITY.md

Awesome Lists containing this project

README

# 🎬🎵 MediaHound

[![CI](https://github.com/jchirayath/mediahound/actions/workflows/ci.yml/badge.svg)](https://github.com/jchirayath/mediahound/actions/workflows/ci.yml)
[![CodeQL](https://github.com/jchirayath/mediahound/actions/workflows/codeql.yml/badge.svg)](https://github.com/jchirayath/mediahound/actions/workflows/codeql.yml)
[![PyPI](https://img.shields.io/pypi/v/mediahound.svg)](https://pypi.org/project/mediahound/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
[![Python 3.11+](https://img.shields.io/badge/python-3.11%2B-blue.svg)](https://www.python.org/)
[![Live demo](https://img.shields.io/badge/live-demo-e97b0c.svg)](https://jchirayath.github.io/mediahound/)

**Turn photos of your movie *and* music collection into a sleek, searchable web catalog.**

Point MediaHound at a folder of cover photos — DVDs, VHS, Blu-ray, **CDs, vinyl, cassettes** — or
**import a CSV**. It identifies each item, pulls in cover art, genres, cast/artist, studio/label,
runtime/tracklist and ratings, writes a short enticing intro, estimates the used resale value, links
where to **watch** (movies), **listen** (music) or **find** (books), and generates a polished static
website you can search, filter by **🎬 Movies / 🎵 Music / 📚 Books**, sort, and curate — with a
password-protected admin mode.

Movies are identified/enriched via TMDB / OMDb / Wikidata + JustWatch; music via **MusicBrainz +
Cover Art Archive**; books via **Open Library** (open, zero-key) — scan a book's **ISBN** for an exact
match. Keyless Spotify / Apple Music / YouTube Music + Open Library / Goodreads links throughout.

**▶ [Live demo](https://jchirayath.github.io/mediahound/)** — explore a sample catalog in your browser (admin password: `changeme`).

**🖥️ Download the desktop app — no Python, no terminal:**
**[⬇ macOS](https://github.com/jchirayath/mediahound/releases/latest/download/MediaHound-macOS.zip)**
**[⬇ Windows](https://github.com/jchirayath/mediahound/releases/latest/download/MediaHound-Windows.zip)**
— unzip and open. *(Or `pip install mediahound` — see below.)*

[![MediaHound screenshot](docs/screenshot.jpg)](https://jchirayath.github.io/mediahound/)

> ℹ️ The demo shows **real movie posters and album covers** (hotlinked from IMDb/OMDb and Cover Art
> Archive / Apple) so you can see what a finished catalog looks like — no cover images are stored in
> this repo. Extra gallery photos are generated placeholders standing in for *your own* photos. Your
> real catalog pulls art from TMDB / OMDb / Wikidata (movies) and MusicBrainz / Cover Art Archive
> (music), or falls back to the photos you take.

- **Runs for anybody with zero API keys** — open-source OCR + open data by default.
- **Offline-first** — never contacts the internet unless you explicitly ask (`--online`).
- **Static output** — deploy anywhere (Netlify, GitHub Pages, S3, Vercel) or just open the HTML file.
- **No secrets in the repo** — keys live in a gitignored `.env`; your catalog is generated output.

> MIT-licensed. Your photos, keys and catalog never get committed to this tool's repo.

---

## See it in 30 seconds (no photos, no keys)

```bash
pip install mediahound

mediahound init demo
mediahound build --config demo/config.toml --mock # generates a sample catalog
cd demo && python3 -m http.server 8000 # open http://localhost:8000
```

That's the screenshot above. Click **🔒 Admin** and sign in with **`changeme`** to try the
read/write admin tools. Everything you see is generated by `--mock` — no internet, no API keys.

> Prefer the bleeding edge or want to hack on it? Install from source instead:
> `git clone https://github.com/jchirayath/mediahound && cd mediahound && pip install -e .`
> (Maintainers: see [RELEASING.md](RELEASING.md) for the publish flow.)

---

## Cataloguing your own collection

### The easy way — one command, no terminal after that

```bash
pip install mediahound
mediahound app # sets up a library and opens the editor in your browser
```

`mediahound app` creates a library folder (if needed), opens the catalog, and you click
**➕ Add photos** to **drag-and-drop** your cover pics — they're saved and identified automatically.
No config files, no separate build/serve commands. Everything stays on your computer.

**Snap covers on your phone:**

```bash
mediahound app --phone # opens to your Wi-Fi and prints a QR code
```

Scan the QR with your phone (on the same Wi-Fi), tap **➕ Add photos → Take Photo**, and they upload
straight into your catalog. Uploads are **token-protected** (only the phone that scanned the code can
add photos) and nothing leaves your network — use it on a network you trust.

### Desktop app (no Python, no terminal)

Prefer to double-click an icon? A native app opens the editor in its own window:

- **Download:** **[⬇ macOS (.app)](https://github.com/jchirayath/mediahound/releases/latest/download/MediaHound-macOS.zip)**
· **[⬇ Windows (.exe)](https://github.com/jchirayath/mediahound/releases/latest/download/MediaHound-Windows.zip)**
— or pick a specific version on the [Releases](https://github.com/jchirayath/mediahound/releases)
page. Unzip, and open. *(Unsigned builds show a Gatekeeper/SmartScreen prompt the first time —
right-click → **Open** on macOS, or **More info → Run anyway** on Windows.)* To ship builds that
open with no warning, see [SIGNING.md](SIGNING.md).
- **Or, if you've got Python:** `pip install "mediahound[desktop]"` then `mediahound gui` opens the
same native window. (Without the `[desktop]` extra it falls back to opening your browser.)
- **Build it yourself:** `bash packaging/build-desktop.sh` (uses PyInstaller; build on the OS you
want). CI builds the macOS + Windows apps automatically — see `.github/workflows/desktop.yml`.

The app keeps your library in **`~/MediaHound Library`** and works fully offline.

### Share it — one-click Publish

When your catalog's ready, click **🌐 Publish** in the admin console to deploy it to **Netlify** (free) and get a shareable link. Paste a Netlify access token once (stored in your OS keychain); only the finished site is uploaded — your source photos and config never leave your computer.

### The CLI way (more control)

```bash
pip install "mediahound[ocr]" # adds the default OCR identifier
# Install the Tesseract engine for OCR:
# macOS: brew install tesseract
# Debian: sudo apt-get install tesseract-ocr

mediahound init mysite # scaffolds mysite/ (RawImages/{video,audio}/, config.toml, web template)

# Sort your cover photos by media type:
cp ~/Pictures/dvd-covers/*.jpg mysite/RawImages/video/ # 🎬 movies (DVD/VHS/Blu-ray/LaserDisc)
cp ~/Pictures/album-covers/*.jpg mysite/RawImages/audio/ # 🎵 music (CD/vinyl/cassette)

mediahound build --config mysite/config.toml --online # identify + enrich (see Providers below)
cd mysite && python3 -m http.server 8000
```

#### Raw-image folder convention

Photos are sorted into **media-type subfolders** so MediaHound knows how to identify each item:

| Folder | Media type | Identified/enriched via |
|---|---|---|
| `RawImages/video/` | 🎬 movies | TMDB / OMDb / Wikidata + JustWatch |
| `RawImages/audio/` | 🎵 music | MusicBrainz + Cover Art Archive + listen links |
| `RawImages/` (root) | defaults to movies | — |

(`movies/` and `music/` are accepted aliases.) Add more photos anytime and re-run `build` — only the
**new** ones are processed (state is tracked by content hash in `data/manifest.json`).

### Or import from a CSV (no photos)

```bash
mediahound import catalog.csv --config mysite/config.toml # add rows offline
mediahound import catalog.csv --config mysite/config.toml --online # …and fetch cover art + metadata
mediahound export --config mysite/config.toml -o backup.csv # dump the whole catalog back to CSV
```

Columns (case-insensitive; extras ignored): `media_type, title, artist, director, year, format,
label, studio, genres, rating, barcode, cover_url, intro`. **Only `title` is required** — any missing
fields are left blank (or filled by `--online`); even a one-column list of titles works. `media_type`
is inferred (`music` if an `artist` is given, else `movie`). See
[`examples/sample-import.csv`](examples/sample-import.csv).

Prefer a UI? Under **`mediahound serve --admin`** the admin screen has an **⬆ Import list** button —
paste or upload the same CSV, optionally tick *enrich online*, and the titles are added and the site
rebuilt in place.

---

## Features

### The catalog
- **Search** title / genre / cast / studio / intro, **sort** by title, year, recently-added, value or rating.
- **Filters**: format, genre, studio, **streaming service**, language, category, seen / unseen.
- **Dense, aligned cards** showing poster, title·year, ★rating · format · runtime · language,
genres, director + cast, studio, where-to-watch, intro hook, and estimated resale value.
- **Clickable everything**: a genre, person, or studio filters the grid to matching titles.
- **Adjustable density** — viewers pick how many movies per row; responsive on web & mobile.

### Photos
- **Multi-photo galleries** — flip through every photo of a title with ‹ › arrows.
- **Click-to-zoom** lightbox; set any photo as the default; rotate photos (baked in on rebuild).
- Auto-uprights sideways/landscape cover photos to portrait.

### Where to watch & resale
- **Where to watch** — is it on Netflix / Amazon Prime / Hulu? A clickable ▶ badge + pills link
straight to the title (via JustWatch, no key). A filter narrows to a specific service.
- **Resale value** — a heuristic estimate plus a live link to eBay sold/completed listings. For music
with a Discogs release id, a condition-based **Discogs price suggestion** (token-gated).

### 📷 Barcode scanning (exact, not fuzzy)
- Identify the **exact release** from the UPC/EAN barcode instead of fuzzy OCR. In the admin view,
**📷 Scan barcode** opens your camera (or type the digits); on `mediahound build --online` a barcode
found in a cover photo is preferred over OCR. Music → MusicBrainz/Discogs; movies → UPCItemDB →
the normal identify-by-title path. Local decode needs the optional extra: `pip install "mediahound[barcode]"`.

### 💿 Discogs (records & CDs)
- **Import an existing Discogs collection** in one step: `mediahound import-discogs ` (or the
**💿 Discogs** admin button). Selectable as a music metadata provider (`[music.metadata] provider = "discogs"`).
Token stored in your keychain via **Settings → API keys**.

### ⭐ Your personal catalog (admin-only, never published)
- **Rate** (★1–10), **note**, and **tag/shelve** any item; track **lending** (loan out → badge →
returned), filter **On loan / Available**, and hit **🎲 Surprise me** to pick something for tonight.
All of this is **stripped from the published site** — it shows only in your local admin view.

### 🛟 Backup, export & feeds
- **`mediahound backup` / `restore`** zip up (and re-create) your whole library — `--no-photos` for a
quick curation-only backup; secrets are never included. A **⬇ Backup** button does the same from the app.
- **Export** to **Letterboxd** (movies) or **JSON**: `mediahound export --format letterboxd|json`
(or the **🎬 Letterboxd** button). The published site also emits **`feed.json` + `feed.xml`** of
recently-added items so anyone can subscribe.
- **📚 Library switcher** — keep separate catalogs (e.g. movies vs. music, or per-family-member) and
**open / create / switch between them from the admin UI** with no restart (a recents list lives in
`~/.config/mediahound/`). Each library's **data directory** is set by its own `config.toml` `[paths]`.

### Two views
- **Default view** — public, read-only.
- **Admin view** — password-protected, read/write. Edit a title's name, year, format, studio &
distributor; **move a title between 🎬 Movies and 🎵 Music**; mark seen; rotate / set-default /
delete a photo; delete a title; and configure the **library name, description, logo, which fields
are shown, and default columns**.

### Editing & persisting your changes

Your edits are recorded as small **corrections** (keyed by title id). There are two ways to make
them permanent so they **survive every future `mediahound build`** — pick one:

**A. Live admin server (recommended — zero manual steps)**

```bash
mediahound serve --admin # serves the site at http://127.0.0.1:8765
```

Open the site, unlock admin, and edit. Every change is written **straight into
`data/corrections.json`** (and `seen-overrides.json`) as you go — the badge shows
*“✓ Saved to disk.”* Click **↻ Rebuild** to re-bake the catalog and reload. Because the edit is
already in `data/`, the next `mediahound build` (and any re-query) keeps it — **edits never revert.**
The write API is **localhost-only** and refuses cross-origin requests; never expose it publicly.

**B. Static export (for read-only/CDN hosting like Netlify or GitHub Pages)**

When the site is served as plain files (no admin server), edits live in your browser. Click
**Export changes** / **Export seen** — the download is **merged with the site's existing
`data/corrections.json`** so nothing already saved is lost — then drop the file into `data/` and run
`mediahound build`.

> Either way the source of truth is `data/corrections.json`. A title you fix only in the browser
> (without server-admin or an export) shows locally but **reverts on the next rebuild**, because the
> build regenerates the catalog from `data/`.

### Manual identification
- Covers that couldn't be read are grouped on `identify.html`, where you **name** them (queued for
discovery on the next online build) or **discard** them (e.g. blank tapes).

---

## How it compares

Most movie-collection tools add items by **barcode scan or manual entry** and keep your catalog in
**their cloud** or a dated desktop app. MediaHound is the only one that identifies titles from
**photos of the covers** and generates a **modern static website you own and host for free** —
offline-first and open-source. It's also one of the few that handles **VHS** (which usually has no
scannable barcode in the disc databases others rely on).

| | MediaHound | CLZ / Libib | Tellico / GCstar | Plex / Jellyfin |
|---|---|---|---|---|
| Add by **photo of cover** | ✅ OCR/AI | ❌ barcode/manual | ❌ search/manual | ❌ scans video files |
| Modern **website you host free** | ✅ | ❌ their cloud | ◻︎ dated HTML export | ❌ private server |
| Open-source / offline / no account | ✅ | ❌ | ✅ desktop | ✅ (Jellyfin) |
| For a **physical** shelf | ✅ | ✅ | ✅ | ❌ digital files |

See **[COMPARISON.md](COMPARISON.md)** for the full, honest analysis — including when a barcode app
(CLZ/Libib), an OSS desktop cataloger (Tellico/Data Crow), or a media server (Plex/Jellyfin) is the
better choice.

---

## Providers (how titles get identified & enriched)

Both paths are first-class — pick them per-site in `config.toml`. The default needs **zero keys**.

| Concern | Default (no key) | Optional upgrade |
|---|---|---|
| **Identify** title from a cover | `tesseract` — open-source OCR | `claude` (Anthropic vision, also writes the intro) · `ollama` (local model) |
| **Movie** metadata + poster | `wikidata` — Wikidata + Wikipedia + Wikimedia | `tmdb` (free key) · `omdb` (free key) |
| **Music** metadata + cover art | `musicbrainz` — MusicBrainz + Cover Art Archive | `discogs` *(planned)* |
| **Where to watch / listen** | `justwatch` (movies) · keyless Spotify/Apple/YouTube search (music) | Spotify / Apple Music keys *(planned)* |
| **Resale** | eBay sold-listings link + estimate | Discogs price *(planned, music)* |

Switch to a premium provider in `config.toml`:

```toml
[identify]
provider = "claude" # needs ANTHROPIC_API_KEY
[metadata]
provider = "tmdb" # needs TMDB_API_KEY (or use "omdb" + OMDB_API_KEY)
```

…and create a **gitignored** `.env` next to `config.toml`:

```
ANTHROPIC_API_KEY=sk-ant-...
TMDB_API_KEY=...
```

Robustness built in: results are cached (`data/.metadata-cache.json`) so rebuilds never re-hit a
rate-limited free key, providers fail soft (a bad lookup never drops a title), and a fuzzy match
that returns the wrong film is rejected so it can't corrupt your names.

---

## Offline by default

`mediahound build` is **offline** — it regenerates the site from existing data and never contacts
the internet. Add `--online` to allow identification / metadata / where-to-watch lookups:

```bash
mediahound build --config mysite/config.toml # offline: just rebuild the site
mediahound build --config mysite/config.toml --online # online: identify + enrich new titles
mediahound build --config mysite/config.toml --online --refresh-streaming # also re-check where-to-watch
```

Useful flags: `--mock` (demo data), `--force` (reprocess everything), `--limit N`, `--reidentify `.

---

## Deploy

The generated site folder (`mysite/`) is plain static files (`index.html`, `identify.html`,
`assets/`, `data/`, `posters/`, `originals/`). It's just static files, so you can **host it free**
on **GitHub Pages, Cloudflare Pages, Netlify, Vercel, Render, or Surge.sh** — no server, database,
or build step required. Quickest:

```bash
cd mysite && npx netlify deploy --prod # Netlify
cd mysite && npx wrangler pages deploy . # Cloudflare Pages
cd mysite && npx surge . # Surge.sh
```

See **[DEPLOYMENT.md](DEPLOYMENT.md)** for the full free-hosting comparison plus GitHub Pages, Vercel
and S3 instructions. The live demo above is itself hosted free on GitHub Pages via a workflow.

It even works by **double-clicking `index.html`** — the build embeds the catalog in `data/bundle.js`
so it loads without a web server.

### 📱 Install it on your phone (PWA)

A published catalog is a **Progressive Web App**. Open it in your phone's browser and **Add to Home
Screen** (on iPhone: the **Share** button in Safari) — you get an app icon, a full-screen view, and
**offline** access (the catalog is cached). It **auto-updates** every time you republish. Browsing is
read-only; editing happens in the local app (`mediahound app`) and is then published.

---

## Architecture

See **[ARCHITECTURE.md](ARCHITECTURE.md)** for the full picture. In short:

```
RawImages/*.jpg ─▶ identify (OCR / vision) ─▶ enrich (poster, genres, cast, studio, rating)
│
confidence too low? ──────┴─▶ data/unidentified.json → identify.html
▼
+ intro + resale + where-to-watch ─▶ data/collection.json ─▶ static site (index.html)
```

Python CLI (`mediahound/`) builds the data; a dependency-free vanilla-JS frontend (`mediahound/web/`) renders it.

---

## Attribution & licensing

- Code: **MIT** (see [LICENSE](LICENSE)).
- Default data: **Wikidata** (CC0), **Wikipedia** text (CC BY-SA), images via **Wikimedia Commons**.
- If you enable **TMDB**: uses the TMDB API but is not endorsed or certified by TMDB.
- Where-to-watch data via **JustWatch**; resale links to **eBay** sold listings (estimates are heuristics).

## Security

MediaHound output is a **static site** — read-only to everyone, with no server to attack. The admin
password is a convenience gate, **not** an access-control boundary (the published catalog can't be
changed without rebuilding + redeploying). API keys stay in a gitignored `.env`; only the password
hash ships. All rendered data is HTML-escaped and links are scheme-restricted. See
**[SECURITY.md](SECURITY.md)** for the full threat model and reporting instructions.

**Privacy:** MediaHound is offline-first with **no account and no telemetry** — your photos, catalog,
and keys stay on your computer; data leaves only when you opt into online metadata, Publish, or phone
upload. See **[PRIVACY.md](PRIVACY.md)**.

**Roadmap:** what's next (barcode scanning, Discogs, backup/exports, personal catalog) is designed in
[docs/design/](docs/design/); the full backlog is in [docs/ROADMAP.md](docs/ROADMAP.md).

Contributions welcome — see [CONTRIBUTING.md](CONTRIBUTING.md).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/jchirayath/mediahound

Awesome Lists containing this project

README