https://github.com/mikewaters/safari-raindrop-tabgroups
A tool to retrieve and analyze Safari and Raindrop.io Tab Groups/Collections. Uses the local MacOS Safari sqlite database and the Raindrop.io API.
https://github.com/mikewaters/safari-raindrop-tabgroups
raindrop safari tab-groups
Last synced: 3 months ago
JSON representation
A tool to retrieve and analyze Safari and Raindrop.io Tab Groups/Collections. Uses the local MacOS Safari sqlite database and the Raindrop.io API.
- Host: GitHub
- URL: https://github.com/mikewaters/safari-raindrop-tabgroups
- Owner: mikewaters
- Created: 2026-02-28T19:26:01.000Z (3 months ago)
- Default Branch: master
- Last Pushed: 2026-02-28T19:33:15.000Z (3 months ago)
- Last Synced: 2026-02-28T22:50:41.491Z (3 months ago)
- Topics: raindrop, safari, tab-groups
- Language: TypeScript
- Homepage:
- Size: 22.4 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# safari-tabgroups
A CLI toolkit for extracting, listing, and describing browser tab groups from Safari and Raindrop.io. Built with [Bun](https://bun.sh).
## Prerequisites
- macOS (reads Safari's local SQLite database)
- [Bun](https://bun.sh) v1.3+
- An [OpenRouter](https://openrouter.ai) API key (for `fetch --prompt` and `describe` commands)
- A [Raindrop.io](https://raindrop.io) API token (for syncing Raindrop collections)
## Install
```bash
bun install
make install # builds + copies to /usr/local/bin
# or install elsewhere:
PREFIX=~/.local/bin make install
```
## Commands
1. `sync-tabgroups` - Sync our cache of sources
2. `safari-tabgroups` - Read safari data from cached source
3. `raindrop-tabgroups` - Read raindrop data from cached source
4. `list-tabgroups` - List combined cached source data
5. `describe-tabgroup` - Perform classification of bookmarks data
6. `bookmark-index` - Manages bookmarks data in a database and orchestrate
### sync-tabgroups
Populates the local cache for Safari and/or Raindrop.io. All other commands read from this cache — run sync first.
```bash
# Sync both sources
bun run sync
# Sync only Safari
bun run sync -- --safari
# Sync only Raindrop.io
bun run sync -- --raindrop
# Sync from Safari Technology Preview
bun run sync -- --safari --stp
```
**Flags:**
| Flag | Description |
|------|-------------|
| `--safari` | Only sync Safari tab groups |
| `--raindrop` | Only sync Raindrop.io collections |
| `--full-raindrop` | Force full Raindrop sync (skip delta mode) |
| `--stp` | Sync from Safari Technology Preview instead of Safari |
| `--verbose` | Print debug info to stderr |
| `--debug` | Implies `--verbose` |
Without `--safari` or `--raindrop`, syncs both sources.
**How it works:**
- **Safari:** Copies `SafariTabs.db` (plus WAL/SHM files) to `~/.cache/safari-tabgroups/`, skipping the copy if the cache is already fresh. Runs a WAL checkpoint to consolidate writes.
- **Raindrop:** Fetches collections on every run. If a prior cache exists, performs a delta sync using `lastUpdate:>fetchedAt` and merges changes by raindrop ID; use `--full-raindrop` to force a complete refresh (recommended periodically to reconcile deletions). Cache is written to `~/.cache/safari-tabgroups/raindrop-collections.json`.
When syncing both, sources are fetched in parallel.
---
### safari-tabgroups
Reads Safari tab groups from the local cache.
```bash
# Human-readable output
bun run safari
# JSON output
bun run safari -- --json
# Pipe JSON to jq
bun run safari -- --json | jq '.profiles[].tabGroups[].name'
```
**Flags:**
| Flag | Description |
|------|-------------|
| `--json` | Output as JSON (`{ profiles: [...] }`) |
| `--verbose` | Print debug info to stderr |
| `--debug` | Implies `--verbose` |
**Text output (default):** one line per tab as `Profile / Tab Group / Title (URL)`
```
Personal / Research / Some Page (https://example.com)
Personal / Work / Jira Board (https://jira.example.com)
```
**JSON output (`--json`):**
```json
{
"profiles": [
{
"name": "Personal",
"tabGroups": [
{
"name": "Research",
"tabs": [
{ "title": "Some Page", "url": "https://example.com" }
]
}
]
}
]
}
```
---
### raindrop-tabgroups
Reads Raindrop.io collections from the local cache, output in the same schema as `safari-tabgroups`.
```bash
# Human-readable output
bun run raindrop
# JSON output
bun run raindrop -- --json
# Pipe to jq
bun run raindrop -- --json | jq '.profiles[0].tabGroups[] | .name'
```
**Flags:**
| Flag | Description |
|------|-------------|
| `--json` | Output as JSON (`{ profiles: [...] }`) |
| `--verbose` | Print debug info to stderr |
| `--debug` | Implies `--verbose` |
**Mapping:** Collections map to tab groups; raindrops (bookmarks) map to tabs. Nested collections are flattened with `"Parent / Child"` naming. All collections are placed under a single profile named `"Raindrop.io"`.
---
### list-tabgroups
Lists all tab group names across both sources.
```bash
# Plain text listing
bun run list
# JSON output
bun run list -- --json
# Only Safari groups
bun run list -- --safari
# Only Raindrop groups
bun run list -- --raindrop
```
**Flags:**
| Flag | Description |
|------|-------------|
| `--json` | Output merged JSON (`{ profiles: [...] }`) |
| `--safari` | Only include Safari tab groups |
| `--raindrop` | Only include Raindrop.io collections |
| `--verbose` | Print debug info to stderr |
| `--debug` | Implies `--verbose` |
**Text output (default):**
```
Personal
Research (12 tabs)
Work (3 tabs)
Raindrop.io
AI Tools (8 tabs)
Reading List (4 tabs)
```
---
### describe-tabgroup
Derives structured metadata about tab groups using an LLM. Works with both Safari and Raindrop sources.
- **Tier 1 (default):** Sends only tab titles and URLs to the LLM. Fast and cheap.
- **Tier 2 (`--fetch`):** Also fetches and includes page content for the top N tabs, providing richer context.
```bash
# Describe a single tab group
bun run describe "My Research"
# Describe with page content (Tier 2)
bun run describe "My Research" -- --fetch
# Describe all tab groups
bun run describe -- --all
# Only describe Safari groups
bun run describe -- --all --safari
# Only describe Raindrop groups
bun run describe -- --all --raindrop
```
**Flags:**
| Flag | Description |
|------|-------------|
| `--all` | Describe all tab groups (outputs a JSON object keyed by group name) |
| `--fetch` | Tier 2: fetch markdown from the top N tabs and include in the prompt |
| `--safari` | Only include Safari tab groups |
| `--raindrop` | Only include Raindrop.io collections |
| `--verbose` | Print debug info to stderr, including the full assembled prompt |
| `--debug` | Implies `--verbose` |
**Output (single group):**
```json
{
"description": "Research into AI agent frameworks and their memory architectures.",
"category": "research",
"topics": ["ai-agents", "agent-memory", "open-source"],
"intent": "Evaluating agent frameworks to find or build a successor setup.",
"confidence": 0.95,
"page_snapshot": "## Agent Memory Docs\nMarkdown content from fetched pages..."
}
```
When `--fetch` is used, the fetched markdown content is included as `page_snapshot` in the output. This is stored alongside the classification in the database. Without `--fetch`, `page_snapshot` is omitted.
**Output (`--all`):** a JSON object where each key is a tab group name.
---
### bookmark-index
Maintains a unified SQLite index of Safari tab groups and Raindrop collections. Supports LLM-powered classification and URL matching against stored groups.
```bash
# Sync index from cached data
bun run index update
# List all indexed groups
bun run index list
# Show full detail for a group
bun run index show "My Research"
# Classify groups using LLM
bun run index classify --all
# Match a URL against classified groups
bun run index match "https://example.com"
```
**Flags:**
| Flag | Description |
|------|-------------|
| `--db ` | Override database location (default: `$XDG_DATA_HOME/safari-tabgroups/bookmarks.db`) |
| `--json` | Output as JSON |
| `--safari` | Only include Safari tab groups |
| `--raindrop` | Only include Raindrop.io collections |
| `--all` | Classify all groups |
| `--fetch` | Include page content when classifying and store page snapshot |
| `--unclassified` | Only classify groups without existing classification |
| `--top N` | Limit match results (default: 5) |
| `--strategy NAME` | Match strategy to use (default: `llm-fetch`) |
| `--verbose` | Print debug info to stderr |
Re-classifying an already-classified collection creates a new version — previous versions are preserved and can be managed with `bookmark-index version`.
The database defaults to `~/.local/share/safari-tabgroups/bookmarks.db` (following the XDG Base Directory spec). This can be configured in `fetch.config.toml` or overridden per-invocation with `--db`.
---
### fetch-tabgroup
Fetches a URL and converts it to markdown. Optionally sends the markdown to an LLM via OpenRouter.
```bash
# Fetch a URL and output markdown
bun run fetch "https://example.com"
# Fetch and ask an LLM about the content
bun run fetch "https://example.com" --prompt "Summarize this page in 3 bullet points"
```
**Flags:**
| Flag | Description |
|------|-------------|
| `--prompt ` | Send the fetched markdown to OpenRouter with this user prompt |
| `--verbose` | Print debug info to stderr |
| `--debug` | Implies `--verbose`. Saves raw markdown to `./debug-.md` |
---
## Configuration
All settings are stored in `fetch.config.toml` at the project root.
```toml
[database]
path = "$XDG_DATA_HOME/safari-tabgroups/bookmarks.db" # Supports $ENV_VAR syntax; falls back to ~/.local/share
[openrouter]
api_key = "$OPENROUTER_API_KEY" # Supports $ENV_VAR syntax
model = "google/gemini-2.5-flash"
system_prompt = "You are a helpful assistant that analyzes web page content."
max_content_bytes = 40_000 # Max markdown bytes sent to LLM (fetch command)
max_tokens = 1_000 # Max response tokens from the LLM
[raindrop]
api_key = "$RAINDROP_TOKEN" # Raindrop.io API token
[describe]
max_tabs_to_fetch = 5 # How many tabs to fetch in Tier 2
skip_domains = ["discord.com", "localhost"]
per_tab_max_bytes = 500 # Max bytes of markdown per tab in Tier 2
system_prompt = """
You are a research librarian cataloging a user's browser tab groups...
"""
```
### Environment variables
| Variable | Used by | Description |
|----------|---------|-------------|
| `OPENROUTER_API_KEY` | `fetch`, `describe`, `index` | OpenRouter API key |
| `RAINDROP_TOKEN` | `sync` | Raindrop.io API token |
| `XDG_DATA_HOME` | `index` | Base directory for persistent data (default: `~/.local/share`) |
---
## Building
Compiled binaries are produced under `dist/`. `make install` copies them from there to `PREFIX`.
```bash
make build # compile all standalone binaries into dist/
make install # build + install to /usr/local/bin
make install PREFIX=~/.local/bin # custom install path
make uninstall # remove installed binaries
make clean # remove dist/ and build artifacts
```
Compiled binaries:
| Binary | Source | Description |
|--------|--------|-------------|
| `sync-tabgroups` | `src/sync.ts` | Cache population for Safari and Raindrop |
| `safari-tabgroups` | `src/safari.ts` | Safari tab group reader (read-only) |
| `raindrop-tabgroups` | `src/raindrop.ts` | Raindrop.io collection reader (read-only) |
| `list-tabgroups` | `src/list.ts` | List all tab group names |
| `describe-tabgroup` | `src/describe.ts` | Tab group metadata derivation via LLM |
| `fetch-tabgroup` | `src/fetch.ts` | URL-to-markdown + optional LLM analysis |
| `bookmark-index` | `src/index.ts` | Unified index with classification and URL matching |
## Database
The index database (`bookmarks.db`) stores collections, their items, and LLM-generated classifications.
### Schema overview
| Table | Description |
|-------|-------------|
| `groups` | Collections from Safari and Raindrop with inline classification fields |
| `items` | Individual bookmarks/tabs within each group |
| `group_classifications` | Versioned Collection Cards with `page_snapshot` |
| `highlights` | Raindrop highlights linked to items |
| `match_log` / `match_cache` | URL match history and caching |
| `match_feedback` | User feedback on match quality |
| `meta` | Key-value metadata (last sync times) |
### Classification snapshots
When classifying with `--fetch`, the markdown fetched from tab pages is stored as `page_snapshot` in `group_classifications`. This preserves the source material that informed each classification.
- The field is nullable — classifications without `--fetch` or where all fetches failed have `NULL`
- When using `--import`, include an optional `page_snapshot` field in the JSON
- An ERROR is logged if `--fetch` is used but no snapshot is captured
### Stored metadata
The `metadata` JSON column on `items` and `groups` stores rich source data:
**Safari tabs** — All columns from Safari's bookmarks table except `server_id` and binary blobs (`icon`, `sync_data`, `extra_attributes`, `local_attributes`). Includes `order_index`, `subtype`, `last_modified`, `external_uuid`, `date_closed`, `read`, and more. Core Data timestamps are converted to ISO 8601.
**Raindrop items** — All fields from the Raindrop API except `cover`. Includes `type`, `excerpt`, `note`, `tags`, `domain`, `important`, `broken`, `media`, `cache`, `user`, `creatorRef`, `sort`, `removed`, and more.
**Raindrop collections** — All collection fields except `cover`, `_id`, `title`, and `parent` (stored as first-class columns). Includes `description`, `color`, `slug`, `access`, `author`, `count`, `creatorRef`, and more.
---
## Architecture
```
src/
sync.ts Cache population — copies Safari DB, fetches Raindrop API
safari.ts Safari SQLite reader — reads tab groups from cached SafariTabs.db
raindrop.ts Raindrop.io reader — reads collections from cached JSON
list.ts Lists tab group names from both sources
describe.ts Tab group metadata derivation via LLM (spawns safari.ts and raindrop.ts)
fetch.ts URL-to-markdown converter with optional LLM analysis
index.ts Unified bookmark index — stores groups, classifications, and matches
match/
types.ts MatchStrategy interface and strategy registry
llm-fetch.ts LLM-based match strategy (pre-score + OpenRouter)
plist.ts Apple plist parser for Safari timestamp extraction
fetch.config.toml Shared configuration (API keys, LLM settings, database path)
Makefile Build and install targets
```
`describe.ts` and `list.ts` spawn `safari.ts` and `raindrop.ts` as subprocesses to get tab group data. All reader commands are read-only — only `sync.ts` writes to the cache. `index.ts` maintains its own SQLite database at `$XDG_DATA_HOME/safari-tabgroups/bookmarks.db` (configurable).