{"id":30066343,"url":"https://github.com/ejfox/scrap-enlightener","last_synced_at":"2026-02-11T14:03:30.532Z","repository":{"id":299916751,"uuid":"1000938479","full_name":"ejfox/scrap-enlightener","owner":"ejfox","description":null,"archived":false,"fork":false,"pushed_at":"2025-08-21T22:04:10.000Z","size":17304,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-10-26T01:39:12.372Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ejfox.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-06-12T14:52:31.000Z","updated_at":"2025-08-21T22:04:16.000Z","dependencies_parsed_at":"2025-06-19T00:41:23.140Z","dependency_job_id":"e7f10f3e-59bb-4675-90cb-a9e0fdb1a957","html_url":"https://github.com/ejfox/scrap-enlightener","commit_stats":null,"previous_names":["ejfox/scrap-enlightener"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/ejfox/scrap-enlightener","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ejfox%2Fscrap-enlightener","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ejfox%2Fscrap-enlightener/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ejfox%2Fscrap-enlightener/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ejfox%2Fscrap-enlightener/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ejfox","download_url":"https://codeload.github.com/ejfox/scrap-enlightener/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ejfox%2Fscrap-enlightener/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29333924,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-11T12:42:24.625Z","status":"ssl_error","status_checked_at":"2026-02-11T12:41:23.344Z","response_time":97,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-08-08T07:29:49.594Z","updated_at":"2026-02-11T14:03:30.528Z","avatar_url":"https://github.com/ejfox.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Scrap Enlightener 🕵️\n\n**Transform your Pinboard bookmarks into a knowledge graph with AI-powered enrichment, semantic search, and automatic tagging.**\n\n## What This Does\n\nScrap Enlightener supercharges your Pinboard bookmarks by:\n- 🤖 **AI Enrichment**: Automatically extracts key information from bookmarked pages (quotes, questions, links, headers)\n- 🔗 **Connection Detection**: Discovers relationships between bookmarks through shared links and concepts\n- 🔍 **Semantic Search**: Find bookmarks by meaning, not just keywords (powered by OpenAI embeddings)\n- 🏷️ **Smart Tagging**: AI suggests relevant tags based on actual page content\n- 📸 **Visual Memory**: Captures screenshots of bookmarked pages\n- 💬 **Chat with Bookmarks**: Ask questions about your bookmark collection\n- 📊 **Social Discovery**: Find Hacker News and Reddit discussions about your bookmarks\n- 🕰️ **Time Machine**: Browse your bookmarks by date with timeline view\n- 📦 **Data Export**: Export your enriched bookmarks as JSON, CSV, or standalone HTML\n\n## How We Compare\n\n| Feature | Scrap Enlightener | [Hoarder](https://github.com/hoarder-app/hoarder) | [Linkding](https://github.com/sissbruecker/linkding) | [Raindrop.io](https://raindrop.io) | [Readwise Reader](https://readwise.io/reader) |\n|---------|------------------|---------|----------|------------|----------------|\n| **Pinboard Integration** | ✅ Native | ❌ | ✅ Import only | ✅ Import only | ❌ |\n| **AI Auto-Tagging** | ✅ | ✅ | ❌ | ❌ | ✅ |\n| **Semantic Search** | ✅ OpenAI | ❌ | ❌ | ✅ Paid tier | ✅ |\n| **Link Connections** | ✅ | ❌ | ❌ | ❌ | ✅ |\n| **Screenshots** | ✅ | ✅ | ✅ | ✅ | ✅ |\n| **Full-Text Archive** | ✅ | ✅ | ✅ | ✅ Paid | ✅ |\n| **HN/Reddit Discovery** | ✅ | ❌ | ❌ | ❌ | ❌ |\n| **Chat with Bookmarks** | ✅ | ✅ | ❌ | ❌ | ✅ |\n| **Self-Hosted** | ✅ | ✅ | ✅ | ❌ | ❌ |\n| **Offline Export** | ✅ HTML/JSON/CSV | ❌ | ✅ HTML | ✅ Limited | ✅ Limited |\n| **Pricing** | Free (BYO API keys) | Free | Free | $3-9/mo | $8-15/mo |\n| **Database** | Supabase/PostgreSQL | SQLite | SQLite/PostgreSQL | Proprietary | Proprietary |\n| **Primary Focus** | Pinboard enhancement | Everything saver | Simple bookmarks | Visual bookmarks | Reading \u0026 highlights |\n\n### Why Scrap Enlightener?\n- **Pinboard-first**: Built specifically for Pinboard users, not trying to replace it\n- **Bring Your Own Keys**: Use your own OpenAI API keys, pay only for what you use\n- **Knowledge Graph**: Unique focus on discovering connections between bookmarks\n- **Fully Exportable**: Your data is always yours, export to standalone HTML that works offline\n\n## Prerequisites\n\n- Node.js 20+ and npm\n- A [Pinboard](https://pinboard.in) account with API token\n- A [Supabase](https://supabase.com) project (free tier works) *\n- An [OpenAI API key](https://platform.openai.com/api-keys) for AI features\n\n\\* *Supabase dependency is being removed - see [decoupling docs](docs/supabase-decoupling.md) for SQLite/PostgreSQL/JSON alternatives*\n\n## Quick Start\n\n### 1. Clone and Install\n\n```bash\ngit clone https://github.com/yourusername/scrap-enlightener.git\ncd scrap-enlightener\nnpm install\n```\n\n### 2. Choose Your Database\n\n#### Option A: SQLite (Easiest - No External Dependencies!)\n```bash\n# Just set this in your .env:\nDB_ADAPTER=sqlite\nSQLITE_PATH=./data/bookmarks.db\n# That's it! Database will be created automatically.\n```\n\n#### Option B: Supabase (Cloud-hosted)\n1. Create a new project at [supabase.com](https://supabase.com)\n2. Run the migrations in order from `supabase/migrations/`:\n   ```sql\n   -- Run each .sql file in your Supabase SQL editor\n   ```\n3. Copy your project URL and anon key from Project Settings \u003e API\n\n### 3. Configure Environment\n\n```bash\ncp .env.example .env\n```\n\nEdit `.env` with your keys:\n```env\n# From Supabase project settings\nSUPABASE_URL=https://your-project.supabase.co\nSUPABASE_ANON_KEY=your-anon-key-here\n\n# From OpenAI platform\nOPENAI_API_KEY=sk-your-openai-key-here\n\n# Optional: Direct Pinboard API access (for testing)\nPINBOARD_API_TOKEN=username:token\n```\n\n### 4. Start Development Server\n\n```bash\nnpm run dev\n```\n\nOpen http://localhost:3000\n\n### 5. Connect Your Pinboard\n\n1. Sign up/login with Supabase Auth\n2. Go to Settings page\n3. Enter your Pinboard API token (format: `username:XXXXXXXXXX`)\n4. Your token is securely stored in the database per-user\n\n## Core Features\n\n### ⚡ ENLIGHTEN.EXE (`/enlighten`)\n**Radically simplified single-screen bookmark processing theater**\n\nTransform your untagged Pinboard bookmarks into properly tagged knowledge:\n- **Real-time AI streaming** - Watch AI analyze content as it happens\n- **Single-word tag suggestions** - Only from your existing vocabulary \n- **Live processing display** - See content analysis and tag generation\n- **Automatic progression** - Accept tags and move to next bookmark seamlessly\n- **Rate-limited \u0026 respectful** - Proper API throttling for Pinboard\n\n**Simple workflow:**\n1. Enter your Pinboard token\n2. Click \"Load Bookmarks\" \n3. Watch AI analyze each bookmark in real-time\n4. Accept/reject suggested tags with one click\n5. Automatically moves to next untagged bookmark\n\n### 🔍 Semantic Search (`/search`)\nSearch bookmarks by meaning using AI embeddings:\n- Finds conceptually related bookmarks\n- Works even without exact keyword matches\n- Powered by OpenAI's text-embedding-3-small model\n\n### 🔗 Connection Discovery\nAutomatically finds relationships between bookmarks:\n- Detects when bookmarks link to each other\n- Creates a knowledge graph of your bookmarks\n- Surfaces forgotten related content\n\n### 📊 Analytics Dashboard (`/analytics`)\nTrack your bookmark habits:\n- Tagging velocity and streaks\n- Domain frequency analysis\n- Achievement system for gamification\n\n## API Endpoints\n\n### Enrichment APIs\n- `POST /api/enrichment/extract` - Extract content from a URL\n- `POST /api/enrichment/embed` - Generate embeddings for semantic search\n\n### Capture APIs\n- `POST /api/capture/screenshot` - Capture page screenshot\n- `POST /api/capture/archive` - Archive full page content\n\n### Discovery APIs\n- `POST /api/social/discover` - Find HN/Reddit discussions\n- `POST /api/chat/bookmarks` - Chat with your bookmarks\n- `POST /api/diff/check` - Check if a page has changed\n\n### Export APIs\n- `GET /api/export/json` - Export as JSON\n- `GET /api/export/csv` - Export as CSV  \n- `GET /api/export/static` - Export as standalone HTML\n\n### Time Machine\n- `GET /api/timemachine/snapshots` - Browse bookmarks by date\n\n### REST API\n- `GET /api/v1/bookmarks` - List bookmarks with filters\n- `POST /api/v1/bookmarks` - Add new bookmark\n- `DELETE /api/v1/bookmarks` - Delete bookmark\n\n## Architecture\n\n```\nFrontend (Nuxt 3 + Vue 3 + Tailwind)\n    ↓\nAPI Layer (Nitro server endpoints)\n    ↓\nServices\n├── Pinboard API (bookmark sync)\n├── OpenAI API (enrichment + embeddings)\n├── Supabase (auth + database + vector search)\n└── External APIs (HN, Reddit, web scraping)\n```\n\n## Database Schema\n\n- `users` - User accounts and settings\n- `bookmarks` - Synced Pinboard bookmarks\n- `enrichments` - Extracted content and metadata\n- `bookmark_embeddings` - Vector embeddings for semantic search\n- `connections` - Relationships between bookmarks\n- `achievements` - Gamification tracking\n\n## Common Issues\n\n### \"Cannot find package '@xenova/transformers'\"\n```bash\nnpm install @xenova/transformers\n```\n\n### \"Pinboard API token invalid\"\n- Format should be `username:TOKEN` (get from https://pinboard.in/settings/password)\n- Token is stored per-user in Settings, not in .env\n\n### \"OpenAI API key missing\"\n- Get key from https://platform.openai.com/api-keys\n- Add to .env as `OPENAI_API_KEY=sk-...`\n\n### \"Supabase connection failed\"\n- Check SUPABASE_URL and SUPABASE_ANON_KEY in .env\n- Ensure migrations are run in order\n\n## Development Commands\n\n```bash\nnpm run dev          # Start dev server\nnpm run build        # Build for production\nnpm run preview      # Preview production build\nnpm run lint         # Run linter\nnpm run typecheck    # Check TypeScript types\n```\n\n## Tech Stack\n\n- **Frontend**: Nuxt 3, Vue 3, Tailwind CSS, Headless UI\n- **Backend**: Nitro, Node.js\n- **Database**: Supabase (PostgreSQL + pgvector)\n- **AI/ML**: OpenAI API (GPT-3.5, embeddings)\n- **APIs**: Pinboard, Hacker News Algolia, Reddit\n- **Scraping**: Cheerio, Playwright (optional)\n\n## Contributing\n\n1. Fork the repository\n2. Create a feature branch (`git checkout -b feature/amazing-feature`)\n3. Commit changes (`git commit -m 'Add amazing feature'`)\n4. Push to branch (`git push origin feature/amazing-feature`)\n5. Open a Pull Request\n\n## License\n\nMIT - See [LICENSE](LICENSE) file\n\n## Acknowledgments\n\n- Inspired by tools like Hoarder, Karakeep, and Readwise\n- Built for the Pinboard community\n- Powered by OpenAI and Supabase","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fejfox%2Fscrap-enlightener","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fejfox%2Fscrap-enlightener","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fejfox%2Fscrap-enlightener/lists"}