{"id":50323825,"url":"https://github.com/queelius/bookmark-memex","last_synced_at":"2026-05-29T04:31:00.445Z","repository":{"id":273274656,"uuid":"918826345","full_name":"queelius/bookmark-memex","owner":"queelius","description":"Personal bookmark archive for the memex ecosystem: SQLite + FTS5 + MCP server, with content caching, soft delete, marginalia, and cross-archive URIs","archived":false,"fork":false,"pushed_at":"2026-04-23T23:00:51.000Z","size":8940,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2026-04-24T01:10:10.144Z","etag":null,"topics":["bookmark-manager","bookmarks","cli","command-line-tool","database","fts5","full-text-search","mcp","memex","python","sqlite"],"latest_commit_sha":null,"homepage":"https://queelius.github.io/bookmark-memex/","language":"HTML","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/queelius.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-01-19T00:25:09.000Z","updated_at":"2026-04-23T23:00:56.000Z","dependencies_parsed_at":null,"dependency_job_id":"2f31185c-de9f-4573-8d99-9678f805d7f4","html_url":"https://github.com/queelius/bookmark-memex","commit_stats":null,"previous_names":["queelius/btk","queelius/bookmark-memex"],"tags_count":10,"template":false,"template_full_name":null,"purl":"pkg:github/queelius/bookmark-memex","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/queelius%2Fbookmark-memex","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/queelius%2Fbookmark-memex/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/queelius%2Fbookmark-memex/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/queelius%2Fbookmark-memex/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/queelius","download_url":"https://codeload.github.com/queelius/bookmark-memex/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/queelius%2Fbookmark-memex/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33637485,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-29T02:00:06.066Z","response_time":107,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bookmark-manager","bookmarks","cli","command-line-tool","database","fts5","full-text-search","mcp","memex","python","sqlite"],"created_at":"2026-05-29T04:30:59.977Z","updated_at":"2026-05-29T04:31:00.434Z","avatar_url":"https://github.com/queelius.png","language":"HTML","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Bookmark Toolkit (btk)\n\nA modern, database-first bookmark manager with powerful features for organizing, searching, and analyzing your bookmarks.\n\n## Features\n\n- 🗄️ **SQLite-based storage** - Fast, reliable, and portable\n- 📥 **Multi-format import** - HTML (Netscape), JSON, CSV, Markdown, plain text\n- 📤 **Multi-format export** - HTML (hierarchical folders), JSON, CSV, Markdown\n- 🔍 **Advanced search** - Full-text search including cached content\n- 🏷️ **Hierarchical tags** - Organize with nested tags (e.g., `programming/python`)\n- 🤖 **Auto-tagging** - NLP-powered automatic tag generation\n- 📄 **Content caching** - Stores compressed HTML and markdown for offline access\n- 📑 **PDF support** - Extracts and indexes text from PDF bookmarks\n- 🔌 **Plugin system** - Extensible architecture for custom features\n- 🌐 **Browser integration** - Import bookmarks and history from Chrome, Firefox, Safari\n- 📊 **Statistics \u0026 analytics** - Track usage, duplicates, health scores\n- ⚡ **Parallel processing** - Fast bulk operations with multi-threading\n\n## Installation\n\n```sh\npip install bookmark-tk\n```\n\n## Quick Start\n\n```sh\n# Start the interactive shell (recommended for exploration)\nbtk shell\n\n# Or use direct CLI commands\nbtk bookmark add https://example.com --title \"Example\" --tags tutorial,web\nbtk bookmark list\nbtk bookmark search \"python\"\n\n# Import and export\nbtk import html bookmarks.html\nbtk export bookmarks.html html --hierarchical\n\n# Tag management\nbtk tag add my-tag 42          # Add tag to bookmark #42\nbtk tag list                   # List all tags\nbtk tag tree                   # Show tag hierarchy\n```\n\n## Interactive Shell\n\nBTK includes a powerful interactive shell with a virtual filesystem interface:\n\n```sh\n$ btk shell\n\nbtk:/$ ls\nbookmarks  tags  starred  archived  recent  domains\n\nbtk:/$ cd tags\nbtk:/tags$ ls\nprogramming/  research/  tutorial/  web/\n\nbtk:/tags$ cd programming/python\nbtk:/tags/programming/python$ ls\n3298  4095  5124  5789  (bookmark IDs with this tag)\n\nbtk:/tags/programming/python$ cat 4095/title\nAdvanced Python Techniques\n\nbtk:/tags/programming/python$ star 4095\n★ Starred bookmark #4095\n\nbtk:/tags/programming/python$ recent\n# Shows recently visited bookmarks in this context\n\nbtk:/tags/programming/python$ cd /bookmarks/4095\nbtk:/bookmarks/4095$ pwd\n/bookmarks/4095\n\nbtk:/bookmarks/4095$ tag data-science machine-learning\n✓ Added tags to bookmark #4095\n```\n\n### Shell Features\n\n- **Virtual filesystem** - Navigate bookmarks like files and directories\n- **Hierarchical tags** - Tags like `programming/python/django` create navigable folders\n- **Context-aware commands** - Commands adapt based on your current location\n- **Unix-like interface** - Familiar `cd`, `ls`, `pwd`, `mv`, `cp` commands\n- **Tab completion** - (planned) Auto-complete for commands and paths\n- **Tag operations** - Rename tags with `mv old-tag new-tag`\n- **Bulk operations** - Copy tags to multiple bookmarks with `cp`\n\n## Database Management\n\nBTK uses a single SQLite database file (default: `btk.db`) instead of directory-based storage:\n\n```sh\n# Use default database (btk.db in current directory)\nbtk list\n\n# Specify a different database\nbtk --db ~/bookmarks.db list\n\n# Set default database in config\nbtk config set database.path ~/bookmarks.db\n\n# Database operations\nbtk db info              # Show database statistics\nbtk db vacuum            # Optimize database\nbtk db export backup.db  # Export to new database\n```\n\n## CLI Commands\n\nBTK organizes commands into logical groups. Use `btk \u003cgroup\u003e \u003ccommand\u003e` syntax:\n\n### Bookmark Operations\n\n```sh\n# Add bookmarks\nbtk bookmark add https://example.com --title \"Example\" --tags tutorial,reference\nbtk bookmark add https://paper.pdf --tags research,ml  # Auto-extracts PDF text\n\n# List and search\nbtk bookmark list                       # List all bookmarks\nbtk bookmark list --limit 10            # List first 10\nbtk bookmark search \"machine learning\"  # Search bookmarks\nbtk bookmark search \"python\" --in-content  # Search cached content\n\n# Get bookmark details\nbtk bookmark get 42                     # Simple view\nbtk bookmark get 42 --details           # Full details\nbtk bookmark get 42 --format json       # JSON output\n\n# Update bookmarks\nbtk bookmark update 42 --title \"New Title\" --tags python,tutorial --stars\nbtk bookmark update 42 --add-tags advanced --remove-tags beginner\n\n# Delete bookmarks\nbtk bookmark delete 42\nbtk bookmark delete --filter-tags old/  # Delete by tag prefix\n\n# Query with JMESPath\nbtk bookmark query \"[?stars == \\`true\\`].title\"  # Starred bookmarks\nbtk bookmark query \"[?visit_count \u003e \\`5\\`]\"      # Frequently visited\n```\n\n### Tag Management\n\n```sh\n# List tags\nbtk tag list                            # All tags\nbtk tag tree                            # Hierarchical tree view\nbtk tag stats                           # Usage statistics\n\n# Tag operations\nbtk tag add my-tag 42 43 44             # Add tag to bookmarks\nbtk tag remove old-tag 42               # Remove tag from bookmark\nbtk tag rename old-tag new-tag          # Rename tag everywhere\nbtk tag copy source-tag 42              # Copy tag to bookmark\nbtk tag filter programming/python       # Filter by tag prefix\n```\n\n### Import \u0026 Export\n\n```sh\n# Import from various formats\nbtk import html bookmarks.html          # Netscape HTML format\nbtk import json bookmarks.json          # JSON format\nbtk import csv bookmarks.csv            # CSV format\nbtk import markdown notes.md            # Extract links from markdown\nbtk import text urls.txt                # Plain text URLs\n\n# Import browser bookmarks\nbtk import chrome                       # Import from Chrome\nbtk import firefox --profile default    # Import from Firefox profile\n\n# Export to various formats\nbtk export output.html html --hierarchical  # HTML with folder structure\nbtk export output.json json                 # JSON format\nbtk export output.csv csv                   # CSV format\nbtk export output.md markdown               # Markdown with sections\n```\n\n### Content Operations\n\n```sh\n# Refresh cached content\nbtk content refresh --id 42             # Refresh specific bookmark\nbtk content refresh --all               # Refresh all bookmarks\nbtk content refresh --all --workers 50  # Use 50 parallel workers\n\n# View cached content\nbtk content view 42                     # View markdown in terminal\nbtk content view 42 --html              # Open HTML in browser\n\n# Auto-tag using content\nbtk content auto-tag --id 42            # Preview suggested tags\nbtk content auto-tag --id 42 --apply    # Apply suggested tags\nbtk content auto-tag --all --workers 100  # Tag all bookmarks\n```\n\n### Database Operations\n\n```sh\n# Database info\nbtk db info                             # Show statistics\nbtk db stats                            # Detailed stats\nbtk db vacuum                           # Optimize database\n\n# Deduplication\nbtk db dedupe --strategy merge          # Merge duplicate metadata\nbtk db dedupe --strategy keep_first     # Keep oldest bookmark\nbtk db dedupe --preview                 # Preview changes\n```\n\n### Configuration\n\n```sh\nbtk config show                         # Show current config\nbtk config set database.path ~/bookmarks.db\nbtk config set output.format json\n```\n\n### Shell\n\n```sh\nbtk shell                               # Start interactive shell\nbtk shell --db ~/bookmarks.db           # Use specific database\n```\n\n## Configuration\n\nBTK supports configuration files for persistent settings:\n\n```sh\n# Show configuration\nbtk config show\n\n# Set configuration values\nbtk config set database.path ~/bookmarks.db\nbtk config set output.format json\nbtk config set import.fetch_titles true\n\n# Configuration file location: ~/.config/btk/config.toml\n```\n\n## Advanced Features\n\n### PDF Support\n\nBTK automatically extracts text from PDF bookmarks for search and auto-tagging:\n\n```sh\nbtk add https://arxiv.org/pdf/2301.00001.pdf --tags research,ml\nbtk search \"neural network\" --in-content  # Searches PDF text\nbtk view 42                                # View extracted PDF text\n```\n\n### Hierarchical Tags \u0026 Export\n\nOrganize bookmarks with hierarchical tags and export to browser-compatible HTML:\n\n```sh\n# Add bookmarks with hierarchical tags\nbtk add https://docs.python.org --tags programming/python/docs\nbtk add https://flask.palletsprojects.com --tags programming/python/web\n\n# Export with folder structure\nbtk export bookmarks.html html --hierarchical\n\n# Result: Nested folders in browser\n# 📁 programming\n#   📁 python\n#     📁 docs\n#       🔖 Python Documentation\n#     📁 web\n#       🔖 Flask Documentation\n```\n\n### Content Caching\n\nBTK caches webpage content for offline access and full-text search:\n\n- Fetches HTML and converts to markdown\n- Compresses with zlib (70-80% compression ratio)\n- Extracts text from PDFs\n- Enables content-based search and auto-tagging\n\n```sh\n# Content is cached automatically when adding bookmarks\nbtk add https://example.com\n\n# Manually refresh content\nbtk refresh --all --workers 50\n\n# Search within cached content\nbtk search \"specific phrase\" --in-content\n```\n\n### Plugin System\n\nBTK has an extensible plugin architecture:\n\n```python\nfrom btk.plugins import Plugin, PluginMetadata, PluginPriority\n\nclass MyPlugin(Plugin):\n    def get_metadata(self) -\u003e PluginMetadata:\n        return PluginMetadata(\n            name=\"my-plugin\",\n            version=\"1.0.0\",\n            description=\"Custom functionality\",\n            priority=PluginPriority.NORMAL\n        )\n\n    def on_bookmark_added(self, bookmark):\n        # Custom logic when bookmark is added\n        pass\n```\n\n## Architecture\n\n### Modern Stack\n\n- **Database**: SQLAlchemy ORM with SQLite backend\n- **Models**: Bookmark, Tag, ContentCache, BookmarkHealth, Collection\n- **CLI**: Grouped argparse structure with Rich for beautiful terminal output\n- **Shell**: Interactive REPL with virtual filesystem and context-aware commands\n- **Testing**: pytest with 515 tests, \u003e80% coverage on core modules\n- **Content**: HTML/Markdown conversion, zlib compression, PDF extraction\n\n### Database Schema\n\n```\nbookmarks\n├── id (primary key)\n├── unique_id (hash)\n├── url\n├── title\n├── description\n├── added (timestamp)\n├── stars (boolean)\n├── visit_count\n├── last_visited\n└── reachable (boolean)\n\ntags\n├── id\n├── name (unique)\n├── description\n└── color\n\nbookmark_tags (many-to-many)\n├── bookmark_id\n└── tag_id\n\ncontent_cache\n├── id\n├── bookmark_id (foreign key)\n├── html_content (compressed)\n├── markdown_content\n├── content_hash\n├── fetched_at\n└── status_code\n```\n\n### Code Organization\n\n```\nbtk/\n├── cli.py              # Grouped command-line interface\n├── shell.py            # Interactive shell with virtual filesystem\n├── db.py               # Database operations\n├── models.py           # SQLAlchemy models\n├── graph.py            # Bookmark relationship graphs\n├── importers.py        # Import from various formats\n├── exporters.py        # Export to various formats\n├── content_fetcher.py  # Web content fetching\n├── content_cache.py    # Content cache management\n├── content_extractor.py # Content extraction \u0026 parsing\n├── auto_tag.py         # Auto-tagging with NLP/TF-IDF\n├── plugins.py          # Plugin system\n├── tag_utils.py        # Tag operations \u0026 hierarchies\n├── dedup.py            # Deduplication strategies\n├── archiver.py         # Web archive integration\n└── browser_import.py   # Browser bookmark import\n```\n\n## Development\n\n### Running Tests\n\n```sh\n# Run all tests\npytest\n\n# Run with coverage\npytest --cov=btk --cov-report=term-missing\n\n# Run specific test file\npytest tests/test_db.py -v\n```\n\n### Test Coverage\n\n- **Overall: 515 tests, all passing** ✅\n- Core modules: \u003e80% coverage\n  - graph.py: 97.28%\n  - models.py: 96.62%\n  - tag_utils.py: 95.67%\n  - content_extractor.py: 93.63%\n  - exporters.py: 92.45%\n  - plugins.py: 90.07%\n  - dedup.py: 88.24%\n  - utils.py: 88.57%\n  - db.py: 86.91%\n- Interface modules:\n  - shell.py: 53.12% (69 tests)\n  - cli.py: 23.11% (41 tests)\n  - Expected lower coverage for interactive/CLI code\n\n## Roadmap\n\n### Recently Completed ✅\n\n- **Smart Collections \u0026 Time-Based Recent** (v0.7.1)\n  - 5 auto-updating smart collections (`/unread`, `/popular`, `/broken`, `/untagged`, `/pdfs`)\n  - Time-based navigation with 6 periods × 3 activity types\n  - Enhanced `/recent` with hierarchical structure\n  - Collection counts in `ls` output\n- **Interactive Shell with Virtual Filesystem** (v0.7.0)\n  - Unix-like navigation (`cd`, `ls`, `pwd`)\n  - Hierarchical tag browsing\n  - Context-aware commands\n  - Tag operations (`mv`, `cp`)\n- **Grouped CLI Structure** - Organized commands by functionality\n- **Comprehensive Test Suite** - 515 tests with \u003e50% shell coverage\n- SQLAlchemy-based database architecture\n- Content caching with compression\n- PDF text extraction\n- Auto-tagging with NLP\n- Hierarchical tag export\n- Parallel processing for bulk operations\n- Browser bookmark import\n- Plugin system\n\n### In Progress 🚧\n\n- Enhanced search capabilities\n- Reading list management\n- Link rot detection with Wayback Machine\n\n### Planned Features 🎯\n\n- **Enhanced Domain Organization** - Improved domain-based browsing and filtering\n- **Bookmark Marginalia** - Rich text notes (marginalia) attached to bookmarks with orphan survival\n- **User-Defined Collections** - Custom smart collections via configuration\n- Browser extensions (Chrome, Firefox)\n- MCP integration for AI-powered queries\n- Static site generator for bookmark collections\n- Similarity detection and recommendations\n- Full-text search with ranking\n- Bookmark relationship graphs\n- Social features (shared collections)\n\n## Migration from Legacy JSON Format\n\nIf you're upgrading from an older JSON-based version of BTK:\n\n1. The new version uses SQLite databases instead of JSON files\n2. Use `btk import json old-bookmarks.json` to migrate your data\n3. Legacy commands and directory-based storage are no longer supported\n4. All functionality is now database-first with improved performance\n\n## Contributing\n\nContributions are welcome! Areas for contribution:\n\n- Adding new importers/exporters\n- Creating plugins for custom functionality\n- Improving test coverage\n- Documentation improvements\n- Performance optimizations\n\nSee the plugin system for the easiest way to extend BTK without modifying core code.\n\n## License\n\nMIT License - see LICENSE file for details.\n\n## Author\n\nDeveloped by [Alex Towell](https://github.com/queelius)\n\n## Links\n\n- GitHub: https://github.com/queelius/bookmark-tk\n- Issues: https://github.com/queelius/bookmark-tk/issues\n- PyPI: https://pypi.org/project/bookmark-tk/\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fqueelius%2Fbookmark-memex","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fqueelius%2Fbookmark-memex","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fqueelius%2Fbookmark-memex/lists"}