{"id":31733658,"url":"https://github.com/lancedb/lance-data-viewer","last_synced_at":"2025-10-09T08:53:19.866Z","repository":{"id":315650866,"uuid":"1060371840","full_name":"lancedb/lance-data-viewer","owner":"lancedb","description":"Browse Lance tables from your local machine in a simple web UI. No database to set up. Mount a folder and go.","archived":false,"fork":false,"pushed_at":"2025-09-28T21:26:44.000Z","size":466,"stargazers_count":6,"open_issues_count":3,"forks_count":1,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-10-04T10:35:13.392Z","etag":null,"topics":["fastapi","lance"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lancedb.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-09-19T19:55:11.000Z","updated_at":"2025-10-03T18:36:15.000Z","dependencies_parsed_at":"2025-09-19T22:29:35.120Z","dependency_job_id":"6865b234-64cb-4011-894d-cffb85e6dccb","html_url":"https://github.com/lancedb/lance-data-viewer","commit_stats":null,"previous_names":["gordonmurray/lance-data-viewer"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/lancedb/lance-data-viewer","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lancedb%2Flance-data-viewer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lancedb%2Flance-data-viewer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lancedb%2Flance-data-viewer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lancedb%2Flance-data-viewer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lancedb","download_url":"https://codeload.github.com/lancedb/lance-data-viewer/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lancedb%2Flance-data-viewer/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279001055,"owners_count":26082991,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-09T02:00:07.460Z","response_time":59,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["fastapi","lance"],"created_at":"2025-10-09T08:53:16.765Z","updated_at":"2025-10-09T08:53:19.861Z","avatar_url":"https://github.com/lancedb.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n\n# Lance Data Viewer - A read-only web UI for Lance datasets\n\nBrowse Lance tables from your local machine in a simple web UI. No database to set up. Mount a folder and go.\n\n**✨ Multi-Version Support**: Built for different Lance versions to ensure compatibility with your data format.\n\n![Lance Data Viewer Screenshot](lance_data_viewer_screenshot.png)\n\n### Quick start (Docker)\n\n1. **Pull the recommended version**\n\n```bash\n# Modern stable version (recommended for new projects)\ndocker pull ghcr.io/gordonmurray/lance-data-viewer:lancedb-0.24.3\n```\n\n2. **Make your data readable (required)**\n\n```bash\n# Make your Lance data directory and all contents readable by the container\nchmod -R o+rx /path/to/your/lance\n```\n\n3. **Run (mount your data)**\n\n```bash\ndocker run --rm -p 8080:8080 \\\n    -v /path/to/your/lance:/data:ro \\\n    ghcr.io/gordonmurray/lance-data-viewer:lancedb-0.24.3\n```\n\n4. **Open the UI**\n\n```\nhttp://localhost:8080\n```\n\nThe UI will display the Lance version in the top-right corner for easy identification.\n\n### What counts as \"Lance data\" here?\n\nA folder containing Lance tables (as created by Lance/LanceDB). The app lists tables under `/data`.\n\n## Available Lance Versions\n\nChoose the container that matches your Lance data format:\n\n| Container Tag | Lance Version | PyArrow | Use Case |\n|--------------|---------------|---------|----------|\n| `lancedb-0.24.3` | 0.24.3 | 21.0.0 | **Recommended** - Modern stable version |\n| `lancedb-0.16.0` | 0.16.0 | 16.1.0 | Anchor stable for older datasets |\n| `lancedb-0.5` | 0.5.0 | 14.0.1 | Legacy support |\n| `lancedb-0.3.4` | 0.3.4 | 14.0.1 | Legacy support |\n| `lancedb-0.3.1` | 0.3.1 | 14.0.1 | Legacy support |\n\n### Viewing older Lance data\n\nIf you have datasets created with older Lance versions:\n\n```bash\n# For datasets created with Lance 0.16.x\ndocker run --rm -p 8080:8080 \\\n    -v /path/to/your/old/lance/data:/data:ro \\\n    ghcr.io/gordonmurray/lance-data-viewer:lancedb-0.16.0\n\n# For very old datasets (Lance 0.3.x era)\ndocker run --rm -p 8080:8080 \\\n    -v /path/to/your/legacy/data:/data:ro \\\n    ghcr.io/gordonmurray/lance-data-viewer:lancedb-0.3.4\n```\n\n**Tip**: If you're unsure which version to use, start with `lancedb-0.24.3` and if you get compatibility errors, try progressively older versions.\n\n### Features\n\n- **Read-only browsing** with organized left sidebar (Datasets → Columns → Schema)\n- **Advanced vector visualization** with CLIP embedding detection and sparkline charts\n- **Schema analysis** with vector column highlighting and type detection\n- **Server-side pagination** with inline controls and column filtering\n- **Robust error handling** - gracefully handles corrupted datasets\n- **Responsive layout** optimized for data viewing\n\n### Configuration (optional)\n\n- **Port:** change host port with `-p 9000:8080`.\n- **Read-only mount:** keep `:ro` to avoid accidental writes in future versions.\n\n### Images \u0026 registries\n\n- **GitHub Container Registry** (`ghcr.io/gordonmurray/lance-data-viewer:TAG`).\n\n### Build and test locally\n\n```bash\n# Build with specific Lance version (default: 0.3.4)\ndocker build -f docker/Dockerfile \\\n    --build-arg LANCEDB_VERSION=0.24.3 \\\n    -t lance-data-viewer:dev .\n\n# Build multiple versions for testing\ndocker build -f docker/Dockerfile --build-arg LANCEDB_VERSION=0.24.3 -t lance-data-viewer:lancedb-0.24.3 .\ndocker build -f docker/Dockerfile --build-arg LANCEDB_VERSION=0.16.0 -t lance-data-viewer:lancedb-0.16.0 .\ndocker build -f docker/Dockerfile --build-arg LANCEDB_VERSION=0.3.4 -t lance-data-viewer:lancedb-0.3.4 .\n\n# Make your Lance data readable (one-time setup)\nchmod -R o+rx data\n\n# Run with your data (replace 'data' with your lance folder path)\ndocker run --rm -p 8080:8080 -v $(pwd)/data:/data:ro lance-data-viewer:dev\n\n# Open the web interface\nopen http://localhost:8080\n\n# Test the API endpoints\ncurl http://localhost:8080/healthz\ncurl http://localhost:8080/datasets\ncurl \"http://localhost:8080/datasets/your-dataset/rows?limit=5\"\n```\n\n### Development workflow\n\n```bash\n# Stop any running containers\ndocker ps -q | xargs docker stop\n\n# Rebuild after code changes (with specific Lance version)\ndocker build -f docker/Dockerfile \\\n    --build-arg LANCEDB_VERSION=0.24.3 \\\n    -t lance-data-viewer:dev .\n\n# Run in background\ndocker run --rm -d -p 8080:8080 -v $(pwd)/data:/data:ro lance-data-viewer:dev\n\n# View logs\ndocker logs $(docker ps -q --filter ancestor=lance-data-viewer:dev)\n\n# Check version info\ncurl http://localhost:8080/healthz | jq '.lancedb_version'\n```\n\n## Supported Data Types\n\n### ✅ Fully Supported\n- **Standard types**: string, int, float, timestamp, boolean, null\n- **Modern vectors**: `Vector(dim)` fields (LanceDB 2024+ style)\n- **Fixed-size vectors**: `fixed_size_list\u003citem: float\u003e[N]` (e.g., CLIP-512)\n- **Structured data**: nested objects, metadata fields\n- **Indexed datasets**: properly created with IVF/HNSW indexes\n\n### ⚠️ Limited Support\n- **Legacy vectors**: `pa.list_(pa.float32(), dim)` - schema only, may show corruption warnings\n- **Large vectors**: \u003e2048 dimensions show preview only\n- **Corrupted data**: graceful degradation with informative error messages\n\n### ❌ Not Supported\n- Binary vectors (uint8 arrays)\n- Multi-vector columns\n- Custom user-defined types\n- Write operations (read-only viewer)\n\n## Vector Visualization Features\n\nThe viewer provides advanced visualization for vector embeddings:\n\n- **CLIP Detection**: Automatically identifies 512-dimensional CLIP embeddings\n- **Statistics**: Shows norm, sparsity, positive ratio, normalization status\n- **Sparkline Charts**: Interactive visual representation of vector values\n- **Detailed Tooltips**: Hover for comprehensive vector analysis\n- **Model Badges**: Visual indicators for recognized embedding types\n\n### Security Notes\n\n- Container runs as non-root\n- No authentication; bind to localhost during development and run behind a reverse proxy if exposing\n- Read-only access prevents accidental data modification","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flancedb%2Flance-data-viewer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flancedb%2Flance-data-viewer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flancedb%2Flance-data-viewer/lists"}