https://github.com/lancedb/community-metrics
Dashboard for tracking downloads and stars for Lance format and LanceDB SDK adoption.
https://github.com/lancedb/community-metrics
Last synced: 2 months ago
JSON representation
Dashboard for tracking downloads and stars for Lance format and LanceDB SDK adoption.
- Host: GitHub
- URL: https://github.com/lancedb/community-metrics
- Owner: lancedb
- Created: 2026-02-22T22:04:06.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2026-04-01T13:34:29.000Z (3 months ago)
- Last Synced: 2026-04-04T07:55:22.022Z (2 months ago)
- Language: Python
- Homepage: https://community-metrics-alpha.vercel.app
- Size: 278 KB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Agents: AGENTS.md
Awesome Lists containing this project
README
# Community Metrics Dashboard
This repository tracks community metrics for Lance and LanceDB, stores them in LanceDB Enterprise, and renders a read-only dashboard frontend.
Architecture split:
- **Write path**: Python ingestion jobs run on a private host (for example EC2 + cron).
- **Read path**: Next.js dashboard app serves `/api/v1/dashboard/daily` and is deployed to Vercel.
## What This Tracks
- SDK downloads:
- `pylance` (PyPI)
- `lance` (crates.io)
- `lancedb` (PyPI)
- `@lancedb/lancedb` (npm)
- `lancedb` (crates.io)
- GitHub stars:
- `lance-format/lance`
- `lancedb/lancedb`
- `lance-format/lance-graph`
- `lance-format/lance-context`
## Prerequisites
- Python managed with `uv`
- Frontend managed with `npm`
- A running **LanceDB Enterprise** cluster
## Environment
Create `.env` in the repo root (or update existing):
```bash
LANCEDB_API_KEY=...
LANCEDB_HOST_OVERRIDE=https://
LANCEDB_REGION=us-east-1
# Strongly recommended for scheduled ingestion:
GITHUB_TOKEN=...
```
`GITHUB_TOKEN` should stay configured on the machine running scheduled updates.
## LanceDB Storage
Tables:
- `metrics`: metric definitions
- `stats`: daily observations keyed by `(metric_id, period_end)`
- `history`: ingestion run logs
Daily row semantics in `stats`:
- `period_start == period_end`
- routine provenance: `api_daily`
- recompute provenance: `recomputed`
- download `source_window`: `1d`
- star `source_window`: `cumulative_snapshot`
## Ingestion Jobs (EC2 / Private Host)
All writes happen directly through `LanceDBStore`.
No FastAPI/uvicorn runtime is required.
### Clean-Slate Bootstrap
```bash
uv run python -m community_metrics.jobs.bootstrap_tables
uv run python -m community_metrics.jobs.update_all --lookback-days 90
```
### Routine Refresh
```bash
uv run python -m community_metrics.jobs.daily_refresh
```
For ad-hoc correction windows:
```bash
uv run python -m community_metrics.jobs.daily_refresh --lookback-days 7
```
One-time star-history backfill for newly added GitHub repos:
```bash
uv run python -m community_metrics.jobs.update_daily_stars --lookback-days 180
```
### Suggested Cron (EC2)
Run daily at **09:00 UTC**:
```cron
0 9 * * * cd /path/to/community-metrics && /usr/bin/env -S bash -lc 'uv run python -m community_metrics.jobs.daily_refresh >> /var/log/community-metrics/daily_refresh.log 2>&1'
```
## Frontend (Next.js + Vercel)
The dashboard lives in `src/dashboard` and fetches:
- `GET /api/v1/dashboard/daily?days=180`
- Google SSO (restricted to `@lancedb.com` accounts)
### Local frontend dev
```bash
cd src/dashboard
npm install
npm run dev
```
Set these frontend env vars in `src/dashboard/.env.local` (local) or Vercel project settings (deployment):
```bash
GOOGLE_CLIENT_ID=...
GOOGLE_CLIENT_SECRET=...
NEXTAUTH_SECRET=...
NEXTAUTH_URL=http://127.0.0.1:3000
```
Google OAuth app setup must include this callback URI:
```bash
http://127.0.0.1:3000/api/auth/callback/google
```
### Vercel env vars
Set these in the Vercel project:
```bash
LANCEDB_API_KEY=...
LANCEDB_HOST_OVERRIDE=https://
LANCEDB_REGION=us-east-1
GOOGLE_CLIENT_ID=...
GOOGLE_CLIENT_SECRET=...
NEXTAUTH_SECRET=...
NEXTAUTH_URL=https://
```
The route is read-only by code path and only queries bounded dashboard windows.
If/when available, use a dedicated read-scoped key for Vercel.
### Frontend metric semantics
- Download chart points are monthly totals.
- Download card headline values are the last full-month totals.
- Through `2025-11-30`, download points come from seeded discrete snapshots.
- From `2025-12-01` onward, monthly download points are aggregated from daily rows.
- Star charts remain daily cumulative series.
- Total stars combine all tracked GitHub star repos.
## Which Job To Run
| Job | Use this for | Command |
| --- | --- | --- |
| `daily_refresh` | Normal daily updates (scheduled) | `uv run python -m community_metrics.jobs.daily_refresh` |
| `update_all` | Recompute/backfill a full lookback window | `uv run python -m community_metrics.jobs.update_all --lookback-days 90` |
| `bootstrap_tables` | Destructive reset/recreate before rebuild | `uv run python -m community_metrics.jobs.bootstrap_tables` |
## Debug helper
`debug.py` reads LanceDB Enterprise tables directly (no REST API required):
```bash
uv run debug.py metrics
uv run debug.py stats --metric-id downloads:lance:python --days 30
uv run debug.py history --start-date 2026-01-01 --end-date 2026-12-31 --limit 200
uv run debug.py all
```
## Development
Format and lint Python:
```bash
uv run ruff format .
uv run ruff check --fix --select I .
```
Run tests:
```bash
uv run pytest -q
```