{"id":51225550,"url":"https://github.com/sayak-sarkar/contextlake","last_synced_at":"2026-06-28T11:00:32.702Z","repository":{"id":366466330,"uuid":"1275829763","full_name":"sayak-sarkar/contextlake","owner":"sayak-sarkar","description":"A local context layer for AI tools: mirror your repositories, index them into a knowledge graph, and serve it over MCP — so agents work from real source.","archived":false,"fork":false,"pushed_at":"2026-06-22T00:09:05.000Z","size":336,"stargazers_count":1,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-22T02:26:27.290Z","etag":null,"topics":["ai","cli","code-search","context","gitlab","knowledge-graph","mcp"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sayak-sarkar.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":"ROADMAP.md","authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-06-21T07:26:00.000Z","updated_at":"2026-06-22T00:09:45.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/sayak-sarkar/contextlake","commit_stats":null,"previous_names":["sayak-sarkar/contextlake"],"tags_count":23,"template":false,"template_full_name":null,"purl":"pkg:github/sayak-sarkar/contextlake","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sayak-sarkar%2Fcontextlake","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sayak-sarkar%2Fcontextlake/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sayak-sarkar%2Fcontextlake/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sayak-sarkar%2Fcontextlake/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sayak-sarkar","download_url":"https://codeload.github.com/sayak-sarkar/contextlake/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sayak-sarkar%2Fcontextlake/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34885802,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-28T02:00:05.809Z","response_time":54,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","cli","code-search","context","gitlab","knowledge-graph","mcp"],"created_at":"2026-06-28T11:00:29.489Z","updated_at":"2026-06-28T11:00:32.691Z","avatar_url":"https://github.com/sayak-sarkar.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://raw.githubusercontent.com/sayak-sarkar/contextlake/main/docs/img/readme-banner.jpg\" alt=\"contextlake, all your real context in one local lake. Pebble the otter surfacing from a misty lake cradling a glowing pebble of context.\" width=\"820\"\u003e\n\u003c/p\u003e\n\u003ch1 align=\"center\"\u003econtextlake\u003c/h1\u003e\n\u003cp align=\"center\"\u003e\u003cstrong\u003eAll your real context, in one local lake.\u003c/strong\u003e\u003c/p\u003e\n\u003cp align=\"center\"\u003e\n  A local context layer for your AI tools: mirror your repositories, index them\u003cbr\u003e\n  into a knowledge graph, and serve it over MCP, so agents answer from \u003cem\u003ereal source\u003c/em\u003e instead of guessing.\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://github.com/sayak-sarkar/contextlake/actions/workflows/ci.yml\"\u003e\u003cimg src=\"https://github.com/sayak-sarkar/contextlake/actions/workflows/ci.yml/badge.svg\" alt=\"CI\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://pypi.org/project/contextlake/\"\u003e\u003cimg src=\"https://img.shields.io/pypi/v/contextlake?color=137A8B\" alt=\"PyPI\"\u003e\u003c/a\u003e\n  \u003cimg src=\"https://img.shields.io/badge/python-3.9%2B-blue\" alt=\"Python 3.9+\"\u003e\n  \u003cimg src=\"https://img.shields.io/badge/offline-first-2BB3A3\" alt=\"Offline-first\"\u003e\n  \u003cimg src=\"https://img.shields.io/badge/license-MIT-green\" alt=\"License: MIT\"\u003e\n\u003c/p\u003e\n\n---\n\n## Why contextlake\n\nYour AI assistant is only as good as what it can actually see. Point it at one file and\nit's sharp; ask it about *the system*, which service calls this API, who depends on that\npackage, where a symbol is really defined across dozens of repos, and it starts guessing.\n\n**contextlake gives your tools the real source to read.** It mirrors your repositories to\nyour machine, indexes them into a queryable knowledge graph, and serves that graph to your\neditor over [MCP](https://modelcontextprotocol.io). Everything runs locally and offline, \nno code leaves your machine, and it carries no credentials of its own.\n\n## How it works\n\ncontextlake is three layers you adopt one at a time. The mirror is useful on its own, and\neach layer above it is optional.\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://raw.githubusercontent.com/sayak-sarkar/contextlake/main/docs/img/architecture.png\" width=\"860\" alt=\"contextlake architecture. On the left, your repos: a GitLab group, plus optional Figma, Jira, and other MCP connectors. In the centre, contextlake indexes and mirrors them into a graph and embeddings, a wiki, and connectors. On the right, it serves the result over MCP to your AI tools: Claude Code, Windsurf, Kiro, Cursor, and Postman.\"\u003e\n\u003c/p\u003e\n\n1. **Mirror**: clone every repo you can reach in a GitLab group into a faithful copy of its\n   namespace tree, each on its most active branch, kept fresh with one command. *(The source\n   is GitLab today; the design is source-agnostic.)*\n2. **Knowledge layer** *(optional)*: parse the mirror into a code + dependency **graph**, add\n   **semantic search**, a council-verified **wiki**, and **connectors** to Atlassian / Figma / GitLab.\n3. **Serve**: expose it all over **MCP** and an offline interactive **graph visualizer**, so\n   agents can answer *\"where is `X` defined?\"* or *\"who calls `Y`?\"* instead of grepping.\n\nEach layer has its own guide: the mirror in **[Usage \u0026 config](docs/usage.md)**, the knowledge\nlayer and serving in **[Knowledge layer](docs/knowledge-layer.md)**, and the whole flow start to\nfinish in **[QUICKSTART](QUICKSTART.md)**.\n\n## Install\n\n```bash\npip install contextlake             # the mirroring CLI\npip install \"contextlake[kb]\"       # + the knowledge layer (graph, search, wiki, MCP server)\n```\n\nPrefer an isolated, zero-setup install? [`uv`](https://docs.astral.sh/uv/) fetches the right\nPython and an isolated environment for you:\n\n```bash\nuv tool install \"contextlake[kb]\"            # install the CLI on your PATH\nuvx --from \"contextlake[kb]\" contextlake --help   # …or run it once, without installing\n# pipx install \"contextlake[kb]\"             # pipx works too\n```\n\n\u003cdetails\u003e\n\u003csummary\u003eFrom source (for contributors)\u003c/summary\u003e\n\n```bash\ngit clone https://github.com/sayak-sarkar/contextlake \u0026\u0026 cd contextlake\npip install -e \".[kb]\"\n```\n\u003c/details\u003e\n\n**Prerequisites:** `git`, and, only for GitLab mirroring, an authenticated\n[`glab`](https://gitlab.com/gitlab-org/cli) (`glab auth login`). The knowledge layer needs\nneither. Once installed, `contextlake`, `python -m contextlake`, and `python3 contextlake.py`\nare equivalent.\n\n## Quickstart: one repo, no setup\n\nYou don't need GitLab or any config to try contextlake on a repo you already have.\nNo install? Run it once with [`uvx`](https://docs.astral.sh/uv/): prefix any command\nbelow with `uvx --from \"contextlake[kb]\"` (e.g. `uvx --from \"contextlake[kb]\" contextlake index --source .`).\n\n```bash\ncontextlake index                     # parse the current repo into a local knowledge graph\ncontextlake graph --overview --open   # open the interactive graph in your browser\ncontextlake serve                     # …or serve it to your AI IDE over MCP\n```\n\n**Wire it into your editor in one line**, no config file needed (it uses the local\n`~/.contextlake/kb` store you just built):\n\n```bash\nclaude mcp add contextlake -- contextlake serve      # Claude Code\n# zero-install variant: claude mcp add contextlake -- uvx --from \"contextlake[kb]\" contextlake serve\n```\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://raw.githubusercontent.com/sayak-sarkar/contextlake/main/docs/img/graph.jpg\" alt=\"The contextlake graph visualizer showing a repository's symbols as a navigable node graph, with a type-glyph legend, search, and a corner minimap\" width=\"840\"\u003e\n\u003c/p\u003e\n\u003cp align=\"center\"\u003e\u003cem\u003e\u003ccode\u003econtextlake graph\u003c/code\u003e, a whole codebase as one offline, navigable graph.\u003c/em\u003e\u003c/p\u003e\n\nEverything lands in a local store (`~/.contextlake/kb`), nothing leaves your machine. Index\nany path with `--source PATH`, or every git repo under a directory with `--workspace DIR`.\n\n\u003e **Want the full path**, mirror a GitLab fleet → graph → wired editor in a few minutes?\n\u003e [**QUICKSTART.md**](QUICKSTART.md) walks the whole flow.\n\n## Fleet mode: mirror a GitLab group\n\nWhere contextlake goes beyond single-repo tools is mirroring and cross-referencing a *whole\nGitLab fleet*. Copy the example config and set your group + workspace:\n\n```bash\ncp .contextlake.ini.example ~/.contextlake.ini\n```\n```ini\n[contextlake]\nwork_dir = ~/work\ngitlab_group = your-gitlab-group\n```\n\n```bash\ncontextlake status      # see where you stand (read-only)\ncontextlake sync        # fetch → clone → update → branches → verify → audit\n```\n\nIt carries no credentials of its own (auth rides on your existing `glab` login), so\n`.contextlake.ini` holds only non-secret settings and is gitignored by default. It runs\nacross hundreds of repos **concurrently**, with an adaptive worker pool, retries with\nbackoff, and **never stomps on the feature branch you're in the middle of**.\n\n\u003e **Behind a slow / TLS-inspecting corporate proxy** (e.g. Zscaler) where `glab`'s API calls\n\u003e time out? Set `GITLAB_TOKEN` (a `read_api` token) and contextlake enumerates projects via\n\u003e its own HTTP client, which tolerates the slow DNS where `glab`'s short dial timeout fails.\n\n## Commands at a glance\n\nRun any command as `contextlake \u003ccommand\u003e`. Full per-command docs: **[docs/usage.md](docs/usage.md)**.\n\n| Command | What it does |\n| --- | --- |\n| `status` | Show the workspace sync state vs GitLab (read-only) |\n| `sync` | The full pipeline: fetch → clone → update → branches → verify → audit |\n| `fetch` · `clone` · `update` | The sync steps, individually |\n| `branches` | Switch each repo to its most active branch |\n| `verify` · `audit` | Check the mirror vs GitLab; report repo health, age \u0026 drift (JSON + CSV) |\n| `bootstrap` | **Turnkey**: sync + index + connect + embed + wiki + steer |\n| `index` | Build the code/dependency graph (`--workspace`, incremental, `--watch`) |\n| `connect` | Link repos to Atlassian / Figma / GitLab items (`--watch` to keep refreshing) |\n| `embed` | Build semantic-search vectors (zero-config built-in CPU model, Ollama, or an API; incremental, `--watch`) |\n| `ingest` | Aggregate external docs into the graph + semantic store (built-in `files`/`web`/`api`/`mcp` sources, or plugins) |\n| `wiki` | LLM-synthesized, council-verified wiki pages |\n| `query` | Search the index (`--kind`, `--repo`, `--as-of \u003ccommit\u003e`) |\n| `owners` | Likely owners / SMEs for a repo (or `--path`), ranked from git history |\n| `impact` | Change-impact / blast radius: what depends on a symbol (`--hops`) |\n| `graph` | Visualize the graph, offline interactive HTML / DOT / Mermaid / JSON |\n| `serve` | Expose the graph over MCP (`--transport stdio`/`http`) |\n| `steer` | Write editor steering, `AGENTS.md`, `.mcp.json`, `.windsurfrules`, skills |\n| `lint` · `doctor` · `eval` | Graph health · environment check · retrieval-quality scoring |\n\nGlobal options apply to any command: `--dry-run` (preview without changing anything),\n`-v`/`-q` (verbosity), `--log-file PATH`, `--config PATH`, `--version`. Output is colorized on\na TTY and plain when piped; set `NO_COLOR` to force-disable.\n\n## Knowledge layer\n\nBeyond mirroring, the optional `contextlake.kb` layer turns your repos into a **knowledge\ngraph** and serves it to AI tools over **MCP**. It can link repos to their Atlassian / Figma /\nGitLab items, add **semantic search**, write a curated **wiki**, **visualize** the graph\n(offline interactive HTML, fleet overview, a symbol's neighbourhood, or a single repo), and\ngenerate per-tool **steering files** + a skills library. Most of it needs no model; the rest\nworks with a local Ollama or any OpenAI-compatible endpoint.\n\nOne command sets it all up:\n\n```bash\ncontextlake bootstrap --kb-config ~/.contextlake/kb.toml\n```\n\nFull guide: **[docs/knowledge-layer.md](docs/knowledge-layer.md)**.\n\n## Documentation\n\n- **[QUICKSTART.md](QUICKSTART.md)**, install → bootstrap → wire your editor, in minutes\n- **[docs/usage.md](docs/usage.md)**, every command, configuration, branch safety, scheduling\n- **[docs/knowledge-layer.md](docs/knowledge-layer.md)**, the graph, connectors, search, wiki, steering\n- **[docs/internals.md](docs/internals.md)**, architecture \u0026 internals\n- **[docs/releasing.md](docs/releasing.md)**, maintainer runbook: versioning, tagging, publishing\n- **[CHANGELOG.md](CHANGELOG.md)** · **[ROADMAP.md](ROADMAP.md)** · **[CONTRIBUTING.md](CONTRIBUTING.md)** · **[BRANDING.md](BRANDING.md)**\n\n## License\n\nMIT, see [LICENSE](LICENSE). Pebble the otter is the project mascot; *deep context, clear answers.*\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsayak-sarkar%2Fcontextlake","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsayak-sarkar%2Fcontextlake","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsayak-sarkar%2Fcontextlake/lists"}