{"id":47314659,"url":"https://github.com/dvcdsys/code-index","last_synced_at":"2026-05-04T17:02:02.926Z","repository":{"id":345049666,"uuid":"1184185044","full_name":"dvcdsys/code-index","owner":"dvcdsys","description":"Semantic code search powered by embeddings. Search your codebase by meaning, not text. Self-hosted, works with any AI agent.","archived":false,"fork":false,"pushed_at":"2026-05-03T11:10:58.000Z","size":33877,"stargazers_count":7,"open_issues_count":2,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-05-03T13:14:29.589Z","etag":null,"topics":["ai-tools","chromadb","claude-code","cli","code-index","code-navigation","code-search","developer-tools","embeddings","fastapi","mcp","self-hosted","semantic-search","tree-sitter","vector-search"],"latest_commit_sha":null,"homepage":"https://hub.docker.com/r/dvcdsys/code-index","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dvcdsys.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-17T10:38:44.000Z","updated_at":"2026-05-03T12:31:17.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/dvcdsys/code-index","commit_stats":null,"previous_names":["dvcdsys/code-index"],"tags_count":16,"template":false,"template_full_name":null,"purl":"pkg:github/dvcdsys/code-index","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dvcdsys%2Fcode-index","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dvcdsys%2Fcode-index/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dvcdsys%2Fcode-index/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dvcdsys%2Fcode-index/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dvcdsys","download_url":"https://codeload.github.com/dvcdsys/code-index/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dvcdsys%2Fcode-index/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32616270,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-04T10:08:07.713Z","status":"ssl_error","status_checked_at":"2026-05-04T10:08:02.005Z","response_time":58,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-tools","chromadb","claude-code","cli","code-index","code-navigation","code-search","developer-tools","embeddings","fastapi","mcp","self-hosted","semantic-search","tree-sitter","vector-search"],"created_at":"2026-03-17T14:07:38.014Z","updated_at":"2026-05-04T17:02:02.916Z","avatar_url":"https://github.com/dvcdsys.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![CI: Server](https://github.com/dvcdsys/code-index/actions/workflows/ci-server.yml/badge.svg)](https://github.com/dvcdsys/code-index/actions/workflows/ci-server.yml)\n[![CI: CLI](https://github.com/dvcdsys/code-index/actions/workflows/ci-cli.yml/badge.svg)](https://github.com/dvcdsys/code-index/actions/workflows/ci-cli.yml)\n[![CodeQL](https://github.com/dvcdsys/code-index/actions/workflows/codeql.yml/badge.svg)](https://github.com/dvcdsys/code-index/actions/workflows/codeql.yml)\n[![Security](https://github.com/dvcdsys/code-index/actions/workflows/security.yml/badge.svg)](https://github.com/dvcdsys/code-index/actions/workflows/security.yml)\n\n```\n ██████╗██╗██╗  ██╗\n██╔════╝██║╚██╗██╔╝\n██║     ██║ ╚███╔╝\n██║     ██║ ██╔██╗\n╚██████╗██║██╔╝ ██╗\n ╚═════╝╚═╝╚═╝  ╚═╝  Code IndeX\n```\n\n[![Release: Server](https://github.com/dvcdsys/code-index/actions/workflows/release-server.yml/badge.svg)](https://github.com/dvcdsys/code-index/actions/workflows/release-server.yml)\n[![Release: CLI](https://github.com/dvcdsys/code-index/actions/workflows/release-cli.yml/badge.svg)](https://github.com/dvcdsys/code-index/actions/workflows/release-cli.yml)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Docker Hub](https://img.shields.io/docker/pulls/dvcdsys/code-index)](https://hub.docker.com/r/dvcdsys/code-index)\n\nSearch your codebase by meaning, not just text. Self-hosted, embeddings-based, works with any agent or terminal.\n\n```bash\ncix search \"authentication middleware\"\ncix search \"database retry logic\" --in ./api --lang go\ncix symbols \"UserService\" --kind class\n```\n\n---\n\n## Why\n\nGrep and fuzzy file search work fine for small projects. At scale they break down:\n\n- You have to know what a thing is called to find it\n- Results flood with noise from unrelated files\n- Agents waste tokens scanning files that aren't relevant\n\n`cix` indexes your code into a vector store using [CodeRankEmbed](https://huggingface.co/nomic-ai/CodeRankEmbed) — a model purpose-built for code retrieval. Search queries return ranked snippets with file paths and line numbers, not raw file lists.\n\n---\n\n## Architecture\n\n```\ncix CLI (Go)\n├── init      → register project + index + start file watcher\n├── search    → semantic search (embeddings)\n├── symbols   → symbol lookup by name (SQLite)\n├── files     → file path search\n├── summary   → project overview\n├── reindex   → manual reindex trigger\n└── watch     → fsnotify daemon → auto reindex on changes\n\ncix-server (Go) — server/\n├── llama-server (llama.cpp sidecar) → embeddings (CodeRankEmbed Q8_0 GGUF, 768d)\n├── chromem-go                       → vector store (cosine similarity)\n├── gotreesitter                     → AST chunking (200+ languages)\n└── modernc.org/sqlite               → project metadata, symbols, file hashes\n```\n\nThe server is a pure-Go static binary. The CLI is a thin Go binary that talks to it over HTTP.\nThe `llama-server` sidecar (from upstream [llama.cpp](https://github.com/ggml-org/llama.cpp)) handles embeddings — the Go process starts it as a child process and communicates via Unix socket.\n\n---\n\n## Quick Start\n\n### 1. Start the API Server\n\nThree deployment options:\n\n| Mode | Best for | GPU acceleration | Prerequisites |\n|------|----------|-----------------|---------------|\n| **Docker (CPU)** | any OS, development | none | Docker |\n| **Docker (CUDA)** | NVIDIA GPU servers | CUDA | Docker, NVIDIA Container Toolkit |\n| **Native (macOS)** | Apple Silicon — full Metal GPU | Metal | Go 1.24+, Xcode CLT |\n\n#### Docker (CPU)\n\n```bash\ngit clone https://github.com/dvcdsys/code-index \u0026\u0026 cd code-index\ncp .env.example .env\n# Edit .env — set CIX_API_KEY to a random string\ndocker compose up -d\n```\n\n```bash\ncurl http://localhost:21847/health   # → {\"status\": \"ok\"}\n```\n\n#### Docker (CUDA — NVIDIA GPU)\n\nSee [GPU Acceleration (CUDA)](#gpu-acceleration-cuda) section below.\n\n```bash\ndocker compose -f docker-compose.cuda.yml up -d\n```\n\n#### Native macOS (Apple Silicon — Metal GPU)\n\n\u003e **Why not Docker?** Docker Desktop on macOS runs containers inside a Linux VM — Metal GPU is **not accessible** from within a container. For full Apple Silicon GPU acceleration you must run the server natively.\n\n**Prerequisites:** Go 1.24+, Xcode Command Line Tools\n\n```bash\nxcode-select --install   # if not already installed\n```\n\n**Step 1 — Build binary + download Metal-enabled llama-server (once)**\n\n```bash\ncd server\nmake bundle\n# Outputs:\n#   dist/cix-darwin-arm64/cix-server\n#   dist/cix-darwin-arm64/llama/llama-server  (includes libggml-metal.dylib)\n```\n\n**Step 2 — Configure**\n\n```bash\ncp .env.example .env\n# Edit .env — set at minimum:\n#   CIX_API_KEY=cix_\u003cyour-random-key\u003e\n#   CIX_N_GPU_LAYERS=99      ← offload all layers to Metal\n```\n\n**Step 3 — Run**\n\n```bash\ncd server \u0026\u0026 make run\n# Reads .env from repo root, sets CIX_LLAMA_BIN_DIR automatically.\n```\n\n```bash\ncurl http://localhost:21847/health   # → {\"status\": \"ok\"}\n```\n\n| Variable | Recommended | Notes |\n|---|---|---|\n| `CIX_N_GPU_LAYERS` | `99` | Offload all layers to Metal; `0` = CPU only |\n| `CIX_LLAMA_BIN_DIR` | set by `make run` | Path to the `llama-server` binary dir |\n| `CIX_EMBEDDINGS_ENABLED` | `true` | Enable GPU embeddings (default) |\n\n\u003e [!TIP]\n\u003e `make run` always runs `make bundle` first (no-op if already built), so it's safe to use after any `git pull`.\n\n**Auto-start with launchd** (optional — run server in the background on login):\n\n```bash\ncat \u003e ~/Library/LaunchAgents/com.cix.server.plist \u003c\u003c 'EOF'\n\u003c?xml version=\"1.0\" encoding=\"UTF-8\"?\u003e\n\u003c!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\"\u003e\n\u003cplist version=\"1.0\"\u003e\u003cdict\u003e\n  \u003ckey\u003eLabel\u003c/key\u003e\u003cstring\u003ecom.cix.server\u003c/string\u003e\n  \u003ckey\u003eProgramArguments\u003c/key\u003e\n  \u003carray\u003e\u003cstring\u003e/ABSOLUTE/PATH/TO/server/dist/cix-darwin-arm64/cix-server\u003c/string\u003e\u003c/array\u003e\n  \u003ckey\u003eEnvironmentVariables\u003c/key\u003e\n  \u003cdict\u003e\n    \u003ckey\u003eCIX_API_KEY\u003c/key\u003e\u003cstring\u003eYOUR_KEY\u003c/string\u003e\n    \u003ckey\u003eCIX_LLAMA_BIN_DIR\u003c/key\u003e\u003cstring\u003e/ABSOLUTE/PATH/TO/server/dist/cix-darwin-arm64/llama\u003c/string\u003e\n    \u003ckey\u003eCIX_N_GPU_LAYERS\u003c/key\u003e\u003cstring\u003e99\u003c/string\u003e\n    \u003ckey\u003eCIX_PORT\u003c/key\u003e\u003cstring\u003e21847\u003c/string\u003e\n    \u003ckey\u003eCIX_SQLITE_PATH\u003c/key\u003e\u003cstring\u003e/Users/YOUR_USER/.cix/data/sqlite/projects.db\u003c/string\u003e\n    \u003ckey\u003eCIX_CHROMA_PERSIST_DIR\u003c/key\u003e\u003cstring\u003e/Users/YOUR_USER/.cix/data/chroma\u003c/string\u003e\n    \u003ckey\u003eCIX_GGUF_CACHE_DIR\u003c/key\u003e\u003cstring\u003e/Users/YOUR_USER/.cix/data/models\u003c/string\u003e\n  \u003c/dict\u003e\n  \u003ckey\u003eRunAtLoad\u003c/key\u003e\u003ctrue/\u003e\n  \u003ckey\u003eKeepAlive\u003c/key\u003e\u003ctrue/\u003e\n  \u003ckey\u003eStandardOutPath\u003c/key\u003e\u003cstring\u003e/tmp/cix-server.log\u003c/string\u003e\n  \u003ckey\u003eStandardErrorPath\u003c/key\u003e\u003cstring\u003e/tmp/cix-server.err\u003c/string\u003e\n\u003c/dict\u003e\u003c/plist\u003e\nEOF\n# Replace /ABSOLUTE/PATH/TO and YOUR_USER/YOUR_KEY with real values, then:\nlaunchctl load ~/Library/LaunchAgents/com.cix.server.plist\nlaunchctl start com.cix.server\n```\n\n### 2. Install the CLI\n\n**Option A: one-line installer (macOS / Linux)**\n\n```bash\ncurl -fsSL https://raw.githubusercontent.com/dvcdsys/code-index/main/install.sh | bash\n```\n\n**Option B: from source**\n\n```bash\ncd cli\nmake build \u0026\u0026 make install   # → /usr/local/bin/cix\n```\n\nOr without Make:\n\n```bash\ncd cli \u0026\u0026 go build -o cix . \u0026\u0026 sudo mv cix /usr/local/bin/\n```\n\n### 3. Configure\n\n```bash\n# Point cix at your server (API key is in .env)\ncix config set api.url http://localhost:21847\ncix config set api.key $(grep CIX_API_KEY .env | cut -d= -f2)\n```\n\n### 4. Index a Project\n\n```bash\ncd /path/to/your/project\ncix init          # registers, indexes, starts file watcher daemon\ncix status        # wait until: Status: ✓ Indexed\n```\n\n### 5. Search\n\n```bash\ncix search \"authentication middleware\"\ncix search \"error handling\" --in ./api\ncix symbols \"handleRequest\" --kind function\ncix files \"config\"\ncix summary\n```\n\n---\n\n## CLI Reference\n\n### Project Management\n\n\n| Command | Description |\n|---------|-------------|\n| `cix init [path]` | Register + index + start file watcher |\n| `cix status` | Show indexing status and progress |\n| `cix list` | List all indexed projects |\n| `cix reindex [--full]` | Trigger manual reindex |\n| `cix summary` | Project overview: languages, directories, symbols |\n\n### Search\n\n```bash\n# Semantic search — natural language, finds by meaning\ncix search \u003cquery\u003e [flags]\n  --in \u003cpath\u003e          restrict to file or directory (repeatable)\n  --exclude \u003cpath\u003e     exclude file or directory (repeatable)\n  --lang \u003clanguage\u003e    filter by language (repeatable)\n  --limit, -l \u003cn\u003e      max results (default: 10)\n  --min-score \u003c0-1\u003e    minimum relevance score (default: 0.4)\n  -p \u003cpath\u003e            project path (default: cwd)\n\n# Symbol search — fast lookup by name\ncix symbols \u003cname\u003e [flags]\n  --kind \u003ctype\u003e        function | class | method | type (repeatable)\n  --limit, -l \u003cn\u003e      max results (default: 20)\n\n# File search\ncix files \u003cpattern\u003e [--limit \u003cn\u003e]\n```\n\n### File Watcher\n\n```bash\ncix watch [path]             # start background daemon\ncix watch --foreground       # run in terminal (Ctrl+C to stop)\ncix watch stop               # stop daemon\ncix watch status             # check if running\n```\n\nThe watcher monitors the project with `fsnotify`, debounces events (5s), and triggers incremental reindexing automatically. Logs: `~/.cix/logs/watcher.log`.\n\n### Configuration\n\n```bash\ncix config show              # print current config\ncix config set \u003ckey\u003e \u003cval\u003e   # set a value\ncix config path              # show config file location\n```\n\nConfig file: `~/.cix/config.yaml`\n\n| Key | Default | Description |\n|-----|---------|-------------|\n| `api.url` | `http://localhost:21847` | API server URL |\n| `api.key` | — | Bearer token for API auth (required) |\n| `watcher.debounce_ms` | `5000` | Delay in ms before reindex is triggered after a file change |\n| `indexing.batch_size` | `20` | Number of files sent to the server per indexing batch |\n\n---\n\n## Agent Integration\n\n`cix` is designed to be called by AI agents (Claude, GPT, Cursor, custom agents) as a shell tool. Agents run `cix search` instead of Grep/Glob — getting ranked, relevant snippets rather than raw file dumps.\n\n### Claude Code\n\nInstall the bundled skill so Claude knows to use `cix` automatically:\n\n```bash\ncp -r skills/cix ~/.claude/skills/cix\n```\n\nThen in any Claude Code session:\n\n```\n/cix\n```\n\nThis loads search guidance into context. Claude will use `cix search` instead of Grep.\n\nTo activate in every session without typing `/cix`, add to `~/.claude/CLAUDE.md`:\n\n```markdown\n## Code search\nUse `cix` for all code search instead of Grep/Glob:\n- `cix search \"query\"` — semantic search by meaning\n- `cix symbols \"Name\" --kind function` — find symbol definitions\n- `cix files \"pattern\"` — find files by path\n- `cix summary` — project overview\nRun `cix init` on first use in a project.\n```\n\n### Other Agents\n\nSame pattern — give the agent access to shell execution and describe the commands:\n\n```\nTool: shell\nUsage: cix search \"what you're looking for\" [--in ./subdir] [--lang python]\nReturns: ranked code snippets with file paths and line numbers\n```\n\n### Typical Agent Workflow\n\n```bash\n# First time in a project\ncix init /path/to/project\n\n# Explore\ncix summary\ncix search \"main entry point\"\n\n# Find specific code\ncix search \"JWT token validation\"\ncix symbols \"ValidateToken\" --kind function\n\n# Navigate\ncix search \"who calls ValidateToken\"\ncix search \"error handling in auth flow\" --in ./api\n```\n\n---\n\n## How Indexing Works\n\n**Chunking** — tree-sitter parses code into semantic chunks (functions, classes, methods). Unsupported languages fall back to a sliding window (2000 chars, 256 char overlap).\n\nSupported languages: Python, TypeScript, JavaScript, Go, Rust, Java (+ 40+ others via fallback).\n\n**Embeddings** — each chunk is encoded with a GGUF build of CodeRankEmbed (default: [awhiteside/CodeRankEmbed-Q8_0-GGUF](https://huggingface.co/awhiteside/CodeRankEmbed-Q8_0-GGUF); 768d, 8192 token context, ~145MB on disk) via the `llama-server` sidecar (llama.cpp). Queries get a `\"Represent this query for searching relevant code: \"` prefix for asymmetric retrieval.\n\n**Incremental reindex** — uses SHA256 file hashes. Only new or changed files are re-embedded. Deleted files are removed from the index.\n\n**Filtering** — respects `.gitignore` and `.cixignore`, skips common dirs (`node_modules`, `.git`, `.venv`, etc.), skips files \u003e512KB and empty files. Per-project configuration via `.cixconfig.yaml` (see below).\n\n---\n\n## Tuning Search Quality\n\n### `--min-score` threshold\n\n`cix` defaults to `--min-score 0.4`. This is calibrated for **CodeRankEmbed-Q8_0** with the path-aware embedding format (`CIX_EMBED_INCLUDE_PATH=true`, default).\n\nA typical score landscape on this codebase:\n\n| Match strength | Score range | Action |\n|---|---|---|\n| Exact symbol or filename match | 0.65 – 0.80 | rare; very high confidence |\n| Strong path-aware concept match | 0.50 – 0.65 | typical \"good\" match for `cix search \"cli watch daemon\"` |\n| Weaker concept / partial path overlap | 0.40 – 0.50 | typical for ambiguous or multi-token queries |\n| Likely unrelated noise | \u003c 0.40 | filtered out by default |\n\n**When to lower the threshold**:\n\n- The query returns `No results` but you know matching code exists — try `--min-score 0.25`\n- Your query is intentionally vague (exploring an unfamiliar codebase) — `--min-score 0.2`\n- Single-word identifier queries on rare names\n\n**When to raise the threshold**:\n\n- Agent context is filling up with weak matches — `--min-score 0.5`\n- You only want clear top hits — `--min-score 0.6`\n\n\u003e [!NOTE]\n\u003e CodeRankEmbed is **asymmetric**: queries get a `\"Represent this query for searching relevant code: \"` prefix, which puts query and passage vectors into separate regions of the embedding space. Cosine similarities are systematically lower than for symmetric models — a \"strong\" match here is 0.55, not 0.80. Don't compare these numbers to thresholds quoted for OpenAI / Voyage / generic sentence-transformers.\n\n\u003e [!TIP]\n\u003e If you switched embedding models or toggled `CIX_EMBED_INCLUDE_PATH`, rerun `cix reindex --full` and recalibrate. Old vectors and new vectors live in the same store but score differently.\n\n### `--exclude` for noisy directories\n\nRepos with vendored code, fixtures, or legacy migrations can pull unrelated paths into top results because path tokens contribute to scoring. Two options:\n\n```bash\n# One-off exclude for a single search\ncix search \"main entry point\" --exclude vendor --exclude bench/fixtures\n\n# Permanent exclude — add to .cixignore (skips indexing entirely)\necho \"vendor/\" \u003e\u003e .cixignore\necho \"bench/fixtures/\" \u003e\u003e .cixignore\ncix reindex --full\n```\n\n`.cixignore` is preferred for directories you never want in results — they don't take up index space. `--exclude` is a per-query escape hatch.\n\n---\n\n## Per-Project Configuration\n\n### `.cixignore` — exclude files from indexing\n\nWorks exactly like `.gitignore` (same syntax, same nesting rules). Place it in the project root or any subdirectory. Patterns from `.cixignore` are merged with `.gitignore` — you don't need to duplicate rules.\n\nUse `.cixignore` when you want to exclude files from the index that are **not** excluded by `.gitignore` (e.g., vendored code, generated files, large test fixtures).\n\n```gitignore\n# .cixignore\napi/smart-contracts/\ngenerated/\n*.pb.go\ntestdata/fixtures/\n```\n\nNested `.cixignore` files work like nested `.gitignore` — they apply to their directory and below, without affecting sibling directories.\n\nThe file watcher automatically triggers a full reindex when `.cixignore` is created, modified, or deleted.\n\n### `.cixconfig.yaml` — project-level settings\n\nPlace this file in the project root. Currently supports automatic git submodule exclusion.\n\n```yaml\n# .cixconfig.yaml\nignore:\n  submodules: true   # automatically exclude all git submodule paths\n```\n\nWhen `ignore.submodules` is `true`, cix reads `.gitmodules` and excludes all submodule paths from indexing. No git binary is required — the file is parsed directly.\n\nThis is useful for projects with Foundry/Forge dependencies, vendored submodules, or any repo where submodules contain thousands of files you don't want indexed.\n\n**Example:** a project with 228 own files and 3,400+ files in nested submodules — after adding `ignore.submodules: true`, only the 228 project files are indexed.\n\nThe file watcher triggers a full reindex when `.cixconfig.yaml` changes.\n\n---\n\n## Configuration Reference\n\n### Server Environment Variables (`.env`)\n\nSee `.env.example` for a complete template.\n\n| Variable | Default | Description |\n|----------|---------|-------------|\n| `CIX_API_KEY` | — | Bearer token for API auth |\n| `CIX_PORT` | `21847` | API server port |\n| `CIX_EMBEDDING_MODEL` | `awhiteside/CodeRankEmbed-Q8_0-GGUF` | HuggingFace GGUF repo |\n| `CIX_MAX_FILE_SIZE` | `524288` | Skip files larger than this (bytes) |\n| `CIX_EXCLUDED_DIRS` | `node_modules,.git,.venv,...` | Comma-separated dirs to skip |\n| `CIX_N_GPU_LAYERS` | auto | `99` offloads all layers to GPU; `0` forces CPU |\n| `CIX_GGUF_CACHE_DIR` | `/data/models` | Where the GGUF file is cached |\n| `CIX_LLAMA_BIN_DIR` | `/app` | Directory containing `llama-server` binary |\n| `CIX_LLAMA_STARTUP_TIMEOUT` | `60` | Seconds to wait for llama-server ready |\n| `CIX_EMBEDDINGS_ENABLED` | `true` | Set to `false` to skip embeddings (CPU-only mode) |\n| `CIX_CHROMA_PERSIST_DIR` | `/data/chroma` | Vector store path |\n| `CIX_SQLITE_PATH` | `/data/sqlite/projects.db` | SQLite database path |\n\nData is stored in `/data` inside the container — mount a volume to persist it.\n\n### Resource Usage\n\n| | Local (native) | Docker (CPU) | CUDA |\n|--|----------------|--------------|------|\n| Memory (idle) | ~1GB | ~1GB | ~1GB |\n| Memory (indexing) | up to 2GB | up to 2GB | up to 2GB |\n| CPU | no limit | `CPUS` env var (default: 2) | unlimited |\n| GPU | Metal (Apple Silicon) | none | NVIDIA CUDA |\n| Disk | `~/.cix/data/` (~50-200MB/project) | same | same |\n| Auto-restart | no (use launchd/systemd) | yes | yes |\n\n### Switching Embedding Models\n\nThe server ships with `awhiteside/CodeRankEmbed-Q8_0-GGUF` — a Q8-quantized build of CodeRankEmbed (137M params, 768 dims, ~145MB on disk, ~650MB idle VRAM/RAM). Inference runs via the `llama-server` sidecar (llama.cpp), so **only GGUF repositories are supported**. Plain PyTorch/`sentence-transformers` repos will not work.\n\nTo switch models:\n1. Stop the server (`make server-local-stop` or `make server-docker-stop`).\n2. Set `EMBEDDING_MODEL` in `.env` to a Hugging Face repo that contains a `.gguf` file, for example:\n   ```bash\n   # code-specialised (default)\n   EMBEDDING_MODEL=awhiteside/CodeRankEmbed-Q8_0-GGUF\n   # smaller general-purpose alternative\n   EMBEDDING_MODEL=nomic-ai/nomic-embed-text-v1.5-GGUF\n   ```\n3. *(Optional)* Pre-cache the new model into the Docker image:\n   `docker compose build --build-arg EMBEDDING_MODEL=\u003crepo\u003e`.\n4. Start the server and re-index your projects.\n\n\u003e [!NOTE]\n\u003e ChromaDB and SQLite paths are suffixed by a sanitised form of the model name (e.g. `projects.db_awhiteside_coderankembed_q8_0_gguf`). This isolates vector spaces per model, so switching back and forth keeps old indices intact and avoids dim-mismatch errors.\n\n\u003e [!TIP]\n\u003e **Apple Silicon:** Docker cannot access Metal GPU — run natively with `cd server \u0026\u0026 make run` (see [Native macOS (Apple Silicon — Metal GPU)](#native-macos-apple-silicon--metal-gpu) above). The bundled `llama-server` includes `libggml-metal.dylib`; set `CIX_N_GPU_LAYERS=99` for full Metal offload.\n\u003e **Linux NVIDIA:** use the CUDA image (`docker-compose.cuda.yml`). Force CPU with `CIX_N_GPU_LAYERS=0`.\n\n---\n\n## Server Management\n\n```bash\ndocker compose up -d                           # start (CPU)\ndocker compose -f docker-compose.cuda.yml up -d  # start (CUDA)\ndocker compose logs -f                         # tail logs\ndocker compose down                            # stop\n```\n\nDeveloper builds (from source):\n\n```bash\ncd server \u0026\u0026 make build        # build cix-server binary\ncd server \u0026\u0026 make bundle       # build + fetch llama-server\ncd server \u0026\u0026 make test-gate    # parity gate (requires GGUF)\nmake docker-build-cuda         # build + push CUDA image\n```\n\n---\n\n## Building and Publishing to Docker Hub\n\n```bash\ndocker login\nmake docker-build-cuda   # builds + pushes server/Dockerfile.cuda → dvcdsys/code-index:go-cu128\n```\n\nPre-built images on Docker Hub:\n\n| Tag | Architecture | Use case |\n|-----|-------------|----------|\n| `dvcdsys/code-index:latest` | linux/amd64 + linux/arm64 | CPU, `CIX_EMBEDDINGS_ENABLED=false` |\n| `dvcdsys/code-index:cu128` | linux/amd64 | NVIDIA GPU (CUDA 12.8), full embeddings |\n| `dvcdsys/code-index:0.2-python-legacy` | linux/amd64 | Frozen Python build, rollback only |\n\nSee `doc/DOCKER_TAGS.md` for the full tag lifecycle policy.\n\n---\n\n## REST API\n\nAll endpoints except `/health` require `Authorization: Bearer \u003capi_key\u003e`.\n\n```bash\nGET  /health                                    # liveness check\nGET  /api/v1/status                             # service status\n\nPOST /api/v1/projects                           # create project\nGET  /api/v1/projects                           # list projects\nGET  /api/v1/projects/{id}                      # project details\nDELETE /api/v1/projects/{id}                    # delete project + index\n\nPOST /api/v1/projects/{id}/index                # trigger indexing\nGET  /api/v1/projects/{id}/index/status         # indexing progress\nPOST /api/v1/projects/{id}/index/cancel         # cancel indexing\n\nPOST /api/v1/projects/{id}/search               # semantic search\nPOST /api/v1/projects/{id}/search/symbols       # symbol search\nPOST /api/v1/projects/{id}/search/files         # file path search\nGET  /api/v1/projects/{id}/summary              # project overview\n```\n\n---\n\n## Troubleshooting\n\n**`API key not set`**\n```bash\ncix config set api.key $(grep CIX_API_KEY /path/to/code-index/.env | cut -d= -f2)\n```\n\n**`connection refused`**\n```bash\ncurl http://localhost:21847/health              # check if server is up\ndocker compose up -d                           # start (CPU)\ndocker compose -f docker-compose.cuda.yml up -d  # start (CUDA)\n```\n\n**`project not found`**\n```bash\ncix init /path/to/project\n```\n\n**Watcher not triggering reindex**\n```bash\ncix watch status\ncat ~/.cix/logs/watcher.log\ncix watch stop \u0026\u0026 cix watch /path/to/project\n```\n\n**Search returns no results**\n- Check project is indexed: `cix status`\n- Lower the threshold: `cix search \"query\" --min-score 0.2` (default is `0.4`; see [Tuning Search Quality](#tuning-search-quality))\n- Docker mode: run `cix list` to verify the project is registered\n\n---\n\n## Releases\n\nCLI and server ship on independent tag streams:\n\n| Component | Tag pattern | Workflow | Artifact |\n|---|---|---|---|\n| CLI (`cix`) | `cli/v*` (e.g. `cli/v0.4.0`) | `release-cli.yml` | `cix-{darwin,linux}-{amd64,arm64}.tar.gz` on a GitHub Release |\n| Server (`cix-server`) | `server/v*` (e.g. `server/v0.3.0`) | `release-server.yml` | Docker images on Docker Hub (`:latest`, `:cu128`) |\n\nBare `v*` tags are the historical pre-split CLI line — the installer\nstill falls back to them when no `cli/v*` release exists, but no new\nbare-`v*` tags should be created.\n\n### Cutting a CLI release\n\n```bash\ngit tag cli/v0.4.0\ngit push origin cli/v0.4.0\n```\n\nGitHub Actions builds binaries for macOS + Linux (amd64 + arm64),\nuploads them to a release named `cli/v0.4.0`, and the installer\nautomatically picks them up on the next run.\n\n### Cutting a server release\n\nSee `doc/DOCKER_TAGS.md` and the T9 step in `.claude/CLAUDE.md`.\n\n### Local cross-build (no release)\n\n```bash\ncd cli\nmake release VERSION=v0.4.0\n```\n\nProduces archives in `cli/dist/` plus `checksums.txt`. Useful for\ntesting the artifact format before pushing a tag.\n\nSupported targets: `darwin-arm64`, `darwin-amd64`, `linux-arm64`, `linux-amd64`.\n\n---\n\n## GPU Acceleration (CUDA)\n\nA CUDA-enabled image is available for servers with NVIDIA GPUs. Inference runs on GPU automatically — no configuration needed.\n\n### VRAM Usage (CodeRankEmbed Q8_0 GGUF, RTX 3090)\n\nWith the GGUF backend the footprint is near-constant: weights (~200-250 MB) plus\nthe pre-allocated context (`n_ctx=8192`, ~200-400 MB) give a **~0.5-0.7 GB**\nidle draw. Embedding calls do not spike VRAM the way fp16 PyTorch attention\nused to — sequence length and batch size only change latency, not peak memory.\n\n`MAX_CHUNK_TOKENS` still caps the length of each code chunk (1 token ≈ 4 chars)\nand must stay ≤ `n_ctx` (8192). `MAX_EMBEDDING_CONCURRENCY` defaults to `5` —\nthe indexing queue ships chunks in parallel; the llama-server sidecar still\nserialises requests through one context, but pipelining host-side prep with\ndevice inference at this depth saturates the GPU without measurable latency\ncost. Drop to `1` only if you observe contention.\n\nSee [`doc/vram-profiling.md`](doc/vram-profiling.md) for methodology and numbers.\n\n**Docker Hub:** [`dvcdsys/code-index:cu128`](https://hub.docker.com/r/dvcdsys/code-index/tags)\n\nTags: `cu128` (stable) and `v\u003cversion\u003e-cu128` (pinned). Image size: ~1.66 GB\n(3-stage build: nvidia/cuda:12.8.1-base + libcublas + llama-server binaries + Go binary).\n\nSee `doc/DOCKER_TAGS.md` for the full tag lifecycle.\n\n**Host requirements:**\n\n- NVIDIA GPU with driver **\u003e= 520** (CUDA 12.x compatible)\n- [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html) installed on the host\n\n**Docker Compose:**\n\n```bash\ndocker compose -f docker-compose.cuda.yml up -d\n```\n\n**Portainer:** use `portainer-stack-cuda.yml` — deploy as a new stack with `API_KEY` env variable set.\n\n---\n\n## License\n\nMIT","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdvcdsys%2Fcode-index","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdvcdsys%2Fcode-index","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdvcdsys%2Fcode-index/lists"}