{"id":49762605,"url":"https://github.com/julymetodiev/veles","last_synced_at":"2026-05-11T10:00:40.064Z","repository":{"id":356725073,"uuid":"1233793191","full_name":"julymetodiev/Veles","owner":"julymetodiev","description":"Fast hybrid (BM25 + semantic) local code search for AI agents — pure Rust, persistent index, MCP/gRPC servers, tree-sitter symbols","archived":false,"fork":false,"pushed_at":"2026-05-11T08:07:43.000Z","size":24617,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-11T10:00:22.913Z","etag":null,"topics":["ai-agents","bm25","cli","code-search","embeddings","grpc","hybrid-search","mcp","model-context-protocol","model2vec","rag","rust","semantic-search","tree-sitter"],"latest_commit_sha":null,"homepage":"https://369.is","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/julymetodiev.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-09T11:14:52.000Z","updated_at":"2026-05-11T08:07:44.000Z","dependencies_parsed_at":null,"dependency_job_id":"ab47e978-8521-4dc9-a3ed-c2de3ac3244e","html_url":"https://github.com/julymetodiev/Veles","commit_stats":null,"previous_names":["julymetodiev/veles"],"tags_count":14,"template":false,"template_full_name":null,"purl":"pkg:github/julymetodiev/Veles","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/julymetodiev%2FVeles","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/julymetodiev%2FVeles/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/julymetodiev%2FVeles/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/julymetodiev%2FVeles/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/julymetodiev","download_url":"https://codeload.github.com/julymetodiev/Veles/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/julymetodiev%2FVeles/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32889972,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-10T13:40:02.631Z","status":"online","status_checked_at":"2026-05-11T02:00:05.975Z","response_time":120,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-agents","bm25","cli","code-search","embeddings","grpc","hybrid-search","mcp","model-context-protocol","model2vec","rag","rust","semantic-search","tree-sitter"],"created_at":"2026-05-11T10:00:20.598Z","updated_at":"2026-05-11T10:00:40.053Z","avatar_url":"https://github.com/julymetodiev.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003cimg src=\".assets/veles-banner.png\" alt=\"Veles\" width=\"640\"\u003e\n\u003c/p\u003e\n\n# Veles\n\n[![Crates.io](https://img.shields.io/crates/v/veles-cli.svg?label=veles-cli)](https://crates.io/crates/veles-cli)\n[![Crates.io](https://img.shields.io/crates/v/veles-core.svg?label=veles-core)](https://crates.io/crates/veles-core)\n[![docs.rs](https://docs.rs/veles-core/badge.svg)](https://docs.rs/veles-core)\n[![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](https://opensource.org/licenses/MIT)\n\nFast, hybrid (BM25 + semantic) local code search for AI agents and humans, written in pure Rust.\n\nVeles runs entirely on CPU — no GPU, no transformer forward pass at query time. Queries return in tens of milliseconds against a persistent on-disk index, with tree-sitter-aware symbol lookups, pipe-friendly output formats, and built-in MCP / gRPC servers for integration with Claude, Cursor, or anything else that speaks JSON-RPC. Static embeddings come from the [potion](https://huggingface.co/minishlab) family via [model2vec-rs](https://github.com/MinishLab/model2vec-rs).\n\nOriginally inspired by [Semble](https://github.com/MinishLab/semble) — Veles started as a Rust port of the same hybrid retrieval recipe and has grown to add persistent + incremental indexing, tree-sitter `symbols` / `defs` / `refs`, six pipe-friendly output formats, glob/language filters, gRPC, and shell completions.\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\".assets/tui-demo.gif\" alt=\"Veles TUI — live hybrid search with preview pane\" width=\"900\"\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003csub\u003e\u003ccode\u003eveles tui\u003c/code\u003e — live hybrid search, ~10ms per keystroke.\u003c/sub\u003e\n\u003c/p\u003e\n\n## Interfaces\n\n- **CLI** — `veles search \"query\" ./my-repo`\n- **MCP server** — stdio JSON-RPC for AI agent integration (Claude, Cursor, etc.)\n- **gRPC** — tonic-based service with `Index`, `Search`, `FindRelated`, `GetStats` RPCs\n\n## Features\n\n- **Persistent index** under `\u003crepo\u003e/.veles/` — searches reuse the cache and finish in tens of milliseconds. Incremental `update` keeps embeddings of unchanged files.\n- **Hybrid search** with Reciprocal Rank Fusion (RRF) blending BM25 and semantic scores\n- **Tree-sitter symbol commands** — `symbols` / `defs` / `refs` for Rust, Python, JavaScript, TypeScript, Go\n- **Identifier-aware tokenizer** — splits camelCase, snake_case, and mixed-script names\n- **Query-type detection** — symbol queries lean BM25, natural language leans semantic\n- **Definition boosting** — promotes chunks that define the queried symbol\n- **Path penalties** — demotes test files, compat dirs, re-export files\n- **File saturation** — avoids stacking all results from one file\n- **Scope labels on every hit** — search/related/refs results carry a tree-sitter-derived ``defines `Foo` `` or ``in `bar` `` suffix so the result header alone tells you what each chunk is\n- **Multilingual model** option for Cyrillic, CJK, Arabic, etc.\n- **Pipe-friendly output** — `pretty`, `compact`, `ripgrep`, `paths`, `json`, `jsonl`\n- **Filter flags** — `--lang`, `--path` and `--exclude` glob patterns, `--min-score`\n- **Prebuilt binaries** for macOS (Intel/ARM), Linux x86_64/ARM64 (musl), Windows x86_64\n\n## Install\n\n```sh\n# Linux / macOS — prebuilt binary (one-liner)\ncurl --proto '=https' --tlsv1.2 -LsSf \\\n  https://github.com/julymetodiev/Veles/releases/latest/download/veles-cli-installer.sh | sh\n\n# Windows — PowerShell\nirm https://github.com/julymetodiev/Veles/releases/latest/download/veles-cli-installer.ps1 | iex\n\n# Homebrew (macOS / Linux)\nbrew install julymetodiev/tap/veles-cli\n\n# From crates.io (compiles locally; no protoc / extra deps needed)\ncargo install veles-cli\n\n# Manual download\ngh release download --repo julymetodiev/Veles --pattern '*linux-gnu*'   # or browse\n#   https://github.com/julymetodiev/Veles/releases/latest\n\n# Verify (optional)\nveles --version    # → veles 0.4.3\n```\n\nSee **[INSTALL.md](INSTALL.md)** for SHA-256 verification and other\ninstall paths.\n\n## Quickstart\n\n```sh\nveles index .                     # one-off, builds .veles/\nveles search \"parse config file\"  # auto-loads the cache\nveles update .                    # refresh after edits\n```\n\nThe first `search` downloads the embedding model from Hugging Face (~64 MB, cached at `~/.cache/huggingface/hub/`).\n\n## Most-used commands\n\n### Search\n\n```sh\nveles search \"rate limiting\"                          # hybrid (default)\nveles search \"rate limiting\" -t 10 -f compact         # 10 results, 1 line each\nveles search \"rate limiting\" -f rg                    # ripgrep-style path:line:content\nveles search \"rate limiting\" -f json | jq '.results'  # structured for scripting\nveles search \"rate limiting\" -f paths | xargs $EDITOR # open every matching file\n```\n\n```sh\nveles search \"TokenStream\" -m bm25                    # exact identifier\nveles search \"auth flow\"    -m semantic               # fuzzy concept\nveles search \"auth\"  -l rust,python                   # language filter\nveles search \"X\"     -g 'src/**/*.rs' -x 'src/legacy/**'   # glob include / exclude\nveles search \"BM25\"  --min-score 0.4                  # drop weak hits\n```\n\n### Symbols (tree-sitter)\n\n```sh\nveles symbols crates/veles-core/src/persist.rs        # outline a single file\nveles defs Manifest                                   # every definition named \"Manifest\"\nveles defs save -k function -l rust                   # filter by kind + language\nveles refs save_index -t 30                           # defs + BM25 references\n```\n\n### Related code\n\n```sh\nveles find-related src/main.rs 42                     # semantically similar chunks\nveles find-related src/main.rs 42  -l rust            # restrict to one language\nveles find-related src/main.rs 42  -g 'crates/foo/**' # restrict to a subtree\n```\n\n### Index lifecycle\n\n```sh\nveles index .              # bootstrap\nveles index . --force      # rebuild from scratch\nveles update .             # incremental refresh\nveles status .             # manifest + drift\nveles clean .              # remove .veles/\n```\n\n### Interactive TUI\n\n```sh\nveles tui                          # live hybrid search with preview pane\nveles tui ./my-repo                # against another repo\n```\n\nLoads the persistent index once, then debounces queries so each keystroke\nre-runs in tens of milliseconds. `↑↓` navigate, `Tab` cycles\nhybrid/bm25/semantic, `Ctrl-R` finds related code, `Ctrl-D` / `Ctrl-F`\nshow tree-sitter definitions / references for the typed identifier,\n`Enter` prints `path:line` to stdout (`$EDITOR $(veles tui)` works),\n`Ctrl-O` opens in `$EDITOR`, `?` shows the full keybinding overlay.\n\n### Servers\n\n```sh\nveles serve-mcp                                # MCP over stdio (default if no args)\nveles serve-grpc --addr \"[::1]:50051\"          # gRPC\n```\n\n### Shell integration\n\n```sh\nmkdir -p ~/.zfunc ~/.local/share/man/man1\nveles completions zsh \u003e ~/.zfunc/_veles\nveles man --out-dir ~/.local/share/man/man1\n```\n\n`veles man --out-dir DIR` writes one page per subcommand (`veles.1`,\n`veles-search.1`, `veles-defs.1`, …) so `man veles-search` works the\nsame way as `man git-commit`.\n\nThen once in `~/.zshrc`:\n\n```sh\nfpath=(~/.zfunc $fpath)\nautoload -Uz compinit \u0026\u0026 compinit\nexport MANPATH=\"$HOME/.local/share/man:$MANPATH\"\n```\n\n### Remote repos\n\n```sh\nveles search \"BM25 inverted index\" https://github.com/julymetodiev/Veles\n```\n\nSee **[USAGE.md](USAGE.md)** for the full reference, recipes (fzf, vim quickfix, jq), and troubleshooting.\n\n## MCP server\n\n```sh\nveles serve-mcp     # explicit\nveles               # equivalent — bare `veles` starts MCP when stdin is piped\n```\n\nExposed tools:\n\n| Tool           | Use it for                                                                            |\n|----------------|---------------------------------------------------------------------------------------|\n| `search`       | Hybrid / BM25 / semantic query, with optional `lang` / `path` / `exclude` / `min_score`. |\n| `defs`         | Tree-sitter definitions for an exact symbol name (Rust, Python, JS, TS, Go).          |\n| `refs`         | Definitions plus BM25 hits — \"where is X defined and where is X used\", in one call.   |\n| `find_related` | Semantically similar chunks for a `(file_path, line)` from an earlier `search`.       |\n| `list_symbols` | Every tree-sitter definition across the index, with `kind` / `lang` / `path` filters. |\n| `symbols`      | Outline of a single file — every definition it contains.                              |\n| `scope_at`     | Innermost tree-sitter symbol containing a given `file:line`.                          |\n| `files`        | Distinct file paths in the index, with `lang` / `path` / `exclude` filters.           |\n| `read`         | Line range from an indexed file (capped at 500 lines, repo-relative paths only).      |\n| `stats`        | File / chunk counts, model metadata, per-language chunk breakdown.                    |\n| `status`       | Non-mutating drift check vs. persisted manifest; distinguishes content edits from bare `touch`. |\n| `update`       | Incremental refresh of a local repo's `.veles/` index after edits (BLAKE3-aware).     |\n\n`search`, `find_related`, and `refs` accept a `format` argument:\n- `default` (default) — scored, fenced code blocks tagged with the enclosing scope.\n- `paths` — flat per-line list. `search` / `find_related` emit `path:start-end`; `refs` emits `path:line` per word-boundary occurrence of the symbol.\n- `unique_paths` — collapsed to one `path` line per file. For agent shortlist workflows that just want \"which files matter\".\n\n## Build from source\n\n```sh\ncargo build --release\n```\n\n`tonic-build` ships a vendored `protoc` via `protoc-bin-vendored`, so no system-wide protobuf compiler is required.\n\n## Embedding in your own Rust project\n\nThe workspace publishes four crates on crates.io — pick the layer you need:\n\n| Crate | Purpose |\n|-------|---------|\n| [`veles-core`](https://crates.io/crates/veles-core) | Indexing, chunking, BM25, dense search, hybrid ranking, persistence. |\n| [`veles-grpc`](https://crates.io/crates/veles-grpc) | tonic-based gRPC service wrapping `veles-core`. |\n| [`veles-mcp`](https://crates.io/crates/veles-mcp)   | MCP / JSON-RPC server (stdio) for AI-agent integration. |\n| [`veles-cli`](https://crates.io/crates/veles-cli)   | The `veles` binary. |\n\nFull API docs are on [docs.rs](https://docs.rs/veles-core).\n\n```toml\n[dependencies]\nveles-core = \"0.2\"\n```\n\n```rust\nuse std::path::Path;\nuse veles_core::{SearchMode, VelesIndex};\n\nlet index = VelesIndex::from_path(Path::new(\".\"), None, None, false)?;\nlet results = index.search(\"parse config\", 5, SearchMode::Hybrid, None, None, None);\nfor r in results {\n    println!(\"{} [{:.3}]\", r.chunk.location(), r.score);\n}\n# Ok::\u003c(), anyhow::Error\u003e(())\n```\n\n## Architecture\n\n```\nVeles/\n  crates/\n    veles-core/    indexing, chunking, BM25, dense search, ranking, symbols\n    veles-grpc/    gRPC service (tonic + prost)\n      proto/\n        veles.proto  gRPC schema\n    veles-mcp/     MCP server over stdio\n    veles-cli/     CLI binary\n```\n\nThe persistent index lives under `\u003crepo\u003e/.veles/`:\n\n```\n.veles/\n  manifest.json   # model, dim, per-file (size, mtime, chunk_count)\n  chunks.bin      # bincode Vec\u003cChunk\u003e\n  bm25.bin        # bincode BM25 inverted index\n  dense.bin       # bincode dense matrix\n  symbols.bin     # bincode tree-sitter symbols\n```\n\n`update` reuses embeddings of files whose `(size, mtime)` fingerprint hasn't changed, so refreshing after a small edit is near-instant on large repos.\n\n## License\n\nMIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjulymetodiev%2Fveles","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjulymetodiev%2Fveles","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjulymetodiev%2Fveles/lists"}