{"id":50516393,"url":"https://github.com/mohamed-em2m/vector-search-benchmarks","last_synced_at":"2026-06-03T00:30:34.333Z","repository":{"id":360434992,"uuid":"1250041654","full_name":"mohamed-em2m/vector-search-benchmarks","owner":"mohamed-em2m","description":"this repo to share scripts to testing different vector search libraries","archived":false,"fork":false,"pushed_at":"2026-05-26T15:09:52.000Z","size":2880,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-26T15:12:17.388Z","etag":null,"topics":["agentic-ai","ai","rag","testing","vector-search"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mohamed-em2m.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-26T08:51:04.000Z","updated_at":"2026-05-26T14:56:09.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/mohamed-em2m/vector-search-benchmarks","commit_stats":null,"previous_names":["mohamed-em2m/vector-search-tests","mohamed-em2m/vector-search-benchmarks"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/mohamed-em2m/vector-search-benchmarks","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mohamed-em2m%2Fvector-search-benchmarks","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mohamed-em2m%2Fvector-search-benchmarks/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mohamed-em2m%2Fvector-search-benchmarks/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mohamed-em2m%2Fvector-search-benchmarks/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mohamed-em2m","download_url":"https://codeload.github.com/mohamed-em2m/vector-search-benchmarks/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mohamed-em2m%2Fvector-search-benchmarks/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33843611,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-02T02:00:07.132Z","response_time":109,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agentic-ai","ai","rag","testing","vector-search"],"created_at":"2026-06-03T00:30:30.819Z","updated_at":"2026-06-03T00:30:34.320Z","avatar_url":"https://github.com/mohamed-em2m.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Vector Search Benchmarks\n\nA multi-scale, modular benchmarking suite for evaluating different vector search stores and algorithms.\n\n\u003cimg src=\"./assets/ezgif-89023905375f8740.gif\" alt=\"Logo\" width=\"auto\"  \u003e\n\n## Overview\n\nThis project provides an orchestration framework to test and compare multiple vector databases and search libraries across different sample sizes (e.g., 500, 5k, 50k, 500k). It isolates runs in individual subprocesses and evaluates each store on:\n- **Speed**: Indexing time, documents/second, average latency, and P95 latency.\n- **Memory**: RSS usage delta, theoretical memory footprint, and compression ratios.\n- **Quality**: Recall@k and Precision@k.\n- **Agreement**: Overlap and Kendall rank correlation compared to an exact in-memory baseline.\n\n---\n\n## Codebase Architecture\n\nThe project has been refactored from a monolithic script into a clean, modular plug-and-play architecture:\n\n```\n├── core/\n│   ├── config.py        # YAML config loader, BenchmarkConfig \u0026 StoreVariant dataclasses\n│   ├── registry.py      # Decorator-based store registration\n│   ├── store.py         # Abstract base class for vector stores\n│   ├── metrics.py       # Pure functions for scoring and similarity evaluation\n│   └── types.py         # Frozen value objects\n├── reporting/\n│   └── tee.py           # Dual console/file logging wrapper (with cp1252 fallback)\n├── stores/\n│   ├── baseline.py      # In-memory baseline store (LangChain InMemoryVectorStore)\n│   ├── faiss_store.py   # FAISS store (FlatL2, FlatIP, HNSW, IVF+PQ, SQ)\n│   ├── qdrant_store.py  # Qdrant in-memory store (float32 + quantization variants)\n│   ├── scann_store.py   # ScaNN store (brute-force / tree-AH)\n│   ├── turbovec_store.py# Quantized N-bit store (2/3/4-bit)\n│   └── usearch_store.py # USearch HNSW store (cosine / L2 / IP)\n├── utils/\n│   ├── convert_json_to_markdown.py  # Convert aggregate JSON results to Markdown tables\n│   └── merge_test_results.py        # Merge and diff results across multiple runs\n├── data/\n│   └── test_cases.json  # Global query dataset (JSON form)\n├── benchmark_config.yaml  # YAML configuration file (stores, variants, paths, settings)\n├── run_benchmark.py       # Main runner for a single sample size (process-isolated)\n└── run_all.py             # Multi-scale orchestrator and comparison compiler\n```\n\n---\n\n## Adding a New Vector Store\n\nThe suite uses a **Registry Pattern**. Adding a new vector store is as simple as creating a single Python file under `stores/`.\n\n1. Create `stores/my_store.py`\n2. Subclass `AbstractVectorStore`\n3. Decorate your class with `@VectorStoreRegistry.register(\"my_store\", \"My Store (Display Name)\")`\n4. Implement the required abstract methods:\n\n```python\nfrom typing import List, Tuple, Any\nfrom langchain_core.documents import Document\nfrom core.store import AbstractVectorStore\nfrom core.registry import VectorStoreRegistry\n\n@VectorStoreRegistry.register(\"my_store\", \"My Store (Display Name)\")\nclass MyStore(AbstractVectorStore):\n    @classmethod\n    def is_available(cls) -\u003e bool:\n        # Check dependencies\n        return True\n\n    @classmethod\n    def build(cls, docs: List[Document], embeddings: Any, vecs: Any, texts: List[str], metadatas: List[dict], embed_dim: int, **kwargs) -\u003e \"MyStore\":\n        # Build index\n        instance = cls()\n        ...\n        return instance\n\n    def search(self, query: str, k: int) -\u003e List[Tuple[Document, float]]:\n        # Perform query search\n        return ...\n\n    @classmethod\n    def theoretical_bytes(cls, embed_dim: int, num_docs: int, **kwargs) -\u003e float:\n        # Calculate theoretical size in MB\n        return ...\n```\n\n5. Import your store module in `run_benchmark.py` and `run_all.py` (e.g., `import stores.my_store`).\n6. Optionally add a `stores` entry in `benchmark_config.yaml` to configure variants and parameters.\n\n---\n\n## Setup\n\nThis project uses `uv` for dependency management. To set up the environment, run:\n\n```bash\n# Install dependencies\nuv sync\n```\n\n### Optional dependencies\n\nTo use the detailed memory profiling feature with `memray` (Linux/macOS only):\n\n```bash\nuv sync --extra memray\n```\n\n---\n\n## Running the Benchmarks\n\n### Multi-scale orchestrator\n\nRuns `run_benchmark.py` for each configured sample size, then compiles a cross-sample comparison report:\n\n```bash\n# Using a dataset path\nuv run python run_all.py --dataset ./data/data.csv\n\n# Using a YAML config file (recommended)\nuv run python run_all.py --config benchmark_config.yaml\n```\n\n### Single sample size\n\n```bash\nuv run python run_benchmark.py --samples 500 --dataset ./data/data.csv\n\n# Or with a config file\nuv run python run_benchmark.py --config benchmark_config.yaml --samples 500\n```\n\n### CLI Options\n\n| Flag | Applies to | Description |\n|------|-----------|-------------|\n| `--dataset PATH` | both | Path to the input CSV dataset. |\n| `--config PATH` | both | Path to a YAML config file. YAML values take priority over CLI defaults. |\n| `--test-cases PATH` | both | Path to the JSON test queries file (default: `./data/test_cases.json`). |\n| `--output-dir PATH` | both | Output directory for results (default: `./results`). |\n| `--samples N` | `run_benchmark.py` | Number of rows to load from the CSV. |\n| `--store KEY` | `run_benchmark.py` | Run benchmark for a specific registered store only (e.g., `--store faiss`). |\n| `--memray` | both | Enable detailed per-allocation memory profiling via `memray` (Linux/macOS only). |\n\n---\n\n## YAML Configuration\n\nThe `benchmark_config.yaml` file provides full control over every aspect of a benchmark run.  \nWhen `--config` is specified, YAML values take priority over CLI defaults.\n\n```yaml\n# Paths\ndataset: ./data/data.csv\ntest_cases: ./data/test_cases.json\noutput_dir: ./results\n\n# Benchmark settings\nsample_sizes: [500, 5000, 50000, 500000]   # for run_all.py\ntop_k: 10\ntiming_repeats: 5\nembedding_model: sentence-transformers/all-MiniLM-L6-v2\n# memray: false  # enable on Linux/macOS only\n\n# Store variants\nstores:\n  faiss:\n    enabled: true\n    variants:\n      - name: \"FAISS (FlatL2)\"\n        params:\n          index_type: flat_l2\n      - name: \"FAISS (IVF+PQ)\"\n        params:\n          index_type: ivf_pq\n          nlist: 100\n          m: 8\n          nbits: 8\n\n  qdrant:\n    enabled: true\n    variants:\n      - name: \"Qdrant (float32)\"\n        params: {}\n      - name: \"Qdrant (Scalar INT8)\"\n        params:\n          quantization: scalar\n          scalar_type: int8\n\n  usearch:\n    enabled: true\n    variants:\n      - name: \"USearch (Cosine)\"\n        params: { metric: cos }\n      - name: \"USearch (L2)\"\n        params: { metric: l2 }\n\n  turbovec:\n    enabled: true\n    variants:\n      - name: \"TurboVec (3-bit)\"\n        params: { bit_width: 3 }\n\n  scann:\n    enabled: true      # Linux/macOS only\n```\n\nEach variant runs as its own row in the benchmark results and comparison tables.  \nSetting `enabled: false` skips a store entirely.\n\n---\n\n## Pipeline Phases (`run_all.py`)\n\nThe orchestrator runs in three phases:\n\n| Phase | Description |\n|-------|-------------|\n| **Phase 1 — Run benchmarks** | Spawns a subprocess for each sample size, producing `results_N.txt` and `summary_N.json`. |\n| **Phase 2 — Load summaries** | Reads all `summary_N.json` files, with corruption/missing-file guards. |\n| **Phase 3 — Build comparison** | Compiles a cross-sample, cross-store report with per-store scale tables, per-metric winner tables, an overall win-count tally, and a scale-effect latency trend summary. |\n\n### Output files\n\n```\nresults/\n├── results_500.txt          ← human-readable per-run output\n├── summary_500.json         ← machine-readable per-run metrics\n├── results_5000.txt\n├── summary_5000.json\n│   …\n├── aggregate_comparison.txt ← main cross-sample comparison report\n└── aggregate_comparison.json← machine-readable aggregate data\n```\n\n### Programmatic API\n\n`run_all.py` also exposes `run_benchmark_pipeline()` for use in scripts or notebooks:\n\n```python\nfrom run_all import run_benchmark_pipeline\n\nrun_benchmark_pipeline(\n    sample_sizes=[500, 5000],\n    dataset_path=\"./data/data.csv\",\n    output_dir=\"./results\",\n    config_path=\"benchmark_config.yaml\",\n    use_memray=False,\n)\n```\n\n---\n\n## Utility Scripts\n\n| Script | Description |\n|--------|-------------|\n| `utils/convert_json_to_markdown.py` | Converts `aggregate_comparison.json` into formatted Markdown tables for reports or GitHub. |\n| `utils/merge_test_results.py` | Merges and diffs results across multiple benchmark runs (e.g., comparing different model configs). |","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmohamed-em2m%2Fvector-search-benchmarks","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmohamed-em2m%2Fvector-search-benchmarks","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmohamed-em2m%2Fvector-search-benchmarks/lists"}