{"id":21426492,"url":"https://github.com/matrixji/annb","last_synced_at":"2026-04-29T14:35:42.122Z","repository":{"id":173866428,"uuid":"649967916","full_name":"matrixji/annb","owner":"matrixji","description":"Approximate Nearest Neighbor Benchmark","archived":false,"fork":false,"pushed_at":"2023-12-14T12:05:58.000Z","size":116,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-03-14T13:43:30.098Z","etag":null,"topics":["anns","benchmarks","cuda","gpu"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/matrixji.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-06-06T03:27:27.000Z","updated_at":"2023-08-15T02:50:43.000Z","dependencies_parsed_at":null,"dependency_job_id":"6af68083-49c4-4060-8af0-2a2622d398f3","html_url":"https://github.com/matrixji/annb","commit_stats":null,"previous_names":["matrixji/annb"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/matrixji%2Fannb","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/matrixji%2Fannb/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/matrixji%2Fannb/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/matrixji%2Fannb/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/matrixji","download_url":"https://codeload.github.com/matrixji/annb/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243933454,"owners_count":20370988,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["anns","benchmarks","cuda","gpu"],"created_at":"2024-11-22T21:42:21.013Z","updated_at":"2026-04-29T14:35:42.091Z","avatar_url":"https://github.com/matrixji.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ANNB: Approximate Nearest Neighbor Benchmark\n\n[![PyPI Version](https://img.shields.io/pypi/v/annb.svg)](https://pypi.python.org/pypi/annb)\n\nNote: This is a work in progress. The API/CLI is not stable yet.\n\n## Installation\n\n```bash\npip install annb\n\n# install vector search index/client you may need for benchmark\n# e.g install faiss for run faiss index benchmark\n```\n\n## Usage\n\n### CLI Usage\n\n#### Run Benchmark\n\n##### start first benchmark with a randome dataset.\n\nJust run `annb-test` to start your first benchmark with a random dataset.\n\n```bash\nannb-test\n```\n\nIt will produce a result like this:\n\n```plain\n❯ annb-test\n... some logs ...\n\nBenchmarkResult:\n  attributes:\n    query_args: [{'nprobe': 1}]\n    topk: 10\n    jobs: 1\n    loop: 5\n    step: 10\n    name: Test\n    dataset: .annb_random_d256_l2_1000.hdf5\n    index: Test\n    dim: 256\n    metric_type: MetricType.L2\n    index_args: {'index': 'ivfflat', 'nlist': 128}\n    started: 2023-08-14 13:03:40\n\n  durations:\n    training: 1 items, 1000 total, 1490.03266ms\n    insert: 1 items, 1000 total, 132.439627ms\n    query:\n      nprobe=1,recall=0.2173 -\u003e 1000 items, 18.615083ms, 53719.878659686874qps, latency=0.18615083ms, p95=0.31939ms, p99=0.41488ms\n```\n\nThis is a simple benchmark test with default index(faiss) with random l2 dataset.\nIf you wants to generate more data or with some different specifications for the dataset, you could see below options:\n  - --index-dim         The dimension of the index, default is 256\n  - --index-metric-type   Index metric type, l2 or ip, default is l2\n  - --topk TOPK           topk used for query, default is 10\n  - --step STEP           the query step, default annb will query 10 items per query, you could set it to 0 for query all items in one query (similar like batch for ann-benchmarks)\n  - --batch               batch mode, alias --step 0\n  - --count COUNT         the total number of items in the dataset, default is 1000\n\n##### run benchmark with a specific dataset\n\nYou could also use ann-benchmarks's [dataset](https://github.com/erikbern/ann-benchmarks#data-sets) to run benchmark. download them locally and run benchmark with `--dataset` option.\n\n```bash\nannb-test --dataset sift-128-euclidean.hdf5\n```\n\n##### run benchmark with query args\nYou mary benchmark with different query args, e.g. different nprobe for faiss ivfflat index. you could try `--query-args` option.\n\n```bash\nannb-test --query-args nprobe=10 --query-args nprobe=20\n```\n\nwill output below result:\n\n```plain\ndurations:\n    training: 1 items, 1000 total, 1548.84968ms\n    insert: 1 items, 1000 total, 143.402532ms\n    query:\n      nprobe=1,recall=0.2173 -\u003e 1000 items, 20.074236ms, 49815.09632545916qps, latency=0.20074235999999998ms, p95=0.332276ms, p99=0.455525ms\n      nprobe=10,recall=0.5221 -\u003e 1000 items, 49.141931ms, 20349.2207092961qps, latency=0.49141931ms, p95=0.722628ms, p99=0.818012ms\n      nprobe=20,recall=0.6861 -\u003e 1000 items, 69.284072ms, 14433.331805324606qps, latency=0.69284072ms, p95=1.126946ms, p99=1.350359ms\n```\n\n##### run multiple benchmarks with config file\nYou may run multiple benchmarks with different index and dataset. you could use `--run-file` run benchmarks from a config file.\n\nBelow is a example config file:\n\nconfig.yaml\n\n```yaml\ndefault:\n  index_factory: annb.anns.faiss.indexes.index_under_test_factory\n  index_factory_args: {}\n  index_name: Test\n  dataset: gist-960-euclidean.hdf5\n  topk: 10\n  step: 10\n  jobs: 1\n  loop: 2\n  result: output.pth\n\nruns:\n  - name: faiss-gist960-gpu-ivfflat\n    index_args:\n      gpu: yes\n      index: ivfflat\n      nlist: 1024\n    query_args:\n      - nprobe: 1\n      - nprobe: 16\n      - nprobe: 256\n  - name: faiss-gist960-gpu-ivfpq8\n    index_args:\n      gpu: yes\n      index: ivfpq\n      nlist: 1024\n    query_args:\n      - nprobe: 1\n      - nprobe: 16\n      - nprobe: 256\n```\n\nExplanation for above config file:\n- The default section is the default config for all benchmarks.\n- The config keys are generally same as the options for `annb-test` command. e.g. `index_factory` is same as `--index-factory`.\n- You could define multiple benchmarks in `runs` section. and each run config will override the default config. In this example, we define use gist-960-euclidean.hdf5 as dataset, so it will use this dataset for all benchmarks. and we use different index and query args for each benchmark. for index_args, we use ivfflat(nlist=1024) and ivfpq(nlist=1024) as two benchmark series. and for query_args, we use nprobe=1,16,256 for each benchmark. That means we will run 6 benchmarks in total, each series will run 3 benchmarks with different nprobe.\n- The result will be saved to output.pth file by default setting. Actually, each benchmark series will save to a separate file. so in this example, we will get two files: `output-1.pth` and `output-2.pth`. you could use `annb-report` to view them.\n\n\n##### more options\n\nYou could use `annb-test --help` to see more options.\n\n```bash\n❯ annb-test --help\n```\n\n\n#### Check Benchmark Results\n\nThe `annb-report` is use to view benchmark results as plain/csv text, or export them to Chart graphic.\n\n```bash\nannb-report --help\n```\n\n##### examples for view/export benchmark results\n\nview benchmark results as plain text\n\n```bash\nannb-report output.pth\n```\n\nview benchmark results as csv text\n\n```bash\nannb-report output.pth --format csv\n```\n\nexport benchmark results to chart graphic(multiple series)\n\n```bash\nannb-report output.pth --format png --output output.png output-1.pth output-2.pth\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmatrixji%2Fannb","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmatrixji%2Fannb","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmatrixji%2Fannb/lists"}