https://github.com/codelibs/search-ann-benchmark
Evaluating and comparing ANN search algorithms across various platforms
https://github.com/codelibs/search-ann-benchmark
Last synced: 3 months ago
JSON representation
Evaluating and comparing ANN search algorithms across various platforms
- Host: GitHub
- URL: https://github.com/codelibs/search-ann-benchmark
- Owner: codelibs
- Created: 2024-02-29T12:40:51.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-04-15T11:48:31.000Z (about 1 year ago)
- Last Synced: 2024-04-16T18:17:31.251Z (about 1 year ago)
- Language: Jupyter Notebook
- Homepage:
- Size: 331 KB
- Stars: 1
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Search ANN Benchmark
Benchmark the search performance of Approximate Nearest Neighbor (ANN) algorithms implemented in various systems.
This repository contains notebooks and scripts to evaluate and compare the efficiency and accuracy of ANN searches across different platforms.## Introduction
Approximate Nearest Neighbor (ANN) search algorithms are essential for handling high-dimensional data spaces, enabling fast and resource-efficient retrieval of similar items from large datasets.
This benchmarking suite aims to provide an empirical basis for comparing the performance of several popular ANN-enabled search systems.## Prerequisites
Before running the benchmarks, ensure you have the following installed:
- Docker
- Python 3.10 or higher## Setup Instructions
1. **Prepare the Environment:**
Create directories for datasets and output files, then download the necessary datasets using the provided script.
```bash
/bin/bash ./scripts/setup.sh
```2. **Install Dependencies:**
Install all required Python libraries.
```bash
pip install -r requirements.txt
```## Benchmark Notebooks
The repository includes the following Jupyter notebooks for conducting benchmarks:
| Notebook | GitHub Actions |
|------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| [Chroma](run-chroma.ipynb) | [](https://github.com/marevol/search-ann-benchmark/actions/workflows/run-chroma-linux.yml) |
| [Elasticsearch](run-elasticsearch.ipynb) | [](https://github.com/marevol/search-ann-benchmark/actions/workflows/run-elasticsearch-linux.yml) |
| [Milvus](run-milvus.ipynb) | [](https://github.com/marevol/search-ann-benchmark/actions/workflows/run-milvus-linux.yml) |
| [OpenSearch](run-opensearch.ipynb) | [](https://github.com/marevol/search-ann-benchmark/actions/workflows/run-opensearch-linux.yml) |
| [pgvector](run-pgvector.ipynb) | [](https://github.com/marevol/search-ann-benchmark/actions/workflows/run-pgvector-linux.yml) |
| [Qdrant](run-qdrant.ipynb) | [](https://github.com/marevol/search-ann-benchmark/actions/workflows/run-qdrant-linux.yml) |
| [Vespa](run-vespa.ipynb) | [](https://github.com/marevol/search-ann-benchmark/actions/workflows/run-vespa-linux.yml) |
| [Weaviate](run-weaviate.ipynb) | [](https://github.com/marevol/search-ann-benchmark/actions/workflows/run-weaviate-linux.yml) |Each notebook guides you through the process of setting up the test environment, loading the dataset, executing the search queries, and analyzing the results.
## Benchmark Results
For a comparison of the results, including response times and precision metrics for different ANN algorithms, see [Benchmark Results Page](https://codelibs.co/benchmark/ann-benchmark.html).
## Contributing
We welcome contributions!
If you have suggestions for additional benchmarks, improvements to existing ones, or fixes for any issues, please feel free to open an issue or submit a pull request.## License
This project is licensed under the Apache License 2.0.