Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/currentslab/awesome-vector-search

Collections of vector search related libraries, service and research papers
https://github.com/currentslab/awesome-vector-search

List: awesome-vector-search

awesome awesome-list knn-search machine-learning nearest-neighbor-search search-engine similarity-search vector vector-search vector-search-engine

Last synced: 3 months ago
JSON representation

Collections of vector search related libraries, service and research papers

Awesome Lists containing this project

README

        

## Awesome Vector Search Engine

> A curated list of awesome vector search framework/engine, library, cloud service and research papers to vector similarity search

### Standalone Service
- [Apache Cassandra 5.0 – Vector search (cep-30), Strict Serialisable ACID (cep-15), horizontally scaling database](https://cassandra.apache.org)
- [Qdrant - Vector Similarity Search Engine with extended filtering support](https://github.com/qdrant/qdrant)
- [Vald - A Highly Scalable Distributed Vector Search Engine](https://github.com/vdaas/vald)
- [Milvus - A cloud-native vector database with high-performance and high scalability.](https://github.com/milvus-io/milvus)
- [Weaviate - A cloud-native, real-time vector search engine](https://github.com/semi-technologies/weaviate)
- [OpenDistro Elasticsearch KNN - A machine learning plugin which supports an approximate k-NN search algorithm for Open Distro for Elasticsearch](https://github.com/opendistro-for-elasticsearch/k-NN)
- [Elastiknn - Elasticsearch plugin for nearest neighbor search](https://github.com/alexklibisz/elastiknn)
- [Epsilla - A High Performance Vector Database Management System, Hippocampus For AI](https://github.com/epsilla-cloud/vectordb)
- [Vearch - A scalable distributed system for efficient similarity search of deep learning vectors](https://github.com/vearch/vearch)
- [pgANN - Fast Approximate Nearest Neighbor (ANN) searches with a PostgreSQL database](https://github.com/netrasys/pgANN)
- [Jina - Jina allows you to build deep learning-powered search-as-a-service.](https://github.com/jina-ai/jina)
- [Infinity - The AI-native database built for LLM applications, providing incredibly fast vector and full-text search](https://github.com/infiniflow/infinity)
- [Aquila DB - Distribution focused k-NN search algorithm](https://github.com/Aquila-Network/AquilaDB)
- [Redis HNSW - A redis module for similarity search based on HNSW](https://github.com/zhao-lang/redis_hnsw)
- [Solr - Apache Solr](https://github.com/apache/solr) - [has a Dense Vector Search feature as of Solr 9.0](https://solr.apache.org/guide/solr/latest/query-guide/dense-vector-search.html)
- [Marqo - A semantic search engine which supports tensor search (sequence of vectors)](https://github.com/marqo-ai/marqo)
- [txtai - Build semantic search applications and workflows](https://github.com/neuml/txtai)
- [Semantra - A multipurpose tool for semantically searching documents.](https://github.com/freedmand/semantra)
- [SuperDuperDB - Bring AI to your favorite database](https://github.com/SuperDuperDB/superduperdb)
- [TensorDB - High Performance Vector Database Supporting Heterogeneous Computing](https://www.actionsky.com/tensorDB)
- [JVector - a pure Java, zero dependency, embedded vector search engine, used by DataStax Astra DB and Apache Cassandra.](https://github.com/jbellis/jvector/)
- [VQLite - Simple and Lightweight Vector Search Engine](https://github.com/VQLite/VQLite)
- [Vexvault - 100% browser based, open source, scalable, simple, zero-cost vector search](https://github.com/Xyntopia/vexvault)
- [Vespa.ai - Text search engine and ... fast approximate vector search (ANN)](https://github.com/vespa-engine)
- [Vespa's large-scale ANN search using HNSW-IF indexes is described here](https://blog.vespa.ai/vespa-hybrid-billion-scale-vector-search/)

### Library
- [LangStream - LangStream is an open-source project that combines the best of event-based architectures with the latest Gen AI technologies.](https://langstream.ai)
- [CassIO - CassIO is the ultimate solution for seamlessly integrating Apache Cassandra® with generative artificial intelligence and other machine learning workloads](https://cassio.org)
- [JVector - A pure Java, zero dependency, embedded vector search engine used by some of the advanced distributed databases such as DataStax Astra DB & Apache Cassandra™](https://github.com/jbellis/jvector)
- [Faiss - A library for efficient similarity search and clustering of dense vectors](https://github.com/facebookresearch/faiss)
- [Distributed Faiss - Work with FAISS indexes which don't fit into a single server memory](https://github.com/facebookresearch/distributed-faiss)
- [Autofaiss - Automatically create Faiss knn indices](https://github.com/criteo/autofaiss)
- [ScaNN - A library efficient vector similarity search at scale. ](https://github.com/google-research/google-research/tree/master/scann)
- [NMSLIB - Non-Metric Space Library, an efficient similarity search library for generic non-metric spaces](https://github.com/nmslib/nmslib)
- [Annoy - C++ library with Python bindings to search for points](https://github.com/spotify/annoy)
- [FLANN - Library written in C++ and contains bindings for the following languages: C, MATLAB, Python, and Ruby](http://www.cs.ubc.ca/research/flann/)
- [LLM App - Open-source Python library for a real-time data KNN (K-Nearest Neighbors) indexing](https://github.com/pathwaycom/llm-app)
- [MRPT - Fast nearest neighbor search with random projection](https://github.com/teemupitkanen/mrpt)
- [RPForest - Python library for approximate nearest neighbours search](https://github.com/lyst/rpforest)
- [pgvector - Open-source vector similarity search extension for Postgres](https://github.com/pgvector/pgvector)
- [PASE - Ultra-High-Dimensional approximate nearest neighbor search extension for Postgres](https://github.com/alipay/PASE)
- [Pyserini - Toolkit for reproducible information retrieval research with sparse and dense representations](https://github.com/castorini/pyserini)
- [NGT - Provides commands and a library for performing high-speed approximate nearest neighbor ](https://github.com/yahoojapan/NGT)
- [NearPy - Approximate search using different locality-sensitive hashing methods](http://pixelogik.github.io/NearPy/)
- [TOROS N2 - lightweight approximate Nearest Neighbor library](https://github.com/kakao/n2)
- [PUFFINN - Parameterless and Universal Fast FInding of Nearest Neighbors](https://github.com/puffinn/puffinn)
- [SPTAG - A distributed approximate nearest neighborhood search (ANN) library ](https://github.com/microsoft/SPTAG)
- [PyNNDescent - A python nearest neighbor descent for approximate k nearest neighbors](https://github.com/lmcinnes/pynndescent)
- [TarsosLSH - A Java library implementing practical nearest neighbour search algorithm for multidimensional vectors ](https://github.com/JorenSix/TarsosLSH)
- [TorchPQ - Efficient implementations of Product Quantization and its variants using Pytorch and CUDA](https://github.com/DeMoriarty/TorchPQ)
- [Granne - Graph-based retrieval of approximate nearest neighbors witten in rust ](https://github.com/granne/granne)
- [Embeddinghub - A database built for machine learning embeddings](https://github.com/featureform/embeddinghub)
- [Hora - Efficient approximate nearest neighbor search algorithm collections library written in Rust](https://github.com/hora-search/hora)
- [Voy - A WASM vector similarity search engine written in Rust](https://github.com/tantaraio/voy)
- [Chroma - The open-source embedding database for building LLM apps in Python or JavaScript with memory](https://github.com/chroma-core/chroma)
- [USearch - Smaller & Faster Vector Search Engine for C++, Python, JavaScript, Rust, Java, GoLang, Wolfram](https://github.com/unum-cloud/usearch)
- [Golang vector stores collection - Chroma, PGVector interfaces](https://github.com/urjitbhatia/vectorstores)
- [Scalable Vector Search (SVS) - A performance library for vector similarity search](https://github.com/IntelLabs/ScalableVectorSearch)

### Cloud Service

- [Epsilla Cloud - The fully managed serverless vector database with 10X faster, cheaper and better.](https://cloud.epsilla.com)
- [DataStax Astra Vector - Multi-cloud, serverless vector DBaaS](https://www.datastax.com/products/vector-search)
- [Relevance AI - Vector Platform From Experimentation To Deployment](https://relevance.ai/vectors/)
- [Pinecone - Managed vector search with filtering, live index updates, horizontal scaling, and a lot more](https://www.pinecone.io)
- [MyScale - A managed vector database based on ClickHouse](https://myscale.com)
- [Redis Cloud - Managed vector database in Redis](https://redis.com/cloud)
- [Zilliz Cloud - Cloud-native service for Milvus](https://zilliz.com/cloud)

### Research Papers

List of methods on how approximate vector search algorithm can be implemented more effciently.

- [SPANN: Highly-efficient Billion-scale Approximate Nearest Neighborhood Search - NEURIPS 2021](https://proceedings.neurips.cc/paper/2021/hash/299dc35e747eb77177d9cea10a802da2-Abstract.html)
- [Revisiting the Inverted Indices for Billion-Scale Approximate Nearest Neighbors - ECCV 2018](https://openaccess.thecvf.com/content_ECCV_2018/html/Dmitry_Baranchuk_Revisiting_the_Inverted_ECCV_2018_paper.html)
- [Accelerating Large-Scale Inference with Anisotropic Vector Quantization](https://arxiv.org/abs/1908.10396)
- [Billion-scale similarity search with GPUs](https://arxiv.org/abs/1702.08734)
- [Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs](https://arxiv.org/abs/1603.09320)
- [Optimization of Indexing Based on k-Nearest Neighbor Graph for Proximity Search in High-dimensional Data](https://arxiv.org/abs/1810.07355)
- [On Approximately Searching for Similar Word Embeddings - ACL 2016](https://www.aclweb.org/anthology/P16-1214.pdf)

[![CC0](https://i.creativecommons.org/p/zero/1.0/88x31.png)](https://creativecommons.org/publicdomain/zero/1.0/)