Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-vector-database
A curated list of awesome works related to high dimensional structure/vector search & database
https://github.com/dangkhoasdc/awesome-vector-database
Last synced: 2 days ago
JSON representation
-
Others
- Foundations of Vector Retrieval
- SimSIMD
- BEIR
- MyScale's Vector Database Benchmark
- VectorHub - source learning website for people (software developers to senior ML architects) interested in adding vector retrieval to their ML stack.
- Distance Comparison Operators for Approximate Nearest Neighbor Search: Exploration and Benchmark
- Billion-scale ANNS Benchmarks
- VectorDBBench - A Vector Database Benchmark Tool
- Qdrant's Vector Database Benchmarks
- Foundations of Multidimensional and Metric Data Structures
- Introduction to Information Retrieval
- Deep Learning for Search
- VLDB
- [slides - tutorial-summary.pdf)]
- Image Retrieval in the Wild (CVPR20)
- Neural Search In Action
- Effective and Efficient: Toward Open-world Instance Re-identification
- [Slides
- [Slides
- [Slides
- Search Optimization with Query Likelihood Boosting and Two-Level Approximate Search for Edge Devices
- 2021 Result
- Approximate Nearest Neighbor Search in Recommender Systems
- Accelerating vector search on the GPU with RAPIDS RAFT
- CANDY: A Benchmark for Continuous Approximate Nearest Neighbor Search with Dynamic Data Ingestion.
- RTNN: accelerating neighbor search using hardware ray tracing - research/rtnn)]
- Physical vs. Logical Indexing with {IDEA}: Inverted {Deduplication-Aware} Index
- Improving approximate nearest neighbor search through learned adaptive early termination
- Deann: Speeding up kernel-density estimation using approximate nearest neighbor search
- Approximate nearest neighbor search on high dimensional data—experiments, analyses, and improvement - 1488.
- CAPS: A Practical Partition Index for Filtered Similarity Search
- Worst-case Performance of Popular Approximate Nearest Neighbor Search Implementations: Guarantees and Limitations
- SISAP Indexing Challenge
- 2023 Competition
- 2024 Competition
- Distance Comparison Operators for Approximate Nearest Neighbor Search: Exploration and Benchmark
- Vector search with small radiuses
- Taking two Birds with one k-NN Cache
- Ascent Similarity Caching With Approximate Indexes
- LeanVec: Search your vectors faster by making them fit.
- Efficient Proximity Search in Time-accumulating High-dimensional Data using Multi-level Block Indexing
- Vector search with small radiuses
- High-Dimensional Approximate Nearest Neighbor Search: with Reliable and Efficient Distance Comparison Operations. - 27.
- Approximate Nearest Neighbour Search on Dynamic Datasets: An Investigation
- Characterizing the Dilemma of Performance and Index Size in Billion-Scale Vector Search and Breaking It with Second-Tier Memory
- Approximate nearest neighbor search on high dimensional data—experiments, analyses, and improvement - 1488.
- CAPS: A Practical Partition Index for Filtered Similarity Search
- RTNN: accelerating neighbor search using hardware ray tracing - research/rtnn)]
- Ascent Similarity Caching With Approximate Indexes
-
Uncategorized
-
Uncategorized
- Weaviate - started-with-weaviate-a-beginners-guide-to-search-with-vector-databases-14bbb9285839)]
- txtai
- marqo
- Google Vector Search (Vertex AI)
- Pinecone
- Vespa
- Epsilla
- algolia
- nucliadb
- OpenSearch
- MyScale
- QdrantCloud
- zilliz
- MongoDB Atlas Vector Search
- OpenSearch's AlibabaCloud
- OpenSearch's AlibabaCloud
- vectara
- SuperDuperDB
- KBD.AI
- Meilisearch
-
-
Multidimensional data / Vectors
- annoy
- NGT
- pgvector
- Chroma
- jvector
- RAFT
- Voyager
- tinyvector
- USearch
- MRPT
- infinity
- havenask
- chromem-go
- OasysDB - iJRL5XyL7?usp=sharing)]
- bleve
- cuVS
- Vector DB Feature Matrix
- Faiss
- Typesense
- Qdrant
- Video tutorial
- Epsilla
- Vald
- vearch
- milvus
- hora
- MyScaleDB
- vsag
- sqlite-vec
- KGraph
- LlamaIndex
- Video tutorial
- Video tutorial
- Meilisearch - Search engine API for Semantic (vectors), full-text & hybrid search
- arroy
- NearestNeighbors.jl
- MuopDB
-
Texts
-
Courses
-
📰 Articles & Talks
- How to handle a Million Vector Embeddings in the RAG Applications
- What makes each one different?
- [Article
- How to choose your vector database in 2023?
- Do we really need a specialized vector database?
- Vector Search RAG Tutorial – Combine Your Data with LLMs with Advanced Search
- How to handle a Million Vector Embeddings in the RAG Applications
- Vector Databases: A Beginner’s Guide!
- How to handle a Million Vector Embeddings in the RAG Applications
- How to handle a Million Vector Embeddings in the RAG Applications
- How to handle a Million Vector Embeddings in the RAG Applications
- How to handle a Million Vector Embeddings in the RAG Applications
- How to handle a Million Vector Embeddings in the RAG Applications
- How to handle a Million Vector Embeddings in the RAG Applications
- How to handle a Million Vector Embeddings in the RAG Applications
- How to handle a Million Vector Embeddings in the RAG Applications
- How to handle a Million Vector Embeddings in the RAG Applications
- What is a Vector Database?
- eBay’s Blazingly Fast Billion-Scale Vector Similarity Engine
- Computer Vision Meetup: Computer Vision Applications at Scale with Vector Databases
- Vector Database and Spring IA
- How to handle a Million Vector Embeddings in the RAG Applications
- How to handle a Million Vector Embeddings in the RAG Applications
- How to handle a Million Vector Embeddings in the RAG Applications
- How to handle a Million Vector Embeddings in the RAG Applications
- How to handle a Million Vector Embeddings in the RAG Applications
- How to handle a Million Vector Embeddings in the RAG Applications
- How to handle a Million Vector Embeddings in the RAG Applications
- How Meilisearch Updates a Millions Vector Embeddings Database in Under a Minute
- How to handle a Million Vector Embeddings in the RAG Applications
- Vector database is not a separate database category
- Vector Databases: A First-Principles Approach
- Efficient Vector Similarity Search in Recommender Workflows Using Milvus with NVIDIA Merlin
- How to handle a Million Vector Embeddings in the RAG Applications
- How to handle a Million Vector Embeddings in the RAG Applications
- Common Pitfalls To Avoid When Using Vector Databases
- Getting Started With Vector Databases
- How to handle a Million Vector Embeddings in the RAG Applications
- How to handle a Million Vector Embeddings in the RAG Applications
- How to handle a Million Vector Embeddings in the RAG Applications
- How to handle a Million Vector Embeddings in the RAG Applications
- How to handle a Million Vector Embeddings in the RAG Applications
- How to handle a Million Vector Embeddings in the RAG Applications
- How to handle a Million Vector Embeddings in the RAG Applications
- How to handle a Million Vector Embeddings in the RAG Applications
-
Related Lists
-
Survey
-
Quantization
- [Paper - dinosauria/Rayuela.jl), [nanopq](https://github.com/matsui528/nanopq)]
- [Paper
- [Paper
- [Paper
- [Paper - research/faiss-quickeradc)]
- [Homepage - us/research/wp-content/uploads/2013/11/pami13opq.pdf), [Code](https://kaiminghe.github.io/cvpr13/matlab_OPQ_release_v1.1.rar), [nanopq](https://github.com/matsui528/nanopq)]
- [Paper
- [Paper - research/google-research/tree/master/scann), [Julia Training/Inference](https://github.com/AxelvL/AHPQ.jl)]
- [Paper - Quantization)]
- [Paper
- [Paper
- Projective Clustering Product Quantization
- Product quantizer aware inverted index for scalable nearest neighbor search
- Efficient Multi-vector Dense Retrieval with Bit Vectors
- Similarity search in the blink of an eye with compressed indices.
- Differentiable product quantization for end-to-end embedding compression
- Adanns: A framework for adaptive semantic search
- DeltaPQ: lossless product quantization code compression for high dimensional similarity search - 3616.
- Generalized product quantization network for semi-supervised image retrieval
- Residual Quantization with Implicit Neural Codebooks
- RaBitQ: Quantizing High-Dimensional Vectors with a Theoretical Error Bound for Approximate Nearest Neighbor Search.
- Efficient Multi-vector Dense Retrieval with Bit Vectors
- [Paper - research/faiss-quickeradc)]
- Nearest neighbor search with compact codes: A decoder perspective
- Jointly optimizing query encoder and product quantization to improve retrieval performance
- [Paper
- [Paper
- [Paper
- Projective Clustering Product Quantization
- Similarity search in the blink of an eye with compressed indices.
-
Hashing
- Awesome Papers on Learning to Hash
- [Paper
- [Paper - to-hash/blob/master/itq.py), [Matlab code](https://github.com/dangkhoasdc/sah/tree/master/itq)]
- Binary Embedding-based Retrieval at Tencent
- Fast Search on Binary Codes by Weighted Hamming Distance
- Binary code based hash embedding for web-scale applications
- Unsupervised Online Hashing with Multi-Bit Quantization
- Fast top-K cosine similarity search through XOR-friendly binary quantization on GPUs
- PM-LSH: A fast and accurate LSH framework for high-dimensional approximate NN search - 655.
- Scalable Nearest Neighbor Search with Compact Codes
- Locality-sensitive hashing scheme based on longest circular co-substring
- Binary Embedding-based Retrieval at Tencent
- Fast Search on Binary Codes by Weighted Hamming Distance
- Fast top-K cosine similarity search through XOR-friendly binary quantization on GPUs
- Locality-sensitive hashing scheme based on longest circular co-substring
- [Paper
- Point-to-hyperplane nearest neighbor search beyond the unit hypersphere
- DET-LSH: A Locality-Sensitive Hashing Scheme with Dynamic Encoding Tree for Approximate Nearest Neighbor Search
-
:chart_with_upwards_trend: Evaluation & Metrics
-
Graph-based Methods
- Theoretical and Empirical Analysis of Adaptive Entry Point Selection for Graph-based Approximate Nearest Neighbor Search.
- General and practical tuning method for off-the-shelf graph-based index: Sisap indexing challenge report by team utokyo.
- Optimizing Graph-based Approximate Nearest Neighbor Search: Stronger and Smarter.
- Graph-based Approximate NN Search: A Revisit
- Speed-ANN: Low-Latency and High-Accuracy Nearest Neighbor Search via Intra-Query Parallelism
- HVS: hierarchical graph structure based on voronoi diagrams for solving approximate nearest neighbor search - 258. [[Code](https://github.com/chuanxiao1983/HVS)]
- Revisiting $ k $-Nearest Neighbor Graph Construction on High-Dimensional Data: Experiments and Analyses
- Graph-and Tree-based Indexes for High-dimensional Vector Similarity Search: Analyses, Comparisons, and Future Directions - 21.
- [Paper
- [Paper - cv/hnsw), [Go Version](https://github.com/coder/hnsw)]
- [Paper
- [Paper
- [Paper
- Cagra: Highly parallel graph construction and approximate nearest neighbor search for gpus.
- ACORN: Performant and Predicate-Agnostic Search Over Vector Embeddings and Structured Data
- An Efficient and Robust Framework for Approximate Nearest Neighbor Search with Attribute Constraint
- Pecann: Parallel efficient clustering with graph-based approximate nearest neighbor search
- ELPIS: Graph-Based Similarity Search for Scalable Data Science - 1559.
- Worst-case performance of popular approximate nearest neighbor search implementations: Guarantees and limitations
- SeRF: Segment Graph for Range-Filtering Approximate Nearest Neighbor Search - 26.
- Graph based nearest neighbor search: Promises and failures
- Understanding and Generalizing Monotonic Proximity Graphs for Approximate Nearest Neighbor Search
- Unleashing Graph Partitioning for Large-Scale Nearest Neighbor Search
- Freshdiskann: A fast and accurate graph-based ann index for streaming similarity search
- Large-Scale Approximate k-NN Graph Construction on GPU
- Starling: An I/O-Efficient Disk-Resident Graph Index Framework for High-Dimensional Vector Similarity Search on Data Segment
- BANG: Billion-Scale Approximate Nearest Neighbor Search using a Single GPU.
- An Exploration Graph with Continuous Refinement for Efficient Multimedia Retrieval
- Graph based nearest neighbor search: Promises and failures
- Understanding and Generalizing Monotonic Proximity Graphs for Approximate Nearest Neighbor Search
- Large-Scale Approximate k-NN Graph Construction on GPU
- [Paper
- [Paper - cv/hnsw), [Go Version](https://github.com/coder/hnsw)]
- [Paper
- Starling: An I/O-Efficient Disk-Resident Graph Index Framework for High-Dimensional Vector Similarity Search on Data Segment
- Pecann: Parallel efficient clustering with graph-based approximate nearest neighbor search
- Graph-based Approximate NN Search: A Revisit
- Speed-ANN: Low-Latency and High-Accuracy Nearest Neighbor Search via Intra-Query Parallelism
- Unleashing Graph Partitioning for Large-Scale Nearest Neighbor Search
- Freshdiskann: A fast and accurate graph-based ann index for streaming similarity search
- ACORN: Performant and Predicate-Agnostic Search Over Vector Embeddings and Structured Data
- Enhancing HNSW Index for Real-Time Updates: Addressing Unreachable Points and Performance Degradation
- BANG: Billion-Scale Approximate Nearest Neighbor Search using a Single GPU.
- Cagra: Highly parallel graph construction and approximate nearest neighbor search for gpus.
- Theoretical and Empirical Analysis of Adaptive Entry Point Selection for Graph-based Approximate Nearest Neighbor Search.
- General and practical tuning method for off-the-shelf graph-based index: Sisap indexing challenge report by team utokyo.
- ParlayANN: Scalable and Deterministic Parallel Graph-Based Approximate Nearest Neighbor Search Algorithms
-
Comparisons
-
Other Approaches
- Spann: Highly-efficient billion-scale approximate nearest neighbor search
- Index-based, high-dimensional, cosine threshold querying with optimality guarantees - 83.
- Semi-convex hull tree: Fast nearest neighbor queries for large scale data on GPUs
- iDEC: indexable distance estimating codes for approximate nearest neighbor search
- VHP: approximate nearest neighbor search via virtual hypersphere partitioning - 1455.
- Practical near neighbor search via group testing - 9962. [[Supplement](https://proceedings.neurips.cc/paper_files/paper/2021/file/5248e5118c84beea359b6ea385393661-Supplemental.pdf)]
- Index-based, high-dimensional, cosine threshold querying with optimality guarantees - 83.
- Practical near neighbor search via group testing - 9962. [[Supplement](https://proceedings.neurips.cc/paper_files/paper/2021/file/5248e5118c84beea359b6ea385393661-Supplemental.pdf)]
-
🎄Tree-based Methods
- AiSAQ: All-in-Storage ANNS with Product Quantization for DRAM-free Information Retrieval
- Diskann: Fast accurate billion-point nearest neighbor search on a single node.
- Approximate Nearest Neighbor Search with Window Filters
- AiSAQ: All-in-Storage ANNS with Product Quantization for DRAM-free Information Retrieval
- GTS: GPU-based Tree Index for Fast Similarity Search
- ProMIPS: Efficient high-dimensional C-approximate maximum inner product search with a lightweight index
- Constructing Tree-based Index for Efficient and Effective Dense Retrieval.
-
Systems
-
Tree-based Methods
Categories
Others
49
Graph-based Methods
47
📰 Articles & Talks
45
Multidimensional data / Vectors
37
Quantization
31
Uncategorized
20
Hashing
18
Other Approaches
8
🎄Tree-based Methods
7
Systems
4
Texts
3
Courses
3
Comparisons
3
Survey
3
Tree-based Methods
2
Related Lists
1
:chart_with_upwards_trend: Evaluation & Metrics
1
Sub Categories
Keywords
vector-search
15
search-engine
12
similarity-search
10
nearest-neighbor-search
9
approximate-nearest-neighbor-search
9
information-retrieval
8
machine-learning
8
vector-database
7
hnsw
6
llm
6
search
5
rust
4
image-search
4
knn-search
4
semantic-search
4
nearest-neighbors
4
rag
4
llms
3
python
3
database
3
retrieval-augmented-generation
3
simd
3
clustering
3
recommender-system
3
vector-search-engine
3
embeddings
3
vector
3
full-text-search
2
neighborhood-methods
2
gpu
2
vector-store
2
benchmark
2
golang
2
vector-similarity
2
distance
2
k-nearest-neighbors
2
cuda
2
ann
2
sparse
2
anns
2
java
2
statistics
2
weaviate
2
sql
2
large-language-models
2
nlp
2
algorithm
2
transformers
2
neural-search
2
mlops
2