awesome-vector-databases
A curated list of vector database solutions, libraries, and resources for AI applications - https://vectordb.works
https://github.com/ever-works/awesome-vector-databases
Last synced: 11 days ago
JSON representation
-
2026 Trends & Startups
- VecDB@VLDB2026 - Academic workshop on vector databases at VLDB 2026, fostering discussions on topics from mathematical theories and ANN algorithms to implementation optimizations, database interactions, RAG, query languages, and embedding models. Provides a platform for researchers and companies to present technical details and exchange ideas. Scheduled for September 4, 2026, at The Westin Boston Seaport District, Boston, MA, USA. ([Read more](/details/vecdbvldb2026.md)) `workshop` `academic` `vldb` `2026` `trends` `startups` `2026 Trends` `startups` `benchmarks`
-
š„ Acknowledgements
-
AI Agent Optimized VDBs
- Mem0 - Knowledge engine for AI agent memory and memory layer for AI agents. Replaces complex RAG pipelines with serverless, single-file memory supporting instant retrieval and long-term memory. ([Read more](/details/mem0.md)) `Open Source` `Rag` `Ai Agents`
-
ANN Indexing Libraries
- brinicle - Brinicle is a lightweight C++ library for approximate nearest neighbor (ANN) vector search on embeddings, optimized for low-RAM environments rather than full vector databases. It features efficient graph-based indexing (HNSW-like), supports quantization for further memory reduction, and excels in languages like C++. Ideal for rapid prototyping of ML prototypes and embedded applications; lighter and more memory-efficient than Milvus, with better low-resource performance vs hnswlib. ([Read more](/details/brinicle.md)) `c++` `Low Ram` `Open Source` `ANN Library` `Embeddable`
- LEANN - LEANN is a lightweight RAG-focused library for vector search on embeddings, achieving 97% storage savings via advanced compression and quantization techniques on personal devices. Implemented in Rust/Python, it supports efficient ANN indexing without full DB overhead. Ideal for embedded apps and private prototyping; far lighter than Milvus, more efficient on-device vs hnswlib. ([Read more](/details/leann.md)) `Open Source` `Rag` `Private` `ANN Library` `Embeddable`
- ScaNN - ScaNN (Scalable Nearest Neighbors) is a pure ANN index library using anisotropic vector quantization and scorers for high-recall, high-throughput search at billion-scale. Features CPU/GPU support, TensorFlow/Numpy bindings, advanced quantization. For custom vector engines in recommendations, benchmarks; superior recall/throughput vs Faiss, building block unlike full Qdrant. ([Read more](/details/scann.md)) `Pure ANN` `Index Only` `Benchmark Tool`
-
Benchmarks & Evaluation
- Qdrant's Vector Database Benchmarks - A set of benchmarks provided by Qdrant for evaluating vector databases, focusing on speed, scalability, and accuracy of vector search operations. ([Read more](/details/qdrants-vector-database-benchmarks.md)) `benchmark` `vector databases` `performance` `scalability`
- BEIR - BEIR (Benchmarking IR) is a benchmark suite for evaluating information retrieval and vector search systems across multiple tasks and datasets. Useful for comparing vector database performance. ([Read more](/details/beir.md)) `benchmark` `evaluation` `vector search` `datasets`
- Billion-scale ANNS Benchmarks - A benchmarking resource for evaluating approximate nearest neighbor search (ANNS) methods on billion-scale datasets, highly relevant for assessing the scalability of vector databases. ([Read more](/details/billion-scale-anns-benchmarks.md)) `benchmark` `ANNS` `scalability` `performance`
- Milvus Sizing Tool - Milvus Sizing Tool helps users estimate the hardware and resource requirements needed to deploy Milvus based on their anticipated data scale and workload. ([Read more](/details/milvus-sizing-tool.md)) `Milvus` `sizing` `performance` `resource estimation`
- MyScale's Vector Database Benchmark - Benchmark results and tools by MyScale aimed at measuring the performance of vector databases in various search and retrieval tasks. ([Read more](/details/myscales-vector-database-benchmark.md)) `benchmark` `vector databases` `performance` `retrieval`
- VectorDBBench - VectorDBBench is a benchmarking tool developed by ZillizTech for evaluating the performance of various vector databases, aiding users in selecting suitable vector database solutions for their needs. ([Read more](/details/vectordbbench.md)) `benchmark` `performance` `vector databases` `evaluation`
- Zeng, Xianzhi, et al. "CANDY: A Benchmark for Continuous Approximate Nearest Neighbor Search with Dynamic Data Ingestion." - A 2024 paper introducing CANDY, a benchmark for continuous ANN search with a focus on dynamic data ingestion, crucial for next-generation vector databases. ([Read more](/details/zeng-xianzhi-et-al-candy-a-benchmark-for-continuous-approximate-nearest-neighbor-search-with-dynamic-data-ingestion.md)) `benchmark` `ANN` `dynamic data` `vector search`
- SISAP Indexing Challenge - An annual competition focused on similarity search and indexing algorithms, including approximate nearest neighbor methods and high-dimensional vector indexing, providing benchmarks and results relevant to vector database research. ([Read more](/details/sisap-indexing-challenge.md)) `benchmark` `similarity search` `evaluation`
- VectorDBBench - The openāsource repository containing the implementation, configuration, and scripts of VectorDBBench, enabling users to run standardized benchmarks across multiple vector database systems locally or in CI. ([Read more](/details/vectordbbench.md)) `benchmark` `evaluation` `vector databases`
- WEAVESS - WEAVESS is an open-source benchmarking and evaluation framework for graph-based approximate nearest neighbor (ANN) search methods, providing code and experiments for large-scale vector similarity search. It is useful for researchers and practitioners comparing vector indexing algorithms for vector databases and AI search applications. ([Read more](/details/weavess.md)) `ANN` `benchmark` `similarity search`
- ANN-Benchmarks - A comprehensive benchmarking project that evaluates and compares implementations of approximate nearest neighbor algorithms. Provides standardized datasets and metrics for comparing ANN libraries including FAISS, HNSW, Annoy, and ScaNN. ([Read more](/details/ann-benchmarks.md)) `Benchmark` `Ann` `Performance`
- Big-ANN Benchmarks - Billion-scale approximate nearest neighbor search benchmark competition. Features datasets like SIFT1B, Deep1B with standardized evaluation metrics for comparing vector search algorithms at scale. ([Read more](/details/big-ann-benchmarks.md)) `Benchmark` `Ann` `Competition`
- BigVectorBench - An innovative benchmark suite for thoroughly evaluating vector database performance on heterogeneous data embeddings and compound queries for real-world multimodal applications. ([Read more](/details/bigvectorbench.md)) `Benchmark` `Open Source` `Multimodal`
- Deep1B Dataset - Billion-scale benchmark dataset containing 96-dimensional deep learning image embeddings. Provides real-world proxy for testing distributed systems and GPU-accelerated vector search at scale. ([Read more](/details/deep1b-dataset.md)) `Benchmark` `Datasets` `Deep Learning`
- MMTEB - Massive Multilingual Text Embedding Benchmark covering over 500 quality-controlled evaluation tasks across 250+ languages, representing the largest multilingual collection of embedding model evaluation tasks. ([Read more](/details/mmteb.md)) `Benchmark` `Multilingual` `Evaluation`
- VIBE - Vector Index Benchmark for Embeddings - an extensible benchmarking suite for approximate nearest neighbor search methods using modern embedding datasets. VIBE addresses limitations of traditional ANN benchmarks by focusing on contemporary embedding models and datasets. ([Read more](/details/vibe.md)) `Benchmark` `Ann` `Embeddings`
- ViDoRe - Visual Document Retrieval Benchmark defining standard evaluation protocols for vision-centric document and video retrieval with 26,000 pages and 3,099 queries across 6 languages from 12,000 man-hours of annotations. ([Read more](/details/vidore.md)) `Benchmark` `Multimodal` `Rag`
- Confident AI - All-in-one AI quality platform with integrated evaluation, observability, and monitoring for LLM applications. Built by creators of DeepEval with 50+ research-backed metrics. ([Read more](/details/confident-ai.md)) `Llm Evaluation` `Observability` `Monitoring`
- LLM-as-Judge Evaluation - Using language models to automatically evaluate RAG system outputs, retrieval quality, and answer correctness. LLM-as-judge provides scalable, consistent evaluation of aspects like faithfulness, relevance, and coherence that are difficult to measure with traditional metrics, enabling rapid iteration on RAG systems. ([Read more](/details/llm-as-judge-evaluation.md)) `Evaluation` `LLM` `RAG`
- LongMemEval - Comprehensive benchmark for evaluating long-term memory in chat assistants with 500 manual questions testing information extraction, multi-session reasoning, and temporal reasoning across 115K-1.5M tokens. ([Read more](/details/longmemeval.md)) `Benchmark` `Agent Memory` `Evaluation`
- Vector Search Quality Metrics - Key metrics for evaluating vector search and retrieval systems including recall, precision, NDCG, MRR, and MAP. Understanding these metrics is essential for optimizing RAG systems, tuning vector indexes, and comparing embedding models for production deployments. ([Read more](/details/vector-search-quality-metrics.md)) `Metrics` `Evaluation` `quality`
- BenchmarkQED - Open benchmarking framework for Retrieval-Augmented Generation (RAG) systems designed to push the community toward fairer, comparably measured retrieval evaluation methods. ([Read more](/details/benchmarkqed.md)) `Rag` `Evaluation` `Benchmarking`
- GraphRAG-Bench - Rigorous evaluation framework that benchmarks GraphRAG against vanilla RAG across reasoning tasks, multi-hop queries, and domain challenges. Released May 2025, it provides standardized metrics for comparing graph-enhanced retrieval approaches with traditional vector-only retrieval. ([Read more](/details/graphrag-bench.md)) `Graphrag` `Evaluation` `Multi Hop`
- M3Retrieve - Benchmark dataset designed for evaluating multimodal retrieval systems in the medical domain. Tests retrieval performance on medical literature tasks involving both text and visual information, providing standardized evaluation for multimodal RAG systems. ([Read more](/details/m3retrieve.md)) `Multimodal` `Medical` `Retieval Benchmark`
- ToolSearch Dataset - Benchmark dataset for evaluating tool retrieval systems in AI Agent applications. Provides test cases for assessing how well systems can select the most relevant tools from large tool repositories based on conversational context and task objectives. ([Read more](/details/toolsearch-dataset.md)) `Tool Retrieval` `Agent` `Benchmark`
- Qdrant ANN-Filtering-Benchmark-Datasets - Curated datasets for benchmarking filtered approximate nearest neighbor (ANN) search in vector databases. Enriched with payload metadata and pre-generated filtering requests, including synthetic and real-world data for keyword and geo-spatial queries. ([Read more](/details/qdrant-ann-filtering-benchmark-datasets.md)) `Open Source` `datasets` `Filtered Search` `Ann`
- Qdrant Vector Search Benchmarks - Open-source comparative benchmarks evaluating vector search performance of engines like Qdrant, Elasticsearch, Milvus, Redis, and Weaviate. Covers single-node upload/search, filtered search across various datasets and configurations, focusing on RPS, latency, precision, and indexing time using affordable hardware. ([Read more](/details/qdrant-vector-search-benchmarks.md)) `Open Source` `Performance` `Vector Search` `Filtered Search`
- Vector Bible - Vector Bible is a GitHub repository comparing popular vector databases across features, performance, and use cases in a structured table. It serves as a quick reference for selection by aggregating benchmarks and pricing information. It complements ANN-Benchmarks as an essential resource for DB evaluation and decision-making, not a tool itself. ([Read more](/details/vector-bible.md)) `comparison-table` `db-evaluation` `resource`
- Vector Database Performance Benchmark 2026 - Comprehensive benchmark dataset comparing 10 vector databases across 19 fields including query latency (p50/p99), throughput, scalability limits, features like hybrid search and ACID compliance, SDK support, and managed pricing. Tested with 1M vectors at 1536 dimensions for RAG and AI search applications. Key highlights include Qdrant for lowest latency, Pinecone for managed scalability, and pgvector for ACID transactions. ([Read more](/details/vector-database-performance-benchmark-2026.md)) `benchmark` `Performance` `scalability` `2026`
-
Cloud & Managed
- Amazon ElastiCache Vector Search - Vector search capabilities in Amazon ElastiCache enabling semantic caching and real-time vector similarity search with microsecond latencies. Supports billions of vectors with HNSW indexing and up to 99% recall. ([Read more](/details/aws-elasticache-vector-search.md)) `Aws` `Caching` `Cloud` `Managed`
- Amazon S3 Vector Search - Leveraging Amazon S3 as a storage layer for vector databases, enabling 70-95% cost reduction for certain use cases. S3's low storage costs make it attractive for large-scale vector datasets with appropriate access patterns. ([Read more](/details/s3-vector-search.md)) `storage` `Aws` `Cost Optimization` `Scalable`
- Azure Cache for Redis Vector Search - Vector search capabilities in Azure Cache for Redis enabling high-performance similarity search and semantic caching. Supports HNSW and FLAT indexes with integration into Azure AI ecosystem. ([Read more](/details/azure-cache-redis-vector.md)) `Azure` `Redis` `Cloud` `Managed`
-
Cloud-managed Vector Databases
- Snowflake - A cloud data platform that offers capabilities for storing and querying various data types, including vector embeddings, often used in conjunction with its data warehousing features. ([Read more](/details/snowflake.md)) `Cloud` `Data Warehousing` `Vector Embeddings`
-
Cloud Services
- Amazon Aurora Serverless v2 - An on-demand, auto-scaling configuration for Amazon Aurora DB instances that automatically adjusts compute and memory capacity based on load, integrated with Knowledge Bases for Amazon Bedrock to simplify vectorization and database capacity management. ([Read more](/details/amazon-aurora-serverless-v2.md)) `cloud-native` `serverless` `AWS`
- MotherDuck - A cloud data warehouse that can be leveraged to store vector embeddings as List data types, enabling semantic search capabilities through SQL-based similarity functions within an existing data pipeline. ([Read more](/details/motherduck.md)) `cloud` `data warehousing` `vector embeddings`
- Instaclustr Vector Database Management - A managed service and tooling offering from Instaclustr that helps teams operate and optimize vector databases for GenAI and Retrieval-Augmented Generation (RAG) workloads, providing expertise and infrastructure management for production deployments. ([Read more](/details/instaclustr-vector-database-management.md)) `managed service` `RAG` `vector databases`
- Nextbrick Managed Vector Database Service - A fully managed vector database infrastructure and operations service provided by Nextbrick. It focuses on deployment, configuration, tuning, scaling, security, and maintenance of vector databases for AI and similarity search workloads. The service handles sharding, replication, query optimization, backups, and disaster recovery so organizations can offload operational management and focus on building AI applications. ([Read more](/details/nextbrick-managed-vector-database-service.md)) `managed service` `vector database` `services`
- Qdrant Cloud Inference - Qdrant Cloud Inference is a managed inference service integrated with the Qdrant vector database, allowing users to generate embeddings and work with vector search pipelines directly in the cloud environment. ([Read more](/details/qdrant-cloud-inference.md)) `managed service` `embeddings` `vector search`
- Vertex AI Vector Search - Google Cloud's vector search engine (formerly Matching Engine) built on ScaNN algorithm. Version 2.0 features unified data model with hybrid search and auto-generated embeddings. This is a commercial managed service. ([Read more](/details/vertex-ai-vector-search.md)) `Commercial` `Google Cloud` `managed service`
- AWS OpenSearch k-NN - Managed OpenSearch service with k-nearest neighbor search capabilities. Uses HNSW, Faiss, and Lucene libraries for approximate nearest neighbor searches. This is a commercial managed service. ([Read more](/details/aws-opensearch-k-nn.md)) `Commercial` `Aws` `managed service`
- Azure Cosmos DB NoSQL Vector Search - Microsoft's globally distributed multi-model database with native vector search using DiskANN algorithm. Features <20ms query latency and 43x lower cost vs Pinecone. This is a commercial managed service. ([Read more](/details/azure-cosmos-db-nosql-vector-search.md)) `Commercial` `Azure` `Nosql`
- Turbopuffer - Serverless vector and full-text search database built on object storage with sub-10ms p50 latency. 10x cheaper than alternatives while hosting 2.5T+ documents and serving 10k+ queries per second. ([Read more](/details/turbopuffer.md)) `Serverless` `Object Storage` `Cost Effective`
- Snowflake Cortex Search - Hybrid search service within Snowflake that combines vector search, keyword search, and semantic reranking for retrieval tasks on data stored in Snowflake tables. ([Read more](/details/snowflake-cortex-search.md)) `Hybrid Search` `Managed Service` `Serverless` `Enterprise`
- Spanner Vector Search - High-performance vector search capability built into Google Cloud Spanner that enables semantic search and similarity matching on high-dimensional vector data within a transactional database, eliminating the need for separate vector databases. ([Read more](/details/spanner-vector-search.md)) `Google Cloud` `Transactional` `Hybrid Search`
- Coveo - Enterprise AI search and discovery platform combining keyword, semantic, and personalized retrieval with analytics, permissions, and governance. Abstracts vector database complexity while adding enterprise-ready features for large knowledge bases and commerce catalogs. ([Read more](/details/coveo.md)) `Enterprise` `Ai` `Search`
- Shaped - AI-native personalization and hybrid search platform that replaces the need for a standalone vector DB stack. Combines hybrid search (keyword + vector retrieval), multi-stage ranking, real-time session adaptation, and business-objective modeling with warehouse-native architecture. ([Read more](/details/shaped.md)) `Personalization` `Hybrid Search` `Recommendation`
- SkyPilot - Open-source framework for running AI and batch workloads on any cloud, enabling distributed GPU orchestration across cloud regions. Optimizes cost and throughput by accessing GPU resources across multiple regions and instance types. ([Read more](/details/skypilot.md)) `Distributed Computing` `Cloud Native` `Gpu Acceleration`
- Dynamic Yield - Dynamic Yield provides cloud-hosted vector-powered personalization and recommendations with auto-scaling, GPU-optimized inference, and seamless AWS/Azure integrations for real-time targeting. Enables enterprise RAG-like experiences and global e-commerce search without dedicated vector DBs. Simpler than Pinecone for non-technical teams; more experimentation-focused vs Zilliz Cloud. ([Read more](/details/dynamic-yield.md)) `Cloud Auto-Scale` `Multi-Cloud` `Pay-Per-Query`
-
Commerce
- Denser Retriever - Denser Retriever is a vector-based retrieval system designed for efficient similarity search and information access in AI and ML workloads. ([Read more](/details/denser-retriever.md)) `vector search` `similarity search` `AI` `commercial`
- LiquidMetal AI - LiquidMetal AI is a platform providing intelligent storage with built-in AI capabilities, including vector database features for building advanced AI applications. ([Read more](/details/liquidmetal-ai.md)) `AI` `vector databases` `commercial` `intelligent storage`
- Meilisearch Vector Search - Meilisearch offers vector search capabilities as part of its search engine, enabling hybrid and semantic search for AI applications. ([Read more](/details/meilisearch-vector-search.md)) `vector search` `semantic search` `hybrid search` `commercial` `AI`
- QdrantCloud - QdrantCloud is the managed cloud version of Qdrant, a vector database tailored for AI-powered similarity search and matching. ([Read more](/details/qdrantcloud.md)) `managed service` `vector database` `similarity search` `AI`
- Vectara - Vectara is a commercial vector database and search platform that enables semantic and hybrid AI-powered search using vector embeddings. ([Read more](/details/vectara.md)) `commercial` `vector search` `semantic search` `AI`
- vector-admin - A universal tool suite for managing vector databases such as Pinecone, Chroma, Qdrant, and Weaviate. Facilitates straightforward management and integration of multiple vector database systems. ([Read more](/details/vector-admin.md)) `management` `tools` `vector databases` `integration`
- Vectorflow - Vectorflow is a vector database optimized for real-time vector indexing and search in distributed environments, suitable for AI and machine learning use cases. ([Read more](/details/vectorflow.md)) `vector database` `real-time` `distributed` `AI`
- Qdrant Enterprise Solutions - Qdrant Enterprise Solutions provide enterpriseāgrade deployments and support for the Qdrant vector database, including advanced security, high availability, SLAs, and integration services for largeāscale AI search and recommendation use cases. ([Read more](/details/qdrant-enterprise-solutions.md)) `enterprise` `vector database` `services`
-
Concepts & Definitions
- Ball-tree - Ball-tree is a binary tree data structure used for organizing points in a multi-dimensional space, particularly useful in vector databases for nearest neighbor search. It partitions data points into hyperspheres (balls), enabling efficient search and scalability in high-dimensional vector spaces. ([Read more](/details/ball-tree.md)) `data structure` `nearest neighbor` `vector search` `scalability`
- K-means Tree - K-means Tree is a clustering-based data structure that organizes high-dimensional vectors for fast similarity search and retrieval. It is used as an indexing method in some vector databases to optimize performance for vector search operations. ([Read more](/details/k-means-tree.md)) `clustering` `data structure` `similarity search` `high-dimensional`
- Locality-Sensitive Hashing - Locality-Sensitive Hashing (LSH) is an algorithmic technique for approximate nearest neighbor search in high-dimensional vector spaces, commonly used in vector databases to speed up similarity search while reducing memory footprint. ([Read more](/details/locality-sensitive-hashing.md)) `ANN` `similarity search` `high-dimensional` `optimization`
- M-tree - M-tree is a dynamic index structure for organizing and searching large data sets in metric spaces, enabling efficient nearest neighbor queries and dynamic updates, which are important features for vector databases handling high-dimensional vectors. ([Read more](/details/m-tree.md)) `data structure` `metric space` `nearest neighbor` `dynamic updates`
- Online Product Quantization (O-PQ) - Online Product Quantization (O-PQ) is a variant of product quantization designed to support dynamic or streaming data. It enables adaptive updating of quantization codebooks and codes in real-time, making it suitable for vector databases that handle evolving datasets. `ANN` `dynamic data` `vector search` `real-time`
- Optimized Product Quantization (OPQ) - Optimized Product Quantization (OPQ) enhances Product Quantization by optimizing space decomposition and codebooks, leading to lower quantization distortion and higher accuracy in vector search. OPQ is widely used in advanced vector databases for improving recall and search quality. `ANN` `optimization` `vector search` `accuracy`
- Product Quantization (PQ) - Product Quantization (PQ) is a technique for compressing high-dimensional vectors into compact codes, enabling efficient approximate nearest neighbor (ANN) search in vector databases. PQ reduces memory footprint and search time, making it a foundational algorithm for large-scale vector search systems. `ANN` `compression` `vector search` `scalability`
- R-tree - R-tree is a tree data structure widely used for indexing multi-dimensional information such as vectors, supporting efficient spatial queries like nearest neighbor and range queries, which are essential in vector databases. ([Read more](/details/r-tree.md)) `data structure` `spatial indexing` `vector search` `nearest neighbor`
- Spectral Hashing - Spectral Hashing is a method for approximate nearest neighbor search that uses spectral graph theory to generate compact binary codes, often applied in vector databases to enhance retrieval efficiency on large-scale, high-dimensional data. `ANN` `similarity search` `compression` `optimization`
- Vector Database - A vector database is a specialized database designed to store, index, and retrieve unstructured data represented as high-dimensional vectors, enabling efficient semantic search, similarity search, and powering applications such as LLM long-term memory, semantic search, and recommendation systems. ([Read more](/details/vector-database.md)) `vector databases` `definition` `semantic search` `similarity search`
- Deep Learning for Search - Applied book on using deep learning for search, including dense vector representations, semantic search, and neural ranking, all directly relevant to building applications on top of vector databases. ([Read more](/details/deep-learning-for-search.md)) `semantic search` `machine learning` `resources`
- Foundations of Multidimensional and Metric Data Structures - Technical book covering theory and practice of multidimensional and metric data structures for similarity search, forming a theoretical basis for index structures used in vector databases. ([Read more](/details/foundations-of-multidimensional-and-metric-data-structures.md)) `similarity search` `metric space` `data structure`
- Machine Learning Crash Course: Embeddings - Module of Googleās Machine Learning Crash Course that explains word and text embeddings, how they are obtained, and the difference between static and contextual embeddings, giving essential background for using vector representations in vector databases and similarity search systems. ([Read more](/details/machine-learning-crash-course-embeddings.md)) `embedding` `machine learning` `learning`
- IVF (Inverted File Index) - IVF is an indexing technique widely used in vector databases where vectors are clustered into inverted lists (partitions), enabling efficient Approximate Nearest Neighbor search by probing only a subset of relevant partitions at query time. ([Read more](/details/ivf-inverted-file-index.md)) `ANN` `indexing` `vector search`
- PQ (Product Quantization) - Product Quantization is a compression and indexing technique for vector search that splits vectors into subspaces and quantizes each part separately, allowing vector databases to store large-scale embeddings compactly while supporting efficient ANN search. ([Read more](/details/pq-product-quantization.md)) `quantization` `ANN` `vector compression`
- Agentic RAG - An advanced RAG architecture where an AI agent autonomously decides which questions to ask, which tools to use, when to retrieve information, and how to aggregate results. Represents a major trend in 2026 for more intelligent and adaptive retrieval systems. ([Read more](/details/agentic-rag.md)) `Rag` `Ai Agents` `Llm`
- Cascading Retrieval - Advanced retrieval approach combining dense vectors, sparse vectors, and reranking in a multi-stage pipeline, achieving up to 48% better performance than single-method retrieval. ([Read more](/details/cascading-retrieval.md)) `Hybrid Search` `Rag` `Retrieval`
- Dense-Sparse Hybrid Embeddings - Combining dense vector embeddings with sparse representations in a single unified model. Captures both semantic meaning (dense) and exact term matching (sparse) for superior retrieval performance. ([Read more](/details/dense-sparse-hybrid-embeddings.md)) `Hybrid` `Embeddings` `Sparse`
- HNSW-IF - Hybrid billion-scale vector search method combining HNSW with inverted file indexes, enabling cost-efficient search by keeping centroids in memory while storing vectors on disk. ([Read more](/details/hnsw-if.md)) `Hnsw` `Disk Based` `Scalability`
- Hybrid Search - A search architecture that combines dense vector embeddings (semantic search) with sparse representations like BM25 (lexical search) to achieve better overall search quality. The industry standard approach for production RAG systems in 2026. ([Read more](/details/hybrid-search.md)) `Hybrid` `Search` `Best Practices`
- Matryoshka Embeddings - Representation learning approach encoding information at multiple granularities, allowing embeddings to be truncated while maintaining performance. Enables 14x smaller sizes and 5x faster search. ([Read more](/details/matryoshka-embeddings.md)) `Embeddings` `Optimization` `Research`
- Multimodal RAG - Retrieval-Augmented Generation extended to handle multiple modalities including text, images, video, and audio. Uses multimodal embeddings like Gemini Embedding 2 or CLIP to enable cross-modal search and generation. ([Read more](/details/multimodal-rag.md)) `Multimodal` `Rag` `Embeddings`
- RecursiveCharacterTextSplitter - LangChain's hierarchical text chunking strategy achieving 85-90% accuracy by recursively splitting using progressively finer separators to preserve semantic boundaries. ([Read more](/details/recursivecharactertextsplitter.md)) `Chunking` `Text Processing` `Rag`
- Vector Index Comparison Guide (Flat, HNSW, IVF) - Comprehensive comparison of vector indexing strategies including Flat, HNSW, and IVF approaches. Covers performance characteristics, memory requirements, and use case recommendations for 2026. ([Read more](/details/vector-index-comparison-guide-flat-hnsw-ivf.md)) `Indexing` `Comparison` `Best Practices`
- ACORN Algorithm - Performant and predicate-agnostic search algorithm for vector embeddings with structured data. Uses two-hop graph expansion to maintain high recall under selective filters in Weaviate. ([Read more](/details/acorn-algorithm.md)) `Ann` `Graph Based` `Filtering`
- Agentic Chunking - An advanced RAG chunking strategy that uses LLMs to dynamically determine optimal document splitting based on semantic meaning and content structure. Agentic chunking analyzes document characteristics and adapts the chunking approach per document for superior retrieval accuracy. ([Read more](/details/agentic-chunking-strategy.md)) `Chunking` `Llm` `Rag` `Text Processing`
- Approximate Nearest Neighbors (ANN) - Family of algorithms trading perfect accuracy for speed in high-dimensional similarity search. Enables sub-linear query time with 90%+ recall on billion-scale datasets. ([Read more](/details/approximate-nearest-neighbors-ann.md)) `Algorithm` `Ann` `Search`
- Asymmetric Search - A search paradigm where queries and documents are encoded differently, optimized for scenarios where queries are short and documents are long. Common in information retrieval and modern embedding models designed specifically for search. ([Read more](/details/asymmetric-search.md)) `Search` `Embeddings` `Retrieval`
- BBQ Binary Quantization - Elasticsearch and Lucene's implementation of RaBitQ algorithm for 1-bit vector quantization, renamed as BBQ. Provides 32x compression with asymptotically optimal error bounds, enabling efficient vector search at massive scale with minimal accuracy loss. ([Read more](/details/bbq-binary-quantization.md)) `Quantization` `Compression` `Elasticsearch`
- Binary Quantization - Extreme vector compression technique converting each dimension to a single bit (0 or 1), achieving 32x memory reduction and enabling ultra-fast Hamming distance calculations with acceptable accuracy trade-offs. ([Read more](/details/binary-quantization.md)) `Quantization` `Compression` `Optimization`
- BM25 - Best Matching 25 ranking function for information retrieval that ranks documents based on query term frequency with length normalization. Core component of hybrid search RAG systems combining keyword and semantic search. ([Read more](/details/bm25.md)) `Information Retrieval` `Ranking` `Keyword Search`
- BM42 - Experimental sparse embedding approach combining exact keyword search with transformer intelligence, integrating sparse and dense vector searches for improved RAG results, developed by Qdrant. ([Read more](/details/bm42.md)) `Sparse` `Hybrid Search` `Experimental`
- Chunk Overlap Strategy - Text chunking technique using 10-20% overlap between consecutive chunks to preserve context continuity and prevent information loss at chunk boundaries for improved retrieval. ([Read more](/details/chunk-overlap-strategy.md)) `Chunking` `Rag` `Text Processing`
- ColBERT and Late Interaction - Multi-vector retrieval architecture where queries and documents are represented by multiple vectors enabling fine-grained matching and improved retrieval quality through late interaction scoring. ([Read more](/details/colbert-and-late-interaction.md)) `Retrieval` `Multi Vector` `Research`
- Consistency Levels - Configuration options in distributed vector databases that trade off between data consistency, availability, and performance. Critical for understanding read/write behavior in production systems with replication. ([Read more](/details/consistency-levels.md)) `Distributed` `Performance` `Reliability`
- Context Precision - RAG evaluation metric assessing retriever's ability to rank relevant chunks higher than irrelevant ones, measuring context relevance and ranking quality for optimal retrieval. ([Read more](/details/context-precision.md)) `Rag` `Evaluation` `Metrics`
- Context Recall - RAG evaluation metric measuring whether retrieved context contains all information required to produce ideal output, assessing completeness and sufficiency of retrieval. ([Read more](/details/context-recall.md)) `Rag` `Evaluation` `Retrieval`
- Context Window - Maximum number of tokens an embedding model or LLM can process in a single input. Critical parameter for vector databases affecting chunk sizes, with modern models supporting 512 to 32,000+ tokens for long-document understanding. ([Read more](/details/context-window.md)) `Llm` `Embeddings` `Architecture`
- Context Window Management in RAG - Strategies for managing LLM context windows in RAG applications including chunk selection, context compression, and techniques for working within token limits while maintaining answer quality. ([Read more](/details/context-window-management-in-rag.md)) `context-window` `Rag` `Optimization`
- Cosine Similarity - Fundamental similarity metric for vector search measuring the cosine of the angle between vectors. Range from -1 to 1, with 1 indicating identical direction regardless of magnitude. ([Read more](/details/cosine-similarity.md)) `Similarity` `Distance Metric` `Vector Search`
- Cross-Encoder - Neural reranking architecture that examines full query-document pairs simultaneously for deeper semantic understanding, achieving higher accuracy than bi-encoders at the cost of computational efficiency. ([Read more](/details/cross-encoder.md)) `Reranking` `Neural Networks` `Nlp`
- Cursor-Based Pagination - A pagination technique for efficiently scrolling through large vector database result sets using cursors instead of offsets. Essential for retrieving all vectors in a collection or iterating through search results without performance degradation. ([Read more](/details/cursor-based-pagination.md)) `Pagination` `Performance` `Best Practices`
- Document Parsing for RAG - Critical preprocessing step for RAG systems involving extraction of text, tables, and images from various document formats (PDF, DOCX, HTML) using tools like Unstructured, LlamaParse, and PyPDF. ([Read more](/details/document-parsing-for-rag.md)) `Document Processing` `Rag` `Preprocessing`
- Dot Product - Vector similarity metric measuring both directional similarity and magnitude of vectors. Used by many LLMs for training and equivalent to cosine similarity for normalized data. Reports both angle and magnitude information. ([Read more](/details/dot-product.md)) `Similarity` `Distance Metric` `Llm`
- Dot Product (Inner Product) - Similarity metric computing sum of element-wise products between vectors. Efficient for normalized vectors, equivalent to cosine similarity when vectors are unit length. ([Read more](/details/dot-product-inner-product.md)) `Similarity` `Distance Metric` `Vector Search`
- Embedding Cache - Caching mechanism for storing and reusing previously computed embeddings to reduce API costs and latency. Essential optimization for production RAG systems processing repeated or similar content. ([Read more](/details/embedding-cache.md)) `Caching` `Optimization` `Cost Reduction`
- Embedding Dimension Selection - Guide to choosing optimal embedding dimensions balancing accuracy, storage costs, and computational requirements, covering Matryoshka embeddings and dimension reduction techniques. ([Read more](/details/embedding-dimension-selection.md)) `Embeddings` `Optimization` `Dimensions`
- Embedding Dimensionality - The size of vector embeddings, typically ranging from 384 to 4096 dimensions. Higher dimensions capture more information but increase storage, compute, and latency costs. ([Read more](/details/embedding-dimensionality.md)) `Embeddings` `Optimization` `Dimensions`
- Embedding Fine-Tuning - Process of adapting pre-trained embedding models to specific domains or tasks for improved performance. Techniques include supervised fine-tuning, contrastive learning, and domain adaptation to optimize embeddings for particular use cases. ([Read more](/details/embedding-fine-tuning.md)) `Embeddings` `Fine Tuning` `Machine Learning`
- Embedding Models Overview - Neural networks that convert text, images, or other data into dense vector representations. Enable semantic understanding by mapping similar concepts to nearby points in vector space. ([Read more](/details/embedding-models-overview.md)) `Embeddings` `Models` `Neural Networks`
- Euclidean Distance - Straight-line distance metric between vectors in multidimensional space, sensitive to both magnitude and direction, ideal when embedding magnitude carries important information. ([Read more](/details/euclidean-distance.md)) `Similarity Search` `Metrics` `Algorithm`
- Filtered Vector Search - Combining vector similarity search with metadata filtering. Enables queries like find similar documents published after 2023 in category Technology. ([Read more](/details/filtered-vector-search.md)) `Filtering` `Metadata` `Hybrid Search`
- Graph RAG - RAG architecture that combines knowledge graphs with vector databases, enabling multi-hop reasoning, relationship traversal, and structured knowledge representation for more accurate and explainable AI responses. ([Read more](/details/graph-rag.md)) `Knowledge Graph` `Rag` `relationships`
- GraphRAG - Microsoft's approach to RAG that uses knowledge graphs to enhance retrieval. GraphRAG builds structured representations of documents enabling better context understanding and multi-hop reasoning for complex queries. ([Read more](/details/graphrag-microsoft.md)) `Graph` `Rag` `Knowledge Graph` `Microsoft`
- Hamming Distance - A distance metric that measures the number of positions at which corresponding elements in two vectors differ. Particularly useful for binary vectors and categorical data, commonly used with binary quantization in vector search. ([Read more](/details/hamming-distance.md)) `Distance Metric` `Binary` `Similarity`
- HCNNG - Hierarchical Clustering-based Nearest Neighbor Graph using MST to connect dataset points through multiple hierarchical clusters. Performs efficient guided search instead of traditional greedy routing. ([Read more](/details/hcnng.md)) `Ann` `Graph Based` `Clustering`
- Hybrid Search Best Practices - Comprehensive guide to combining BM25 keyword search with vector semantic search using reciprocal rank fusion and reranking. Essential pattern for production RAG systems in 2026. ([Read more](/details/hybrid-search-best-practices.md)) `Hybrid Search` `Rag` `Best Practices`
- Hybrid Search Techniques - Best practices for combining vector and keyword search using RRF and weighted fusion for improved retrieval accuracy in RAG systems. ([Read more](/details/hybrid-search-techniques.md)) `Hybrid Search` `Best Practices` `Rag`
- Hybrid Search with Reciprocal Rank Fusion - Search technique combining BM25 lexical search and semantic vector search using Reciprocal Rank Fusion (RRF) to merge results, balancing precision of keyword matching with contextual understanding of neural embeddings. ([Read more](/details/hybrid-search-with-reciprocal-rank-fusion.md)) `Hybrid Search` `Bm25` `Ranking`
- IVF-FLAT - Inverted File index with FLAT (uncompressed) vectors, partitioning the vector space into clusters with centroids, offering a balance between search speed and accuracy for approximate nearest neighbor search. ([Read more](/details/ivf-flat.md)) `Indexing` `Ivf` `Clustering`
- Late Chunking - Advanced chunking technique for long-context embeddings where documents are embedded first as a whole, then chunked, preserving contextual information and improving retrieval quality especially for technical documents. ([Read more](/details/late-chunking.md)) `Chunking` `Embeddings` `Rag`
- LLM Caching for Vector Search - Caching strategies for LLM and vector search systems including semantic caching, embedding caching, and response caching to reduce costs and improve latency in RAG applications. ([Read more](/details/llm-caching-for-vector-search.md)) `Caching` `Performance` `Cost Optimization`
- LLMOps - Operational practices and tooling for deploying, monitoring, and maintaining LLM applications in production, encompassing prompt management, model versioning, evaluation, and observability. ([Read more](/details/llmops.md)) `Operations` `MLOps` `production`
- Locality Sensitive Hashing (LSH) - Algorithmic technique for approximate nearest neighbor search in high-dimensional spaces using hash functions to map similar items to the same buckets with high probability. ([Read more](/details/locality-sensitive-hashing-lsh.md)) `Hashing` `Ann` `Algorithm`
- Locally-Adaptive Vector Quantization - Advanced quantization technique that applies per-vector normalization and scalar quantization, adapting the quantization bounds individually for each vector. Achieves four-fold reduction in vector size while maintaining search accuracy with 26-37% overall memory footprint reduction. ([Read more](/details/locally-adaptive-vector-quantization.md)) `Quantization` `Compression` `Optimization`
- Manhattan Distance - Vector distance metric calculating the sum of absolute differences between vector components. Measures grid-like distance and is robust to outliers, with faster calculation as data dimensionality increases. ([Read more](/details/manhattan-distance.md)) `Similarity` `Distance Metric` `High Dimensional`
- MaxSim - Maximum Similarity late interaction function introduced by ColBERT for ranking. Calculates cosine similarity between query and document token embeddings, keeping maximum score per query token for highly effective long-document retrieval. ([Read more](/details/maxsim.md)) `Colbert` `Ranking` `Late Interaction`
- Multimodal Embeddings - Vector representations mapping different data types (text, images, audio, video) into a shared embedding space. Enables cross-modal search and understanding. ([Read more](/details/multimodal-embeddings.md)) `Multimodal` `Embeddings` `Cross Modal`
- RAG (Retrieval-Augmented Generation) - AI technique combining information retrieval with LLM generation. Retrieves relevant context from knowledge base before generating responses, reducing hallucinations and enabling grounded answers. ([Read more](/details/rag-retrieval-augmented-generation.md)) `Rag` `Llm` `Retrieval`
- Range Search - A vector search operation that retrieves all vectors within a specified distance threshold from the query vector, rather than a fixed number of nearest neighbors. Useful for finding all similar items above a quality threshold. ([Read more](/details/range-search.md)) `Search` `Similarity` `Threshold`
- Reciprocal Rank Fusion - Method for combining ranked lists from multiple retrieval systems in hybrid search. Standard technique in RAG pipelines for fusing BM25 and dense vector results before reranking, creating diverse high-confidence candidate sets. ([Read more](/details/reciprocal-rank-fusion.md)) `Hybrid Search` `Ranking` `Fusion`
- Reciprocal Rank Fusion (RRF) - Hybrid search algorithm combining results from multiple ranking systems by computing reciprocal ranks, commonly used to merge dense vector search with sparse keyword search for improved retrieval. ([Read more](/details/reciprocal-rank-fusion-rrf.md)) `Hybrid Search` `Ranking` `Fusion`
- Scalar Quantization - Vector compression technique reducing precision of each vector component from 32-bit floats to 8-bit integers, achieving 4x memory reduction with minimal accuracy loss for vector search. ([Read more](/details/scalar-quantization.md)) `Quantization` `Compression` `Optimization`
- Semantic Caching - A caching technique that uses vector embeddings to identify and reuse responses for semantically similar queries, reducing LLM costs and latency. Unlike traditional caches based on exact matches, semantic caching achieves cache hit ratios of up to 92% by matching queries based on semantic similarity. ([Read more](/details/semantic-caching-embeddings.md)) `Caching` `Embeddings` `Performance` `Cost Optimization`
- Semantic Search - Search technique understanding meaning and context rather than exact keyword matching. Uses vector embeddings to find semantically similar content even with different wording. ([Read more](/details/semantic-search.md)) `Search` `Embeddings` `Semantics`
- Sparse Vectors (SPLADE) - Learned sparse representation technique that creates interpretable, high-dimensional sparse vectors for text, combining benefits of traditional keyword search with neural approaches for improved retrieval. ([Read more](/details/sparse-vectors-splade.md)) `Sparse Vectors` `Neural Search` `Interpretable`
- Streaming Vector Indexing - Real-time indexing of vectors as they arrive in a stream, enabling immediate searchability without batch processing delays. Critical for applications requiring up-to-the-second freshness like social media, news, or real-time recommendations. ([Read more](/details/streaming-vector-indexing.md)) `Streaming` `Real Time` `Indexing`
- Text Chunking Strategies for RAG - Essential techniques for splitting documents into optimal-sized chunks for Retrieval-Augmented Generation, including fixed-size, recursive, semantic, and document-based chunking with overlap strategies to preserve context. ([Read more](/details/text-chunking-strategies-for-rag.md)) `Rag` `Text Processing` `Retrieval`
- Text-to-Cypher - Natural language to Cypher query generation for Neo4j graph databases. Enables users to query knowledge graphs using plain English, critical component of GraphRAG systems for generating graph traversal queries from natural language questions. ([Read more](/details/text-to-cypher.md)) `Graphrag` `Knowledge Graph` `Llm`
- TreeAH - Vector index type based on Google's ScaNN algorithm combining tree-like structure with Asymmetric Hashing quantization, optimized for batch queries with 10x faster index generation and smaller memory footprint. ([Read more](/details/treeah.md)) `Indexing` `Quantization` `Google`
- Vamana - Graph-based indexing algorithm powering Microsoft's DiskANN. Uses flat graph structure with minimized search diameter for efficient disk-based nearest neighbor search with 40x GPU speedup available via NVIDIA cuVS. ([Read more](/details/vamana.md)) `Ann` `Graph Based` `Algorithm`
- Vector Database Backup and Recovery - Best practices for backing up vector databases, disaster recovery planning, point-in-time recovery, and data migration strategies to prevent data loss and ensure business continuity. ([Read more](/details/vector-database-backup-and-recovery.md)) `Backup` `Disaster Recovery` `Operations`
- Vector Database Backup and Recovery Guide - Best practices for backup and disaster recovery in vector databases. Covers full/incremental backups, replication strategies, and cloud-native approaches for safeguarding high-dimensional embeddings. ([Read more](/details/vector-database-backup-and-recovery-guide.md)) `Backup` `Disaster Recovery` `Best Practices`
- Vector Database Cost Optimization - Comprehensive strategies for reducing vector database costs through embedding model selection, quantization, caching, and infrastructure choices. Critical for production deployments at scale. ([Read more](/details/vector-db-cost-optimization.md)) `Cost Optimization` `pricing` `Best Practices` `Scalability`
- Vector Database Cost Optimization Guide - Comprehensive strategies for reducing vector database costs including storage management, compute optimization, and monitoring. Covers cloud pricing trends and hidden costs in 2026. ([Read more](/details/vector-database-cost-optimization-guide.md)) `Cost Optimization` `Cloud` `Best Practices`
- Vector Database Deletion and Updates - Strategies for deleting and updating vectors in production systems including soft deletes, versioning, and rebuild patterns. Critical for maintaining data accuracy and handling GDPR/compliance requirements. ([Read more](/details/vector-database-deletion-and-updates.md)) `Operations` `Data Management` `Compliance`
- Vector Database Migration - Strategies and tools for migrating vector data between databases or upgrading versions. Includes export/import patterns, zero-downtime migrations, and validation techniques for production systems. ([Read more](/details/vector-database-migration.md)) `Migration` `Data Engineering` `Operations`
- Vector Database Migration Strategies - Guide to migrating vector databases including export/import procedures, zero-downtime migration patterns, data validation, and strategies for changing providers or versions. ([Read more](/details/vector-database-migration-strategies.md)) `Migration` `data-transfer` `Operations`
- Vector Database Monitoring - Observability practices for vector databases including query latency, recall metrics, storage utilization, and index health monitoring. ([Read more](/details/vector-database-monitoring.md)) `Monitoring` `Observability` `Operations`
- Vector Database Schema Design - Best practices for designing vector database schemas including vector dimensions, metadata structure, indexing strategies, and collection organization. Critical for performance, scalability, and maintainability. ([Read more](/details/vector-database-schema-design.md)) `Schema` `Design` `Best Practices`
- Vector Database Sharding - Distributing vector data across multiple nodes for horizontal scaling. Enables handling billions of vectors by partitioning data and parallelizing queries. ([Read more](/details/vector-database-sharding.md)) `Sharding` `Scalability` `Distributed`
- Vector Database Sharding Strategies - Approaches for distributing vectors across multiple nodes including horizontal sharding, data partitioning, and routing strategies for scaling vector search to billions of vectors. ([Read more](/details/vector-database-sharding-strategies.md)) `Scalability` `distributed-systems` `Architecture`
- Vector Database Use Cases - Applications of vector databases across industries including semantic search, RAG systems, recommendations, anomaly detection, and multimodal search. ([Read more](/details/vector-database-use-cases.md)) `Use Cases` `Applications` `Ai`
- Vector Index Build Strategies - Techniques for efficiently building vector indexes including batch construction, incremental updates, and online indexing. Critical for production systems that need to balance indexing speed, search performance, and resource utilization. ([Read more](/details/vector-index-build-strategies.md)) `Indexing` `Performance` `Operations`
- Vector Index Types - Overview of indexing structures for approximate nearest neighbor search including HNSW (graph-based), IVF (clustering), LSH (hashing), and tree-based approaches. ([Read more](/details/vector-index-types.md)) `Indexing` `Algorithms` `Ann`
- Zero-Shot Classification with Embeddings - Using vector embeddings to classify items into categories without training data for those specific categories. Leverages semantic similarity between text and category descriptions for instant classification. ([Read more](/details/zero-shot-classification-with-embeddings.md)) `Classification` `Zero Shot` `Embeddings`
- ASMR Technique - Agentic Search and Memory Retrieval technique by Supermemory using parallel reader agents and search agents that achieved ~99% accuracy on LongMemEval benchmark. ([Read more](/details/asmr-technique.md)) `Agent Memory` `Retrieval` `Multi Agent`
- Ann Algorithm Comparison - Placeholder - comprehensive documentation for ann-algorithm-comparison in vector databases and RAG systems. ([Read more](/details/ann-algorithm-comparison.md)) `placeholder`
- Approximate Nearest Neighbors (ANN) - Algorithms and techniques for finding nearest neighbors in high-dimensional vector spaces with speed-accuracy trade-offs. ANN methods like HNSW, IVF, and DiskANN enable billion-scale vector search by sacrificing small amounts of recall for massive performance gains over exact search. ([Read more](/details/approximate-nearest-neighbors-ann.md)) `Algorithm` `approximate` `Scalability`
- Binary Quantization for Vector Search - Compression technique that converts full-precision vectors to binary representations, achieving 32x storage reduction while maintaining 90-95% recall for efficient large-scale vector search. ([Read more](/details/binary-quantization-for-vector-search.md)) `Quantization` `Compression` `Optimization` `Binary`
- Compression Ratio Optimization - Techniques for optimizing the trade-off between memory usage and accuracy in vector quantization, achieving 5-40x compression in systems like Mastra's Observational Memory. ([Read more](/details/compression-ratio-optimization.md)) `Compression` `Optimization` `Memory`
- Cross-Encoder Reranking - Two-stage retrieval where initial results from bi-encoder vector search are reranked using more expensive cross-encoder models for higher accuracy. Used in Hindsight and other systems. ([Read more](/details/cross-encoder-reranking.md)) `Reranking` `Retrieval` `Accuracy`
- Early Termination Strategy for HNSW - Optimization technique that allows HNSW vector searches to exit early when the candidate queue remains saturated, reducing latency and resource usage with minimal recall impact. ([Read more](/details/early-termination-strategy-for-hnsw.md)) `Optimization` `Hnsw` `Performance` `Algorithm`
- Event-Driven Agent Core - Agent architecture pattern in AG2 where agents respond to events rather than polling, enabling better async execution, scalability, and resource efficiency. ([Read more](/details/event-driven-agent-core.md)) `Event Driven` `Agents` `Architecture`
- GraphRAG - Retrieval-Augmented Generation approach that combines graph databases with vector search for enhanced context retrieval. Uses graph structures to capture relationships between entities while leveraging vector embeddings for semantic search. ([Read more](/details/graphrag.md)) `Rag` `Graph Database` `Hybrid Approach`
- HybridRAG - Next evolution in RAG systems that combines vector databases for semantic similarity with graph databases for relationship exploration and multi-hop reasoning. ([Read more](/details/hybridrag.md)) `Rag` `Hybrid Search` `Graph Vector`
- k-NN Search - k-Nearest Neighbors search finds the k closest vectors to a query vector in high-dimensional space. A fundamental operation in vector databases and machine learning, k-NN can be exact (brute force) or approximate (ANN) depending on performance requirements and dataset size. ([Read more](/details/k-nn-search.md)) `Algorithm` `Search` `fundamental`
- Lazy Loading Filesystem - Modal Labs' FUSE-based filesystem implementation that loads container images and dependencies on-demand, enabling sub-second container startup times for GPU workloads. ([Read more](/details/lazy-loading-filesystem.md)) `Optimization` `Containers` `Performance`
- LIRE Protocol - Lightweight incremental rebalancing protocol used in SPFresh for billion-scale vector updates with only 1% DRAM and <10% cores compared to global rebuild approaches. ([Read more](/details/lire-protocol.md)) `Indexing` `Incremental` `Algorithm`
- Multi-Vector Embeddings - Embedding approach where documents/images are represented by multiple vectors (one per token/patch) rather than a single vector, enabling fine-grained semantic matching. ([Read more](/details/multi-vector-embeddings.md)) `Embeddings` `Colbert` `Retrieval`
- Perpetual Sandbox - Sandbox architecture that maintains state indefinitely while scaling costs to zero during idle periods. Pioneered by Blaxel with sub-25ms resume times from standby mode. ([Read more](/details/perpetual-sandbox.md)) `Sandbox` `Architecture` `Cost Optimization`
- Progressive K-Annealing - Training technique in CSRv2 that stabilizes sparsity learning by gradually increasing sparsity constraints, reducing dead neurons from >80% to ~20%. ([Read more](/details/progressive-k-annealing.md)) `Training` `Sparse Embeddings` `Optimization`
- Semantic Search - A search approach that understands the meaning and intent of queries rather than just matching keywords. Using vector embeddings and similarity measures, semantic search finds conceptually relevant results even when exact terms don't match, enabling natural language queries and cross-lingual retrieval. ([Read more](/details/semantic-search.md)) `Search` `NLP` `Embeddings`
- Temporal Knowledge Graph - Knowledge graph architecture where facts have validity windows showing when they became true and were superseded. Core component of Zep AI's Graphiti and other agent memory systems. ([Read more](/details/temporal-knowledge-graph.md)) `Knowledge Graph` `Temporal` `Agent Memory`
- Vector Database Backup and Restore - Strategies for backing up vector databases and restoring from failures, including snapshots, incremental backups, and disaster recovery. Proper backup procedures are essential for production vector databases to prevent data loss and ensure business continuity in RAG and search systems. ([Read more](/details/vector-database-backup-and-restore.md)) `Backup` `Disaster Recovery` `Operations`
- Vector Dimensionality Reduction - Techniques for reducing embedding dimensions while preserving semantic information, including PCA, random projection, and learned compression methods like Matryoshka embeddings. Dimensionality reduction enables faster search, lower storage costs, and efficient deployment at scale. ([Read more](/details/vector-dimensionality-reduction.md)) `Optimization` `Compression` `Embeddings`
- Vector Index Types - Different indexing strategies for vector databases including HNSW, IVF, LSH, and flat indexes. Each type offers different trade-offs between query speed, build time, accuracy, and memory usage. Understanding index types is crucial for optimizing vector database performance at scale. ([Read more](/details/vector-index-types.md)) `Indexing` `Performance` `Algorithms`
- Vector Normalization - The process of scaling vectors to unit length (L2 normalization) or other standard forms. Normalized vectors enable cosine similarity computation via simple dot product and are essential for many embedding models and distance metrics used in vector databases. ([Read more](/details/vector-normalization.md)) `Preprocessing` `mathematics` `Embeddings`
- Context Engineering - Context Engineering is an emerging discipline encompassing the systematic design, construction, and management of the entire information payload provided to an LLM at inference time. It moves beyond crafting single prompts to architecting the complete environment a model uses to reason and respond, including instructions, retrieved knowledge, tools, memory, state, and the user query as structured components. ([Read more](/details/context-engineering.md)) `Llm Architecture` `Retrieval Augmented Generation` `System Design`
- L2 Normalization (Vector Normalization) - A preprocessing technique that scales vectors to unit length, ensuring all vectors lie on a hypersphere. Essential for making cosine similarity equivalent to inner product and improving embedding quality in many applications. ([Read more](/details/l2-normalization-vector-normalization.md)) `Normalization` `Preprocessing` `Embeddings`
- Multimodal RAG - Retrieval-Augmented Generation extended to handle multiple modalities including text, images, video, and audio. Uses multimodal embeddings like Gemini Embedding 2 or CLIP to enable cross-modal search and generation. ([Read more](/details/multimodal-rag.md)) `Multimodal` `Rag` `Embeddings`
- Vector Index Comparison Guide (Flat, HNSW, IVF) - Comprehensive comparison of vector indexing strategies including Flat, HNSW, and IVF approaches. Covers performance characteristics, memory requirements, and use case recommendations for 2026. ([Read more](/details/vector-index-comparison-guide-flat-hnsw-ivf.md)) `indexing` `comparison` `best-practices`
- Cold Start Problem in Vector Search - Strategies for handling the cold start problem in vector databases and recommendation systems including hybrid approaches, popularity-based fallbacks, and collaborative filtering techniques. ([Read more](/details/cold-start-problem-in-vector-search.md)) `cold-start` `Recommendations` `bootstrapping`
- Context Engineering - Context Engineering is an emerging discipline encompassing the systematic design, construction, and management of the entire information payload provided to an LLM at inference time. It moves beyond crafting single prompts to architecting the complete environment a model uses to reason and respond, including instructions, retrieved knowledge, tools, memory, state, and the user query as structured components. ([Read more](/details/context-engineering.md)) `llm-architecture` `retrieval-augmented-generation` `system-design`
- Context Window - Maximum number of tokens an embedding model or LLM can process in a single input. Critical parameter for vector databases affecting chunk sizes, with modern models supporting 512 to 32,000+ tokens for long-document understanding. ([Read more](/details/context-window.md)) `Llm` `Embeddings` `architecture`
- Context Window Management in RAG - Strategies for managing LLM context windows in RAG applications including chunk selection, context compression, and techniques for working within token limits while maintaining answer quality. ([Read more](/details/context-window-management-in-rag.md)) `context-window` `Rag` `optimization`
- Contextual Compression - A RAG optimization technique that compresses retrieved documents by extracting only the most relevant portions relative to the query. Reduces token usage and improves LLM response quality by removing irrelevant context. ([Read more](/details/contextual-compression.md)) `Rag` `optimization` `compression`
- Contextual Retrieval - Anthropic's RAG technique that prepends chunk-specific explanatory context before embedding, reducing failed retrievals by 49% (67% with reranking). Uses Contextual Embeddings and Contextual BM25. ([Read more](/details/contextual-retrieval.md)) `Rag` `retrieval` `context`
- Cross-Encoder - Neural reranking architecture that examines full query-document pairs simultaneously for deeper semantic understanding, achieving higher accuracy than bi-encoders at the cost of computational efficiency. ([Read more](/details/cross-encoder.md)) `Reranking` `neural-networks` `nlp`
- Inverted File Index (IVF) - A vector indexing technique that partitions the vector space into clusters using k-means, then searches only the nearest clusters during queries. Foundation for efficient approximate nearest neighbor search, often combined with product quantization (IVF-PQ). ([Read more](/details/inverted-file-index-ivf.md)) `indexing` `ivf` `Clustering`
- IVF-PQ (Inverted File with Product Quantization) - Vector indexing method combining inverted file index with product quantization for memory-efficient search. Reduces storage from 128x4 bytes to 32x1 bytes (1/16th) while maintaining search quality. ([Read more](/details/ivf-pq-inverted-file-with-product-quantization.md)) `Quantization` `indexing` `compression`
- MSTG (Multi-Stage Tree Graph) - Hierarchical vector index developed by MyScale overcoming IVF limitations through multi-layered design, creating multiple layers unlike IVF's single layer of cluster vectors for improved search performance. ([Read more](/details/mstg-multi-stage-tree-graph.md)) `indexing` `tree-based` `hierarchical`
- Multi-Tenancy Patterns - Architectural patterns for isolating data between different tenants (customers/organizations) in vector databases. Includes collection-per-tenant, partition-per-tenant, and filter-based approaches with different trade-offs. ([Read more](/details/multi-tenancy-patterns.md)) `Multi Tenant` `architecture` `security`
- Multimodal Embeddings - Vector representations mapping different data types (text, images, audio, video) into a shared embedding space. Enables cross-modal search and understanding. ([Read more](/details/multimodal-embeddings.md)) `Multimodal` `Embeddings` `cross-modal`
- Navigable Small World (NSW) - A graph-based approximate nearest neighbor search algorithm that uses both long-range and short-range links to achieve poly-logarithmic search complexity. Foundation for the more advanced HNSW algorithm. ([Read more](/details/navigable-small-world-nsw.md)) `graph-based` `Ann` `algorithm`
- Parent Document Retriever - A RAG technique that indexes small chunks for precise matching but retrieves larger parent documents for LLM context. Balances retrieval precision with comprehensive context by separating indexing granularity from context size. ([Read more](/details/parent-document-retriever.md)) `Rag` `retrieval` `chunking`
- Query Expansion for Vector Search - Techniques to improve retrieval by expanding user queries with synonyms, related terms, and reformulations including HyDE, query rewriting, and multi-query approaches. ([Read more](/details/query-expansion-for-vector-search.md)) `query-optimization` `retrieval` `Rag`
Programming Languages
Categories
Concepts & Definitions
144
Research Papers & Surveys
132
Vector Database Engines
87
Sdks & Libraries
77
Machine Learning Models
62
Curated Resource Lists
57
LLM Tools
47
SDKs & Libraries
41
Open Sources
32
Sdks Libraries
31
Benchmarks & Evaluation
29
Vector Database Extensions
26
LLM Frameworks
25
Managed Vector Databases
23
Data Integration & Migration
21
Core Vector Databases
18
Cloud Services
15
Multi Model & Hybrid Databases
14
Managed & Serverless Vector DBs
11
curated-resource-lists
11
Security & Governance
9
Llm Tools
9
Commerce
8
Vector Database
6
Relational Databases
5
Embedded Vector Databases
5
Vector DB Research & Surveys
4
Llm Frameworks
4
Data Processing
4
sdks-libraries
4
Graph Database
4
Rust-based Vector Databases
3
ANN Indexing Libraries
3
Search & Retrieval
3
Managed and Serverless Vector DBs
3
Cloud & Managed
3
Open Source Vector Databases
2
Multi-Model & Hybrid Databases
2
Edge Database
2
ā Star History
2
Integrations & Extensions
2
š„ Acknowledgements
2
Multimodal Vector Databases
1
Tools
1
serverless-managed-vector-dbs
1
Evaluation & Observability
1
Rust-Based Vector DBs
1
AI Agent Optimized VDBs
1
Multimodal Vector DBs
1
Cloud-managed Vector Databases
1
Vector Indexing Libraries
1
research-papers-surveys
1
2026 Trends & Startups
1
Experimental & Learning Vector DBs
1
Libraries
1
Embedded & Edge Vector Databases
1
Hybrid Vector Stores
1
GPU-Accelerated Vector DBs
1
Scalable Distributed Vector DBs
1
In-Memory Hybrid Vector Stores
1
Sub Categories
Keywords
vector-database
33
vector-search
31
llm
29
rag
22
ai
20
similarity-search
17
nearest-neighbor-search
17
embeddings
16
search-engine
15
approximate-nearest-neighbor-search
15
machine-learning
14
information-retrieval
13
database
13
rust
11
search
11
python
10
retrieval-augmented-generation
10
semantic-search
9
openai
9
vector
8
vector-search-engine
8
chatgpt
8
milvus
7
gpt
7
llms
6
hnsw
6
faiss
6
benchmark
6
qdrant
5
knn-search
5
sql
5
artificial-intelligence
5
clustering
5
nearest-neighbors
5
hybrid-search
5
full-text-search
5
embedding
5
image-search
5
retrieval
4
webassembly
4
postgresql
4
vectors
4
gpt-4
4
large-language-models
4
pinecone
4
ann
4
ml
4
agents
3
elasticsearch
3
genai
3