An open API service indexing awesome lists of open source software.

awesome-vector-databases

A curated list of vector database solutions, libraries, and resources for AI applications - https://vectordb.works
https://github.com/ever-works/awesome-vector-databases

Last synced: 7 days ago
JSON representation

  • SDKs & Libraries

    • FlashRank - Ultra-lite and super-fast Python reranking library based on SoTA cross-encoders and LLMs, running on CPU with the tiniest reranking model in the world at ~4MB with no PyTorch dependency. ([Read more](/details/flashrank.md)) `Reranking` `Lightweight` `Open Source`
    • Graphiti - Open-source framework for building temporally-aware knowledge graphs that power AI agent memory. Graphiti tracks when facts were true and maintains historical context, combining semantic search with graph traversal. ([Read more](/details/graphiti.md)) `Open Source` `Knowledge Graph` `Temporal`
    • Hannoy - Graph-based approximate nearest neighbor search library built on LMDB key-value storage. The successor to Arroy, Hannoy combines graph-based ANN algorithms with production-ready persistent storage for vector databases. ([Read more](/details/hannoy.md)) `Graph Based` `Lmdb` `Rust`
    • hnswlib-node - Node.js bindings for HNSWlib implementing approximate nearest-neighbor search. Provides fast HNSW-based vector similarity search for JavaScript/TypeScript applications with file persistence support. ([Read more](/details/hnswlib-node.md)) `Nodejs` `Javascript` `Hnsw`
    • Infinity - High-throughput, low-latency serving engine for text embeddings, reranking models, CLIP, CLAP and ColPali with GPU acceleration support for local deployment and production use. ([Read more](/details/infinity-embedding-server.md)) `Embeddings` `Gpu Acceleration` `Open Source`
    • Instructor - Python library for extracting structured, type-safe data from Large Language Models with automatic validation, retries, and streaming support. Built on Pydantic with over 3 million monthly downloads. ([Read more](/details/instructor.md)) `Python` `structured-outputs` `validation`
    • MeMemo - A JavaScript library that brings vector search and RAG (Retrieval-Augmented Generation) to browser environments, enabling efficient searching through millions of vectors using HNSW algorithm with IndexedDB and Web Workers. ([Read more](/details/mememo.md)) `Javascript` `Browser` `Rag`
    • Milvus Client Libraries - Official SDK and client libraries for Milvus vector database supporting Python, Java, Go, Node.js, and other languages. Provides simple and intuitive APIs for vector operations, search, and data management across platforms. ([Read more](/details/milvus-client-libraries.md)) `Sdk` `Multi Language` `Milvus`
    • Ollama Embeddings - Local embedding generation through Ollama supporting models like nomic-embed-text and mxbai-embed-large. Enables completely offline embeddings with no subscription fees or API costs, ideal for privacy-focused RAG applications. ([Read more](/details/ollama-embeddings.md)) `Embeddings` `Local` `Privacy`
    • PaCMAP - Pairwise Controlled Manifold Approximation - a dimensionality reduction technique that preserves both local and global structure better than UMAP or t-SNE. Particularly effective for visualizing complex embedding spaces. ([Read more](/details/pacmap-dimensionality-reduction.md)) `Dimensionality Reduction` `Visualization` `Python` `Algorithms`
    • PQk-means - An efficient clustering method for billion-scale feature vectors that compresses input vectors into short product-quantized (PQ) codes to achieve fast and memory-efficient clustering. PQk-means can cluster one billion 128D SIFT features in 14 hours using just 32 GB of memory. ([Read more](/details/pqk-means.md)) `product quantization` `Clustering` `Compression` `Scalable` `Python`
    • PUFFINN - Parameterless and Universal Fast Finding of Nearest Neighbors - an LSH-based library for approximate nearest neighbor search with probabilistic guarantees. Features a parameterless design requiring only memory budget and result quality specifications. ([Read more](/details/puffinn.md)) `Lsh` `Ann` `Open Source`
    • Qdrant Client Libraries - Official SDKs for Qdrant vector database available in Python, Rust, Go, TypeScript, and other languages. Features OpenAPI v3 specs enabling easy client generation for virtually any programming framework. ([Read more](/details/qdrant-client-libraries.md)) `Sdk` `Multi Language` `Qdrant`
    • Redis LangCache - Semantic caching solution for LLM applications that reduces API calls and costs by recognizing semantically similar queries. Achieves up to 73% cost reduction in conversational workloads with sub-millisecond cache retrieval through vector similarity search. ([Read more](/details/redis-langcache.md)) `Caching` `Rag` `Optimization`
    • ScaNN Library - Scalable Nearest Neighbors library by Google Research that provides efficient vector similarity search at scale. Uses anisotropic vector quantization and advanced compression techniques to handle twice as many queries per second compared to alternatives. ([Read more](/details/scann-library.md)) `Ann` `Google` `Quantization`
    • Sentence Transformers (SBERT) - State-of-the-art Python framework for sentence, text, and image embeddings using siamese BERT networks, providing access to 15,000+ pre-trained models for semantic search, similarity comparison, and clustering. ([Read more](/details/sentence-transformers-sbert.md)) `Embedding` `Python` `Bert`
    • Transformers.js - JavaScript library from Hugging Face for running transformer models directly in the browser with no server required, providing embeddings, classification, and multimodal capabilities using ONNX Runtime. ([Read more](/details/transformersjs.md)) `Javascript` `Browser` `Embeddings`
    • Voy - A portable WebAssembly vector similarity search engine written in Rust with a tiny footprint (75KB gzipped). Designed for edge deployment, browsers, and IoT devices with support for k-d tree indexing and optimized for modern web applications. ([Read more](/details/voy.md)) `Wasm` `Rust` `Browser`
    • Weaviate Client Libraries - Official SDKs for Weaviate vector database in Python, TypeScript, JavaScript, Go, and Java. Provides both REST and GraphQL APIs with comprehensive support for vector search, hybrid queries, and generative search. ([Read more](/details/weaviate-client-libraries.md)) `Sdk` `Multi Language` `Weaviate`
    • vsag - vsag is an Alibaba open-source library implementing efficient vector search algorithms, including approximate nearest neighbor search for high-dimensional vectors. ([Read more](/details/vsag.md)) `Ann` `High Dimensional` `Vector Search`
    • Autofaiss - Automatic index selection and tuning library for FAISS that selects optimal KNN index configurations to maximize recall given memory and query speed constraints, eliminating manual hyperparameter tuning. ([Read more](/details/autofaiss.md)) `Open Source` `Python` `Optimization`
    • Chroma-go - Go client library for Chroma vector database with support for in-process persistent storage, HNSW parameter configuration, and full compatibility with Chroma v1.x. ([Read more](/details/chroma-go.md)) `Go` `Client Library` `Chroma`
    • Chroma-hnswlib - Chroma's optimized fork of hnswlib for high-performance vector similarity search, providing the core indexing engine for ChromaDB with over 400K weekly downloads. ([Read more](/details/chroma-hnswlib.md)) `Python` `Hnsw` `Indexing`
    • FastPLAID - Optimized implementation of PLAID index for fast ColBERT retrieval, providing 10x storage compression and sub-200ms latency. Default index backend for PyLate library, enabling efficient multi-vector late interaction retrieval. ([Read more](/details/fastplaid.md)) `Colbert` `Index` `Multi Vector`
    • imvectordb - Super simple and easy-to-use in-memory vector database for Node.js. Perfect for quickly building prototypes or small-scale applications with a compressed file size of just 3KB. ([Read more](/details/imvectordb.md)) `Javascript` `In Memory` `Lightweight`
    • IVF-SQ8 Index - A quantization-based vector indexing algorithm that combines Inverted File Index (IVF) with 8-bit scalar quantization (SQ8). Designed to tackle large-scale similarity search challenges, achieving faster searches with a much smaller memory footprint compared to exhaustive search methods by using 8-bit integers instead of 32-bit floats. ([Read more](/details/ivf-sq8-index.md)) `Quantization` `Indexing` `memory-optimization`
    • nanoflann - C++11 header-only library for Nearest Neighbor (NN) search with KD-trees. Optimized for 2D or 3D point clouds with efficient nearest neighbor searches in spatial data structures, particularly suited for robotics and computer vision applications. ([Read more](/details/nanoflann.md)) `c++` `Kd Tree` `Header Only`
    • PyLate - Library built on Sentence Transformers for flexible training, inference, and retrieval with state-of-the-art ColBERT models. Features FastPLAID index for efficient multi-vector late interaction retrieval with 10x storage compression and sub-200ms latency. ([Read more](/details/pylate.md)) `Python` `Colbert` `Late Interaction`
    • Superlinked - Python framework for AI Engineers building high-performance search and recommendation applications that combine structured and unstructured data through vector compute. ([Read more](/details/superlinked.md)) `Vector Compute` `Multi Modal` `Python`
    • VectorDB.js - Simple in-memory vector database for Node.js that works 100% locally and in-memory by default. Uses hnswlib-node for simple vector search and Embeddings.js for simple text embeddings with support for OpenAI, Mistral and local embeddings. ([Read more](/details/vectordbjs.md)) `Javascript` `In Memory` `Local`
    • Vectra - Local vector database for Node.js with features similar to Pinecone but built using local files. Provides predictable local performance with full in-memory scans delivering sub-millisecond to low-millisecond latency for small/medium corpora. ([Read more](/details/vectra.md)) `Javascript` `Local` `File Based`
    • NLTK - The Natural Language Toolkit (NLTK) is a leading Python platform for building programs to work with human language data. It provides easy-to-use interfaces to lexical resources like WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning. ([Read more](/details/nltk.md)) `Natural Language Processing` `Python` `Text Processing`
    • PyPDF2 - A pure Python PDF library for extracting text, metadata, and other content from PDF documents, commonly used in data preprocessing pipelines for vector database applications involving research papers and technical documentation. ([Read more](/details/pypdf2.md)) `Pdf` `Text Extraction` `Document Processing`
    • tiktoken - OpenAI's tokenizer library for encoding and decoding text into tokens, primarily used for calculating token counts with OpenAI's models and estimating chunk sizes for vector database document processing. ([Read more](/details/tiktoken.md)) `Tokenization` `Open Source` `Text Processing`
    • PyPDF2 - A pure Python PDF library for extracting text, metadata, and other content from PDF documents, commonly used in data preprocessing pipelines for vector database applications involving research papers and technical documentation. ([Read more](/details/pypdf2.md)) `pdf` `text-extraction` `document-processing`
    • RankGPT - LLM-based document reranking approach that fine-tunes decoder-only models like LLaMA to calculate query-document relevance scores. Uses generative capabilities of large language models to improve retrieval ranking in search and RAG systems. ([Read more](/details/rankgpt.md)) `Llm Based` `Reranking` `Generative`
  • Security & Governance

    • Cloaked AI - Protect sensitive AI data in vector databases by first encrypting it without losing functionality. ([Read more](/details/cloaked-ai.md)) `security` `encryption` `data privacy`
    • Privacera AI Governance (PAIG) - Privacera AI Governance (PAIG) is a solution designed to secure and govern AI data, including safeguarding vector databases and embeddings, ensuring data privacy and compliance for AI applications. ([Read more](/details/privacera-ai-governance-paig.md)) `data governance` `security` `compliance`
    • HONEYBEE RBAC Framework - Efficient role-based access control framework for vector databases using dynamic partitioning. Achieves up to 13.5X lower query latency than row-level security with only 1.24X memory overhead, while providing 90.4% memory reduction compared to dedicated per-role indexes. ([Read more](/details/honeybee-rbac-framework.md)) `Rbac` `Access Control` `Research`
    • Trilio for Kubernetes - An application-aware backup and disaster recovery solution for Kubernetes that provides specialized support for vector databases like Milvus, featuring immutable backups, 40x faster recovery with Continuous Restore, and CSI-native snapshots. ([Read more](/details/trilio-for-kubernetes.md)) `Backup` `Disaster Recovery` `Kubernetes`
    • Vector Database Security & Access Control - Security practices for protecting sensitive vector data including Role-Based Access Control (RBAC), encryption at rest and in transit, attribute-based policies, and protection against vector injection attacks and data reconstruction threats. ([Read more](/details/vector-database-security-access-control.md)) `Security` `Rbac` `Encryption`
    • Vector Database Security Best Practices - A comprehensive guide and concept covering security measures for vector databases including RBAC, encryption, access control, and protection against vector-specific attacks. Essential for production deployments handling sensitive data. ([Read more](/details/vector-database-security-best-practices.md)) `Security` `Rbac` `Encryption` `Best Practices`
    • Vectorsight - The world's first purpose-built observability platform for vector databases, providing real-time monitoring, intelligent alerts, and performance optimization for AI applications using Pinecone, Qdrant, Milvus, Weaviate, and ChromaDB. ([Read more](/details/vectorsight.md)) `Observability` `Monitoring` `Analytics`
    • lakeFS - Data version control with immutable commits, audit trails for compliance (GDPR), and RBAC-compatible governance for vector data lakes. Supports enterprise data privacy through reproducible embeddings and instant rollback. Outperforms open-source Git with zero-copy branching and AI lifecycle management. ([Read more](/details/lakefs.md)) `Vector Data Privacy` `RBAC Access` `GDPR Compliant`
    • rvf-ebpf - eBPF-based kernel-level networking filters (XDP, TC) with encryption enforcement and access controls for secure vector data flows. Enables enterprise data privacy in RuVector deployments with compliance monitoring. More efficient than open-source eBPF tools with vector-optimized filtering. ([Read more](/details/rvf-ebpf.md)) `Vector Data Privacy` `RBAC Access` `GDPR Compliant`
  • serverless-managed-vector-dbs

    • Amazon S3 Vectors - Serverless object storage with native vector storage and query capabilities, supporting up to 2 billion vectors per index and 20 trillion per vector bucket. Optimized for production-scale AI workloads including RAG, semantic search, and conversational AI with sub-second query latencies. Integrates directly with Amazon Bedrock Knowledge Bases and Amazon OpenSearch Service. ([Read more](/details/amazon-s3-vectors.md)) `Serverless` `Aws` `s3` `object-storage`
  • ⭐ Star History

  • Tools

    • RAGAS - Retrieval Augmented Generation Assessment framework for reference-free evaluation of RAG pipelines. RAGAS provides automated metrics for retrieval quality, context relevance, and generation faithfulness. ([Read more](/details/ragas-rag-evaluation.md)) `Rag` `Evaluation` `Testing` `Metrics`
  • Vector Database

    • AlayaDB - Integrated database-and-inference engine that processes document data through an LLM's forward pass into tensors stored in a dedicated KV Cache system. Advocated as an approach to optimize the retrieval-context pipeline by combining storage and inference into a unified architecture. ([Read more](/details/alayadb.md)) `Kv Cache` `Inference Integration` `Context Engineering`
    • Manu — A Cloud Native Vector Database Management System - VLDB 2022 paper introducing Manu, a cloud-native vector database management system designed for scalable similarity search in cloud environments with separated storage and compute architecture. ([Read more](/details/manu-a-cloud-native-vector-database-management-system.md)) `Cloud Native` `Distributed` `Billion Scale`
    • PostgreSQL (with pgvector) - Powerful open-source object-relational database system that, with the pgvector extension, serves as a capable vector database for AI applications. Widely used from small projects to large-scale enterprise systems, and offered as managed services by major cloud providers. ([Read more](/details/postgresql-pgvector.md)) `Open Source` `Relational` `Pgvector`
    • VAST AI OS - GPU-accelerated platform from VAST Data that includes a native vector database, designed for enterprise AI workloads including multi-agent systems, video-reasoning, and high-volume RAG. It combines vector embeddings with structured data and metadata in unified tables, enabling hybrid queries across modalities without orchestration layers or external indexes. ([Read more](/details/vast-ai-os.md)) `GPU-accelerated` `Enterprise` `Hybrid Search`
  • vector-database-engines

    • Data Cloud Vector Database - Built into the Salesforce platform, Data Cloud Vector Database ingests various large datasets from customer interactions, classifies and organizes unstructured data, and merges it with structured data to enrich customer profiles and store as metadata in Data Cloud. It enhances generative AI by providing more relevant, accurate, and up-to-date responses through improved data retrieval and semantic search capabilities. ([Read more](/details/data-cloud-vector-database.md)) `Enterprise` `Cloud Native` `Vector Database`
    • vector engine for OpenSearch Serverless - An on-demand serverless configuration for OpenSearch Service that simplifies the operational complexities of managing OpenSearch domains, integrated with Knowledge Bases for Amazon Bedrock to support generative AI applications. ([Read more](/details/vector-engine-for-opensearch-serverless.md)) `Cloud Native` `Serverless` `opensearch`
    • Google Cloud Vertex AI Vector Search - Google Cloud Platform offers vector search as part of its Vertex AI suite, enabling scalable and integrated vector search capabilities for AI-driven applications. ([Read more](/details/google-cloud-vertex-ai-vector-search.md)) `Cloud Native` `Vector Search` `Ai` `Scalable`
    • HAKES - HAKES is a system designed for efficient data search using embedding vectors at scale, making it a relevant solution for vector database applications. ([Read more](/details/hakes.md)) `Vector Search` `Scalable` `Embeddings`
    • Qwak Vector Store - Qwak provides a vector store solution engineered for optimized storage and querying of vector embeddings, offering efficient search capabilities, high performance, scalability, and data retrieval by identifying similarities among data points. ([Read more](/details/qwak-vector-store.md)) `Vector Store` `Scalable` `Embeddings`
  • Vector Database Engines

    • Amazon Web Services Vector Search - AWS has introduced vector search in several of its managed database services, including OpenSearch, Bedrock, MemoryDB, Neptune, and Amazon Q, making it a comprehensive platform for vector search solutions. ([Read more](/details/amazon-web-services-vector-search.md)) `cloud-native` `vector search` `managed service` `enterprise`
    • AstraDB - AstraDB (also known as Astra DB by DataStax) is a cloud-native vector database built on Apache Cassandra, supporting real-time AI applications with scalable vector search. It is designed for large-scale deployments and features a user-friendly Data API, robust vector capabilities, and automation for AI-powered applications. ([Read more](/details/astradb.md)) `cloud-native` `vector search` `scalable` `AI`
    • ChromaDB - ChromaDB (also known as Chroma or chroma-core) is an open-source vector database focused on LLM applications, emphasizing simplicity and in-memory HNSW-based dense vector search. It is suited for prototyping, metadata filtering, and offers a user-friendly interface for building and testing vector search applications, though it currently lacks hybrid and distributed features. ([Read more](/details/chromadb.md)) `open-source` `in-memory` `vector search` `LLM`
    • citrus - A distributed vector database designed for scalable and efficient vector similarity search. It is purpose-built for handling large-scale vector data and search workloads. ([Read more](/details/citrus.md)) `open-source` `distributed` `vector search` `scalable`
    • ClickHouse - ClickHouse is an open-source column-oriented database that supports vectorized computation and now offers vector search features. Its architecture enables efficient real-time analytics and vector operations, making it a relevant choice for vector database use cases. ([Read more](/details/clickhouse.md)) `open-source` `analytics` `vector search` `real-time`
    • Cottontail DB - Cottontail DB is an open-source vector database for storing and searching high-dimensional data, with features geared towards research and production environments. ([Read more](/details/cottontail-db.md)) `open-source` `vector databases` `high-dimensional` `vector search`
    • Datastax - Datastax offers a vector search solution integrated with its database platform, enabling approximate similarity search and hybrid queries for enterprise use cases. ([Read more](/details/datastax.md)) `enterprise` `vector search` `hybrid search` `similarity search`
    • Deep Lake - Deep Lake is a vector database designed as a data lake for AI, capable of storing and managing vector embeddings, text, images, and videos. It utilizes a tensor format for efficient querying and integration with AI algorithms, making it suitable for similarity search and machine learning workflows. It is open-source and tailored for handling unstructured and multimodal data, with seamless integration with frameworks like PyTorch and TensorFlow. ([Read more](/details/deep-lake.md)) `open-source` `vector search` `AI` `multimodal`
    • Google Cloud Vertex AI Vector Search - Google Cloud Platform offers vector search as part of its Vertex AI suite, enabling scalable and integrated vector search capabilities for AI-driven applications. ([Read more](/details/google-cloud-vertex-ai-vector-search.md)) `cloud-native` `vector search` `AI` `scalable`
    • Google Vertex AI - Google Vertex AI offers managed vector search capabilities as part of its AI platform, supporting hybrid and semantic search for text, image, and other embeddings. ([Read more](/details/google-vertex-ai.md)) `managed service` `vector search` `hybrid search` `semantic search` `cloud-native`
    • Infinity - Infinity is an AI-native database built for LLM applications, offering fast hybrid search of dense vectors, sparse vectors, tensors, and full-text data. ([Read more](/details/infinity.md)) `AI` `LLM` `hybrid search` `vector search`
    • KDB.AI - KDB.AI is a proprietary vector database and search engine designed for real-time AI applications. It offers advanced vector search, integrates with popular ML tools, and supports temporal and semantic context for embeddings. KDB.AI Server is a high-performance vector database and search engine from KX, designed for real-time analytics and AI applications requiring rapid similarity search. ([Read more](/details/kdbai.md)) `proprietary` `real-time` `AI` `vector search`
    • LanceDB - LanceDB is a columnar vector database optimized for real-time AI use cases and analytics workloads, providing efficient vector storage and fast similarity search. ([Read more](/details/lancedb.md)) `vector search` `real-time` `analytics` `AI`
    • Manu - A cloud-native vector database management system designed for efficient storage and retrieval of vector embeddings. Directly relevant as a vector database platform. ([Read more](/details/manu.md)) `vector databases` `cloud-native` `vector search` `scalable`
    • Marqo - Marqo is an open-source neural search engine that leverages vector representations to enable semantic search over textual data. It abstracts vector database complexity and provides a high-level interface for building advanced search applications. ([Read more](/details/marqo.md)) `open-source` `semantic search` `vector search` `AI`
    • Microsoft Azure AI Search - Azure AI Search provides vector search capabilities as a managed service, supporting approximate KNN, hybrid search, and integration with other Azure AI tools. ([Read more](/details/microsoft-azure-ai-search.md)) `managed service` `vector search` `hybrid search` `cloud-native`
    • Microsoft Azure Vector Database - Microsoft Azure offers vector search support across multiple database services, enabling developers to leverage vector search in cloud-native and enterprise scenarios. ([Read more](/details/microsoft-azure-vector-database.md)) `cloud-native` `vector search` `enterprise` `scalable`
    • Milvus - Milvus is a mature, open-source vector database maintained by Zilliz, supporting large-scale similarity search with multiple indexing strategies and GPU acceleration. It includes variants such as Milvus Lite (lightweight version), Milvus Standalone (single-machine deployment), and Milvus Distributed (Kubernetes-based deployment for large scale). ([Read more](/details/milvus.md)) `open-source` `vector search` `scalable` `GPU acceleration`
    • MongoDB - MongoDB is a general-purpose database that now includes vector search capabilities, enabling light vector workloads alongside traditional database functionality. MongoDB Atlas, the managed cloud offering, includes vector search built on Lucene, supporting ANN queries and hybrid search. MongoDB Atlas Search integrates powerful vector search capabilities directly within MongoDB. ([Read more](/details/mongodb.md)) `vector search` `hybrid search` `NoSQL` `managed service`
    • MongoDB Atlas Vector Search - A vector search capability integrated within MongoDB Atlas, enabling vector-based retrieval and similarity search over unstructured data. Relevant for users seeking vector search in a popular database platform. MongoDB Vector Search is an integrated feature in MongoDB Atlas that enables efficient vector-based search within a comprehensive document database, supporting up to 2,048 dimensions and hybrid search capabilities. ([Read more](/details/mongodb-atlas-vector-search.md)) `cloud-native` `vector search` `document database` `hybrid search`
    • MyScale - A relational database engine extended with native vector search capabilities, allowing for scalable and efficient similarity search in combination with SQL queries. ([Read more](/details/myscale.md)) `vector search` `SQL` `scalable` `hybrid search`
    • Neo4j - Neo4j is a graph database that has added vector search capabilities, providing unique and effective approaches for retrieval augmented generation (RAG) and other AI applications. ([Read more](/details/neo4j.md)) `graph database` `vector search` `RAG` `AI`
    • NucliaDB - NucliaDB is a commercial vector database that enables semantic and vector search across unstructured data, supporting advanced AI and ML-powered applications. ([Read more](/details/nucliadb.md)) `commercial` `vector search` `semantic search` `AI`
    • OpenSearch - OpenSearch is a fully open-source, community-driven search and analytics suite that supports vector search, providing a transparent and flexible alternative for organizations seeking advanced search features. ([Read more](/details/opensearch.md)) `open-source` `vector search` `analytics` `scalable`
    • Oracle Database Vector Search - Oracle's core database now includes vector search capabilities, enabling enterprises to perform scalable vector queries natively as part of their data management workflows. Oracle includes vector search capabilities in its database platform, supporting approximate KNN and hybrid search for enterprise-scale use cases. ([Read more](/details/oracle-database-vector-search.md)) `enterprise` `vector search` `hybrid search` `KNN`
    • orama - Orama is a lightweight search engine that supports vector and hybrid search functionalities, suitable for browser, server, or edge environments. ([Read more](/details/orama.md)) `open-source` `vector search` `hybrid search` `lightweight`
    • Pinecone - Pinecone is a managed vector database service optimized for handling vector embeddings at scale. It provides straightforward APIs and infrastructure management, making it popular for adding vector search capabilities to modern applications. ([Read more](/details/pinecone.md)) `managed service` `vector search` `scalable` `AI`
    • Qdrant - Qdrant is a dedicated vector database and similarity search engine supporting advanced filtering and efficient retrieval, suitable for faceted search and retrieval-augmented generation. It offers self-hosted and cloud deployment options, making it highly relevant for vector search applications. ([Read more](/details/qdrant.md)) `open-source` `vector search` `similarity search` `RAG`
    • Redis - Redis, while primarily an in-memory data store, offers vector search capabilities through its RediSearch and RedisAI modules, enabling vector similarity searches and deep learning model management for existing Redis users. With the RediSearch module, Redis extends its functionality to support native vector search, indexing, and hybrid queries, making it suitable for real-time AI and semantic search applications. ([Read more](/details/redis.md)) `vector search` `in-memory` `hybrid search` `real-time`
    • SingleStore - SingleStore (also known as SingleStoreDB) is a real-time data platform offering integrated vector search capabilities for building intelligent applications with high-throughput requirements. It is a relational database platform with built-in support for vector search, enabling high-performance similarity queries alongside structured SQL-based analytics. ([Read more](/details/singlestore.md)) `real-time` `vector search` `SQL` `analytics`
    • Solr - Solr is a mature open-source search engine that has incorporated vector search capabilities, making it relevant for enterprises looking to implement vector-based search alongside traditional keyword search. ([Read more](/details/solr.md)) `open-source` `vector search` `hybrid search` `enterprise`
    • Supabase Vector - Supabase Vector extends the Supabase platform by providing vector database functionalities, making it easy to add vector search capabilities to applications with PostgreSQL backend. ([Read more](/details/supabase-vector.md)) `vector search` `PostgreSQL` `open-source` `integration`
    • Transwarp Hippo - Transwarp Hippo is an enterprise-grade, cloud-native distributed vector database designed for scalable vector operations, including similarity search and clustering, targeting massive datasets and real-time recommendation systems. ([Read more](/details/transwarp-hippo.md)) `enterprise` `cloud-native` `distributed` `vector search`
    • Trieve - Trieve provides an all-in-one infrastructure for vector search, recommendations, retrieval-augmented generation (RAG), and analytics, accessible via API for seamless integration. ([Read more](/details/trieve.md)) `open-source` `vector search` `RAG` `analytics`
    • Typesense - Typesense is an open-source search engine that supports hybrid search, including vector search capabilities, providing an alternative to proprietary vector search solutions. ([Read more](/details/typesense.md)) `open-source` `hybrid search` `vector search` `full-text search`
    • Vald - Vald is an open-source, highly scalable distributed vector search engine known for its asynchronous auto-indexing and ability to efficiently handle large-scale vector data in real time, making it suitable for demanding vector search applications. ([Read more](/details/vald.md)) `open-source` `distributed` `scalable` `real-time`
    • vearch - Vearch is a distributed vector search engine designed for AI-native applications, enabling scalable and efficient similarity search across large datasets. ([Read more](/details/vearch.md)) `open-source` `distributed` `vector search` `AI`
    • Vector.ai - Vector.ai offers commercial vector database solutions for efficient high-dimensional similarity search and machine learning applications. ([Read more](/details/vectorai.md)) `commercial` `vector search` `machine learning` `similarity search`
    • Vespa.ai - Vespa.ai is a scalable open-source platform for real-time big data serving and vector search. It supports vector similarity search and is used for applications like retrieval augmented generation and e-commerce search, making it highly relevant for vector database and vector search use cases. ([Read more](/details/vespaai.md)) `open-source` `vector search` `real-time` `scalable`
    • Weaviate - Weaviate is an open-source, cloud-native vector database that supports fast semantic search, modular extensions, and graph-like querying, making it an ideal solution for building scalable, modern vector search applications. ([Read more](/details/weaviate.md)) `open-source` `cloud-native` `semantic search` `scalable`
    • Zilliz Cloud - Zilliz Cloud is a fully managed vector database service powered by Milvus, offering hassle-free deployment, scalability, and high performance for vector search applications. ([Read more](/details/zilliz-cloud.md)) `cloud-native` `managed service` `vector search` `Milvus`
    • Data Cloud Vector Database - Built into the Salesforce platform, Data Cloud Vector Database ingests various large datasets from customer interactions, classifies and organizes unstructured data, and merges it with structured data to enrich customer profiles and store as metadata in Data Cloud. It enhances generative AI by providing more relevant, accurate, and up-to-date responses through improved data retrieval and semantic search capabilities. ([Read more](/details/data-cloud-vector-database.md)) `enterprise` `cloud-native` `vector database`
    • Instaclustr - Instaclustr offers comprehensive managed services for vector databases, handling deployment, configuration, ongoing maintenance, tuning, optimization, scalability, security, and data protection. This allows organizations to offload the complexities of managing their vector database infrastructure and focus on their core business objectives. ([Read more](/details/instaclustr.md)) `managed service` `cloud` `enterprise`
    • Qwak - A platform designed to simplify the building, management, and deployment of Large Language Model (LLM) applications, enabling rapid operationalization of context-aware LLMs and offering integration with its Vector Store. ([Read more](/details/qwak.md)) `MLOps` `LLM` `platform`
    • vector engine for OpenSearch Serverless - An on-demand serverless configuration for OpenSearch Service that simplifies the operational complexities of managing OpenSearch domains, integrated with Knowledge Bases for Amazon Bedrock to support generative AI applications. ([Read more](/details/vector-engine-for-opensearch-serverless.md)) `cloud-native` `serverless` `OpenSearch`
    • Aerospike - A multi-model AI database designed for high-throughput vector processing at scale, supporting real-time AI use cases with a patented Hybrid Memory Architecture and efficient infrastructure usage, capable of handling large volumes of data and concurrent users. ([Read more](/details/aerospike.md)) `multi-model` `real-time` `scalable`
    • AllegroGraph - A database that incorporates neuro-symbolic AI and offers a managed service (AllegroGraph Cloud) for neuro-symbolic AI knowledge graphs, indicating its relevance to advanced AI applications, likely including vector capabilities. ([Read more](/details/allegrograph.md)) `graph database` `AI` `knowledge graph`
    • Blaze - An emerging solution diversifying the options available to data engineers in the vector database landscape. ([Read more](/details/blaze.md)) `vector database` `emerging` `data engineering`
    • DataFusion - A general-purpose analytical engine with built-in vector processing capabilities, excelling at traditional analytical workloads and efficient handling of vector operations. It is an example of a vector engine. ([Read more](/details/datafusion.md)) `analytical engine` `vector processing` `open-source`
    • HAKES - HAKES is a system designed for efficient data search using embedding vectors at scale, making it a relevant solution for vector database applications. ([Read more](/details/hakes.md)) `vector search` `scalable` `embeddings`
    • JaguarDB - JaguarDB is a database solution, identified as a vector database in the context of the provided research. ([Read more](/details/jaguardb.md)) `vector database` `commercial` `high-performance`
    • MTEB: Massive Text Embedding Benchmark - A massive text embedding benchmark for evaluating the quality of text embedding models, crucial for vector database applications. ([Read more](/details/mteb-massive-text-embedding-benchmark.md)) `embeddings` `evaluation` `benchmark`
    • ObjectBox - A high-performance embedded database for edge devices and mobile, offering vector search capabilities for AI applications. ([Read more](/details/objectbox.md)) `embedded` `edge` `mobile`
    • Photon Engine - A general-purpose analytical engine with built-in vector processing capabilities, excelling at traditional analytical workloads and efficient handling of vector operations. It is an example of a vector engine. ([Read more](/details/photon-engine.md)) `analytical engine` `vector processing` `performance`
    • Quokka - An emerging solution diversifying the options available to data engineers in the vector database landscape. ([Read more](/details/quokka.md)) `vector database` `emerging` `data engineering`
    • Qwak Vector Store - Qwak provides a vector store solution engineered for optimized storage and querying of vector embeddings, offering efficient search capabilities, high performance, scalability, and data retrieval by identifying similarities among data points. ([Read more](/details/qwak-vector-store.md)) `vector store` `scalable` `embeddings`
    • Vector Databases - A critical emerging technology focused on processing, storing, and retrieving vast amounts of high-dimensional vector data rapidly and efficiently. Unlike traditional databases, they offer unique advantages for use cases such as image and video recognition, natural language processing (NLP), and Retrieval-Augmented Generation (RAG). ([Read more](/details/vector-databases.md)) `vector databases` `AI` `RAG`
    • Google Cloud Vertex AI Vector Search - Google Cloud Platform offers vector search as part of its Vertex AI suite, enabling scalable and integrated vector search capabilities for AI-driven applications. ([Read more](/details/google-cloud-vertex-ai-vector-search.md)) `cloud-native` `vector search` `AI` `scalable`
    • JaguarDB - JaguarDB is a database solution, identified as a vector database in the context of the provided research. ([Read more](/details/jaguardb.md)) `vector database` `commercial` `high-performance`
    • Google Cloud Vertex AI Vector Search - Google Cloud Platform offers vector search as part of its Vertex AI suite, enabling scalable and integrated vector search capabilities for AI-driven applications. ([Read more](/details/google-cloud-vertex-ai-vector-search.md)) `cloud-native` `vector search` `AI` `scalable`
    • Qdrant Vector Database - Qdrant is an open‑source vector database designed for high‑performance similarity search and AI applications such as RAG, recommendation systems, advanced semantic search, anomaly detection, and AI agents. It provides scalable storage and retrieval of vector embeddings with features like filtering, hybrid search, and production‑grade APIs for integrating with machine learning workloads. ([Read more](/details/qdrant-vector-database.md)) `vector database` `open-source` `hybrid search`
    • Qdrant Edge - Qdrant Edge is a private beta offering of Qdrant optimized for edge and on-device deployments, enabling low-latency vector search and AI capabilities closer to where data is generated. ([Read more](/details/qdrant-edge.md)) `edge` `embedded` `vector search`
    • seekdb - seekdb is OceanBase’s experimental vector database component for high-performance nearest neighbor search over embedding vectors. ([Read more](/details/seekdb.md)) `ANN` `vector database` `high-performance`
    • tinyvector - tinyvector is a minimal vector database / ANN engine focused on simplicity and compact implementation for educational and small-scale similarity search uses. ([Read more](/details/tinyvector.md)) `ANN` `similarity search` `lightweight`
    • Deep Lake - Deep Lake is a vector database designed as a data lake for AI, capable of storing and managing vector embeddings, text, images, and videos. It utilizes a tensor format for efficient querying and integration with AI algorithms, making it suitable for similarity search and machine learning workflows. It is open-source and tailored for handling unstructured and multimodal data, with seamless integration with frameworks like PyTorch and TensorFlow. ([Read more](/details/deep-lake.md)) `open-source` `vector search` `AI` `multimodal`
    • Deep Lake - Deep Lake is a vector database designed as a data lake for AI, capable of storing and managing vector embeddings, text, images, and videos. It utilizes a tensor format for efficient querying and integration with AI algorithms, making it suitable for similarity search and machine learning workflows. It is open-source and tailored for handling unstructured and multimodal data, with seamless integration with frameworks like PyTorch and TensorFlow. ([Read more](/details/deep-lake.md)) `open-source` `vector search` `AI` `multimodal`
    • Deep Lake 4.0 - AI data lake with revolutionary index-on-the-lake technology enabling sub-second queries from S3. Features 10x cost efficiency vs in-memory DBs and 2x faster than alternatives. This is a commercial platform with OSS components. ([Read more](/details/deep-lake-40.md)) `Commercial` `Data Lake` `Multimodal`
    • Zvec - Open-source, in-process vector database by Alibaba, positioned as the SQLite of vector databases. Delivers production-grade, low-latency similarity search with minimal setup, achieving >8,000 QPS on VectorDBBench. This is an open-source (OSS) solution released under Apache 2.0 license. ([Read more](/details/zvec.md)) `Open Source` `Embedded` `Lightweight`
    • Turso - SQLite-based database with native vector search capabilities built directly into the database without extensions. Based on libSQL fork of SQLite with support for DiskANN algorithm for approximate nearest neighbor search. This is a commercial solution with free tier available. ([Read more](/details/turso.md)) `Commercial` `Sqlite` `Edge`
    • YugabyteDB with pgvector - PostgreSQL-compatible distributed database with pgvector support and USearch integration, proven to handle billions of vectors with 96.56% recall and sub-second query latency. ([Read more](/details/yugabytedb-with-pgvector.md)) `Postgresql` `Distributed` `Open Source`
    • Actian VectorAI DB - Edge-native vector database designed for deployment at remote locations and edge devices with no cloud dependency. Supports real-time decision making with sub-15ms query latency and operates independently during network outages. ([Read more](/details/actian-vectorai-db.md)) `Edge` `On Premises` `Offline`
    • Couchbase Lite Vector - Developer-friendly, full-featured embedded NoSQL database with vector search for offline-first GenAI apps running on mobile, IoT devices, and web browsers with no internet dependencies. ([Read more](/details/couchbase-lite-vector.md)) `Embedded` `Offline` `Mobile`
    • Jina VectorDB - A Pythonic vector database offering comprehensive CRUD operations with robust scalability through sharding and replication. Built on DocArray for vector search and Jina for efficient index serving, deployable from local to cloud environments. ([Read more](/details/jina-vectordb.md)) `Python` `Docarray` `Open Source`
    • Quickwit - Cloud-native search engine for observability built on Tantivy, offering sub-second search on data stored in object storage as an open-source alternative to Datadog, Elasticsearch, Loki, and Tempo. ([Read more](/details/quickwit.md)) `Observability` `Open Source` `Cloud Native`
    • RUMMY - A GPU-accelerated vector query processing system that supports large vector datasets beyond GPU memory. RUMMY uses reordered pipelining to efficiently overlap data transmission and GPU computation, achieving up to 135× better performance than traditional GPU-based approaches. ([Read more](/details/rummy-gpu-vector-search.md)) `Gpu Acceleration` `High Performance` `Scalable`
    • ScyllaDB Vector Search - High-performance NoSQL database with vector search capabilities built on USearch library and shard-per-core architecture, storing vector embeddings alongside structured data in unified tables. ([Read more](/details/scylladb-vector-search.md)) `Nosql` `Distributed` `High Performance`
    • SemaDB - A vector database with multi-index hybrid keyword search capabilities, offering both pure vector search (v1) and hybrid keyword search (v2) implementations through a simple REST API with JSON or MessagePack support. ([Read more](/details/semadb.md)) `Hybrid Search` `Open Source` `Rest Api`
    • Swirl - Open-source federated AI search platform that simultaneously searches across 100+ enterprise data sources without requiring data migration, using AI to re-rank unified results. ([Read more](/details/swirl-metasearch.md)) `Federated Search` `Open Source` `Enterprise`
    • HollowDB Vector - Decentralized vector database built on Arweave network with HNSW index implementation, providing privacy-preserving vector search capabilities for Web3 and AI applications. ([Read more](/details/hollowdb-vector.md)) `Decentralized` `Web3` `Privacy` `Open Source`
    • BBANN - Track 2 winning entry for the NeurIPS'21 Billion-Scale ANNS Competition, developed by a team from Zilliz and Southern University of Science and Technology. Competed in the out-of-core indices track where search uses Azure Standard_L8s_v2 VMs with 8 vCPUs, 64GB RAM, and a 1TB local SSD. ([Read more](/details/bbann.md)) `Competition Winner` `Out Of Core` `Disk Based Index`
    • Blockify - AI search and vector database platform that provides unified vector search with semantic understanding, hybrid search capabilities, and developer-friendly APIs for building intelligent search applications. ([Read more](/details/blockify.md)) `Ai Search` `Hybrid Search` `Semantic Search` `Developer Api`
    • EmbeddixDB - High-performance vector database designed for LLM memory and RAG applications. Provides an MCP (Model Context Protocol) server for seamless integration with AI assistants like Claude, and a REST API for traditional applications. Supports HNSW and flat indexes with 256x memory compression via quantization, flexible storage backends, auto-embedding, and advanced analytics. ([Read more](/details/embeddixdb.md)) `Open Source` `Hnsw` `Rag` `Mcp`
    • Tribase — Vector Data Query Engine with Triangle Inequality Pruning - SIGMOD 2025 paper introducing Tribase, a vector data query engine that uses triangle inequalities for reliable and lossless pruning compression, achieving efficient similarity search without sacrificing accuracy. ([Read more](/details/tribase-vector-data-query-engine-with-triangle-inequality-pruning.md)) `Similarity Search` `Pruning` `High Performance`
    • Pixeltable - Pixeltable is an open-source database featuring automatic incremental embedding indexing for efficient vector search. It supports Apache License 2.0 and is designed for handling embeddings in AI applications. ([Read more](/details/pixeltable.md)) `Open Source` `incremental` `Embeddings`
    • YDB - YDB is an open-source distributed SQL database with vector search capabilities under Apache License 2.0. It supports high-performance vector similarity search for AI and machine learning applications. ([Read more](/details/ydb.md)) `Open Source` `Distributed` `Sql`
    • ospipe - RuVector-enhanced personal AI memory for Screenpipe, replacing SQLite with semantic vector search, knowledge graphs, and attention reranking. ([Read more](/details/ospipe.md)) `Open Source` `Rust` `Memory` `Semantic Search`
    • RankT5 - Open-source reranking model that uses an encoder-decoder (T5) architecture, fine-tuned to generate classification tokens indicating whether query-document pairs are relevant or irrelevant. Formulates document ranking as a generation task. ([Read more](/details/rankt5.md)) `Open Source` `encoder-decoder` `LLM-reranking`
    • RankZephyr - Open-source reranking model based on fine-tuned decoder-only LLMs (LLaMA family), designed for listwise document reranking in RAG pipelines. RankZephyr leverages supervised fine-tuning on ranking datasets to improve query-document relevance scoring beyond what zero-shot LLM prompts can achieve. ([Read more](/details/rankzephyr.md)) `Open Source` `LLM-reranking` `listwise-ranking`
    • rvf-runtime - Runtime engine for RVF including store API, copy-on-write, and compaction features. Powers persistent and efficient vector data management in RuVector applications. ([Read more](/details/rvf-runtime.md)) `Rust` `runtime` `cow` `compaction` `Open Source`
    • VAST CNode-X - GPU-accelerated server from VAST Data that combines the VAST AI OS with NVIDIA data-processing libraries and onboard NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs. Designed for enterprise AI workloads requiring high-throughput vector search, data vectorization, and inference, it leverages the NVIDIA AI Data Platform reference design. ([Read more](/details/vast-cnode-x.md)) `GPU-accelerated` `server-hardware` `Enterprise` `Vector Database 2026` `Ann Benchmarks` `Rag Optimized`
    • Vexless — Serverless Vector Data Management - SIGMOD 2024 paper introducing Vexless, a serverless vector data management system built on cloud functions that decouples compute and storage for elastic, pay-per-use vector similarity search. ([Read more](/details/vexless-serverless-vector-data-management.md)) `Serverless` `Cloud Native` `Similarity Search` `Vector Database 2026` `Ann Benchmarks` `Rag Optimized`
  • Vector Database Extensions

    • k-NN plugin - An OpenSearch plugin that expands its capabilities with the custom `knn_vector` data type, enabling storage of embeddings and providing methods for k-NN similarity searches, including Approximate k-NN, Script Score k-NN, and Painless extensions. ([Read more](/details/k-nn-plugin.md)) `OpenSearch` `k-NN` `vector search`
    • HeatWave - A feature for MySQL that integrates vector store capabilities, allowing users to store and process vector embeddings for AI applications. ([Read more](/details/heatwave.md)) `MySQL` `vector store` `extension`
    • Lantern - Lantern is a PostgreSQL extension that enables efficient vector search capabilities, allowing users to perform similarity searches directly within their PostgreSQL databases. ([Read more](/details/lantern.md)) `PostgreSQL` `vector search` `extension`
    • MariaDB Vector - MariaDB Vector is an extension or feature of MariaDB, providing capabilities for handling and querying vector data within the MariaDB ecosystem. ([Read more](/details/mariadb-vector.md)) `relational database` `vector search` `extension`
    • faiss-quickeradc - faiss-quickeradc is an extension of FAISS that implements the Quicker ADC approach to accelerate product-quantization-based approximate nearest neighbor search using SIMD, improving performance in vector database retrieval. ([Read more](/details/faiss-quickeradc.md)) `ANN` `product quantization` `optimization`
    • OpenSearch Neural Search / Hybrid Search - Neural and hybrid search capability in OpenSearch that combines lexical queries with vector-based neural search using a pipeline of normalization and score combination techniques. It enables semantic (vector) search and hybrid search over indices such as `neural_search_pqa`, suitable for AI and vector database-style retrieval use cases. ([Read more](/details/opensearch-neural-search-hybrid-search.md)) `hybrid search` `semantic search` `vector search`
    • pgvecto.rs - PostgreSQL extension for scalable, low-latency vector search written in Rust. Features 20x faster HNSW than pgvector, with support for FP16, INT8, and binary vectors. This is an OSS extension. ([Read more](/details/pgvectors.md)) `Open Source` `Postgresql` `Rust`
    • pgvectorscale - Open-source PostgreSQL extension that builds on pgvector with higher-performance embedding search and cost-efficient storage. Features StreamingDiskANN index inspired by Microsoft's DiskANN algorithm. This is an OSS solution under PostgreSQL license. ([Read more](/details/pgvectorscale.md)) `Open Source` `Postgresql` `Diskann`
    • VectorChord - PostgreSQL extension for scalable, high-performance vector search, successor to pgvecto.rs. Features RaBitQ quantization enabling 6x cost savings vs Pinecone. Fully compatible with pgvector. This is an OSS extension. ([Read more](/details/vectorchord.md)) `Open Source` `Postgresql` `Quantization`
    • pgai - Open-source PostgreSQL extension and Python library that automates embedding generation and synchronization for RAG and semantic search applications. Features pgai Vectorizer for declarative embedding pipelines. This is an OSS solution. ([Read more](/details/pgai.md)) `Open Source` `Postgresql` `Embedding`
    • libSQL - Open-source, open-contribution fork of SQLite maintained by Turso that adds native vector search with DiskANN indexing, making it production-ready and fully backwards compatible with SQLite. ([Read more](/details/libsql.md)) `Sqlite` `Diskann` `Embedded`
    • pg_embedding - PostgreSQL extension enabling the Hierarchical Navigable Small World (HNSW) algorithm for vector similarity search. Developed by Neon, it delivers 5-30x faster performance compared to pgvector's IVFFlat indexing for approximate nearest neighbor search. ([Read more](/details/pg_embedding.md)) `Postgresql` `Hnsw` `Open Source`
    • PlanetScale Vectors - Vector search and storage for MySQL, now generally available. PlanetScale Vectors brings native vector capabilities to MySQL, allowing you to store and query vector embeddings alongside relational data without requiring a separate vector database. ([Read more](/details/planetscale-vectors.md)) `Mysql` `Cloud Native` `Vector Search`
    • Redis Vector Search - Native vector database capabilities in Redis combining ultra-low latency in-memory operations with vector similarity search. Redis 8.0 introduced vector sets as native data type for semantic search, RAG pipelines, and recommendations. ([Read more](/details/redis-vector-search.md)) `Redis` `In Memory` `Real Time`
    • Timescale Vector - PostgreSQL-based vector search solution built on Timescale Cloud with pgai extensions including pgvector, pgvectorscale, and pgai. Features StreamingDiskANN index for high-performance embedding search at scale. ([Read more](/details/timescale-vector.md)) `Postgresql` `Pgvector` `Time Series`
    • Neo4j Vector Search - An enhancement to the Neo4j graph database providing vector search capabilities through dedicated indexes. ([Read more](/details/neo4j-vector-search.md)) `Graph Database` `Vector Search` `extension`
    • OpenSearch Neural Search / Hybrid Search - Neural and hybrid search capability in OpenSearch that combines lexical queries with vector-based neural search using a pipeline of normalization and score combination techniques. It enables semantic (vector) search and hybrid search over indices such as `neural_search_pqa`, suitable for AI and vector database-style retrieval use cases. ([Read more](/details/opensearch-neural-search-hybrid-search.md)) `Hybrid Search` `Semantic Search` `Vector Search`
    • Apache Solr Dense Vector Search - Vector search capabilities in Apache Solr with HNSW indexing, early termination optimization, and integrated text-to-vector capabilities for hybrid search applications. ([Read more](/details/apache-solr-dense-vector-search.md)) `Open Source` `Hybrid Search` `Java` `Search Engine`
    • DuckDB VSS Extension - Experimental extension for DuckDB that adds HNSW indexing support to accelerate vector similarity search queries using DuckDB's fixed-size ARRAY type. First custom index type provided through a DuckDB extension. ([Read more](/details/duckdb-vss-extension.md)) `Duckdb` `Hnsw` `Sql`
    • GridStore - Qdrant's custom-built storage engine written in Rust, replacing RocksDB with improved performance and lower latency for payload and sparse vector storage. ([Read more](/details/gridstore.md)) `Storage Engine` `Rust` `Performance`
    • ParadeDB - PostgreSQL extension enabling fast full-text, faceted, and hybrid search over Postgres tables using the BM25 algorithm. Built on Tantivy for production-ready search with ACID guarantees and transactional consistency. ([Read more](/details/paradedb.md)) `Postgresql` `Bm25` `Hybrid Search`
    • PGLite - Lightweight WASM Postgres build packaged into a TypeScript client library that enables running PostgreSQL in the browser, Node.js, Bun, and Deno with pgvector support. At only 3MB gzipped, it provides full Postgres functionality including vector search capabilities without requiring separate database installation. ([Read more](/details/pglite.md)) `WebAssembly` `PostgreSQL` `Lightweight`
    • Qdrant 1.5-bit Quantization - Middle-ground quantization introduced in Qdrant v1.15.0 that provides better precision than binary quantization while being more aggressive than scalar quantization. ([Read more](/details/qdrant-15-bit-quantization.md)) `Quantization` `Optimization` `Qdrant`
    • SQLite VSS - A SQLite extension for efficient vector similarity search based on FAISS, enabling semantic search, recommendations, and question-answering directly within SQLite databases. ([Read more](/details/sqlite-vss.md)) `Sqlite` `Faiss` `Extension`
    • QBit - ClickHouse's vector search extension that adds kNN similarity search capabilities to the ClickHouse columnar database, enabling hybrid analytical + vector queries at scale. ([Read more](/details/qbit.md)) `Clickhouse` `Columnar` `Knn` `Analytical` `Extension`
    • ruvector-postgres - PostgreSQL extension providing 230+ SQL functions as pgvector replacement, enabling vector search, graph queries, and AI features directly in relational databases. ([Read more](/details/ruvector-postgres.md)) `Postgres` `Sql` `Extension` `Pgvector`
  • Vector DB Research & Surveys

    • Approximate Nearest Neighbour Search on Dynamic Datasets: An Investigation - arXiv 2024 research paper investigating ANN search performance on dynamic datasets with updates. Reviews benchmarks for vector indexing adaptability and efficiency. For academic/research use in dynamic vector DB scenarios; compares to prior static benchmarks and 2026 dynamic trends. ([Read more](/details/approximate-nearest-neighbour-search-on-dynamic-datasets-an-investigation.md)) `research-paper` `ann-survey`
    • Learning Cluster Representatives for Approximate Nearest Neighbor Search - arXiv 2024 research paper proposing learned cluster representatives for efficient ANN search via vector quantization and clustering. Reviews benchmarks for scalability in similarity search. Academic/research use for advanced indexing techniques; contrasts with prior methods and 2026 learned index trends. ([Read more](/details/learning-cluster-representatives-for-approximate-nearest-neighbor-search.md)) `research-paper` `ann-survey`
    • Operational Advice for Dense and Sparse Retrievers: HNSW, Flat, or Inverted Indexes? - arXiv 2024 research paper providing practical guidance on HNSW, flat, and inverted indexes for dense/sparse retrieval in vector systems. Reviews performance benchmarks across retrievers. For research/academic optimization of AI retrieval; compares index choices vs 2026 hybrid trends. ([Read more](/details/operational-advice-for-dense-and-sparse-retrievers-hnsw-flat-or-inverted-indexes.md)) `research-paper` `ann-survey`
  • Vector Indexing Libraries

    • PISA - PISA is an inverted index library for semantic search, supporting sparse and dense vectors with advanced compression techniques. It offers multi-threaded querying and learned indexes, primarily oriented towards research applications in information retrieval. ([Read more](/details/pisa.md)) `inverted-index` `learned-compression` `research-lib`
  • Wasm/Edge Runtime VDBs

    • micro-hnsw-wasm - WASM library for brain-inspired neuromorphic HNSW vector search in 11.8KB. Optimized for edge devices with spiking neurons for energy-efficient similarity search. ([Read more](/details/micro-hnsw-wasm.md)) `Open Source` `Wasm` `Hnsw` `Neuromorphic`
Categories
Research Papers & Surveys 135 Concepts & Definitions 131 Vector Database Engines 91 Sdks & Libraries 77 Machine Learning Models 66 Curated Resource Lists 56 LLM Tools 50 SDKs & Libraries 42 Open Sources 32 Sdks Libraries 28 Benchmarks & Evaluation 26 Vector Database Extensions 26 LLM Frameworks 24 Managed Vector Databases 21 Data Integration & Migration 20 Core Vector Databases 18 Multi Model & Hybrid Databases 15 Cloud Services 13 Llm Tools 9 Security & Governance 9 Commerce 8 vector-database-engines 5 Relational Databases 5 Embedded Vector Databases 5 Vector Database 4 Data Processing 4 ANN Indexing Libraries 3 Embedded & Edge Vector Databases 3 Cloud & Managed 3 Vector DB Research & Surveys 3 Graph Database 3 Llm Frameworks 3 Integrations & Extensions 2 ⭐ Star History 2 curated-resource-lists 2 Open Source Vector Databases 2 Quantum-Safe Vector DBs 2 🔥 Acknowledgements 2 Rust-based Vector Databases 2 Graph-Enhanced Vector DBs 2 Multi-Model & Hybrid Databases 2 Multimodal Vector DBs 1 Rust-Based Vector DBs 1 Search & Retrieval 1 Edge Database 1 AI Agent Optimized VDBs 1 Evaluation & Observability 1 Tools 1 serverless-managed-vector-dbs 1 Multimodal Vector Databases 1 Cloud-managed Vector Databases 1 llm-frameworks 1 Vector Indexing Libraries 1 research-papers-surveys 1 GPU-Accelerated Vector DBs 1 Scalable Distributed Vector DBs 1 In-Memory Hybrid Vector Stores 1 2026 Trends & Startups 1 Managed and Serverless Vector DBs 1 Wasm/Edge Runtime VDBs 1 Managed & Serverless Vector DBs 1 Hybrid Vector Stores 1 Experimental & Learning Vector DBs 1 Developer Tools & Benchmarks 1 Libraries 1
Sub Categories