{"id":44787,"url":"https://github.com/dangkhoasdc/awesome-vector-database","name":"awesome-vector-database","description":"A curated list of awesome works related to high dimensional structure/vector search \u0026 database","projects_count":297,"last_synced_at":"2026-06-08T03:00:19.420Z","repository":{"id":162489702,"uuid":"616800196","full_name":"dangkhoasdc/awesome-vector-database","owner":"dangkhoasdc","description":"A curated list of awesome works related to high dimensional structure/vector search \u0026 database","archived":false,"fork":false,"pushed_at":"2026-05-06T01:59:37.000Z","size":541,"stargazers_count":347,"open_issues_count":3,"forks_count":22,"subscribers_count":9,"default_branch":"main","last_synced_at":"2026-05-06T03:30:47.612Z","etag":null,"topics":["approximate-nearest-neighbor-search","embedding-similarity","embeddings-similarity","nearest-neighbor-search","search-engine","similarity-search","vector-database","vector-search","vector-search-engine"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"cc0-1.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dangkhoasdc.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2023-03-21T05:22:35.000Z","updated_at":"2026-05-06T01:59:41.000Z","dependencies_parsed_at":"2025-12-10T06:05:37.030Z","dependency_job_id":null,"html_url":"https://github.com/dangkhoasdc/awesome-vector-database","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/dangkhoasdc/awesome-vector-database","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dangkhoasdc%2Fawesome-vector-database","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dangkhoasdc%2Fawesome-vector-database/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dangkhoasdc%2Fawesome-vector-database/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dangkhoasdc%2Fawesome-vector-database/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dangkhoasdc","download_url":"https://codeload.github.com/dangkhoasdc/awesome-vector-database/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dangkhoasdc%2Fawesome-vector-database/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34046003,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-08T02:00:07.615Z","response_time":111,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"created_at":"2024-01-13T21:18:59.432Z","updated_at":"2026-06-08T03:00:19.421Z","primary_language":null,"list_of_lists":false,"displayable":true,"categories":["Uncategorized","Multidimensional data / Vectors","📰 Articles \u0026 Talks","Others","Quantization","Survey","Graph-based Methods","🎄Tree-based Methods","Hashing",":chart_with_upwards_trend: Evaluation \u0026 Metrics","Comparisons","Tree-based Methods","Systems","Other Approaches","Courses","Texts","Related Lists"],"sub_categories":["Uncategorized"],"readme":"# :mag: Awesome Vector Database [![Awesome](https://cdn.jsdelivr.net/gh/sindresorhus/awesome@d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg)](https://github.com/sindresorhus/awesome)\nA curated list of awesome works related to high dimensional structure/vector search \u0026amp; database \n\n# Services\n- [Google Vector Search (Vertex AI)](https://cloud.google.com/vertex-ai/docs/vector-search/overview)\n- [Pinecone](https://www.pinecone.io/)\n- [Weaviate](https://github.com/weaviate/weaviate) [[Beginner Guide](https://towardsdatascience.com/getting-started-with-weaviate-a-beginners-guide-to-search-with-vector-databases-14bbb9285839)]\n- [Vespa](https://vespa.ai/)\n- [txtai](https://github.com/neuml/txtai)\n- [marqo](https://github.com/marqo-ai/marqo)\n- [vectara](https://vectara.com)\n- [Epsilla](https://epsilla.com/)\n- [algolia](https://www.algolia.com/)\n- [Meilisearch](https://www.meilisearch.com/solutions/vector-search)\n- [nucliadb](https://nuclia.com/vector-database/)\n- [OpenSearch](https://opensearch.org/)\n- [MyScale](https://myscale.com)\n- [QdrantCloud](https://cloud.qdrant.io/)\n- [zilliz](https://cloud.zilliz.com/signup)\n- [OpenSearch's AlibabaCloud](https://www.alibabacloud.com/product/opensearch)\n- [Typesense's Cloud](https://cloud.typesense.org)\n- [MongoDB Atlas Vector Search](https://www.mongodb.com/products/platform/atlas-vector-search)\n- [SuperDuperDB](https://github.com/SuperDuperDB/superduperdb)\n- [KBD.AI](https://kdb.ai/)\n- [Denser Retriever](https://denser.ai)\n- [Rivestack](https://rivestack.io) — Managed PostgreSQL with pgvector for AI workloads. HNSW indexing, built-in SQL editor with embedding search, free tier available.\n\n## Comparisons\n- [From Vespa](https://cloud.vespa.ai/feature-comparison.html)\n- [Vector DB Comparison by VectorHub](https://vdbs.superlinked.com/)\n- [MyScale Vector Database Benchmark 🚀](https://github.com/myscale/vector-db-benchmark)\n  \n# Libraries \u0026 Engines\n## Multidimensional data / Vectors\n\n- :star: 🥇 [Vector DB Feature Matrix](https://docs.google.com/spreadsheets/d/170HErOyOkLDjQfy3TJ6a3XXXM1rHvw_779Sit-KT7uc/edit#gid=0)\n- :star: [Faiss](https://faiss.ai/) [Paper](https://arxiv.org/pdf/2401.08281.pdf)\n- [ArcadeDB](https://arcadedb.com/) - open-source multi-model database with native vector embedding support alongside graph, document, key-value and time series models\n- [Typesense](https://typesense.org/)\n- [Qdrant](https://qdrant.tech/)\n  - [Video tutorial](https://youtu.be/LRcZ9pbGnno), [Notebook](https://github.com/qdrant/examples/blob/master/qdrant_101_getting_started/getting_started.ipynb)\n- [annoy](https://github.com/spotify/annoy)\n- [NGT](https://github.com/yahoojapan/NGT)\n- [pgvector](https://github.com/pgvector/pgvector)\n- [Chroma](https://github.com/chroma-core/chroma) - AI memory with semantic, full-text, \u0026 regex search\n- [LlamaIndex](https://github.com/jerryjliu/llama_index)\n- [Epsilla](https://github.com/epsilla-cloud/vectordb/tree/main)\n- [jvector](https://github.com/jbellis/jvector)\n- [RAFT](https://github.com/rapidsai/raft)\n- [Vald](https://vald.vdaas.org/)\n- [Voyager](https://github.com/spotify/voyager)\n- [tinyvector](https://github.com/0hq/tinyvector)\n- [USearch](https://github.com/unum-cloud/usearch)\n- [vearch](https://vearch.github.io/)\n- [MRPT](https://github.com/vioshyvo/mrpt)\n- [milvus](https://milvus.io/)\n- [infinity](https://github.com/infiniflow/infinity)\n- [havenask](https://github.com/alibaba/havenask)\n- [chromem-go](https://github.com/philippgille/chromem-go)\n- [OasysDB](https://github.com/oasysai/oasysdb) [[Notebook](https://colab.research.google.com/drive/15_1hH7jGKzMeQ6IfnScjsc-iJRL5XyL7?usp=sharing)]\n- [Meilisearch](https://github.com/meilisearch) - Search engine API for Semantic (vectors), full-text \u0026 hybrid search\n- [arroy](https://github.com/meilisearch/arroy) - Approximate Nearest Neighbors Rust library\n- [bleve](https://github.com/blevesearch/bleve)\n- [cuVS](https://github.com/rapidsai/cuvs)\n- [vsag](https://github.com/alipay/vsag)\n- [sqlite-vec](https://github.com/asg017/sqlite-vec)\n- [MyScaleDB](https://github.com/myscale/MyScaleDB)\n- [hora](https://github.com/hora-search/hora)\n- [arroy](https://github.com/meilisearch/arroy)\n- [KGraph](https://github.com/aaalgo/kgraph)\n- [NearestNeighbors.jl](https://github.com/KristofferC/NearestNeighbors.jl)\n- [MuopDB](https://github.com/hicder/muopdb)\n- [puck](https://github.com/baidu/puck)\n- [Denser Retriever](https://github.com/denser-org/denser-retriever)\n- [seekdb](https://github.com/oceanbase/seekdb)\n- [brinicle](https://github.com/bicardinal/brinicle) - Resource-efficient C++ vector index engine built for low-RAM production workloads\n- [VelesDB](https://github.com/cyberlife-coder/VelesDB) - Embedded vector + graph + columnar database. Rust core (~6MB), HNSW with 5 distance metrics, VelesQL (SQL + NEAR + MATCH). Python and Rust SDKs.\n\n## Texts\n\n- [PISA](https://github.com/pisa-engine/pisa)\n- [Tantivy](https://github.com/quickwit-oss/tantivy)\n- [sonic](https://github.com/valeriansaliou/sonic)\n\n## Others\n- [SimSIMD](https://github.com/ashvardanian/SimSIMD): Efficient Alternative to `scipy.spatial.distance` and `numpy.inner`\n- [vector-io](https://github.com/AI-Northstar-Tech/vector-io): Comprehensive Vector Data Tooling.\n- [VectorDBZ](https://github.com/vectordbz/vectordbz) — GUI desktop app for exploring and debugging vector databases\n\n# Benchmarks \u0026 Databases\n- [ANN Benchmarks](http://ann-benchmarks.com/) [[Paper](https://arxiv.org/pdf/1807.05614.pdf)].\n- [Billion-scale ANNS Benchmarks](https://big-ann-benchmarks.com)\n    - [2021 Result](https://proceedings.mlr.press/v176/simhadri22a/simhadri22a.pdf)\n    - Simhadri, Harsha Vardhan, et al. \"[Results of the Big ANN: NeurIPS'23 competition.](https://arxiv.org/pdf/2409.17424)\" arXiv preprint arXiv:2409.17424 (2024).\n\n- [BEIR](https://github.com/beir-cellar/beir)\n- [VectorDBBench - A Vector Database Benchmark Tool](https://zilliz.com/vector-database-benchmark-tool) , [[Github](https://github.com/zilliztech/VectorDBBench)]\n- [Qdrant's Vector Database Benchmarks](https://qdrant.tech/benchmarks/)\n- [MyScale's Vector Database Benchmark](https://myscale.github.io/benchmark/#/benchmark)\n- Li, Wen, et al. \"[Approximate nearest neighbor search on high dimensional data—experiments, analyses, and improvement](https://arxiv.org/pdf/1610.02455.pdf).\" IEEE Transactions on Knowledge and Data Engineering 32.8 (2019): 1475-1488.\n- Zeng, Xianzhi, et al. \"[CANDY: A Benchmark for Continuous Approximate Nearest Neighbor Search with Dynamic Data Ingestion.](https://arxiv.org/pdf/2406.19651)\" arXiv preprint arXiv:2406.19651 (2024).\n- [IntelLabs's Vector Search Datasets](https://github.com/IntelLabs/VectorSearchDatasets)\n\n# 📚 Books \n- [Foundations of Multidimensional and Metric Data Structures](https://www.amazon.com/Foundations-Multidimensional-Structures-Kaufmann-Computer/dp/0123694469/)\n- [Introduction to Information Retrieval](https://nlp.stanford.edu/IR-book/information-retrieval-book.html)\n- [Deep Learning for Search](https://www.manning.com/books/deep-learning-for-search)\n- [Foundations of Vector Retrieval](https://arxiv.org/abs/2401.09350)\n\n# Conferences \u0026 Workshops\n- :star: [VLDB](https://vldb.org)\n  - Tutorial:\n    - New Trends in High-D Vector Similarity Search [[slides](https://vldb.org/2021/files/slides/tutorial/tutorial5.pdf), [video](https://www.youtube.com/watch?v=TFsrFwF0bC4\u0026ab_channel=VLDB2021), [paper](https://echihabi.com/publications/tutorials/vldb2021-tutorial-summary.pdf)]\n- :star: [Image Retrieval in the Wild (CVPR20)](https://matsui528.github.io/cvpr2020_tutorial_retrieval/) [[Video](https://www.youtube.com/watch?v=SKrHs03i08Q)]\n- [Haystack](https://haystackconf.com)\n- [Neural Search In Action](https://matsui528.github.io/cvpr2023_tutorial_neural_search/)\n- ACM MM 2020: [Effective and Efficient: Toward Open-world Instance Re-identification](https://wangzwhu.github.io/home/acmmm2020_tutorial_reid.html)\n  - Billion-scale Approximate Nearest Neighbor Search: [[Slides](https://wangzwhu.github.io/home/file/acmmm-t-part3-ann.pdf), [Video](https://www.youtube.com/watch?v=iI8e3kU11eU)]\n  - Is instance search a solved problem? [[Slides](https://wangzwhu.github.io/home/file/acmmm-t-part4-ins.pdf), [Video](https://www.youtube.com/watch?v=cH256Zqt5Ms)]\n- Retrieval Augmented Generation and Vespa [[Slides](https://docs.google.com/presentation/d/1LRAQfdT4UH69pgojNi_EMspSgsHn9YJVac_bbnhy038/edit#slide=id.p1)]\n- [SISAP Indexing Challenge](https://sisap-challenges.github.io/#sisap_indexing_challenge)\n  - [2023 Competition](https://sisap-challenges.github.io/2023/)\n  - [2024 Competition](https://sisap-challenges.github.io/2024/)\n \n\n## Courses\n- Long Term Memory in AI - Vector Search and Databases (COS 495 - Princeton) [[Class Notes](https://github.com/edoliberty/vector-search-class-notes)]\n- Freiburg Information Retrieval WS 2022-2023 [[Website](https://ad-wiki.informatik.uni-freiburg.de/teaching/InformationRetrievalWS2223), [Video Lectures](https://www.youtube.com/playlist?list=PL682UO4IMem9rQdlqJZZGQYD_CTRpTJZU)]\n- Vector Similarity Search and Faiss Course [[Youtube Playlist](https://www.youtube.com/playlist?list=PLIUOU7oqGTLhlWpTz4NnuT3FekouIVlqc)]\n\n## Others\n- [VectorHub](https://github.com/superlinked/VectorHub): a free, open-source learning website for people (software developers to senior ML architects) interested in adding vector retrieval to their ML stack.\n- [Vector Database Group @ NTU](https://vectordb-ntu.github.io): Vector Database Research Group at Nanyang Technological University [[Github](https://github.com/VectorDB-NTU)]\n\n# Publications\n## Survey\n- :star: Pan, James Jie, Jianguo Wang, and Guoliang Li. \"Survey of Vector Database Management Systems.\" arXiv preprint arXiv:2310.14021 (2023). [[Paper](https://arxiv.org/abs/2310.14021)]\n- Aumüller, Martin, and Matteo Ceccarello. \"[Recent Approaches and Trends in Approximate Nearest Neighbor Search](http://sites.computer.org/debull/A23sept/p89.pdf).\" {IEEE} Data Engineering Bulletin (2023).\n- Nearest neighbor search: the old, the new, and the impossible. Andoni, Alexandr. [[Paper](https://dspace.mit.edu/bitstream/handle/1721.1/55090/587638612-MIT.pdf?sequence=2)]\n- Ganbarov, Ali, et al. \"[Experimental comparison of graph-based approximate nearest neighbor search algorithms on edge devices.](https://arxiv.org/pdf/2411.14006)\" arXiv preprint arXiv:2411.14006 (2024).\n- Li, Zhaoheng, et al. \"[Cloud-Native Vector Search: A Comprehensive Performance Analysis.](https://arxiv.org/pdf/2511.14748)\" arXiv preprint arXiv:2511.14748 (2025).\n- Song, Yitong, et al. \"[Disk-Resident Vector Similarity Search: A Survey.](https://hkbudb.github.io/disk-vss-survey/Disk-Resident%20Vector%20Similarity%20Search-%20A%20Survey.pdf)\"\n- XIE, JIADONG, YINGFAN LIU, and JEFFREY XU YU. \"[A Survey on Query Processing in Vector Databases.](https://xiejiadong.github.io/files/paper/vector_survey.pdf)\" (2026).\n\n\n## Quantization\n![](https://raw.githubusercontent.com/wiki/facebookresearch/faiss/PQ_variants_Faiss_annotated.png)\nSource: A survey of product quantization.\n\n- :star: PQ: Product quantization for nearest neighbor search. Jegou, Herve, Matthijs Douze, and Cordelia Schmid. [[Paper](https://lear.inrialpes.fr/pubs/2011/JDS11/jegou_searching_with_quantization.pdf), [Code](https://github.com/facebookresearch/faiss), [Julia Code](https://github.com/una-dinosauria/Rayuela.jl), [nanopq](https://github.com/matsui528/nanopq)]\n- :star: k-selection on GPU: Billion-scale similarity search with gpus. Johnson, Jeff, Matthijs Douze, and Hervé Jégou [[Paper](https://arxiv.org/pdf/1702.08734.pdf), [Code](https://github.com/facebookresearch/faiss)]\n- :star: A survey of product quantization. Matsui, Yusuke, Yusuke Uchida, Hervé Jégou, and Shin'ichi Satoh [[Paper](https://www.jstage.jst.go.jp/article/mta/6/1/6_2/_pdf)]\n- OPQ: Optimized Product Quantization. Ge, Tiezheng, Kaiming He, Qifa Ke, and Jian Sun [[Homepage](https://kaiminghe.github.io/cvpr13/index.html), [Paper](https://www.microsoft.com/en-us/research/wp-content/uploads/2013/11/pami13opq.pdf), [Code](https://kaiminghe.github.io/cvpr13/matlab_OPQ_release_v1.1.rar), [nanopq](https://github.com/matsui528/nanopq)]\n- Quicker adc: Unlocking the hidden potential of product quantization with simd. André, Fabien, Anne-Marie Kermarrec, and Nicolas Le Scouarnec [[Paper](https://arxiv.org/pdf/1812.09162), [Code](https://github.com/technicolor-research/faiss-quickeradc)]\n  - Accelerated nearest neighbor search with quick adc. André, Fabien, Anne-Marie Kermarrec, and Nicolas Le Scouarnec [[Paper](https://arxiv.org/pdf/1704.07355.pdf)].\n  - Cache locality is not enough: High-performance nearest neighbor search with product quantization fast scan. Fabien André, Anne-Marie Kermarrec, Nicolas Le Scouarnec [[Paper](https://hal.inria.fr/hal-01239055/document)]\n- Wu, Xiang, et al. \"[Multiscale quantization for fast similarity search.](https://proceedings.neurips.cc/paper_files/paper/2017/file/b6617980ce90f637e68c3ebe8b9be745-Paper.pdf)\" Advances in neural information processing systems 30 (2017).\n- ScaNN: Accelerating Large-Scale Inference with Anisotropic Vector Quantization. Guo, Ruiqi, Philip Sun, Erik Lindgren, Quan Geng, David Simcha, Felix Chern, and Sanjiv Kumar [[Paper](http://proceedings.mlr.press/v119/guo20h/guo20h.pdf), [Python/C++ Inference](https://github.com/google-research/google-research/tree/master/scann), [Julia Training/Inference](https://github.com/AxelvL/AHPQ.jl)]\n- The inverted multi-index. Babenko, Artem, and Victor Lempitsky [[Paper](https://cmp.felk.cvut.cz/~toliageo/rg/papers/BabenkoLempitsky_PAMI2014_The%20Inverted%20Multi-Index.pdf), [Code](https://github.com/jatin7gupta/Product-Quantization)]\n- Are We There Yet? Product Quantization and its Hardware Acceleration. Fernandez-Marques, Javier, Ahmed F. AbouElhamayed, Nicholas D. Lane, and Mohamed S. Abdelfattah. [[Paper](https://arxiv.org/pdf/2305.18334.pdf)]\n- LibVQ: A Toolkit for Optimizing Vector Quantization and Efficient Neural Retrieval. Li, Chaofan, Zheng Liu, Shitao Xiao, Yingxia Shao, Defu Lian, and Zhao Cao. [[Paper](https://dl.acm.org/doi/10.1145/3539618.3591799), [Code](https://github.com/staoxiao/LibVQ/tree/demo)]\n- Matsui, Yusuke, Ryota Hinami, and Shin'ichi Satoh. \"Reconfigurable Inverted Index.\" Proceedings of the 26th ACM international conference on Multimedia. 2018. [[Paper](https://dl.acm.org/ft_gateway.cfm?id=3240630), [Project](https://yusukematsui.me/project/rii/), [Code](https://github.com/matsui528/rii)]\n- Aguerrebere, Cecilia, et al. \"[Similarity search in the blink of an eye with compressed indices.](https://arxiv.org/pdf/2304.04759.pdf)\" arXiv preprint arXiv:2304.04759 (2023).\n- Huijben, Iris, et al. \"[Residual Quantization with Implicit Neural Codebooks](https://arxiv.org/pdf/2401.14732.pdf).\" arXiv preprint arXiv:2401.14732 (2024). [[Code](https://github.com/facebookresearch/Qinco)]\n- Rege, Aniket, et al. \"[Adanns: A framework for adaptive semantic search](https://proceedings.neurips.cc/paper_files/paper/2023/file/f062da1973ac9ac61fc6d44dd7fa309f-Paper-Conference.pdf).\" Advances in Neural Information Processing Systems 36 (2024).\n- Amara, Kenza, et al. \"[Nearest neighbor search with compact codes: A decoder perspective](https://arxiv.org/pdf/2112.09568).\" Proceedings of the 2022 International Conference on Multimedia Retrieval. 2022.\n- Krishnan, Aditya, and Edo Liberty. \"[Projective Clustering Product Quantization](https://arxiv.org/pdf/2112.02179.pdf).\" arXiv preprint arXiv:2112.02179 (2021).\n- Noh, Haechan, Taeho Kim, and Jae-Pil Heo. \"[Product quantizer aware inverted index for scalable nearest neighbor search](https://openaccess.thecvf.com/content/ICCV2021/papers/Noh_Product_Quantizer_Aware_Inverted_Index_for_Scalable_Nearest_Neighbor_Search_ICCV_2021_paper.pdf).\" Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021.\n- Zhan, Jingtao, et al. \"[Jointly optimizing query encoder and product quantization to improve retrieval performance](https://arxiv.org/pdf/2108.00644).\" Proceedings of the 30th ACM International Conference on Information \u0026 Knowledge Management. 2021.\n- Wang, Runhui, and Dong Deng. \"[DeltaPQ: lossless product quantization code compression for high dimensional similarity search](http://vldb.org/pvldb/vol13/p3603-wang.pdf).\" Proceedings of the VLDB Endowment 13.13 (2020): 3603-3616.\n- Jang, Young Kyun, and Nam Ik Cho. \"[Generalized product quantization network for semi-supervised image retrieval](https://openaccess.thecvf.com/content_CVPR_2020/papers/Jang_Generalized_Product_Quantization_Network_for_Semi-Supervised_Image_Retrieval_CVPR_2020_paper.pdf).\" Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020.\n- Chen, Ting, Lala Li, and Yizhou Sun. \"[Differentiable product quantization for end-to-end embedding compression](http://proceedings.mlr.press/v119/chen20l/chen20l.pdf).\" International Conference on Machine Learning. PMLR, 2020.\n- Huang, Rong, et al. \"[Learning Discrete Document Representations in Web Search](https://dl.acm.org/doi/pdf/10.1145/3580305.3599854).\" Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2023.\n- Nardini, Franco Maria, Cosimo Rulli, and Rossano Venturini. \"[Efficient Multi-vector Dense Retrieval with Bit Vectors](https://arxiv.org/pdf/2404.02805.pdf).\" European Conference on Information Retrieval. Cham: Springer Nature Switzerland, 2024. [[Code](https://github.com/CosimoRulli/emvb)]\n- Gao, Jianyang, and Cheng Long. \"[RaBitQ: Quantizing High-Dimensional Vectors with a Theoretical Error Bound for Approximate Nearest Neighbor Search.](https://arxiv.org/pdf/2405.12497)\" arXiv preprint arXiv:2405.12497 (2024). [[Code](https://github.com/gaoj0017/RaBitQ)]\n- Gao, Jianyang, et al. \"[Practical and Asymptotically Optimal Quantization of High-Dimensional Vectors in Euclidean Space for Approximate Nearest Neighbor Search.](https://arxiv.org/pdf/2409.09913)\" arXiv preprint arXiv:2409.09913 (2024).\n- Mohoney, Jason, et al. \"[Incremental IVF Index Maintenance for Streaming Vector Search.](https://arxiv.org/pdf/2411.00970)\" arXiv preprint arXiv:2411.00970 (2024).\n- Yang, Mingyu, Wentao Li, and Wei Wang. \"[Fast High-dimensional Approximate Nearest Neighbor Search with Efficient Index Time and Space.](https://arxiv.org/pdf/2411.06158)\" arXiv preprint arXiv:2411.06158 (2024).\n- Liu, Qiyu, et al. \"[Learned Data Compression: Challenges and Opportunities for the Future.](https://arxiv.org/pdf/2412.10770)\" arXiv preprint arXiv:2412.10770 (2024).\n- Vallaeys, Théophane, et al. \"[Qinco2: Vector Compression and Search with Improved Implicit Neural Codebooks.](https://arxiv.org/pdf/2501.03078)\" arXiv preprint arXiv:2501.03078 (2025).\n- Mageirakos, Vasilis, Bowen Wu, and Gustavo Alonso. \"[Cracking Vector Search Indexes](https://arxiv.org/pdf/2503.01823).\" arXiv preprint arXiv:2503.01823 (2025).\n- ZHANG, FANGYUAN, et al. \"[Efficient Dynamic Indexing for Range Filtered Approximate Nearest Neighbor Search](https://www1.se.cuhk.edu.hk/~swang/V3mod-152-RangePQ.pdf).\" (2025).\n- Huan, Chengying, et al. \"[OrchANN: A Unified I/O Orchestration Framework for Skewed Out-of-Core Vector Search.](https://arxiv.org/pdf/2512.22838)\" arXiv preprint arXiv:2512.22838 (2025).\n- Jin, Yicheng, et al. \"[Curator: Efficient Vector Search with Low-Selectivity Filters.](https://arxiv.org/pdf/2601.01291)\" arXiv preprint arXiv:2601.01291 (2026).\n- Wang, Yang, et al. \"[Pyramid Product Quantization for Approximate Nearest Neighbor Search.](https://www.mdpi.com/2076-3417/16/2/853)\" Applied Sciences 16.2 (2026): 853.\n- Yang, Mingyu, et al. \"[Quantization Meets Projection: A Happy Marriage for Approximate k-Nearest Neighbor Search.](https://www.researchgate.net/profile/Mingyu-Yang-26/publication/400339548_Quantization_Meets_Projection_A_Happy_Marriage_for_Approximate_k-Nearest_Neighbor_Search/links/69800d5542f94d1212a5bdd5/Quantization-Meets-Projection-A-Happy-Marriage-for-Approximate-k-Nearest-Neighbor-Search.pdf)\"\n- Yin, Ziqi, et al. \"[BBC: Improving Large-k Approximate Nearest Neighbor Search with a Bucket-based Result Collector.](https://arxiv.org/pdf/2604.01960)\" arXiv preprint arXiv:2604.01960 (2026).\n- Sun, Yiping, Yang Shi, and Jiaolong Du. \"[A real-time adaptive multi-stream gpu system for online approximate nearest neighborhood search.](https://arxiv.org/pdf/2408.02937)\" Proceedings of the 33rd ACM International Conference on Information and Knowledge Management. 2024.\n- Bahn, DongHa. \"[EIVF: Efficient IVFPQ Search for On-Device ARM Processors.](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=\u0026arnumber=11463912)\" ICASSP 2026-2026 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2026.\n- Abraham, Ashley N., et al. \"[Large-Scale Data Parallelization of Product Quantization and Inverted Indexing Using Dask.](https://arxiv.org/pdf/2604.21645)\" arXiv preprint arXiv:2604.21645 (2026).\n- Ahmed, Ibrar. \"[TTVI: A Two-Tier Vector Index for Low-WAL Approximate Nearest Neighbor Search in Databases.](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=11501658)\" IEEE Access (2026).\n- Ma, Y. T., et al. \"[CS-PQ: Cache-Friendly SIMD Product Quantization for Large-Scale ANNS Index Construction.](https://arxiv.org/pdf/2605.25521)\" arXiv preprint arXiv:2605.25521 (2026).\n\n## Graph-based Methods\n\n- :star: Wang, Zeyu, et al. \"[Graph-and Tree-based Indexes for High-dimensional Vector Similarity Search: Analyses, Comparisons, and Future Directions](https://helios2.mi.parisdescartes.fr/~themisp/publications/bulletin23.pdf).\" Data Engineering (2023): 3-21.\n- :star: A comprehensive survey and experimental comparison of graph-based approximate nearest neighbor search. Wang, Mengzhao, Xiaoliang Xu, Qiang Yue, and Yuxiang Wang. [[Paper](https://arxiv.org/pdf/2101.12631.pdf), [Code](https://github.com/Lsyhprum/WEAVESS)]\n- Lin, Peng-Cheng, and Wan-Lei Zhao. \"[Graph based nearest neighbor search: Promises and failures](https://arxiv.org/pdf/1904.02077).\" arXiv preprint arXiv:1904.02077 (2019).\n- :star: HNSW: Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs. Malkov, Yu A., and Dmitry A. Yashunin. [[Paper](https://arxiv.org/pdf/1603.09320.pdf), [Code](https://github.com/nmslib/hnswlib), [Rust Version](https://github.com/rust-cv/hnsw), [Go Version](https://github.com/coder/hnsw)]\n- Scaling Graph-Based ANNS Algorithms to Billion-Size Datasets: A Comparative Analysis. Dobson, Magdalen, Zheqi Shen, Guy E. Blelloch, Laxman Dhulipala, Yan Gu, Harsha Vardhan Simhadri, and Yihan Sun. [[Paper](https://arxiv.org/pdf/2305.04359.pdf)]\n- FINGER: Fast Inference for Graph-based Approximate Nearest Neighbor Search. Chen, Patrick, Wei-Cheng Chang, Jyun-Yu Jiang, Hsiang-Fu Yu, Inderjit Dhillon, and Cho-Jui Hsieh [[Paper](https://dl.acm.org/doi/pdf/10.1145/3543507.3583318), [Video](https://www.youtube.com/watch?v=OsxZG2XfcZA)]\n- NSG : Navigating Spread-out Graph For Approximate Nearest Neighbor Search. Fu, Cong, Chao Xiang, Changxu Wang, and Deng Cai. [[Paper](https://www.vldb.org/pvldb/vol12/p461-fu.pdf), [Code](https://github.com/ZJULearning/nsg)]\n- EFANNA : Extremely Fast Approximate Nearest Neighbor Search Algorithm Based on kNN Graph. Cong Fu, Deng Cai. [[Paper](https://arxiv.org/abs/1609.07228), [Code](https://github.com/ZJULearning/efanna/tree/master)]\n- Khan, Saim, et al. \"[BANG: Billion-Scale Approximate Nearest Neighbor Search using a Single GPU.](https://arxiv.org/pdf/2401.11324.pdf)\" arXiv preprint arXiv:2401.11324 (2024).\n- Ootomo, Hiroyuki, et al. \"[Cagra: Highly parallel graph construction and approximate nearest neighbor search for gpus.](https://arxiv.org/pdf/2308.15136.pdf)\" arXiv preprint arXiv:2308.15136 (2023).\n- Oguri, Yutaro, and Yusuke Matsui. \"[Theoretical and Empirical Analysis of Adaptive Entry Point Selection for Graph-based Approximate Nearest Neighbor Search.](https://arxiv.org/pdf/2402.04713.pdf)\" arXiv preprint arXiv:2402.04713 (2024).\n- Oguri, Yutaro, and Yusuke Matsui. \"[General and practical tuning method for off-the-shelf graph-based index: Sisap indexing challenge report by team utokyo.](https://arxiv.org/pdf/2309.00472.pdf)\" International Conference on Similarity Search and Applications. Cham: Springer Nature Switzerland, 2023.\n- Wang, Mengzhao, et al. \"[Starling: An I/O-Efficient Disk-Resident Graph Index Framework for High-Dimensional Vector Similarity Search on Data Segment](https://arxiv.org/pdf/2401.02116.pdf).\" arXiv preprint arXiv:2401.02116 (2024). [[Code](https://github.com/zilliztech/starling)]\n- Manohar, Magdalen Dobson, et al. \"[ParlayANN: Scalable and Deterministic Parallel Graph-Based Approximate Nearest Neighbor Search Algorithms](https://dl.acm.org/doi/pdf/10.1145/3627535.3638475).\" Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming. 2024. [[Code](https://github.com/cmuparlay/ParlayANN)]\n- Wang, Mengzhao, et al. \"[An Efficient and Robust Framework for Approximate Nearest Neighbor Search with Attribute Constraint](https://proceedings.neurips.cc/paper_files/paper/2023/file/32e41d6b0a51a63a9a90697da19d235d-Paper-Conference.pdf).\" Advances in Neural Information Processing Systems 36 (2024).\n- Yu, Shangdi, et al. \"[Pecann: Parallel efficient clustering with graph-based approximate nearest neighbor search](https://arxiv.org/pdf/2312.03940.pdf).\" arXiv preprint arXiv:2312.03940 (2023).\n- Azizi, Ilias, Karima Echihabi, and Themis Palpanas. \"[ELPIS: Graph-Based Similarity Search for Scalable Data Science](https://www.vldb.org/pvldb/vol16/p1548-azizi.pdf).\" Proceedings of the VLDB Endowment 16.6 (2023): 1548-1559.\n- Indyk, Piotr, and Haike Xu. \"[Worst-case performance of popular approximate nearest neighbor search implementations: Guarantees and limitations](https://proceedings.neurips.cc/paper_files/paper/2023/file/d0ac28b79816b51124fcc804b2496a36-Paper-Conference.pdf).\" Advances in Neural Information Processing Systems 36 (2024).\n- Liu, Jun, et al. \"[Optimizing Graph-based Approximate Nearest Neighbor Search: Stronger and Smarter.](https://nicsefc.ee.tsinghua.edu.cn/nics_file/pdf/dacb55cd-fe0a-4b00-9fa0-8f32e3243930.pdf)\" 2022 23rd IEEE International Conference on Mobile Data Management (MDM). IEEE, 2022.\n- Wang, Hui, Yong Wang, and Wan-Lei Zhao. \"[Graph-based Approximate NN Search: A Revisit](https://arxiv.org/pdf/2204.00824.pdf).\" arXiv preprint arXiv:2204.00824 (2022).\n- Peng, Zhen, et al. \"[Speed-ANN: Low-Latency and High-Accuracy Nearest Neighbor Search via Intra-Query Parallelism](https://arxiv.org/pdf/2201.13007.pdf).\" arXiv preprint arXiv:2201.13007 (2022).\n- Lu, Kejing, et al. \"[HVS: hierarchical graph structure based on voronoi diagrams for solving approximate nearest neighbor search](https://www.vldb.org/pvldb/vol15/p246-lu.pdf).\" Proceedings of the VLDB Endowment 15.2 (2021): 246-258. [[Code](https://github.com/chuanxiao1983/HVS)]\n- Yingfan, Liu, Cheng Hong, and Cui Jiangtao. \"[Revisiting $ k $-Nearest Neighbor Graph Construction on High-Dimensional Data: Experiments and Analyses](https://arxiv.org/pdf/2112.02234).\" arXiv preprint arXiv:2112.02234 (2021).\n- Zhu, Dantong, and Minjia Zhang. \"[Understanding and Generalizing Monotonic Proximity Graphs for Approximate Nearest Neighbor Search](https://arxiv.org/pdf/2107.13052).\" arXiv preprint arXiv:2107.13052 (2021).\n- Gottesbüren, Lars, et al. \"[Unleashing Graph Partitioning for Large-Scale Nearest Neighbor Search](https://arxiv.org/pdf/2403.01797v1.pdf).\" arXiv preprint arXiv:2403.01797 (2024).\n- Singh, Aditi, et al. \"[Freshdiskann: A fast and accurate graph-based ann index for streaming similarity search](https://arxiv.org/pdf/2105.09613.pdf).\" arXiv preprint arXiv:2105.09613 (2021).\n- Wang, Hui, Wan-Lei Zhao, and Xiangxiang Zeng. \"[Large-Scale Approximate k-NN Graph Construction on GPU](https://arxiv.org/pdf/2103.15386).\" arXiv preprint arXiv:2103.15386 (2021).\n- Patel, Liana, et al. \"[ACORN: Performant and Predicate-Agnostic Search Over Vector Embeddings and Structured Data](https://arxiv.org/pdf/2403.04871v1.pdf).\" arXiv preprint arXiv:2403.04871 (2024).\n- Zuo, Chaoji, et al. \"[SeRF: Segment Graph for Range-Filtering Approximate Nearest Neighbor Search](https://dl.acm.org/doi/pdf/10.1145/3639324).\" Proceedings of the ACM on Management of Data 2.1 (2024): 1-26.\n- Hezel, Nico, et al. \"[An Exploration Graph with Continuous Refinement for Efficient Multimedia Retrieval](https://dl.acm.org/doi/pdf/10.1145/3652583.3658117).\" Proceedings of the 2024 International Conference on Multimedia Retrieval. 2024.\n- Xiao, Wentao, et al. \"[Enhancing HNSW Index for Real-Time Updates: Addressing Unreachable Points and Performance Degradation](https://arxiv.org/pdf/2407.07871).\" arXiv preprint arXiv:2407.07871 (2024).\n- Yang, Shuo, et al. \"[Revisiting the Index Construction of Proximity Graph-Based Approximate Nearest Neighbor Search.](https://arxiv.org/pdf/2410.01231)\" arXiv preprint arXiv:2410.01231 (2024).\n- Gou, Yutong, et al. \"[SymphonyQG: Towards Symphonious Integration of Quantization and Graph for Approximate Nearest Neighbor Search.](https://arxiv.org/pdf/2411.12229)\" arXiv preprint arXiv:2411.12229 (2024). [[Code](https://github.com/gouyt13/SymphonyQG)]\n- Yang, Ming, Yuzheng Cai, and Weiguo Zheng. \"[CSPG: Crossing Sparse Proximity Graphs for Approximate Nearest Neighbor Search.](https://openreview.net/pdf?id=ohvXBIPV7e)\" The Thirty-eighth Annual Conference on Neural Information Processing Systems.\n- Liang, Anqi, et al. \"[UNIFY: Unified Index for Range Filtered Approximate Nearest Neighbors Search.](https://arxiv.org/pdf/2412.02448)\" arXiv preprint arXiv:2412.02448 (2024).\n- Gou, Yutong. \"[Efficient approximate nearest neighbor search on high-dimensional vectors by graph and quantization.](https://dr.ntu.edu.sg/bitstream/10356/181650/4/MENG_Thesis.pdf)\" (2024).\n- Xu, Yuexuan, et al. \"[iRangeGraph: Improvising Range-dedicated Graphs for Range-filtering Nearest Neighbor Search.](https://arxiv.org/pdf/2409.02571)\" Proceedings of the ACM on Management of Data 2.6 (2024): 1-26. [[Code](https://github.com/YuexuanXu7/iRangeGraph)]\n- Douze, Matthijs, Alexandre Sablayrolles, and Hervé Jégou. \"[Link and code: Fast indexing with graphs and compact regression codes.](https://openaccess.thecvf.com/content_cvpr_2018/papers/Douze_Link_and_Code_CVPR_2018_paper.pdf)\" Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.\n- Yin, Ziqi, et al. \"[DEG: Efficient Hybrid Vector Search Using the Dynamic Edge Navigation Graph](https://dl.acm.org/doi/pdf/10.1145/3709679).\" Proceedings of the ACM on Management of Data 3.1 (2025): 1-28.\n- Shi, Yang, et al. \"[Scalable Overload-Aware Graph-Based Index Construction for 10-Billion-Scale Vector Similarity Search](https://arxiv.org/pdf/2502.20695).\" arXiv preprint arXiv:2502.20695 (2025).\n- Gui, Yuntao, et al. \"[PilotANN: Memory-Bounded GPU Acceleration for Vector Search.](https://arxiv.org/pdf/2503.21206)\" arXiv preprint arXiv:2503.21206 (2025) [[Code](https://github.com/ytgui/PilotANN)]\n- Chung, Jun Woo, Huawei Lin, and Weijie Zhao. \"[Locality-Sensitive Indexing for Graph-Based Approximate Nearest Neighbor Search](https://dl.acm.org/doi/pdf/10.1145/3726302.3730028).\" Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2025.\n- Xiao, Yang, et al. \"[Breaking the Storage-Compute Bottleneck in Billion-Scale ANNS: A GPU-Driven Asynchronous I/O Framework.](https://arxiv.org/pdf/2507.10070)\" arXiv preprint arXiv:2507.10070 (2025).\n- Yang, Ming, Yuzheng Cai, and Weiguo Zheng. \"[Hi-PNG: Efficient Interval-Filtering ANNS via Hierarchical Interval Partition Navigating Graph](https://dl.acm.org/doi/pdf/10.1145/3711896.3736997).\" Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V. 2. 2025.\n- Bosch, Yannick, and Sabine Storandt. \"[Instance-based Approximation Guarantees for Graph-based Nearest Neighbor Search.](https://ojs.aaai.org/index.php/ICAPS/article/download/36113/38267)\" Proceedings of the International Conference on Automated Planning and Scheduling. Vol. 35. No. 1. 2025.\n- Zhong, Xiaoyao, et al. \"[VSAG: An Optimized Search Framework for Graph-based Approximate Nearest Neighbor Search.](https://www.vldb.org/pvldb/vol18/p5017-cheng.pdf)\" arXiv preprint arXiv:2503.17911 (2025).\n- Zhao, Dongfang. \"[MCGI: Manifold-Consistent Graph Indexing for Billion-Scale Disk-Resident Vector Search.](https://arxiv.org/pdf/2601.01930)\" arXiv preprint arXiv:2601.01930 (2026).\n- Xu, Yuming, et al. \"[Scalable Distributed Vector Search via Accuracy Preserving Index Construction.](https://arxiv.org/pdf/2512.17264)\" arXiv preprint arXiv:2512.17264 (2025).\n- Xu, Haike, et al. \"[JAG: Joint Attribute Graphs for Filtered Nearest Neighbor Search.](https://arxiv.org/pdf/2602.10258)\" arXiv preprint arXiv:2602.10258 (2026).\n- Guo, Hao, and Youyou Lu. \"[Odinann: Direct insert for consistently stable performance in billion-scale graphbased vector search.](https://www.usenix.org/system/files/conf%C3%A9rence/fast26/fast26spring-prepub_guo.pdf)\" 24th USENIX Conference on File and Storage Technologies (FAST 26), Santa Clara, CA. 2026.\n- Chen, Yue, et al. \"[RED-ANNS: An RDMA-Enabled Distributed Framework for Graph-Based Approximate Nearest Neighbor Search.](https://kay21s.github.io/RED-ANNS-VLDB2026.pdf)\"\n- Rubel, Tobias, et al. \"[PiPNN: Ultra-Scalable Graph-Based Nearest Neighbor Indexing](https://arxiv.org/pdf/2602.21247).\" arXiv preprint arXiv:2602.21247 (2026).\n- Fang, Fei, Yi Liu, and Chen Qian. \"[d-HNSW: A High-performance Vector Search Engine on Disaggregated Memory.](https://arxiv.org/pdf/2603.13591)\" arXiv preprint arXiv:2603.13591 (2026).\n- Wu, Zekai, et al. \"[FGIM: a Fast Graph-based Indexes Merging Framework for Approximate Nearest Neighbor Search.](https://arxiv.org/pdf/2603.21710)\" arXiv preprint arXiv:2603.21710 (2026).\n- Xiao, Yang, et al. \"[FlashANNS: GPU-Driven Asynchronous I/O Pipelining for Eliminating Storage-Compute Bottlenecks in Billion-Scale Similarity Search.](https://dl.acm.org/doi/pdf/10.1145/3786652)\" Proceedings of the ACM on Management of Data 4.1 (SIGMOD (2026): 1-27.\n\n## 🎄Tree-based Methods\n- Jayaram Subramanya, Suhas, et al. \"[Diskann: Fast accurate billion-point nearest neighbor search on a single node.](https://proceedings.neurips.cc/paper_files/paper/2019/file/09853c7fb1d3f8ee67a61b6bf4a7f8e6-Paper.pdf)\" Advances in Neural Information Processing Systems 32 (2019). [[Code](https://github.com/microsoft/DiskANN)]\n- Li, Haitao, et al. \"[Constructing Tree-based Index for Efficient and Effective Dense Retrieval.](https://arxiv.org/pdf/2304.11943.pdf)\" arXiv preprint arXiv:2304.11943 (2023).\n- Engels, Joshua, et al. \"[Approximate Nearest Neighbor Search with Window Filters](https://arxiv.org/html/2402.00943v1).\" arXiv preprint arXiv:2402.00943 (2024).\n- Song, Yang, et al. \"[ProMIPS: Efficient high-dimensional C-approximate maximum inner product search with a lightweight index](https://arxiv.org/pdf/2104.04406).\" 2021 IEEE 37th International Conference on Data Engineering (ICDE). IEEE, 2021.\n- Zhu, Yifan, et al. \"[GTS: GPU-based Tree Index for Fast Similarity Search](https://arxiv.org/html/2404.00966v1).\" arXiv preprint arXiv:2404.00966 (2024).\n- Tatsuno, Kento, et al. \"[AiSAQ: All-in-Storage ANNS with Product Quantization for DRAM-free Information Retrieval](https://arxiv.org/pdf/2404.06004.pdf).\" arXiv preprint arXiv:2404.06004 (2024).\n- Chen, Weijian, et al. \"[Efficient Index Layout and Search Strategy for Large-scale High-dimensional Vector Similarity Search.](https://dl.acm.org/doi/pdf/10.1145/3802045)\" Proceedings of the ACM on Management of Data 4.3 (SIGMOD (2026): 1-27.\n\n## Hashing\n- :star: [Awesome Papers on Learning to Hash](https://learning2hash.github.io)\n- :star: A survey on learning to hash. Wang, Jingdong, Ting Zhang, Nicu Sebe, and Heng Tao Shen [[Paper](https://arxiv.org/pdf/1606.00185.pdf)]\n- :star: A survey on deep hashing methods. Luo, Xiao, Haixin Wang, Daqing Wu, Chong Chen, Minghua Deng, Jianqiang Huang, and Xian-Sheng Hua. [[Paper](https://dl.acm.org/doi/full/10.1145/3532624)]\n- :star: Moran, Sean. \"[Learning-Based Hashing for ANN Search: Foundations and Early Advances.](https://arxiv.org/pdf/2510.04127)\" arXiv preprint arXiv:2510.04127 (2025).\n- :star: Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval. Gong, Yunchao, Svetlana Lazebnik, Albert Gordo, and Florent Perronnin [[Paper](https://slazebni.cs.illinois.edu/publications/ITQ.pdf), [Python code](https://github.com/twistedcubic/learn-to-hash/blob/master/itq.py), [Matlab code](https://github.com/dangkhoasdc/sah/tree/master/itq)]\n- Gan, Yukang, et al. \"[Binary Embedding-based Retrieval at Tencent](https://arxiv.org/pdf/2302.08714).\" arXiv preprint arXiv:2302.08714 (2023).\n- Yan, Bencheng, et al. \"[Binary code based hash embedding for web-scale applications](https://arxiv.org/pdf/2109.02471).\" Proceedings of the 30th ACM International Conference on Information \u0026 Knowledge Management. 2021.\n- Weng, Zhenyu, and Yuesheng Zhu. \"[Unsupervised Online Hashing with Multi-Bit Quantization](https://openaccess.thecvf.com/content/ACCV2022/papers/Weng_Unsupervised_Online_Hashing_with_Multi-Bit_Quantization_ACCV_2022_paper.pdf).\" Proceedings of the Asian Conference on Computer Vision. 2022.\n- Huang, Qiang, Yifan Lei, and Anthony KH Tung. \"[Point-to-hyperplane nearest neighbor search beyond the unit hypersphere](https://dl.acm.org/doi/pdf/10.1145/3448016.3457240).\" Proceedings of the 2021 International Conference on Management of Data. 2021.\n- Weng, Zhenyu, Yuesheng Zhu, and Ruixin Liu. \"[Fast Search on Binary Codes by Weighted Hamming Distance](https://arxiv.org/pdf/2009.08591).\" arXiv preprint arXiv:2009.08591 (2020).\n- Jian, Xiaozheng, et al. \"[Fast top-K cosine similarity search through XOR-friendly binary quantization on GPUs](https://arxiv.org/pdf/2008.02002).\" arXiv preprint arXiv:2008.02002 (2020).\n- Zheng, Bolong, et al. \"[PM-LSH: A fast and accurate LSH framework for high-dimensional approximate NN search](https://vbn.aau.dk/files/391642966/p643_zheng_1_.pdf).\" Proceedings of the VLDB Endowment 13.5 (2020): 643-655.\n- Eghbali, Sepehr. \"[Scalable Nearest Neighbor Search with Compact Codes](https://uwspace.uwaterloo.ca/bitstream/handle/10012/15355/Eghbali_Sepehr.pdf?sequence=3\u0026isAllowed=y).\" (2019).\n- Lei, Yifan, et al. \"[Locality-sensitive hashing scheme based on longest circular co-substring](https://arxiv.org/pdf/2004.05345).\" Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 2020.\n- Wei, Jiuqi, et al. \"[DET-LSH: A Locality-Sensitive Hashing Scheme with Dynamic Encoding Tree for Approximate Nearest Neighbor Search](https://arxiv.org/pdf/2406.10938).\" arXiv preprint arXiv:2406.10938 (2024).\n- Meng, Jingfan. \"[Efficient locality sensitive hashing: solutions, primitives, and applications.](https://repository.gatech.edu/bitstreams/d4bcf2e3-0d86-44c5-8376-d67639a060b4/download)\" (2025).\n- Zhang, Zhibo, et al. \"[LLMs Meet Isolation Kernel: Lightweight, Learning-free Binary Embeddings for Fast Retrieval.](https://arxiv.org/pdf/2601.09159)\" arXiv preprint arXiv:2601.09159 (2026).\n\n## Other Approaches\n- Chen, Qi, et al. \"[Spann: Highly-efficient billion-scale approximate nearest neighbor search](https://papers.nips.cc/paper/2021/file/299dc35e747eb77177d9cea10a802da2-Paper.pdf).\" arXiv preprint arXiv:2111.08566 (2021). [[Code](https://github.com/microsoft/SPTAG)]\n- Li, Yuliang, et al. \"[Index-based, high-dimensional, cosine threshold querying with optimality guarantees](https://arxiv.org/pdf/1812.07695.pdf).\" Theory of Computing Systems 65 (2021): 42-83.\n- Chen, Yewang, et al. \"[Semi-convex hull tree: Fast nearest neighbor queries for large scale data on GPUs](https://www.researchgate.net/profile/Yewang-Chen/publication/330028721_Semi-Convex_Hull_Tree_Fast_Nearest_Neighbor_Queries_for_Large_Scale_Data_on_GPUs/links/5c316845299bf12be3b1ca36/Semi-Convex-Hull-Tree-Fast-Nearest-Neighbor-Queries-for-Large-Scale-Data-on-GPUs.pdf).\" 2018 IEEE International Conference on Data Mining (ICDM). IEEE, 2018.\n- Engels, Joshua, Benjamin Coleman, and Anshumali Shrivastava. \"[Practical near neighbor search via group testing](https://arxiv.org/pdf/2106.11565.pdf).\" Advances in Neural Information Processing Systems 34 (2021): 9950-9962. [[Supplement](https://proceedings.neurips.cc/paper_files/paper/2021/file/5248e5118c84beea359b6ea385393661-Supplemental.pdf)]\n- Gong, Long, et al. \"[iDEC: indexable distance estimating codes for approximate nearest neighbor search](https://par.nsf.gov/servlets/purl/10167953).\" Proceedings of the VLDB Endowment 13.9 (2020).\n- Lu, Kejing, et al. \"[VHP: approximate nearest neighbor search via virtual hypersphere partitioning](https://eprints.lib.hokudai.ac.jp/dspace/bitstream/2115/79717/1/3397230.3397240.pdf).\" Proceedings of the VLDB Endowment 13.9 (2020): 1443-1455.\n- Bing Tian, , Haikun Liu, Yuhang Tang, Shihai Xiao, Zhuohui Duan, Xiaofei Liao, Xuecang Zhang, Junhua Zhu, Yu Zhang. \"[FusionANNS: An Efficient CPU/GPU Cooperative Processing Architecture for Billion-scale Approximate Nearest Neighbor Search.](https://arxiv.org/pdf/2409.16576)\" (2024).\n- Chen, Zhonghan, et al. \"[Exploring the Meaningfulness of Nearest Neighbor Search in High-Dimensional Space.](https://arxiv.org/pdf/2410.05752)\" arXiv preprint arXiv:2410.05752 (2024).\n- Tepper, Mariano, et al. \"[GleanVec: Accelerating vector search with minimalist nonlinear dimensionality reduction.](https://arxiv.org/pdf/2410.22347)\" arXiv preprint arXiv:2410.22347 (2024).\n- Li, Jingyu, et al. \"[PANTHER: Private Approximate Nearest Neighbor Search in the Single Server Setting.](https://eprint.iacr.org/2024/1774.pdf)\" Cryptology ePrint Archive (2024).\n- Wei, Jiuqi, et al. \"[Subspace Collision: An Efficient and Accurate Framework for High-dimensional Approximate Nearest Neighbor Search.](https://arxiv.org/pdf/2411.14754)\" arXiv preprint arXiv:2411.14754 (2024).\n- Severo, Daniel, et al. [\"Lossless Compression of Vector IDs for Approximate Nearest Neighbor Search](https://arxiv.org/pdf/2501.10479).\" arXiv preprint arXiv:2501.10479 (2025).\n- Scheerer, J. L., Zaharia, M., Potts, C., Alonso, G., \u0026 Khattab, O. (2025). WARP: An Efficient Engine for Multi-Vector Retrieval. arXiv [Cs.IR]. Retrieved from http://arxiv.org/abs/2501.17788\n- Kuffo, Leonardo, Elena Krippner, and Peter Boncz. \"[PDX: A Data Layout for Vector Similarity Search](https://arxiv.org/pdf/2503.04422).\" arXiv preprint arXiv:2503.04422 (2025).\n- Dang, Nam Anh, Ben Landrum, and Ken Birman. \"[Passing the Baton: High Throughput Distributed Disk-Based Vector Search with BatANN.](https://arxiv.org/pdf/2512.09331)\" arXiv preprint arXiv:2512.09331 (2025).\n- Chen, Yaoqi, et al. \"[RetroInfer: A Vector Storage Engine for Scalable Long-Context LLM Inference.](https://www.vldb.org/pvldb/vol19/p1016-lu.pdf)\"\n\n\n## Systems\n\n- Qin, An, et al. \"[Maze: A Cost-Efficient Video Deduplication System at Web-scale](https://dl.acm.org/doi/pdf/10.1145/3503161.3548145).\" Proceedings of the 30th ACM International Conference on Multimedia. 2022.\n- Doshi, Ishita, et al. \"[LANNS: a web-scale approximate nearest neighbor lookup system](https://arxiv.org/pdf/2010.09426.pdf).\" arXiv preprint arXiv:2010.09426 (2020).\n- Chen, Yaoqi, et al. \"[OneSparse: A Unified System for Multi-index Vector Search](https://dl.acm.org/doi/pdf/10.1145/3589335.3648338).\" Companion Proceedings of the ACM on Web Conference 2024. 2024.\n- Sun, Ji, et al. \"[GaussDB-Vector: A Large-Scale Persistent Real-Time Vector Database for LLM Applications](https://sunji.greatji.com/resource/p2114-li%20(1).pdf).\"\n- Zhi, Xiangyu, et al. \"[CoTra: Towards Efficient and Scalable Distributed Vector Search with RDMA.](https://dl.acm.org/doi/pdf/10.1145/3786634)\" Proceedings of the ACM on Management of Data 4.1 (SIGMOD (2026): 1-27.\n- Wang, Yan, et al. \"[LindormVector: A Distributed Vector Engine on a Cloud-Native Multi-Model NoSQL Database.](https://dl.acm.org/doi/epdf/10.1145/3788853.3803088)\" Companion of the International Conference on Management of Data. 2026.\n\n## Others\n- [Search Optimization with Query Likelihood Boosting and Two-Level Approximate Search for Edge Devices](https://arxiv.org/abs/2312.07517)\n- Gao, Jianyang, and Cheng Long. \"[High-Dimensional Approximate Nearest Neighbor Search: with Reliable and Efficient Distance Comparison Operations.](https://dl.acm.org/doi/pdf/10.1145/3589282)\" Proceedings of the ACM on Management of Data 1.2 (2023): 1-27.\n- [Approximate Nearest Neighbor Search in Recommender Systems](https://big-ann-benchmarks.com/neurips23_slides/ANNS_for_recommendation_systems_Yury.pdf). Yury Malkov.\n- [Accelerating vector search on the GPU with RAPIDS RAFT](https://big-ann-benchmarks.com/neurips23_slides/NVIDIA_Corey.pdf). Corey Nolet\n- Gupta, Gaurav, et al. \"[CAPS: A Practical Partition Index for Filtered Similarity Search](https://arxiv.org/pdf/2308.15014.pdf).\" arXiv preprint arXiv:2308.15014 (2023).\n- Zhu, Yuhao. \"[RTNN: accelerating neighbor search using hardware ray tracing](https://arxiv.org/pdf/2201.01366.pdf).\" Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 2022. [[Code](https://github.com/horizon-research/rtnn)]\n- Levi, Asaf, et al. \"[Physical vs. Logical Indexing with {IDEA}: Inverted {Deduplication-Aware} Index](https://www.usenix.org/system/files/fast24-levi.pdf).\" 22nd USENIX Conference on File and Storage Technologies (FAST 24). 2024. [[Code](https://github.com/asaflevi0812/IDEA)]\n- Carra, Damiano, and Giovanni Neglia. \"[Taking two Birds with one k-NN Cache](http://profs.sci.univr.it/~carra/downloads/Carra_Globecom_21.pdf).\" 2021 IEEE Global Communications Conference (GLOBECOM). IEEE, 2021.\n- Salem, Tareq Si, Giovanni Neglia, and Damiano Carra. \"[Ascent Similarity Caching With Approximate Indexes](https://arxiv.org/pdf/2107.00957.pdf).\" IEEE/ACM Transactions on Networking (2022).\n- Li, Conglong, et al. \"[Improving approximate nearest neighbor search through learned adaptive early termination](https://pdl.cmu.edu/PDL-FTP/BigLearning/mod0246-liA.pdf).\" Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 2020.\n- Karppa, Matti, Martin Aumüller, and Rasmus Pagh. \"[Deann: Speeding up kernel-density estimation using approximate nearest neighbor search](https://proceedings.mlr.press/v151/karppa22a/karppa22a.pdf).\" International Conference on Artificial Intelligence and Statistics. PMLR, 2022.\n- Wang, Zeyu, et al. \"[Distance Comparison Operators for Approximate Nearest Neighbor Search: Exploration and Benchmark](https://arxiv.org/pdf/2403.13491.pdf).\" arXiv preprint arXiv:2403.13491 (2024).\n- Szilvasy, Gergely, Pierre-Emmanuel Mazaré, and Matthijs Douze. \"[Vector search with small radiuses](https://arxiv.org/pdf/2403.10746).\" arXiv preprint arXiv:2403.10746 (2024).\n- Han, Changhun, Suji Kim, and Ha-Myung Park. \"[Efficient Proximity Search in Time-accumulating High-dimensional Data using Multi-level Block Indexing](https://openproceedings.org/2024/conf/edbt/paper-154.pdf).\" (2024).\n- Tepper, Mariano, et al. \"[LeanVec: Search your vectors faster by making them fit.](https://arxiv.org/pdf/2312.16335v1)\" arXiv preprint arXiv:2312.16335 (2023).\n- Harwood, Ben, et al. \"[Approximate Nearest Neighbour Search on Dynamic Datasets: An Investigation](https://arxiv.org/pdf/2404.19284).\" arXiv preprint arXiv:2404.19284 (2024).\n- [Characterizing the Dilemma of Performance and Index Size in Billion-Scale Vector Search and Breaking It with Second-Tier Memory](https://arxiv.org/pdf/2405.03267)\n- Xu, Haike. [Worst-case Performance of Popular Approximate Nearest Neighbor Search Implementations: Guarantees and Limitations](https://dspace.mit.edu/bitstream/handle/1721.1/156284/xu-haikexu-sm-eecs-2024-thesis.pdf?sequence=1\u0026isAllowed=y). Diss. Massachusetts Institute of Technology, 2024.\n- Lin, Jimmy. \"[Operational Advice for Dense and Sparse Retrievers: HNSW, Flat, or Inverted Indexes?.](https://arxiv.org/pdf/2409.06464)\" arXiv preprint arXiv:2409.06464 (2024).\n- Zhou, Mingxun, Elaine Shi, and Giulia Fanti. \"[Pacmann: Efficient Private Approximate Nearest Neighbor Search.](https://eprint.iacr.org/2024/1600.pdf)\" Cryptology ePrint Archive (2024).\n- Vecchiato, Thomas. \"[Learning Cluster Representatives for Approximate Nearest Neighbor Search.](https://arxiv.org/pdf/2412.05921)\" arXiv preprint arXiv:2412.05921 (2024).\n- Special Issue on High-Dimensional Vector Similarity Search: The Role of Machine Learning, and Future Perspectives [[PDF](http://sites.computer.org/debull/A24sept/A24SEPT-CD.pdf#page=45)]\n\n\n## :chart_with_upwards_trend: Evaluation \u0026 Metrics\n- Which BM25 do you mean? A large-scale reproducibility study of scoring variants. Kamphuis, Chris, Arjen P. de Vries, Leonid Boytsov, and Jimmy Lin [[Paper](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7148026/)]\n\n## 📰 Articles \u0026 Talks\n- [What is a Vector Database?](https://www.pinecone.io/learn/vector-database/)\n- Vector databases (Part 1): [What makes each one different?](https://thedataquarry.com/posts/vector-db-1/)\n- [eBay’s Blazingly Fast Billion-Scale Vector Similarity Engine](https://tech.ebayinc.com/engineering/ebays-blazingly-fast-billion-scale-vector-similarity-engine/)\n- [Computer Vision Meetup: Computer Vision Applications at Scale with Vector Databases](https://www.youtube.com/watch?v=YTIDj7jeRbs)\n- [How to choose your vector database in 2023?](https://www.sicara.fr/blog-technique/how-to-choose-your-vector-database-in-2023)\n- [Do we really need a specialized vector database?](https://modelz.ai/blog/pgvector)\n- [Vector database is not a separate database category](https://nextword.substack.com/p/vector-database-is-not-a-separate)\n- [Vector Databases: A First-Principles Approach](https://docs.google.com/presentation/d/1qRv2nGVHjbFHXyUeUKK7bbvboj7Yal8UYcu_POEfWOQ/edit#slide=id.p)\n- [Vector Search RAG Tutorial – Combine Your Data with LLMs with Advanced Search](https://www.youtube.com/watch?v=JEBDfGqrAUA\u0026ab_channel=freeCodeCamp.org)\n- [Efficient Vector Similarity Search in Recommender Workflows Using Milvus with NVIDIA Merlin](https://milvus.io/blog/efficient-vector-similarity-search-recommender-workflows-using-milvus-nvidia-merlin.md)\n- [Vector Databases: A Beginner’s Guide!](https://medium.com/data-and-beyond/vector-databases-a-beginners-guide-b050cbbe9ca0)\n- [Vector Database and Spring IA](https://dev.to/lucasnscr/vector-database-and-spring-ia-4dll)\n- [How to handle a Million Vector Embeddings in the RAG Applications](https://medium.datadriveninvestor.com/how-to-handle-a-million-embedding-vectors-in-the-rag-application-d10b875a0218)\n- [How Meilisearch Updates a Millions Vector Embeddings Database in Under a Minute](https://blog.kerollmops.com/how-meilisearch-updates-a-millions-vector-embeddings-database-in-under-a-minute)\n- [Common Pitfalls To Avoid When Using Vector Databases](https://dagshub.com/blog/common-pitfalls-to-avoid-when-using-vector-databases/)\n- [Getting Started With Vector Databases](https://dzone.com/refcardz/getting-started-with-vector-databases)\n- [Choosing the best model for semantic search](https://blog.meilisearch.com/choosing-the-best-model-for-semantic-search/)\n\n## Related Lists\n\n- [Awesome Vector Search Engine](https://github.com/currentslab/awesome-vector-search)\n\n","projects_url":"https://awesome.ecosyste.ms/api/v1/lists/dangkhoasdc%2Fawesome-vector-database/projects"}