{"id":40871700,"url":"https://github.com/datawhalechina/easy-vecdb","last_synced_at":"2026-02-11T21:31:15.647Z","repository":{"id":293374437,"uuid":"983835336","full_name":"datawhalechina/easy-vecdb","owner":"datawhalechina","description":"📚 从零开始的向量数据库原理与实践教程，在线阅读地址：https://datawhalechina.github.io/easy-vecdb/","archived":false,"fork":false,"pushed_at":"2026-01-22T03:11:13.000Z","size":170916,"stargazers_count":184,"open_issues_count":0,"forks_count":39,"subscribers_count":3,"default_branch":"main","last_synced_at":"2026-01-22T12:24:04.188Z","etag":null,"topics":["ai-project","annoy","faiss","hnsw","ivfflat","lsh","milvus","rag","vector-database"],"latest_commit_sha":null,"homepage":"https://datawhalechina.github.io/easy-vecdb/","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/datawhalechina.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-05-15T01:54:51.000Z","updated_at":"2026-01-22T05:22:57.000Z","dependencies_parsed_at":"2025-05-15T02:41:33.919Z","dependency_job_id":"4f05d8e5-48df-4bb5-8207-470e7d13e89e","html_url":"https://github.com/datawhalechina/easy-vecdb","commit_stats":null,"previous_names":["muxiaoxiong/easy-vectordb","datawhalechina/easy-vectordb","datawhalechina/easy-vecdb"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/datawhalechina/easy-vecdb","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datawhalechina%2Feasy-vecdb","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datawhalechina%2Feasy-vecdb/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datawhalechina%2Feasy-vecdb/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datawhalechina%2Feasy-vecdb/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/datawhalechina","download_url":"https://codeload.github.com/datawhalechina/easy-vecdb/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datawhalechina%2Feasy-vecdb/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29345484,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-11T20:11:40.865Z","status":"ssl_error","status_checked_at":"2026-02-11T20:10:41.637Z","response_time":97,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-project","annoy","faiss","hnsw","ivfflat","lsh","milvus","rag","vector-database"],"created_at":"2026-01-22T00:40:55.477Z","updated_at":"2026-02-11T21:31:15.617Z","avatar_url":"https://github.com/datawhalechina.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align='center'\u003e\r\n    \u003cimg src=\"./logo.png\" alt=\"alt text\" width=\"100%\"\u003e\r\n    \u003ch1\u003eEasy-vecDB（⚠️ Alpha内测版）\u003c/h1\u003e\r\n\u003c/div\u003e\r\n\r\n\u003e [!CAUTION]\r\n\u003e ⚠️ Alpha内测版本警告：此为早期内部构建版本，尚不完整且可能存在错误，欢迎大家提Issue反馈问题或建议。\r\n\r\n\u003cdiv align=\"center\"\u003e\r\n\r\n[![GitHub stars](https://img.shields.io/github/stars/datawhalechina/easy-vecdb?style=flat-square)](https://github.com/datawhalechina/easy-vecdb/stargazers) [![GitHub forks](https://img.shields.io/github/forks/datawhalechina/easy-vecdb?style=flat-square)](https://github.com/datawhalechina/easy-vecdb/network/members) [![GitHub issues](https://img.shields.io/github/issues/datawhalechina/easy-vecdb?style=flat-square)](https://github.com/datawhalechina/easy-vecdb/issues) [![GitHub license](https://img.shields.io/github/license/datawhalechina/easy-vecdb?style=flat-square)](https://github.com/datawhalechina/easy-vecdb/blob/main/LICENSE)\r\n\r\n[中文](./README.md) | [English](./README_en.md)\r\n\r\n[📚 在线阅读地址](https://datawhalechina.github.io/easy-vecdb/)\r\n\r\n📚 从零开始的向量数据库原理与实践教程\r\n\r\n\u003c/div\u003e\r\n\r\n## 🧭 项目简介\r\n\r\n**EasyVecDB** 是一个面向开发者与研究者的 **向量数据库系统性学习项目**。  \r\n项目内容覆盖从基础概念、算法原理到生产级应用部署的全流程，聚焦以下三个方向：\r\n\r\n- 🧩 **理论入门**：理解向量数据库的原理、架构与索引机制  \r\n- ⚙️ **实战教程**：掌握 Milvus / Faiss / Annoy 的使用与优化技巧  \r\n- 💡 **项目案例**：从零构建 RAG、嵌入检索、聚类可视化等完整项目  \r\n\r\n---\r\n\r\n\r\n## 📖 内容导航\r\n\r\n项目共分为 **基础学习篇** 与 **实践篇** 两个部分，对应导航栏配置如下：\r\n\r\n| 章节 | 关键内容 | 状态 |\r\n| --- | --- | --- |\r\n| \u003cstrong\u003e第一部分：基础学习篇（Base）\u003c/strong\u003e | 向量数据库原理、嵌入与搜索基础 |  |\r\n| [Chapter 1 项目介绍](./docs/base/chapter1/项目介绍.md) | 项目目标、整体学习路径 | ✅ |\r\n| [Chapter 2 为什么需要向量数据库](./docs/base/chapter2/为什么需要向量数据库.md) | 检索瓶颈、相似度搜索原理 | ✅ |\r\n| [Chapter 3 向量嵌入算法基础](./docs/base/chapter3/向量嵌入算法基础.md) | Word2Vec、Transformer Embedding | ✅ |\r\n| [Chapter 4 向量搜索算法基础](./docs/base/chapter4/向量搜索算法基础.md) | 暴力检索、向量相似度 | ✅ |\r\n| [Chapter 5 ANN 搜索算法](./docs/base/chapter5/ANN搜索算法.md) | IVF、PQ、HNSW、LSH、Annoy算法原理与代码实战 | ✅ |\r\n| [Chapter 6 实现你自己的向量数据库](./docs/base/chapter6/实现你自己的向量数据库.md) | 向量数据库最小实现 | ✅ |\r\n| \u003cstrong\u003e第二部分：Faiss 教程（Faiss）\u003c/strong\u003e | 高性能向量检索引擎实战 |  |\r\n| [Chapter 1 FAISS 入门与环境搭建](./docs/Faiss/chapter1/FAISS入门与环境搭建.md) | 安装配置、基础概念 | ✅ |\r\n| [Chapter 2 FAISS 核心索引实战](./docs/Faiss/chapter2/FAISS数据结构与索引.md) | Flat、IVF、PQ、HNSW 等索引 | ✅ |\r\n| [Chapter 3 FAISS 核心功能进阶](./docs/Faiss/chapter3/FAISS核心功能进阶.md) | 复合索引、GPU、批量检索 | ✅ |\r\n| [Chapter 4 FAISS 性能调优与评估](./docs/Faiss/chapter4/FAISS性能调优与评估.md) | Recall、延迟、内存调优 | ✅ |\r\n| [Chapter 5 FAISS 工程化落地实战](./docs/Faiss/chapter5/FAISS工程化落地实战.md) | 工程结构、服务化、实战案例 | ✅ |\r\n| \u003cstrong\u003e第三部分：Milvus 教程（Milvus）\u003c/strong\u003e | 分布式向量数据库与工程实践 |  |\r\n| [Chapter 1 Milvus 向量数据库入门：认知与架构](./docs/Milvus/chapter1/Milvus向量数据库入门.md) | 架构设计、核心组件 | ✅ |\r\n| [Chapter 2 Milvus 核心概念：数据模型与索引体系](./docs/Milvus/chapter2/Milvus核心概念.md) | Collection、Partition、Index | ✅ |\r\n| [Chapter 3 Milvus 基础操作：PyMilvus核心API实战](./docs/Milvus/chapter3/PyMilvus核心API实战.md) | Milvus数据写入、查询、索引管理 | ✅ |\r\n| [Chapter 4 Milvus的AI应用开发：基于BM25的混合搜索向量数据库开发实战](./docs/Milvus/chapter4/Milvus的AI应用开发.md) | RAG、混合向量检索应用 | ✅ |\r\n| [Chapter 5 Milvus的AI应用开发：图像检索应用实战](./docs/Milvus/docs/Milvus/chapter5/Milvus的AI应用开发.md) | 图像检索应用 | ✅ |\r\n| [Chapter 6 Milvus 选学部分](./docs/Milvus/chapter6/Milvus底层架构详解.md) |Milvus底层架构详解、Milvus reranker、Milvus Lite部署与应用、MinerU部署教程 | ✅ |\r\n| \u003cstrong\u003e第四部分：基于向量数据库的AI应用开发\u003c/strong\u003e |  |  |\r\n| [project 1 基于FAISS框架RAG实战项目](./docs/projects/project1/README.md)  | RAG with FAISS    |✅  |\r\n| [project 2 基于Milvus框架的Agent项目](./docs/projects/project2/README.md) | Agent with Milvus     |✅   |\r\n| [project 3 基于Milvus和ArangoDB的RAG系统](./docs/projects/project3/README.md) | RAG with Milvus \u0026 ArangoDB     |✅  |\r\n| \u003cstrong\u003e第五部分：补充内容\u003c/strong\u003e | 与向量数据库有关的内容 |  |\r\n| [向量基础知识](./docs/more/chapter5/向量.md) | 向量基础概念与数学原理 | ✅ |\r\n| [FusionANNS架构设计](./docs/more/chapter1/GPU加速检索-基于FusionANNS.md) | GPU加速检索系统架构 | ✅ |\r\n| [Meta-Chunking策略](./docs/more/chapter2/Meta-Chunking：一种新的文本切分策略.md) | 智能文本切分算法 | ✅ |\r\n| [检索理论极限](./docs/more/chapter3/Limit基于嵌入检索的理论极限.md) | 向量检索性能边界分析 | ✅ |\r\n| [RabitQ索引技术](./docs/more/chapter4/RabitQ：用于近似最近邻搜索的带理论误差界的高维向量量化.md) | 高维向量量化方法 | ✅ |\r\n| [聚类算法](./docs/more/chapter6/聚类算法介绍.md) | 聚类算法介绍 | ✅ |\r\n|或者你想要添加更多具体的文档项？可以告诉我具体要补充哪些内容。|||\r\n\r\n如果你是想在原来的JSON导航结构中添加，也可以告诉我具体要加在哪个位置。\r\n⏳ **持续更新中...** \r\n\r\n\u003e 📘 本项目旨在让你从 **原理 → 实践 → 部署** 全流程掌握向量数据库核心知识与实战能力。\r\n\r\n\r\n## 🛠️ 项目目录结构说明\r\n\r\n```\r\n.\r\n├── docs 向量数据库学习指南与项目文档\r\n├── data 通用示例数据目录\r\n├── src  项目相关代码\r\n└── tmp  临时文件目录\r\n```\r\n\r\n## 📄 补充资源\r\n\r\n- 📚 [Datawhale社区介绍](./docs/Datawhale%E7%A4%BE%E5%8C%BA%E4%BB%8B%E7%BB%8D.pdf)\r\n- 🌐 [在线文档站点](https://datawhalechina.github.io/easy-vecdb/)\r\n- 💻 [项目源码](https://github.com/datawhalechina/easy-vecdb/tree/main/src)\r\n\r\n【相关竞赛】\r\n- 🚩[2025 全国大学生计算机系统能力大赛——第2届PolarDB数据库创新设计赛](https://tianchi.aliyun.com/competition/entrance/532409)\r\n\r\n## 🤝 参与贡献\r\n\r\n- 如果你发现了一些问题，可以提Issue进行反馈，如果提完没有人回复你可以联系[保姆团队](https://github.com/datawhalechina/DOPMC/blob/main/OP.md)的同学进行反馈跟进~\r\n- 如果你想参与贡献本项目，可以提Pull request，如果提完没有人回复你可以联系[保姆团队](https://github.com/datawhalechina/DOPMC/blob/main/OP.md)的同学进行反馈跟进~\r\n- 如果你对 Datawhale 很感兴趣并想要发起一个新的项目，请按照[Datawhale开源项目指南](https://github.com/datawhalechina/DOPMC/blob/main/GUIDE.md)进行操作即可~\r\n\r\n### 核心贡献者\r\n- [牧小熊-项目负责人](https://github.com/muxiaoxiong)(Datawhale成员)\r\n- [刘晓-项目贡献者](https://github.com/Halukisan)(Datawhale鲸英助教)\r\n- [柯慕灵-项目贡献者](https://github.com/1985312383)(Datawhale成员)\r\n- [赵鑫龙-项目贡献者](https://github.com/xiaoming910)(Datawhale精英助教)\r\n- [陈辅元-项目贡献者](https://github.com/Fyuan0206)(Datawhale成员)\r\n### 特别感谢\r\n\r\n- 感谢 [@Sm1les](https://github.com/Sm1les) 对本项目的帮助与支持\r\n- 感谢所有为本项目做出贡献的开发者们 ❤️\r\n\r\n\u003cdiv align=\"left\"\u003e\r\n\r\n\u003ca href=\"https://github.com/datawhalechina/easy-vecdb/graphs/contributors\"\u003e\r\n  \u003cimg src=\"https://contrib.rocks/image?repo=datawhalechina/easy-vecdb\" /\u003e\r\n\u003c/a\u003e\r\n\r\n\u003c/div\u003e\r\n\r\n\r\n## 关注我们\r\n\r\n\u003cdiv align=center\u003e\r\n\u003cp\u003e扫描下方二维码关注公众号：Datawhale\u003c/p\u003e\r\n\u003cimg src=\"https://raw.githubusercontent.com/datawhalechina/pumpkin-book/master/res/qrcode.jpeg\" width = \"180\" height = \"180\"\u003e\r\n\u003c/div\u003e\r\n\r\n\r\n## 📊 Star History\r\n\r\n\u003cdiv align=\"center\"\u003e\r\n\r\n\u003cpicture\u003e\r\n  \u003csource media=\"(prefers-color-scheme: dark)\" srcset=\"https://api.star-history.com/svg?repos=datawhalechina/easy-vecdb\u0026type=Date\u0026theme=dark\" /\u003e\r\n  \u003csource media=\"(prefers-color-scheme: light)\" srcset=\"https://api.star-history.com/svg?repos=datawhalechina/easy-vecdb\u0026type=Date\" /\u003e\r\n  \u003cimg alt=\"Star History Chart\" src=\"https://api.star-history.com/svg?repos=datawhalechina/easy-vecdb\u0026type=Date\" /\u003e\r\n\u003c/picture\u003e\r\n\r\n---\r\n\u003c/div\u003e\r\n\r\n## 📜 开源协议\r\n\r\n\u003cdiv align=\"left\"\u003e\r\n\r\n\u003ca rel=\"license\" href=\"http://creativecommons.org/licenses/by-nc-sa/4.0/\"\u003e\r\n  \u003cimg alt=\"知识共享许可协议\" style=\"border-width:0\" src=\"https://img.shields.io/badge/license-CC%20BY--NC--SA%204.0-lightgrey\" /\u003e\r\n\u003c/a\u003e\r\n\r\n本作品采用 [知识共享署名-非商业性使用-相同方式共享 4.0 国际许可协议](http://creativecommons.org/licenses/by-nc-sa/4.0/) 进行许可。\r\n\r\n**Made with ❤️ by Datawhale**\r\n\r\n\u003c/div\u003e\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdatawhalechina%2Feasy-vecdb","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdatawhalechina%2Feasy-vecdb","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdatawhalechina%2Feasy-vecdb/lists"}