{"id":13525839,"url":"https://github.com/arcmindai/arcmindvector","last_synced_at":"2025-04-01T05:32:32.769Z","repository":{"id":221805216,"uuid":"717296488","full_name":"arcmindai/arcmindvector","owner":"arcmindai","description":"ArcMind Vector DB","archived":false,"fork":false,"pushed_at":"2024-05-05T06:30:14.000Z","size":239,"stargazers_count":8,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-11-02T10:34:03.402Z","etag":null,"topics":["ai","approximate-nearest-neighbor-search","internetcomputer","nearest-neighbor-search","retrieval-augmented-generation","rust-lang","similarity-search","smart-contracts","vector-database"],"latest_commit_sha":null,"homepage":"https://arcmindai.app","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/arcmindai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-11-11T03:19:22.000Z","updated_at":"2024-10-26T13:28:24.000Z","dependencies_parsed_at":null,"dependency_job_id":"62ddb2d5-d43c-4516-865e-5ad0cb9ff7a4","html_url":"https://github.com/arcmindai/arcmindvector","commit_stats":null,"previous_names":["arcmindai/arcmindvector"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/arcmindai%2Farcmindvector","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/arcmindai%2Farcmindvector/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/arcmindai%2Farcmindvector/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/arcmindai%2Farcmindvector/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/arcmindai","download_url":"https://codeload.github.com/arcmindai/arcmindvector/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246591752,"owners_count":20801984,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","approximate-nearest-neighbor-search","internetcomputer","nearest-neighbor-search","retrieval-augmented-generation","rust-lang","similarity-search","smart-contracts","vector-database"],"created_at":"2024-08-01T06:01:22.739Z","updated_at":"2025-04-01T05:32:32.397Z","avatar_url":"https://github.com/arcmindai.png","language":"Rust","readme":"# Arcmind Vector DB\n\nArcmind Vector DB is a high-performance, flexible, and ergonomic vector similarity search database for the [Internet Computer](https://internetcomputer.org). It is designed to be a general-purpose vector similarity search database that can be used for a wide range of AI-powered applications, including recommendation systems, search engines, [Retrieval Augmented Generation](https://arxiv.org/abs/2005.11401) (RAG), and long-term memory of Autonomous AI agents like [ArcMind AI](https://github.com/arcmindai/arcmindai).\n\n## Architecture\n\nSequence Flow Diagram\n![ArcMind Vector DB](/diagram/architecture.png)\n\n## Prerequisites\n\n- Install Rust Toolchain using Rustup  \n  Follows https://www.rust-lang.org/tools/install\n- Install cargo-audit\n\n```\ncargo install cargo-audit\n```\n\n- Install dfx sdk  \n  Follow https://github.com/dfinity/sdk\n\n## Quick Start\n\nIf you want to test your project locally, you can use the following commands:\n\n```bash\n# Starts the replica, running in the background\ndfx start --background\n\n# Deploys controller and brain canisters to the local replica\n# Setup the environment variable: CONTROLLER_PRINCIPAL using using \u003e dfx identity get-principal\n\n./scripts/provision.sh\n```\n\nThe provision script will deploy a `arcmindvectordb` canister.\n\n## API\n\nSee [Candid](/src/arcmindvectordb/arcmindvectordb.did) for the full API.\n\n## Interacting with the canisters\n\nSample shell scripts are provided to interact with the canisters in the [interact](/interact/) directory.\nSample embeddings content and their embedding vectors are provided in the [embeddings](/embeddings/) directory.\n\n### Add a vector to the VectorStore\n\nOpen and Edit:\n\n```bash\n./interact/add_vector.sh\n```\n\nTry adding multiple vectors of different topics to the VectorStore.\n\n### Search the VectorStore\n\nThen search for similar vectors by using one of the vectors you added as input.\nIt should return the same vector as the most similar vector and other similar vectors of the same topic.\nSee how it can understand the semantic meanings of the vectors with many dimensions.\n\nOpen and Edit:\n\n```bash\n./interact/search_vector.sh\n```\n\nNote that the same embedding model must be used for adding and searching vectors.\nIt is recommended that you use the same embedding model in a single VectorStore for consistent results.\n\nThe embeddings in /embeddings/ are generated using the [OpenAI text-embedding-ada-002](https://platform.openai.com/docs/guides/embeddings/embedding-models) model with its [Embedding API](https://platform.openai.com/docs/api-reference/embeddings)\n\n## Setting up Github Action CI / CD\n\nGet the string using commands below then put it into Github Secrets.\nNote: Replace default by the identity name you need.\n\n### DFX_IDENTITY\n\n```\nawk 'NF {sub(/\\r/, \"\"); printf \"%s\\\\r\\\\n\",$0;}' ~/.config/dfx/identity/default/identity.pem\n```\n\n### DFX_WALLETS\n\n```\ncat ~/.config/dfx/identity/default/wallets.json\n```\n\n## Roadmap\n\n- [x] Backend - Research and implement primary canister as long-term VectorStore with Nearest Neighbours distance metric, embedding API and indexing\n- [x] Backend - Integrate with ArcMind AI Autonomous Agent for long-term memory\n- [ ] Doc - Add documentation for the VectorStore API\n- [ ] Backend - Self-hosted machine learning models for generating text (NLP), image and audio embeddings\n- [ ] Backend - Scalable storage buckets for large-scale vector data beyond the canister storage limit\n\n## License\n\nSee the [License](LICENSE) file for license rights and limitations (MIT).\n\n## Contributing\n\nSee [CONTRIBUTING.md](CONTRIBUTING.md) for details about how to contribute to this project.\n\n## Authors\n\nCode \u0026 Architecture: Henry Chan, [henry@arcmindai.app](mailto:henry@arcmindai.app), Twitter: [@kinwo](https://twitter.com/kinwo)\n\n## References\n\n- [Internet Computer](https://internetcomputer.org)\n- [Cloudflare - What is a Vector Database?](https://developers.cloudflare.com/vectorize/reference/what-is-a-vector-database/)\n- [RAG](https://arxiv.org/abs/2005.11401)\n- [Open-source vector similarity search for Postgres](https://github.com/pgvector/pgvector)\n- [Spotify Annoy Library - Approximate Nearest Neighbors in C++/Python](https://github.com/spotify/annoy)\n- [What is similarity Search](https://www.pinecone.io/learn/what-is-similarity-search/)\n- [Semantic Search: Measuring Meaning From Jaccard to Bert](https://www.pinecone.io/learn/semantic-search/)\n- [A high-performance, flexible, ergonomic k-d tree Rust library](https://github.com/sdd/kiddo)\n- [K-d tree](https://en.wikipedia.org/wiki/K-d_tree)\n- [Depplearing.ai course - Building Applications with Vector Databases](https://www.deeplearning.ai/short-courses/building-applications-vector-databases/)\n","funding_links":[],"categories":["Decentralized AI"],"sub_categories":["TON"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Farcmindai%2Farcmindvector","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Farcmindai%2Farcmindvector","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Farcmindai%2Farcmindvector/lists"}