{"id":23988585,"url":"https://github.com/argmaxml/vecsim","last_synced_at":"2025-04-14T12:14:52.941Z","repository":{"id":63710428,"uuid":"570074028","full_name":"argmaxml/vecsim","owner":"argmaxml","description":null,"archived":false,"fork":false,"pushed_at":"2023-11-30T13:40:25.000Z","size":60,"stargazers_count":4,"open_issues_count":0,"forks_count":2,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-04-14T12:14:47.102Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/argmaxml.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2022-11-24T09:26:18.000Z","updated_at":"2024-01-12T18:48:50.000Z","dependencies_parsed_at":"2023-11-30T11:28:48.144Z","dependency_job_id":"43f0f73e-e1dc-435d-9189-2f6e86d6096e","html_url":"https://github.com/argmaxml/vecsim","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/argmaxml%2Fvecsim","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/argmaxml%2Fvecsim/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/argmaxml%2Fvecsim/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/argmaxml%2Fvecsim/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/argmaxml","download_url":"https://codeload.github.com/argmaxml/vecsim/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248877961,"owners_count":21176244,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-01-07T16:13:39.610Z","updated_at":"2025-04-14T12:14:52.918Z","avatar_url":"https://github.com/argmaxml.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# VecSim - A unified interface for similarity servers\r\nA standard, light-weight interface to all popular similarity servers.\r\n\r\n## The problems we are trying to solve:\r\n1. **Standard API** - Different vector similarity servers have different APIs - so switching is not trivial.\r\n1. **Identifiers** - Some vector similarity servers support string IDs, some do not - we keep track of the mapping.\r\n1. **Partitions** - In most cases, pre-filtering is needed prior to querying, we abstract this concept away.\r\n1. **Aggregations** - In some cases, one item is being indexed to multiple vectors.\r\n\r\n## Supported engines:\r\n1. Scikit-learn, via [NearestNeighbors](https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.NearestNeighbors.html)\r\n1. [RediSearch](https://redis.io/docs/stack/search/reference/vectors/)\r\n1. [Faiss](https://github.com/facebookresearch/faiss)\r\n1. [ElasticSearch](https://www.elastic.co)\r\n1. [Pinecone](https://www.pinecone.io)\r\n\r\n\r\n## QuickStart example\r\n```python\r\nimport numpy as np\r\n# Import a similarity server of your choice:\r\n# SKlearn (best for small datasets or testing)\r\nfrom vecsim import SciKitIndex\r\nsim = SciKitIndex(metric='cosine', dim=32)\r\n\r\nuser_ids = [\"user_\"+str(1+i) for i in range(100)]\r\nuser_data = np.random.random((100,32))\r\nitem_ids=[\"item_\"+str(101+i) for i in range(100)]\r\nitem_data = np.random.random((100,32))\r\nsim.add_items(user_data, user_ids, partition=\"users\")\r\nsim.add_items(item_data, item_ids, partition=\"items\")\r\n# Index the data\r\nsim.init()\r\n# Run nearest neighbor vector search\r\nquery = np.random.random(32)\r\ndists, items = sim.search(query, k=10) # returns a list of users and items\r\ndists, items = sim.search(query, k=10, partition=\"users\") # returns a list of users only\r\n```\r\n\r\nFor more examples, please read our [documentation](https://vecsim.readthedocs.io/)","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fargmaxml%2Fvecsim","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fargmaxml%2Fvecsim","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fargmaxml%2Fvecsim/lists"}