{"id":29571844,"url":"https://github.com/oceanbase/pyobvector","last_synced_at":"2025-07-19T04:36:19.848Z","repository":{"id":260988161,"uuid":"862697041","full_name":"oceanbase/pyobvector","owner":"oceanbase","description":"pyobvector: A python SDK for OceanBase Multimodal Store (Vector Store / Full Text Search / JSON Table), based on SQLAlchemy, compatible with Milvus API.","archived":false,"fork":false,"pushed_at":"2025-06-09T13:11:32.000Z","size":162,"stargazers_count":11,"open_issues_count":3,"forks_count":3,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-06-09T13:32:42.180Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/oceanbase.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-09-25T03:26:25.000Z","updated_at":"2025-06-09T13:10:52.000Z","dependencies_parsed_at":"2024-12-20T10:38:06.133Z","dependency_job_id":"4ff6ab00-89d8-439c-b060-d8034ccbb91c","html_url":"https://github.com/oceanbase/pyobvector","commit_stats":null,"previous_names":["oceanbase/pyobvector"],"tags_count":34,"template":false,"template_full_name":null,"purl":"pkg:github/oceanbase/pyobvector","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oceanbase%2Fpyobvector","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oceanbase%2Fpyobvector/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oceanbase%2Fpyobvector/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oceanbase%2Fpyobvector/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/oceanbase","download_url":"https://codeload.github.com/oceanbase/pyobvector/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oceanbase%2Fpyobvector/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265889155,"owners_count":23844539,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-07-19T04:36:19.327Z","updated_at":"2025-07-19T04:36:19.828Z","avatar_url":"https://github.com/oceanbase.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# pyobvector\n\nA python SDK for OceanBase Multimodal Store (Vector Store / Full Text Search / JSON Table), based on SQLAlchemy, compatible with Milvus API.\n\n[![Downloads](https://static.pepy.tech/badge/pyobvector)](https://pepy.tech/project/pyobvector)  [![Downloads](https://static.pepy.tech/badge/pyobvector/month)](https://pepy.tech/project/pyobvector)\n\n## Installation\n\n- git clone this repo, then install with:\n\n```shell\npoetry install\n```\n\n- install with pip:\n\n```shell\npip install pyobvector==0.2.14\n```\n\n## Build Doc\n\nYou can build document locally with `sphinx`:\n\n```shell\nmkdir build\nmake html\n```\n\n## Usage\n\n`pyobvector` supports two modes:\n\n- `Milvus compatible mode`: You can use the `MilvusLikeClient` class to use vector storage in a way similar to the Milvus API\n- `SQLAlchemy hybrid mode`: You can use the vector storage function provided by the `ObVecClient` class and execute the relational database statement with the SQLAlchemy library. In this mode, you can regard `pyobvector` as an extension of SQLAlchemy.\n\n### Milvus compatible mode\n\nRefer to `tests/test_milvus_like_client.py` for more examples.\n\nA simple workflow to perform ANN search with OceanBase Vector Store:\n\n- setup a client:\n\n```python\nfrom pyobvector import *\n\nclient = MilvusLikeClient(uri=\"127.0.0.1:2881\", user=\"test@test\")\n```\n\n- create a collection with vector index:\n\n```python\ntest_collection_name = \"ann_test\"\n# define the schema of collection with optional partitions\nrange_part = ObRangePartition(False, range_part_infos = [\n    RangeListPartInfo('p0', 100),\n    RangeListPartInfo('p1', 'maxvalue'),\n], range_expr='id')\nschema = client.create_schema(partitions=range_part)\n# define field schema of collection\nschema.add_field(field_name=\"id\", datatype=DataType.INT64, is_primary=True)\nschema.add_field(field_name=\"embedding\", datatype=DataType.FLOAT_VECTOR, dim=3)\nschema.add_field(field_name=\"meta\", datatype=DataType.JSON, nullable=True)\n# define index parameters\nidx_params = self.client.prepare_index_params()\nidx_params.add_index(\n    field_name='embedding',\n    index_type=VecIndexType.HNSW,\n    index_name='vidx',\n    metric_type=\"L2\",\n    params={\"M\": 16, \"efConstruction\": 256},\n)\n# create collection\nclient.create_collection(\n    collection_name=test_collection_name,\n    schema=schema,\n    index_params=idx_params,\n)\n```\n\n- insert data to your collection:\n\n```python\n# prepare\nvector_value1 = [0.748479,0.276979,0.555195]\nvector_value2 = [0, 0, 0]\ndata1 = [{'id': i, 'embedding': vector_value1} for i in range(10)]\ndata1.extend([{'id': i, 'embedding': vector_value2} for i in range(10, 13)])\ndata1.extend([{'id': i, 'embedding': vector_value2} for i in range(111, 113)])\n# insert data\nclient.insert(collection_name=test_collection_name, data=data1)\n```\n\n- do ann search:\n\n```python\nres = client.search(collection_name=test_collection_name, data=[0,0,0], anns_field='embedding', limit=5, output_fields=['id'])\n# For example, the result will be:\n# [{'id': 112}, {'id': 111}, {'id': 10}, {'id': 11}, {'id': 12}]\n```\n\n### SQLAlchemy hybrid mode\n\n- setup a client:\n\n```python\nfrom pyobvector import *\nfrom sqlalchemy import Column, Integer, JSON\nfrom sqlalchemy import func\n\nclient = ObVecClient(uri=\"127.0.0.1:2881\", user=\"test@test\")\n```\n\n- create a partitioned table with vector index:\n\n```python\n# create partitioned table\nrange_part = ObRangePartition(False, range_part_infos = [\n    RangeListPartInfo('p0', 100),\n    RangeListPartInfo('p1', 'maxvalue'),\n], range_expr='id')\n\ncols = [\n    Column('id', Integer, primary_key=True, autoincrement=False),\n    Column('embedding', VECTOR(3)),\n    Column('meta', JSON)\n]\nclient.create_table(test_collection_name, columns=cols, partitions=range_part)\n\n# create vector index\nclient.create_index(\n    test_collection_name, \n    is_vec_index=True, \n    index_name='vidx',\n    column_names=['embedding'],\n    vidx_params='distance=l2, type=hnsw, lib=vsag',\n)\n```\n\n- insert data to your collection:\n\n```python\n# insert data\nvector_value1 = [0.748479,0.276979,0.555195]\nvector_value2 = [0, 0, 0]\ndata1 = [{'id': i, 'embedding': vector_value1} for i in range(10)]\ndata1.extend([{'id': i, 'embedding': vector_value2} for i in range(10, 13)])\ndata1.extend([{'id': i, 'embedding': vector_value2} for i in range(111, 113)])\nclient.insert(test_collection_name, data=data1)\n```\n\n- do ann search:\n\n```python\n# perform ann search\nres = self.client.ann_search(\n    test_collection_name, \n    vec_data=[0,0,0], \n    vec_column_name='embedding',\n    distance_func=l2_distance,\n    topk=5,\n    output_column_names=['id']\n)\n# For example, the result will be:\n# [(112,), (111,), (10,), (11,), (12,)]\n```\n\n- If you want to use pure `SQLAlchemy` API with `OceanBase` dialect, you can just get an `SQLAlchemy.engine` via `client.engine`. The engine can also be created as following:\n\n```python\nimport pyobvector\nfrom sqlalchemy.dialects import registry\nfrom sqlalchemy import create_engine\n\nuri: str = \"127.0.0.1:2881\"\nuser: str = \"root@test\"\npassword: str = \"\"\ndb_name: str = \"test\"\nregistry.register(\"mysql.oceanbase\", \"pyobvector.schema.dialect\", \"OceanBaseDialect\")\nconnection_str = (\n    f\"mysql+oceanbase://{user}:{password}@{uri}/{db_name}?charset=utf8mb4\"\n)\nengine = create_engine(connection_str, **kwargs)\n```\n\n- Async engine is also supported:\n\n```python\nimport pyobvector\nfrom sqlalchemy.dialects import registry\nfrom sqlalchemy.ext.asyncio import create_async_engine\n\nuri: str = \"127.0.0.1:2881\"\nuser: str = \"root@test\"\npassword: str = \"\"\ndb_name: str = \"test\"\nregistry.register(\"mysql.aoceanbase\", \"pyobvector\", \"AsyncOceanBaseDialect\")\nconnection_str = (\n    f\"mysql+aoceanbase://{user}:{password}@{uri}/{db_name}?charset=utf8mb4\"\n)\nengine = create_async_engine(connection_str)\n```\n\n- For further usage in pure `SQLAlchemy` mode, please refer to [SQLAlchemy](https://www.sqlalchemy.org/)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Foceanbase%2Fpyobvector","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Foceanbase%2Fpyobvector","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Foceanbase%2Fpyobvector/lists"}