{"id":26093655,"url":"https://github.com/AI-Northstar-Tech/vector-io","last_synced_at":"2025-03-09T12:01:36.143Z","repository":{"id":182304059,"uuid":"636107962","full_name":"AI-Northstar-Tech/vector-io","owner":"AI-Northstar-Tech","description":"Comprehensive Vector Data Tooling. The universal interface for all vector database, datasets and RAG platforms. Easily export, import, backup, re-embed (using any model) or access your vector data from any vector databases or repository.","archived":false,"fork":false,"pushed_at":"2025-02-24T18:32:02.000Z","size":4610,"stargazers_count":226,"open_issues_count":24,"forks_count":27,"subscribers_count":6,"default_branch":"main","last_synced_at":"2025-03-02T01:11:29.331Z","etag":null,"topics":["chromadb","data-backup","data-exploration-and-preprocessing","data-export","data-import","datastax","huggingface","huggingface-datasets","kdb","lancedb","milvus","parquet","pinecone","qdrant","turbopuffer","vector-database","vector-search-engine","visualization","zilliz"],"latest_commit_sha":null,"homepage":"https://vector-io.com","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/AI-Northstar-Tech.png","metadata":{"files":{"readme":"README.html","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"custom":"https://buy.stripe.com/aEU6p89JpefG8tW5km","github":[]}},"created_at":"2023-05-04T06:31:11.000Z","updated_at":"2025-02-27T05:06:11.000Z","dependencies_parsed_at":"2023-11-06T12:30:06.493Z","dependency_job_id":"2e161383-7a88-4c46-bc71-ce949f3d372c","html_url":"https://github.com/AI-Northstar-Tech/vector-io","commit_stats":null,"previous_names":["ai-northstar-tech/all-vectordb","ai-northstar-tech/vector-io"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AI-Northstar-Tech%2Fvector-io","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AI-Northstar-Tech%2Fvector-io/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AI-Northstar-Tech%2Fvector-io/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AI-Northstar-Tech%2Fvector-io/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/AI-Northstar-Tech","download_url":"https://codeload.github.com/AI-Northstar-Tech/vector-io/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":242685876,"owners_count":20169243,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["chromadb","data-backup","data-exploration-and-preprocessing","data-export","data-import","datastax","huggingface","huggingface-datasets","kdb","lancedb","milvus","parquet","pinecone","qdrant","turbopuffer","vector-database","vector-search-engine","visualization","zilliz"],"created_at":"2025-03-09T12:01:17.170Z","updated_at":"2025-03-09T12:01:36.077Z","avatar_url":"https://github.com/AI-Northstar-Tech.png","language":"Jupyter Notebook","funding_links":["https://buy.stripe.com/aEU6p89JpefG8tW5km"],"categories":["Curated Resource Lists","Others"],"sub_categories":[],"readme":"\u003ch1 id=\"vector-io\"\u003eVector IO\u003c/h1\u003e\n\u003cp\u003e\n    \u003ca href=\"https://pypi.org/project/vdf-io/\"\u003e\u003cimg alt=\"PyPI - Version\" src=\"https://img.shields.io/pypi/v/vdf-io\"\u003e\u003c/a\u003e\n    \u003ca href=\"https://pypi.org/project/vdf-io/\"\u003e\u003cimg alt=\"PyPI - Downloads\"\n            src=\"https://img.shields.io/pypi/dm/vdf-io?style=flat\u0026link=https%3A%2F%2Fpypi.org%2Fproject%2Fvdf-io%2F\"\u003e\u003c/a\u003e\n    \u003ca href=\"https://discord.gg/HGxDZxNt9G\"\u003e\u003cimg alt=\"Discord\"\n            src=\"https://img.shields.io/discord/1223707915827937321?style=flat\u0026logo=discord\u0026link=https%3A%2F%2Fdiscord.gg%2FHGxDZxNt9G\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\u003cp align=\"center\"\u003e\n    \u003c!-- include photo --\u003e\n    \u003cimg src=\"assets/vector-io-logo.png\" width=\"200\" /\u003e\n\u003c/p\u003e\n\u003cp\u003eThis library uses a universal format for vector datasets to easily\n    export and import data from all vector databases.\u003c/p\u003e\n\u003cp\u003eRequest support for a VectorDB by voting/commenting on \u003ca\n        href=\"https://github.com/AI-Northstar-Tech/vector-io/discussions/38\"\u003ethis\n        poll\u003c/a\u003e\u003c/p\u003e\n\u003cp\u003eSee the \u003ca href=\"#contributing\"\u003eContributing\u003c/a\u003e section to add\n    support for your favorite vector database.\u003c/p\u003e\n\u003ch2 id=\"supported-vector-databases\"\u003eSupported Vector Databases\u003c/h2\u003e\n\u003cdetails open\u003e\n    \u003csummary\u003e\n        Fully Supported\n    \u003c/summary\u003e\n    \u003cp align=\"center\"\u003e\n        \u003c!-- include photo --\u003e\n        \u003cimg src=\"assets/vector-io-ecosystem-may3-2024.jpg\" width=\"800\" /\u003e\n    \u003c/p\u003e\n    \u003ctable\u003e\n        \u003cthead\u003e\n            \u003ctr class=\"header\"\u003e\n                \u003cth\u003eVector Database\u003c/th\u003e\n                \u003cth\u003eImport\u003c/th\u003e\n                \u003cth\u003eExport\u003c/th\u003e\n            \u003c/tr\u003e\n        \u003c/thead\u003e\n        \u003ctbody\u003e\n            \u003ctr class=\"odd\"\u003e\n                \u003ctd\u003ePinecone\u003c/td\u003e\n                \u003ctd\u003e✅\u003c/td\u003e\n                \u003ctd\u003e✅\u003c/td\u003e\n            \u003c/tr\u003e\n            \u003ctr class=\"even\"\u003e\n                \u003ctd\u003eQdrant\u003c/td\u003e\n                \u003ctd\u003e✅\u003c/td\u003e\n                \u003ctd\u003e✅\u003c/td\u003e\n            \u003c/tr\u003e\n            \u003ctr class=\"odd\"\u003e\n                \u003ctd\u003eMilvus\u003c/td\u003e\n                \u003ctd\u003e✅\u003c/td\u003e\n                \u003ctd\u003e✅\u003c/td\u003e\n            \u003c/tr\u003e\n            \u003ctr class=\"even\"\u003e\n                \u003ctd\u003eGCP Vertex AI Vector Search\u003c/td\u003e\n                \u003ctd\u003e✅\u003c/td\u003e\n                \u003ctd\u003e✅\u003c/td\u003e\n            \u003c/tr\u003e\n            \u003ctr class=\"odd\"\u003e\n                \u003ctd\u003eKDB.AI\u003c/td\u003e\n                \u003ctd\u003e✅\u003c/td\u003e\n                \u003ctd\u003e✅\u003c/td\u003e\n            \u003c/tr\u003e\n            \u003ctr class=\"even\"\u003e\n                \u003ctd\u003eLanceDB\u003c/td\u003e\n                \u003ctd\u003e✅\u003c/td\u003e\n                \u003ctd\u003e✅\u003c/td\u003e\n            \u003c/tr\u003e\n            \u003ctr class=\"odd\"\u003e\n                \u003ctd\u003eDataStax Astra DB\u003c/td\u003e\n                \u003ctd\u003e✅\u003c/td\u003e\n                \u003ctd\u003e✅\u003c/td\u003e\n            \u003c/tr\u003e\n            \u003ctr class=\"even\"\u003e\n                \u003ctd\u003eChroma\u003c/td\u003e\n                \u003ctd\u003e✅\u003c/td\u003e\n                \u003ctd\u003e✅\u003c/td\u003e\n            \u003c/tr\u003e\n            \u003ctr class=\"odd\"\u003e\n                \u003ctd\u003eTurbopuffer\u003c/td\u003e\n                \u003ctd\u003e✅\u003c/td\u003e\n                \u003ctd\u003e✅\u003c/td\u003e\n            \u003c/tr\u003e\n        \u003c/tbody\u003e\n    \u003c/table\u003e\n\u003c/details\u003e\n\u003chr /\u003e\n\u003cdetails open\u003e\n    \u003csummary\u003e\n        Partial\n    \u003c/summary\u003e\n    \u003ctable\u003e\n        \u003cthead\u003e\n            \u003ctr class=\"header\"\u003e\n                \u003cth\u003eVector Database\u003c/th\u003e\n                \u003cth\u003eImport\u003c/th\u003e\n                \u003cth\u003eExport\u003c/th\u003e\n            \u003c/tr\u003e\n        \u003c/thead\u003e\n        \u003ctbody\u003e\n        \u003c/tbody\u003e\n    \u003c/table\u003e\n\u003c/details\u003e\n\u003c!-- line break --\u003e\n\u003chr /\u003e\n\u003cdetails\u003e\n    \u003csummary\u003e\n        In Progress\n    \u003c/summary\u003e\n    \u003ctable\u003e\n        \u003cthead\u003e\n            \u003ctr class=\"header\"\u003e\n                \u003cth\u003eVector Database\u003c/th\u003e\n                \u003cth\u003eImport\u003c/th\u003e\n                \u003cth\u003eExport\u003c/th\u003e\n            \u003c/tr\u003e\n        \u003c/thead\u003e\n        \u003ctbody\u003e\n            \u003ctr class=\"odd\"\u003e\n                \u003ctd\u003epgvector\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n            \u003c/tr\u003e\n            \u003ctr class=\"even\"\u003e\n                \u003ctd\u003eAzure AI Search\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n            \u003c/tr\u003e\n            \u003ctr class=\"odd\"\u003e\n                \u003ctd\u003eWeaviate\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n            \u003c/tr\u003e\n            \u003ctr class=\"even\"\u003e\n                \u003ctd\u003eMongoDB Atlas\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n            \u003c/tr\u003e\n            \u003ctr class=\"odd\"\u003e\n                \u003ctd\u003eApache Cassandra\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n            \u003c/tr\u003e\n            \u003ctr class=\"even\"\u003e\n                \u003ctd\u003etxtai\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n            \u003c/tr\u003e\n            \u003ctr class=\"odd\"\u003e\n                \u003ctd\u003eSQLite-VSS\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n            \u003c/tr\u003e\n        \u003c/tbody\u003e\n    \u003c/table\u003e\n\u003c/details\u003e\n\u003chr /\u003e\n\u003cdetails\u003e\n    \u003csummary\u003e\n        Not Supported\n    \u003c/summary\u003e\n    \u003ctable\u003e\n        \u003cthead\u003e\n            \u003ctr class=\"header\"\u003e\n                \u003cth\u003eVector Database\u003c/th\u003e\n                \u003cth\u003eImport\u003c/th\u003e\n                \u003cth\u003eExport\u003c/th\u003e\n            \u003c/tr\u003e\n        \u003c/thead\u003e\n        \u003ctbody\u003e\n            \u003ctr class=\"odd\"\u003e\n                \u003ctd\u003eVespa\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n            \u003c/tr\u003e\n            \u003ctr class=\"even\"\u003e\n                \u003ctd\u003eAWS Neptune\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n            \u003c/tr\u003e\n            \u003ctr class=\"odd\"\u003e\n                \u003ctd\u003eNeo4j\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n            \u003c/tr\u003e\n            \u003ctr class=\"even\"\u003e\n                \u003ctd\u003eMarqo\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n            \u003c/tr\u003e\n            \u003ctr class=\"odd\"\u003e\n                \u003ctd\u003eOpenSearch\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n            \u003c/tr\u003e\n            \u003ctr class=\"even\"\u003e\n                \u003ctd\u003eElasticsearch\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n            \u003c/tr\u003e\n            \u003ctr class=\"odd\"\u003e\n                \u003ctd\u003eApache Solr\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n            \u003c/tr\u003e\n            \u003ctr class=\"even\"\u003e\n                \u003ctd\u003eRedis Search\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n            \u003c/tr\u003e\n            \u003ctr class=\"odd\"\u003e\n                \u003ctd\u003eClickHouse\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n            \u003c/tr\u003e\n            \u003ctr class=\"even\"\u003e\n                \u003ctd\u003eUSearch\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n            \u003c/tr\u003e\n            \u003ctr class=\"odd\"\u003e\n                \u003ctd\u003eRockset\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n            \u003c/tr\u003e\n            \u003ctr class=\"even\"\u003e\n                \u003ctd\u003eEpsilla\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n            \u003c/tr\u003e\n            \u003ctr class=\"odd\"\u003e\n                \u003ctd\u003eActiveloop Deep Lake\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n            \u003c/tr\u003e\n            \u003ctr class=\"even\"\u003e\n                \u003ctd\u003eApertureDB\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n            \u003c/tr\u003e\n            \u003ctr class=\"odd\"\u003e\n                \u003ctd\u003eCrateDB\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n            \u003c/tr\u003e\n            \u003ctr class=\"even\"\u003e\n                \u003ctd\u003eMeilisearch\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n            \u003c/tr\u003e\n            \u003ctr class=\"odd\"\u003e\n                \u003ctd\u003eMyScale\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n            \u003c/tr\u003e\n            \u003ctr class=\"even\"\u003e\n                \u003ctd\u003eNuclia DB\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n            \u003c/tr\u003e\n            \u003ctr class=\"odd\"\u003e\n                \u003ctd\u003eOramaSearch\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n            \u003c/tr\u003e\n            \u003ctr class=\"even\"\u003e\n                \u003ctd\u003eTypesense\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n            \u003c/tr\u003e\n            \u003ctr class=\"odd\"\u003e\n                \u003ctd\u003eAnari AI\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n            \u003c/tr\u003e\n            \u003ctr class=\"even\"\u003e\n                \u003ctd\u003eVald\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n                \u003ctd\u003e❌\u003c/td\u003e\n            \u003c/tr\u003e\n        \u003c/tbody\u003e\n    \u003c/table\u003e\n\u003c/details\u003e\n\u003ch2 id=\"installation\"\u003eInstallation\u003c/h2\u003e\n\u003ch3 id=\"using-pip\"\u003eUsing pip\u003c/h3\u003e\n\u003cdiv class=\"sourceCode\" id=\"cb1\"\u003e\n    \u003cpre\n        class=\"sourceCode bash\"\u003e\u003ccode class=\"sourceCode bash\"\u003e\u003cspan id=\"cb1-1\"\u003e\u003ca href=\"#cb1-1\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e\u003cspan class=\"ex\"\u003epip\u003c/span\u003e install vdf-io\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\n\u003c/div\u003e\n\u003ch3 id=\"from-source\"\u003eFrom source\u003c/h3\u003e\n\u003cdiv class=\"sourceCode\" id=\"cb2\"\u003e\n    \u003cpre\n        class=\"sourceCode bash\"\u003e\u003ccode class=\"sourceCode bash\"\u003e\u003cspan id=\"cb2-1\"\u003e\u003ca href=\"#cb2-1\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e\u003cspan class=\"fu\"\u003egit\u003c/span\u003e clone https://github.com/AI-Northstar-Tech/vector-io.git\u003c/span\u003e\n\u003cspan id=\"cb2-2\"\u003e\u003ca href=\"#cb2-2\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e\u003cspan class=\"bu\"\u003ecd\u003c/span\u003e vector-io\u003c/span\u003e\n\u003cspan id=\"cb2-3\"\u003e\u003ca href=\"#cb2-3\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e\u003cspan class=\"ex\"\u003epip\u003c/span\u003e install \u003cspan class=\"at\"\u003e-r\u003c/span\u003e requirements.txt\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\n\u003c/div\u003e\n\u003ch2 id=\"universal-vector-dataset-format-vdf-specification\"\u003eUniversal\n    Vector Dataset Format (VDF) specification\u003c/h2\u003e\n\u003col type=\"1\"\u003e\n    \u003cli\u003eVDF_META.json: It is a json file with the following schema VDFMeta\n        defined in \u003ca href=\"src/vdf_io/meta_types.py\"\u003esrc/vdf_io/meta_types.py\u003c/a\u003e:\u003c/li\u003e\n\u003c/ol\u003e\n\u003cdiv class=\"sourceCode\" id=\"cb3\"\u003e\n    \u003cpre\n        class=\"sourceCode python\"\u003e\u003ccode class=\"sourceCode python\"\u003e\u003cspan id=\"cb3-1\"\u003e\u003ca href=\"#cb3-1\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e\u003cspan class=\"kw\"\u003eclass\u003c/span\u003e NamespaceMeta(BaseModel):\u003c/span\u003e\n\u003cspan id=\"cb3-2\"\u003e\u003ca href=\"#cb3-2\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e    namespace: \u003cspan class=\"bu\"\u003estr\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb3-3\"\u003e\u003ca href=\"#cb3-3\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e    index_name: \u003cspan class=\"bu\"\u003estr\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb3-4\"\u003e\u003ca href=\"#cb3-4\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e    total_vector_count: \u003cspan class=\"bu\"\u003eint\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb3-5\"\u003e\u003ca href=\"#cb3-5\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e    exported_vector_count: \u003cspan class=\"bu\"\u003eint\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb3-6\"\u003e\u003ca href=\"#cb3-6\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e    dimensions: \u003cspan class=\"bu\"\u003eint\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb3-7\"\u003e\u003ca href=\"#cb3-7\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e    model_name: \u003cspan class=\"bu\"\u003estr\u003c/span\u003e \u003cspan class=\"op\"\u003e|\u003c/span\u003e \u003cspan class=\"va\"\u003eNone\u003c/span\u003e \u003cspan class=\"op\"\u003e=\u003c/span\u003e \u003cspan class=\"va\"\u003eNone\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb3-8\"\u003e\u003ca href=\"#cb3-8\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e    vector_columns: List[\u003cspan class=\"bu\"\u003estr\u003c/span\u003e] \u003cspan class=\"op\"\u003e=\u003c/span\u003e [\u003cspan class=\"st\"\u003e\u0026quot;vector\u0026quot;\u003c/span\u003e]\u003c/span\u003e\n\u003cspan id=\"cb3-9\"\u003e\u003ca href=\"#cb3-9\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e    data_path: \u003cspan class=\"bu\"\u003estr\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb3-10\"\u003e\u003ca href=\"#cb3-10\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e    metric: \u003cspan class=\"bu\"\u003estr\u003c/span\u003e \u003cspan class=\"op\"\u003e|\u003c/span\u003e \u003cspan class=\"va\"\u003eNone\u003c/span\u003e \u003cspan class=\"op\"\u003e=\u003c/span\u003e \u003cspan class=\"va\"\u003eNone\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb3-11\"\u003e\u003ca href=\"#cb3-11\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e    index_config: Optional[Dict[Any, Any]] \u003cspan class=\"op\"\u003e=\u003c/span\u003e \u003cspan class=\"va\"\u003eNone\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb3-12\"\u003e\u003ca href=\"#cb3-12\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e    schema_dict: Optional[Dict[\u003cspan class=\"bu\"\u003estr\u003c/span\u003e, Any]] \u003cspan class=\"op\"\u003e=\u003c/span\u003e \u003cspan class=\"va\"\u003eNone\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb3-13\"\u003e\u003ca href=\"#cb3-13\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e\u003c/span\u003e\n\u003cspan id=\"cb3-14\"\u003e\u003ca href=\"#cb3-14\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e\u003c/span\u003e\n\u003cspan id=\"cb3-15\"\u003e\u003ca href=\"#cb3-15\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e\u003cspan class=\"kw\"\u003eclass\u003c/span\u003e VDFMeta(BaseModel):\u003c/span\u003e\n\u003cspan id=\"cb3-16\"\u003e\u003ca href=\"#cb3-16\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e    version: \u003cspan class=\"bu\"\u003estr\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb3-17\"\u003e\u003ca href=\"#cb3-17\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e    file_structure: List[\u003cspan class=\"bu\"\u003estr\u003c/span\u003e]\u003c/span\u003e\n\u003cspan id=\"cb3-18\"\u003e\u003ca href=\"#cb3-18\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e    author: \u003cspan class=\"bu\"\u003estr\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb3-19\"\u003e\u003ca href=\"#cb3-19\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e    exported_from: \u003cspan class=\"bu\"\u003estr\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb3-20\"\u003e\u003ca href=\"#cb3-20\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e    indexes: Dict[\u003cspan class=\"bu\"\u003estr\u003c/span\u003e, List[NamespaceMeta]]\u003c/span\u003e\n\u003cspan id=\"cb3-21\"\u003e\u003ca href=\"#cb3-21\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e    exported_at: \u003cspan class=\"bu\"\u003estr\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb3-22\"\u003e\u003ca href=\"#cb3-22\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e    id_column: Optional[\u003cspan class=\"bu\"\u003estr\u003c/span\u003e] \u003cspan class=\"op\"\u003e=\u003c/span\u003e \u003cspan class=\"va\"\u003eNone\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\n\u003c/div\u003e\n\u003col start=\"2\" type=\"1\"\u003e\n    \u003cli\u003eParquet files/folders for metadata and vectors.\u003c/li\u003e\n\u003c/ol\u003e\n\u003ch2 id=\"export-script\"\u003eExport Script\u003c/h2\u003e\n\u003cdiv class=\"sourceCode\" id=\"cb4\"\u003e\n    \u003cpre\n        class=\"sourceCode bash\"\u003e\u003ccode class=\"sourceCode bash\"\u003e\u003cspan id=\"cb4-1\"\u003e\u003ca href=\"#cb4-1\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e\u003cspan class=\"ex\"\u003eexport_vdf\u003c/span\u003e \u003cspan class=\"at\"\u003e--help\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb4-2\"\u003e\u003ca href=\"#cb4-2\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e\u003cspan class=\"ex\"\u003eusage:\u003c/span\u003e export_vdf \u003cspan class=\"pp\"\u003e[-\u003c/span\u003e\u003cspan class=\"ss\"\u003eh\u003c/span\u003e\u003cspan class=\"pp\"\u003e]\u003c/span\u003e [-m MODEL_NAME]\u003c/span\u003e\n\u003cspan id=\"cb4-3\"\u003e\u003ca href=\"#cb4-3\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e                  \u003cspan class=\"ex\"\u003e[--max_file_size\u003c/span\u003e MAX_FILE_SIZE]\u003c/span\u003e\n\u003cspan id=\"cb4-4\"\u003e\u003ca href=\"#cb4-4\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e                  \u003cspan class=\"ex\"\u003e[--push_to_hub\u003c/span\u003e \u003cspan class=\"kw\"\u003e|\u003c/span\u003e \u003cspan class=\"ex\"\u003e--no-push_to_hub]\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb4-5\"\u003e\u003ca href=\"#cb4-5\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e                  \u003cspan class=\"ex\"\u003e[--public\u003c/span\u003e \u003cspan class=\"kw\"\u003e|\u003c/span\u003e \u003cspan class=\"ex\"\u003e--no-public]\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb4-6\"\u003e\u003ca href=\"#cb4-6\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e                  \u003cspan class=\"dt\"\u003e{pinecone\u003c/span\u003e\u003cspan class=\"op\"\u003e,\u003c/span\u003e\u003cspan class=\"dt\"\u003eqdrant\u003c/span\u003e\u003cspan class=\"op\"\u003e,\u003c/span\u003e\u003cspan class=\"dt\"\u003ekdbai\u003c/span\u003e\u003cspan class=\"op\"\u003e,\u003c/span\u003e\u003cspan class=\"dt\"\u003emilvus\u003c/span\u003e\u003cspan class=\"op\"\u003e,\u003c/span\u003e\u003cspan class=\"dt\"\u003evertexai_vectorsearch}\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb4-7\"\u003e\u003ca href=\"#cb4-7\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e                  \u003cspan class=\"ex\"\u003e...\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb4-8\"\u003e\u003ca href=\"#cb4-8\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e\u003c/span\u003e\n\u003cspan id=\"cb4-9\"\u003e\u003ca href=\"#cb4-9\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e\u003cspan class=\"ex\"\u003eExport\u003c/span\u003e data from various vector databases to the VDF format for vector datasets\u003c/span\u003e\n\u003cspan id=\"cb4-10\"\u003e\u003ca href=\"#cb4-10\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e\u003c/span\u003e\n\u003cspan id=\"cb4-11\"\u003e\u003ca href=\"#cb4-11\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e\u003cspan class=\"ex\"\u003eoptions:\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb4-12\"\u003e\u003ca href=\"#cb4-12\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e  \u003cspan class=\"ex\"\u003e-h,\u003c/span\u003e \u003cspan class=\"at\"\u003e--help\u003c/span\u003e            show this help message and exit\u003c/span\u003e\n\u003cspan id=\"cb4-13\"\u003e\u003ca href=\"#cb4-13\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e  \u003cspan class=\"ex\"\u003e-m\u003c/span\u003e MODEL_NAME, \u003cspan class=\"at\"\u003e--model_name\u003c/span\u003e MODEL_NAME\u003c/span\u003e\n\u003cspan id=\"cb4-14\"\u003e\u003ca href=\"#cb4-14\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e                        \u003cspan class=\"ex\"\u003eName\u003c/span\u003e of model used\u003c/span\u003e\n\u003cspan id=\"cb4-15\"\u003e\u003ca href=\"#cb4-15\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e  \u003cspan class=\"ex\"\u003e--max_file_size\u003c/span\u003e MAX_FILE_SIZE\u003c/span\u003e\n\u003cspan id=\"cb4-16\"\u003e\u003ca href=\"#cb4-16\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e                        \u003cspan class=\"ex\"\u003eMaximum\u003c/span\u003e file size in MB \u003cspan class=\"er\"\u003e(\u003c/span\u003e\u003cspan class=\"ex\"\u003edefault:\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb4-17\"\u003e\u003ca href=\"#cb4-17\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e                        \u003cspan class=\"ex\"\u003e1024\u003c/span\u003e\u003cspan class=\"kw\"\u003e)\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb4-18\"\u003e\u003ca href=\"#cb4-18\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e  \u003cspan class=\"ex\"\u003e--push_to_hub,\u003c/span\u003e \u003cspan class=\"at\"\u003e--no-push_to_hub\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb4-19\"\u003e\u003ca href=\"#cb4-19\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e                        \u003cspan class=\"ex\"\u003ePush\u003c/span\u003e to hub\u003c/span\u003e\n\u003cspan id=\"cb4-20\"\u003e\u003ca href=\"#cb4-20\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e  \u003cspan class=\"ex\"\u003e--public,\u003c/span\u003e \u003cspan class=\"at\"\u003e--no-public\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb4-21\"\u003e\u003ca href=\"#cb4-21\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e                        \u003cspan class=\"ex\"\u003eMake\u003c/span\u003e dataset public \u003cspan class=\"er\"\u003e(\u003c/span\u003e\u003cspan class=\"ex\"\u003edefault:\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb4-22\"\u003e\u003ca href=\"#cb4-22\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e                        \u003cspan class=\"ex\"\u003eFalse\u003c/span\u003e\u003cspan class=\"kw\"\u003e)\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb4-23\"\u003e\u003ca href=\"#cb4-23\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e\u003c/span\u003e\n\u003cspan id=\"cb4-24\"\u003e\u003ca href=\"#cb4-24\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e\u003cspan class=\"ex\"\u003eVector\u003c/span\u003e Databases:\u003c/span\u003e\n\u003cspan id=\"cb4-25\"\u003e\u003ca href=\"#cb4-25\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e  \u003cspan class=\"ex\"\u003eChoose\u003c/span\u003e the vectors database to export data from\u003c/span\u003e\n\u003cspan id=\"cb4-26\"\u003e\u003ca href=\"#cb4-26\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e\u003c/span\u003e\n\u003cspan id=\"cb4-27\"\u003e\u003ca href=\"#cb4-27\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e  \u003cspan class=\"dt\"\u003e{pinecone\u003c/span\u003e\u003cspan class=\"op\"\u003e,\u003c/span\u003e\u003cspan class=\"dt\"\u003eqdrant\u003c/span\u003e\u003cspan class=\"op\"\u003e,\u003c/span\u003e\u003cspan class=\"dt\"\u003ekdbai\u003c/span\u003e\u003cspan class=\"op\"\u003e,\u003c/span\u003e\u003cspan class=\"dt\"\u003emilvus\u003c/span\u003e\u003cspan class=\"op\"\u003e,\u003c/span\u003e\u003cspan class=\"dt\"\u003evertexai_vectorsearch}\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb4-28\"\u003e\u003ca href=\"#cb4-28\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e    \u003cspan class=\"ex\"\u003epinecone\u003c/span\u003e            Export data from Pinecone\u003c/span\u003e\n\u003cspan id=\"cb4-29\"\u003e\u003ca href=\"#cb4-29\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e    \u003cspan class=\"ex\"\u003eqdrant\u003c/span\u003e              Export data from Qdrant\u003c/span\u003e\n\u003cspan id=\"cb4-30\"\u003e\u003ca href=\"#cb4-30\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e    \u003cspan class=\"ex\"\u003ekdbai\u003c/span\u003e               Export data from KDB.AI\u003c/span\u003e\n\u003cspan id=\"cb4-31\"\u003e\u003ca href=\"#cb4-31\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e    \u003cspan class=\"ex\"\u003emilvus\u003c/span\u003e              Export data from Milvus\u003c/span\u003e\n\u003cspan id=\"cb4-32\"\u003e\u003ca href=\"#cb4-32\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e    \u003cspan class=\"ex\"\u003evertexai_vectorsearch\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb4-33\"\u003e\u003ca href=\"#cb4-33\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e                        \u003cspan class=\"ex\"\u003eExport\u003c/span\u003e data from Vertex AI Vector\u003c/span\u003e\n\u003cspan id=\"cb4-34\"\u003e\u003ca href=\"#cb4-34\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e                        \u003cspan class=\"ex\"\u003eSearch\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\n\u003c/div\u003e\n\u003ch2 id=\"import-script\"\u003eImport script\u003c/h2\u003e\n\u003cdiv class=\"sourceCode\" id=\"cb5\"\u003e\n    \u003cpre\n        class=\"sourceCode bash\"\u003e\u003ccode class=\"sourceCode bash\"\u003e\u003cspan id=\"cb5-1\"\u003e\u003ca href=\"#cb5-1\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e\u003cspan class=\"ex\"\u003eimport_vdf\u003c/span\u003e \u003cspan class=\"at\"\u003e--help\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb5-2\"\u003e\u003ca href=\"#cb5-2\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e\u003cspan class=\"ex\"\u003eusage:\u003c/span\u003e import_vdf \u003cspan class=\"pp\"\u003e[-\u003c/span\u003e\u003cspan class=\"ss\"\u003eh\u003c/span\u003e\u003cspan class=\"pp\"\u003e]\u003c/span\u003e [-d DIR] [-s \u003cspan class=\"kw\"\u003e|\u003c/span\u003e \u003cspan class=\"ex\"\u003e--subset\u003c/span\u003e \u003cspan class=\"kw\"\u003e|\u003c/span\u003e \u003cspan class=\"ex\"\u003e--no-subset]\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb5-3\"\u003e\u003ca href=\"#cb5-3\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e                  \u003cspan class=\"ex\"\u003e[--create_new\u003c/span\u003e \u003cspan class=\"kw\"\u003e|\u003c/span\u003e \u003cspan class=\"ex\"\u003e--no-create_new]\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb5-4\"\u003e\u003ca href=\"#cb5-4\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e                  \u003cspan class=\"dt\"\u003e{milvus\u003c/span\u003e\u003cspan class=\"op\"\u003e,\u003c/span\u003e\u003cspan class=\"dt\"\u003epinecone\u003c/span\u003e\u003cspan class=\"op\"\u003e,\u003c/span\u003e\u003cspan class=\"dt\"\u003eqdrant\u003c/span\u003e\u003cspan class=\"op\"\u003e,\u003c/span\u003e\u003cspan class=\"dt\"\u003evertexai_vectorsearch\u003c/span\u003e\u003cspan class=\"op\"\u003e,\u003c/span\u003e\u003cspan class=\"dt\"\u003ekdbai}\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb5-5\"\u003e\u003ca href=\"#cb5-5\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e                  \u003cspan class=\"ex\"\u003e...\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb5-6\"\u003e\u003ca href=\"#cb5-6\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e\u003c/span\u003e\n\u003cspan id=\"cb5-7\"\u003e\u003ca href=\"#cb5-7\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e\u003cspan class=\"ex\"\u003eImport\u003c/span\u003e data from VDF to a vector database\u003c/span\u003e\n\u003cspan id=\"cb5-8\"\u003e\u003ca href=\"#cb5-8\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e\u003c/span\u003e\n\u003cspan id=\"cb5-9\"\u003e\u003ca href=\"#cb5-9\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e\u003cspan class=\"ex\"\u003eoptions:\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb5-10\"\u003e\u003ca href=\"#cb5-10\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e  \u003cspan class=\"ex\"\u003e-h,\u003c/span\u003e \u003cspan class=\"at\"\u003e--help\u003c/span\u003e            show this help message and exit\u003c/span\u003e\n\u003cspan id=\"cb5-11\"\u003e\u003ca href=\"#cb5-11\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e  \u003cspan class=\"ex\"\u003e-d\u003c/span\u003e DIR, \u003cspan class=\"at\"\u003e--dir\u003c/span\u003e DIR     Directory to import\u003c/span\u003e\n\u003cspan id=\"cb5-12\"\u003e\u003ca href=\"#cb5-12\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e  \u003cspan class=\"ex\"\u003e-s,\u003c/span\u003e \u003cspan class=\"at\"\u003e--subset,\u003c/span\u003e \u003cspan class=\"at\"\u003e--no-subset\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb5-13\"\u003e\u003ca href=\"#cb5-13\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e                        \u003cspan class=\"ex\"\u003eImport\u003c/span\u003e a subset of data \u003cspan class=\"er\"\u003e(\u003c/span\u003e\u003cspan class=\"ex\"\u003edefault:\u003c/span\u003e False\u003cspan class=\"kw\"\u003e)\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb5-14\"\u003e\u003ca href=\"#cb5-14\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e  \u003cspan class=\"ex\"\u003e--create_new,\u003c/span\u003e \u003cspan class=\"at\"\u003e--no-create_new\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb5-15\"\u003e\u003ca href=\"#cb5-15\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e                        \u003cspan class=\"ex\"\u003eCreate\u003c/span\u003e a new index \u003cspan class=\"er\"\u003e(\u003c/span\u003e\u003cspan class=\"ex\"\u003edefault:\u003c/span\u003e False\u003cspan class=\"kw\"\u003e)\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb5-16\"\u003e\u003ca href=\"#cb5-16\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e\u003c/span\u003e\n\u003cspan id=\"cb5-17\"\u003e\u003ca href=\"#cb5-17\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e\u003cspan class=\"ex\"\u003eVector\u003c/span\u003e Databases:\u003c/span\u003e\n\u003cspan id=\"cb5-18\"\u003e\u003ca href=\"#cb5-18\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e  \u003cspan class=\"ex\"\u003eChoose\u003c/span\u003e the vectors database to export data from\u003c/span\u003e\n\u003cspan id=\"cb5-19\"\u003e\u003ca href=\"#cb5-19\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e\u003c/span\u003e\n\u003cspan id=\"cb5-20\"\u003e\u003ca href=\"#cb5-20\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e  \u003cspan class=\"dt\"\u003e{milvus\u003c/span\u003e\u003cspan class=\"op\"\u003e,\u003c/span\u003e\u003cspan class=\"dt\"\u003epinecone\u003c/span\u003e\u003cspan class=\"op\"\u003e,\u003c/span\u003e\u003cspan class=\"dt\"\u003eqdrant\u003c/span\u003e\u003cspan class=\"op\"\u003e,\u003c/span\u003e\u003cspan class=\"dt\"\u003evertexai_vectorsearch\u003c/span\u003e\u003cspan class=\"op\"\u003e,\u003c/span\u003e\u003cspan class=\"dt\"\u003ekdbai}\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb5-21\"\u003e\u003ca href=\"#cb5-21\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e    \u003cspan class=\"ex\"\u003emilvus\u003c/span\u003e              Import data to Milvus\u003c/span\u003e\n\u003cspan id=\"cb5-22\"\u003e\u003ca href=\"#cb5-22\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e    \u003cspan class=\"ex\"\u003epinecone\u003c/span\u003e            Import data to Pinecone\u003c/span\u003e\n\u003cspan id=\"cb5-23\"\u003e\u003ca href=\"#cb5-23\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e    \u003cspan class=\"ex\"\u003eqdrant\u003c/span\u003e              Import data to Qdrant\u003c/span\u003e\n\u003cspan id=\"cb5-24\"\u003e\u003ca href=\"#cb5-24\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e    \u003cspan class=\"ex\"\u003evertexai_vectorsearch\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb5-25\"\u003e\u003ca href=\"#cb5-25\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e                        \u003cspan class=\"ex\"\u003eImport\u003c/span\u003e data to Vertex AI Vector Search\u003c/span\u003e\n\u003cspan id=\"cb5-26\"\u003e\u003ca href=\"#cb5-26\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e    \u003cspan class=\"ex\"\u003ekdbai\u003c/span\u003e               Import data to KDB.AI\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\n\u003c/div\u003e\n\u003ch2 id=\"re-embed-script\"\u003eRe-embed script\u003c/h2\u003e\n\u003cp\u003eThis Python script is used to re-embed a vector dataset. It takes a\n    directory of vector dataset in the VDF format and re-embeds it using a\n    new model. The script also allows you to specify the name of the column\n    containing text to be embedded.\u003c/p\u003e\n\u003cdiv class=\"sourceCode\" id=\"cb6\"\u003e\n    \u003cpre\n        class=\"sourceCode bash\"\u003e\u003ccode class=\"sourceCode bash\"\u003e\u003cspan id=\"cb6-1\"\u003e\u003ca href=\"#cb6-1\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e\u003cspan class=\"ex\"\u003ereembed_vdf\u003c/span\u003e \u003cspan class=\"at\"\u003e--help\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb6-2\"\u003e\u003ca href=\"#cb6-2\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e\u003cspan class=\"ex\"\u003eusage:\u003c/span\u003e reembed_vdf \u003cspan class=\"pp\"\u003e[-\u003c/span\u003e\u003cspan class=\"ss\"\u003eh\u003c/span\u003e\u003cspan class=\"pp\"\u003e]\u003c/span\u003e \u003cspan class=\"at\"\u003e-d\u003c/span\u003e DIR [-m NEW_MODEL_NAME]\u003c/span\u003e\n\u003cspan id=\"cb6-3\"\u003e\u003ca href=\"#cb6-3\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e                  \u003cspan class=\"ex\"\u003e[-t\u003c/span\u003e TEXT_COLUMN]\u003c/span\u003e\n\u003cspan id=\"cb6-4\"\u003e\u003ca href=\"#cb6-4\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e\u003c/span\u003e\n\u003cspan id=\"cb6-5\"\u003e\u003ca href=\"#cb6-5\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e\u003cspan class=\"ex\"\u003eReembed\u003c/span\u003e a vector dataset\u003c/span\u003e\n\u003cspan id=\"cb6-6\"\u003e\u003ca href=\"#cb6-6\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e\u003c/span\u003e\n\u003cspan id=\"cb6-7\"\u003e\u003ca href=\"#cb6-7\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e\u003cspan class=\"ex\"\u003eoptions:\u003c/span\u003e\u003c/span\u003e\n\u003cspan id=\"cb6-8\"\u003e\u003ca href=\"#cb6-8\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e  \u003cspan class=\"ex\"\u003e-h,\u003c/span\u003e \u003cspan class=\"at\"\u003e--help\u003c/span\u003e            show this help message and exit\u003c/span\u003e\n\u003cspan id=\"cb6-9\"\u003e\u003ca href=\"#cb6-9\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e  \u003cspan class=\"ex\"\u003e-d\u003c/span\u003e DIR, \u003cspan class=\"at\"\u003e--dir\u003c/span\u003e DIR     Directory of vector dataset in\u003c/span\u003e\n\u003cspan id=\"cb6-10\"\u003e\u003ca href=\"#cb6-10\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e                        \u003cspan class=\"ex\"\u003ethe\u003c/span\u003e VDF format\u003c/span\u003e\n\u003cspan id=\"cb6-11\"\u003e\u003ca href=\"#cb6-11\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e  \u003cspan class=\"ex\"\u003e-m\u003c/span\u003e NEW_MODEL_NAME, \u003cspan class=\"at\"\u003e--new_model_name\u003c/span\u003e NEW_MODEL_NAME\u003c/span\u003e\n\u003cspan id=\"cb6-12\"\u003e\u003ca href=\"#cb6-12\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e                        \u003cspan class=\"ex\"\u003eName\u003c/span\u003e of new model to be used\u003c/span\u003e\n\u003cspan id=\"cb6-13\"\u003e\u003ca href=\"#cb6-13\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e  \u003cspan class=\"ex\"\u003e-t\u003c/span\u003e TEXT_COLUMN, \u003cspan class=\"at\"\u003e--text_column\u003c/span\u003e TEXT_COLUMN\u003c/span\u003e\n\u003cspan id=\"cb6-14\"\u003e\u003ca href=\"#cb6-14\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e                        \u003cspan class=\"ex\"\u003eName\u003c/span\u003e of the column containing\u003c/span\u003e\n\u003cspan id=\"cb6-15\"\u003e\u003ca href=\"#cb6-15\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e                        \u003cspan class=\"ex\"\u003etext\u003c/span\u003e to be embedded\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\n\u003c/div\u003e\n\u003ch2 id=\"examples\"\u003eExamples\u003c/h2\u003e\n\u003cdiv class=\"sourceCode\" id=\"cb7\"\u003e\n    \u003cpre\n        class=\"sourceCode bash\"\u003e\u003ccode class=\"sourceCode bash\"\u003e\u003cspan id=\"cb7-1\"\u003e\u003ca href=\"#cb7-1\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e\u003cspan class=\"ex\"\u003eexport_vdf\u003c/span\u003e \u003cspan class=\"at\"\u003e-m\u003c/span\u003e hkunlp/instructor-xl \u003cspan class=\"at\"\u003e--push_to_hub\u003c/span\u003e pinecone \u003cspan class=\"at\"\u003e--environment\u003c/span\u003e gcp-starter\u003c/span\u003e\n\u003cspan id=\"cb7-2\"\u003e\u003ca href=\"#cb7-2\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e\u003c/span\u003e\n\u003cspan id=\"cb7-3\"\u003e\u003ca href=\"#cb7-3\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e\u003cspan class=\"ex\"\u003eimport_vdf\u003c/span\u003e \u003cspan class=\"at\"\u003e-d\u003c/span\u003e /path/to/vdf/dataset milvus\u003c/span\u003e\n\u003cspan id=\"cb7-4\"\u003e\u003ca href=\"#cb7-4\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e\u003c/span\u003e\n\u003cspan id=\"cb7-5\"\u003e\u003ca href=\"#cb7-5\" aria-hidden=\"true\" tabindex=\"-1\"\u003e\u003c/a\u003e\u003cspan class=\"ex\"\u003ereembed_vdf\u003c/span\u003e \u003cspan class=\"at\"\u003e-d\u003c/span\u003e /path/to/vdf/dataset \u003cspan class=\"at\"\u003e-m\u003c/span\u003e sentence-transformers/all-MiniLM-L6-v2 \u003cspan class=\"at\"\u003e-t\u003c/span\u003e title\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\n\u003c/div\u003e\n\u003cp\u003eFollow the prompt to select the index and id range to export.\u003c/p\u003e\n\u003ch2 id=\"contributing\"\u003eContributing\u003c/h2\u003e\n\u003ch3 id=\"adding-a-new-vector-database\"\u003eAdding a new vector database\u003c/h3\u003e\n\u003cp\u003eIf you wish to add an import/export implementation for a new vector\n    database, you must also implement the other side of the import/export\n    for the same database. Please fork the repo and send a PR for both the\n    import and export scripts.\u003c/p\u003e\n\u003cp\u003eSteps to add a new vector database (ABC):\u003c/p\u003e\n\u003col type=\"1\"\u003e\n    \u003cli\u003eAdd your database name in \u003ca href=\"src/vdf_io/names.py\"\u003esrc/vdf_io/names.py\u003c/a\u003e in the DBNames enum\n        class.\u003c/li\u003e\n    \u003cli\u003eCreate new files \u003ccode\u003esrc/vdf_io/export_vdf/export_abc.py\u003c/code\u003e\n        and \u003ccode\u003esrc/vdf_io/import_vdf/import_abc.py\u003c/code\u003e for the new\n        DB.\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003e\u003cstrong\u003eExport\u003c/strong\u003e:\u003c/p\u003e\n\u003col type=\"1\"\u003e\n    \u003cli\u003eIn your export file, define a class ExportABC which inherits from\n        ExportVDF.\u003c/li\u003e\n    \u003cli\u003eSpecify a DB_NAME_SLUG for the class\u003c/li\u003e\n    \u003cli\u003eThe class should implement:\n        \u003col type=\"1\"\u003e\n            \u003cli\u003emake_parser() function to add database specific arguments to the\n                export_vdf CLI\u003c/li\u003e\n            \u003cli\u003eexport_vdb() function to prompt user for info not provided in the\n                CLI. It should then call the get_data() function.\u003c/li\u003e\n            \u003cli\u003eget_data() function to download points (in a batched manner) with\n                all the metadata from the specified index of the vector database. This\n                data should be stored in a series of parquet files/folders. The metadata\n                should be stored in a json file with the \u003ca\n                    href=\"#universal-vector-dataset-format-vdf-specification\"\u003eschema\n                    above\u003c/a\u003e.\u003c/li\u003e\n        \u003c/ol\u003e\n    \u003c/li\u003e\n    \u003cli\u003eUse the script to export data from an example index of the vector\n        database and verify that the data is exported correctly.\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003e\u003cstrong\u003eImport\u003c/strong\u003e:\u003c/p\u003e\n\u003col type=\"1\"\u003e\n    \u003cli\u003eIn your import file, define a class ImportABC which inherits from\n        ImportVDF.\u003c/li\u003e\n    \u003cli\u003eSpecify a DB_NAME_SLUG for the class\u003c/li\u003e\n    \u003cli\u003eThe class should implement:\n        \u003col type=\"1\"\u003e\n            \u003cli\u003emake_parser() function to add database specific arguments to the\n                import_vdf CLI, such as the url of the database, any authentication\n                tokens, etc.\u003c/li\u003e\n            \u003cli\u003eimport_vdb() function to prompt user for info not provided in the\n                CLI. It should then call the upsert_data() function.\u003c/li\u003e\n            \u003cli\u003eupsert_data() function to upload points from a vdf dataset (in a\n                batched manner) with all the metadata to the specified index of the\n                vector database. All metadata about the dataset should be read from the\n                VDF_META.json file in the vdf folder.\u003c/li\u003e\n        \u003c/ol\u003e\n    \u003c/li\u003e\n    \u003cli\u003eUse the script to import data from the example vdf dataset exported\n        in the previous step and verify that the data is imported\n        correctly.\u003c/li\u003e\n\u003c/ol\u003e\n\u003ch3 id=\"changing-the-vdf-specification\"\u003eChanging the VDF\n    specification\u003c/h3\u003e\n\u003cp\u003eIf you wish to change the VDF specification, please open an issue to\n    discuss the change before sending a PR.\u003c/p\u003e\n\u003ch3 id=\"efficiency-improvements\"\u003eEfficiency improvements\u003c/h3\u003e\n\u003cp\u003eIf you wish to improve the efficiency of the import/export scripts,\n    please fork the repo and send a PR.\u003c/p\u003e\n\u003ch2 id=\"telemetry\"\u003eTelemetry\u003c/h2\u003e\n\u003cp\u003eRunning the scripts in the repo will send anonymous usage data to AI\n    Northstar Tech to help improve the library.\u003c/p\u003e\n\u003cp\u003eYou can opt out this by setting the environment variable\n    \u003ccode\u003eDISABLE_TELEMETRY_VECTORIO\u003c/code\u003e to \u003ccode\u003e1\u003c/code\u003e.\n\u003c/p\u003e\n\u003ch2 id=\"questions\"\u003eQuestions\u003c/h2\u003e\n\u003cp\u003eIf you have any questions, please open an issue on the repo or\n    message Dhruv Anand on \u003ca href=\"https://www.linkedin.com/in/dhruv-anand-ainorthstartech/\"\u003eLinkedIn\u003c/a\u003e\u003c/p\u003e","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FAI-Northstar-Tech%2Fvector-io","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FAI-Northstar-Tech%2Fvector-io","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FAI-Northstar-Tech%2Fvector-io/lists"}