{"id":19411511,"url":"https://github.com/volcengine/veturboio","last_synced_at":"2025-04-24T10:33:48.773Z","repository":{"id":227000501,"uuid":"764575172","full_name":"volcengine/veTurboIO","owner":"volcengine","description":"A library developed by Volcano Engine for high-performance reading and writing of PyTorch model files.","archived":false,"fork":false,"pushed_at":"2024-06-07T10:22:32.000Z","size":5379,"stargazers_count":7,"open_issues_count":1,"forks_count":3,"subscribers_count":3,"default_branch":"main","last_synced_at":"2024-06-07T11:44:16.396Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/volcengine.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2024-02-28T10:27:44.000Z","updated_at":"2024-06-07T10:22:38.000Z","dependencies_parsed_at":"2024-03-11T04:35:43.067Z","dependency_job_id":null,"html_url":"https://github.com/volcengine/veTurboIO","commit_stats":null,"previous_names":["volcengine/veturboio"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/volcengine%2FveTurboIO","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/volcengine%2FveTurboIO/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/volcengine%2FveTurboIO/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/volcengine%2FveTurboIO/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/volcengine","download_url":"https://codeload.github.com/volcengine/veTurboIO/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":223950384,"owners_count":17230462,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-10T12:21:50.230Z","updated_at":"2025-04-24T10:33:48.767Z","avatar_url":"https://github.com/volcengine.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# veTurboIO\n\n\n[En](./README.md) | [中文](./README.zh.md)\n\n\nA Python library for high-performance reading and writing of PyTorch model files \ndeveloped by Volcano Engine. This library mainly implements based on the safetensors \nfile format to achieve efficient storage and reading of tensor data.\n\n## Install\n\nIt can be installed directly through the following way:\n```bash\ncd veturboio\npython setup.py get_libcfs\npython setup.py install\n```\n\nTips: This instruction will preferentially download the whl file that matches the \ncurrent Python and PyTorch versions. If no matching whl file is found, it will \nautomatically download the source code for compilation and installation.\n\n\nIf the installation fails, you can also try to install by downloading the source code, \nand then compile and install it manually.\n\n```bash\n# CUDA ops, default\npython setup.py install --cuda_ext\n\n# NPU ops\npython setup.py install --npu_ext\n\n# CPU only\npython setup.py install --cpu_ext\n```\n\n\n## Quick Start\n\n### Read and write model files\n\n\n```python\nimport torch\nimport veturboio\n\ntensors = {\n   \"weight1\": torch.zeros((1024, 1024)),\n   \"weight2\": torch.zeros((1024, 1024))\n}\n\nveturboio.save_file(tensors, \"model.safetensors\")\n\nnew_tensors = veturboio.load(\"model.safetensors\")\n\n# check if the tensors are the same\nfor k, v in tensors.items():\n    assert torch.allclose(v, new_tensors[k])\n```\n\n### Convert existing PyTorch files\n\n```bash\npython -m veturboio.convert -i model.pt -o model.safetensors\n```\n\n## Performance test\n\nRun directly:\n```bash\nbash bench/io_bench.sh\n```\nThen, you can get the following results:\n```\nfs_name    tensor_size     veturboio load_time(s)             torch load_time(s)\nshm        1073741824      0.08                               0.63\nshm        2147483648      0.19                               1.26\nshm        4294967296      0.36                               2.32\n```\n\nAlso, you can run the following command to get more options:\n```bash\npython bench/io_bench.py -h\n```\n\n## Advance Features\n\n### Using veMLP to accelerate reading and writing\nVolcano Engine Machine Learning Platform (veMLP) provides a distributed cache file system\nbased on the physical disks of the GPU cluster. \n\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"./docs/imgs/SFCS.png\" style=\"zoom:15%;\"\u003e\n\u003c/p\u003e\n\nWhen a cluster-level task needs to read \na model file, the caching system can efficiently distribute the model file between GPU \nmachines via RDMA transfer, thus avoiding network transfer bottlenecks. When using this \nsystem, veTurboIO can maximize its performance advantages.\n\n### Encrypt and decrypt model files\nveTurboIO supports encryption and decryption of model files. You can read the [tutorial](./docs/encrypt_model.md) \nto learn how to keep your model files secure. When you use GPU as target device, veTurboIO can decrypt the model file on the fly.\n\n\n## License\n\n[Apache License 2.0](./LICENSE)\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvolcengine%2Fveturboio","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvolcengine%2Fveturboio","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvolcengine%2Fveturboio/lists"}