{"id":16976160,"url":"https://github.com/ashvardanian/tenpack","last_synced_at":"2025-03-22T14:31:48.648Z","repository":{"id":188633982,"uuid":"517316537","full_name":"ashvardanian/TenPack","owner":"ashvardanian","description":"Fast Tensors Packaging library for text, image, video, and audio data compatible with PyTorch, TensorFlow, \u0026 NumPy 🖼️🎵🎥 ➡️ 🧠","archived":false,"fork":false,"pushed_at":"2024-05-06T06:31:12.000Z","size":111,"stargazers_count":7,"open_issues_count":4,"forks_count":1,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-03-15T15:49:59.659Z","etag":null,"topics":["clip","laion","multi-modal","numpy","parser","pytorch","simd","tensor","tensorflow","transformer"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ashvardanian.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-07-24T12:22:43.000Z","updated_at":"2024-10-23T07:35:21.000Z","dependencies_parsed_at":"2024-05-06T07:46:48.964Z","dependency_job_id":null,"html_url":"https://github.com/ashvardanian/TenPack","commit_stats":null,"previous_names":["ashvardanian/tenpack"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ashvardanian%2FTenPack","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ashvardanian%2FTenPack/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ashvardanian%2FTenPack/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ashvardanian%2FTenPack/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ashvardanian","download_url":"https://codeload.github.com/ashvardanian/TenPack/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244972270,"owners_count":20540951,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["clip","laion","multi-modal","numpy","parser","pytorch","simd","tensor","tensorflow","transformer"],"created_at":"2024-10-14T01:25:09.133Z","updated_at":"2025-03-22T14:31:48.385Z","avatar_url":"https://github.com/ashvardanian.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# TenPack\n\nThree simple things this library does to you:\n\n1. Guess the media type from raw bytes,\n2. Parse its' dimensions, sizes, lengths, etc.,\n3. Unpack data into regular preallocated Tensors.\n\nWhere do we use it?\nTo connect the Data Storage layer of [UKV](github.com/unum-cloud/ukv) to High-Performance Computing libraries like [TensorFlow](tensorflow.org) and [PyTorch](pytorch.org).\n\n## How it works?\n\nMost common file-formats have \"signatures\" or \"magic numbers\" embedded into them.\nOften, as the prefix of the byte-stream.\n\n* [List of file signatures](https://en.wikipedia.org/wiki/List_of_file_signatures)\n* [Magic numbers in programming](https://en.wikipedia.org/wiki/Magic_number_(programming)#Magic_numbers_in_files)\n\nLibraries implementing the first step have been implemented for other languages:\n\n* [filetype](https://github.com/h2non/filetype) for GoLang\n* [filetype.py](https://github.com/h2non/filetype.py) for Python\n* [FileType](https://github.com/rzane/file_type) for Elixir\n* [FileSignatures](https://github.com/neilharvey/FileSignatures) for C#\n\n## Alternatives for Tensor Exports\n\n* [Pillow](https://pillow.readthedocs.io/en/stable/) and [Pillow-SIMD](https://github.com/uploadcare/pillow-simd) for [image formats](https://pillow.readthedocs.io/en/stable/handbook/image-file-formats.html).\n* [FFmpeg](https://ffmpeg.org/), for video formats.\n* [Nyquist](https://github.com/ddiakopoulos/libnyquist), for audio formats.\n\nIn fact, TenPack is just a CMake-friendly generalization of those libraries with a C interface and focus on memory reusing.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fashvardanian%2Ftenpack","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fashvardanian%2Ftenpack","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fashvardanian%2Ftenpack/lists"}