Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/brian14708/tsar

Archive file format for storing large tensors efficiently
https://github.com/brian14708/tsar

archive compression onnx

Last synced: about 1 month ago
JSON representation

Archive file format for storing large tensors efficiently

Awesome Lists containing this project

README

        

# Tensor Archive

Archive file format for storing tensors, with optional lossy compression for better storage efficiency.

Features:
- floating-point compression ([zfp](https://github.com/LLNL/zfp))
- storing data in lower precision format (bfloat16, ...)
- compressing mantissa and exponents separately
- tools for building archives from ONNX format

## Quick Start

Install `tsar-py` package from source (require Rust build environment):

```sh
pip install git+https://github.com/brian14708/tsar.git#subdirectory=tsar-py
```

### ONNX format

```sh
# lossless compression
tsar-pack -e 0 ".onnx" output.tsar
# lossy compression (with maximum error 1e-6)
tsar-pack -e 1e-6 ".onnx" output.tsar

# extract file to model/ directory
tsar-unpack output.tsar model/
```

### Python API

TODO

## Results

| Model | Compression | Size |
| ----------------------------------------------------------------------------------- | --------------- | --------- |
| [ResNet-152](https://github.com/onnx/models/tree/main/vision/classification/resnet) | none | 230.6 MiB |
| | gzip | 215.4 MiB |
| | tsar (lossless) | 197.4 MiB |
| | tsar (err=1e-6) | 129.8 MiB |
| | tsar (err=1e-5) | 108.7 MiB |
| | tsar (err=1e-4) | 87.8 MiB |
| | tsar (err=1e-3) | 60.6 MiB |