Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/bcrist/zig-rom-compress
https://github.com/bcrist/zig-rom-compress
Last synced: 21 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/bcrist/zig-rom-compress
- Owner: bcrist
- License: mit
- Created: 2023-10-28T15:56:51.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-01-13T03:20:45.000Z (11 months ago)
- Last Synced: 2024-01-13T16:48:28.700Z (11 months ago)
- Language: Zig
- Size: 9.77 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: readme.md
- License: license
Awesome Lists containing this project
README
# Zig-Rom-Compress
A simple algorithm for in-storage compression of potentially sparse data where the data will be fully uncompressed before usage.
The motivating use case is when using a microcontroller to bootstrap an SRAM at reset, which will thereafter function as a lookup table/ROM.
Often times such lookup tables may contain several megabytes of highly patterned data, but you'd like to initialize it with just a small/cheap microcontroller, and typically these will have much less than 1MB of flash memory.Note for most data, an LZ-based compressor like DEFLATE will achieve a better compression ratio, but doesn't support sparse data without adding additional metadata.
The decompressor for this scheme is also very simple and requires only a few bytes of working RAM.## Algorithm
The key insight that we make use of is that many data words are likely to be repeated many times,
and there may be many "don't care" addresses that we need not initialize at all.
Furthermore, the data need not be reconstructed in linear order.The compressor begins by partitioning the data into lists of addresses which point to the same data value.
Then it sorts the addresses within those lists, and sorts the partitions based on the data value.
It can then use delta compression and RLE on the transformed data.The algorithm works well in most real-world cases where there are many addresses with the same data value,
and even better if those addresses appear in contiguous blocks.
It is possible, however, (particularly with encrypted or random-like data) that the compressed version may actually be larger.