https://github.com/zenatron/pi-compress

The Future of Data Compression?
https://github.com/zenatron/pi-compress

algorithms compression compression-algorithm pi rust

Last synced: 4 months ago
JSON representation

The Future of Data Compression?

Host: GitHub
URL: https://github.com/zenatron/pi-compress
Owner: zenatron
Created: 2025-04-01T02:12:23.000Z (7 months ago)
Default Branch: main
Last Pushed: 2025-04-01T02:23:16.000Z (7 months ago)
Last Synced: 2025-07-05T05:07:51.159Z (4 months ago)
Topics: algorithms, compression, compression-algorithm, pi, rust
Language: Rust
Homepage:
Size: 456 KB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# $`\pi`$Compress: The Future of Data Compression and Encryption?

## Overview
Welcome to $`\pi`$Compress, a revolutionary data "compression" tool developed as a fun thought experiment. Leveraging Rust's speed and safety, $`\pi`$Compress uses the first million digits of $\pi$ to "compress" data. Like other compression algorithms, it works super fast and is super secure*. On top of that, the output is even larger than the input! Checkmate, [LZMA](https://en.wikipedia.org/wiki/LZMA)!

*Terms and Conditions apply.

## The "Algorithm"
The core idea is based on the fascinating (and mathematically unproven for *all* sequences) concept that the digits of $`\pi`$ contain every possible finite sequence of numbers.

$`\pi`$Compress works like this:
1. **Hexadecimal Conversion:** Taking your input text.
2. **Byte-to-Hex Encoding:** Converting each byte of the text into its two-digit hexadecimal representation (e.g., 'H' -> 0x48 becomes "48").
3. **The $`\pi`$ Search:** Searching for this resulting hexadecimal string within a stored sequence of the **first million digits of $`\pi`$** (from `pi.txt`). Note that since $`\pi`$ only contains digits 0-9, this search will only ever find hex strings that happen to *only* contain those digits (e.g., "1234" might be found, but "4a8b" never will).
4. **Greedy Longest Match:** It tries to find the longest possible chunk of your input text (starting from the beginning) whose hex representation exists in $`\pi`$.
5. **Verification:** Crucially, it verifies that the specific sequence found in $`\pi`$ can be reliably converted *back* into the original bytes (this step prevents some hilarious but incorrect results).
6. **"Compressed" Output:** The output isn't actually smaller! It's a list of instructions:
* `Pi[index] (N bytes)`: Meaning "fetch the data representing N original bytes starting at this index in $`\pi`$".
* `Raw[0xHH...]`: Meaning "this byte couldn't be reliably found in $`\pi`$, so here's its original hex value".
7. **Decompression:** Reverses the process by fetching the digits from the stored $`\pi`$ sequence based on the indices and converting them back to text (via hex), or using the raw bytes directly.

## Why?
Mostly for fun and to explore a quirky idea. This is **not** a practical compression algorithm. In fact, the "compressed" representation is often significantly larger than the original input due to storing indices and raw data. Clone the repo and run it to see for yourself!

## Disclaimer
This project is intended for educational and entertainment purposes only. Do not use it for any serious data compression needs (or do, I guess). Happy April Fools'!

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/zenatron/pi-compress

Awesome Lists containing this project

README