https://github.com/shivamkumar818/compresso
Efficient Text Compression
https://github.com/shivamkumar818/compresso
Last synced: about 2 months ago
JSON representation
Efficient Text Compression
- Host: GitHub
- URL: https://github.com/shivamkumar818/compresso
- Owner: shivamkumar818
- Created: 2025-03-19T18:04:00.000Z (2 months ago)
- Default Branch: main
- Last Pushed: 2025-03-19T18:05:20.000Z (2 months ago)
- Last Synced: 2025-03-19T19:23:28.630Z (2 months ago)
- Language: C++
- Size: 8.79 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
File_Compressor
A text file compression tool based on Huffman encoding algorithm. The tool enjoys the benefit given by huffman encoding algorithm, i.e, Lossless compression. We can compress the data upto 30-40% without any loss in the data, saving 60-70% space. This comes very handy in case of cloud sharing.
UseCase
Imagine you have to send a 1GB data file, and you only have 500MB of internet data. No need to worry! Huffman encoding algorithm got you covered. After compression the file's size becomes somewhere around 300MB(not guaranteed) which can now be safely sent without any problems.About the Algorithm
Huffman coding, an optimal prefix-free binary code, is a renowned data compression technique designed by David A. Huffman in 1952. This ingenious algorithm is based on the principle of assigning shorter codes to frequently occurring symbols in a dataset, resulting in more efficient representation and compression. Huffman encoding achieves this by constructing a binary tree, known as the Huffman tree, where each leaf node corresponds to a symbol and is assigned a binary code based on its frequency in the input data. The algorithm employs a priority queue to efficiently merge nodes with the lowest frequencies until a single root node is formed. The resulting binary codes exhibit the desirable property of being prefix-free, ensuring that no code is a prefix of another. This characteristic simplifies decoding, as it enables unambiguous identification of each symbol in the compressed data. It finds its unique applications in the real-world. Notably, in file compression algorithms such as ZIP and in network protocols where efficient data transmission is crucial.