Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/aashrafh/enwik8

An attempt to compress the enwik8 file
https://github.com/aashrafh/enwik8

bwt bwtool compression compression-application enwik8 huffman linked-list lzw mtf multimedia text-compression variable-length

Last synced: 10 days ago
JSON representation

An attempt to compress the enwik8 file

Awesome Lists containing this project

README

        

# About
An attempt to compress the first 100 MB of Wikipedia which is called [enwik8](https://en.wikipedia.org/wiki/Hutter_Prize) using LZW(Lempel–Ziv–Welch) and BZip2-Like algorithms with variable length encoding.

# Results
* LZW:
* Compression ratio: 2.905
* Compressed file size: 32 MB
* BZip2-Like:
* Compression ratio: 3.855
* Compressed file size: 24 MB
# How to run
* Compression
1. Open a terminal on the directory containing the code
2. Generate the binary file using command: ```g++ -o encoder.exe encoder.cpp```
3. Run the binary file: ```./encoder.exe```
* Decompression
1. Open a terminal on the directory containing the code
2. Generate the binary file using command: ```g++ -o decoder.exe decoder.cpp```
3. Run the binary file: ```./decoder.exe```

# To Do
- [ ] A Decoder for the BZip2-Like algorithm