Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/aashrafh/enwik8
An attempt to compress the enwik8 file
https://github.com/aashrafh/enwik8
bwt bwtool compression compression-application enwik8 huffman linked-list lzw mtf multimedia text-compression variable-length
Last synced: 10 days ago
JSON representation
An attempt to compress the enwik8 file
- Host: GitHub
- URL: https://github.com/aashrafh/enwik8
- Owner: aashrafh
- License: mit
- Created: 2020-05-22T20:25:39.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2020-06-04T21:22:24.000Z (over 4 years ago)
- Last Synced: 2024-10-25T14:32:29.722Z (about 2 months ago)
- Topics: bwt, bwtool, compression, compression-application, enwik8, huffman, linked-list, lzw, mtf, multimedia, text-compression, variable-length
- Language: C++
- Homepage:
- Size: 427 KB
- Stars: 4
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# About
An attempt to compress the first 100 MB of Wikipedia which is called [enwik8](https://en.wikipedia.org/wiki/Hutter_Prize) using LZW(Lempel–Ziv–Welch) and BZip2-Like algorithms with variable length encoding.# Results
* LZW:
* Compression ratio: 2.905
* Compressed file size: 32 MB
* BZip2-Like:
* Compression ratio: 3.855
* Compressed file size: 24 MB
# How to run
* Compression
1. Open a terminal on the directory containing the code
2. Generate the binary file using command: ```g++ -o encoder.exe encoder.cpp```
3. Run the binary file: ```./encoder.exe```
* Decompression
1. Open a terminal on the directory containing the code
2. Generate the binary file using command: ```g++ -o decoder.exe decoder.cpp```
3. Run the binary file: ```./decoder.exe```# To Do
- [ ] A Decoder for the BZip2-Like algorithm