https://github.com/zlib-ng/corpora
Common corpora used for lossless compression testing and benchmarking.
https://github.com/zlib-ng/corpora
compression corpora testing
Last synced: 3 months ago
JSON representation
Common corpora used for lossless compression testing and benchmarking.
- Host: GitHub
- URL: https://github.com/zlib-ng/corpora
- Owner: zlib-ng
- Created: 2020-07-14T16:03:45.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2025-03-18T21:00:40.000Z (7 months ago)
- Last Synced: 2025-05-18T05:14:53.034Z (5 months ago)
- Topics: compression, corpora, testing
- Language: HTML
- Homepage:
- Size: 94.1 MB
- Stars: 4
- Watchers: 3
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## Corpora
This repository contains common corpora used for lossless compression testing and benchmarking.
### Sources
Detailed descriptions of the files found in each of the corpus can be found below.
|Corpus|URL|Notes|
|:-|:-|:-|
|Canterbury|https://corpus.canterbury.ac.nz/|Includes artificial, calgary, canterbury, large, and miscellaneous corpus.|
|Silesia|http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia||
|Snappy|https://github.com/google/snappy|Test data with some duplicates removed that were present in other corpus.|
|Neuro|https://github.com/neurolabusc/zlib-bench|NIfTI format brain images.|### License
All files are the works of their respective authors. Please see the sources above for any licensing information.