An open API service indexing awesome lists of open source software.

https://github.com/zlib-ng/corpora

Common corpora used for lossless compression testing and benchmarking.
https://github.com/zlib-ng/corpora

compression corpora testing

Last synced: 3 months ago
JSON representation

Common corpora used for lossless compression testing and benchmarking.

Awesome Lists containing this project

README

          

## Corpora

This repository contains common corpora used for lossless compression testing and benchmarking.

### Sources

Detailed descriptions of the files found in each of the corpus can be found below.

|Corpus|URL|Notes|
|:-|:-|:-|
|Canterbury|https://corpus.canterbury.ac.nz/|Includes artificial, calgary, canterbury, large, and miscellaneous corpus.|
|Silesia|http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia||
|Snappy|https://github.com/google/snappy|Test data with some duplicates removed that were present in other corpus.|
|Neuro|https://github.com/neurolabusc/zlib-bench|NIfTI format brain images.|

### License

All files are the works of their respective authors. Please see the sources above for any licensing information.