https://github.com/mozillasecurity/corpus-replicator
A corpus generation tool
https://github.com/mozillasecurity/corpus-replicator
corpus fuzzing media test
Last synced: 5 months ago
JSON representation
A corpus generation tool
- Host: GitHub
- URL: https://github.com/mozillasecurity/corpus-replicator
- Owner: MozillaSecurity
- License: mpl-2.0
- Created: 2023-06-13T23:32:39.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2025-03-26T17:11:24.000Z (6 months ago)
- Last Synced: 2025-05-07T08:14:18.210Z (5 months ago)
- Topics: corpus, fuzzing, media, test
- Language: Python
- Homepage:
- Size: 42 KB
- Stars: 22
- Watchers: 6
- Forks: 3
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
- Code of conduct: CODE_OF_CONDUCT.md
- Codeowners: .github/CODEOWNERS
Awesome Lists containing this project
README
Corpus Replicator
=================
[](https://community-tc.services.mozilla.com/api/github/v1/repository/MozillaSecurity/corpus-replicator/main/latest)
[](https://matrix.to/#/#fuzzing:mozilla.org)
[](https://pypi.org/project/corpus-replicator)Corpus Replicator is a corpus generation tool that enables the creation of multiple
unique output files based on templates. The primary intended use case is the
creation of a seed corpus that can be used by fuzzers. Support for additional output
formats can be added via the creation of `Recipes`. If a desired format is unsupported,
support can be added via the creation of a `CorpusGenerator`.The goal is to create an efficient corpus that maximizes code coverage and minimizes
file size. Small unique files that execute quickly are preferred.Currently four media types can be generated `animation`, `audio`, `image` and
`video`.Requirements
------------Corpus Replicator relies on [FFmpeg](https://ffmpeg.org/).
Installation
------------
```
pip install corpus-replicator
```Example
-------This is an example `recipe` file.
```yaml
# "base" contains required entries and default flags
base:
codec: "h264" # name of the codec
container: "mp4" # container/file extension
library: "libx264" # name of library
medium: "video" # supported medium
tool: "ffmpeg" # name of supported tool
default_flags:
encoder: # "encoder" flag group
["-c:v", "libx264"]
resolution: # "resolution" flag group
["-s", "320x240"]# variations allow flags to be added and overwritten
# one file will be generated for each entry in a flag group
variation:
resolution: # flag group - overwrites default flag group in "base"
- ["-s", "640x480"]
- ["-s", "32x18"]
- ["-s", "64x64"]
monochrome: # flag group - adds new flag group
- ["-vf", "hue=s=0"]
```Running the recipe will generate a corpus:
```
$ corpus-replicator example.yml video -t test
Generating templates...
1 recipe(s) will be used with 1 template(s) to create 4 file(s).
Generating 4 'video/libx264/h264/mp4' file(s) using template 'test'...
Optimizing corpus, checking for duplicates...
Done.
```Resulting corpus:
```
$ ls generated-corpus/
video-h264-libx264-test-monochrome-00.mp4
video-h264-libx264-test-resolution-01.mp4
video-h264-libx264-test-resolution-00.mp4
video-h264-libx264-test-resolution-02.mp4
```A more complex corpus can be generated by using multiple `Recipes` and `Templates` at
once.Recipes are stored in [src/corpus_replicator/recipes](/src/corpus_replicator/recipes/).