https://github.com/snesnopic/chisel
Cross‑platform C++ project for lossless recompression/re-encoding of files using a variety of statically compiled libraries.
https://github.com/snesnopic/chisel
cli compression cpp linux macos recompression windows
Last synced: 8 days ago
JSON representation
Cross‑platform C++ project for lossless recompression/re-encoding of files using a variety of statically compiled libraries.
- Host: GitHub
- URL: https://github.com/snesnopic/chisel
- Owner: Snesnopic
- License: mit
- Created: 2025-09-26T13:56:51.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2026-06-09T13:37:59.000Z (14 days ago)
- Last Synced: 2026-06-09T14:12:57.246Z (14 days ago)
- Topics: cli, compression, cpp, linux, macos, recompression, windows
- Language: C++
- Homepage:
- Size: 1.39 MB
- Stars: 2
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# chisel
**chisel** is a CLI tool and library that losslessly optimizes files.
It recursively explores folders, files inside ZIPs or **cover arts inside music files** (IDv3, APE tags etc.), and supports **160+ file formats** by integrating many specialized encoders.
It does NOT change the format of files (even when it would be beneficial to do so), supports checksum verification to verify that the raw data of files hasn't been altered, and doesn't discard metadata by default.
---
## Installation
### Quick install
The easiest way to install `chisel` is via your package manager.
Chisel is available on [homebrew](https://brew.sh):
```bash
brew update
brew tap snesnopic/tools
brew install chsl
```
It is also available on [winget](https://github.com/microsoft/winget-cli):
```powershell
winget update
winget install Snesnopic.Chisel
```
> The executable name is 'chsl' because 'chisel' already exists in brew.
### Building from source
If you prefer to compile chisel manually, please follow the build instructions below.
## Requirements
The project fetches all its dependencies automatically via Git submodules.
- **All Platforms:**
- `git` (with LFS support: run `git lfs install` once)
- `cmake` (≥ 3.20)
- `ninja` (recommended)
- Rust toolchain (required for OptiVorbis integration; install via [rustup.rs](https://rustup.rs))
- **Linux:**
- A modern C++23 compiler (GCC ≥ 11 or Clang ≥ 14)
- `build-essential`, `pkg-config`
- `autoconf`, `automake`, `libtool`, `m4`, `nasm`, `yasm` (required by some submodules)
- **macOS:**
- Xcode Command Line Tools (Clang with C++23 support)
- `pkg-config`
- `autoconf`, `automake`, `libtool`, `nasm`, `yasm` (required by some submodules)
- **Windows:**
- Visual Studio 2022 (with MSVC C++23 toolchain), `vcpkg`
---
## Installing dependencies
### Linux
This command installs only the build tools. All libraries are submodules.
```bash
sudo apt-get update
sudo apt-get install -y build-essential cmake ninja-build help2man pkg-config git \
autoconf automake libtool m4 nasm yasm ccache
curl https://sh.rustup.rs -sSf | sh
```
### macOS (Homebrew)
```bash
brew update
brew install cmake ninja pkg-config git autoconf help2man automake libtool nasm yasm
curl https://sh.rustup.rs -sSf | sh
```
### Windows
Ensure you have installed Visual Studio 2022 (with the "Desktop development with C++" workload), Git and vcpkg configured.
```powershell
# Download Visual Studio 2022 Community bootstrapper
Invoke-WebRequest "https://aka.ms/vs/17/release/vs_community.exe" -OutFile vs.exe
# Install "Desktop development with C++" workload
.\vs.exe --quiet --wait --norestart --nocache `
--add Microsoft.VisualStudio.Workload.NativeDesktop `
--add Microsoft.VisualStudio.Component.VC.Tools.x86.x64 `
--add Microsoft.VisualStudio.Component.Windows10SDK.22621
# Install Rust toolchain
Invoke-WebRequest https://win.rustup.rs/x86_64 -OutFile rustup-init.exe
.\rustup-init.exe -y
```
---
## Building chisel
### Clone the repository and initialize all submodules:
```bash
git clone https://github.com/Snesnopic/chisel.git
cd chisel
git lfs install
git lfs pull
git submodule update --init --recursive
```
### Configure and build with CMake:
```bash
mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
cmake --build . --config Release
```
> If you have Ninja, you can add `-G "Ninja"` to the first command.
### Opting out of specific encoders
You can also opt out of the OptiVorbis integration (which requires Rust) with `-DENABLE_OPTIVORBIS=OFF`.
You can do the same for MKV optimizations (libmkclean specifically) with `-DENABLE_MATROSKA=OFF`.
## Usage
`./chsl ... [options]`
**Arguments:**
- `inputs...`
One or more files or directories to process.
Use `-` to read from `stdin`.
**Options:**
- `-h, --help`
Show the help message and exit.
- `--version`
Display program version information and exit.
- `-o, --output `
Write optimized files to PATH instead of modifying them in-place.
If the input is `stdin` (-), PATH must be a file.
Otherwise, PATH must be a directory.
- `--report `
Export a final CSV report to the specified file.
- `-r, --recursive`
Recursively scan input folders.
- `-q, --quiet`
Suppress non-error console output (progress bar, results).
- `--dry-run`
Use chisel without replacing original files.
- `--no-meta`
Don't preserve files metadata. (Metadata is preserved by default).
- `--verify-checksums`
Verify raw checksums before replacing files.
- `--threads `
Number of worker threads to use (default: half of available cores).
- `--log-level `
Set logging verbosity (ERROR, WARNING, INFO, DEBUG, NONE). Default is ERROR.
- `--log-file `
Write logs to the specified file (default: no file logging).
- `--include `
Process only files matching regex PATTERN. (Can be used multiple times).
- `--exclude `
Do not process files matching regex PATTERN. (Can be used multiple times).
- `--iterations `
Number of iterations for Zopfli based compression (default: 15).
- `--iterations-large `
Number of iterations for Zopfli on large images (default: 5).
- `--max-tokens `
Number of tokens for FlexiGif compression (default: 10000).
- `--mode `
Select how multiple encoders are applied to a file (`pipe` or `parallel`).
`pipe` (default): Encoders are chained; output of one is input to the next.
`parallel`: All encoders run on the original file; the smallest result is chosen.
**Examples:**
- `./chsl file.jpg dir/ --recursive --threads 4`
- `./chsl archive.zip`
- `./chsl dir/ --report report.csv`
- `cat file.png | ./chsl - -o out.png`
- `cat file.png | ./chsl - > out.png`
---
## How it works
`chisel` scans the input file(s) to understand their actual format. On Windows, detection is currently based on file extensions, while on Linux and macOS it relies on `libmagic` for accurate MIME type detection. If a relevant `Processor` is found for the input, the file goes through a pipeline with 3 phases:
**Phase 1: Extraction & discovery**
The system identifies files whose compatible `Processor`s are flagged as containers. This includes traditional archives (like ZIP or Tar), PDF documents, and even audio files (like MP3 or FLAC) that contain embedded cover art within their ID3/APE tags. These internal files are extracted to a temporary location and exposed to the pipeline recursively. This means `chisel` is perfectly capable of compressing an image inside a ZIP archive, inside another ZIP archive.
**Phase 2: Recompression**
All discovered and extracted files are delegated to a thread pool. The worker thread will invoke the `recompress` function of the file's designated `Processor` (if available;not all formats are compressible, just like not all formats are containers).
If multiple processors are registered for the same file type, two modes of operation can occur, depending on the `--mode` flag:
* **PIPE (Default):** Processors are chained sequentially. The optimized output of the first processor becomes the input for the next one, in the exact order they are registered in `processor_registry.cpp`.
* **PARALLEL:** Every processor runs its `recompress` function simultaneously on a fresh copy of the original file, and the smallest resulting file is chosen. *(Note: This behavior is likely to be deprecated soon, as PIPE mode typically yields better cumulative results, and scenarios with multiple encoders for the exact same format are rare).*
If, at the end of this phase, the recompressed file is not strictly smaller than its original counterpart, the new file is discarded and the original is preserved.
**Phase 3: Finalization**
All files that were originally classified as containers, and whose contents were extracted in Phase 1, are now rebuilt. The `Processor` will repack the container using the newly optimized internal files, preserving the original structure.
---
## Adding a new Processor
Extending `chisel` with a new encoder or format requires just a few operations:
1. **Define the Processor:**
Create a new header in `libchisel/include/processors/`, inheriting from `IProcessor`. You must meaningfully implement the required metadata methods:
* `get_name()`
* `get_supported_mime_types()`
* `get_supported_extensions()`
* `can_recompress()`
* `can_extract_contents()`
2. **Implement the core logic:**
Write the implementation in `libchisel/src/processors/`.
* Implement `recompress()`, making sure to respect the `options.preserve_metadata` flag if applicable for your format.
* If your processor is a container, you must override `prepare_extraction()` and `finalize_extraction()`, ensuring the exact structure of the container is restored during finalization.
* *Note:* Implementing the `raw_equal()` method (used to verify that the meaningful content is bit-identical before and after compression) isn't strictly required to run the tool, but all tests run on the GitHub CI workers will execute with the `--verify-checksums` flag enabled, so it is highly recommended.
3. **Register the Processor:**
The final step is to instantiate and register your new class inside the constructor of `ProcessorRegistry` in `libchisel/src/processor_registry.cpp`.
---
## Supported formats
| Category | Format | MIME | Extensions | Libraries |
|---------------|----------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------|
| Images | JPEG | image/jpeg, image/jpg | .jpg, .jpeg, .jpe, .jif, .jfif, .jfi, .thm | mozjpeg |
| Images | GIF | image/gif | .gif | gifsicle, flexigif |
| Images | JPEG XL | image/jxl | .jxl | libjxl |
| Images | WebP | image/webp, image/x-webp | .webp | libwebp |
| Images | PNG / Android 9-Patch | image/png, image/apng | .png, .apng | zopflipng |
| Images | TIFF | image/tiff, image/tiff-fx | .tif, .tiff | libtiff |
| Images | TrueVision TGA | image/x-tga, image/tga | .tga, .targa, .icb, .vda, .vst | stb |
| Images | Windows Bitmap | image/bmp, image/x-ms-bmp | .bmp, .dib | bmplib |
| Images | Windows Icon / Cursor | image/x-icon, image/vnd.microsoft.icon | .ico, .cur | internal, bmplib, zopflipng |
| Images | Apple Icon Image | image/x-icns | .icns | internal, zopflipng |
| Images | Tencent Image Container | application/x-gft | .gft | internal |
| Images | Portable Anymap | image/x-portable-anymap, image/x-portable-pixmap | .pnm, .ppm, .pgm | stb (read), internal (write) |
| Images | JPEG 2000 | image/jp2, image/jpx | .jp2, .j2k, .j2c | OpenJPEG |
| Images | MNG / JNG | video/x-mng, image/x-jng | .mng, .jng | Internal (Zopfli, mozjpeg) |
| Images | PCX / DCX | image/x-pcx, image/vnd.zbrush.pcx | .pcx, .dcx, .pcc | Internal (RLE) |
| Images | OpenRaster | image/openraster | .ora | libarchive (ZIP-based) |
| Images | SVG | image/svg+xml | .svg | pugixml |
| Documents | XML Data | application/xml, text/xml, text/xsl, application/xhtml+xml, application/vnd.google-earth.kml+xml, application/gpx+xml, model/vnd.collada+xml, application/rss+xml, application/atom+xml, application/rdf+xml | .xml, .xhtml, .kml, .gpx, .dae, .rss, .atom, .xmp, .xsl, .xslt | pugixml |
| Documents | JSON Data | application/json | .json | yyjson |
| Documents | Microsoft Office OOXML | docx: application/vnd.openxmlformats-officedocument.wordprocessingml.document, application/vnd.openxmlformats-officedocument.wordprocessingml.template
xlsx: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet, application/vnd.openxmlformats-officedocument.spreadsheetml.template
pptx: application/vnd.openxmlformats-officedocument.presentationml.presentation, application/vnd.ms-powerpoint, application/vnd.openxmlformats-officedocument.presentationml.template | .docx, .docm, .dotm, .dotx, .xlsx, .xlsm, .xltm, .xltx, .pptx, .pptm, .potm, .potx, .ppsm, .ppsx | - |
| Documents | CFBF (Legacy Office / MSI) | application/x-ole-storage, application/msword, application/vnd.ms-excel, application/vnd.ms-powerpoint, application/x-msi, application/x-ms-spool | .doc, .xls, .ppt, .msi, .msp, .mst, .pub, .vsd, .vss, .vst, .adp, .mdb, .mdt, .mpd, .mpp, .mpt, .rvt, .sldasm, .slddrw, .sldprt, .snt, .thumbs.db, .spl, .dot, .xlt, .pps, .chm, .fla, .one, .ost, .rfa, .rte, .wps | - |
| Documents | OpenDocument | odt: application/vnd.oasis.opendocument.text
ods: application/vnd.oasis.opendocument.spreadsheet
odp: application/vnd.oasis.opendocument.presentation
odg: application/vnd.oasis.opendocument.graphics
odf: application/vnd.oasis.opendocument.formula
odb: application/vnd.oasis.opendocument.database | .odt, .ods, .odp, .odg, .odf, .odb | - |
| Documents | EPUB | application/epub+zip | .epub | libarchive (ZIP-based) |
| Documents | FictionBook | application/x-fictionbook+xml | .fb2 | pugixml |
| Documents | Comic Book | CBZ: application/vnd.comicbook+zip
CBT: application/vnd.comicbook+tar | .cbz, .cbt | libarchive |
| Documents | vCard (Photo only) | text/vcard, text/x-vcard | .vcf | TagLib (base64) |
| Documents | XPS | application/vnd.ms-xpsdocument, application/oxps | .xps, .oxps | libarchive (ZIP-based) |
| Documents | Portable Executable (PE) | application/x-msdownload, application/vnd.microsoft.portable-executable | .exe, .dll, .ocx, .scr, .cpl, .sys, .drv, .bpl, .icl, .rll, .vbx | Internal |
| Documents | DWFX | model/vnd.dwfx+xps | .dwfx | libarchive (ZIP-based) |
| Documents | 3MF (3D) | application/vnd.ms-package | .3mf | libarchive (ZIP-based) |
| Documents | KMZ (Google Earth) | application/vnd.google-earth.kmz | .kmz | libarchive (ZIP-based) |
| Audio | FLAC | audio/flac, audio/x-flac | .flac | libFLAC, TagLib |
| Audio | Ogg (FLAC stream) | audio/ogg, audio/oga | .ogg, .oga | libFLAC, libogg |
| Audio | Ogg Vorbis/Opus | audio/ogg, audio/vorbis, audio/opus | .ogg, .opus | OptiVorbis, TagLib (covers) |
| Audio | MP3 | audio/mpeg | .mp3 | mp3packercpp, vbrfix, TagLib (covers) |
| Audio | M4A/MP4 (Cover Art only) | audio/mp4, audio/x-m4a, video/mp4, video/quicktime, video/3gpp, video/3gpp2 | .m4a, .mp4, .m4b, .m4v, .mov, .qt, .3gp, .3g2 | TagLib (covers) |
| Audio | WAV (Cover Art only) | audio/wav, audio/x-wav | .wav | TagLib (covers) |
| Audio | AIFF (Cover Art only) | audio/x-aiff, audio/aiff | .aif, .aiff, .aifc | TagLib (covers) |
| Audio | Monkey's Audio | audio/ape, audio/x-ape | .ape | MACLib, TagLib |
| Audio | WavPack | audio/x-wavpack, audio/x-wavpack-correction | .wv, .wvp, .wvc | wavpack |
| Audio | DSF (DSD Stream File) | audio/dsf, audio/x-dsf | .dsf | TagLib (covers) |
| Audio | DSDIFF | audio/dff, audio/x-dff | .dff | TagLib (covers) |
| Audio | Musepack | audio/musepack, audio/x-musepack | .mpc, .mp+, .mpp | TagLib (covers) |
| Audio | TrueAudio | audio/tta, audio/x-tta | .tta | TagLib (covers) |
| Video / Audio | Matroska / WebM | video/x-matroska, audio/x-matroska, video/webm, audio/webm | .mkv, .mka, .webm | mkclean, TagLib (attachments) |
| Video / Audio | ASF / WMA / WMV | audio/x-ms-wma, video/x-ms-wmv, video/x-ms-asf | .wma, .wmv, .asf | TagLib (covers) |
| Video / Audio | Shockwave Flash | application/x-shockwave-flash | .swf | zlib |
| Archives | Brotli | application/x-brotli, application/brotli | .br | brotli |
| Archives | Zip | application/zip, application/x-zip-compressed | .zip, .air, .bsz, .cdr, .csl, .gallery, .gallerycollection, .galleryitem, .grs, .ita, .itz, .nbk, .notebook, .oex, .osk, .pk3, .puz, .stz, .vlt, .wal, .wba, .wmz, .wsz, .xap, .xl, .xmz, .xsn, .appx, .bar, .dwf, .easm, .rmskin, .sldx, .zipx | libarchive |
| Archives | Tar | application/x-tar | .tar | libarchive |
| Archives | GZip | application/gzip, application/x-gzip | .gz, .tgz, .deb, .ipk, .svgz | libarchive, zopfli |
| Archives | BZip2 | application/x-bzip2 | .bz2 | bzip2 |
| Archives | Xz | application/x-xz | .xz | liblzma |
| Archives | LZMA | application/x-lzma | .lzma | liblzma |
| Archives | ISO | application/x-iso9660-image | .iso | libarchive |
| Archives | CPIO | application/x-cpio | .cpio | libarchive |
| Archives | AR (Static Lib) | application/x-archive | .a, .ar, .lib | libarchive |
| Archives | Zstandard | application/zstd, application/x-zstd | .zst, .tzst, .tar.zst | libzstd, libarchive |
| Archives | JAR | application/java-archive | .jar | libarchive (ZIP-based) |
| Archives | XPI | application/x-xpinstall | .xpi | libarchive (ZIP-based) |
| Archives | APK | application/vnd.android.package-archive | .apk, .ipa, .ipsw | libarchive (ZIP-based) |
| Archives | VSIX / NuGet | application/zip | .vsix, .nupkg | libarchive (ZIP-based) |
| Archives | Java EE | application/java-archive | .war, .ear | libarchive (ZIP-based) |
| Archives | Android Bundle | application/vnd.android.package-archive | .aab | libarchive (ZIP-based) |
| Archives | Tencent Resource DB | application/x-rdb | .rdb | internal |
| Archives | Kanzi | application/x-kanzi | .knz | kanzi |
| Fonts | WOFF | font/woff | .woff | zlib |
| Fonts | WOFF2 | font/woff2 | .woff2 | woff2 |
| Scientific | MSEED | application/vnd.fdsn.mseed | .mseed | libmseed |
| Databases | SQLite | application/vnd.sqlite3, application/x-sqlite3 | .sqlite, .db | sqlite3 |
## Third-party libraries
Chisel works because it makes use of so many libraries. Here is a list of the incredible open-source projects that power its processors:
- **[TagLib](https://github.com/taglib/taglib)**: Audio metadata and cover art extraction.
- **[libarchive](https://github.com/libarchive/libarchive)**: Multi-format archive reading and writing (ZIP, Tar, GZip, etc.).
- **[mozjpeg](https://github.com/mozilla/mozjpeg)**: High-performance JPEG recompression.
- **[libwebp](https://github.com/webmproject/libwebp)**: WebP image encoding and decoding.
- **[libpng](http://www.libpng.org/pub/png/libpng.html)**: PNG image handling.
- **[zopfli](https://github.com/google/zopfli)**: Deflate/PNG optimization.
- **[libflac](https://github.com/xiph/flac)**: Free Lossless Audio Codec handling.
- **[libogg](https://github.com/xiph/ogg)**: Ogg container support.
- **[libjxl](https://github.com/libjxl/libjxl)**: JPEG XL support.
- **[wavpack](https://github.com/dbry/WavPack)**: WavPack lossless audio compressor.
- **[libebml](https://github.com/Matroska-Org/libebml)** & **[libmatroska](https://github.com/Matroska-Org/libmatroska)**: EBML and Matroska container handling.
- **[mkclean](https://github.com/Matroska-Org/libmkclean)**: Matroska/WebM optimization tool.
- **[qpdf](https://github.com/qpdf/qpdf)**: PDF transformation and optimization.
- **[zstd](https://github.com/facebook/zstd)**: ZSTD compression.
- **[xz](https://tukaani.org/xz/)**: LZMA compression.
- **[brotli](https://github.com/google/brotli)**: Generic-purpose lossless compression.
- **[sqlite3](https://www.sqlite.org/)**: SQLite database engine.
- **[OptiVorbis](https://github.com/OptiVorbis/OptiVorbis)**: Ogg Vorbis recompression (Rust-based).
- **[gifsicle](https://github.com/kohler/gifsicle)** & **[flexigif](https://create.stephan-brumme.com/flexigif-lossless-gif-lzw-optimization/)**: GIF optimization.
- **[libmseed](https://github.com/EarthScope/libmseed)**: miniSEED data format library.
- **[pugixml](https://github.com/zeux/pugixml)**: Light-weight, simple and fast XML parser.
- **[yyjson](https://github.com/ibireme/yyjson)**: High-performance JSON library.
- **[libtiff](http://www.simplesystems.org/libtiff/)**: TIFF image support.
- **[stb](https://github.com/nothings/stb)**: Single-file public domain libraries for images.
- **[OpenJPEG](https://github.com/uclouvain/openjpeg)**: Open-source JPEG 2000 codec.
- **[kanzi-cpp](https://github.com/flanglet/kanzi-cpp)**: Lossless data compressor port.
- **[mp3packercpp](https://github.com/Snesnopic/mp3packercpp)**: mp3packer port.
- **[corrosion](https://github.com/corrosion-rs/corrosion)**: CMake integration for Rust.
## Why?
Chisel exists because I've been inspired by larger and more mature projects that have had something missing.
Specifically, I've always needed a cross-platform utility, that was contained in its size, that didn't require an interpreter or a terminal script to use it, and that could automatically handle IDv3 tags inside music files.
These are the tools that I have used, both personally and for research for this project.
- https://nikkhokkho.sourceforge.io/?page=FileOptimizer
- https://github.com/JayXon/Leanify
- https://github.com/T-3B/rhefo
- https://github.com/MartinEesmaa/awesome-compopt
- https://github.com/ajslater/picopt
- https://github.com/Wdavery/minuimus.pl
- https://papas-best.com/optimizer_en