https://github.com/oelin/coco
A library for creating self-extracting archives in Python.
https://github.com/oelin/coco
data-compression data-science self-extracting-archive serialization
Last synced: about 1 month ago
JSON representation
A library for creating self-extracting archives in Python.
- Host: GitHub
- URL: https://github.com/oelin/coco
- Owner: oelin
- License: mit
- Created: 2023-02-17T12:29:55.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2023-06-18T09:03:33.000Z (almost 3 years ago)
- Last Synced: 2025-12-27T01:59:08.639Z (5 months ago)
- Topics: data-compression, data-science, self-extracting-archive, serialization
- Language: Python
- Homepage:
- Size: 33.2 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Coco
A library for creating self-extracting archives in Python.
## Overview
Coco is a data serialization library specialized for creating self-extracting archives (SEAs), i.e. a compressed stream which includes the decoder. SEAs are useful for distributing highly compressed files with uncommon, bespoke or proprietary codecs. The library currently utilized pickle as the outermost serialization format, however this is subject to change.
### Self-extracting Archives
A self-extracting archive is type of compressed file which also include the decompression program. A simple example would be an image compressed using LZW with an LZW decoder appended to the end of the file. The primary advantage of SEAs is that they allow for highly optimized domain specific compression while maintaining portability.
### Security
One risk of SEAs is that they can essentially execute arbitrary code. The risk associated with decompressing an SEA is similar to the risk associated with running third party executables. For this reason, it's good practice to extract the data in a sandboxed environment.
## Usage
Create a decoder/extractor.
```py
import coco
@coco.decoder
def decode(data):
# Decompression logic...
```
Create a self-extracting archive.
```py
archive = coco.encode(data, decoder)
```
Extract the original data.
```py
data = coco.decode(archive)
```
## Installation
```sh
pip install coco
```
## API
#### `coco.encode(data: Data, decoder: Decoder) -> Data`
Takes a compressed data stream and a decoder, returns a self-extracting archive.
#### `coco.decode(data: Data) -> Data`
Takes a self-extracting archive and returns an uncompressed data stream.