https://github.com/bannsec/extractor
Universal extraction tool
https://github.com/bannsec/extractor
python unarchive
Last synced: 4 months ago
JSON representation
Universal extraction tool
- Host: GitHub
- URL: https://github.com/bannsec/extractor
- Owner: bannsec
- Created: 2017-02-14T08:18:19.000Z (almost 9 years ago)
- Default Branch: master
- Last Pushed: 2017-02-21T02:24:33.000Z (almost 9 years ago)
- Last Synced: 2025-06-21T07:42:42.469Z (6 months ago)
- Topics: python, unarchive
- Language: Python
- Size: 31.3 KB
- Stars: 3
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Install
Just `pip install .`
# Use
Make sure you're in the python environment you installed it to. Then:
```bash
$ extract -h
usage: extract [-h] [-rm] file
Universal Extractor
positional arguments:
file The file to extract
optional arguments:
-h, --help show this help message and exit
-rm Should we remove the source file after extract? (defualt: False)
```
# Extending
Found something that isn't handled? Adding a handler is easy. When extractor fails, it will tell you what the mime type was that it didn't know how to handle:
```bash
$ extract carry.c.lzma
ERROR:extract:No handler available for type LZMA compressed data, streamed (application/x-lzma)
```
To write a handler, create a module using the mime type. In this case, the mime type is `application/x-lzma`. So, we will create a handler `extract/handlers/application/x_lzma/__init__.py`. This file must define a class named `handle` that extends `handleBaseClass` and exposes an `extract` method. The extract method must call the super class at the end of execution.
Example:
```python
from extract.handlers import handleBaseClass
class handle(handleBaseClass):
def extract(self):
config = self.config
# Open it up
l = lzma.LZMAFile(config['fileName'])
# Find the base directory of the file
directory = os.path.dirname(os.path.abspath(config['fileName']))
# Do the actual extraction
with open(config['fileName'] + "_extracted","wb") as f:
f.write(l.read())
# Call parent handler
handleBaseClass.extract(self)
import lzma
import logging
import os
logger = logging.getLogger('extract.handlers.application.x_lzma')
```
That's it. The handler will now be automatically discovered and called.
For an example of calling multiple options in sequence, check out [x-compress](extract/handlers/application/x_compress/__init__.py)