Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/nolze/msoffcrypto-tool
Python tool and library for decrypting and encrypting MS Office files using passwords or other keys
https://github.com/nolze/msoffcrypto-tool
command-line decryption doc docx encryption ms-offcrypto ole ooxml ppt pptx xls xlsx
Last synced: 34 minutes ago
JSON representation
Python tool and library for decrypting and encrypting MS Office files using passwords or other keys
- Host: GitHub
- URL: https://github.com/nolze/msoffcrypto-tool
- Owner: nolze
- License: mit
- Created: 2015-09-29T16:40:17.000Z (about 9 years ago)
- Default Branch: master
- Last Pushed: 2024-05-02T05:46:06.000Z (8 months ago)
- Last Synced: 2024-05-20T09:05:10.874Z (7 months ago)
- Topics: command-line, decryption, doc, docx, encryption, ms-offcrypto, ole, ooxml, ppt, pptx, xls, xlsx
- Language: Python
- Homepage: https://msoffcrypto-tool.readthedocs.io/
- Size: 647 KB
- Stars: 525
- Watchers: 25
- Forks: 83
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE.txt
- Security: .github/SECURITY.md
Awesome Lists containing this project
README
# msoffcrypto-tool
[![PyPI](https://img.shields.io/pypi/v/msoffcrypto-tool.svg)](https://pypi.org/project/msoffcrypto-tool/)
[![PyPI downloads](https://img.shields.io/pypi/dm/msoffcrypto-tool.svg)](https://pypistats.org/packages/msoffcrypto-tool)
[![build](https://github.com/nolze/msoffcrypto-tool/actions/workflows/ci.yaml/badge.svg)](https://github.com/nolze/msoffcrypto-tool/actions/workflows/ci.yaml)
[![Coverage Status](https://codecov.io/gh/nolze/msoffcrypto-tool/branch/master/graph/badge.svg)](https://codecov.io/gh/nolze/msoffcrypto-tool)
[![Documentation Status](https://readthedocs.org/projects/msoffcrypto-tool/badge/?version=latest)](http://msoffcrypto-tool.readthedocs.io/en/latest/?badge=latest)msoffcrypto-tool is a Python tool and library for decrypting and encrypting MS Office files using a password or other keys.
## Contents
* [Installation](#installation)
* [Examples](#examples)
* [Supported encryption methods](#supported-encryption-methods)
* [Tests](#tests)
* [Todo](#todo)
* [Resources](#resources)
* [Use cases and mentions](#use-cases-and-mentions)
* [Contributors](#contributors)
* [Credits](#credits)## Installation
```
pip install msoffcrypto-tool
```## Examples
### As CLI tool (with password)
#### Decryption
Specify the password with `-p` flag:
```
msoffcrypto-tool encrypted.docx decrypted.docx -p Passw0rd
```Password is prompted if you omit the password argument value:
```bash
$ msoffcrypto-tool encrypted.docx decrypted.docx -p
Password:
```To check if the file is encrypted or not, use `-t` flag:
```
msoffcrypto-tool document.doc --test -v
```It returns `1` if the file is encrypted, `0` if not.
#### Encryption (OOXML only, experimental)
> [!IMPORTANT]
> Encryption feature is experimental. Please use it at your own risk.To password-protect a document, use `-e` flag along with `-p` flag:
```
msoffcrypto-tool -e -p Passw0rd plain.docx encrypted.docx
```### As library
Password and more key types are supported with library functions.
#### Decryption
Basic usage:
```python
import msoffcryptoencrypted = open("encrypted.docx", "rb")
file = msoffcrypto.OfficeFile(encrypted)file.load_key(password="Passw0rd") # Use password
with open("decrypted.docx", "wb") as f:
file.decrypt(f)encrypted.close()
```In-memory:
```python
import msoffcrypto
import io
import pandas as pddecrypted = io.BytesIO()
with open("encrypted.xlsx", "rb") as f:
file = msoffcrypto.OfficeFile(f)
file.load_key(password="Passw0rd") # Use password
file.decrypt(decrypted)df = pd.read_excel(decrypted)
print(df)
```Advanced usage:
```python
# Verify password before decryption (default: False)
# The ECMA-376 Agile/Standard crypto system allows one to know whether the supplied password is correct before actually decrypting the file
# Currently, the verify_password option is only meaningful for ECMA-376 Agile/Standard Encryption
file.load_key(password="Passw0rd", verify_password=True)# Use private key
file.load_key(private_key=open("priv.pem", "rb"))# Use intermediate key (secretKey)
file.load_key(secret_key=binascii.unhexlify("AE8C36E68B4BB9EA46E5544A5FDB6693875B2FDE1507CBC65C8BCF99E25C2562"))# Check the HMAC of the data payload before decryption (default: False)
# Currently, the verify_integrity option is only meaningful for ECMA-376 Agile Encryption
file.decrypt(open("decrypted.docx", "wb"), verify_integrity=True)
```Supported key types are
- Passwords
- Intermediate keys (optional)
- Private keys used for generating escrow keys (escrow certificates) (optional)See also ["Backdooring MS Office documents with secret master keys"](https://web.archive.org/web/20171008075059/http://secuinside.com/archive/2015/2015-1-9.pdf) for more information on the key types.
#### Encryption (OOXML only, experimental)
> [!IMPORTANT]
> Encryption feature is experimental. Please use it at your own risk.Basic usage:
```python
from msoffcrypto.format.ooxml import OOXMLFileplain = open("plain.docx", "rb")
file = OOXMLFile(plain)with open("encrypted.docx", "wb") as f:
file.encrypt("Passw0rd", f)plain.close()
```In-memory:
```python
from msoffcrypto.format.ooxml import OOXMLFile
import ioencrypted = io.BytesIO()
with open("plain.xlsx", "rb") as f:
file = OOXMLFile(f)
file.encrypt("Passw0rd", encrypted)# Do stuff with encrypted buffer; it contains an OLE container with an encrypted stream
...
```## Supported encryption methods
### MS-OFFCRYPTO specs
* [x] ECMA-376 (Agile Encryption/Standard Encryption)
* [x] MS-DOCX (OOXML) (Word 2007-)
* [x] MS-XLSX (OOXML) (Excel 2007-)
* [x] MS-PPTX (OOXML) (PowerPoint 2007-)
* [x] Office Binary Document RC4 CryptoAPI
* [x] MS-DOC (Word 2002, 2003, 2004)
* [x] MS-XLS ([Excel 2002, 2003, 2007, 2010](https://learn.microsoft.com/en-us/openspecs/office_file_formats/ms-xls/a3ad4e36-ab66-426c-ba91-b84433312068#Appendix_A_22)) (experimental)
* [x] MS-PPT (PowerPoint 2002, 2003, 2004) (partial, experimental)
* [x] Office Binary Document RC4
* [x] MS-DOC (Word 97, 98, 2000)
* [x] MS-XLS (Excel 97, 98, 2000) (experimental)
* [ ] ECMA-376 (Extensible Encryption)
* [x] XOR Obfuscation
* [x] MS-XLS ([Excel 2002, 2003](https://learn.microsoft.com/en-us/openspecs/office_file_formats/ms-xls/a3ad4e36-ab66-426c-ba91-b84433312068#Appendix_A_21)) (experimental)
* [ ] MS-DOC (Word 2002, 2003, 2004?)### Other
* [ ] Word 95 Encryption (Word 95 and prior)
* [ ] Excel 95 Encryption (Excel 95 and prior)
* [ ] PowerPoint 95 Encryption (PowerPoint 95 and prior)PRs are welcome!
## Tests
With [coverage](https://github.com/nedbat/coveragepy) and [pytest](https://pytest.org/):
```
poetry install
poetry run coverage run -m pytest -v
```## Todo
* [x] Add tests
* [x] Support decryption with passwords
* [x] Support older encryption schemes
* [x] Add function-level tests
* [x] Add API documents
* [x] Publish to PyPI
* [x] Add decryption tests for various file formats
* [x] Integrate with more comprehensive projects handling MS Office files (such as [oletools](https://github.com/decalage2/oletools/)?) if possible
* [x] Add the password prompt mode for CLI
* [x] Improve error types (v4.12.0)
* [ ] Add type hints
* [ ] Introduce something like `ctypes.Structure`
* [x] Support OOXML encryption
* [ ] Support other encryption
* [ ] Isolate parser
* [ ] Redesign APIs (v6.0.0)## Resources
* "Backdooring MS Office documents with secret master keys" [http://secuinside.com/archive/2015/2015-1-9.pdf](https://web.archive.org/web/20171008075059/http://secuinside.com/archive/2015/2015-1-9.pdf)
* Technical Documents
* [MS-OFFCRYPTO] Agile Encryption
* [MS-OFFDI] Microsoft Office File Format Documentation Introduction
* LibreOffice/core
* LibreOffice/mso-dumper
* wvDecrypt
* Microsoft Office password protection - Wikipedia
* office2john.py## Alternatives
* herumi/msoffice
* DocRecrypt
* Apache POI - the Java API for Microsoft Documents## Use cases and mentions
### General
* (kudos to maintainers!)
### Corporate
* Workato
* Check Point### Malware/maldoc analysis
*
*### CTF
*
*### In other languages
*
*
*
*### In publications
* [Excel、データ整理&分析、画像処理の自動化ワザを完全網羅! 超速Python仕事術大全](https://books.google.co.jp/books?id=TBdVEAAAQBAJ&q=msoffcrypto) (伊沢剛, 2022)
* ["Analyse de documents malveillants en 2021"](https://twitter.com/decalage2/status/1435255507846053889), MISC Hors-série N° 24, "Reverse engineering : apprenez à analyser des binaires" (Lagadec Philippe, 2021)
* [シゴトがはかどる Python自動処理の教科書](https://books.google.co.jp/books?id=XEYUEAAAQBAJ&q=msoffcrypto) (クジラ飛行机, 2020)## Contributors
*
## Credits
* The sample file for XOR Obfuscation is from: