Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/m8sec/pymeta

Utility to download and extract document metadata from an organization. This technique can be used to identify: domains, usernames, software/version numbers and naming conventions.
https://github.com/m8sec/pymeta

extract-metadata information-disclosure metadata metadata-extraction pentest pentest-tool python3

Last synced: about 1 month ago
JSON representation

Utility to download and extract document metadata from an organization. This technique can be used to identify: domains, usernames, software/version numbers and naming conventions.

Awesome Lists containing this project

README

        

# PyMeta


 


 
 

PyMeta is a Python3 rewrite of the tool [PowerMeta](https://github.com/dafthack/PowerMeta), created by [dafthack](https://twitter.com/dafthack) in PowerShell. It uses specially crafted search queries to identify and download the following file types (**pdf, xls, xlsx, csv, doc, docx, ppt, pptx**) from a given domain using Google and Bing scraping.

Once downloaded, metadata is extracted from these files using Phil Harvey's [exiftool](https://sno.phy.queensu.ca/~phil/exiftool/) and added to a ```.csv``` report. Alternatively, Pymeta can be pointed at a directory to extract metadata from files manually downloaded using the ```-dir``` command line argument. See the [Usage](#Usage), or [All Options](#All-Options) section for more information.

#### Why?
Metadata is a common place for penetration testers and red teamers to find: domains, user accounts, naming conventions, software/version numbers, and more!

# Getting Started
### Prerequisites
[Exiftool](https://sno.phy.queensu.ca/~phil/exiftool/) is required and can be installed with:

    **Ubuntu/Kali** - ```apt-get install exiftool -y```

    **Mac OS** - ```brew install exiftool```

### Install:
Install the last stable release from PyPi:
```commandline
pip3 install pymetasec
```

Or, install the most recent code from GitHub:
```
git clone https://github.com/m8sec/pymeta
cd pymeta
python3 setup.py install
```

## Usage
* Search Google and Bing for files within example.com and extract metadata to a csv report:

```pymeta -d example.com```

* Extract metadata from files within the given directory and create csv report:

```pymeta -dir Downloads/```

## All Options
```
options:
-h, --help show this help message and exit
-T MAX_THREADS Max threads for file download (Default=5)
-t TIMEOUT Max timeout per search (Default=8)
-j JITTER Jitter between requests (Default=1)

Search Options:
-s ENGINE, --search ENGINE Search Engine (Default='google,bing')
--file-type FILE_TYPE File types to search (default=pdf,xls,xlsx,csv,doc,docx,ppt,pptx)
-m MAX_RESULTS Max results per type search

Proxy Options:
--proxy PROXY Proxy requests (IP:Port)
--proxy-file PROXY Load proxies from file for rotation

Output Options:
-o DWNLD_DIR Path to create downloads directory (Default: ./)
-f REPORT_FILE Custom report name ("pymeta_report.csv")

Target Options:
-d DOMAIN Target domain
-dir FILE_DIR Pre-existing directory of file
```

## Credit
- Beau Bullock [(@dafthack)](https://twitter.com/dafthack) - [https://github.com/dafthack/PowerMeta](https://github.com/dafthack/PowerMeta)
- Phil Harvey - [https://exiftool.org/](https://exiftool.org/)