An open API service indexing awesome lists of open source software.

https://github.com/ivan-sincek/file-scraper

Scrape files for sensitive information, and generate an interactive HTML report. Based on Rabin2.
https://github.com/ivan-sincek/file-scraper

bug-bounty desktop-penetration-testing ethical-hacking incident-response malware-analysis mobile-penetration-testing offensive-security penetration-testing python rabin2 radare2 red-team-engagement scraping secrets-finder secrets-management security sensitive-data sensitive-files strings web-penetration-testing

Last synced: 10 months ago
JSON representation

Scrape files for sensitive information, and generate an interactive HTML report. Based on Rabin2.

Awesome Lists containing this project

README

          

# File Scraper

Scrape files for sensitive information, and generate an interactive HTML report. Based on Rabin2.

This tool is only as good as your [RegEx](https://github.com/ivan-sincek/file-scraper?tab=readme-ov-file#build-the-template--run) skills.

You can also style your own [report](https://github.com/ivan-sincek/file-scraper/blob/main/src/file_scraper/reports/default.html).

Tested on Kali Linux v2024.2 (64-bit).

Made for educational purposes. I hope it will help!

## Table of Contents

* [How to Install](#how-to-install)
* [Install Radare2](#install-radare2)
* [Standard Install](#standard-install)
* [Build and Install From the Source](#build-and-install-from-the-source)
* [Build the Template & Run](#build-the-template--run)
* [Usage](#usage)
* [Images](#images)

## How to Install

### Install Radare2

On Kali Linux, run:

```bash
apt-get -y install radare2
```

---

On Windows OS, download and unpack [radareorg/radare2](https://github.com/radareorg/radare2/releases), then, add the `bin` directory to Windows `PATH` environment variable.

---

On macOS, run:

```bash
brew install radare2
```

### Standard Install

```bash
pip3 install --upgrade file-scraper
```

### Build and Install From the Source

```bash
git clone https://github.com/ivan-sincek/file-scraper && cd file-scraper

python3 -m pip install --upgrade build

python3 -m build

python3 -m pip install dist/file_scraper-4.6-py3-none-any.whl
```

## Build the Template & Run

Prepare a template such as [the default template](https://github.com/ivan-sincek/file-scraper/blob/main/src/file_scraper/templates/default.json):

```json
{
"Auth.":{
"query":"(?:basic|bearer)\\ ",
"ignorecase":true,
"search":true
},
"Variables":{
"query":"(?:access|account|admin|auth|card|conf|cookie|cred|customer|email|history|ident|info|jwt|key|kyc|log|otp|pass|pin|priv|refresh|salt|secret|seed|session|setting|sign|token|transaction|transfer|user)[\\w\\d\\-\\_]*(?:\\\"\\ *\\:|\\ *\\=[^\\=]{1})",
"ignorecase":true,
"search":true
},
"Comments":{
"query":"(?:(? = decoded | files | test.exe | etc.
TEMPLATE
File containing extraction details or a single RegEx to use
Default: built-in JSON template file
-t, --template = template.json | "secret\: [\w\d]+" | etc.
EXCLUDES
Exclude all files ending with the specified extension
Specify 'default' to load the built-in list
Use comma-separated values
-e, --excludes = mp3 | default,jpg,png | etc.
INCLUDES
Include all files ending with the specified extension
Overrides the excludes
Use comma-separated values
-i, --includes = java | json,xml,yaml | etc.
BEAUTIFY
Beautify [minified] JavaScript (.js) files
-b, --beautify
THREADS
Number of parallel threads to run
Default: 30
-th, --threads = 10 | etc.
OUT
Output file
-o, --out = results.html | etc.
DEBUG
Enable debug output
-dbg, --debug
```

## Images

Interactive Report (1)

Figure 1 - Interactive Report (1)

Interactive Report (2)

Figure 2 - Interactive Report (2)

Interactive Report (3)

Figure 3 - Interactive Report (3)