https://github.com/devanshbatham/heaptruffle

Mine URLs from Browser's Heap Snapshot for fun and profit
https://github.com/devanshbatham/heaptruffle

Last synced: about 1 year ago
JSON representation

Mine URLs from Browser's Heap Snapshot for fun and profit

Host: GitHub
URL: https://github.com/devanshbatham/heaptruffle
Owner: devanshbatham
License: mit
Created: 2023-08-06T20:28:56.000Z (almost 3 years ago)
Default Branch: main
Last Pushed: 2023-08-09T09:02:33.000Z (almost 3 years ago)
Last Synced: 2025-03-29T05:12:20.698Z (about 1 year ago)
Language: JavaScript
Homepage:
Size: 312 KB
Stars: 64
Watchers: 2
Forks: 16
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

heaptruffle

Mine URLs from Browser's Heap Snapshot for fun and profit

🏗️ Install
⛏️ Usage
💡 How it Works
⚡ Inspiration

![heaptruffle](https://github.com/devanshbatham/heaptruffle/blob/main/static/truffleheap.png?raw=true)

# Installation

Follow these steps to get `heaptruffle` up and running:

1. **Clone the Repository**:
```sh
git clone https://github.com/devanshbatham/heaptruffle
```

2. **Navigate to the Directory**:
```sh
cd heaptruffle
```

3. **Build the Docker Image**:
```sh
docker build -t heaptruffle .
```

4. **Make the script executable and move it to a directory in your PATH**:
```sh
sudo chmod +x heaptruffle
sudo mv heaptruffle /usr/local/bin/heaptruffle
```

Once done, you can invoke `heaptruffle` from any location in your terminal.

# Usage

### Using Docker:

- To run heaptruffle on single URL
```sh
docker run -it --rm -v "$PWD":/app/data --name heaptruffle-container heaptruffle --url http://example.com
```

- or, to run it on a file containing URLs.
```sh
docker run -it --rm -v "$PWD":/app/data --name heaptruffle-container heaptruffle --list urls.txt
```

- Save the output to a file (output.txt):
```sh
docker run -it --rm -v "$PWD":/app/data --name heaptruffle-container heaptruffle --url http://example.com --output /app/data/output.txt
```

- Increase concurrency to fetch URLs faster:
```sh
docker run -it --rm -v "$PWD":/app/data --name heaptruffle-container heaptruffle --list urls.txt --concurrency 10
```

### Using heaptruffle alias (after installation):

- To run heaptruffle:
```sh
heaptruffle --url https://example.com
```

- or
```sh
heaptruffle --list urls.txt
```

- Increase concurrency to fetch URLs faster:
```sh
heaptruffle --list urls.txt --concurrency 10
```

- Save the output to a file (output.txt):
```sh
heaptruffle --url https://example.com --output output.txt
```

- Use silent mode to suppress the ASCII banner:
```sh
heaptruffle --url https://example.com --silent
```

## Options

| Option | Alias | Type | Description |
|-----------------|-------|----------|---------------------------------------------------------------|
| `--url` | `-u` | `string` | URL address |
| `--list` | `-l` | `string` | File containing list of URLs |
| `--concurrency` | `-c` | `number` | Number of URLs to fetch concurrently (default: 5) |
| `--silent` | `-s` | `boolean`| Silent mode, does not display the ASCII banner (default: false)|
| `--output` | `-o` | `string` | File to save the output |

# How it Works

heaptruffle uses Puppeteer, a headless browser automation library, to load web pages and capture heap snapshots of the web pages' memory. These heap snapshots are then parsed using the `heapsnapshot-parser` library, allowing heaptruffle to extract URLs/endpoints from it.

The tool takes either a single URL or a file containing a list of URLs as input. It fetches each URL concurrently to speed up the process. For each URL, heaptruffle loads the web page, captures a heap snapshot, and then performs analysis to extract relevant paths from the snapshot. It identifies the URLs and paths accessed during the page's execution and outputs them to the console or a specified output file.

## Inspiration
This tool was inspired by the project [extract-relative-url-heapsnapshot](https://github.com/smiegles/extract-relative-url-heapsnapshot) by [smiegles](https://github.com/smiegles). I just improved it in my way and extended its functionality (concurrency, support for multiple URLs, pretty output, the ability to save the results in a file, dockerization, error handling, an easy-to-use setup script, etc).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/devanshbatham/heaptruffle

Awesome Lists containing this project

README

heaptruffle

Mine URLs from Browser's Heap Snapshot for fun and profit