https://github.com/xorbit01/webpalm

🕸️ Crawl in the web network
https://github.com/xorbit01/webpalm

crawler crawling data data-science datamining go golang hack mining osint redteam spider tool

Last synced: 2 months ago
JSON representation

🕸️ Crawl in the web network

Host: GitHub
URL: https://github.com/xorbit01/webpalm
Owner: XORbit01
License: gpl-3.0
Created: 2023-04-22T14:47:32.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2025-03-20T20:28:34.000Z (7 months ago)
Last Synced: 2025-08-02T22:15:46.467Z (2 months ago)
Topics: crawler, crawling, data, data-science, datamining, go, golang, hack, mining, osint, redteam, spider, tool
Language: Go
Homepage:
Size: 5.07 MB
Stars: 371
Watchers: 3
Forks: 39
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          




 🌐 Advanced Web Traversal & Data Extraction 🚀 


[![GitHub release](https://img.shields.io/github/v/release/XORbit01/webpalm?color=blue&label=release)]()

[![GitHub license](https://img.shields.io/github/license/XORbit01/webpalm?color=green)]()

[![GitHub issues](https://img.shields.io/github/issues/XORbit01/webpalm?color=red)]()

[![GitHub stars](https://img.shields.io/github/stars/XORbit01/webpalm?color=yellow)]()

[![GitHub forks](https://img.shields.io/github/forks/XORbit01/webpalm?color=orange)]()

[![GitHub watchers](https://img.shields.io/github/watchers/XORbit01/webpalm?color=blue)]()

🔍 **Crawl websites efficiently, extract structured data, and visualize connections.** 🕵️‍♂️





---

## 🗺️ Table of Contents

- [`📦 Installation`](#-installation)

- [`⚡ Features`](#-features)

- [`🚀 Usage`](#-usage)

- [`📌 Examples`](#-examples)

- [`📜 Regex Patterns`](#-regex-patterns)

- [`🤝 Contributing`](#-contributing)

---

## ⚡ Features  

- 🌳 **Structured Web-Tree Generation**  

- 🕵️ **Regex-Based Data Extraction**  

- ⚡ **High-Speed Multi-threading**  

- 📂 **Multiple Export Formats**  

- 🎨 **Colorized Output & Robust Error Handling**  

---

## 📦 Installation

### 📥 Download Binary

### 📥 Compile from Source

```sh

git clone https://github.com/XORbit01/webpalm.git

cd webpalm

go build -o webpalm && ./webpalm

```  

👉 [Download Latest Release](https://github.com/XORbit01/webpalm/releases/latest)

### 📥 Install via Go

```sh

go install github.com/XORbit01/webpalm/v2@latest

```

---

## 🚀 Usage

```sh

webpalm -h

```

### ⚙️ Common Flags

```yaml

🌎 -i, --include     # Include only specific domains (e.g., google.com, facebook.com)

🔗 -u, --url         # Target website  

📏 -l, --level       # Depth of traversal  

❌ -x, --exclude     # Exclude status codes (e.g., 404, 500)  

💾 -o, --output      # Save results (JSON, XML, TXT)  

🚀 -w, --worker      # Multi-threading workers  

🔍 --regexes         # Extract data using regex  

```

---

## 📌 Examples

### 🌲 Generate a Website Map

```sh

webpalm -u https://example.com -l2

```

### 💬 Extract Comments from Pages

```sh

webpalm -u https://example.com -l1 --regexes comments="\<\!--.*?-->" -o results.json

```

### 🚀 Crawl with Multi-threading

```sh

webpalm -u https://example.com -l3 -w 50

```

### 💾 Export Results

```sh

webpalm -u https://example.com -l2 -o output.xml

```

---

## 📜 Regex Patterns

| 🔍 Purpose   | 📜 Regex Pattern |

|-----------|--------------|

| 📧 Emails    | `[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+` |

| 💬 Comments  | `\<\!--.*?-->` |

| 🔑 Tokens    | `[a-zA-Z0-9]{32}` |

| 🔐 Passwords | `\bpassword\b.{0,10}` |

📌 *Escape special characters if needed.*

---

## 🤝 Contributing

💡 Pull requests are welcome! Open an issue before major changes.  

📢 **Discord:** `xorbit.`

---

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/xorbit01/webpalm

Awesome Lists containing this project

README

🌐 Advanced Web Traversal & Data Extraction 🚀