https://github.com/0xlam/phishsage

PhishSage is a lightweight email triage and phishing-analysis toolkit. Extracts headers, attachments, and links, applies heuristic checks, and produces structured insights.
https://github.com/0xlam/phishsage

cybersecurity email-analysis email-security incident-response malware-analysis phishing python3 security-tools soc

Last synced: about 2 months ago
JSON representation

PhishSage is a lightweight email triage and phishing-analysis toolkit. Extracts headers, attachments, and links, applies heuristic checks, and produces structured insights.

Host: GitHub
URL: https://github.com/0xlam/phishsage
Owner: 0xlam
License: mit
Created: 2025-11-15T05:47:49.000Z (8 months ago)
Default Branch: main
Last Pushed: 2026-02-13T08:05:57.000Z (5 months ago)
Last Synced: 2026-02-13T19:57:30.460Z (5 months ago)
Topics: cybersecurity, email-analysis, email-security, incident-response, malware-analysis, phishing, python3, security-tools, soc
Language: Python
Homepage:
Size: 180 KB
Stars: 2
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE

Awesome Lists containing this project

README

# PhishSage

PhishSage is a lightweight phishing-analysis toolkit that parses raw emails, inspects headers, analyzes links and domains with multi-layer heuristics, and outputs structured JSON findings for fast, automated investigation

[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)]()
[![Python](https://img.shields.io/badge/Python-3.10%2B-blue.svg)]()
[![Status: Active](https://img.shields.io/badge/Project%20Status-Active-brightgreen.svg)]()

## 1. Core functionality

PhishSage is intentionally minimal and concentrates on these essential capabilities:

* **Header analysis**

* Extracts normalized sender-related headers (From, Reply-To, Return-Path, Message-ID)
* Parses SPF, DKIM, and DMARC results from Authentication-Results
* Performs alignment checks across From, Reply-To, and Return-Path
* Validates Message-ID domain consistency
* Detects use of free email providers in Reply-To and Return-Path headers
* Checks timestamp sanity by comparing the Date header with the first Received hop
* Looks up WHOIS domain age and flags newly registered or soon-to-expire domains
* Validates MX records for sender-related domains
* Queries Spamhaus DBL for sender-related domains
* Aggregates all findings into structured JSON with merged alerts

* **Attachment processing**

* List attachments with MIME and size
* Extract attachments safely (avoid overwrites)
* Compute hashes (MD5, SHA1, SHA256)
* Optional VirusTotal scan by SHA256
* Scan attachments with YARA rules (single files, multiple files, or directories; recursive and filtered for valid .yar/.yara files)
* Verbose mode shows matched strings with offsets and hex data

* **Link / URL analysis**

* Extracts URLs from email bodies or headers
* Detects URLs using raw IP addresses instead of domains
* Flags suspicious or uncommon top-level domains (TLDs)
* Identifies excessive or nested subdomains, ignoring trivial ones (e.g., "www")
* Recognizes shortened URLs (bit.ly, tinyurl.com, etc.)
* Calculates Shannon entropy for domain and subdomain to spot obfuscation
* Performs SSL/TLS certificate inspection (issuer, validity, domain match, expiration)
* Looks up domain age via WHOIS and flags newly registered or expiring domains
* VirusTotal URL lookup for threat intelligence
* Optional redirect-chain tracing to uncover hidden destinations
* Checks for numeric-only registrable domains
* Detects URLs using commonly abused web platforms and services
* Flags URLs with excessively deep paths

## 2. Installation

### Base Install

Installs core functionality: header analysis and basic email parsing.
```bash
# From PyPI
pip install phishsage

# From GitHub
git clone https://github.com/0xlam/PhishSage.git
cd PhishSage
python3 -m venv venv

# Linux / macOS
source venv/bin/activate

# Windows (PowerShell)
venv\Scripts\Activate.ps1

pip install -e .
```

---

### Optional Extras

Install only what you need:
```bash
# Attachment analysis (YARA scanning, MIME detection)
pip install "phishsage[attachments]"

# Link / URL analysis
pip install "phishsage[links]"

# Everything
pip install "phishsage[all]"
```

---

### VirusTotal API Key

Required if using `--vt-scan` in any mode.
```bash
# Linux / macOS
export VIRUSTOTAL_API_KEY="your_virustotal_api_key"

# Windows (PowerShell)
setx VIRUSTOTAL_API_KEY "your_virustotal_api_key"
```

## 3. CLI Usage

PhishSage provides a command-line interface with three main modes: `headers`, `attachments`, and `links`. The `headers` and `links` modes output results in JSON format, while the `attachments` mode produces human-readable summaries only.

### Main Help

```bash
phishsage -h
```

**Output:**

```
usage: phishsage [-h] {headers,attachments,links} ...

PhishSage

positional arguments:
{headers,attachments,links}
headers Analyze email headers for anomalies or indicators
attachments Analyze or extract attachments
links Analyze links in email content

options:
-h, --help show this help message and exit
```

---

### Header Analysis

```bash
phishsage headers -h
```

**Options:**

```
usage: phishsage headers [-h] -f FILE [-o FILE] [--heuristics] [--enrich [{mx,spamhaus,domain_age,all} ...]] [--json]

options:
-h, --help show this help message and exit
-f, --file FILE Email file to analyze (.eml)
-o, --output FILE Save JSON results to file (use with --json)
--heuristics Analyze headers for suspicious patterns and anomalies
--enrich [{mx,spamhaus,domain_age,all} ...]
Add threat-intel enrichment to header analysis (mx, spamhaus, domain_age). Requires --heuristics.
--json Output full details in JSON format
```

---

### Attachment Processing

```bash
phishsage attachments -h
```

**Options:**

```
usage: phishsage attachments [-h] -f FILE [-o FILE] [--list] [--extract DIR] [--hash] [--vt-scan] [--yara PATH [PATH ...]] [--yara-verbose] [--json]

options:
-h, --help show this help message and exit
-f, --file FILE Email file to analyze (.eml)
-o, --output FILE Save JSON results to file (use with --json)
--list List attachments only
--extract DIR Extract attachments to specified directory
--hash Compute hashes (MD5, SHA1, SHA256) for each attachment
--vt-scan Check attachments against VirusTotal by SHA256
--yara PATH [PATH ...]
Scan attachments with YARA rules. Paths can be files or directories; directories are scanned recursively for .yar/.yara
files.
--yara-verbose Show detailed string matches and offsets when YARA rules hit
--json Output full details in JSON format
```

---

### Link / URL Analysis

```bash
phishsage links -h
```

**Options:**

```
usage: phishsage links [-h] -f FILE [-o FILE] [--extract] [--vt-scan] [--check-redirects] [--heuristics]
[--enrich [{all,domain_age,certificate,virustotal,redirects} ...]] [--json]

options:
-h, --help show this help message and exit
-f, --file FILE Email file to analyze (.eml)
-o, --output FILE Save JSON results to file (use with --json)
--extract Extract URLs from the email body
--vt-scan Query VirusTotal for URL reputation
--check-redirects Follow HTTP redirects and show chain
--heuristics Run phishing detection heuristics (use --enrich to add extra data)
--enrich [{all,domain_age,certificate,virustotal,redirects} ...]
Add extra analysis to heuristics (requires --heuristics)
--json Output full details in JSON format
```

---

## 4. Configuration

PhishSage stores configuration values in the project config (`config.toml`) or environment variables. The main items you may safely adjust are:

* `VIRUSTOTAL_API_KEY` — API key for VirusTotal scans.
* `MAX_REDIRECTS` — Maximum number of redirects to follow when checking redirect chains.
* `THRESHOLD_YOUNG`, `THRESHOLD_EXPIRING` — Domain age/expiry thresholds (in days). Domains younger than `THRESHOLD_YOUNG` or expiring within `THRESHOLD_EXPIRING` days are flagged as potentially suspicious.
* `ABUSABLE_PLATFORM_DOMAINS`, `SUSPICIOUS_TLDS`, `SHORTENERS` — Heuristic lists used in URL/link analysis.
* `SUBDOMAIN_THRESHOLD`, `TRIVIAL_SUBDOMAINS` — Used for subdomain heuristics to identify excessive or meaningful subdomains.
* `FREE_EMAIL_DOMAINS` — Free email providers that may indicate disposable or less-trusted addresses.
* `DATE_RECEIVED_DRIFT_MINUTES` — Maximum allowed difference between the `Date` header and the first `Received` hop in email headers.

*Note: Only modify thresholds or heuristic lists if you understand the potential impact on false positives and overall detection accuracy.*

---

## 5. Scope & Limitations

* **Focused functionality:** PhishSage is not a full mail forensic suite. It prioritizes heuristics, quick triage, and enrichment over deep forensic analysis.
* **Network-dependent checks:** WHOIS, VirusTotal, MX, and SSL inspections rely on external services; results may vary or fail due to connectivity issues or API limits.
* **Attachment processing:** Currently limited to listing, extraction, hashing, and optional VirusTotal scans. Full heuristic attachment analysis will be introduced in a future release.
* **Output formats:** Human‑readable pretty output is the default. Use `--json` to obtain detailed structured data for all modes.
* **Intended use:** Designed for investigative support and enrichment. Not intended for automated blocking or enforcement in production email systems.
* **Evolving coverage:** Current checks under each section are limited; additional heuristics and enhanced analyses will be added in future releases.

---

## 6. Contributing

Contributions to PhishSage are welcome! You can help improve the project by:

* Adding or refining heuristic checks for headers, attachments, and links.
* Expanding the lists in `config.toml`.
* Improving parsing, normalization, or output handling.
* Reporting bugs or suggesting enhancements.

Before submitting changes, please ensure they are well-tested and maintain the code’s clarity, security, and reliability. Contributions that enhance detection coverage, reduce false positives, or improve usability are particularly appreciated.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/0xlam/phishsage

Awesome Lists containing this project

README