https://github.com/appiumtestdistribution/secure-hulk
Secure-Hulk is a security scanner for Model Context Protocol (MCP) servers and tools. It helps identify potential security vulnerabilities in MCP configurations, such as prompt injection, tool poisoning, cross-origin escalation, data exfiltration, and toxic agent flows.
https://github.com/appiumtestdistribution/secure-hulk
Last synced: 6 months ago
JSON representation
Secure-Hulk is a security scanner for Model Context Protocol (MCP) servers and tools. It helps identify potential security vulnerabilities in MCP configurations, such as prompt injection, tool poisoning, cross-origin escalation, data exfiltration, and toxic agent flows.
- Host: GitHub
- URL: https://github.com/appiumtestdistribution/secure-hulk
- Owner: AppiumTestDistribution
- Created: 2025-05-08T07:24:57.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2025-06-09T06:08:25.000Z (8 months ago)
- Last Synced: 2025-08-05T15:56:23.886Z (6 months ago)
- Language: TypeScript
- Homepage:
- Size: 760 KB
- Stars: 6
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Secure-Hulk
Security scanner for Model Context Protocol servers and tools.
## Overview
Secure-Hulk is a security scanner for Model Context Protocol (MCP) servers and tools. It helps identify potential security vulnerabilities in MCP configurations, such as prompt injection, tool poisoning, cross-origin escalation, data exfiltration, and toxic agent flows.
## Features
- Scan MCP configurations for security vulnerabilities
- Detect prompt injection attempts
- Identify tool poisoning vulnerabilities
- Check for cross-origin escalation risks
- Monitor for data exfiltration attempts
- **Detect toxic agent flows** - Multi-step attacks that manipulate agents into unintended actions
- **Privilege escalation detection** - Identify attempts to escalate from public to private access
- **Cross-resource attack detection** - Monitor suspicious access patterns across multiple resources
- **Indirect prompt injection detection** - Catch attacks through external content processing
- Generate HTML reports of scan results
- Whitelist approved entities
## Installation
```bash
npm install
npm run build
```
## Usage
### Scanning MCP Configurations
```bash
# Scan well-known MCP configuration paths
npm i secure-hulk
# Scan specific configuration files
secure-hulk scan /path/to/config.json
# Generate HTML report
secure-hulk scan --html report.html /path/to/config.json
# Enable verbose output
secure-hulk scan -v /path/to/config.json
# Output results in JSON format
secure-hulk scan -j /path/to/config.json
```
### Using OpenAI Moderation API for Harmful Content Detection
Secure-Hulk now supports using OpenAI's Moderation API to detect harmful content in entity descriptions. This provides a more robust detection mechanism for identifying potentially harmful, unsafe, or unethical content.
To use the OpenAI Moderation API:
```bash
secure-hulk scan --use-openai-moderation --openai-api-key YOUR_API_KEY /path/to/config.json
```
Options:
- `--use-openai-moderation`: Enable OpenAI Moderation API for prompt injection detection
- `--openai-api-key `: Your OpenAI API key
- `--openai-moderation-model `: OpenAI Moderation model to use (default: 'omni-moderation-latest')
The OpenAI Moderation API provides several advantages:
1. **More accurate detection**: The API uses advanced AI models to detect harmful content, which can catch subtle harmful content that pattern matching might miss.
2. **Categorized results**: The API provides detailed categories for flagged content (hate, harassment, self-harm, sexual content, violence, etc.), helping you understand the specific type of harmful content detected.
3. **Confidence scores**: Each category includes a confidence score, allowing you to set appropriate thresholds for your use case.
4. **Regular updates**: The API is regularly updated to detect new types of harmful content as OpenAI's policies evolve.
The API can detect content in these categories:
- Hate speech
- Harassment
- Self-harm
- Sexual content
- Violence
- Illegal activities
- Deception
If the OpenAI Moderation API check fails for any reason, Secure-Hulk will automatically fall back to pattern-based detection for prompt injection vulnerabilities.
### Using Hugging Face Safety Models for Content Detection
Secure-Hulk now supports Hugging Face safety models for advanced AI-powered content moderation. This provides additional options beyond OpenAI's Moderation API, including open-source models and specialized toxicity detection.
To use Hugging Face safety models:
```bash
secure-hulk scan --use-huggingface-guardrails --huggingface-api-token YOUR_HF_TOKEN /path/to/config.json
```
Options:
- `--use-huggingface-guardrails`: Enable Hugging Face safety models for content detection
- `--huggingface-api-token `: Your Hugging Face API token
- `--huggingface-model `: Specific model to use (default: 'unitary/toxic-bert')
- `--huggingface-threshold `: Confidence threshold for flagging content (default: 0.5)
- `--huggingface-preset `: Use preset configurations: 'toxicity', 'hate-speech', 'multilingual', 'strict'
- `--huggingface-timeout `: Timeout for API calls (default: 10000)
Available models include:
- **unitary/toxic-bert**: General toxicity detection (recommended default)
- **s-nlp/roberta_toxicity_classifier**: High-sensitivity toxicity detection
- **unitary/unbiased-toxic-roberta**: Bias-reduced toxicity detection
Preset configurations:
- `toxicity`: General purpose toxicity detection
- `strict`: High sensitivity for maximum safety
Example with multiple guardrails:
```bash
secure-hulk scan \
--use-openai-moderation --openai-api-key YOUR_OPENAI_KEY \
--use-huggingface-guardrails --huggingface-preset toxicity --huggingface-api-token YOUR_HF_TOKEN \
--use-nemo-guardrails --nemo-guardrails-config-path ./guardrails-config \
/path/to/config.json
```
The Hugging Face integration provides several advantages:
1. **Model diversity**: Choose from multiple specialized safety models
2. **Open-source options**: Use community-developed models
3. **Customizable thresholds**: Fine-tune sensitivity for your use case
4. **Specialized detection**: Models focused on specific types of harmful content
5. **Cost flexibility**: Various pricing options including free tiers
If the Hugging Face API check fails for any reason, Secure-Hulk will log the error and continue with other security checks.
### Inspecting MCP Configurations
```bash
secure-hulk inspect /path/to/config.json
```
### Managing the Whitelist
```bash
# Add an entity to the whitelist
secure-hulk whitelist tool "Calculator" abc123
# Print the whitelist
secure-hulk whitelist
# Reset the whitelist
secure-hulk whitelist --reset
```
## Configuration
### Scan Options
- `--json, -j`: Output results in JSON format
- `--verbose, -v`: Enable verbose output
- `--html `: Generate HTML report and save to specified path
- `--storage-file `: Path to store scan results and whitelist information
- `--server-timeout `: Seconds to wait before timing out server connections
- `--checks-per-server `: Number of times to check each server
- `--suppress-mcpserver-io `: Suppress stdout/stderr from MCP servers
### Whitelist Options
- `--storage-file `: Path to store scan results and whitelist information
- `--reset`: Reset the entire whitelist
- `--local-only`: Only update local whitelist, don't contribute to global whitelist
## Sponsors
Proudly sponsored by LambdaTest
## License
MIT