https://github.com/KOKOSde/localmod

Self-hosted content moderation API that outperforms Amazon Comprehend. 100% offline, your data never leaves your server. Text + Image moderation.
https://github.com/KOKOSde/localmod

content-moderation docker fastapi image-moderation llm-security machine-learning nsfw-detection offline-first pii-detection privacy prompt-injection self-hosted spam-detection toxicity-detection

Last synced: 6 months ago
JSON representation

Self-hosted content moderation API that outperforms Amazon Comprehend. 100% offline, your data never leaves your server. Text + Image moderation.

Host: GitHub
URL: https://github.com/KOKOSde/localmod
Owner: KOKOSde
License: mit
Created: 2025-12-27T00:37:26.000Z (7 months ago)
Default Branch: main
Last Pushed: 2026-01-01T09:03:52.000Z (7 months ago)
Last Synced: 2026-01-06T01:58:44.119Z (6 months ago)
Topics: content-moderation, docker, fastapi, image-moderation, llm-security, machine-learning, nsfw-detection, offline-first, pii-detection, privacy, prompt-injection, self-hosted, spam-detection, toxicity-detection
Language: Python
Homepage:
Size: 182 KB
Stars: 9
Watchers: 0
Forks: 3
Open Issues: 7
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Codeowners: .github/CODEOWNERS

Awesome Lists containing this project

awesome-ai-security - LocalMod - _Self-hosted content moderation API with prompt injection detection, toxicity filtering, PII detection, and NSFW classification. Runs 100% offline._ (Defense & Security Controls / Input/Output Guardrails)

README

          # LocalMod

[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

[![Tests](https://img.shields.io/badge/tests-109%20passed-brightgreen.svg)]()

**Fully offline content moderation API** — Free, self-hosted, and private. Your data never leaves your infrastructure.



  





  



---

## Benchmark Results

### Toxicity Detection

Benchmarked using [CHI 2025 "Lost in Moderation"](https://arxiv.org/html/2503.01623) methodology (HateXplain, Civil Comments, SBIC datasets):

| System | Balanced Accuracy | Type |

|--------|------------------|------|

| OpenAI Moderation API | 0.83 | Commercial |

| Azure Content Moderator | 0.81 | Commercial |

| **LocalMod** | **0.75** ⭐ | Open Source |

| Amazon Comprehend | 0.74 | Commercial |

| Perspective API | 0.62 | Commercial |

### Spam Detection

| System | Balanced Accuracy | Dataset |

|--------|------------------|---------|

| **LocalMod** | **0.998** | [UCI SMS Spam Collection](https://archive.ics.uci.edu/ml/datasets/sms+spam+collection) |

---

## Installation

```bash

git clone https://github.com/KOKOSde/localmod.git

cd localmod

pip install -e .

# Download ML models (~3.5GB - includes image model)

python scripts/download_models.py

```

## Quick Start

```bash

# Run demo

python examples/demo.py

```

### Python Usage

```python

from localmod import SafetyPipeline

pipeline = SafetyPipeline()

report = pipeline.analyze("Check this text for safety issues")

print(f"Flagged: {report.flagged}")

print(f"Severity: {report.severity}")

```

### API Server

```bash

localmod serve --port 8000

# Test

curl -X POST http://localhost:8000/analyze \

  -H "Content-Type: application/json" \

  -d '{"text": "Hello world", "classifiers": ["toxicity", "pii"]}'

```

### Docker

```bash

docker build -f docker/Dockerfile -t localmod:latest .

docker run -p 8000:8000 localmod:latest

```

### Discord Bot 🆕

```bash

# Install Discord dependency

pip install -e ".[discord]"

# Set your bot token

export DISCORD_BOT_TOKEN=your_token_here

# Run the bot

python examples/discord_bot.py

```

Features: Real-time text & image moderation, auto-delete, timeout, logging.

---

## Classifiers

### Text Moderation

| Classifier | Detects | Model |

|------------|---------|-------|

| **PII** | Emails, phones, SSNs, credit cards | Regex + Validation |

| **Toxicity** | Hate speech, harassment, threats | Weighted Ensemble (4 models) |

| **Prompt Injection** | LLM jailbreaks, instruction override | DeBERTa |

| **Spam** | Promotional content, scams | RoBERTa |

| **NSFW Text** | Sexual content, adult themes | NSFW Classifier |

### Image Moderation 🆕

| Classifier | Detects | Model |

|------------|---------|-------|

| **NSFW Image** | Explicit/adult images | [Falconsai/nsfw_image_detection](https://huggingface.co/Falconsai/nsfw_image_detection) (ViT) |

*71M+ downloads on HuggingFace • Apache 2.0 license*

### Toxicity Ensemble

| Model | Weight | Purpose |

|-------|--------|---------|

| [`unitary/toxic-bert`](https://huggingface.co/unitary/toxic-bert) | 50% | Multi-label toxicity |

| [`Hate-speech-CNERG/dehatebert-mono-english`](https://huggingface.co/Hate-speech-CNERG/dehatebert-mono-english) | 20% | Hate speech |

| [`s-nlp/roberta_toxicity_classifier`](https://huggingface.co/s-nlp/roberta_toxicity_classifier) | 15% | Toxicity |

| [`facebook/roberta-hate-speech-dynabench-r4-target`](https://huggingface.co/facebook/roberta-hate-speech-dynabench-r4-target) | 15% | Adversarial robustness |

---

## API Endpoints

### Text Moderation

| Endpoint | Method | Description |

|----------|--------|-------------|

| `/analyze` | POST | Analyze single text |

| `/analyze/batch` | POST | Analyze multiple texts |

| `/redact` | POST | Redact PII from text |

### Image Moderation 🆕

| Endpoint | Method | Description |

|----------|--------|-------------|

| `/analyze/image` | POST | Analyze image from URL |

| `/analyze/image/upload` | POST | Analyze uploaded image file |

### System

| Endpoint | Method | Description |

|----------|--------|-------------|

| `/health` | GET | Health check |

| `/classifiers` | GET | List text classifiers |

| `/classifiers/image` | GET | List image classifiers |

### Example: Text Analysis

```json

POST /analyze

{

  "text": "You are an idiot!",

  "classifiers": ["toxicity"]

}

Response:

{

  "flagged": true,

  "results": [{"classifier": "toxicity", "flagged": true, "confidence": 0.72, "severity": "high"}],

  "processing_time_ms": 85.3

}

```

### Example: Image Analysis 🆕

```json

POST /analyze/image

{

  "image_url": "https://example.com/image.jpg"

}

Response:

{

  "flagged": false,

  "results": [{"classifier": "nsfw_image", "flagged": false, "confidence": 0.02, "severity": "none"}],

  "processing_time_ms": 120.5

}

```

---

## Offline Mode

```bash

# Download models once

python scripts/download_models.py --model-dir /path/to/models

# Run offline

export LOCALMOD_MODEL_DIR=/path/to/models

export LOCALMOD_OFFLINE=1

localmod serve

```

Docker:

```bash

docker run -p 8000:8000 \

  -v /path/to/models:/models \

  -e LOCALMOD_MODEL_DIR=/models \

  -e LOCALMOD_OFFLINE=1 \

  localmod:latest

```

---

## Configuration

| Variable | Default | Description |

|----------|---------|-------------|

| `LOCALMOD_MODEL_DIR` | `~/.cache/localmod/models` | Model directory |

| `LOCALMOD_OFFLINE` | `false` | Force offline mode |

| `LOCALMOD_DEVICE` | `auto` | `cpu`, `cuda`, or `auto` |

---

## Performance

| Classifier | CPU Latency | GPU Latency | Memory |

|------------|-------------|-------------|--------|

| PII (regex) | <1ms | <1ms | Minimal |

| Single ML model (text) | ~50-200ms | ~10-30ms | ~1GB |

| Toxicity ensemble (4 models) | ~200-500ms | ~30-80ms | ~3GB |

| NSFW Image (ViT) | ~100-300ms | ~20-50ms | ~500MB |

*Performance varies by hardware. GPU recommended for production workloads.*

---

## Development

```bash

pip install -e ".[dev]"

pytest tests/ -v

```

---

## License

MIT License — see [LICENSE](LICENSE).

## Acknowledgments

Models from [HuggingFace](https://huggingface.co/). Benchmark methodology from CHI 2025 "Lost in Moderation" (Hartmann et al.).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/KOKOSde/localmod

Awesome Lists containing this project

README