{"id":28385981,"url":"https://github.com/kode-rex/webcat","last_synced_at":"2026-03-07T00:01:49.545Z","repository":{"id":224094458,"uuid":"762392470","full_name":"Kode-Rex/webcat","owner":"Kode-Rex","description":"The repo for the Web Cat MCP Server - A simple and reliable search server","archived":false,"fork":false,"pushed_at":"2025-10-02T22:39:39.000Z","size":506,"stargazers_count":1,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-10-03T00:24:16.965Z","etag":null,"topics":["mcp","mcp-server","search","vibe-coding"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Kode-Rex.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-02-23T17:25:02.000Z","updated_at":"2025-10-02T22:39:42.000Z","dependencies_parsed_at":"2024-07-19T21:58:59.567Z","dependency_job_id":"5982a07a-2542-4a69-b3e5-4001105f2476","html_url":"https://github.com/Kode-Rex/webcat","commit_stats":null,"previous_names":["t-rav/webcat","kode-rex/webcat"],"tags_count":5,"template":false,"template_full_name":null,"purl":"pkg:github/Kode-Rex/webcat","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Kode-Rex%2Fwebcat","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Kode-Rex%2Fwebcat/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Kode-Rex%2Fwebcat/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Kode-Rex%2Fwebcat/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Kode-Rex","download_url":"https://codeload.github.com/Kode-Rex/webcat/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Kode-Rex%2Fwebcat/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30204109,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-06T19:07:06.838Z","status":"ssl_error","status_checked_at":"2026-03-06T18:57:34.882Z","response_time":250,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["mcp","mcp-server","search","vibe-coding"],"created_at":"2025-05-30T12:37:53.594Z","updated_at":"2026-03-07T00:01:49.506Z","avatar_url":"https://github.com/Kode-Rex.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# WebCat MCP Server\n\n**Web search and content extraction for AI models via Model Context Protocol (MCP)**\n\n[![Version](https://img.shields.io/badge/version-2.3.2-blue.svg)](https://github.com/Kode-Rex/webcat)\n[![License](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)\n[![Docker](https://img.shields.io/badge/docker-multi--platform-blue.svg)](https://hub.docker.com/r/tmfrisinger/webcat)\n\n## Quick Start\n\n### Docker (Recommended)\n\n```bash\n# Run with Docker (no setup required)\ndocker run -p 8000:8000 tmfrisinger/webcat:latest\n\n# With Serper API key for premium search\ndocker run -p 8000:8000 -e SERPER_API_KEY=your_key tmfrisinger/webcat:latest\n\n# With authentication enabled\ndocker run -p 8000:8000 -e WEBCAT_API_KEY=your_token tmfrisinger/webcat:latest\n```\n\n**Supports:** linux/amd64, linux/arm64 (Intel/AMD, Apple Silicon, AWS Graviton)\n\n### Local Development\n\n```bash\ncd docker\npython -m pip install -e \".[dev]\"\n\n# Start MCP server with auto-reload\nmake dev\n\n# Or run directly\npython mcp_server.py\n```\n\n## What is WebCat?\n\nWebCat is an **MCP (Model Context Protocol) server** that provides AI models with:\n- 🔍 **Web Search** - Serper API (premium) or DuckDuckGo (free fallback)\n- 📄 **Content Extraction** - Serper scrape API (premium) or Trafilatura (free fallback)\n- 🌐 **Modern HTTP Transport** - Streamable HTTP with JSON-RPC 2.0\n- 🐳 **Multi-Platform Docker** - Works on Intel, ARM, and Apple Silicon\n- 🎯 **Composite Tool** - Single SERPER_API_KEY enables both search + scraping\n\nBuilt with **FastMCP**, **Serper.dev**, and **Trafilatura** for seamless AI integration.\n\n## Features\n\n- ✅ **Optional Authentication** - Bearer token auth when needed, or run without (v2.3.1)\n- ✅ **Composite Search Tool** - Single Serper API key enables both search + scraping\n- ✅ **Automatic Fallback** - Search: Serper → DuckDuckGo | Scraping: Serper → Trafilatura\n- ✅ **Premium Scraping** - Serper's optimized infrastructure for fast, clean content extraction\n- ✅ **Smart Content Extraction** - Returns markdown with preserved document structure\n- ✅ **MCP Compliant** - Works with Claude Desktop, LiteLLM, and other MCP clients\n- ✅ **Parallel Processing** - Fast concurrent scraping\n- ✅ **Multi-Platform Docker** - Linux (amd64/arm64) support\n\n## Installation \u0026 Usage\n\n### Docker Deployment\n\n```bash\n# Quick start - no configuration needed\ndocker run -p 8000:8000 tmfrisinger/webcat:latest\n\n# With environment variables\ndocker run -p 8000:8000 \\\n  -e SERPER_API_KEY=your_key \\\n  -e WEBCAT_API_KEY=your_token \\\n  tmfrisinger/webcat:latest\n\n# Using docker-compose\ncd docker\ndocker-compose up\n```\n\n### Local Development\n\n```bash\ncd docker\npython -m pip install -e \".[dev]\"\n\n# Configure environment (optional)\necho \"SERPER_API_KEY=your_key\" \u003e .env\n\n# Development mode with auto-reload\nmake dev        # Start MCP server with auto-reload\n\n# Production mode\nmake mcp        # Start MCP server\n```\n\n## Available Endpoints\n\n| Endpoint | Description |\n|----------|-------------|\n| `http://localhost:8000/health` | 💗 Health check |\n| `http://localhost:8000/status` | 📊 Server status |\n| `http://localhost:8000/mcp` | 🛠️ MCP protocol endpoint (Streamable HTTP with JSON-RPC 2.0) |\n\n## Configuration\n\n### Environment Variables\n\n| Variable | Default | Description |\n|----------|---------|-------------|\n| `SERPER_API_KEY` | *(none)* | Serper API key for premium search (optional, falls back to DuckDuckGo if not set) |\n| `PERPLEXITY_API_KEY` | *(none)* | Perplexity API key for deep research tool (optional, get at https://www.perplexity.ai/settings/api) |\n| `WEBCAT_API_KEY` | *(none)* | Bearer token for authentication (optional, if set all requests must include `Authorization: Bearer \u003ctoken\u003e`) |\n| `PORT` | `8000` | Server port |\n| `LOG_LEVEL` | `INFO` | Logging level (DEBUG, INFO, WARNING, ERROR) |\n| `LOG_DIR` | `/tmp` | Log file directory |\n| `MAX_CONTENT_LENGTH` | `1000000` | Maximum characters to return per scraped article |\n\n### Get API Keys\n\n**Serper API (for web search + scraping):**\n1. Visit [serper.dev](https://serper.dev)\n2. Sign up for free tier (2,500 searches/month + scraping)\n3. Copy your API key\n4. Add to `.env` file: `SERPER_API_KEY=your_key`\n5. **Note:** One API key enables both search AND content scraping!\n\n**Perplexity API (for deep research):**\n1. Visit [perplexity.ai/settings/api](https://www.perplexity.ai/settings/api)\n2. Sign up and get your API key\n3. Copy your API key\n4. Add to `.env` file: `PERPLEXITY_API_KEY=your_key`\n\n### Enable Authentication (Optional)\n\nTo require bearer token authentication for all MCP tool calls:\n\n1. Generate a secure random token: `openssl rand -hex 32`\n2. Add to `.env` file: `WEBCAT_API_KEY=your_token`\n3. Include in all requests: `Authorization: Bearer your_token`\n\n**Note:** If `WEBCAT_API_KEY` is not set, no authentication is required.\n\n## MCP Tools\n\nWebCat exposes these tools via MCP:\n\n| Tool | Description | Parameters |\n|------|-------------|------------|\n| `search` | Search web and extract content | `query: str`, `max_results: int` |\n| `scrape_url` | Scrape specific URL | `url: str` |\n| `health_check` | Check server health | *(none)* |\n| `get_server_info` | Get server capabilities | *(none)* |\n\n## Architecture\n\n```\nMCP Client (Claude, LiteLLM)\n    ↓\nFastMCP Server (Streamable HTTP with JSON-RPC 2.0)\n    ↓\nAuthentication (optional bearer token)\n    ↓\nSearch Decision\n    ├─ Serper API (premium) → Serper Scrape API (premium)\n    └─ DuckDuckGo (free)    → Trafilatura (free)\n                                    ↓\n                            Markdown Response\n```\n\n**Tech Stack:**\n- **FastMCP** - MCP protocol implementation with modern HTTP transport\n- **JSON-RPC 2.0** - Standard protocol for client-server communication\n- **Serper API** - Google-powered search + optimized web scraping\n- **Trafilatura** - Fallback content extraction (removes navigation/ads)\n- **DuckDuckGo** - Free search fallback\n\n## Testing\n\n```bash\ncd docker\n\n# Run all unit tests\nmake test\n# OR\npython -m pytest tests/unit -v\n\n# With coverage report\nmake test-coverage\n# OR\npython -m pytest tests/unit --cov=. --cov-report=term --cov-report=html\n\n# CI-safe tests (no external dependencies)\npython -m pytest -v -m \"not integration\"\n\n# Run specific test file\npython -m pytest tests/unit/services/test_content_scraper.py -v\n```\n\n**Current test coverage:** 70%+ across all modules (enforced in CI)\n\n## Development\n\n```bash\n# First-time setup\nmake setup-dev   # Install all dependencies + pre-commit hooks\n\n# Development workflow\nmake dev         # Start server with auto-reload\nmake format      # Auto-format code (Black + isort)\nmake lint        # Check code quality (flake8)\nmake test        # Run unit tests\n\n# Before committing\nmake ci-fast     # Quick validation (~30 seconds)\n# OR\nmake ci          # Full validation with security checks (~2-3 minutes)\n\n# Code quality tools\nmake format-check   # Check formatting without changes\nmake security       # Run bandit security scanner\nmake audit          # Check dependency vulnerabilities\n```\n\n**Pre-commit Hooks:**\nHooks run automatically on `git commit` to ensure code quality. Install with `make setup-dev`.\n\n## Project Structure\n\n```\ndocker/\n├── mcp_server.py          # Main MCP server (FastMCP)\n├── cli.py                 # CLI interface for server modes\n├── health.py              # Health check endpoint\n├── api_tools.py           # API tooling utilities\n├── clients/               # External API clients\n│   ├── serper_client.py  # Serper API (search + scrape)\n│   └── duckduckgo_client.py  # DuckDuckGo fallback\n├── services/              # Core business logic\n│   ├── search_service.py # Search orchestration\n│   └── content_scraper.py # Serper scrape → Trafilatura fallback\n├── tools/                 # MCP tool implementations\n│   └── search_tool.py    # Search tool with auth\n├── models/                # Pydantic data models\n│   ├── domain/           # Domain entities (SearchResult, etc.)\n│   └── responses/        # API response models\n├── utils/                 # Shared utilities\n│   └── auth.py           # Bearer token authentication\n├── endpoints/             # FastAPI endpoints\n├── tests/                 # Comprehensive test suite\n│   ├── unit/             # Unit tests (mocked dependencies)\n│   └── integration/      # Integration tests (external deps)\n└── pyproject.toml         # Project config + dependencies\n```\n\n## Search Quality Comparison\n\n| Feature | Serper API | DuckDuckGo |\n|---------|------------|------------|\n| **Cost** | Paid (free tier available) | Free |\n| **Quality** | ⭐⭐⭐⭐⭐ Excellent | ⭐⭐⭐⭐ Good |\n| **Coverage** | Comprehensive (Google-powered) | Standard |\n| **Speed** | Fast | Fast |\n| **Rate Limits** | 2,500/month (free tier) | None |\n\n## Docker Multi-Platform Support\n\nWebCat supports multiple architectures for broad deployment compatibility:\n\n```bash\n# Build locally for multiple platforms\ncd docker\n./build.sh  # Builds for linux/amd64 and linux/arm64\n\n# Manual multi-platform build and push\ndocker buildx build --platform linux/amd64,linux/arm64 \\\n  -t tmfrisinger/webcat:2.3.2 \\\n  -t tmfrisinger/webcat:latest \\\n  -f Dockerfile --push .\n\n# Verify multi-platform support\ndocker buildx imagetools inspect tmfrisinger/webcat:latest\n```\n\n**Automated Releases:**\nPush a version tag to trigger automated multi-platform builds via GitHub Actions:\n```bash\ngit tag v2.3.2\ngit push origin v2.3.2\n```\n\n## Limitations\n\n- **Text-focused:** Optimized for article content, not multimedia\n- **No JavaScript:** Cannot scrape dynamic JS-rendered content (uses static HTML)\n- **PDF support:** Detection only, not full extraction\n- **Python 3.11 required:** Not compatible with 3.10 or 3.12\n- **External API limits:** Subject to Serper API rate limits (2,500/month free tier)\n\n## Contributing\n\nContributions welcome! Please:\n\n1. Fork the repository\n2. Create a feature branch\n3. Add tests for new functionality\n4. Ensure `make ci` passes\n5. Submit a Pull Request\n\nSee [CLAUDE.md](CLAUDE.md) for development guidelines and architecture standards.\n\n## License\n\nMIT License - see [LICENSE](LICENSE) file for details.\n\n## Links\n\n- **GitHub:** [github.com/Kode-Rex/webcat](https://github.com/Kode-Rex/webcat)\n- **MCP Spec:** [modelcontextprotocol.io](https://modelcontextprotocol.io)\n- **Serper API:** [serper.dev](https://serper.dev)\n\n---\n\n**Version 2.3.2** | Built with FastMCP, FastAPI, Readability, and html2text\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkode-rex%2Fwebcat","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkode-rex%2Fwebcat","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkode-rex%2Fwebcat/lists"}