https://github.com/vrknetha/mcp-server-firecrawl
FireCrawl MCP Server is a powerful web scraping integration for Claude and other LLMs. It provides JavaScript rendering, batch processing, and search capabilities through a Model Context Protocol (MCP) interface. Now with support for self-hosted instances and advanced features like parallel processing, automatic retries, and content filtering
https://github.com/vrknetha/mcp-server-firecrawl
batch-processing claude content-extraction data-collection firecrawl firecrawl-ai javascript-rendering llm-tools mcp-server model-context-protocol search-api web-crawler web-scraping
Last synced: 3 months ago
JSON representation
FireCrawl MCP Server is a powerful web scraping integration for Claude and other LLMs. It provides JavaScript rendering, batch processing, and search capabilities through a Model Context Protocol (MCP) interface. Now with support for self-hosted instances and advanced features like parallel processing, automatic retries, and content filtering
- Host: GitHub
- URL: https://github.com/vrknetha/mcp-server-firecrawl
- Owner: vrknetha
- License: mit
- Created: 2024-12-06T07:50:27.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2025-01-04T10:46:10.000Z (4 months ago)
- Last Synced: 2025-01-04T11:39:44.489Z (4 months ago)
- Topics: batch-processing, claude, content-extraction, data-collection, firecrawl, firecrawl-ai, javascript-rendering, llm-tools, mcp-server, model-context-protocol, search-api, web-crawler, web-scraping
- Language: JavaScript
- Homepage:
- Size: 194 KB
- Stars: 16
- Watchers: 2
- Forks: 4
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
- awesome-mcp-servers - FireCrawl - Official Firecrawl MCP Server - Adds powerful web scraping to Cursor, Claude and any other LLM clients. (Table of Contents / Browser Automation)
README
# FireCrawl MCP Server
[](https://smithery.ai/server/mcp-server-firecrawl)
A Model Context Protocol (MCP) server implementation that integrates with FireCrawl for advanced web scraping capabilities.
## Features
- Web scraping with JavaScript rendering
- Batch scraping with parallel processing and queuing
- URL discovery and crawling
- Web search with content extraction
- Automatic retries with exponential backoff
- Credit usage monitoring for cloud API
- Comprehensive logging system
- Support for cloud and self-hosted FireCrawl instances
- Mobile/Desktop viewport support
- Smart content filtering with tag inclusion/exclusion## Installation
### Installing via Smithery
To install FireCrawl for Claude Desktop automatically via [Smithery](https://smithery.ai/server/mcp-server-firecrawl):
```bash
npx -y @smithery/cli install mcp-server-firecrawl --client claude
```### Manual Installation
```bash
npm install -g mcp-server-firecrawl
```## Configuration
### Environment Variables
- `FIRE_CRAWL_API_KEY`: Your FireCrawl API key
- Required when using cloud API (default)
- Optional when using self-hosted instance with `FIRE_CRAWL_API_URL`
- `FIRE_CRAWL_API_URL` (Optional): Custom API endpoint for self-hosted instances
- Example: `https://firecrawl.your-domain.com`
- If not provided, the cloud API will be used (requires API key)### Configuration Examples
For cloud API usage (default):
```bash
export FIRE_CRAWL_API_KEY=your-api-key
```For self-hosted instance without authentication:
```bash
export FIRE_CRAWL_API_URL=https://firecrawl.your-domain.com
```For self-hosted instance with authentication:
```bash
export FIRE_CRAWL_API_URL=https://firecrawl.your-domain.com
export FIRE_CRAWL_API_KEY=your-api-key # Optional for authenticated self-hosted instances
```### Usage with Claude Desktop
Add this to your `claude_desktop_config.json`:
```json
{
"mcpServers": {
"mcp-server-firecrawl": {
"command": "npx",
"args": ["-y", "mcp-server-firecrawl"],
"env": {
"FIRE_CRAWL_API_KEY": "YOUR_API_KEY_HERE"
}
}
}
}
```### System Configuration
The server includes several configurable parameters:
```typescript
const CONFIG = {
retry: {
maxAttempts: 3,
initialDelay: 1000, // 1 second
maxDelay: 10000, // 10 seconds
backoffFactor: 2,
},
batch: {
delayBetweenRequests: 2000, // 2 seconds
maxParallelOperations: 3,
},
credit: {
warningThreshold: 1000,
criticalThreshold: 100,
},
};
```### Rate Limits
The server implements rate limiting to prevent API abuse:
- 3 requests per minute on free tier
- Automatic retries with exponential backoff
- Parallel processing for batch operations
- Higher limits available on paid plans## Available Tools
### 1. Scrape Tool (`fire_crawl_scrape`)
Scrape content from a single URL with advanced options.
```json
{
"name": "fire_crawl_scrape",
"arguments": {
"url": "https://example.com",
"formats": ["markdown"],
"onlyMainContent": true,
"waitFor": 1000,
"timeout": 30000,
"mobile": false,
"includeTags": ["article", "main"],
"excludeTags": ["nav", "footer"],
"skipTlsVerification": false
}
}
```### 2. Batch Scrape Tool (`fire_crawl_batch_scrape`)
Scrape multiple URLs with parallel processing and queuing.
```json
{
"name": "fire_crawl_batch_scrape",
"arguments": {
"urls": ["https://example1.com", "https://example2.com"],
"options": {
"formats": ["markdown"],
"onlyMainContent": true
}
}
}
```Response includes operation ID for status checking:
```json
{
"content": [
{
"type": "text",
"text": "Batch operation queued with ID: batch_1. Use fire_crawl_check_batch_status to check progress."
}
],
"isError": false
}
```### 3. Check Batch Status (`fire_crawl_check_batch_status`)
Check the status of a batch operation.
```json
{
"name": "fire_crawl_check_batch_status",
"arguments": {
"id": "batch_1"
}
}
```### 4. Search Tool (`fire_crawl_search`)
Search the web and optionally extract content from search results.
```json
{
"name": "fire_crawl_search",
"arguments": {
"query": "your search query",
"limit": 5,
"lang": "en",
"country": "us",
"scrapeOptions": {
"formats": ["markdown"],
"onlyMainContent": true
}
}
}
```### 5. Crawl Tool (`fire_crawl_crawl`)
Start an asynchronous crawl with advanced options.
```json
{
"name": "fire_crawl_crawl",
"arguments": {
"url": "https://example.com",
"maxDepth": 2,
"limit": 100,
"allowExternalLinks": false,
"deduplicateSimilarURLs": true
}
}
```## Logging System
The server includes comprehensive logging:
- Operation status and progress
- Performance metrics
- Credit usage monitoring
- Rate limit tracking
- Error conditionsExample log messages:
```
[INFO] FireCrawl MCP Server initialized successfully
[INFO] Starting scrape for URL: https://example.com
[INFO] Batch operation queued with ID: batch_1
[WARNING] Credit usage has reached warning threshold
[ERROR] Rate limit exceeded, retrying in 2s...
```## Error Handling
The server provides robust error handling:
- Automatic retries for transient errors
- Rate limit handling with backoff
- Detailed error messages
- Credit usage warnings
- Network resilienceExample error response:
```json
{
"content": [
{
"type": "text",
"text": "Error: Rate limit exceeded. Retrying in 2 seconds..."
}
],
"isError": true
}
```## Development
```bash
# Install dependencies
npm install# Build
npm run build# Run tests
npm test
```### Contributing
1. Fork the repository
2. Create your feature branch
3. Run tests: `npm test`
4. Submit a pull request## License
MIT License - see LICENSE file for details