{"id":23881908,"url":"https://github.com/rmncldyo/firecrawl-toolkit","last_synced_at":"2025-08-17T18:04:08.024Z","repository":{"id":269116093,"uuid":"905395228","full_name":"RMNCLDYO/firecrawl-toolkit","owner":"RMNCLDYO","description":"The Firecrawl Toolkit is the easiest way for developers to interact with web content through crawling, scraping, and mapping capabilities.","archived":false,"fork":false,"pushed_at":"2024-12-21T02:44:19.000Z","size":198,"stargazers_count":0,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-06-29T18:04:05.106Z","etag":null,"topics":["ai-batch-scrape","ai-crawler","ai-scraper","ai-toolkit","batch-scrape","crawl","fire-crawl","firecrawl","firecrawl-ai","map","scrape","sitemap","sitemap-crawler","sitemap-scraper","web-crawler","web-scraper"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/RMNCLDYO.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":".github/CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":".github/CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":".github/SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-12-18T18:30:45.000Z","updated_at":"2024-12-21T02:44:23.000Z","dependencies_parsed_at":"2024-12-21T03:34:48.045Z","dependency_job_id":null,"html_url":"https://github.com/RMNCLDYO/firecrawl-toolkit","commit_stats":null,"previous_names":["rmncldyo/firecrawl-toolkit"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/RMNCLDYO/firecrawl-toolkit","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RMNCLDYO%2Ffirecrawl-toolkit","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RMNCLDYO%2Ffirecrawl-toolkit/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RMNCLDYO%2Ffirecrawl-toolkit/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RMNCLDYO%2Ffirecrawl-toolkit/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/RMNCLDYO","download_url":"https://codeload.github.com/RMNCLDYO/firecrawl-toolkit/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RMNCLDYO%2Ffirecrawl-toolkit/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":270884766,"owners_count":24662308,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-17T02:00:09.016Z","response_time":129,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-batch-scrape","ai-crawler","ai-scraper","ai-toolkit","batch-scrape","crawl","fire-crawl","firecrawl","firecrawl-ai","map","scrape","sitemap","sitemap-crawler","sitemap-scraper","web-crawler","web-scraper"],"created_at":"2025-01-04T01:59:40.902Z","updated_at":"2025-08-17T18:04:07.982Z","avatar_url":"https://github.com/RMNCLDYO.png","language":"Python","readme":"\u003cp align=\"center\"\u003e\n    \u003ca href=\"https://firecrawl.dev/\" title=\"Go to Firecrawl homepage\"\u003e\n        \u003cimg src=\"https://img.shields.io/badge/Firecrawl-fafafa?style=for-the-badge\u0026logo=data:image/svg+xml;base64,PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0iVVRGLTgiPz4KPHN2ZyBpZD0iTGF5ZXJfMSIgeG1sbnM9Imh0dHA6Ly93d3cudzMub3JnLzIwMDAvc3ZnIiB4bWxuczp4bGluaz0iaHR0cDovL3d3dy53My5vcmcvMTk5OS94bGluayIgdmVyc2lvbj0iMS4xIiB2aWV3Qm94PSIwIDAgMTkuOCAxOS44Ij4KICA8IS0tIEdlbmVyYXRvcjogQWRvYmUgSWxsdXN0cmF0b3IgMjkuMS4wLCBTVkcgRXhwb3J0IFBsdWctSW4gLiBTVkcgVmVyc2lvbjogMi4xLjAgQnVpbGQgMTQyKSAgLS0+CiAgPGltYWdlIHdpZHRoPSIzNSIgaGVpZ2h0PSIzNSIgdHJhbnNmb3JtPSJ0cmFuc2xhdGUoLTcuMSAtOC4xKSIgeGxpbms6aHJlZj0iZGF0YTppbWFnZS9wbmc7YmFzZTY0LGlWQk9SdzBLR2dvQUFBQU5TVWhFVWdBQUFDTUFBQUFqQ0FZQUFBQWUyYk5aQUFBQUNYQklXWE1BQUFzU0FBQUxFZ0hTM1g3OEFBQURvMGxFUVZSWWhlMlYyMjlVVlJTSGYydnRjK2JNbVhhbXpJWHB0RTNCT3RNcUYydkxwWUlHalBEZ2k0a0orb1JSRzRyaWcrSURDVDRRUXJEeHBUR1JtSkJVWTBpc1lOU0VoaENER0ZLOGtOU0FvUEdHdGJTbHRIUjZtNW0yTXkyZE9YUE8zdjRMcDdib2cvTTlyL3pXbDczMzJnc29VYUpFaWY4UnROeUEzSkZETDZFdzcvbXJmdFBIV3c0Y0tDNG5pNWNyQTZkZzJIM2ZubWo0dGVkdzZvMkRnZjlVeHE2c3pwTHVNMlZ5NExnKyswZkgrSlh2Vi85ck1wMTlTUzI5TmtHN3V2ckZ4THRkVVhIejJqWlJZMExFWTBMbEpsN3hmZFIrYk9TSDIrWjlsZG4yM0dsS25UdFh2YmUzZTNObytCWStqd3pFelorNjJsVnU3R1dSS0dlOVFRT3Y5ck9jdkxNMzhFVkg4OEViZzB0K2o1cmJ3a3M3UnlMMFMrcUl5czFORTlIVm1kYW4yeUJUcjJvYmFzQVZERkFSWEtOQlRuSlEzZnB4eStCZDR3YUF3bEprWEo4TUwyUXJJUFI3dkxidXd1M3YvcXlpelBDVEhQVkRWQXBBT29EdGdBd0hZQW5BMG5QL1lGQmR5NGpHblZQeTBTZTY1L2E4UHJpcTkzUTlHREdPYWlDdkEvSktrQWtBRXNwV1VJdHFjdC9HY05uYzZHaW9zeS9wdW9mcmE4cjNkRWZKTk5zcWJsNy9wSkNjQ0xDaFBGUXV3UUVKTGllb29nWm56QWZpQmNBZmZHUlB4NzZIa0ZnM3ZQLzVGOCs4QnVSWFRPYXB3NWVZeHQ5ZWorSnNxNnJmbU9MMG1BT0RBcUlDRUVFRitEVWdBNEJNVUtBU0tqMXp5UDc1S2xGNjlPeThicC9GU3NwOFdUY2tyS0Ywbk12eVFzMzB2UVc3cUxnMlJDSnNnaW8wS0owQmx1Q0FBdzVaQUN3aDV3RzdmN3hCM1RuamRkUER0VXpoNUVsV1paYVBhZ2tJWlNIcWdxUTNsSUdqQnB4VkJxQVk3TTlEWDVNSGVTeFkxeTFJQWFoRitQTDMyUDJRdUNucU9mcXByYnptbEVvQnBFdDQxaTFBZjZBQUZXTElRQXd5bUlDTUJVQTFESzFXQXNLQnlnTHNwVW05c3RMMXZuSWxjeVZoU21yYTBTc056Mjh5VGRBaUdTQnF3WTVVUVJuYm9ZeGRjSUpiWWNjQ29MQUZFYlVCSDJ6WnRPblU5UG12WjFkVTV2M05jWFhoemZjR3NiNzVRd0N6eWxHUUVRK1Vwd21zYlFDTE9KUm9oUEt1Z2ZTbG9ReEkxUlE5WlR5Mi8zeGoyTGV5SndNQUh3ek4yNFkvT3V3VXhKeVQwcUVvRGVJb2lHTmdMZ2R6RUVwbElWSlpPQmt4d1pHSFB6dTYvY0dNMjN3QUVHNExxeGNhVld0TFlrb1ZrbmxwVHo5dStxWU1PMXlFMHJlQ3lJUmpmUVV0ZVFMVzc0RnhaN0hsSGIxcTk4Vm5uOW05cEhXdzVEL2IvdWFZWnQyZGF1SFErQXUwTnJORFZyZUZoZGFjVTluT2Zob1p1bHhJeHkvaVdtVEEzMzdjWG1wMmlSSWxTdHh2L2dhZHJXZk9UbFFJcEFBQUFBQkpSVTVFcmtKZ2dnPT0iLz4KPC9zdmc+\" alt=\"Firecrawl\"\u003e\n    \u003c/a\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n    \u003ca href=\"https://github.com/RMNCLDYO/firecrawl-toolkit\" title=\"Go to repo\"\u003e\n        \u003cimg src=\"https://img.shields.io/badge/dynamic/json?style=for-the-badge\u0026label=Firecrawl%20Toolkit\u0026query=version\u0026url=https%3A%2F%2Fraw.githubusercontent.com%2FRMNCLDYO%2Ffirecrawl-toolkit%2Fmain%2F.github%2Fversion.json\" alt=\"Firecrawl Toolkit\"\u003e\n    \u003c/a\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n    \u003ca href=\".github/CHANGELOG.md\" title=\"Go to changelog\"\u003e\u003cimg src=\"https://img.shields.io/badge/maintained-yes-2ea44f?style=for-the-badge\" alt=\"maintained - yes\"\u003e\u003c/a\u003e\n    \u003ca href=\".github/CONTRIBUTING.md\" title=\"Go to contributions doc\"\u003e\u003cimg src=\"https://img.shields.io/badge/contributions-welcome-2ea44f?style=for-the-badge\" alt=\"contributions - welcome\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n    \u003ca href=\"/\"\u003e\n        \u003cpicture\u003e\n          \u003csource media=\"(prefers-color-scheme: dark)\" srcset=\"https://raw.githubusercontent.com/RMNCLDYO/firecrawl-toolkit/main/.github/firecrawl-logo.png\"\u003e\n          \u003csource media=\"(prefers-color-scheme: light)\" srcset=\"https://raw.githubusercontent.com/RMNCLDYO/firecrawl-toolkit/main/.github/firecrawl-logo.png\"\u003e\n          \u003cimg alt=\"Firecrawl\" width=\"250\" src=\"https://raw.githubusercontent.com/RMNCLDYO/firecrawl-toolkit/main/.github/firecrawl-logo.png\"\u003e\n        \u003c/picture\u003e\n    \u003c/a\u003e\n\u003c/p\u003e\n\nThe Firecrawl Toolkit is the easiest way for developers to interact with web content through crawling, scraping, and mapping capabilities. It offers seamless integration for web crawling, content extraction, and site mapping, allowing you to process websites with advanced features like custom actions, multiple output formats, and batch processing—all in one comprehensive package with minimal dependencies.\n\n## 🚀 Features\n\n- 🕷️ **Web Crawling**: Traverse websites with customizable depth and path controls, supporting both internal and external link processing\n- 📄 **Content Extraction**: Extract content in multiple formats (Markdown, HTML, raw HTML) with smart content filtering\n- 🗺️ **Site Mapping**: Generate comprehensive site maps with advanced search and subdomain capabilities\n- 🔄 **Batch Processing**: Process multiple URLs simultaneously with unified configurations\n- 🤖 **Custom Actions**: Automate complex interactions (clicking, scrolling, form filling) during scraping\n- 📱 **Device Emulation**: Switch between mobile and desktop views with customizable headers\n- 🌎 **Geolocation**: Simulate different locations with country and language preferences\n- ⚡ **Smart Retry Logic**: Built-in retry mechanism with real-time status monitoring and webhooks\n- 🪶 **Lightweight Design**: Minimal dependencies powered by requests for easy setup and deployment\n- 🔒 **Robust Error Handling**: Comprehensive error catching and validation system\n- 🎯 **Parameter Validation**: Extensive validation for all API inputs and configurations\n- 📊 **Multiple Output Formats**: Support for various output types (Markdown, HTML, screenshots, etc.)\n\n## 📋 Table of Contents\n\n- [Installation](#-installation)\n- [API Key Configuration](#-configuration)\n- [Usage](#-usage)\n- [Advanced Configuration](#%EF%B8%8F-advanced-configuration)\n- [Available Formats](#-available-formats)\n- [Supported Actions](#-supported-actions)\n- [Error Handling and Safety](#-error-handling-and-safety)\n- [Contributing](#-contributing)\n- [Issues and Support](#-issues-and-support)\n- [Feature Requests](#-feature-requests)\n- [Versioning and Changelog](#-versioning-and-changelog)\n- [Security](#-security)\n- [License](#-license)\n\n## 🛠 Installation\n\n1. Clone the repository:\n   ```bash\n   git clone https://github.com/RMNCLDYO/firecrawl-toolkit.git\n   ```\n\n2. Navigate to the repository folder:\n   ```bash\n   cd firecrawl-toolkit\n   ```\n\n3. Install the required dependencies:\n   ```bash\n   pip install -r requirements.txt\n   ```\n\n## 🔑 Configuration\n1. Obtain an API key from [Firecrawl](https://www.firecrawl.dev/app/api-keys).\n2. You have three options for managing your API key:\n   \u003cdetails\u003e\n   \u003csummary\u003eClick here to view the API key configuration options\u003c/summary\u003e\n   \n   - **Setting it as an environment variable on your device (recommended for everyday use)**\n       - Navigate to your terminal.\n       - Add your API key like so:\n         ```shell\n         export FIRECRAWL_API_KEY=your_api_key\n         ```\n       This method allows the API key to be loaded automatically when using the wrapper.\n     \n   - **Using an .env file (recommended for development):**\n       - Install python-dotenv if you haven't already: `pip install python-dotenv`.\n       - Create a .env file in the project's root directory.\n       - Add your API key to the .env file like so:\n         ```makefile\n         FIRECRAWL_API_KEY=your_api_key\n         ```\n       This method allows the API key to be loaded automatically when using the wrapper.\n     \n   - **Direct Input:**\n       - If you prefer not to use a `.env` file, you can directly pass your API key as an argument to the wrapper function.\n         \n         ***Wrapper***\n         ```shell\n         api_key=\"your_api_key\"\n         ```\n       This method requires manually inputting your API key each time you initiate an API call.\n   \u003c/details\u003e\n\n## 💻 Usage\n\n### Web Crawling\n*For traversing websites and extracting content with customizable depth and path controls.*\n\n```python\nimport firecrawl\n\n# Basic crawling\nfirecrawl.crawl(\n    url=\"https://example.com\",\n    formats=[\"markdown\", \"html\"]\n)\n```\n\n### Content Scraping\n*For extracting content from specific URLs with custom actions and formatting.*\n\n```python\nimport firecrawl\n\n# Single URL scraping\nfirecrawl.scrape(\n    url=\"https://example.com\",\n    formats=[\"markdown\", \"html\"],\n    onlyMainContent=True\n)\n```\n\n### Batch Scraping\n*For processing multiple URLs simultaneously with shared configurations.*\n\n```python\nimport firecrawl\n\n# Batch scraping\nfirecrawl.batch_scrape(\n    urls=[\"https://example.com\", \"https://sitemaps.org\"],\n    formats=[\"markdown\", \"html\"]\n)\n```\n\n### Site Mapping\n*For generating comprehensive site maps with search capabilities.*\n\n```python\nimport firecrawl\n\nfirecrawl.map(\n    url=\"https://example.com\",\n    includeSubdomains=True,\n    limit=1000\n)\n```\n\n## ⚙️ Advanced Configuration\n\n| Description | Parameter | Type | Example |\n|-------------|-----------|------|---------|\n| Output Formats | `formats` | List | `[\"markdown\", \"html\", \"rawHtml\"]` |\n| Main Content Only | `onlyMainContent` | Boolean | `True` |\n| Include Tags | `includeTags` | List | `[\"article\", \"main\"]` |\n| Exclude Tags | `excludeTags` | List | `[\"nav\", \"footer\"]` |\n| Custom Headers | `headers` | Dict | `{\"User-Agent\": \"Custom\"}` |\n| Wait Time | `waitFor` | Integer | `1000` |\n| Mobile View | `mobile` | Boolean | `False` |\n| Custom Actions | `actions` | List | `[{\"type\": \"click\", \"selector\": \"#btn\"}]` |\n| Location | `location` | Dict | `{\"country\": \"US\", \"languages\": [\"en-US\"]}` |\n\n## 📊 Available Formats\n\n- `markdown`: Formatted Markdown content\n- `html`: Clean HTML content\n- `rawHtml`: Original HTML content\n- `links`: Extracted links\n- `extract`: Custom content extraction\n- `screenshot`: Page screenshot\n- `screenshot@fullPage`: Full page screenshot\n\n## 📁 Supported Actions\n\nThe toolkit supports various page interactions:\n\n| Action Type | Description | Parameters |\n|-------------|-------------|------------|\n| `wait` | Wait for element/time | `milliseconds`, `selector` |\n| `click` | Click elements | `selector` |\n| `write` | Input text | `selector`, `text` |\n| `press` | Press keyboard keys | `key` |\n| `scroll` | Scroll page | `direction`, `amount` |\n| `screenshot` | Take screenshots | `fullPage` |\n| `scrape` | Get current page state | None |\n\n## 🔒 Error Handling and Safety\n\n| Error Type | Description | Solution |\n|------------|-------------|----------|\n| ConfigurationError | Missing or invalid configuration | Check config.yaml and API key |\n| ValidationError | Invalid request parameters | Verify parameter values |\n| APIError | API-related issues | Check error message for details |\n| NetworkError | Connection problems | Verify internet connection |\n| ResponseError | Invalid API response | Check response format expectations |\n\n## 🤝 Contributing\nContributions are welcome!\n\nPlease refer to [CONTRIBUTING.md](.github/CONTRIBUTING.md) for detailed guidelines on how to contribute to this project.\n\n## 🐛 Issues and Support\nEncountered a bug? We'd love to hear about it. Please follow these steps to report any issues:\n\n1. Check if the issue has already been reported.\n2. Use the [Bug Report](.github/ISSUE_TEMPLATE/bug_report.md) template to create a detailed report.\n3. Submit the report [here](https://github.com/RMNCLDYO/firecrawl-toolkit/issues).\n\nYour report will help us make the project better for everyone.\n\n## 💡 Feature Requests\nGot an idea for a new feature? Feel free to suggest it. Here's how:\n\n1. Check if the feature has already been suggested or implemented.\n2. Use the [Feature Request](.github/ISSUE_TEMPLATE/feature_request.md) template to create a detailed request.\n3. Submit the request [here](https://github.com/RMNCLDYO/firecrawl-toolkit/issues).\n\nYour suggestions for improvements are always welcome.\n\n## 🔁 Versioning and Changelog\nStay up-to-date with the latest changes and improvements in each version:\n\n- [CHANGELOG.md](.github/CHANGELOG.md) provides detailed descriptions of each release.\n\n## 🔐 Security\nYour security is important to us. If you discover a security vulnerability, please follow our responsible disclosure guidelines found in [SECURITY.md](.github/SECURITY.md). Please refrain from disclosing any vulnerabilities publicly until said vulnerability has been reported and addressed.\n\n## 📄 License\nLicensed under the MIT License. See [LICENSE](LICENSE) for details.","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frmncldyo%2Ffirecrawl-toolkit","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frmncldyo%2Ffirecrawl-toolkit","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frmncldyo%2Ffirecrawl-toolkit/lists"}