{"id":29151429,"url":"https://github.com/cameronking4/nextjs-firecrawl-starter","last_synced_at":"2025-09-04T20:43:29.351Z","repository":{"id":273619476,"uuid":"916260409","full_name":"cameronking4/nextjs-firecrawl-starter","owner":"cameronking4","description":"Nextjs 15 Firecrawl app to scrape doc links for an LLM. Use it as a starter kit to build your Firecrawl app. Turn any developer documentation into a GPT knowledge base. Pre-made Github Action to crawl and commit markdown responses to repo.","archived":false,"fork":false,"pushed_at":"2025-05-14T00:12:23.000Z","size":578,"stargazers_count":13,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-05-14T02:16:45.231Z","etag":null,"topics":["docs","firecrawl","github-actions","llm","nextjs","rombo","shadcn-ui","web-scraping"],"latest_commit_sha":null,"homepage":"https://nextjs-firecrawl-starter.vercel.app","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cameronking4.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-01-13T18:54:58.000Z","updated_at":"2025-05-14T00:12:26.000Z","dependencies_parsed_at":"2025-03-11T01:20:31.141Z","dependency_job_id":"12291776-9448-4315-8eae-46a549f15091","html_url":"https://github.com/cameronking4/nextjs-firecrawl-starter","commit_stats":null,"previous_names":["cameronking4/nextjs-firecrawl-starter"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/cameronking4/nextjs-firecrawl-starter","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cameronking4%2Fnextjs-firecrawl-starter","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cameronking4%2Fnextjs-firecrawl-starter/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cameronking4%2Fnextjs-firecrawl-starter/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cameronking4%2Fnextjs-firecrawl-starter/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cameronking4","download_url":"https://codeload.github.com/cameronking4/nextjs-firecrawl-starter/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cameronking4%2Fnextjs-firecrawl-starter/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":262870877,"owners_count":23377314,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["docs","firecrawl","github-actions","llm","nextjs","rombo","shadcn-ui","web-scraping"],"created_at":"2025-07-01T00:09:09.102Z","updated_at":"2025-07-01T00:09:19.856Z","avatar_url":"https://github.com/cameronking4.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Next.js Firecrawl Starter\n![image](https://github.com/user-attachments/assets/e82a0567-6ad9-44c4-bc4b-2a99543cac1f)\n\nThis Nextjs app aims to provide a modern web interface for crawling documentation and processing it for LLM use. Use the output `markdown`, `xml`, or `zip` files to build knowledge files to copy over to a vector database, a ChatGPT GPT, an OpenAI Assistant, Claude Artifacts, Vapi.ai, Aimdoc, or any other LLM tool.\n\n![image](https://github.com/user-attachments/assets/8d48194d-7436-4227-9919-7602688c65b7)\n\nThe Next app generates a .md file, .xml file, or .zip of markdown files ready for LLM consumption, inspired by the [devdocs-to-llm](https://github.com/alexfazio/devdocs-to-llm) Jupyter notebook by Alex Fazio.\n\n## Features\n\n- 🌐 Serverless architecture using `Firecrawl API` v1\n- ⚡ Real-time crawl status updates\n- 🎨 Modern UI with dark/light mode support\n- 📂 Crawl History using `Local Storage`\n- 💥 Github Action defined to manually run crawl function and commit to /knowledge_bases folder\n\n## Github Action \nUse the Github Action template to define automations. Leverage Github Actions cron to schedule crawls for a given site and commit markdown file directly to repo.\n### Manual Trigger\nhttps://github.com/user-attachments/assets/fdc0f1a3-1632-44fc-b382-332377d73ed6\n\n### Scheduled Crawl ([available on Github Marketplace](https://github.com/marketplace/actions/firecrawl-scheduled-action))\n[![Firecrawl Action](https://github.com/cameronking4/nextjs-firecrawl-starter/actions/workflows/crawl-docs.yml/badge.svg)](https://github.com/cameronking4/nextjs-firecrawl-starter/actions/workflows/crawl-docs.yml)\n\nAdd this to any Github Repo to start crawling on a schedule. It will commit the output results automatically after crawling to a specified folder. Default is to crawl `Hacker News` everyday at midnight and store results in the `/knowledge_bases` folder.\n\n```yaml\nname: Scheduled Crawl Action\n\n# This workflow will automatically crawl the specified URL on a schedule and commit the results to your repository.\n\non:\n  schedule:\n    - cron: '0 0 * * *'  # Replace with the cron expression for the schedule you want to use (e.g., '0 0 * * *' for daily at midnight UTC)\n  workflow_dispatch:  # Allow manual triggering\n\njobs:\n  crawl:\n    runs-on: ubuntu-latest\n    permissions:\n      contents: write\n      id-token: write\n      actions: read\n\n    steps:\n      - uses: actions/checkout@v4\n      - name: Firecrawl Scheduled Action\n        uses: cameronking4/nextjs-firecrawl-starter@v1.0.0\n        with:\n          url: 'https://news.ycombinator.com' # Replace with the URL you want to crawl regularly\n          output_folder: 'knowledge_bases' # Replace with the folder name where the output commits will be saved\n          api_url: 'https://nextjs-firecrawl-starter.vercel.app' # Replace with the API URL of your Firecrawl API endpoint, this is the default URL for the starter app.\n```\n\n## OpenAPI Spec \u0026 Custom GPT Actions\nYou can use this project to serve endpoints for your LLM tools. In ChatGPT, you can click `Create a GPT` and then `Create Action` to allow your GPT to call the Firecrawl API endpoints and return results in chat.\n\n![image (3)](https://github.com/user-attachments/assets/1280fc24-582b-42b3-8c76-7db66c72b004)\n\n### Quickstart\nAdd the Firecrawl actions to your GPT by copying and pasting this import URL in the Configure Tab:\n```\nhttps://nextjs-firecrawl-starter.vercel.app/api/openapi\n```\nThis URL is defined and can be edited in the `/api/openapi/route.ts` file.\n\n## Tech Stack\n\n- **Framework**: Next.js 15.1.4\n- **Styling**: Tailwind CSS\n- **UI Components**: \n  - Radix UI primitives\n  - Shadcn/ui components\n- **State Management**: React Hook Form\n- **Animations**: Framer Motion \u0026 Rombo \n- **Development**: TypeScript\n- **API Routes**: Firecrawl API Key \u0026 Next.js App Router\n\n## API Routes\n\nThe application uses Next.js App Router API routes for serverless functionality:\n\n- `/api/crawl/route.ts` - Initiates a new crawl job\n- `/api/crawl/status/[id]/route.ts` - Gets the status of an ongoing crawl\n- `/api/map/route.ts` - Generates site maps\n- `/api/scrape/route.ts` - Handles individual page scraping\n\n## Getting Started\n\n1. Clone the repository\n2. Install dependencies:\n```bash\nnpm install\n# or\nyarn install\n# or\npnpm install\n```\n\n3. Create a `.env` file with your Firecrawl API key:\n```env\nFIRECRAWL_API_KEY=your_api_key_here\n```\n\n4. Run the development server:\n```bash\nnpm run dev\n# or\nyarn dev\n# or\npnpm dev\n```\n\n5. Open [http://localhost:3000](http://localhost:3000) in your browser\n\n## Key Dependencies\n\n- **UI \u0026 Components**:\n  - @radix-ui/* - Headless UI components\n  - class-variance-authority - Component variants\n  - tailwind-merge - Tailwind class merging\n  - lucide-react - Icons\n  - next-themes - Theme management\n  - framer-motion - Animations\n  - rombo - Animations\n- **Forms \u0026 Validation**:\n  - react-hook-form - Form handling\n  - @hookform/resolvers - Form validation\n  - zod - Schema validation\n\n## How It Works\n\n1. **Crawling**: Uses Firecrawl API to crawl documentation sites and generate sitemaps\n2. **Processing**: Extracts content and converts it to markdown format\n3. **Status Tracking**: Real-time updates on crawl progress\n4. **Results**: Displays processed content ready for LLM consumption\n\n## Credits\n\nThis project is a Next.js implementation inspired by the [devdocs-to-llm](https://github.com/alexfazio/devdocs-to-llm) Jupyter notebook by Alex Fazio. The original project demonstrated how to use Firecrawl API to crawl developer documentation and prepare it for LLM use.\n\nThe original Jupyter notebook implementation provides:\n- Documentation crawling with Firecrawl API\n- Content extraction and markdown conversion\n- Export capabilities to Rentry.co and Google Docs\n\n\nThis Next.js version builds upon these capabilities by:\n- Adding a modern web interface\n- Implementing real-time crawl status tracking\n- Providing a serverless architecture for Firecrawl API processing\n- Adding dark/light theme support\n- Making the tool more accessible through a user-friendly UI deployed on Vercel\n- Github Actions for manual and scheduled scraping\n- OpenAPI Specification for LLM Tool Calling\n\n## License\n\nSee the [LICENSE](LICENSE) file for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcameronking4%2Fnextjs-firecrawl-starter","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcameronking4%2Fnextjs-firecrawl-starter","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcameronking4%2Fnextjs-firecrawl-starter/lists"}