https://github.com/devhims/weblinq
High-performance web scraping and browser automation platform built on Cloudflare
https://github.com/devhims/weblinq
better-auth d1 durableobject honojs nextjs r2 workers workers-ai
Last synced: about 1 month ago
JSON representation
High-performance web scraping and browser automation platform built on Cloudflare
- Host: GitHub
- URL: https://github.com/devhims/weblinq
- Owner: devhims
- License: mit
- Created: 2025-05-30T16:23:52.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2025-08-29T12:26:55.000Z (about 1 month ago)
- Last Synced: 2025-08-29T14:51:17.722Z (about 1 month ago)
- Topics: better-auth, d1, durableobject, honojs, nextjs, r2, workers, workers-ai
- Language: TypeScript
- Homepage: https://weblinq.dev
- Size: 2.57 MB
- Stars: 6
- Watchers: 0
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Security: SECURITY.md
- Support: docs/support/contact.mdx
Awesome Lists containing this project
README
# WebLinq
> High-performance web scraping and browser automation platform
[](https://opensource.org/licenses/MIT)
[](https://www.typescriptlang.org/)
[](https://workers.cloudflare.com/)
[](https://nodejs.org/)
[](https://hono.dev/)## ๐ Overview
WebLinq is a modern web scraping and browser automation platform that revolutionizes performance through **intelligent browser session reuse**. Built on Cloudflare's edge infrastructure, it provides lightning-fast web operations while maintaining reliability and scalability.
**๐ฏ Perfect for:** Realtime web access in chat apps, Browser automation, Data aggregation, Competitor analysis, and Market research.
### Key Features
- **๐ Browser Session Reuse**: Intelligent architecture that reduces operation latency from ~2-3s to ~200-500ms
- **โก High Performance**: Built on Cloudflare Workers for global edge deployment
- **๐ฏ Comprehensive API**: Search, Screenshot capture, Markdown / HTML extraction, PDF generation, AI data extraction
- **๐ง MCP Integration**: Model Context Protocol server for AI assistant integration
- **๐ก๏ธ Enterprise Ready**: Authentication, rate limiting, and secure API key management
- **๐ฑ Modern Dashboard**: Full-featured web interface for API management## ๐ Live Demo
Try WebLinq instantly with our interactive API:
```bash
# Extract markdown from any webpage
curl -X POST "https://api.weblinq.dev/v1/web/markdown" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com"}'# Take a screenshot
curl -X POST "https://api.weblinq.dev/v1/web/screenshot" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com"}'
```**๐ [Get your free API key](https://weblinq.dev/dashboard/api-keys)** โข **๐ [View live documentation](https://docs.weblinq.dev)**
## ๐๏ธ Directory Structure
```
weblinq/
โโโ backend/ # Core API server (Cloudflare Worker)
โ โโโ src/
โ โ โโโ durable-objects/ # Browser session management
โ โ โโโ routes/ # API endpoints
โ โ โโโ lib/ # Core utilities and operations
โ โ โโโ middlewares/ # Authentication and CORS
โ โโโ scripts/ # Build and deployment scripts
โโโ frontend/ # Next.js 15 dashboard application
โ โโโ src/
โ โ โโโ app/ # App router pages
โ โ โโโ components/ # Reusable UI components
โ โ โโโ lib/ # Client utilities
โโโ weblinq-mcp/ # Model Context Protocol server
โ โโโ src/ # MCP implementation
โโโ docs/ # Mintlify documentation site
โ โโโ api-reference/ # API documentation
โ โโโ guides/ # User guides and examples
โโโ tests/ # Integration testing suite
```## ๐ Browser Session Reuse Innovation
WebLinq's core innovation lies in its **intelligent browser session reuse architecture** powered by Cloudflare Durable Objects:
### Architecture Overview
- **BrowserManagerDO**: Orchestrates up to 10 concurrent browser sessions
- **BrowserDO**: Manages individual Playwright/Puppeteer browser instances
- **Session Pooling**: Maintains warm sessions across requests
- **Blue-Green Refresh**: Zero-downtime session rotation every 8.5 minutes### Performance Benefits
| Metric | Traditional | WebLinq |
| ------------------------- | ----------------------------- | -------------------------- |
| **Cold Start Latency** | 2-3 seconds | 200-500ms |
| **Resource Efficiency** | โ New browser per request | โ Persistent sessions |
| **Concurrent Operations** | Limited by startup time | Up to 10 parallel sessions |
| **Cost Optimization** | High browser startup overhead | Reduced slot usage |### How It Works
1. **Session Management**: Durable Objects maintain persistent browser sessions
2. **Intelligent Allocation**: Available sessions are reused; new ones created on-demand
3. **Proactive Refresh**: Sessions are refreshed before Cloudflare's 10-minute limit
4. **Fault Tolerance**: Automatic recovery from crashes and network issues## ๐ ๏ธ Core Technologies
### Backend Stack
Built with modern, high-performance technologies:
- **[Hono.js](https://hono.dev/)** `^4.7.10` - Ultra-fast web framework
- **[Drizzle ORM](https://orm.drizzle.team/)** `^0.43.1` - Type-safe database operations
- **[Zod](https://zod.dev/)** `^3.25.28` - Runtime type validation
- **[Better Auth](https://better-auth.com/)** `^1.2.8` - Modern authentication
- **[@cloudflare/puppeteer](https://github.com/cloudflare/puppeteer)** - Browser automation## ๐ MCP Server Integration
The **`weblinq-mcp/`** directory contains a complete [Model Context Protocol](https://modelcontextprotocol.io/) server implementation, enabling seamless integration with AI assistants like Claude Desktop and other MCP-compatible clients.
### Features
- **๐ Direct API Integration**: Connect AI assistants to WebLinq's full API
- **๐ Real-time Operations**: Screenshot capture, data extraction, web search
- **๐ก๏ธ Secure Authentication**: API key-based access control
- **๐ Structured Responses**: Type-safe data exchange with AI models### Usage
```bash
cd weblinq-mcp
npm install
npm run dev # Development server
npm run deploy # Deploy to Cloudflare Workers
```The MCP server provides AI assistants with tools for web scraping, screenshot capture, and data extraction, making WebLinq's capabilities directly accessible within AI workflows.
## ๐ Quick Start
### For Developers
```bash
# Clone and setup
git clone https://github.com/devhims/weblinq.git
cd weblinq# Install dependencies
cd backend && pnpm install
cd ../frontend && pnpm install# Setup environment
cp backend/.env.example backend/.env
cp frontend/.env.example frontend/.env.local# Start development
cd backend && pnpm dev # Backend: http://localhost:8787
cd frontend && pnpm dev # Frontend: http://localhost:3000
```**Requirements:** Node.js 18+, Cloudflare account with Workers/D1/Durable Objects enabled
๐ **[Full setup guide in CONTRIBUTING.md](CONTRIBUTING.md)**
## ๐ Documentation
- **๐ [API Documentation](./docs/)** - Complete API reference and guides
- **๐ [Quick Start Guide](./docs/getting-started/quickstart.mdx)** - Get started in 5 minutes
- **๐ง [Developer Guide](./docs/guides/examples.mdx)** - Integration examples
- **๐ [Authentication](./docs/getting-started/authentication.mdx)** - API key setup
- **๐ก๏ธ [Security Policy](./SECURITY.md)** - Vulnerability reporting and best practices## ๐ค Contributing
We welcome contributions! Please see our [Contributing Guidelines](CONTRIBUTING.md) for detailed information on:
- ๐ ๏ธ Development setup and workflow
- ๐ Code style and standards
- ๐งช Testing requirements
- ๐ Bug reporting process
- ๐ก Feature request guidelines## ๐ License
This project is licensed under the **MIT License** - see the [LICENSE](./LICENSE) file for details.
## ๐ Acknowledgments
- **Cloudflare** - For Workers, Durable Objects, and Browser Rendering API
- **Hono.js** - For the clean, lightning-fast web framework
- **Better Auth** - The most complete authentication framework---
**[Documentation](./docs/) โข [API Reference](./docs/api-reference/) โข [Examples](./docs/guides/examples.mdx) โข [Contributing](CONTRIBUTING.md)**
Made with โค๏ธ by the WebLinq team