An open API service indexing awesome lists of open source software.

https://github.com/saifyxpro/headlessx

A lightweight, self-hosted headless browser automation platform. Designed as an alternative to Browserless, built for speed, privacy, and scalability.
https://github.com/saifyxpro/headlessx

automation automation-api automation-platform browser-automation browser-testing browserless chrome-headless chromedriver container-automation data-extraction headless headless-chrome headless-service playwright playwright-automation puppeteer scraping-service web-automation web-scraping

Last synced: 3 months ago
JSON representation

A lightweight, self-hosted headless browser automation platform. Designed as an alternative to Browserless, built for speed, privacy, and scalability.

Awesome Lists containing this project

README

          

# πŸš€ HeadlessX v1.3.0

**Advanced Anti-Detection Web Scraping API with Comprehensive Fingerprinting Control**

[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg?style=for-the-badge)](https://opensource.org/licenses/MIT)
[![Version](https://img.shields.io/badge/Version-v1.3.0-blue.svg?style=for-the-badge)](https://github.com/saifyxpro/HeadlessX/releases)
[![Node.js](https://img.shields.io/badge/Node.js-18%2B-success.svg?style=for-the-badge&logo=node.js)](https://nodejs.org/)
[![Playwright](https://img.shields.io/badge/Playwright-Latest-orange.svg?style=for-the-badge&logo=playwright)](https://playwright.dev/)

[![GitHub Stars](https://img.shields.io/github/stars/saifyxpro/HeadlessX?style=for-the-badge&logo=github)](https://github.com/saifyxpro/HeadlessX/stargazers)
[![GitHub Forks](https://img.shields.io/github/forks/saifyxpro/HeadlessX?style=for-the-badge&logo=github)](https://github.com/saifyxpro/HeadlessX/network/members)
[![Docker](https://img.shields.io/badge/Docker-Ready-blue.svg?style=for-the-badge&logo=docker)](#-docker-deployment)
[![CI/CD](https://img.shields.io/github/actions/workflow/status/saifyxpro/HeadlessX/ci.yml?style=for-the-badge&logo=github-actions&label=CI%2FCD)](https://github.com/saifyxpro/HeadlessX/actions)

[![Open Source](https://img.shields.io/badge/Open%20Source-100%25-brightgreen.svg?style=for-the-badge&logo=open-source-initiative)](https://github.com/saifyxpro/HeadlessX)
[![Contributors](https://img.shields.io/github/contributors/saifyxpro/HeadlessX?style=for-the-badge&logo=github)](https://github.com/saifyxpro/HeadlessX/graphs/contributors)
[![Issues](https://img.shields.io/github/issues/saifyxpro/HeadlessX?style=for-the-badge&logo=github)](https://github.com/saifyxpro/HeadlessX/issues)
[![PRs Welcome](https://img.shields.io/badge/PRs-Welcome-brightgreen.svg?style=for-the-badge&logo=github)](http://makeapullrequest.com)

![HeadlessX Demo](assets/main.gif)

> 🎯 **Unified Solution**: Website + API on a single domain
> πŸ›‘οΈ **Advanced Anti-Detection**: Canvas/WebGL/Audio spoofing, behavioral simulation
> 🧠 **Human-like Behavior**: Bezier mouse movements, keyboard dynamics, natural scrolling
> πŸš€ **Deploy Anywhere**: Docker, Node.js+PM2, or Development

---

## πŸ—ΊοΈ **What's Coming Next?**

### πŸš€ **HeadlessX v2.0** - Full-Stack AI-Powered Platform
*The future of intelligent web scraping is here*

[![Roadmap](https://img.shields.io/badge/πŸ—ΊοΈ_View_Full_Roadmap-v2.0-purple.svg?style=for-the-badge)](./docs/roadmap-v2.md)

**🎯 Revolutionary Features Coming:**
- πŸ€– **AI-Powered Admin Panel** - Intelligent task management & automation
- 🎨 **Modern React Frontend** - Sleek, responsive dashboard interface
- 🧠 **Smart Automation** - AI-driven scraping strategies & optimization
- πŸ“Š **Advanced Analytics** - Real-time insights & performance metrics
- πŸ”„ **Workflow Builder** - Visual scraping pipeline creation
- πŸŽ›οΈ **Enterprise Controls** - Advanced user management & permissions

*Transform your web scraping experience with the next generation of HeadlessX*

---

## ✨ v1.3.0 Key Features

### πŸ›‘οΈ **Advanced Anti-Detection Engine**
- **Canvas Fingerprinting Control** - Dynamic noise injection with consistent seeds
- **WebGL Spoofing** - GPU vendor/model spoofing with realistic profiles
- **Audio Context Manipulation** - Hardware audio fingerprint database
- **WebRTC Leak Prevention** - Complete IP leak protection
- **Hardware Fingerprint Spoofing** - CPU, memory, and performance masking

### 🧠 **Human-like Behavioral Simulation**
- **Bezier Mouse Movement** - Natural acceleration and deceleration patterns
- **Keyboard Dynamics** - Realistic dwell time and flight time variations
- **Natural Scroll Patterns** - Reader, scanner, browser behavioral profiles
- **Attention Model Simulation** - Human-like focus and interaction patterns
- **Micro-movement Injection** - Sub-pixel accuracy for maximum realism

### 🌐 **WAF Bypass Capabilities**
- **Cloudflare Bypass** - Advanced challenge solving and TLS fingerprinting
- **DataDome Evasion** - Resource blocking and behavioral pattern matching
- **Incapsula/Akamai** - Generic WAF bypass with adaptive techniques
- **HTTP/2 Fingerprinting** - Stream prioritization and header ordering

### πŸ“Š **Comprehensive Device Profiles**
- **50+ Chrome Profiles** - Desktop, mobile, and tablet configurations
- **Hardware Consistency** - CPU, GPU, memory, and sensor correlation
- **Geolocation Intelligence** - Timezone, language, and locale matching
- **Profile Validation** - Real-time consistency checking and scoring

---

**Choose your deployment:**

| Method | Command | Best For |
|--------|---------|----------|
| 🐳 **Docker** | `docker-compose up -d` | Production, easy deployment |
| πŸ”§ **Auto Setup** | `chmod +x scripts/setup.sh && sudo ./scripts/setup.sh` | VPS/Server with full control |
| πŸ’» **Development** | `npm install && npm start` | Local development, testing |

**Access your HeadlessX v1.3.0:**
```
🌐 Website: https://your-subdomain.yourdomain.com
πŸ”— API: https://your-subdomain.yourdomain.com/api
πŸ›‘οΈ Stealth: https://your-subdomain.yourdomain.com/api/render/stealth
πŸ§ͺ Testing: https://your-subdomain.yourdomain.com/api/test-fingerprint
πŸ“± Profiles: https://your-subdomain.yourdomain.com/api/profiles
πŸ”§ Health: https://your-subdomain.yourdomain.com/api/health
πŸ“Š Status: https://your-subdomain.yourdomain.com/api/status?token=YOUR_AUTH_TOKEN
```

---

## πŸ—οΈ Enhanced Anti-Detection Architecture v1.3.0

HeadlessX v1.3.0 introduces advanced anti-detection capabilities with comprehensive fingerprinting control, behavioral simulation, and WAF bypass techniques while maintaining the modular architecture from v1.2.0.

### v1.3.0 Key Enhancements:
- **πŸ›‘οΈ Advanced Anti-Detection**: Canvas, WebGL, Audio, WebRTC fingerprinting control
- **🎭 Behavioral Simulation**: Human-like mouse movement with Bezier curves and keyboard dynamics
- **🌐 WAF Bypass**: Cloudflare, DataDome, and advanced evasion techniques
- **πŸ“± Device Profiling**: Comprehensive desktop and mobile device profiles with hardware spoofing
- **πŸ§ͺ Testing Framework**: Comprehensive anti-detection testing and validation
- **πŸ”§ Separation of Concerns**: Enhanced modules for fingerprinting, behavioral, and evasion services
- **πŸš€ Better Performance**: Optimized browser management with intelligent profile-based pooling
- **πŸ› οΈ Developer Experience**: Development tools, profile generators, and interactive testing
- **πŸ“¦ Production Ready**: Enhanced error handling, detection analytics, and profile validation
- **πŸ”’ Security**: Advanced authentication, profile management, and secure fingerprint storage
- **πŸ“Š Monitoring**: Real-time detection monitoring, success rate analytics, and performance benchmarking

### v1.3.0 Architecture Overview:
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Routes │───▢│ Controllers │───▢│ Services β”‚
β”‚ (api.js) β”‚ β”‚ (rendering.js)β”‚ β”‚ (browser.js) β”‚
β”‚ (admin.js) β”‚ β”‚ (profiles.js) β”‚ β”‚ (stealth.js) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ (detection.js)β”‚ β”‚ (interaction.js)
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β–Ό β”‚ β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β–Ό β–Ό
β”‚ Middleware β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ (auth.js) β”‚ β”‚ Utils β”‚ β”‚ Config β”‚
β”‚ (error.js) β”‚ β”‚ (logger.js) β”‚ β”‚ (index.js) β”‚
β”‚ (analyzer.js) β”‚ β”‚ (helpers.js) β”‚ β”‚ (browser.js) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ (validator.js)β”‚ β”‚ (profiles/) β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β–Ό β”‚ β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β–Ό β–Ό
β”‚ Fingerprinting β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ (canvas-spoof) β”‚ β”‚ Behavioral β”‚ β”‚ Evasion β”‚
β”‚ (webgl-spoof) β”‚ β”‚ (mouse-movement)β”‚ β”‚ (cloudflare) β”‚
β”‚ (audio-context) β”‚ β”‚ (keyboard-dyn) β”‚ β”‚ (datadome) β”‚
β”‚ (webrtc-ctrl) β”‚ β”‚ (scroll-pattern)β”‚ β”‚ (waf-bypass) β”‚
β”‚ (hardware-noise)β”‚ β”‚ (attention-mod) β”‚ β”‚ (tls-fingerpr) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚ β”‚ β”‚
β–Ό β–Ό β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Testing β”‚ β”‚ Development β”‚ β”‚ Profiles β”‚
β”‚ (test-framework)β”‚ β”‚ (dev-tools) β”‚ β”‚ (chrome-prof) β”‚
β”‚ (detection-test)β”‚ β”‚ (profile-gen) β”‚ β”‚ (mobile-prof) β”‚
β”‚ (performance) β”‚ β”‚ (fingerpr-test) β”‚ β”‚ (firefox-prof) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

**Migration from v1.2.0:**
- All v1.2.0 functionality preserved with enhanced anti-detection capabilities
- New environment variables for fingerprint control and stealth configuration
- Enhanced API endpoints for profile management and detection testing
- Backward compatible with all existing configurations and scripts

πŸ“– **Detailed Documentation**: [MODULAR_ARCHITECTURE.md](docs/MODULAR_ARCHITECTURE.md)

---

## πŸš€ Deployment Guide

### 🐳 **Docker Deployment (Recommended)**

```bash
# Install Docker (if needed)
curl -fsSL https://get.docker.com | sh
sudo usermod -aG docker $USER

# Deploy HeadlessX
git clone https://github.com/SaifyXPRO/HeadlessX.git
cd HeadlessX
cp .env.example .env
nano .env # Configure DOMAIN, SUBDOMAIN, AUTH_TOKEN

# Start services
docker-compose up -d

# Optional: Setup SSL
apt install certbot python3-certbot-nginx
certbot --nginx -d your-subdomain.yourdomain.com
```

**Docker Management:**
```bash
docker-compose ps # Check status
docker-compose logs headlessx # View logs
docker-compose restart # Restart services
docker-compose down # Stop services
```

### πŸ”§ **Node.js + PM2 Deployment**

```bash
# Automated setup (recommended)
git clone https://github.com/SaifyXPRO/HeadlessX.git
cd HeadlessX
cp .env.example .env
nano .env # Configure environment
chmod +x scripts/setup.sh
sudo ./scripts/setup.sh # Installs dependencies, builds website, starts PM2
```

**🌐 Nginx Configuration (Auto-handled by setup script):**

The setup script automatically configures nginx, but if you need to manually configure:

```bash
# Copy and configure nginx site
sudo cp nginx/headlessx.conf /etc/nginx/sites-available/headlessx

# Replace placeholders with your actual domain
sudo sed -i 's/SUBDOMAIN.DOMAIN.COM/your-subdomain.yourdomain.com/g' /etc/nginx/sites-available/headlessx

# Enable the site
sudo ln -sf /etc/nginx/sites-available/headlessx /etc/nginx/sites-enabled/
sudo rm -f /etc/nginx/sites-enabled/default

# Test and reload nginx
sudo nginx -t && sudo systemctl reload nginx
```

**Manual setup (if not using setup script):**
```bash
sudo apt update && sudo apt upgrade -y
curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash -
sudo apt install -y nodejs build-essential
npm install && npm run build
sudo npm install -g pm2
npm run pm2:start
```

**PM2 Management:**
```bash
npm run pm2:status # Check status
npm run pm2:logs # View logs
npm run pm2:restart # Restart server
npm run pm2:stop # Stop server
```

### πŸ’» **Development Setup**

```bash
git clone https://github.com/SaifyXPRO/HeadlessX.git
cd HeadlessX
cp .env.example .env
nano .env # Set AUTH_TOKEN, DOMAIN=localhost, SUBDOMAIN=headlessx

# Make scripts executable
chmod +x scripts/*.sh

# Install dependencies
npm install
cd website && npm install && npm run build && cd ..

# Start development server
npm start # Access at http://localhost:3000
```

---

## 🌐 API Routes & Structure

```
HeadlessX Routes:
β”œβ”€β”€ /favicon.ico β†’ Favicon
β”œβ”€β”€ /robots.txt β†’ SEO robots file
β”œβ”€β”€ /api/health β†’ Health check (no auth required)
β”œβ”€β”€ /api/status β†’ Server status (requires token)
β”œβ”€β”€ /api/render β†’ Full page rendering
β”œβ”€β”€ /api/html β†’ HTML extraction
β”œβ”€β”€ /api/content β†’ Clean text extraction
β”œβ”€β”€ /api/screenshot β†’ Screenshot generation
β”œβ”€β”€ /api/pdf β†’ PDF generation
└── /api/batch β†’ Batch URL processing
```

**πŸ”„ Request Flow:**
1. Nginx receives request on port 80/443
2. Proxies to Node.js server on port 3000
3. Server routes based on path:
- `/api/*` β†’ API endpoints
- `/*` β†’ Website files (built Next.js app)

---

## πŸš€ API Examples & HTTP Integrations

### Quick Health Check (No Auth)
```bash
curl https://your-subdomain.yourdomain.com/api/health
```

### πŸ”§ cURL Examples

#### πŸ›‘οΈ v1.3.0 Anti-Detection Rendering (Maximum Stealth)
```bash
curl -X POST "https://your-subdomain.yourdomain.com/api/render/stealth?token=YOUR_AUTH_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com",
"profile": "desktop-chrome",
"stealthMode": "maximum",
"behaviorSimulation": true,
"timeout": 30000
}'
```

#### πŸ“± Mobile Device Simulation
```bash
curl -X POST "https://your-subdomain.yourdomain.com/api/render?token=YOUR_AUTH_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com",
"profile": "iphone-14-pro",
"geolocation": {"latitude": 40.7128, "longitude": -74.0060},
"behaviorSimulation": true
}'
```

#### πŸ§ͺ Test Anti-Detection Capabilities
```bash
curl -X POST "https://your-subdomain.yourdomain.com/api/test-fingerprint?token=YOUR_AUTH_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"profile": "desktop-chrome",
"testCanvas": true,
"testWebGL": true,
"testAudio": true
}'
```

#### πŸ“Š Get Available Device Profiles
```bash
curl "https://your-subdomain.yourdomain.com/api/profiles?token=YOUR_AUTH_TOKEN"
```

#### 🎭 Behavioral Simulation with WAF Bypass
```bash
curl -X POST "https://your-subdomain.yourdomain.com/api/render?token=YOUR_AUTH_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com",
"profile": "desktop-firefox",
"cloudflareBypass": true,
"datadomeBypass": true,
"mouseMovement": "natural",
"keyboardDynamics": "human",
"timeout": 45000
}'
```

#### Extract HTML Content
```bash
curl -X POST "https://your-subdomain.yourdomain.com/api/html?token=YOUR_AUTH_TOKEN" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com", "timeout": 30000}'
```

#### Generate Screenshot
```bash
curl "https://your-subdomain.yourdomain.com/api/screenshot?token=YOUR_AUTH_TOKEN&url=https://example.com&fullPage=true" \
-o screenshot.png
```

#### Extract Text Only
```bash
curl -X POST "https://your-subdomain.yourdomain.com/api/text?token=YOUR_AUTH_TOKEN" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com", "waitForSelector": "main"}'
```

#### Generate PDF
```bash
curl -X POST "https://your-subdomain.yourdomain.com/api/pdf?token=YOUR_AUTH_TOKEN" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com", "format": "A4"}' \
-o document.pdf
```

### πŸ€– Make.com (Integromat) Integration

**HTTP Request Module Configuration:**
```json
{
"url": "https://your-subdomain.yourdomain.com/api/html",
"method": "POST",
"headers": {
"Content-Type": "application/json"
},
"qs": {
"token": "YOUR_AUTH_TOKEN"
},
"body": {
"url": "{{url_to_scrape}}",
"timeout": 30000,
"waitForSelector": "{{optional_selector}}"
}
}
```

### ⚑ Zapier Integration

**Webhooks by Zapier Setup:**
- **URL:** `https://your-subdomain.yourdomain.com/api/html?token=YOUR_AUTH_TOKEN`
- **Method:** POST
- **Headers:** `Content-Type: application/json`
- **Body:**
```json
{
"url": "{{url_from_trigger}}",
"timeout": 30000,
"humanBehavior": true
}
```

### πŸ”— n8n Integration

**HTTP Request Node:**
```json
{
"url": "https://your-subdomain.yourdomain.com/api/html",
"method": "POST",
"authentication": "queryAuth",
"query": {
"token": "YOUR_AUTH_TOKEN"
},
"headers": {
"Content-Type": "application/json"
},
"body": {
"url": "={{$json.url}}",
"timeout": 30000,
"humanBehavior": true
}
}
```

**Available via n8n Community Node:**
- Install: `npm install n8n-nodes-headlessx`
- [GitHub Repository](https://github.com/SaifyXPRO/n8n-nodes-headlessx)

### 🐍 Python Example
```python
import requests

def scrape_with_headlessx(url, token):
response = requests.post(
"https://your-subdomain.yourdomain.com/api/html",
params={"token": token},
json={
"url": url,
"timeout": 30000,
"humanBehavior": True
}
)
return response.json()

# Usage
result = scrape_with_headlessx("https://example.com", "YOUR_TOKEN")
print(result['html'])
```

### 🟨 JavaScript/Node.js Example
```javascript
const axios = require('axios');

async function scrapeWithHeadlessX(url, token) {
try {
const response = await axios.post(
`https://your-subdomain.yourdomain.com/api/html?token=${token}`,
{
url: url,
timeout: 30000,
humanBehavior: true
}
);
return response.data;
} catch (error) {
console.error('Scraping failed:', error.message);
throw error;
}
}

// Usage
scrapeWithHeadlessX('https://example.com', 'YOUR_TOKEN')
.then(result => console.log(result.html))
.catch(error => console.error(error));
```

### πŸ”„ Batch Processing Example
```bash
curl -X POST "https://your-subdomain.yourdomain.com/api/batch?token=YOUR_AUTH_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"urls": [
"https://example1.com",
"https://example2.com",
"https://example3.com"
],
"timeout": 30000,
"humanBehavior": true
}'
```

### Batch Processing
```bash
curl -X POST "https://your-subdomain.yourdomain.com/api/batch?token=YOUR_AUTH_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"urls": ["https://example.com", "https://httpbin.org"],
"format": "text",
"options": {"timeout": 30000}
}'
```

---

## πŸ“ Project Structure

```
HeadlessX v1.3.0 - Enhanced Anti-Detection Architecture/
β”œβ”€β”€ πŸ“‚ src/ # Modular application source
β”‚ β”œβ”€β”€ πŸ“‚ config/ # Configuration management
β”‚ β”‚ β”œβ”€β”€ index.js # Main configuration loader
β”‚ β”‚ └── browser.js # Browser-specific settings
β”‚ β”œβ”€β”€ πŸ“‚ utils/ # Utility functions
β”‚ β”‚ β”œβ”€β”€ errors.js # Error handling & categorization
β”‚ β”‚ β”œβ”€β”€ logger.js # Structured logging
β”‚ β”‚ └── helpers.js # Common utilities
β”‚ β”œβ”€β”€ πŸ“‚ services/ # Business logic services
β”‚ β”‚ β”œβ”€β”€ browser.js # Browser lifecycle management
β”‚ β”‚ β”œβ”€β”€ stealth.js # Anti-detection techniques
β”‚ β”‚ β”œβ”€β”€ interaction.js # Human-like behavior
β”‚ β”‚ └── rendering.js # Core rendering logic
β”‚ β”œβ”€β”€ πŸ“‚ middleware/ # Express middleware
β”‚ β”‚ β”œβ”€β”€ auth.js # Authentication
β”‚ β”‚ └── error.js # Error handling
β”‚ β”œβ”€β”€ πŸ“‚ controllers/ # Request handlers
β”‚ β”‚ β”œβ”€β”€ system.js # Health & status endpoints
β”‚ β”‚ β”œβ”€β”€ rendering.js # Main rendering endpoints
β”‚ β”‚ β”œβ”€β”€ batch.js # Batch processing
β”‚ β”‚ └── get.js # GET endpoints & docs
β”‚ β”œβ”€β”€ πŸ“‚ routes/ # Route definitions
β”‚ β”‚ β”œβ”€β”€ api.js # API route mappings
β”‚ β”‚ └── static.js # Static file serving
β”‚ β”œβ”€β”€ app.js # Main application setup
β”‚ β”œβ”€β”€ server.js # Entry point for PM2
β”‚ └── rate-limiter.js # Rate limiting implementation
β”œβ”€β”€ πŸ“‚ website/ # Next.js website (unchanged)
β”‚ β”œβ”€β”€ app/ # Next.js 13+ app directory
β”‚ β”œβ”€β”€ components/ # React components
β”‚ β”œβ”€β”€ .env.example # Website environment template
β”‚ β”œβ”€β”€ next.config.js # Next.js configuration
β”‚ └── package.json # Website dependencies
β”œβ”€β”€ πŸ“‚ scripts/ # Deployment & management scripts
β”‚ β”œβ”€β”€ setup.sh # Automated installation (updated)
β”‚ β”œβ”€β”€ update_server.sh # Server update script (updated)
β”‚ β”œβ”€β”€ verify-domain.sh # Domain verification
β”‚ └── test-routing.sh # Integration testing
β”œβ”€β”€ πŸ“‚ nginx/ # Nginx configuration
β”‚ └── headlessx.conf # Nginx proxy config
β”œβ”€β”€ πŸ“‚ docker/ # Docker deployment (updated)
β”‚ β”œβ”€β”€ Dockerfile # Container definition
β”‚ └── docker-compose.yml # Docker Compose setup
β”œβ”€β”€ ecosystem.config.js # PM2 configuration (moved to root)
β”œβ”€β”€ .env.example # Environment template (updated)
β”œβ”€β”€ package.json # Server dependencies (updated)
β”œβ”€β”€ docs/
β”‚ └── MODULAR_ARCHITECTURE.md # Architecture documentation
└── README.md # This file
```

---

## πŸ› οΈ Development

### Local Development
```bash
# 1. Install dependencies
npm install

# 2. Build website
cd website
npm install
npm run build
cd ..

# 3. Set environment variables
export AUTH_TOKEN="development_token_123"
export DOMAIN="localhost"
export SUBDOMAIN="headlessx"

# 4. Start server
npm start # Uses src/app.js

# 5. Access locally
# Website: http://localhost:3000
# API: http://localhost:3000/api/health
```

### Testing Integration
```bash
# Test server and website integration
bash scripts/test-routing.sh localhost

# Test with environment variables
bash scripts/verify-domain.sh
```

---

## βš™οΈ Configuration

### 🌐 **Environment Variables (.env)**

Create your `.env` file from the template:
```bash
cp .env.example .env
nano .env
```

**Required configuration:**
```bash
# Security Token (Generate a secure random string)
AUTH_TOKEN=your_secure_token_here

# Domain Configuration
DOMAIN=yourdomain.com
SUBDOMAIN=headlessx

# Optional: Browser Settings
BROWSER_TIMEOUT=60000
MAX_CONCURRENT_BROWSERS=5

# Optional: Server Settings
PORT=3000
NODE_ENV=production
```

### 🌐 **Nginx Domain Setup**

**Option 1: Automatic (Recommended)**
```bash
# The setup script automatically replaces domain placeholders
sudo ./scripts/setup.sh
```

**Option 2: Manual Configuration**
```bash
# Copy nginx configuration
sudo cp nginx/headlessx.conf /etc/nginx/sites-available/headlessx

# Replace domain placeholders (replace with your actual domain)
sudo sed -i 's/SUBDOMAIN.DOMAIN.COM/headlessx.yourdomain.com/g' /etc/nginx/sites-available/headlessx

# Example: If your domain is "api.example.com"
sudo sed -i 's/SUBDOMAIN.DOMAIN.COM/api.example.com/g' /etc/nginx/sites-available/headlessx

# Enable site and reload nginx
sudo ln -sf /etc/nginx/sites-available/headlessx /etc/nginx/sites-enabled/
sudo nginx -t && sudo systemctl reload nginx
```

**Your final URLs will be:**
- Website: `https://your-subdomain.yourdomain.com`
- API Health: `https://your-subdomain.yourdomain.com/api/health`
- API Endpoints: `https://your-subdomain.yourdomain.com/api/*`

---

## πŸ“Š API Reference

### πŸ”§ **Core Endpoints**

| Endpoint | Method | Description | Auth Required |
|----------|--------|-------------|---------------|
| `/api/health` | GET | Health check | ❌ |
| `/api/status` | GET | Server status | βœ… |
| `/api/render` | POST | Full page rendering (JSON) | βœ… |
| `/api/html` | GET/POST | Raw HTML extraction | βœ… |
| `/api/content` | GET/POST | Clean text extraction | βœ… |
| `/api/screenshot` | GET | Screenshot generation | βœ… |
| `/api/pdf` | GET | PDF generation | βœ… |
| `/api/batch` | POST | Batch URL processing | βœ… |

### πŸ”‘ **Authentication**
All endpoints (except `/api/health`) require a token via:
- Query parameter: `?token=YOUR_TOKEN`
- Header: `X-Token: YOUR_TOKEN`
- Header: `Authorization: Bearer YOUR_TOKEN`

### πŸ“– **Complete Documentation**
Visit your HeadlessX website for full API documentation with examples, or check:
- [GET Endpoints](docs/GET_ENDPOINTS.md)
- [POST Endpoints](docs/POST_ENDPOINTS.md)

---

## πŸ“Š Monitoring & Troubleshooting

### πŸ” **Health Checks**
```bash
curl https://your-subdomain.yourdomain.com/api/health
curl "https://your-subdomain.yourdomain.com/api/status?token=YOUR_TOKEN"
```

### πŸ“‹ **Log Management**
```bash
# PM2 logs
npm run pm2:logs
pm2 logs headlessx --lines 100

# Docker logs
docker-compose logs -f headlessx

# Nginx logs
sudo tail -f /var/log/nginx/access.log
```

### πŸ”„ **Updates**
```bash
git pull origin main
npm run build # Rebuild website
npm run pm2:restart # PM2
# OR
docker-compose restart # Docker
```

### πŸ”§ **Common Issues**

**"npm ci" Error (missing package-lock.json):**
```bash
chmod +x scripts/generate-lockfiles.sh
./scripts/generate-lockfiles.sh # Generate lock files
# OR
npm install --production # Use install instead
```

**"Cannot find module 'express'":**
```bash
npm install # Install dependencies
```

**System dependency errors (Ubuntu):**
```bash
sudo apt update && sudo apt install -y \
libatk1.0-0t64 libatk-bridge2.0-0t64 libcups2t64 \
libatspi2.0-0t64 libasound2t64 libxcomposite1
```

**PM2 not starting:**
```bash
sudo npm install -g pm2
chmod +x scripts/setup.sh # Make script executable
pm2 start config/ecosystem.config.js
pm2 logs headlessx # Check errors
```

**Script permission errors:**
```bash
# Make all scripts executable
chmod +x scripts/*.sh

# Or use the quick setup
chmod +x scripts/quick-setup.sh && ./scripts/quick-setup.sh
```

**Playwright browser installation errors:**
```bash
# Use dedicated Playwright setup script
chmod +x scripts/setup-playwright.sh
./scripts/setup-playwright.sh

# Or install manually:
sudo apt update && sudo apt install -y \
libgtk-3-0t64 libpangocairo-1.0-0 libcairo-gobject2 \
libgdk-pixbuf-2.0-0 libdrm2 libxss1 libxrandr2 \
libasound2t64 libatk1.0-0t64 libnss3

# Install only Chromium (most stable)
npx playwright install chromium

# Alternative: Use Docker (avoids dependency issues)
docker-compose up -d
```

---

## πŸ” Security Features

- **Token Authentication**: Secure API access with custom tokens
- **Rate Limiting**: Nginx-level request throttling
- **Security Headers**: XSS, CSRF, and clickjacking protection
- **Bot Protection**: Common attack vector blocking
- **SSL/TLS**: Automatic HTTPS with Let's Encrypt

---

## 🀝 Contributing

1. Fork the repository
2. Create your feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'Add amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request

---

## πŸ“„ License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

---

## πŸ†˜ Support

- **πŸ“– Documentation**: Visit your deployed website for full API docs
- **πŸ› Issues**: [GitHub Issues](https://github.com/SaifyXPRO/HeadlessX/issues)
- **πŸ’¬ Community**: [GitHub Discussions](https://github.com/SaifyXPRO/HeadlessX/discussions) (Coming Soon)

---

## 🎯 Built by SaifyXPRO

**HeadlessX v1.1.0** - The most advanced open-source browserless web scraping solution.

Made with ❀️ for the developer community.