{"id":33314515,"url":"https://github.com/novama/web-automator-js","last_synced_at":"2026-05-04T15:40:41.602Z","repository":{"id":322717402,"uuid":"1089186092","full_name":"novama/web-automator-js","owner":"novama","description":"Selenium and Playwright web-automation example in Javascript, with demo docker container setup for AWS lambda","archived":false,"fork":false,"pushed_at":"2025-11-14T03:19:06.000Z","size":81,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-11-14T05:29:17.391Z","etag":null,"topics":["aws-lambda","docker","javascript","playwright","playwright-automation","playwright-javascript","selenium","selenium-webdriver"],"latest_commit_sha":null,"homepage":"","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/novama.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-11-04T02:11:41.000Z","updated_at":"2025-11-14T03:24:46.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/novama/web-automator-js","commit_stats":null,"previous_names":["novama/web-automator-js"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/novama/web-automator-js","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/novama%2Fweb-automator-js","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/novama%2Fweb-automator-js/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/novama%2Fweb-automator-js/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/novama%2Fweb-automator-js/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/novama","download_url":"https://codeload.github.com/novama/web-automator-js/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/novama%2Fweb-automator-js/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":285240542,"owners_count":27137943,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-11-19T02:00:05.673Z","response_time":65,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aws-lambda","docker","javascript","playwright","playwright-automation","playwright-javascript","selenium","selenium-webdriver"],"created_at":"2025-11-19T12:01:17.202Z","updated_at":"2025-11-19T12:03:14.153Z","avatar_url":"https://github.com/novama.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Web Automator JS\n\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Node.js](https://img.shields.io/badge/Node.js-22.x-green.svg)](https://nodejs.org/)\n[![Docker](https://img.shields.io/badge/Docker-Supported-blue.svg)](https://www.docker.com/)\n[![AWS Lambda](https://img.shields.io/badge/AWS-Lambda%20Ready-orange.svg)](https://aws.amazon.com/lambda/)\n\nA web automation framework supporting both **Selenium** and **Playwright**, with containerized **AWS Lambda** deployment capabilities. Perfect for web scraping, E2E testing, and browser automation at scale.\n\n## Key Features\n\n- **Dual Framework Support** - Choose between Selenium and Playwright\n- **Docker Containerization** - Production-ready Lambda containers\n- **Smart Environment Detection** - Automatic browser configuration\n- **Health Monitoring** - Container health checks and auto-restart\n- **Cross-Platform** - Works on Windows, macOS, and Linux\n- **Serverless Ready** - Optimized for AWS Lambda deployment\n- **Production Stable** - Comprehensive error handling and logging\n\n## Architecture\n\n```txt\n┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐\n│   Lambda Event  │───▶│  Handler (index) │───▶│ Browser Engine  │\n└─────────────────┘    └──────────────────┘    └─────────────────┘\n                                │                        │\n                                ▼                        ▼\n                       ┌──────────────────┐    ┌─────────────────┐\n                       │ Environment      │    │ Selenium /      │\n                       │ Detection        │    │ Playwright      │\n                       └──────────────────┘    └─────────────────┘\n```\n\n## Quick Start\n\n### Prerequisites\n\n- **Node.js** 18+ (22.x recommended)\n- **Docker** (for containerized deployment)\n- **Git** (for cloning the repository)\n\n### Installation\n\n```bash\n# Clone the repository\ngit clone https://github.com/your-org/web-automator-js.git\ncd web-automator-js\n\n# Install dependencies\nnpm install\n\n# Setup automation drivers (choose one or both)\nnpm run setup:playwright  # Install Playwright browsers\nnpm run setup:selenium    # Install Selenium drivers\nnpm run setup:all         # Install everything\n```\n\n### Basic Usage\n\n#### Local Testing\n\n```bash\n# Test Playwright automation\nnpm run playwright\n\n# Test Selenium automation  \nnpm run selenium\n\n# Test Lambda handler locally\nnpm run lambda\n```\n\n#### Docker Development\n\n```bash\n# Build and start containerized Lambda\nnpm run docker:build \u0026\u0026 npm run docker:start\n\n# Test containerized Lambda\nnpm run docker:test\n\n# Monitor container health\nnpm run docker:health\n```\n\n## Documentation\n\n| Document | Description |\n|----------|-------------|\n| [Docker Lambda Setup](docs/DOCKER-LAMBDA-SETUP.md) | Complete containerization and deployment guide |\n| [AWS Deployment](docs/AWS-DEPLOYMENT.md) | AWS Lambda deployment instructions |\n| [Framework Comparison](docs/SELENIUM-VS-PLAYWRIGHT.md) | Selenium vs Playwright detailed comparison |\n| [Container Status](docs/CONTAINER-STATUS.md) | Health monitoring and troubleshooting |\n| [Lambda Handler](docs/LAMBDA-HANDLER.md) | Lambda function implementation guide |\n\n## Project Structure\n\n```text\nweb-automator-js/\n├── src/\n│   ├── automator/\n│   │   ├── playwright/          # Playwright automation drivers\n│   │   └── selenium/            # Selenium automation drivers\n│   ├── common/utils/            # Shared utilities\n│   └── examples/                # Example implementations\n├── tests/\n│   └── test-events/             # Lambda test event files\n├── config/                      # Configuration files\n├── scripts/                     # Setup and utility scripts\n├── Dockerfile                   # Production container image\n├── docker-compose.yml           # Local development environment\n├── package.json                 # Dependencies and npm scripts\n└── index.js                     # Main Lambda handler\n```\n\n## Available Commands\n\n### Development Commands\n\n```bash\nnpm run setup              # Check and install dependencies\nnpm run playwright         # Run Playwright example\nnpm run selenium           # Run Selenium example\nnpm run lambda             # Test Lambda handler locally\nnpm run clean              # Clean output directories\n```\n\n### Docker Commands\n\n```bash\nnpm run docker:build       # Build Lambda container\nnpm run docker:start       # Start Lambda service\nnpm run docker:test        # Test containerized Lambda\nnpm run docker:health      # Check container health\nnpm run docker:monitor     # Continuous health monitoring\nnpm run docker:restart     # Smart container restart\nnpm run docker:logs        # View container logs\nnpm run docker:stop        # Stop all containers\nnpm run docker:cleanup     # Clean up Docker resources\n```\n\n## Framework Comparison\n\nBoth frameworks are supported with smart environment detection:\n\n| Aspect | Selenium | Playwright |\n|--------|----------|------------|\n| **Speed** | Good | Excellent |\n| **Reliability** | Good | Excellent |\n| **AWS Lambda** | Supported | Optimized |\n| **Setup** | Manual | Automatic |\n\n**[View Detailed Comparison →](docs/SELENIUM-VS-PLAYWRIGHT.md)**\n\n## Configuration\n\n### Environment Variables\n\n```bash\n# Browser Configuration\nHEADLESS=true                   # Run browsers in headless mode\nBROWSER_TYPE=chromium           # Browser type (chromium/firefox/webkit)\n\n# AWS Lambda Detection (auto-detected)\nAWS_LAMBDA_FUNCTION_NAME        # Lambda function name\nAWS_EXECUTION_ENV               # AWS execution environment\nNODE_ENV                        # Application environment\n\n# Development Options\nDEBUG=true                      # Enable debug logging\nTIMEOUT=30000                   # Default timeout in milliseconds\n```\n\n### Custom Configuration\n\nCreate `config/config.json` for custom settings:\n\n```json\n{\n  \"browser\": {\n    \"headless\": true,\n    \"timeout\": 30000,\n    \"viewport\": {\n      \"width\": 1920,\n      \"height\": 1080\n    }\n  },\n  \"lambda\": {\n    \"timeout\": 30,\n    \"memorySize\": 1024\n  }\n}\n```\n\n## Docker Deployment\n\n### Local Development\n\n```bash\n# Start development environment\nnpm run docker:start\n\n# Test with sample event\nnpm run docker:test\n\n# Monitor health\nnpm run docker:monitor\n```\n\n### AWS Lambda Deployment\n\n```bash\n# Build production image\ndocker build -t web-automator-lambda .\n\n# Tag for ECR\ndocker tag web-automator-lambda:latest {account}.dkr.ecr.{region}.amazonaws.com/web-automator:latest\n\n# Deploy to Lambda (requires AWS CLI configured)\naws lambda update-function-code --function-name web-automator --image-uri {account}.dkr.ecr.{region}.amazonaws.com/web-automator:latest\n```\n\n**[Complete Deployment Guide →](docs/AWS-DEPLOYMENT.md)**\n\n## Testing\n\n### Unit Tests\n\n```bash\nnpm test                       # Run test suite (when implemented)\n```\n\n### Integration Tests\n\n```bash\nnpm run docker:test            # Test containerized Lambda\nnpm run docker:test:basic      # Basic functionality test\n```\n\n### Health Monitoring\n\n```bash\nnpm run docker:health          # One-time health check\nnpm run docker:monitor         # Continuous monitoring\n```\n\n## Examples\n\n### Basic Web Scraping\n\n```javascript\nconst { playwrightDriver } = require('./src/automator/playwright/drivers/playwrightDriver');\n\nasync function scrapeTitle(url) {\n    const driver = new playwrightDriver();\n    await driver.initialize();\n    \n    const page = await driver.browser.newPage();\n    await page.goto(url);\n    const title = await page.title();\n    \n    await driver.cleanup();\n    return title;\n}\n```\n\n### Lambda Handler Usage\n\n```javascript\n// Event format\nconst event = {\n    \"url\": \"https://example.com\",\n    \"selector\": \"h1\",\n    \"action\": \"getText\",\n    \"timeout\": 30000\n};\n\n// Response format\n{\n    \"statusCode\": 200,\n    \"body\": {\n        \"success\": true,\n        \"url\": \"https://example.com\",\n        \"title\": \"Example Domain\",\n        \"result\": \"Example Domain\"\n    }\n}\n```\n\n## Troubleshooting\n\n### Common Issues\n\n**Container crashes with segmentation fault:**\n\n```bash\nnpm run docker:restart          # Smart restart with health checks\nnpm run docker:logs             # Check error details\n```\n\n**Browser not found in Lambda:**\n\n- Ensure using `@sparticuz/chromium` package\n- Check AWS environment detection\n- Verify container image includes browsers\n\n**Network timeouts:**\n\n- Increase timeout values in configuration\n- Check Lambda function timeout settings\n- Verify network connectivity\n\n**[Complete Troubleshooting Guide →](docs/CONTAINER-STATUS.md)**\n\n## Performance\n\n### AWS Lambda Metrics\n\n- **Cold Start**: ~3-5 seconds (Playwright) / ~8-12 seconds (Selenium)\n- **Execution**: ~2-4 seconds per page (Playwright) / ~5-8 seconds (Selenium)\n- **Memory Usage**: 256MB+ (Playwright) / 512MB+ (Selenium)\n- **Container Size**: ~250MB (Playwright) / ~1.5GB (Selenium)\n\n### Optimization Tips\n\n- Use Playwright for better Lambda performance\n- Enable connection pooling for multiple requests\n- Implement intelligent caching strategies\n- Use appropriate Lambda memory allocation\n\n## License\n\nThis project is licensed under the **MIT License** - see the [LICENSE](LICENSE) file for details.\n\n## Acknowledgments\n\n- **[Playwright Team](https://playwright.dev/)** - Modern web automation framework\n- **[Selenium Project](https://selenium.dev/)** - Web automation standard\n- **[@sparticuz/chromium](https://github.com/Sparticuz/chromium)** - Serverless Chromium builds\n- **AWS Lambda Team** - Serverless compute platform\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnovama%2Fweb-automator-js","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnovama%2Fweb-automator-js","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnovama%2Fweb-automator-js/lists"}