{"id":29604765,"url":"https://github.com/diabahmed/sykell-crawler","last_synced_at":"2026-04-07T22:31:52.792Z","repository":{"id":305021501,"uuid":"1020500704","full_name":"diabahmed/sykell-crawler","owner":"diabahmed","description":"A robust, scalable, and production-ready web crawler full stack application built with Go and Next.js","archived":false,"fork":false,"pushed_at":"2025-07-17T18:29:16.000Z","size":271,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-07-17T22:08:29.306Z","etag":null,"topics":["docker","golang","jwt-auth","mysql","nextjs","playwright","rest-api","websockets"],"latest_commit_sha":null,"homepage":"https://sykell.com","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/diabahmed.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-07-16T01:19:40.000Z","updated_at":"2025-07-17T18:29:20.000Z","dependencies_parsed_at":"2025-07-18T00:24:51.353Z","dependency_job_id":"9a83229c-4dcd-4dd0-b7f2-cc291bd7481f","html_url":"https://github.com/diabahmed/sykell-crawler","commit_stats":null,"previous_names":["diabahmed/sykell-crawler"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/diabahmed/sykell-crawler","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/diabahmed%2Fsykell-crawler","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/diabahmed%2Fsykell-crawler/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/diabahmed%2Fsykell-crawler/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/diabahmed%2Fsykell-crawler/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/diabahmed","download_url":"https://codeload.github.com/diabahmed/sykell-crawler/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/diabahmed%2Fsykell-crawler/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31532224,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-07T16:28:08.000Z","status":"ssl_error","status_checked_at":"2026-04-07T16:28:06.951Z","response_time":105,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["docker","golang","jwt-auth","mysql","nextjs","playwright","rest-api","websockets"],"created_at":"2025-07-20T16:00:29.166Z","updated_at":"2026-04-07T22:31:52.777Z","avatar_url":"https://github.com/diabahmed.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Sykell Web Crawler Platform\n\n![Go Version](https://img.shields.io/badge/Go-1.24.2-blue.svg)\n![Next.js](https://img.shields.io/badge/Next.js-15-black.svg)\n![React](https://img.shields.io/badge/React-19-blue.svg)\n![TypeScript](https://img.shields.io/badge/TypeScript-5-blue.svg)\n![License](https://img.shields.io/badge/license-MIT-green.svg)\n![Docker](https://img.shields.io/badge/Docker-Supported-blue.svg)\n![MySQL](https://img.shields.io/badge/Database-MySQL%208.0-orange.svg)\n\n\u003cimg width=\"1920\" height=\"1080\" alt=\"Screenshot 2025-07-16 103733\" src=\"https://github.com/user-attachments/assets/9ffe1416-401b-4a5b-91a5-b6f662ca7b48\" /\u003e\n\u003cimg width=\"1920\" height=\"1080\" alt=\"Screenshot 2025-07-16 103726\" src=\"https://github.com/user-attachments/assets/3f506173-9c9f-45c5-82c7-2db4ec7ee0a5\" /\u003e\n\u003cimg width=\"1920\" height=\"1080\" alt=\"Screenshot 2025-07-16 103759\" src=\"https://github.com/user-attachments/assets/69b86f55-0598-4d38-8062-cb59da50f92e\" /\u003e\n\u003cimg width=\"1920\" height=\"1080\" alt=\"Screenshot 2025-07-16 103815\" src=\"https://github.com/user-attachments/assets/685129f1-986c-4584-9dbe-532af1997076\" /\u003e\n\n\nA comprehensive, full-stack web crawling platform that provides powerful website analysis capabilities through a modern web interface. Built with Go backend and Next.js frontend, this platform offers real-time crawling, detailed analytics, and an exceptional user experience.\n\n\u003e **⚡ Rapid Development Achievement**: This entire full-stack application was built after learning Go in just one day! It showcases the power of modern development tools, clean architecture patterns, and the effectiveness of well-structured frameworks for building robust applications quickly.\n\n## 🌟 Platform Overview\n\nSykell is a multi-tenant web crawling platform that combines:\n\n- **Powerful Backend**: High-performance Go API with clean architecture\n- **Modern Frontend**: React-based dashboard with real-time updates\n- **Scalable Infrastructure**: Docker-containerized deployment ready for production\n- **Real-time Features**: WebSocket integration for live status updates\n- **Comprehensive Analytics**: Detailed website analysis and reporting\n\n## 🏗️ Architecture\n\n```\n┌─────────────────────────────────────────────────────────────┐\n│                    Frontend (Next.js)                       │\n│  ┌─────────────┐  ┌──────────────┐  ┌─────────────────────┐ │\n│  │  Dashboard  │  │  Real-time   │  │   Authentication    │ │\n│  │     UI      │  │   Updates    │  │        UI           │ │\n│  └─────────────┘  └──────────────┘  └─────────────────────┘ │\n└─────────────────────────────────────────────────────────────┘\n                              │\n                         HTTP/WebSocket\n                              │\n┌─────────────────────────────────────────────────────────────┐\n│                    Backend API (Go)                         │\n│  ┌─────────────┐  ┌──────────────┐  ┌─────────────────────┐ │\n│  │    RESTful  │  │   WebSocket  │  │   Authentication    │ │\n│  │     API     │  │     Hub      │  │   \u0026 Authorization   │ │\n│  └─────────────┘  └──────────────┘  └─────────────────────┘ │\n│  ┌─────────────┐  ┌──────────────┐  ┌─────────────────────┐ │\n│  │   Crawler   │  │   Business   │  │    Data Access      │ │\n│  │   Engine    │  │    Logic     │  │      Layer          │ │\n│  └─────────────┘  └──────────────┘  └─────────────────────┘ │\n└─────────────────────────────────────────────────────────────┘\n                              │\n┌─────────────────────────────────────────────────────────────┐\n│                    Database (MySQL)                         │\n│  ┌─────────────┐  ┌──────────────┐  ┌─────────────────────┐ │\n│  │    Users    │  │    Crawls    │  │    Audit Logs       │ │\n│  │   Tables    │  │   Results    │  │   \u0026 Sessions        │ │\n│  └─────────────┘  └──────────────┘  └─────────────────────┘ │\n└─────────────────────────────────────────────────────────────┘\n```\n\n## 🚀 Features\n\n### 🕷️ Web Crawling\n\n- **Comprehensive Analysis**: HTML version detection, title extraction, heading structure analysis\n- **Link Analysis**: Internal vs. external link classification with broken link detection\n- **Form Detection**: Login form presence identification\n- **Performance Metrics**: Processing time tracking and optimization insights\n- **Real-time Processing**: Background job processing with live status updates\n\n### 👨‍💻 User Experience\n\n- **Multi-tenant System**: Complete user registration and authentication\n- **Modern Dashboard**: Responsive design with dark mode support\n- **Real-time Updates**: WebSocket integration for live crawl notifications\n- **Data Visualization**: Interactive tables with advanced filtering and sorting\n- **Bulk Operations**: Manage multiple crawls efficiently\n\n### 🛠️ Technical Excellence\n\n- **Clean Architecture**: Domain-driven design with clear separation of concerns\n- **Type Safety**: Full TypeScript coverage across the frontend\n- **Security**: JWT authentication with secure session management\n- **Performance**: Optimized concurrent processing and caching systems\n- **Scalability**: Docker containerization ready for production deployment\n\n## 📦 Repository Structure\n\n```\nsykell-crawler/\n├── 📁 client/                       # Next.js Frontend Application\n│   ├── 📁 app/                     # Next.js App Router\n│   ├── 📁 components/              # React Components\n│   ├── 📁 store/                   # State Management (Zustand)\n│   ├── 📁 hooks/                   # Custom React Hooks\n│   ├── 📁 lib/                     # Utility Libraries\n│   ├── 📁 types/                   # TypeScript Definitions\n│   ├── 📁 tests/                   # E2E Tests (Playwright)\n│   ├── 📄 Dockerfile               # Frontend Container Config\n│   └── 📄 README.md                # Frontend Documentation\n├── 📁 server/                       # Go Backend API\n│   ├── 📁 cmd/api/                 # Application Entry Point\n│   ├── 📁 internal/                # Private Application Code\n│   │   ├── 📁 application/         # Business Logic Services\n│   │   ├── 📁 domain/              # Domain Entities \u0026 Interfaces\n│   │   ├── 📁 infrastructure/      # External Integrations\n│   │   └── 📁 presentation/        # HTTP/WebSocket Handlers\n│   ├── 📁 tests/                   # API Tests \u0026 Test Utilities\n│   ├── 📄 Dockerfile               # Backend Container Config\n│   └── 📄 README.md                # Backend Documentation\n├── 📄 docker-compose.yml           # Multi-service Orchestration\n├── 📄 .env.example                 # Environment Configuration Template\n├── 📄 LICENSE                      # MIT License\n└── 📄 README.md                    # This File\n```\n\n## 🚀 Quick Start\n\n### Prerequisites\n\n- **Docker \u0026 Docker Compose** (Recommended)\n- **Go 1.24.2+** (for local development)\n- **Node.js 20+** (for local development)\n- **MySQL 8.0** (if running locally)\n\n### 🐳 Docker Deployment (Recommended)\n\n1. **Clone the repository**\n\n   ```bash\n   git clone https://github.com/diabahmed/sykell-crawler.git\n   cd sykell-crawler\n   ```\n\n2. **Configure environment**\n\n   ```bash\n   cp .env.example .env\n   ```\n\n   Edit `.env` with your configuration:\n\n   ```env\n   # Database Configuration\n   DB_PASSWORD=your_secure_password\n   DB_NAME=web_crawler_db\n   DB_SOURCE=\"root:your_secure_password@tcp(db:3306)/web_crawler_db?charset=utf8mb4\u0026parseTime=True\u0026loc=Local\"\n\n   # Frontend Configuration\n   NEXT_PUBLIC_API_BASE_URL=http://localhost:8088/api/v1\n   NEXT_PUBLIC_WS_BASE_URL=ws://localhost:8088/api/v1/ws\n\n   # JWT Configuration\n   TOKEN_SYMMETRIC_KEY=\"your_32_character_secret_key_here\"\n   ACCESS_TOKEN_DURATION=\"24h\"\n   ```\n\n3. **Launch the platform**\n\n   ```bash\n   docker-compose up --build -d\n   ```\n\n4. **Access the application**\n   - **Frontend**: http://localhost:3000\n   - **Backend API**: http://localhost:8088\n   - **Database**: localhost:3306\n\n### 🔧 Local Development\n\nFor detailed local development instructions, refer to the component-specific READMEs:\n\n- **[Backend Development Guide](./server/README.md)** - Go API setup, testing, and development\n- **[Frontend Development Guide](./client/README.md)** - Next.js setup, components, and testing\n\n## 📖 Documentation\n\n### Component Documentation\n\n- **[📚 Backend API Documentation](./server/README.md)**\n\n  - Architecture overview\n  - API endpoints\n  - Database schema\n  - Configuration options\n  - Development guide\n\n- **[📚 Frontend Documentation](./client/README.md)**\n  - Component architecture\n  - State management\n  - UI components\n  - Testing strategy\n  - Performance optimizations\n\n### API Documentation\n\n- **API Endpoints**: Detailed in [Backend README](./server/README.md#-api-endpoints)\n\n## 🔐 Security\n\n### Authentication \u0026 Authorization\n\n- JWT-based authentication with HTTP-only cookies\n- Multi-tenant user isolation\n- Secure password hashing with bcrypt\n- Session management and automatic logout\n\n### API Security\n\n- Input validation and sanitization\n- CORS configuration\n- Rate limiting capabilities\n- SQL injection prevention via ORM\n\n### Infrastructure Security\n\n- Container security best practices\n- Secure environment variable handling\n- Network isolation with Docker\n\n### Environment Configurations\n\n- **Development**: Local development with hot reload\n- **Staging**: Production-like environment for testing\n- **Production**: Optimized for performance and security\n\n## 📄 License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\nFor detailed component documentation, please refer to:\n\n- [🔧 Backend Documentation](./server/README.md)\n- [🎨 Frontend Documentation](./client/README.md)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdiabahmed%2Fsykell-crawler","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdiabahmed%2Fsykell-crawler","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdiabahmed%2Fsykell-crawler/lists"}