{"id":28713602,"url":"https://github.com/zetxtech/hellobot","last_synced_at":"2026-02-26T10:30:31.382Z","repository":{"id":298813789,"uuid":"1001208204","full_name":"zetxtech/hellobot","owner":"zetxtech","description":null,"archived":false,"fork":false,"pushed_at":"2025-06-13T02:53:34.000Z","size":20,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-06-13T03:43:55.630Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Vue","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/zetxtech.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-06-13T02:15:27.000Z","updated_at":"2025-06-13T02:53:37.000Z","dependencies_parsed_at":"2025-06-13T03:43:58.617Z","dependency_job_id":"27842af0-081e-4855-b86b-e15d4d6c7db5","html_url":"https://github.com/zetxtech/hellobot","commit_stats":null,"previous_names":["zetxtech/hellobot"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/zetxtech/hellobot","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zetxtech%2Fhellobot","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zetxtech%2Fhellobot/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zetxtech%2Fhellobot/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zetxtech%2Fhellobot/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/zetxtech","download_url":"https://codeload.github.com/zetxtech/hellobot/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zetxtech%2Fhellobot/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":259901390,"owners_count":22929227,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-06-15T00:11:19.898Z","updated_at":"2025-10-30T11:20:07.400Z","avatar_url":"https://github.com/zetxtech.png","language":"Vue","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Hellobot - A Cloud-Native File Processing Service\n\n## Introduction\n\nThis project is a sample implementation of a standard, large-scale file processing workflow, inspired by the architecture you proposed. A live demo is available at: [hellobot.zetx.tech](https://hellobot.zetx.tech/)\n\nThis service provides a simple yet powerful function: for any number and size of files uploaded by the user, it adds a \"Hello\\! \" prefix to the beginning of each line.\n\nThe entire service is deployed on AWS (Amazon Web Services), leveraging a fully **Serverless** and **Cloud-Native** architecture to achieve high elasticity, availability, and cost-efficiency.\n\n## Architecture\n\n### Brief Overview\n\nThe entire system is event-driven. Core components are decoupled via S3 events and SQS messages, enabling massive scalability.\n\nYou can find the source code for all Lambda functions in the `aws/lambdas` directory of this project.\n\n\u003cdetails\u003e\n\u003csummary\u003eClick to expand/collapse the Mini Architecture Diagram\u003c/summary\u003e\n\n```mermaid\ngraph LR\n    User[\"\u003cfa:fa-user\u003e User / Client\"]\n\n    subgraph \"AWS Cloud Infrastructure\"\n        API[\"\u003cfa:fa-server\u003e API Gateway\"]\n        Engine[\"\u003cfa:fa-cogs\u003e Async Processing Engine\u003cbr\u003e(Lambda)\"]\n        S3[\"\u003cfa:fa-database\u003e S3 Storage\"]\n        DB[\"\u003cfa:fa-table\u003e Job Status Tracker\u003cbr\u003e(DynamoDB)\"]\n        Error[\"\u003cfa:fa-bug\u003e Error Handler\"]\n    end\n\n    %% --- Workflow ---\n    User -- \"Get URL / Check Status\" --\u003e API\n    User -- \"Upload File\" --\u003e S3\n    \n    API -- \"Reads / Updates\" --\u003e DB\n    API -- \"Generates URL for\" --\u003e S3\n    \n    S3 -- \"Triggers Processing\" --\u003e Engine\n    \n    Engine -- \"Processes Chunks via\" --\u003e S3\n    Engine -- \"Updates Status in\" --\u003e DB\n    Engine -- \"Writes Final Result to\" --\u003e S3\n    \n    User -- \"Download Result from\" --\u003e S3\n\n    %% --- Error Handling ---\n    Engine -- \"On Failure / Timeout\" --\u003e Error\n    Error -- \"Marks Job as FAILED in\" --\u003e DB\n\n\n    %% --- Styling ---\n    classDef user fill:#e9f5ff,stroke:#005ea2,stroke-width:2px;\n    classDef default fill:#f9f9f9,stroke:#333;\n    classDef api fill:#9C27B0,stroke:#333,stroke-width:2px,color:white;\n    classDef engine fill:#FF9900,stroke:#333,stroke-width:2px,color:white;\n    classDef storage fill:#2E73B8,stroke:#333,stroke-width:2px,color:white;\n    classDef db fill:#3F8627,stroke:#333,stroke-width:2px,color:white;\n    classDef error fill:#D82231,stroke:#333,stroke-width:2px,color:white;\n\n    class User user;\n    class API api;\n    class Engine engine;\n    class S3 storage;\n    class DB db;\n    class Error error;\n```\n\n\u003c/details\u003e\n\n**The main processing flow is as follows:**\n\n1.  **Get Upload Link**: The user's browser application calls our API Gateway to obtain a secure S3 Presigned URL for file upload.\n2.  **Upload \u0026 Trigger**: The user uploads the file directly to an S3 bucket using the presigned URL. Upon successful upload, an S3 event automatically triggers the `FileOrchestrator` Lambda function, kicking off the backend process.\n3.  **Orchestration \u0026 Dispatch**: The `FileOrchestrator` function reads the file's metadata (e.g., size), logically splits the large file into smaller chunks (e.g., 1 MB each), and sends a message for each chunk to an SQS (Simple Queue Service) queue.\n4.  **Parallel Processing**: Messages in the SQS queue trigger the `ChunkProcessor` Lambda function. Thanks to Lambda's elastic scaling, hundreds or thousands of `ChunkProcessor` instances can be invoked concurrently to process all chunks in parallel. Each function processes its data chunk and saves the result as a temporary part file in S3.\n5.  **Assembly \u0026 Cleanup**: After a `ChunkProcessor` instance completes its task, it immediately updates the job's progress in DynamoDB. It then performs a check to see if all chunks for the original file have been processed. Only when all chunks are reported as complete is the `SingleFilePackager` Lambda function invoked. This function assembles all temporary result parts into a final, complete file, updates the overall task status to \"COMPLETED,\" and finally, cleans up all temporary parts and the original uploaded file.\n6.  **Status Check \u0026 Download**: Throughout the process, the user can poll an API endpoint to check the job status. Upon completion, the user receives a download link for the final processed file.\n\n### Complete Architecture\n\nThe following diagram provides a detailed blueprint of all service components, triggers, and data flows, suitable for development and operational reference.\n\n\u003cdetails\u003e\n\u003csummary\u003eClick to expand/collapse the Complete Architecture Diagram\u003c/summary\u003e\n\n```mermaid\ngraph TD\n    %% Define styles for different components\n    classDef lambda fill:#FF9900,stroke:#333,stroke-width:2px;\n    classDef s3 fill:#2E73B8,stroke:#333,stroke-width:2px,color:white;\n    classDef sqs fill:#D82231,stroke:#333,stroke-width:2px,color:white;\n    classDef db fill:#3F8627,stroke:#333,stroke-width:2px,color:white;\n    classDef api fill:#9C27B0,stroke:#333,stroke-width:2px,color:white;\n    classDef event fill:#BDBDBD,stroke:#333,stroke-width:2px;\n    classDef user fill:#FFFFFF,stroke:#333,stroke-width:2px;\n\n    %% Main State Store\n    DynamoDB[\"\u003cfa:fa-table\u003e DynamoDB Table\u003cbr\u003e\u003ci\u003e(Tasks \u0026 Status)\u003c/i\u003e\"]:::db\n\n    subgraph \"1\\. API Layer \u0026 User Interaction\"\n        direction TB\n        User[\"\u003cfa:fa-user\u003e Client/User\"]:::user\n        APIGW[\"\u003cfa:fa-server\u003e HellobotAPI\u003cbr\u003e(API Gateway)\"]:::api\n\n        subgraph \"API-Triggered Functions\"\n            direction RL\n            L_GetUpload[\"\u003cfa:fa-bolt\u003e getUploadUrl\"]:::lambda\n            L_GetStatus[\"\u003cfa:fa-bolt\u003e getJobStatus\"]:::lambda\n            L_CreateZip[\"\u003cfa:fa-bolt\u003e CreateZipPackage\"]:::lambda\n        end\n        \n        User -- \"POST /get-upload-url\" --\u003e APIGW\n        APIGW --\u003e L_GetUpload\n        L_GetUpload -- \"1\\. Creates 'PENDING' task\" --\u003e DynamoDB\n        L_GetUpload -- \"2\\. Returns S3 Presigned URL\" --\u003e APIGW\n        User -- \"3\\. Uploads file via URL\" --\u003e S3_Upload\n\n        User -- \"GET /get-job-status\" --\u003e APIGW\n        APIGW --\u003e L_GetStatus\n        L_GetStatus -- \"Reads task\" --\u003e DynamoDB\n        \n        User -- \"POST /create-zip-package\" --\u003e APIGW\n        APIGW --\u003e L_CreateZip\n    end\n    \n    subgraph \"2\\. Asynchronous Processing Pipeline\"\n        direction TB\n        S3_Upload[\"\u003cfa:fa-database\u003e UPLOAD_BUCKET\u003cbr\u003e\u003ci\u003e(Raw user files)\u003c/i\u003e\"]:::s3\n        L_Orchestrator[\"\u003cfa:fa-bolt\u003e FileOrchestrator\"]:::lambda\n        SQS_Queue[\"\u003cfa:fa-comments\u003e SQS Queue\u003cbr\u003e\u003ci\u003e(Chunk processing jobs)\u003c/i\u003e\"]:::sqs\n        L_Processor[\"\u003cfa:fa-bolt\u003e ChunkProcessor\"]:::lambda\n        S3_Parts[\"\u003cfa:fa-database\u003e PROCESSED_PARTS_BUCKET\u003cbr\u003e\u003ci\u003e(Temporary processed chunks)\u003c/i\u003e\"]:::s3\n\n        S3_Upload -- \"4\\. S3 ObjectCreated Trigger\" --\u003e L_Orchestrator\n        L_Orchestrator -- \"Reads metadata\" --\u003e S3_Upload\n        L_Orchestrator -- \"5\\. Updates task to 'PROCESSING'\" --\u003e DynamoDB\n        L_Orchestrator -- \"6\\. Sends messages for each chunk\" --\u003e SQS_Queue\n        SQS_Queue -- \"7\\. SQS Trigger (in batches)\" --\u003e L_Processor\n        L_Processor -- \"Reads byte-range from\" --\u003e S3_Upload\n        L_Processor -- \"8\\. Writes processed part to\" --\u003e S3_Parts\n        L_Processor -- \"9\\. Increments completedChunks in\" --\u003e DynamoDB\n        L_Processor -- \"10\\. On completion, invokes...\" --\u003e L_Packager\n    end\n\n    subgraph \"3\\. Finalization \u0026 Output\"\n        direction TB\n        L_Packager[\"\u003cfa:fa-bolt\u003e SingleFilePackager\u003cbr\u003e\u003ci\u003e(Assembler \u0026 Cleaner)\u003c/i\u003e\"]:::lambda\n        S3_Individual[\"\u003cfa:fa-database\u003e PROCESSED_INDIVIDUAL_BUCKET\u003cbr\u003e\u003ci\u003e(Final processed files)\u003c/i\u003e\"]:::s3\n        S3_Packaged[\"\u003cfa:fa-database\u003e PACKAGED_RESULTS_BUCKET\u003cbr\u003e\u003ci\u003e(Zipped archives)\u003c/i\u003e\"]:::s3\n\n        L_Packager -- \"11\\. Reads all parts for task from\" --\u003e S3_Parts\n        L_Packager -- \"12\\. Writes final reassembled file to\" --\u003e S3_Individual\n        L_Packager -- \"13\\. Updates task to 'COMPLETED'\u003cbr\u003ewith presigned URL\" --\u003e DynamoDB\n        L_Packager -- \"14\\. Cleans up parts from\" --\u003e S3_Parts\n        L_Packager -- \"15\\. Cleans up original file from\" --\u003e S3_Upload\n        \n        L_CreateZip -- \"Reads individual files from\" --\u003e S3_Individual\n        L_CreateZip -- \"Writes ZIP file to\" --\u003e S3_Packaged\n        S3_Packaged -- \"Returns download URL via\" --\u003e L_CreateZip\n    end\n\n    subgraph \"4\\. Error \u0026 Timeout Handling\"\n        direction TB\n        SQS_DLQ[\"\u003cfa:fa-bug\u003e SQS Dead-Letter Queue\"]:::sqs\n        L_Failure[\"\u003cfa:fa-bolt\u003e FailureHandler\"]:::lambda\n        EventBridge[\"\u003cfa:fa-clock\u003e EventBridge Scheduler\"]:::event\n        L_StuckCleaner[\"\u003cfa:fa-bolt\u003e StuckTaskCleaner\"]:::lambda\n\n        SQS_Queue -- \"On message failure\" --\u003e SQS_DLQ\n        SQS_DLQ -- \"DLQ Trigger\" --\u003e L_Failure\n        L_Failure -- \"Updates task to 'FAILED'\" --\u003e DynamoDB\n        L_Failure -- \"Invokes for cleanup\" --\u003e L_Packager\n\n        EventBridge -- \"Scheduled trigger (e.g., every 2 hours)\" --\u003e L_StuckCleaner\n        L_StuckCleaner -- \"Queries for stuck 'PROCESSING' tasks from\" --\u003e DynamoDB\n        L_StuckCleaner -- \"Updates task to 'FAILED'\" --\u003e DynamoDB\n        L_StuckCleaner -- \"Invokes for cleanup\" --\u003e L_Packager\n    end\n```\n\n\u003c/details\u003e\n\n## Frontend\n\nWe use the **Vue.js** and **Tailwind.css** framework to build a modern and responsive user interface.\n\nThe frontend project is deployed via **AWS Amplify**, which hosts the static web application on a global CDN, providing low-latency access for users worldwide and integrating seamlessly with the backend services.\n\nYou can find the complete frontend source code in the `frontend` directory of this project.\n\n## Permissions\n\nTo adhere to the **Principle of Least Privilege**, we use a separation of permissions for Lambda functions with different responsibilities. The system primarily creates three IAM Roles:\n\n1.  **`HellobotLambdaUploadStatusRole`**:\n\n      * **Purpose**: Assigned to user-facing functions directly triggered by API Gateway, such as `getUploadUrl` and `getJobStatus`.\n      * **Permissions**: Highly restricted permissions, only allowing the creation/reading of task items in DynamoDB and the generation of S3 presigned URLs for uploads.\n\n2.  **`HellobotLambdaCreateZipRole`**:\n\n      * **Purpose**: Specifically assigned to the `CreateZipPackage` function.\n      * **Permissions**: Allows reading files from the final results S3 bucket and writing the generated ZIP archive to the packaged results S3 bucket.\n\n3.  **`HellobotLambdaRole`**:\n\n      * **Purpose**: This is the internal role assigned to the core backend processing pipeline (e.g., `FileOrchestrator`, `ChunkProcessor`, `SingleFilePackager`).\n      * **Permissions**: Possesses broader permissions required to execute the core business logic, including reading/writing to multiple S3 buckets, sending/receiving SQS messages, updating DynamoDB, and invoking other Lambda functions.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzetxtech%2Fhellobot","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fzetxtech%2Fhellobot","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzetxtech%2Fhellobot/lists"}