{"id":28402399,"url":"https://github.com/ahmed-boutar/aiorchestration","last_synced_at":"2025-06-26T16:32:19.795Z","repository":{"id":295675755,"uuid":"990871655","full_name":"ahmed-boutar/AIOrchestration","owner":"ahmed-boutar","description":"AI Orchestration mini project using AWS S3, Lambdas, step functions, and cloudwatch for monitoring","archived":false,"fork":false,"pushed_at":"2025-05-30T02:08:44.000Z","size":467,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-06-08T14:50:43.625Z","etag":null,"topics":["ai","aws","aws-lambda","aws-s3","orchestration","state-machine","step-functions"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ahmed-boutar.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-05-26T19:14:16.000Z","updated_at":"2025-05-30T02:08:47.000Z","dependencies_parsed_at":"2025-05-29T15:02:34.736Z","dependency_job_id":null,"html_url":"https://github.com/ahmed-boutar/AIOrchestration","commit_stats":null,"previous_names":["ahmed-boutar/aiorchestration"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/ahmed-boutar/AIOrchestration","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ahmed-boutar%2FAIOrchestration","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ahmed-boutar%2FAIOrchestration/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ahmed-boutar%2FAIOrchestration/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ahmed-boutar%2FAIOrchestration/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ahmed-boutar","download_url":"https://codeload.github.com/ahmed-boutar/AIOrchestration/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ahmed-boutar%2FAIOrchestration/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":262102463,"owners_count":23259264,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","aws","aws-lambda","aws-s3","orchestration","state-machine","step-functions"],"created_at":"2025-06-01T15:37:49.220Z","updated_at":"2025-06-26T16:32:19.783Z","avatar_url":"https://github.com/ahmed-boutar.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# AI Pipeline Orchestration \n\n## Overview\n\nThis project implements an orchestrated AI pipeline using **AWS Step Functions**, **Lambda**, and **CloudWatch**, designed to automate the ingestion, preprocessing, and sentiment analysis of textual data. The system features built-in **retries**, **error handling**, and **logging**, ensuring robustness and visibility throughout the ML pipeline lifecycle. The deployment of the lambda functions was done using **AWS SAM**, which handles any underlying layers and makes the process much easier. \n\n---\n\n## Architecture\n\n### High-Level Workflow\n![The Architecture Diagram](images/SystemsDesignDiagram.png)\n\n## State Machine Definition\n\nThe workflow is defined in Amazon States Language (ASL). Below is a simplified breakdown:\n\n- **DataIngestion**: Fetches or receives raw data (e.g., tweets, reviews)\n- **Preprocessing**: Cleans, tokenizes, and formats text for sentiment analysis\n- **AnalyzeSentiment**: Invokes a model (or API) to predict sentiment\n- **Success**: Indicates successful pipeline completion\n- **Failure**: Centralized failure handler after retries are exhausted\n\n## Project Structure \n```\nAIOrchestration\n├── images/                 # contains the diagram\n├── src/\n│   ├── upload_dataset.py           # script used to upload the dataset to S3\n│   ├── data_ingestion                  # structure used by AWS SAM\n│   │   └── data_ingestion.py            # lambda handler \n│   │   └── requirements.txt  \n│   ├── preprocess_data\n│   │   └── preprocess_data.py            # lambda handler \n│   │   └── requirements.txt  \n│   └── sentiment_analysis\n│   │   └── sentiment_analysis.py            # lambda handler \n│   │   └── requirements.txt  \n├── ml_pipeline_step_function.json      # Step function definition\n├── README.md\n├── samconfig.toml                      # File generating after building with AWS SAM\n├── template.yaml                       # AWS SAM template\n└── .gitignore\n```\n\n## Monitoring \u0026 Logging\n- All Lambda functions log events to Amazon CloudWatch Logs with a unique requestId.\n- Logs include timestamps, status messages, and error traces.\n- A CloudWatch Dashboard was created to visualize invocation counts, error counts and duration of executions\n\n## Error Handling \n- Each state has:\n    - Retry block for transient errors (up to 3 attempts)\n    - Catch block that redirects to Failure if all retries fail\n- These ensure the pipeline can handle:\n    - Temporary network failures\n    - Lambda timeouts\n    - Unavailable external services\n\n\n## AI Usage \n- Claude 4.0 was used to create the template.yaml and the sentiment_analysis.py\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fahmed-boutar%2Faiorchestration","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fahmed-boutar%2Faiorchestration","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fahmed-boutar%2Faiorchestration/lists"}