https://github.com/ahmed-boutar/aiorchestration

AI Orchestration mini project using AWS S3, Lambdas, step functions, and cloudwatch for monitoring
https://github.com/ahmed-boutar/aiorchestration

ai aws aws-lambda aws-s3 orchestration state-machine step-functions

Last synced: about 1 year ago
JSON representation

AI Orchestration mini project using AWS S3, Lambdas, step functions, and cloudwatch for monitoring

Host: GitHub
URL: https://github.com/ahmed-boutar/aiorchestration
Owner: ahmed-boutar
Created: 2025-05-26T19:14:16.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2025-05-30T02:08:44.000Z (about 1 year ago)
Last Synced: 2025-06-08T14:50:43.625Z (about 1 year ago)
Topics: ai, aws, aws-lambda, aws-s3, orchestration, state-machine, step-functions
Language: Python
Homepage:
Size: 456 KB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# AI Pipeline Orchestration

## Overview

This project implements an orchestrated AI pipeline using **AWS Step Functions**, **Lambda**, and **CloudWatch**, designed to automate the ingestion, preprocessing, and sentiment analysis of textual data. The system features built-in **retries**, **error handling**, and **logging**, ensuring robustness and visibility throughout the ML pipeline lifecycle. The deployment of the lambda functions was done using **AWS SAM**, which handles any underlying layers and makes the process much easier.

---

## Architecture

### High-Level Workflow
![The Architecture Diagram](images/SystemsDesignDiagram.png)

## State Machine Definition

The workflow is defined in Amazon States Language (ASL). Below is a simplified breakdown:

- **DataIngestion**: Fetches or receives raw data (e.g., tweets, reviews)
- **Preprocessing**: Cleans, tokenizes, and formats text for sentiment analysis
- **AnalyzeSentiment**: Invokes a model (or API) to predict sentiment
- **Success**: Indicates successful pipeline completion
- **Failure**: Centralized failure handler after retries are exhausted

## Project Structure
```
AIOrchestration
├── images/ # contains the diagram
├── src/
│ ├── upload_dataset.py # script used to upload the dataset to S3
│ ├── data_ingestion # structure used by AWS SAM
│ │ └── data_ingestion.py # lambda handler
│ │ └── requirements.txt
│ ├── preprocess_data
│ │ └── preprocess_data.py # lambda handler
│ │ └── requirements.txt
│ └── sentiment_analysis
│ │ └── sentiment_analysis.py # lambda handler
│ │ └── requirements.txt
├── ml_pipeline_step_function.json # Step function definition
├── README.md
├── samconfig.toml # File generating after building with AWS SAM
├── template.yaml # AWS SAM template
└── .gitignore
```

## Monitoring & Logging
- All Lambda functions log events to Amazon CloudWatch Logs with a unique requestId.
- Logs include timestamps, status messages, and error traces.
- A CloudWatch Dashboard was created to visualize invocation counts, error counts and duration of executions

## Error Handling
- Each state has:
- Retry block for transient errors (up to 3 attempts)
- Catch block that redirects to Failure if all retries fail
- These ensure the pipeline can handle:
- Temporary network failures
- Lambda timeouts
- Unavailable external services

## AI Usage
- Claude 4.0 was used to create the template.yaml and the sentiment_analysis.py

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ahmed-boutar/aiorchestration

Awesome Lists containing this project

README