{"id":50146715,"url":"https://github.com/soneylegal/cortex","last_synced_at":"2026-05-24T05:08:31.130Z","repository":{"id":358310180,"uuid":"1240216057","full_name":"soneylegal/cortex","owner":"soneylegal","description":"Pipeline de dados serverless pronto para produção na AWS focado em ingestão de eventos de alta vazão, validação estrita de schema, padrões de resiliência (DLQ, Idempotência) e emulação local completa via LocalStack e Terraform.)","archived":false,"fork":false,"pushed_at":"2026-05-16T19:36:01.000Z","size":52,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-16T19:40:34.819Z","etag":null,"topics":["aws-lambda","aws-sqs","backend-engineering","clean-architecture","data-pipeline","dynamodb","event-driven","fastapi","infrastructure-as-code","localstack","observability","pydantic","pytest","python3","resilience-patterns","serverless-architecture","terraform","test-driven-development"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/soneylegal.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-15T22:17:40.000Z","updated_at":"2026-05-16T19:36:05.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/soneylegal/cortex","commit_stats":null,"previous_names":["soneylegal/cortex"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/soneylegal/cortex","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/soneylegal%2Fcortex","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/soneylegal%2Fcortex/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/soneylegal%2Fcortex/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/soneylegal%2Fcortex/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/soneylegal","download_url":"https://codeload.github.com/soneylegal/cortex/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/soneylegal%2Fcortex/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33422091,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-23T22:14:44.296Z","status":"online","status_checked_at":"2026-05-24T02:00:06.296Z","response_time":57,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aws-lambda","aws-sqs","backend-engineering","clean-architecture","data-pipeline","dynamodb","event-driven","fastapi","infrastructure-as-code","localstack","observability","pydantic","pytest","python3","resilience-patterns","serverless-architecture","terraform","test-driven-development"],"created_at":"2026-05-24T05:08:21.538Z","updated_at":"2026-05-24T05:08:31.124Z","avatar_url":"https://github.com/soneylegal.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🧠 Cortex — Serverless Data Pipeline\n\n![Build Status](https://img.shields.io/github/actions/workflow/status/soneylegal/cortex/ci.yml?branch=main\u0026style=flat-square)\n![Python](https://img.shields.io/badge/python-3.12+-blue.svg?style=flat-square\u0026logo=python\u0026logoColor=white)\n![Terraform](https://img.shields.io/badge/terraform-1.5+-623CE4.svg?style=flat-square\u0026logo=terraform\u0026logoColor=white)\n![LocalStack](https://img.shields.io/badge/localstack-3.8.1-brightgreen.svg?style=flat-square\u0026logo=localstack\u0026logoColor=white)\n![Semantic Release](https://img.shields.io/badge/%20%20%F0%9F%93%A6%F0%9F%9A%80-semantic--release-e10079.svg?style=flat-square)\n![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg?style=flat-square)\n\n\u003e A modern, production-grade, and **100% Zero-Cost** Serverless Data Pipeline for Infrastructure Monitoring. Built with a focus on **Resilience**, **Observability**, **Data Lakehouse analytics**, and **Infrastructure as Code**.\n\n---\n\n## 🏛️ Architecture\n\nThe Cortex pipeline is designed to ingest high-throughput telemetry, process it securely, and fan out the data to both a real-time transactional database and a historical analytical Data Lake.\n\n```mermaid\nflowchart LR\n    Client([User / Agent]) -- \"JWT Auth\" --\u003e APIGW[API Gateway]\n    \n    subgraph Ingestion Layer\n        APIGW -- \"REST API\" --\u003e AuthLambda(Authorizer Lambda)\n        APIGW -- \"POST /events\" --\u003e ProdLambda(Producer Lambda)\n    end\n    \n    subgraph Routing Layer\n        ProdLambda -- \"PutEvents\" --\u003e EB[EventBridge Bus]\n        EB -- \"Rule: Main\" --\u003e SQS[SQS Queue]\n        EB -- \"Rule: Analytics\" --\u003e Firehose[Kinesis Firehose]\n    end\n\n    subgraph Transactional Layer\n        SQS -- \"Event Source Mapping\" --\u003e ConsLambda(Consumer Lambda)\n        ConsLambda -- \"Batch Persist\" --\u003e DDB[(DynamoDB)]\n        SQS -. \"3x Retries\" .-\u003e DLQ[Dead Letter Queue]\n    end\n    \n    subgraph Analytical Data Lake\n        Firehose -- \"Buffer \u0026 Compress\" --\u003e S3[(S3 Data Lake)]\n        S3 -. \"Schema\" .-\u003e Glue[AWS Glue]\n        Glue -. \"Query\" .-\u003e Athena[AWS Athena]\n    end\n    \n    subgraph Observability\n        DLQ -- \"Alarm\" --\u003e CW[CloudWatch Alarms]\n        CW -- \"Trigger\" --\u003e SNS[SNS Alerts]\n    end\n```\n\n## 📌 Development Status\n\n| Component | Status |\n|---|---|\n| **Pipeline Core** | ✅ API Gateway → EventBridge → SQS → Consumer → DynamoDB |\n| **Data Lake** | ✅ Kinesis Firehose → S3 Data Lake → Glue → Athena |\n| **Observability** | ✅ Powertools Logging, X-Ray Tracing, CloudWatch Alarms \u0026 SNS |\n| **CI/CD** | ✅ GitHub Actions (Lint, Mypy, Bandit, Pytest, Terraform, Semantic Release) |\n| **Local Environment** | ✅ 100% Free LocalStack 3.8.1 emulation with Zero-Cost bypass |\n\n## ⚡ Technology Stack\n\n| Layer | Technology |\n|---|---|\n| **Ingestion** | AWS API Gateway (REST API v1) + Custom JWT Authorizer |\n| **Validation** | AWS Lambda (Python 3.12) + Pydantic |\n| **Messaging** | AWS EventBridge + AWS SQS + Dead Letter Queue |\n| **Processing** | AWS Lambda (Consumer) |\n| **Persistence** | AWS DynamoDB (On-Demand, Idempotent) |\n| **Data Lake** | Amazon S3 + Kinesis Firehose + AWS Glue + AWS Athena |\n| **Observability**| AWS Lambda Powertools + AWS X-Ray + CloudWatch + SNS |\n| **IaC** | Terraform (HCL) |\n| **Dev Environment**| LocalStack 3.8.1 + Docker Compose |\n\n## 🔬 Engineering Highlights\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cstrong\u003eAdvanced Architectural Decisions \u0026 Trade-offs\u003c/strong\u003e\u003c/summary\u003e\n\n### 1. The Zero-Cost Local Environment Bypass\nTo guarantee a completely free development environment, this project enforces an infrastructure lockdown against LocalStack Pro features. Terraform dynamically utilizes `count = var.use_localstack ? 0 : 1` to gracefully skip Pro services (Glue and Athena) during CI/CD LocalStack emulation, while successfully deploying them when targeting the real AWS cloud. We specifically pinned LocalStack to `v3.8.1` to bypass mandatory cloud account authentications introduced in v4.\n\n### 2. Idempotency \u0026 Partial Batch Failure\nThe Consumer utilizes DynamoDB `ConditionExpression` (`attribute_not_exists`) to guarantee that messages reprocessed by SQS do not generate duplicate entries. This pairs seamlessly with SQS `ReportBatchItemFailures`, ensuring that only failing messages within a batch are retried, preventing successful messages from being needlessly reprocessed.\n\n### 3. Event-Driven Fan-Out Pattern\nMigrating from direct SQS invocation to Amazon EventBridge allows the pipeline to implement a robust fan-out architecture. A single telemetry event from the Producer is instantly routed to both the transactional pipeline (SQS -\u003e DynamoDB) and the analytical pipeline (Firehose -\u003e S3) without adding execution overhead to the Producer Lambda.\n\n### 4. Lambda Package Optimization (27MB → 5.2MB)\n`boto3` and `botocore` are pre-packaged in the standard AWS Lambda Python runtime. Removing them from the build packaging reduced the deployment artifact size from ~27MB to 5.2MB, significantly improving cold-start times and deployment speed.\n\u003c/details\u003e\n\n---\n\n## 📁 Repository Structure\n\n```text\ncortex/\n├── src/\n│   ├── producer/       # Lambda — validates \u0026 puts events to EventBridge\n│   ├── consumer/       # Lambda — pulls from SQS \u0026 persists to DynamoDB\n│   ├── authorizer/     # Lambda — JWT validation for API Gateway\n│   ├── read_api/       # Lambda — FastAPI microservice for querying events\n│   └── shared/         # Shared schemas, constants, logging utils\n├── terraform/          # Complete Infrastructure as Code (EventBridge, S3, Glue, etc)\n├── scripts/            # Deployment, load testing, and seeding utilities\n├── tests/\n│   ├── unit/           # Unit tests with fully mocked AWS resources (boto3 stubs)\n│   └── integration/    # E2E Tests running against LocalStack\n├── docker-compose.yml  # LocalStack container configuration\n├── Makefile            # Automation targets (make deploy, make test)\n└── pyproject.toml      # Dependency management and tool config\n```\n\n## 🚀 Quick Start\n\n### Prerequisites\n- Python 3.12+\n- Docker \u0026 Docker Compose\n- Terraform \u003e= 1.5\n\n### 1. Install Dependencies\n```bash\npip install -e \".[dev]\"\n```\n\n### 2. Deploy Locally (LocalStack)\n```bash\nmake localstack-up      # Starts the LocalStack container (v3.8.1)\nmake deploy-local       # Packages Lambdas and runs terraform apply locally\n```\n\n### 3. Run Tests\n```bash\nmake test               # Runs unit tests\nmake test-integration   # Runs E2E integration tests against LocalStack\n```\n\n### 4. Test the Pipeline\n```bash\n# Send a valid telemetry event\ncurl -X POST http://localhost:4566/restapis/\u003capi-id\u003e/dev/_user_request_/events \\\n  -H \"Content-Type: application/json\" \\\n  -H \"Authorization: Bearer \u003cyour-jwt-token\u003e\" \\\n  -d '{\n    \"source\": \"server-web-01\",\n    \"event_type\": \"cpu_usage\",\n    \"severity\": \"warning\",\n    \"data\": {\"cpu_percent\": 87.5, \"load_avg_1m\": 2.3}\n  }'\n```\n\n### 5. Load Testing\n```bash\nmake load-test          # 10 requests\nmake load-test-100      # 100 requests\n```\n\n## 🛡️ Resilience \u0026 Security\n\n| Feature | Implementation |\n|---|---|\n| **Dead Letter Queue (DLQ)** | Events failing \u003e3 times are diverted to a DLQ for manual inspection. |\n| **Alarms \u0026 Alerts** | CloudWatch Alarms monitor DLQ traffic and trigger SNS email alerts. |\n| **Authorization** | REST API is secured with a Custom Lambda Authorizer expecting JWTs. |\n| **Secret Management** | JWT Secrets and API Keys are securely injected via Terraform environment variables. |\n\n## 📋 Useful Commands\n\n```bash\nmake help             # List all targets\nmake lint             # Ruff check + format check\nmake typecheck        # Mypy type validation\nmake deploy           # Deploy to real AWS Account\nmake destroy-local    # Tear down LocalStack infra\nmake clean            # Remove build artifacts and caches\n```\n\n## 📄 License\nCopyright 2026 Davi Laurindo\n\nLicensed under the Apache License, Version 2.0. See [LICENSE](LICENSE) for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsoneylegal%2Fcortex","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsoneylegal%2Fcortex","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsoneylegal%2Fcortex/lists"}