{"id":32701025,"url":"https://github.com/nazwright/daria","last_synced_at":"2026-04-12T03:37:40.233Z","repository":{"id":313295279,"uuid":"1050854126","full_name":"NazWright/daria","owner":"NazWright","description":"Real-time fraud detection architecture powered by AWS Kinesis, KaggleHub, and SMOTE-augmented data — the foundation of DARIA, the Detection And Risk-Intelligence Agent.","archived":false,"fork":false,"pushed_at":"2025-11-01T19:15:33.000Z","size":167,"stargazers_count":0,"open_issues_count":7,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-11-01T21:10:10.019Z","etag":null,"topics":["aws","evm","fraud","fraud-detection-using-machine-learning","kaggle","kinesis","machine-learning","math","numpy","pandas","python","random","web3"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/NazWright.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-09-05T03:44:33.000Z","updated_at":"2025-10-09T06:49:32.000Z","dependencies_parsed_at":null,"dependency_job_id":"e598cf37-64f1-4557-8d34-6da08399f2a8","html_url":"https://github.com/NazWright/daria","commit_stats":null,"previous_names":["nazwright/fraud-signal-detector-augmented-producer","nazwright/daria"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/NazWright/daria","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NazWright%2Fdaria","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NazWright%2Fdaria/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NazWright%2Fdaria/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NazWright%2Fdaria/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/NazWright","download_url":"https://codeload.github.com/NazWright/daria/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NazWright%2Fdaria/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":282216044,"owners_count":26633413,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-11-01T02:00:06.759Z","response_time":61,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aws","evm","fraud","fraud-detection-using-machine-learning","kaggle","kinesis","machine-learning","math","numpy","pandas","python","random","web3"],"created_at":"2025-11-01T23:01:19.819Z","updated_at":"2025-11-01T23:03:04.480Z","avatar_url":"https://github.com/NazWright.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🧠 DARIA — Fraud Signal Detector ( Augmented Producer )\n\n**DARIA** — *Detection And Risk-Intelligence Agent* — begins here.  \nThis system powers real-time fraud simulation and streaming analysis across AWS + Web3 systems.\n\n---\n\n## ⚡️ Overview\n\n`DARIA` is a **real-time fraud detection architecture** built on  \n**Amazon Kinesis Data Streams** + **KaggleHub** + **Python**.\n\nIt ingests a Kaggle credit-card dataset, serializes each transaction,  \nand publishes them into **sharded Kinesis streams** for downstream analytics,  \nrisk scoring, and eventually blockchain-backed audit trails.\n\n\u003e 🧩 *This is where DARIA learns to “see” — synthetic data, real signals.*\n\n---\n\n## 🎯 Why This Exists\n\n- Showcase a **streaming-first** fraud detection architecture (not batch).  \n- Demonstrate **ordered, replayable shards** and horizontal throughput.  \n- Provide a clean, reproducible **producer pipeline** anyone can point at their own stream.  \n- Bridge **AWS ML + Web3**, enabling on-chain logging and smart-contract-based rule enforcement.\n\n---\n\n## 🧱 Core Concepts\n\n| Layer | Purpose |\n|-------|----------|\n| **Kinesis Data Streams** | Real-time event ingestion (ordered shards). |\n| **KaggleHub** | Pulls public Kaggle datasets directly into the pipeline. |\n| **Augmented Transactions** | Synthetic + SMOTE-balanced data from Tranche I. |\n| **Smart Contracts (future)** | Run fraud-rule logic and immutable logging on-chain. |\n| **DARIA** | The AI Agent orchestrating detection and risk intelligence. |\n\n---\n\n## 🚀 Quick Start\n\n```bash\n# 1.  Create and activate virtual environment\npython -m venv .venv \u0026\u0026 source .venv/bin/activate\n\n# 2.  Install dependencies\npip install -r requirements.txt\n\n# 3.  Configure environment\ncp .env.example .env      # update region / stream / creds\n\n# 4.  Provision the stream\nbash infra/create_stream.sh fraud-transactions-stream 2 us-east-1\n\n# 5.  Run the producer\npython -m src.producer      # publishes Kaggle (or augmented) transactions\n\n# 6.  Optional: test a consumer\npython -m src.consumer_demo # quick reader\n````\n\n---\n\n## 🧬 Data Lineage\n\n**Input → Augmentation → Stream**\n\n1. `creditcard.csv` from Kaggle\n2. Augmented via SMOTE + Faker (see `02_data_augmentation_with_faker.ipynb`)\n3. Serialized into JSON payloads\n4. Pushed into `fraud-transactions-stream` shards\n5. Read downstream for analytics, rule evaluation, and ML training\n\n---\n\n## 🌐 Web3 Integration (Upcoming)\n\nDARIA’s risk events will soon publish to **smart contracts** that:\n\n* Verify fraud-rule outcomes on-chain\n* Append immutable audit logs\n* Enable decentralized compliance tracing\n\n\u003e *AWS streams meet blockchain state — transparency by design.*\n\n---\n\n## 🧠 Roadmap\n\n| Phase           | Focus                                                     |\n| --------------- | --------------------------------------------------------- |\n| **Tranche I**   | Data Augmentation (SMOTE + Faker) ✅                       |\n| **Tranche II**  | Real-time Streaming Producer (AWS Kinesis) ✅              |\n| **Tranche III** | Fraud-Rule Engine + Smart Contract Logging 🧩             |\n| **Tranche IV**  | Model Serving + SageMaker Integration 🚀                  |\n| **Tranche V**   | DARIA as an Autonomous Risk Agent (AWS Bedrock + Web3) 🌌 |\n\n---\n\n## 🪞 Vision\n\n\u003e “DARIA doesn’t guess — she *knows* when something feels off.”\n\u003e — Naz Wright, DareDevTech\n\nThe goal isn’t just to detect fraud — it’s to teach machines the intuition of trust.\n\n---\n\n## 🖋️ Author\n\n**Nazere Wright (@daredevtech)**\n*Full-Stack + AWS Machine Learning Engineer*\nBuilding myth-driven, cloud-native intelligence systems.\n\n---\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnazwright%2Fdaria","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnazwright%2Fdaria","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnazwright%2Fdaria/lists"}