https://github.com/nazwright/daria
Real-time fraud detection architecture powered by AWS Kinesis, KaggleHub, and SMOTE-augmented data β the foundation of DARIA, the Detection And Risk-Intelligence Agent.
https://github.com/nazwright/daria
aws evm fraud fraud-detection-using-machine-learning kaggle kinesis machine-learning math numpy pandas python random web3
Last synced: 4 days ago
JSON representation
Real-time fraud detection architecture powered by AWS Kinesis, KaggleHub, and SMOTE-augmented data β the foundation of DARIA, the Detection And Risk-Intelligence Agent.
- Host: GitHub
- URL: https://github.com/nazwright/daria
- Owner: NazWright
- Created: 2025-09-05T03:44:33.000Z (2 months ago)
- Default Branch: main
- Last Pushed: 2025-11-01T19:15:33.000Z (4 days ago)
- Last Synced: 2025-11-01T21:10:10.019Z (4 days ago)
- Topics: aws, evm, fraud, fraud-detection-using-machine-learning, kaggle, kinesis, machine-learning, math, numpy, pandas, python, random, web3
- Language: Jupyter Notebook
- Homepage:
- Size: 163 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# π§ DARIA β Fraud Signal Detector ( Augmented Producer )
**DARIA** β *Detection And Risk-Intelligence Agent* β begins here.
This system powers real-time fraud simulation and streaming analysis across AWS + Web3 systems.
---
## β‘οΈ Overview
`DARIA` is a **real-time fraud detection architecture** built on
**Amazon Kinesis Data Streams** + **KaggleHub** + **Python**.
It ingests a Kaggle credit-card dataset, serializes each transaction,
and publishes them into **sharded Kinesis streams** for downstream analytics,
risk scoring, and eventually blockchain-backed audit trails.
> π§© *This is where DARIA learns to βseeβ β synthetic data, real signals.*
---
## π― Why This Exists
- Showcase a **streaming-first** fraud detection architecture (not batch).
- Demonstrate **ordered, replayable shards** and horizontal throughput.
- Provide a clean, reproducible **producer pipeline** anyone can point at their own stream.
- Bridge **AWS ML + Web3**, enabling on-chain logging and smart-contract-based rule enforcement.
---
## π§± Core Concepts
| Layer | Purpose |
|-------|----------|
| **Kinesis Data Streams** | Real-time event ingestion (ordered shards). |
| **KaggleHub** | Pulls public Kaggle datasets directly into the pipeline. |
| **Augmented Transactions** | Synthetic + SMOTE-balanced data from Tranche I. |
| **Smart Contracts (future)** | Run fraud-rule logic and immutable logging on-chain. |
| **DARIA** | The AI Agent orchestrating detection and risk intelligence. |
---
## π Quick Start
```bash
# 1. Create and activate virtual environment
python -m venv .venv && source .venv/bin/activate
# 2. Install dependencies
pip install -r requirements.txt
# 3. Configure environment
cp .env.example .env # update region / stream / creds
# 4. Provision the stream
bash infra/create_stream.sh fraud-transactions-stream 2 us-east-1
# 5. Run the producer
python -m src.producer # publishes Kaggle (or augmented) transactions
# 6. Optional: test a consumer
python -m src.consumer_demo # quick reader
````
---
## 𧬠Data Lineage
**Input β Augmentation β Stream**
1. `creditcard.csv` from Kaggle
2. Augmented via SMOTE + Faker (see `02_data_augmentation_with_faker.ipynb`)
3. Serialized into JSON payloads
4. Pushed into `fraud-transactions-stream` shards
5. Read downstream for analytics, rule evaluation, and ML training
---
## π Web3 Integration (Upcoming)
DARIAβs risk events will soon publish to **smart contracts** that:
* Verify fraud-rule outcomes on-chain
* Append immutable audit logs
* Enable decentralized compliance tracing
> *AWS streams meet blockchain state β transparency by design.*
---
## π§ Roadmap
| Phase | Focus |
| --------------- | --------------------------------------------------------- |
| **Tranche I** | Data Augmentation (SMOTE + Faker) β
|
| **Tranche II** | Real-time Streaming Producer (AWS Kinesis) β
|
| **Tranche III** | Fraud-Rule Engine + Smart Contract Logging π§© |
| **Tranche IV** | Model Serving + SageMaker Integration π |
| **Tranche V** | DARIA as an Autonomous Risk Agent (AWS Bedrock + Web3) π |
---
## πͺ Vision
> βDARIA doesnβt guess β she *knows* when something feels off.β
> β Naz Wright, DareDevTech
The goal isnβt just to detect fraud β itβs to teach machines the intuition of trust.
---
## ποΈ Author
**Nazere Wright (@daredevtech)**
*Full-Stack + AWS Machine Learning Engineer*
Building myth-driven, cloud-native intelligence systems.
---