{"id":28419061,"url":"https://github.com/sorna-fast/fraud-detection","last_synced_at":"2026-04-28T17:32:40.852Z","repository":{"id":293745756,"uuid":"985006625","full_name":"sorna-fast/fraud-detection","owner":"sorna-fast","description":"Predicting transaction fraud using classification problems such as Guardian Boosting as well as user interfaces using Streamlite, Accuracy: 98% AUC-ROC","archived":false,"fork":false,"pushed_at":"2025-10-18T01:21:15.000Z","size":5260,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-10-19T01:08:24.833Z","etag":null,"topics":["adaboostclassifier","eda","gradientboostingclassifier","imblearn","lgbmclassifier","matplotlib-pyplot","numpy","pandas-dataframe","pickle-file","plotly-express","randomforestclassifier","scipy-stats","seaborn-plots","sklearn-library","streamlit-webapp","xgbclassifier"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sorna-fast.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-05-16T22:28:59.000Z","updated_at":"2025-10-18T01:21:19.000Z","dependencies_parsed_at":"2025-08-28T07:19:05.764Z","dependency_job_id":"880ec759-fc39-49b2-a8cc-4432a6be4109","html_url":"https://github.com/sorna-fast/fraud-detection","commit_stats":null,"previous_names":["sorna-fast/fraud-detection"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/sorna-fast/fraud-detection","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sorna-fast%2Ffraud-detection","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sorna-fast%2Ffraud-detection/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sorna-fast%2Ffraud-detection/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sorna-fast%2Ffraud-detection/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sorna-fast","download_url":"https://codeload.github.com/sorna-fast/fraud-detection/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sorna-fast%2Ffraud-detection/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32392293,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-28T14:34:11.604Z","status":"ssl_error","status_checked_at":"2026-04-28T14:32:37.009Z","response_time":56,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["adaboostclassifier","eda","gradientboostingclassifier","imblearn","lgbmclassifier","matplotlib-pyplot","numpy","pandas-dataframe","pickle-file","plotly-express","randomforestclassifier","scipy-stats","seaborn-plots","sklearn-library","streamlit-webapp","xgbclassifier"],"created_at":"2025-06-04T14:14:11.388Z","updated_at":"2026-04-28T17:32:40.848Z","avatar_url":"https://github.com/sorna-fast.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n# Project introduction in English\n\n# Financial Fraud Detection System - Technical Documentation\n\n![GitHub](https://img.shields.io/badge/Python-3.9%2B-blue)\n![GitHub](https://img.shields.io/badge/License-MIT-green)\n\n## Table of Contents\n- [Project Overview](#project-overview)\n- [Key Features](#key-features)\n- [Installation \u0026 Setup](#installation--setup)\n- [Project Structure](#project-structure)\n- [Running the Application](#running-the-application)\n- [Technical Documentation](#technical-documentation)\n- [Requirements](#requirements)\n- [License](#license)\n\n---\n\n## Project Overview\nThis system uses the **Gradient Boosting Algorithm** to detect fraudulent financial transactions with high accuracy. The project covers the complete pipeline from data analysis to UI implementation, including a Streamlit-based interface for real-time processing and result visualization.\n\n![Sample Output](visualizations/roc_curve.png)\n\n---\n\n## Key Features\n- 🕵️ Exploratory Data Analysis (EDA) with 7+ professional visualizations\n- 🚀 Model with 98% AUC-ROC accuracy\n- 📊 Web-based UI using Streamlit\n- 🔄 Real-time data processing capability\n- 📈 Comprehensive documentation\n\n---\n\n## Installation \u0026 Setup\n\n### Prerequisites\n- Python 3.9+\n- pip\n\n### Installation Steps:\n```bash\ngit clone https://github.com/sorna-fast/fraud-detection.git\ncd fraud-detection\npip install -r requirements.txt\n```\n\n---\n\n## Project Structure\n```\nfraud-detection/\n├── apps/                  # Core application code\n│   ├── src/              # Processing modules\n│   └── data/             # Data processing \u0026 splitting\n├── model/                # Trained model\n│   └── gb_classifier.pkl\n├── notebooks/            # Data analysis notebooks\n│   ├── Fraud_Detection_EDA_Model_Training_FA.ipynb (Persian comments)\n│   └── Fraud_Detection_EDA_Model_Training_EN.ipynb (English comments)\n├── visualizations/       # Visualization outputs\n│   ├── confusion_matrix_test.png\n│   └── roc_curve.png\n        ...\n├── .gitignore\n├── app.py                # Application entry point\n├── README.md\n└── requirements.txt\n```\n\n---\n\n## Running the Application\nTo launch the web interface:\n```bash\nstreamlit run app.py\n```\n\n---\n\n## Technical Documentation\n\n### 1. Dataset\n- **File Name:** `fraud_dataset_mod.csv`\n- **Key Characteristics:**\n  - 17 numerical \u0026 categorical features\n  - 50,001 records\n  - Balanced using RandomUnderSampler\n\n### 2. Model\n- **Algorithm:** Gradient Boosting Classifier + RandomUnderSampler\n- **Accuracy:** 98% AUC-ROC\n- **Input:** 12 processed features\n- **Output:** Fraud probability (0-1)\n\n### 3. Visualizations\n| File Name | Description |\n|----------|---------|\n| `categorical_distribution.png` | Categorical feature distribution |\n| `numeric_features_boxplot.png` | Outlier analysis |\n\n\n---\n\n## Requirements\nFull requirements list available in [`requirements.txt`](requirements.txt)\n\n---\n\n## License\nThis project is licensed under the [MIT](LICENSE) License.\n\n---\n\n👋 We hope you find this project useful! 🚀\n\n## 👨‍💻 Author\n**Masoud Ghasemi**\n\n- **GitHub**: [sorna-fast](https://github.com/sorna-fast)\n- **Email**: [masudpythongit@gmail.com](mailto:masudpythongit@gmail.com)\n- **linkedin**: [masoud-ghasemi](https://www.linkedin.com/in/masoud-ghasemi-748412381)\n- **Telegram**: [@Masoud_Ghasemi_sorna_fast](https://t.me/Masoud_Ghasemi_sorna_fast)\n\n---\n\n\n\n# Project introduction in Persian\n\n# سیستم تشخیص تقلب در تراکنش‌های مالی - مستندات فنی\n\n![GitHub](https://img.shields.io/badge/Python-3.9%2B-blue)\n![GitHub](https://img.shields.io/badge/License-MIT-green)\n\n## فهرست مطالب\n- [معرفی پروژه](#معرفی-پروژه)\n- [ویژگی‌های کلیدی](#ویژگی‌های-کلیدی)\n- [نصب و راه‌اندازی](#نصب-و-راهاندازی)\n- [ساختار پروژه](#ساختار-پروژه)\n- [اجرای برنامه](#اجرای-برنامه)\n- [مستندات فنی](#مستندات-فنی)\n- [لیست نیازمندی‌ها](#لیست-نیازمندیها)\n- [مجوز](#مجوز)\n\n\n\n## معرفی پروژه\nاین سیستم با استفاده از **الگوریتم Gradient Boosting** قادر به تشخیص تراکنش‌های مالی تقلبی با دقت بالا است. پروژه شامل مراحل کامل از تحلیل داده تا پیاده‌سازی رابط کاربری می‌باشد و از محیط کاربری استریملیت برای نمایش نتایج و پردازش داده‌های جدید استفاده می‌کند.\n\n![نمونه خروجی](visualizations/roc_curve.png)\n\n\n\n## ویژگی‌های کلیدی\n- 🕵️ تحلیل اکتشافی داده (EDA) با ۷+ نمودار حرفه‌ای\n- 🚀 مدل با دقت 98% AUC-ROC\n- 📊 رابط کاربری تحت وب با Streamlit\n- 🔄 قابلیت پردازش بلادراز داده‌های جدید\n- 📈 مستندات کامل و آماده انتشار\n\n\n\n## نصب و راه‌اندازی\n\n### پیش‌نیازها\n- Python 3.9+\n- pip\n\n### مراحل نصب:\n```bash\ngit clone https://github.com/sorna-fast/fraud-detection.git\ncd fraud-detection\npip install -r requirements.txt\n```\n\n---\n\n## ساختار پروژه\n```\nfraud-detection/\n├── apps/                  # کدهای اصلی برنامه\n│   ├── src/              # ماژول‌های پردازشی\n│   └── data/             # پردازش و تقسیم داده\n├── model/                # مدل آموزش دیده\n│   └── gb_classifier.pkl\n├── notebooks/            # تحلیل‌های داده\n│   ├── Fraud_Detection_EDA_Model_Training_FA.ipynb (کامنت‌های فارسی)\n│   └── Fraud_Detection_EDA_Model_Training_EN.ipynb (کامنت‌های انگلیسی)\n├── visualizations/       # خروجی نمودارها\n│   ├── confusion_matrix_test.png\n│   └── roc_curve.png\n        ...\n├── .gitignore\n├── app.py                # نقطه ورود برنامه\n├── README.md\n└── requirements.txt\n```\n\n---\n\n## اجرای برنامه\nبرای اجرای رابط کاربری:\n```bash\nstreamlit run app.py\n```\n\n---\n\n## مستندات فنی\n\n### ۱. دیتاست\n- **نام فایل:** `fraud_dataset_mod.csv`\n- **ویژگی‌های کلیدی:**\n  - 17 ویژگی عددی و دسته‌ای\n  - 50001 رکورد \n  - متوازن‌سازی شده با RandomUnderSampler\n\n### ۲. مدل\n- **الگوریتم:** Gradient Boosting Classifier + RandomUnderSampler \n- **دقت:** ۹8% AUC-ROC\n- **ورودی:** ۱۲ ویژگی پردازش شده\n- **خروجی:** احتمال تقلب (۰ تا ۱)\n\n### ۳. ویزوالایزیشن‌ها\n| نام فایل | توضیحات |\n|----------|---------|\n| `categorical_distribution.png` | توزیع ویژگی‌های دسته‌ای |\n| `numeric_features_boxplot.png` | تحلیل داده‌های پرت |\n  \n\n---\n\n## لیست نیازمندی‌ها\nمشاهده کامل نیازمندی‌ها در [`requirements.txt`](requirements.txt)\n\n---\n\n## مجوز\nاین پروژه تحت مجوز [MIT](LICENSE) منتشر شده است.\n\n\n\n👋 امیدواریم این پروژه برای شما مفید باشد! 🚀\n\n## 👨‍💻 نویسنده\n**مسعود فاسمی**\n\n- **GitHub**: [sorna-fast](https://github.com/sorna-fast)\n- **Email**: [masudpythongit@gmail.com](mailto:masudpythongit@gmail.com)\n- **linkedin**: [masoud-ghasemi](https://www.linkedin.com/in/masoud-ghasemi-748412381)\n- **Telegram**: [@Masoud_Ghasemi_sorna_fast](https://t.me/Masoud_Ghasemi_sorna_fast)\n\n---\n\nاین فایل README.md:\n- کاملاً دو زبانه با ساختار یکپارچه\n- دارای تمام بخش‌های ضروری با جزئیات کامل\n- سازگار با استانداردهای GitHub\n- شامل لینک‌های کاربردی و اطلاعات تماس\n- دارای فرمتبندی حرفه‌ای با Markdown\n- منطبق با ساختار پروژه شما\n\nهر بخش ابتدا به انگلیسی و سپس به فارسی نوشته شده است. برای پیمایش راحت‌تر، از انکرهای مناسب استفاده شده است. 🚀","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsorna-fast%2Ffraud-detection","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsorna-fast%2Ffraud-detection","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsorna-fast%2Ffraud-detection/lists"}