{"id":48577795,"url":"https://github.com/amanbig/fraud_detection","last_synced_at":"2026-04-08T16:03:33.595Z","repository":{"id":316896548,"uuid":"1065256315","full_name":"Amanbig/fraud_detection","owner":"Amanbig","description":"A comprehensive machine learning-based fraud detection system built with Streamlit, Scikit-learn, and XGBoost. This application provides real-time fraud detection capabilities with an intuitive web interface.","archived":false,"fork":false,"pushed_at":"2025-09-27T11:10:11.000Z","size":51,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-03-30T02:44:33.722Z","etag":null,"topics":["sckit-learn","streamlit"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Amanbig.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-09-27T11:03:48.000Z","updated_at":"2025-10-21T16:45:20.000Z","dependencies_parsed_at":"2025-09-28T15:19:08.896Z","dependency_job_id":null,"html_url":"https://github.com/Amanbig/fraud_detection","commit_stats":null,"previous_names":["amanbig/fraud_detection"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Amanbig/fraud_detection","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Amanbig%2Ffraud_detection","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Amanbig%2Ffraud_detection/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Amanbig%2Ffraud_detection/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Amanbig%2Ffraud_detection/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Amanbig","download_url":"https://codeload.github.com/Amanbig/fraud_detection/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Amanbig%2Ffraud_detection/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31562697,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-08T14:31:17.711Z","status":"ssl_error","status_checked_at":"2026-04-08T14:31:17.202Z","response_time":54,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["sckit-learn","streamlit"],"created_at":"2026-04-08T16:03:31.178Z","updated_at":"2026-04-08T16:03:33.582Z","avatar_url":"https://github.com/Amanbig.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🔒 Fraud Detection System\n\nA comprehensive machine learning-based fraud detection system built with **Streamlit**, **Scikit-learn**, and **XGBoost**. This application provides real-time fraud detection capabilities with an intuitive web interface.\n\n## 📋 Table of Contents\n- [Features](#features)\n- [Dataset](#dataset)\n- [Models](#models)\n- [Installation](#installation)\n- [Usage](#usage)\n- [Project Structure](#project-structure)\n- [Model Performance](#model-performance)\n- [Screenshots](#screenshots)\n- [Contributing](#contributing)\n\n## ✨ Features\n\n### 🔍 Single Transaction Prediction\n- **Interactive Input Form**: Easy-to-use interface for entering transaction details\n- **Dual Model Predictions**: Compare results from Logistic Regression and XGBoost\n- **Real-time Results**: Instant fraud probability calculations\n- **Visual Comparisons**: Bar charts showing model confidence levels\n\n### 📊 Dataset Analytics\n- **Comprehensive Statistics**: Key metrics including fraud rate and transaction patterns\n- **Interactive Visualizations**: \n  - Fraud vs Legitimate transaction distribution\n  - Transaction amounts by fraud status\n  - Merchant category analysis\n  - Location-based insights\n\n### 📈 Model Performance Analysis\n- **Accuracy Metrics**: Performance comparison between models\n- **Probability Distributions**: Fraud score distributions for both models\n- **Feature Importance**: XGBoost feature importance visualization\n- **Confusion Matrices**: Model performance evaluation\n\n## 📊 Dataset\n\nThe system uses a fraud detection dataset with the following features:\n\n| Feature | Description | Type |\n|---------|-------------|------|\n| `TransactionID` | Unique transaction identifier | Numeric |\n| `Amount` | Transaction amount in USD | Numeric |\n| `Time` | Time since first transaction (seconds) | Numeric |\n| `Location` | Transaction location | Categorical |\n| `MerchantCategory` | Type of merchant | Categorical |\n| `CardHolderAge` | Age of cardholder | Numeric |\n| `IsFraud` | Target variable (0=Legitimate, 1=Fraud) | Binary |\n\n### Data Preprocessing\n- **Categorical Encoding**: OrdinalEncoder for location and merchant category\n- **Missing Value Handling**: Mean imputation for numeric features\n- **Feature Scaling**: StandardScaler for numeric features\n- **Data Validation**: Robust handling of unknown categories\n\n## 🤖 Models\n\n### 1. Logistic Regression\n- **Purpose**: Baseline model for interpretable predictions\n- **Accuracy**: ~94.4%\n- **Advantages**: Fast, interpretable, good baseline performance\n- **Use Case**: Quick fraud screening and interpretable results\n\n### 2. XGBoost Classifier\n- **Purpose**: Advanced ensemble model for complex pattern recognition\n- **Accuracy**: ~94.4%\n- **Advantages**: Handles non-linear relationships, feature importance, robust performance\n- **Use Case**: Production-ready fraud detection with high accuracy\n\n## 🚀 Installation\n\n### Prerequisites\n- Python 3.8 or higher\n- pip package manager\n\n### Quick Start\n\n1. **Clone or Download the Project**\n   ```bash\n   git clone https://github.com/Amanbig/fraud_detection\n   cd assign_fraud\n   ```\n\n2. **Install Dependencies**\n   ```bash\n   pip install -r requirements.txt\n   ```\n\n3. **Prepare Models (First Time Only)**\n   ```bash\n   python retrain_models.py\n   ```\n\n4. **Run the Application**\n   ```bash\n   streamlit run fraud_detection_app.py\n   ```\n\n5. **Open Your Browser**\n   - The app will automatically open at `http://localhost:8501`\n   - If not, navigate to the URL shown in your terminal\n\n### Manual Installation\n```bash\npip install streamlit pandas numpy scikit-learn xgboost plotly seaborn matplotlib\n```\n\n## 📖 Usage\n\n### 🔍 Making Predictions\n\n1. **Navigate to Single Prediction Page**\n2. **Enter Transaction Details**:\n   - Transaction ID (numeric)\n   - Amount in USD\n   - Time (seconds since epoch)\n   - Location (dropdown selection)\n   - Merchant Category (dropdown selection)\n   - Cardholder Age\n\n3. **Click \"Predict Fraud\"**\n4. **View Results**:\n   - Both models' predictions\n   - Fraud probability scores\n   - Visual comparison chart\n\n### 📊 Exploring Data\n\n1. **Go to Dataset Overview Page**\n2. **Review Key Metrics**:\n   - Total transactions\n   - Fraud cases count\n   - Overall fraud rate\n   - Average transaction amount\n\n3. **Analyze Visualizations**:\n   - Transaction distributions\n   - Location patterns\n   - Merchant category insights\n\n### 📈 Model Analysis\n\n1. **Visit Model Analytics Page**\n2. **Compare Model Performance**:\n   - Accuracy scores\n   - Probability distributions\n   - Feature importance (XGBoost)\n   - Confusion matrices\n\n## 📁 Project Structure\n\n```\nassign_fraud/\n├── 📄 README.md                    # Project documentation\n├── 🐍 fraud_detection_app.py       # Main Streamlit application\n├── 🐍 retrain_models.py           # Model retraining script\n├── 📊 model-deployment.ipynb      # Original development notebook\n├── 📈 fraud-detection.csv         # Dataset file\n├── 📋 requirements.txt            # Python dependencies\n├── 🤖 logistic.pkl               # Trained Logistic Regression model\n├── 🤖 xgboost.pkl                # Trained XGBoost model\n├── ⚙️ ordinal_encoder.pkl        # Saved categorical encoder\n├── ⚙️ scaler.pkl                 # Saved feature scaler\n└── ⚙️ feature_names.pkl          # Saved feature names\n```\n\n## 📈 Model Performance\n\n### Training Results\n- **Dataset Size**: 500 transactions\n- **Fraud Rate**: ~14% (realistic imbalanced dataset)\n- **Train/Test Split**: 75/25\n\n### Performance Metrics\n| Model | Accuracy | Precision | Recall | F1-Score |\n|-------|----------|-----------|--------|----------|\n| Logistic Regression | 94.4% | High | Good | Good |\n| XGBoost | 94.4% | High | Good | Good |\n\n### Key Insights\n- Both models achieve similar high accuracy\n- XGBoost provides better feature importance insights\n- Logistic Regression offers faster predictions and interpretability\n- System handles class imbalance effectively\n\n## 🖼️ Screenshots\n\n### Main Dashboard\n- Clean, professional interface\n- Easy navigation between features\n- Real-time prediction results\n\n### Prediction Interface\n- User-friendly input forms\n- Dropdown selections for categorical features\n- Clear fraud/legitimate indicators\n\n### Analytics Dashboard\n- Interactive charts and visualizations\n- Model comparison tools\n- Performance metrics display\n\n## ⚙️ Configuration\n\n### Model Retraining\nTo retrain models with your own data:\n\n1. **Update Dataset**: Replace `fraud-detection.csv` with your data\n2. **Run Retraining**: `python retrain_models.py`\n3. **Restart App**: `streamlit run fraud_detection_app.py`\n\n### Customization Options\n- **Styling**: Modify CSS in the Streamlit app\n- **Features**: Add/remove input fields as needed\n- **Models**: Integrate additional ML algorithms\n- **Visualizations**: Customize charts using Plotly\n\n## 🔧 Troubleshooting\n\n### Common Issues\n\n1. **Model Loading Errors**\n   ```bash\n   python retrain_models.py  # Regenerate models\n   ```\n\n2. **Missing Dependencies**\n   ```bash\n   pip install -r requirements.txt --upgrade\n   ```\n\n3. **Dataset Not Found**\n   - Ensure `fraud-detection.csv` is in the project directory\n   - Check file encoding (use UTF-8)\n\n4. **Port Already in Use**\n   ```bash\n   streamlit run fraud_detection_app.py --server.port 8502\n   ```\n\n## 🤝 Contributing\n\n1. Fork the repository\n2. Create a feature branch (`git checkout -b feature/AmazingFeature`)\n3. Commit changes (`git commit -m 'Add AmazingFeature'`)\n4. Push to branch (`git push origin feature/AmazingFeature`)\n5. Open a Pull Request\n\n## 📜 License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## 🙏 Acknowledgments\n\n- **Scikit-learn**: Machine learning library\n- **XGBoost**: Gradient boosting framework\n- **Streamlit**: Web app framework\n- **Plotly**: Interactive visualizations\n- **Seaborn \u0026 Matplotlib**: Statistical plotting\n\n## 📞 Support\n\nFor questions or support, please:\n1. Check the troubleshooting section\n2. Review existing issues\n3. Create a new issue with detailed description\n\n---\n\n**Built with ❤️ for fraud detection and financial security**\n\n### Quick Commands Reference\n```bash\n# Install dependencies\npip install -r requirements.txt\n\n# Retrain models (first time or after data changes)\npython retrain_models.py\n\n# Run the application\nstreamlit run fraud_detection_app.py\n\n# Run on different port\nstreamlit run fraud_detection_app.py --server.port 8502\n```\n\n🔒 **Stay Safe, Detect Fraud!**","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Famanbig%2Ffraud_detection","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Famanbig%2Ffraud_detection","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Famanbig%2Ffraud_detection/lists"}