Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/kefrankk/ml-fraud-detection
I built a predictive model to detect fraud in financial transactions.
https://github.com/kefrankk/ml-fraud-detection
pandas python scikit-learn
Last synced: 28 days ago
JSON representation
I built a predictive model to detect fraud in financial transactions.
- Host: GitHub
- URL: https://github.com/kefrankk/ml-fraud-detection
- Owner: kefrankk
- Created: 2024-09-27T17:55:50.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2024-10-23T14:40:45.000Z (3 months ago)
- Last Synced: 2024-10-23T17:45:57.777Z (3 months ago)
- Topics: pandas, python, scikit-learn
- Language: Jupyter Notebook
- Homepage:
- Size: 7.57 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Fraud detection using Isolation Forest
This project uses **Isolation Forest** to detect potential fraud in transactional data. I used the **Credit Card Fraud Detection** dataset available on Kaggle, which contains 294,588 credit card transactions. Given that the dataset is unlabeled, the Isolation Forest algorithm helps in identifying anomalies (fraudulent transactions) based on an estimated contamination rate, representing the expected proportion of fraud cases.
## Project Overview
This project aims to detect suspicious transactions by leveraging the Isolation Forest model, widely used for anomaly detection in financial data. The model identifies anomalies by isolating data points that differ significantly from the majority, making it suitable for fraud detection in the absence of labeled data.### Key Steps
1. Data Preprocessing:
- Handle categorical data with **One-Hot Encoding** to ensure all features are numeric.
- Normalize numerical features to improve the efficiency and accuracy of the model.2. Model Training:
- **Isolation Forest** is trained with an estimated contamination level based on typical fraud rates in similar datasets.
### Key Files
`data/`: Contains sample data for training and testing.`fraud_detection_model.ipynb`: Jupyter notebook detailing data exploration, preprocessing, and model training.
## Installation
1. Clone the repository:
```
git clone https://github.com/yourusername/fraud-detection
cd fraud-detection
```2. Install dependencies:
```
pip install -r requirements.txt
```