https://github.com/mehrab-kalantari/credit-card-fraud-detection
Credit card transactions fraud detection using classic algorithms
https://github.com/mehrab-kalantari/credit-card-fraud-detection
association-analysis auc-roc-curve correlation-analysis credit-card-fraud-detection feature-engineering fraud-detection imbalanced-learning model-selection roc-curve smote-tomek
Last synced: about 2 months ago
JSON representation
Credit card transactions fraud detection using classic algorithms
- Host: GitHub
- URL: https://github.com/mehrab-kalantari/credit-card-fraud-detection
- Owner: Mehrab-Kalantari
- Created: 2024-01-18T14:08:15.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-01-18T14:32:13.000Z (over 1 year ago)
- Last Synced: 2025-01-16T09:42:28.675Z (4 months ago)
- Topics: association-analysis, auc-roc-curve, correlation-analysis, credit-card-fraud-detection, feature-engineering, fraud-detection, imbalanced-learning, model-selection, roc-curve, smote-tomek
- Language: Jupyter Notebook
- Homepage:
- Size: 2.07 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Credit Card Transactions Fraud Detection
[Dataset on kaggle](https://www.kaggle.com/datasets/kartik2112/fraud-detection)
## Contents
### Data Understanding
* Features
* Null values detection
* Duplicated values detection### Data Cleaning For EDA
* Column removal
* Discretization
* Creating new features
* Feature extraction### Feature Engineering
* Datetime feature extraction
* Credit card feature extraction### Exploratory Data Analysis
* Univariate Analysis
* Target
* Categorical features
* Numerical features* Bivariate Analysis
* Target analysis
* Amount of activity analysis
* Time analysis### Correlation and Association Analysis
* Correlation matrix
* Association matrix### Data Preprocessing
* Column removal
* Log transform
* Categorical encoding
* Binary encoding
* Weight of evidence encoding
* Ordinal encoding* Train-test split
### Imbalanced Learning
Target is imbalanced
* Methods performed
* No changes
* Random under sampling
* Random over sampling
* SMOTE-Tomek links
* Class weights### Feature Importance
### Modeling
1. Random Forest Classifier
2. Logistic Regression Classifier
3. Naive Bayes
4. Decision Tree Classifier
5. Support Vector Machine (SVM) Classifier
6. K-nearest neighbor (KNN) Classifier### Evaluation
* Confusion matrix
* AUC curve
* Classification metrics
* Decision boundaryResults on random forest classifier for test data
* * 
### Model Selection
Results on all models for test data
