https://github.com/krishaa1803/bitcoin-transaction-anomaly-detection-using-unsupervised-machine-learning
Built an unsupervised Machine Learning pipeline to detect anomalies in Bitcoin transactions by selecting 19 key features from 700.
https://github.com/krishaa1803/bitcoin-transaction-anomaly-detection-using-unsupervised-machine-learning
anomaly-detection bitcoin blockchain data-science dbscan fraud-detection isolation-forest k-means machine-learning pca python tsne unsupervised-learning
Last synced: 2 months ago
JSON representation
Built an unsupervised Machine Learning pipeline to detect anomalies in Bitcoin transactions by selecting 19 key features from 700.
- Host: GitHub
- URL: https://github.com/krishaa1803/bitcoin-transaction-anomaly-detection-using-unsupervised-machine-learning
- Owner: krishaa1803
- License: mit
- Created: 2025-06-22T19:20:03.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-06-22T19:34:43.000Z (about 1 year ago)
- Last Synced: 2025-06-22T20:30:03.656Z (about 1 year ago)
- Topics: anomaly-detection, bitcoin, blockchain, data-science, dbscan, fraud-detection, isolation-forest, k-means, machine-learning, pca, python, tsne, unsupervised-learning
- Language: HTML
- Homepage:
- Size: 16.6 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Bitcoin-Transaction-Anomaly-Detection-using-Unsupervised-Machine-Learning
Built an unsupervised Machine Learning pipeline to detect anomalies in Bitcoin transactions by selecting 19 key features from 700. Used PCA, t-SNE for dimensionality reduction, Isolation Forest for anomaly detection, and K-Means/DBSCAN for clustering. Applied Hampel filter for noise correction and evaluated performance using Random Forest-derived silhouette scores.
---
## 🧠Key Concepts
- **Unsupervised Learning**: No labeled data required.
- **Dimensionality Reduction**: Visualization and structure discovery.
- **Clustering & Isolation**: Identify anomalous transactions.
- **Feature Analysis**: Understand key drivers of anomalies.
---
## 🚀 Technologies & Libraries
- Python 3.x
- NumPy / Pandas
- Scikit-learn
- Matplotlib / Seaborn
- t-SNE / PCA
- Isolation Forest / DBSCAN / K-Means
- Hampel Filter for outlier preprocessing
---
## 📊 Pipeline Overview
### 1. 📂 Data Preprocessing
- Transaction data is cleaned and normalized.
- **Hampel filter** is applied to remove extreme outliers and reduce noise.
### 2. 🔻 Dimensionality Reduction
- **PCA** is used to reduce feature space while retaining variance.
- **t-SNE** helps in visualizing complex, high-dimensional patterns.
### 3. 📌 Clustering for Pattern Discovery
- **K-Means Clustering** for identifying common behavior groups.
- **DBSCAN** for density-based anomaly detection and noise separation.
- **Silhouette Score** is used to evaluate cluster quality.
### 4. 🚨 Outlier Detection
- **Isolation Forest** detects anomalous transactions by isolating rare patterns.
### 5. 📈 Feature Importance
- A **Random Forest** model ranks the most influential features post-clustering to help interpret anomaly causes (e.g., transaction value, frequency, mining difficulty, sentiment metrics).
---