An open API service indexing awesome lists of open source software.

https://github.com/eeddaann/data-mining-project


https://github.com/eeddaann/data-mining-project

boosting confusion-matrix data-mining feature-engineering feature-selection gaussian-naive-bayes-implementation knn-classifier machine-learning mlp-classifier pca-analysis roc-curve

Last synced: 3 months ago
JSON representation

Awesome Lists containing this project

README

          

# Data Mining Project
This notebook is the final project for data-mining course.
For this project we applied data-mining techniques with python's scikit-learn library.
The project consist:
1. **Data Exploration:**
- statistics about the features
- scatter plots for the features
- correlation matrix
- violinplot
2. **Feature Engineering:**
- LDA
- PCA
- Modification for PCA
- Feature Generation
3. **KNN**
- Hyperparameter optimization
- apply with different preprocessing
4. **Gaussian Naive Bayes**
- apply with different preprocessing
5. **Multilayer Perceptron:**
- apply with different preprocessing
6. **Boosting:**
- Based on decision trees
- Based on Gaussian Naive Bayes
7. **Evaluation**:
- Confusion matrix
- Receiver operating characteristic (ROC)