https://github.com/seunggihong/ml-sklearn
Simple machine learning model using scikit-learn
https://github.com/seunggihong/ml-sklearn
adaboost bagging classfication decision-tree discriminant-analysis gaussian-naive-bayes gridsearchcv knn machine-learning multinomial-naive-bayes random-forest regression scikit-learn svm voting
Last synced: 3 months ago
JSON representation
Simple machine learning model using scikit-learn
- Host: GitHub
- URL: https://github.com/seunggihong/ml-sklearn
- Owner: seunggihong
- License: apache-2.0
- Created: 2023-11-22T01:26:32.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2023-12-14T05:46:09.000Z (almost 2 years ago)
- Last Synced: 2024-01-29T00:05:09.623Z (over 1 year ago)
- Topics: adaboost, bagging, classfication, decision-tree, discriminant-analysis, gaussian-naive-bayes, gridsearchcv, knn, machine-learning, multinomial-naive-bayes, random-forest, regression, scikit-learn, svm, voting
- Language: Python
- Homepage:
- Size: 70.3 KB
- Stars: 2
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README


## ML-Sklearn
This repository uses scikit-learn to implement regression and classification models for machine learning algorithms. Then, evaluate each model and save and compare evaluation metrics. The data used in the regression analysis uses kaggle's `'red wine quilty'`, and the data used in the classification problem uses kaggle's `'Heart Failure Prediction'`. Additionally, I created a code showing how to find optimal hyperparameters using 'GridSearchCV'.
## Data


- Regression
- [Red Wine Quality](https://www.kaggle.com/datasets/uciml/red-wine-quality-cortez-et-al-2009)
- Classification
- [Heart Failure Prediction](https://www.kaggle.com/datasets/fedesoriano/heart-failure-prediction)
## Algorithm
- [D-Tree](#dtree)
- [RF](#rf)
- [NB](#nb)
- Gaussian Naive Bayes
- Multinomial Naive Bayes
- [K-NN](#knn)
- [Ada](#ada)
- [DA](#da)
- Linear Discriminant Analysis
- Quadratic Discriminant Analysis
- [SVM](#svm)
- [Voting](#voting)
- [Bagging](#bagging)
### D-Tree **_(Decision Tree)_**
- **_Code_** [DTree.py](https://github.com/seunggihong/ML-Sklearn/blob/main/Algorithm/DTree.py)
- **_Hyper parameters_**
```json
"dtree": { "max_depth": [1, 2, 3, 4, 5], "min_samples_split": [2, 3] }
```
- **_Usage_**
```bash
$ python3 main.py --prob={reg or class} --model=dtree
```
### RF **_(Random Forest)_**
- **_Code_** [RF.py](https://github.com/seunggihong/ML-Sklearn/blob/main/Algorithm/RF.py)
- **_Hyper parameters_**
```json
"rf": {
"n_estimators": [10, 100],
"max_depth": [6, 8, 10, 12],
"min_samples_leaf": [8, 12, 18],
"min_samples_split": [8, 16, 20]
}
```
- **_Usage_**
```bash
$ python3 main.py --prob={reg or class} --model=rf
```
### NB **_(Naive Bayes)_**
- **_Code_** [NB.py](https://github.com/seunggihong/ML-Sklearn/blob/main/Algorithm/NB.py)
**_Gaussian Naive Bayes(GNB)_**
- **_Hyper parameters_**
```json
"gnb": {
"var_smoothing": [1e-2, 1e-3, 1e-4, 1e-5, 1e-6]
}
```
- **_Usage_**
```bash
$ python3 main.py --prob=class --model=gnb
```**_Multinomial Naive Bayes(MNB)_**
- **_Hyper parameters_**
```json
"mnb": {
"var_smoothing": [1e-2, 1e-3, 1e-4, 1e-5, 1e-6]
}
```
- **_Usage_**
```bash
$ python3 main.py --prob=class --model=mnb
```
### K-NN **_(K Nearest Neighbors)_**
- **_Code_** [KNN.py](https://github.com/seunggihong/ML-Sklearn/blob/main/Algorithm/KNN.py)
- **_Hyper parameters_**
```json
"knn": {
"n_neighbors": [1, 2, 3, 4, 5],
"weights": ["uniform", "distance"]
}
```
- **_Usage_**
```bash
$ python3 main.py --prob={reg or class} --model=knn
```
### Ada **_(Adaptive Boosting)_**
- **_Code_** [Ada.py](https://github.com/seunggihong/ML-Sklearn/blob/main/Algorithm/Ada.py)
- **_Hyper parameters_**
```json
"ada": {
"n_estimators": [50, 100, 150],
"learning_rate": [0.01, 0.1]
}
```
- **_Usage_**
```bash
$ python3 main.py --prob={reg or class} --model=ada
```
### DA **_(Discriminant Analysis)_**
- **_Code_** [DA.py](https://github.com/seunggihong/ML-Sklearn/blob/main/Algorithm/DA.py)
**_Linear Discriminant Analysis(LDA)_**
- **_Hyper parameters_**
```json
"lda": {
"n_components": [6, 8, 10, 12],
"learning_decay": [0.75, 0.8, 0.85]
}
```
- **_Usage_**
```bash
$ python3 main.py --prob=class --model=lda
```**_Quadratic Discriminant Analysis(QDA)_**
- **_Hyper parameters_**
```json
"qda": {
"reg_param": [0.1, 0.2, 0.3, 0.4, 0.5]
}
```
- **_Usage_**
```bash
$ python3 main.py --prob=class --model=qda
```
### SVM **_(Support Vector Machine)_**
- **_Code_** [SVM.py](https://github.com/seunggihong/ML-Sklearn/blob/main/Algorithm/SVM.py)
- **_Hyper parameters_**
```json
"svm": {
"C": [0.1, 0.8, 0.9, 1, 1.1, 1.2, 1.3, 1.4],
"kernel": ["linear", "rbf"],
"gamma": [0.1, 0.8, 0.9, 1, 1.1, 1.2, 1.3, 1.4]
}
```
- **_Usage_**
```bash
$ python3 main.py --prob={reg or class} --model=svm
```
### Voting
- **_Code_** [Voting.py](https://github.com/seunggihong/ML-Sklearn/blob/main/Algorithm/Voting.py)
- **_Hyper parameters_**
```json
Not yet
```
- **_Usage_**
```bash
$ python3 main.py --prob={reg or class} --model=voting
```
### Bagging
- **_Code_** [Bagging.py](https://github.com/seunggihong/ML-Sklearn/blob/main/Algorithm/Bagging.py)
- **_Hyper parameters_**
```json
Not yet
```
- **_Usage_**
```bash
$ python3 main.py --prob={reg or class} --model=bagging
```
## Reference
- https://scikit-learn.org/stable/user_guide.html