https://github.com/michaelzheng67/ml_classification_optimizer

Algorithm that determines best machine learning classification model to use for a given dataset. Written in Python.
https://github.com/michaelzheng67/ml_classification_optimizer

classification machine-learning python scikit-learn

Last synced: about 2 months ago
JSON representation

Algorithm that determines best machine learning classification model to use for a given dataset. Written in Python.

Host: GitHub
URL: https://github.com/michaelzheng67/ml_classification_optimizer
Owner: michaelzheng67
Created: 2021-05-07T01:56:11.000Z (about 5 years ago)
Default Branch: main
Last Pushed: 2021-05-08T20:21:49.000Z (about 5 years ago)
Last Synced: 2025-04-07T20:19:54.400Z (about 1 year ago)
Topics: classification, machine-learning, python, scikit-learn
Language: Python
Homepage:
Size: 129 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

Machine Learning Classification Optimizer

Python-based application

Imports: Pandas, Scikit-learn

tldr: Prints which machine learning classification model would work best for a given dataset.

Inspiration and tutorial based on Udemy Machine Learning course by Kirill Eremenko. This algorithm works by having the user insert a .csv file of data that can be grouped and classified, and runs it through multiple classification models, in which the best possible model for the dataset is determined by metric assessment. Firstly, the .py file is configured so that the user is directing it to connect to data within a given .csv file. Then, the data is split into training set and test set, undergoes feature scaling, and then is plugged into seven different classification models from scikit-learn. Then, the models are judged on multiple metrics also derived from scikit-learn.

Credit to the Machine Learning course for providing the test data and the foundational code for the basic way that the models can run and splitting / scaling the test data.

notes:
- The variables file import that the main.py file is referring to is another .py file that stores strings that the models use
- In order for the algorithm to work, we must ensure that the dependent variables are placed before the independent variable in terms of column order. This means that the independent variable in which the classification is trying to guess is going to be in the last column of the .csv file '
- The Social Network Ads test data was also provided by the Udemy course. Essentially, it's a csv dataset that has age and estimated salary columns, along with a last column of whether or not that specific user clicked on an ad

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/michaelzheng67/ml_classification_optimizer

Awesome Lists containing this project

README