{"id":27130969,"url":"https://github.com/michaelzheng67/ml_classification_optimizer","last_synced_at":"2026-04-29T01:33:59.609Z","repository":{"id":199605286,"uuid":"365086210","full_name":"michaelzheng67/ML_Classification_optimizer","owner":"michaelzheng67","description":"Algorithm that determines best machine learning classification model to use for a given dataset. Written in Python. ","archived":false,"fork":false,"pushed_at":"2021-05-08T20:21:49.000Z","size":132,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-07T20:19:54.400Z","etag":null,"topics":["classification","machine-learning","python","scikit-learn"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/michaelzheng67.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2021-05-07T01:56:11.000Z","updated_at":"2021-05-08T20:21:51.000Z","dependencies_parsed_at":"2023-10-11T06:32:29.505Z","dependency_job_id":null,"html_url":"https://github.com/michaelzheng67/ML_Classification_optimizer","commit_stats":null,"previous_names":["michaelzheng67/ml_classification_optimizer"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/michaelzheng67/ML_Classification_optimizer","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/michaelzheng67%2FML_Classification_optimizer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/michaelzheng67%2FML_Classification_optimizer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/michaelzheng67%2FML_Classification_optimizer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/michaelzheng67%2FML_Classification_optimizer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/michaelzheng67","download_url":"https://codeload.github.com/michaelzheng67/ML_Classification_optimizer/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/michaelzheng67%2FML_Classification_optimizer/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32407164,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-28T19:38:08.556Z","status":"ssl_error","status_checked_at":"2026-04-28T19:37:55.688Z","response_time":56,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["classification","machine-learning","python","scikit-learn"],"created_at":"2025-04-07T20:19:57.087Z","updated_at":"2026-04-29T01:33:59.595Z","avatar_url":"https://github.com/michaelzheng67.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"Machine Learning Classification Optimizer \n\nPython-based application\n\nImports: Pandas, Scikit-learn\n\ntldr: Prints which machine learning classification model would work best for a given dataset. \n\nInspiration and tutorial based on Udemy Machine Learning course by Kirill Eremenko. This algorithm works by having the user insert a .csv file of data that can be grouped and classified, and runs it through multiple classification models, in which the best possible model for the dataset is determined by metric assessment. Firstly, the .py file is configured so that the user is directing it to connect to data within a given .csv file. Then, the data is split into training set and test set, undergoes feature scaling, and then is plugged into seven different classification models from scikit-learn. Then, the models are judged on multiple metrics also derived from scikit-learn. \n\nCredit to the Machine Learning course for providing the test data and the foundational code for the basic way that the models can run and splitting / scaling the test data. \n\nnotes: \n- The variables file import that the main.py file is referring to is another .py file that stores strings that the models use \n- In order for the algorithm to work, we must ensure that the dependent variables are placed before the independent variable in terms of column order. This means that the independent variable in which the classification is trying to guess is going to be in the last column of the .csv file '\n- The Social Network Ads test data was also provided by the Udemy course. Essentially, it's a csv dataset that has age and estimated salary columns, along with a last column of whether or not that specific user clicked on an ad\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmichaelzheng67%2Fml_classification_optimizer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmichaelzheng67%2Fml_classification_optimizer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmichaelzheng67%2Fml_classification_optimizer/lists"}