{"id":19651152,"url":"https://github.com/lewis-morris/skperopt","last_synced_at":"2025-04-28T16:31:20.057Z","repository":{"id":57468141,"uuid":"234890836","full_name":"lewis-morris/Skperopt","owner":"lewis-morris","description":"A hyperopt wrapper - simplifying hyperparameter tuning with Scikit-learn style estimators.","archived":false,"fork":false,"pushed_at":"2020-04-04T04:07:28.000Z","size":2742,"stargazers_count":5,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-04-17T17:57:42.327Z","etag":null,"topics":["accuracy","auc","f1-score","hyperopt","hyperopt-wrapper","pandas-dataframe","randomsearch","rmse","sklearn-estimator"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lewis-morris.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-01-19T11:44:45.000Z","updated_at":"2020-05-11T10:54:07.000Z","dependencies_parsed_at":"2022-09-19T09:30:56.479Z","dependency_job_id":null,"html_url":"https://github.com/lewis-morris/Skperopt","commit_stats":null,"previous_names":[],"tags_count":18,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lewis-morris%2FSkperopt","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lewis-morris%2FSkperopt/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lewis-morris%2FSkperopt/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lewis-morris%2FSkperopt/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lewis-morris","download_url":"https://codeload.github.com/lewis-morris/Skperopt/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251345884,"owners_count":21574801,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["accuracy","auc","f1-score","hyperopt","hyperopt-wrapper","pandas-dataframe","randomsearch","rmse","sklearn-estimator"],"created_at":"2024-11-11T15:05:26.445Z","updated_at":"2025-04-28T16:31:15.070Z","avatar_url":"https://github.com/lewis-morris.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n\u003cp align=\"center\"\u003e \n\u003cimg src=\"https://github.com/lewis-morris/Skperopt/blob/master/images/logo.png?raw=true\"\u003e\n\u003c/p\u003e\n\n# Skperopt\n \nA hyperopt wrapper - simplifying hyperparameter tuning with Scikit-learn style estimators.\n\nWorks with either classification evaluation metrics \"f1\", \"auc\" or \"accuracy\" AND regression \"rmse\" and \"mse\".\n\n## Installation:\n\n```\npip install skperopt\n```\n\n## Usage:\n\nJust pass in an estimator, a parameter grid and Hyperopt will do the rest. No need to define objectives or write hyoperopt specific parameter grids. \n\n### Recipe (vanilla flavour):\n\n- [x]  Import skperopt\n- [x]  Initalize skperopt \n- [x]  Run skperopt.HyperSearch.search\n- [x]  Collect the results\n\nCode example below.\n\n```python\nimport skperopt as sk\n\nimport pandas as pd\n\nfrom sklearn.datasets import make_classification\nfrom sklearn.neighbors import KNeighborsClassifier\n\n#generate classification data\ndata = make_classification(n_samples=1000, n_features=10, n_classes=2)\nX = pd.DataFrame(data[0])\ny = pd.DataFrame(data[1])\n\n#init the classifier\nkn = KNeighborsClassifier()\nparam = {\"n_neighbors\": [int(x) for x in np.linspace(1, 60, 30)],\n         \"leaf_size\": [int(x) for x in np.linspace(1, 60, 30)],\n         \"p\": [1, 2, 3, 4, 5, 10, 20],\n         \"algorithm\": ['auto', 'ball_tree', 'kd_tree', 'brute'],\n         \"weights\": [\"uniform\", \"distance\"]}\n\n\n#search parameters\nsearch = sk.HyperSearch(kn, X, y, params=param)\nsearch.search()\n\n#gather and apply the best parameters\nkn.set_params(**search.best_params)\n\n#view run results\nprint(search.stats)\n\n\n```\n\n## HyperSearch parameters\n\n* **est** (*[sklearn estimator]* required) \n\u003e any sklearn style estimator\n\n* **X** (*[pandas Dataframe]* required) \n\u003e your training data\n\n* **y** (*[pandas Dataframe]* required) \n\u003e your training data\n\n* **params** (*[dictionary]* required) \n\u003e a parameter search grid \n\n* **iters** (default 500 *[int]*) \n\u003e number of iterations to try before early stopping\n\n* **time_to_search** (default None *[int]*) \n\u003e time in seconds to run for before early stopping (None = no time limit)\n\n* **cv** (default 5 *[int]*) \n\u003e number of folds to use in cross_vaidation tests\n\n* **cv_times** (default 1 *[int]*) \n\u003e number of times to perfrom cross validation on a new random sample of the data -higher values decrease variance but increase run time\n\n* **randomState** (default 10 *[int]*) \n\u003e random state for the data shuffling\n\n* **scorer** (default \"f1\" *[str]*) \n\u003e type of evaluation metric to use - accepts classification \"f1\",\"auc\",\"accuracy\" or regression \"rmse\" and \"mse\"\n\n* **verbose** (default 1 *[int]*) \n\u003e amount of verbosity \n\n         0 = none \n         \n         1 = some \n         \n         2 = debug\n\n* **random** (default - *False*) \n\u003e should the data be randomized during the cross validation\n\n* **foldtype** (default \"Kfold\" *[str]*) \n\u003e type of folds to use - accepts \"KFold\", \"Stratified\"\n\n## HyperSearch methods \n\n* **HyperSearch.search()** (None) \n\u003e Used to search the parameter grid using hyperopt. No parameters need to be passed to the function. All parameters are set during initialization.\n\n\n# Testing\n\nWith 100 tests of 150 search iterations for both RandomSearch and Skperopt Searches.\n\nSkperopt (hyperopt) performs better than a RandomSearch, producing higher average f1 score with a smaller standard deviation.\n\n\n![alt chart](./images/skperopt.png \"Chart\")\n\n### Skperopt Search Results \n\nf1 score over 100 test runs:\n\n\u003e Mean **0.9340930**\n\n\u003e Standard deviation **0.0062275**\n\n\n### Random Search Results\n\nf1 score over 100 test runs \n\n\u003e Mean **0.927461652**\n\n\u003e Standard deviation **0.0063314**\n\n\n----------------------------------------------------------------------------\n\n\n## Updates\n\n### V0.0.73\n\n* Added cv_times attr - runs the cross validation n times (ie cv (5x5) ) each iteration on a new randomly sampled data set\n this should reduce overfitting \n\n\n### V0.0.7\n\n* Added **FIXED** RMSE eval metric \n\n* Added MSE eval metric \n         \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flewis-morris%2Fskperopt","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flewis-morris%2Fskperopt","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flewis-morris%2Fskperopt/lists"}