{"id":13906280,"url":"https://github.com/jmcarpenter2/parfit","last_synced_at":"2025-04-05T15:09:59.676Z","repository":{"id":57450792,"uuid":"111728435","full_name":"jmcarpenter2/parfit","owner":"jmcarpenter2","description":"A package for parallelizing the fit and flexibly scoring of sklearn machine learning models, with visualization routines.","archived":false,"fork":false,"pushed_at":"2024-02-13T04:16:38.000Z","size":1608,"stargazers_count":199,"open_issues_count":8,"forks_count":29,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-03-29T14:12:29.323Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jmcarpenter2.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-11-22T20:17:51.000Z","updated_at":"2025-02-17T08:27:44.000Z","dependencies_parsed_at":"2024-06-19T05:32:26.224Z","dependency_job_id":null,"html_url":"https://github.com/jmcarpenter2/parfit","commit_stats":null,"previous_names":[],"tags_count":18,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jmcarpenter2%2Fparfit","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jmcarpenter2%2Fparfit/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jmcarpenter2%2Fparfit/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jmcarpenter2%2Fparfit/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jmcarpenter2","download_url":"https://codeload.github.com/jmcarpenter2/parfit/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247353749,"owners_count":20925329,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-06T23:01:32.600Z","updated_at":"2025-04-05T15:09:59.644Z","avatar_url":"https://github.com/jmcarpenter2.png","language":"Python","funding_links":[],"categories":["超参数优化和AutoML"],"sub_categories":[],"readme":"# parfit\nA package for parallelizing the fit and flexibly scoring of sklearn machine learning models, with visualization routines.\n\n# This python package is NO LONGER MAINTAINED.\n\n## Alternatives\nThere are several fantastic alternatives that serve the same purpose as `parfit`, but do it even better.\n\nBelow I list a few libraries that are very effective at solving the particular problem that parfit originally aimed to solve.\n\n### Hyper-parameter optimization\n* [Tune](https://ray.readthedocs.io/en/latest/tune.html)\n* [Scikit-Optimize](https://scikit-optimize.github.io/stable/)\n* [Scikit-learn](https://scikit-learn.org/stable/modules/grid_search.html)\n\n### Visualization of hyper-parameter optimizations\n* [Tune + Tensorboard](https://ray.readthedocs.io/en/latest/tune-usage.html#tensorboard)\n* [Scikit-Optimize plotting module](https://scikit-optimize.github.io/stable/modules/classes.html#module-skopt.plots)\n* [Examples using Scikit-learn + seaborn](https://towardsdatascience.com/using-3d-visualizations-to-tune-hyperparameters-of-ml-models-with-python-ba2885eab2e9)\n\n# Deprecated\n\nCURRENT VERSION == 0.220\n\nInstallation:\n```\n$pip install parfit # first time installation\n$pip install -U parfit # upgrade to latest version\n``` \n\nand then import into your code using:\n```\nfrom parfit import bestFit # Necessary if you wish to use bestFit\n\n# Necessary if you wish to run each step sequentially\nfrom parfit.fit import *\nfrom parfit.score import *\nfrom parfit.plot import *\nfrom parfit.crossval import *\n```\n\n Once imported, you can use bestFit() or other functions freely.\n\n## Easy to use\n```\ngrid = {\n    'min_samples_leaf': [1, 5, 10, 15, 20, 25],\n    'max_features': ['sqrt', 'log2', 0.5, 0.6, 0.7],\n    'n_estimators': [60],\n    'n_jobs': [-1],\n    'random_state': [42]\n}\nparamGrid = ParameterGrid(grid)\n\nbest_model, best_score, all_models, all_scores = bestFit(RandomForestClassifier(), paramGrid,\n                                                    X_train, y_train, X_val, y_val, # nfolds=5 [optional, instead of validation set]\n                                                    metric=roc_auc_score, greater_is_better=True, \n                                                    scoreLabel='AUC')\n\nprint(best_model, best_score)\n```\n```\n{max_features': 'sqrt', 'min_samples_leaf': 1, 'n_estimators': 60, 'n_jobs': -1, 'random_state': 42}\n0.9627794057231478\n```\n\n## Interpretable Visualizations\n![Alt text](/assets/scoring_grid_2D.png?raw=true)\n\n## Notes\n1. You can either use **bestFit()** to automate the steps of the process, and optionally plot the scores over the parameter grid, OR you can do each step in order: \n\n\u003e `fitModels()` -\u003e `scoreModels()` -\u003e `plotScores()` -\u003e `getBestModel()` -\u003e `getBestScore()`\n\nor\n\n\u003e `crossvalModels()` -\u003e `plotScores()` -\u003e `getBestModel()` -\u003e `getBestScore()`\n\n2. Be sure to specify ALL parameters in the ParameterGrid, even the ones you are not searching over.\n\n3. For example usage, see parfit_ex.ipynb. Each function is well-documented in the .py file. In Jupyter Notebooks, you can see the docs by pressing Shift+Tab(x3). Also, check out the complete documentation [here](docs/documentation.md) along with the [changelog](docs/changelog.md).\n\n4. This package is designed for use with sklearn machine learning models, but in theory will work with any model that has a .fit(X,y) function. Furthermore, the sklearn scoring metrics are typically used, but any function that reads in two vectors and returns a score will work.\n\n5. The plotScores() function will only work for up to a 3D parameterGrid object. That is, you can only view the scores of a grid varying over 1-3 parameters. Other parameters which do not vary can still be set, and you can still train and scores models over a higher dimensional grid.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjmcarpenter2%2Fparfit","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjmcarpenter2%2Fparfit","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjmcarpenter2%2Fparfit/lists"}