{"id":15764596,"url":"https://github.com/soodoku/optimal_classification_cutoffs","last_synced_at":"2025-04-21T04:31:52.938Z","repository":{"id":79658275,"uuid":"151788960","full_name":"soodoku/optimal_classification_cutoffs","owner":"soodoku","description":"Script for calculating the optimal cut-off for max. F1-score or accuracy","archived":false,"fork":false,"pushed_at":"2018-10-07T17:51:35.000Z","size":6,"stargazers_count":6,"open_issues_count":0,"forks_count":1,"subscribers_count":3,"default_branch":"master","last_synced_at":"2024-10-11T12:16:38.136Z","etag":null,"topics":["accuracy","calibration","f1-score"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/soodoku.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-10-05T23:42:59.000Z","updated_at":"2021-10-03T12:58:07.000Z","dependencies_parsed_at":null,"dependency_job_id":"69aa82c0-5273-4661-b854-a1023eb9b27e","html_url":"https://github.com/soodoku/optimal_classification_cutoffs","commit_stats":{"total_commits":2,"total_committers":1,"mean_commits":2.0,"dds":0.0,"last_synced_commit":"2c8def54b822a8a23f34bc96c129a62b2ae1f0f1"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/soodoku%2Foptimal_classification_cutoffs","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/soodoku%2Foptimal_classification_cutoffs/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/soodoku%2Foptimal_classification_cutoffs/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/soodoku%2Foptimal_classification_cutoffs/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/soodoku","download_url":"https://codeload.github.com/soodoku/optimal_classification_cutoffs/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":223849432,"owners_count":17213640,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["accuracy","calibration","f1-score"],"created_at":"2024-10-04T12:04:10.841Z","updated_at":"2024-11-09T16:03:36.548Z","avatar_url":"https://github.com/soodoku.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"## Optimal Cut-Offs\n\nProbabilities from classification models can have two problems: \n\n1. Miscalibration: A p of .9 often doesn't mean a 90% chance of 1 (assuming a dichotomous y). (You can calibrate it using isotonic regression.)\n\n2. Optimal cut-offs: For multi-class classifiers, we do not know what probability value will maximize the accuracy or F1 score. Or any metric for which you need to trade-off between FP and FN.\n\nHere we share a solution for #2. It involves running the outputs through a brute-force optimizer. We provide a simple wrapper to make it yet easier to use.\n\n### Function\n\nThe function `get_probability` takes the following arguments: \n\n1. `true_labs` (required): NumPy array or Pandas Series in which the true labels are stored. \n2. `pred_prob` (required): NumPy array or Pandas Series in which the predicted probabilities are stored.\n3. `objective` (optional): `accuracy` (default) or `f1`\n4. `verbose` (optional): `True` or `False` (default) to show/hide verbose messages.\n\nThe function outputs a numeric p-value that gives the lowest F1-score or FP+FN (max. accuracy).\n\n### Usage\n\nTo use the [function](optimal_cut_offs.py), just download it and put it in the local directory and call import. \n\n```\nimport optimal_cut_offs\n\ndf = ...\n\np = optimal_cut_offs.get_probability(df.true_labs, df.pred_prob, 'accuracy')\n\n```\n\n### Illustration\n\nCheck out this [Jupyter notebook](comscore.ipynb) to see the script in action. \nFor context, the notebook underlies the outputs you see [here](https://github.com/themains/domain_knowledge/blob/master/scripts/porn.ipynb).\n\n### Authors\n\nSuriyan Laohaprapanon and Gaurav Sood\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsoodoku%2Foptimal_classification_cutoffs","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsoodoku%2Foptimal_classification_cutoffs","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsoodoku%2Foptimal_classification_cutoffs/lists"}