{"id":15763360,"url":"https://github.com/peter-evans/soft-thresholding","last_synced_at":"2025-03-31T10:17:13.477Z","repository":{"id":77843096,"uuid":"94962406","full_name":"peter-evans/soft-thresholding","owner":"peter-evans","description":"Candidate selection using an iterative soft-thresholding algorithm","archived":false,"fork":false,"pushed_at":"2019-03-18T03:33:29.000Z","size":20,"stargazers_count":1,"open_issues_count":0,"forks_count":1,"subscribers_count":3,"default_branch":"master","last_synced_at":"2024-10-05T11:41:39.256Z","etag":null,"topics":["algorithm","python","selection-algorithms","soft-thresholding","statistics","thresholding"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/peter-evans.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-06-21T04:21:37.000Z","updated_at":"2022-09-20T12:57:04.000Z","dependencies_parsed_at":"2023-04-05T06:22:15.418Z","dependency_job_id":null,"html_url":"https://github.com/peter-evans/soft-thresholding","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/peter-evans%2Fsoft-thresholding","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/peter-evans%2Fsoft-thresholding/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/peter-evans%2Fsoft-thresholding/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/peter-evans%2Fsoft-thresholding/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/peter-evans","download_url":"https://codeload.github.com/peter-evans/soft-thresholding/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246450475,"owners_count":20779421,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["algorithm","python","selection-algorithms","soft-thresholding","statistics","thresholding"],"created_at":"2024-10-04T11:41:44.103Z","updated_at":"2025-03-31T10:17:13.431Z","avatar_url":"https://github.com/peter-evans.png","language":"Python","readme":"# Candidate Selection using Iterative Soft-Thresholding\n[\u003cimg alt=\"The blog of Peter Evans: Candidate Selection Using Iterative Soft-Thresholding\" title=\"View blog post\" src=\"https://peterevans.dev/img/blog-published-badge.svg\"\u003e](https://peterevans.dev/posts/candidate-selection-using-iterative-soft-thresholding/)\n\nThis describes one way to use soft-thresholding to select the statistically best candidates from a sorted list. This algorithm was introduced to me as an alternative to setting a hard threshold, i.e. selecting a fixed number of the best candidates. Using an iterative soft-thresholding algorithm a variable number of candidates can be selected depending on the distribution of the values.\n\nIn the following example the best candidates are selected from a sorted list. Setting a hard threshold of three will of course always select the top three candidates. However, it is clear from looking at the distribution of the values that only the top two could be considered as candidates. This soft-thresholding algorithm allows us to select just those candidates.\n\n![HardVsSoftThresholding](/images/hard-vs-soft-thresholding.png?raw=true)\n\n## How the algorithm works\n\nIn each iteration the algorithm compares the mean and the median of the values remaining in the list. Any values higher than the minimum of the mean and median are discarded. The process is repeated until exit conditions are satisfied or until there is only one value remaining.\n\n![CompareMeanMedian](/images/compare-mean-median.png?raw=true)\n\n## Sample code\n\nThe sample python code [here](soft_thresholding.py) is a simple example to demonstrate how iterative soft-thresholding can be implemented. The sorted list values are randomly generated on each execution of the script. Executing a number of times shows how the number of selected candidates varies based on the distribution.\n\nOne candidate is selected:\n```bash\n~$ python soft_thresholding.py\nSorted list of candidates: [2, 10, 11, 20, 22, 23, 27, 29, 35, 39, 43, 44, 49, 57, 58, 61, 65, 66, 68, 83, 83, 91, 94, 94, 99]\nRemaining candidates: 25\nRemaining candidates: 13\nRemaining candidates: 7\nRemaining candidates: 3\n========================\nSelected candidates: [2]\n```\nTwo candidates are selected:\n```bash\n~$ python soft_thresholding.py\nSorted list of candidates: [1, 2, 11, 12, 12, 27, 32, 34, 35, 37, 38, 44, 46, 48, 50, 59, 60, 60, 62, 71, 71, 75, 77, 80, 91]\nRemaining candidates: 25\nRemaining candidates: 12\nRemaining candidates: 5\nRemaining candidates: 2\n========================\nSelected candidates: [1, 2]\n```\nThree candidates are selected:\n```bash\n~$ python soft_thresholding.py\nSorted list of candidates: [2, 3, 4, 5, 5, 6, 12, 12, 16, 17, 20, 21, 26, 27, 32, 34, 41, 53, 55, 58, 59, 61, 72, 86, 96]\nRemaining candidates: 25\nRemaining candidates: 13\nRemaining candidates: 6\nRemaining candidates: 3\n========================\nSelected candidates: [2, 3, 4]\n```\n\n## Fine tuning\n\nThe maximum number of candidates can be modified in the sample code. The output of the algorithm will be any number of candidates up to this value.\n```python\nmax_candidates = 3\n```\n\nThe algorithm will continue to iterate until the exit conditions are satisfied. These can be fine tuned to be less or more sensitive. In general, if the candidates are very close in value then we want to stop iterating because all of them will be good potential candidates. If the distribution is sparse then we want to keep iterating.\n\nThese are the exit conditions for asymmetrical and symmetrical distributions in the sample code.\n```python\nabs(mean - median) \u003c 0.1 * max(mean, median)\n```\n```python\nstd \u003c 0.5 * mean\n```\nThe fixed values of `0.1` and `0.5` allow the algorithm to be tuned. Decreasing these values will make the exit condition less sensitive and the algorithm will keep iterating. Increasing the value will cause the algorithm to exit sooner.\n\n## License\n\nMIT License - see the [LICENSE](LICENSE) file for details\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpeter-evans%2Fsoft-thresholding","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpeter-evans%2Fsoft-thresholding","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpeter-evans%2Fsoft-thresholding/lists"}