{"id":20647481,"url":"https://github.com/jmbhughes/crcf","last_synced_at":"2025-04-16T03:09:55.172Z","repository":{"id":56851112,"uuid":"160128553","full_name":"jmbhughes/crcf","owner":"jmbhughes","description":"Combination Robust Cut Forests: Merging Isolation Forests and Robust Random Cut Forests","archived":false,"fork":false,"pushed_at":"2023-08-09T03:59:52.000Z","size":2633,"stargazers_count":14,"open_issues_count":2,"forks_count":3,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-04-16T03:09:36.188Z","etag":null,"topics":["anomaly-detection","isolation-forest","machine-learning","robust-random-cut-forest","trees"],"latest_commit_sha":null,"homepage":"https://jmbhughes.github.io/crcf/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jmbhughes.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-12-03T03:40:43.000Z","updated_at":"2024-06-15T21:02:24.000Z","dependencies_parsed_at":"2024-11-16T16:33:10.582Z","dependency_job_id":"f33791f0-fa1a-4384-b9df-d50c26c93e58","html_url":"https://github.com/jmbhughes/crcf","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jmbhughes%2Fcrcf","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jmbhughes%2Fcrcf/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jmbhughes%2Fcrcf/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jmbhughes%2Fcrcf/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jmbhughes","download_url":"https://codeload.github.com/jmbhughes/crcf/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":249188426,"owners_count":21227015,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["anomaly-detection","isolation-forest","machine-learning","robust-random-cut-forest","trees"],"created_at":"2024-11-16T16:32:57.447Z","updated_at":"2025-04-16T03:09:55.159Z","avatar_url":"https://github.com/jmbhughes.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Combination Robust Cut Forests\n[![CodeFactor](https://www.codefactor.io/repository/github/jmbhughes/crcf/badge)](https://www.codefactor.io/repository/github/jmbhughes/crcf)\n[![PyPI version](https://badge.fury.io/py/crcf.svg)](https://badge.fury.io/py/crcf)\n[![codecov](https://codecov.io/gh/jmbhughes/crcf/branch/main/graph/badge.svg?token=YBZERHDU75)](https://codecov.io/gh/jmbhughes/crcf)\n\nIsolation Forests **[Liu+2008]** and Robust Random Cut Trees **[Guha+2016]** are very similar in many ways, \nas outlined in the [supporting overview](overview.pdf). Most notably, they are extremes\nof the same outlier scoring function: \n\n$$\\theta \\textrm{Depth} + (1 - \\theta) \\textrm{[Co]Disp}$$ \n\nThe combination robust cut forest allows you to combine both scores by using an theta other than 0 or 1. \n\n# Install\nYou can install with through `pip install crcf`. Alternatively, you can download the repository and run \n`python3 setup.py install` or `pip3 install .` Please note that this package uses features from Python 3.7+\nand is not compatible with earlier Python versions. \n\n\n# Tasks\n- [X] complete basic implementation\n- [X] provide clear documentation and usage instructions\n- [ ] ensure interface allows for fitting and scoring on multiple points at the same time\n- [ ] implement a better saving method than pickling\n- [ ] use random tests with hypothesis\n- [ ] implement tree down in cython\n- [ ] accelerate forests with multi-threading\n- [ ] incorporate categorical variable support, including categorical rules\n- [ ] complete the write-up document with a benchmarking of performance\n\n# References\n- **[Liu+2008]**: [Liu, Fei Tony, Kai Ming Ting, and Zhi-Hua Zhou. \n\"Isolation forest.\" In 2008 Eighth IEEE International Conference on Data Mining, \npp. 413-422. IEEE, 2008.](https://cs.nju.edu.cn/zhouzh/zhouzh.files/publication/icdm08b.pdf?q=isolation-forest)\n- **[Guha+2016]**: [Guha, Sudipto, Nina Mishra, Gourav Roy, and Okke Schrijvers. \n\"Robust random cut forest based anomaly detection on streams.\" \nIn International conference on machine learning, pp. 2712-2721. 2016.](http://proceedings.mlr.press/v48/guha16.pdf)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjmbhughes%2Fcrcf","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjmbhughes%2Fcrcf","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjmbhughes%2Fcrcf/lists"}