{"id":20960243,"url":"https://github.com/instance01/simplehoeffdingtree","last_synced_at":"2025-10-19T20:53:52.136Z","repository":{"id":68070264,"uuid":"279898309","full_name":"instance01/SimpleHoeffdingTree","owner":"instance01","description":"Super simple, research only","archived":false,"fork":false,"pushed_at":"2020-07-18T12:58:11.000Z","size":6,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-01-20T00:34:21.086Z","etag":null,"topics":["hoeffding","hoeffding-tree","hoeffding-trees"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/instance01.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-07-15T14:58:39.000Z","updated_at":"2023-03-21T00:13:42.000Z","dependencies_parsed_at":"2023-06-06T00:00:45.682Z","dependency_job_id":null,"html_url":"https://github.com/instance01/SimpleHoeffdingTree","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/instance01%2FSimpleHoeffdingTree","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/instance01%2FSimpleHoeffdingTree/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/instance01%2FSimpleHoeffdingTree/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/instance01%2FSimpleHoeffdingTree/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/instance01","download_url":"https://codeload.github.com/instance01/SimpleHoeffdingTree/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243358211,"owners_count":20277991,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["hoeffding","hoeffding-tree","hoeffding-trees"],"created_at":"2024-11-19T01:58:09.644Z","updated_at":"2025-10-19T20:53:52.077Z","avatar_url":"https://github.com/instance01.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"## SimpleHoeffdingTree\n\nSimply calculates the splits based on the amazing Hoeffding bound.\nThis is just for research purposes.\n\nReference paper:\n\n```\n@inproceedings{domingos2000mining,\n  title={Mining high-speed data streams},\n  author={Domingos, Pedro and Hulten, Geoff},\n  booktitle={Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining},\n  pages={71--80},\n  year={2000}\n}\n```\n\n\nSample data:\n\n| Person | Time since license | Gender | Area | Risk class |\n| ------ | ------------------ | ------ | ---- | ---------- |\n| 1|1 − 2|m|urban|low |\n| 2|2 − 7|m|rural|high |\n| 3|\u003e 7  |f|rural|low |\n| 4|1 − 2|f|rural|high |\n| 5|\u003e 7  |m|rural|high |\n| 6|1 − 2|m|rural|high |\n| 7|2 − 7|f|urban|low |\n| 8|2 − 7|m|urban|low |\n\n\n\nSample output (with verbose set to True):\n\n```\n2\ndict_keys(['1-2', '2-7'])\nentropy for 1-2 :  -0.0\nentropy for 2-7 :  -0.0\nweighted_sum: 0.0\ninfogain: 1.0 - 0.0 =  1.0\ndict_keys(['m'])\nentropy for m :  1.0\nweighted_sum: 1.0\ninfogain: 1.0 - 1.0 =  0.0\ndict_keys(['urban', 'rural'])\nentropy for urban :  -0.0\nentropy for rural :  -0.0\nweighted_sum: 0.0\ninfogain: 1.0 - 0.0 =  1.0\nAll infogains: [(0.0, 'gender'), (1.0, 'area'), (1.0, 'time')]\n\n4\ndict_keys(['1-2', '2-7', '\u003e7'])\nentropy for 1-2 :  1.0\nentropy for 2-7 :  -0.0\nentropy for \u003e7 :  -0.0\nweighted_sum: 0.5\ninfogain: 1.0 - 0.5 =  0.5\ndict_keys(['m', 'f'])\nentropy for m :  1.0\nentropy for f :  1.0\nweighted_sum: 1.0\ninfogain: 1.0 - 1.0 =  0.0\ndict_keys(['urban', 'rural'])\nentropy for urban :  -0.0\nentropy for rural :  0.9182958340544896\nweighted_sum: 0.6887218755408672\ninfogain: 1.0 - 0.6887218755408672 =  0.31127812445913283\nAll infogains: [(0.0, 'gender'), (0.31127812445913283, 'area'), (0.5, 'time')]\n\n6\ndict_keys(['1-2', '2-7', '\u003e7'])\nentropy for 1-2 :  0.9182958340544896\nentropy for 2-7 :  -0.0\nentropy for \u003e7 :  1.0\nweighted_sum: 0.792481250360578\ninfogain: 0.9182958340544896 - 0.792481250360578 =  0.12581458369391152\ndict_keys(['m', 'f'])\nentropy for m :  0.8112781244591328\nentropy for f :  1.0\nweighted_sum: 0.8741854163060885\ninfogain: 0.9182958340544896 - 0.8741854163060885 =  0.044110417748401076\ndict_keys(['urban', 'rural'])\nentropy for urban :  -0.0\nentropy for rural :  0.7219280948873623\nweighted_sum: 0.6016067457394686\ninfogain: 0.9182958340544896 - 0.6016067457394686 =  0.31668908831502096\nAll infogains: [(0.044110417748401076, 'gender'), (0.12581458369391152, 'time'), (0.31668908831502096, 'area')]\n\n8\ndict_keys(['1-2', '2-7', '\u003e7'])\nentropy for 1-2 :  0.9182958340544896\nentropy for 2-7 :  0.9182958340544896\nentropy for \u003e7 :  1.0\nweighted_sum: 0.9387218755408672\ninfogain: 1.0 - 0.9387218755408672 =  0.06127812445913283\ndict_keys(['m', 'f'])\nentropy for m :  0.9709505944546686\nentropy for f :  0.9182958340544896\nweighted_sum: 0.9512050593046015\ninfogain: 1.0 - 0.9512050593046015 =  0.04879494069539847\ndict_keys(['urban', 'rural'])\nentropy for urban :  -0.0\nentropy for rural :  0.7219280948873623\nweighted_sum: 0.4512050593046014\ninfogain: 1.0 - 0.4512050593046014 =  0.5487949406953986\nAll infogains: [(0.04879494069539847, 'gender'), (0.06127812445913283, 'time'), (0.5487949406953986, 'area')]\nSplit on: area\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Finstance01%2Fsimplehoeffdingtree","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Finstance01%2Fsimplehoeffdingtree","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Finstance01%2Fsimplehoeffdingtree/lists"}