{"id":34556277,"url":"https://github.com/epanemu/fct","last_synced_at":"2026-04-19T22:02:43.404Z","repository":{"id":152517214,"uuid":"576876986","full_name":"Epanemu/FCT","owner":"Epanemu","description":"An ML model with decision trees focusing on best leaf accuracy","archived":false,"fork":false,"pushed_at":"2023-11-22T07:08:46.000Z","size":4642,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2023-11-22T08:28:09.323Z","etag":null,"topics":["decision-tree-classifier","explainable-ai","machine-learning","mixed-integer-programming"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Epanemu.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2022-12-11T09:26:29.000Z","updated_at":"2023-05-23T19:45:21.000Z","dependencies_parsed_at":"2023-11-22T08:37:47.878Z","dependency_job_id":null,"html_url":"https://github.com/Epanemu/FCT","commit_stats":null,"previous_names":["epanemu/fct"],"tags_count":0,"template":null,"template_full_name":null,"purl":"pkg:github/Epanemu/FCT","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Epanemu%2FFCT","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Epanemu%2FFCT/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Epanemu%2FFCT/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Epanemu%2FFCT/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Epanemu","download_url":"https://codeload.github.com/Epanemu/FCT/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Epanemu%2FFCT/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32024251,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-18T20:23:30.271Z","status":"online","status_checked_at":"2026-04-19T02:00:07.110Z","response_time":55,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["decision-tree-classifier","explainable-ai","machine-learning","mixed-integer-programming"],"created_at":"2025-12-24T08:29:35.088Z","updated_at":"2026-04-19T22:02:43.398Z","avatar_url":"https://github.com/Epanemu.png","language":"Jupyter Notebook","readme":"# FCT: Fair Clasification Tree\n\nA low-depth Classification Tree that is optimized for leaf accuracy.\n\nAfter this, leaves of the tree are extended using further models. In this implementation, we extend using XGBoost models. However, we could choose any model in practie.\n\nThe hybrid tree combines the explainability of Classification Trees with the accuracy of XGBoost.\n\n## Results\nTo generate visualizations and tables used in the paper see `Results.ipynb`.\n\n\n## Requirements\nRequirements are listed in the `requirements.txt` file. To install them, run ```pip install -r requirements.txt```\n\n## Datasets\nTo download datasets, run ```python ./utils/openml_data_down.py```. This downloads the classification part of the tabular benchmark by Grinzstajn et al. to folders `./data/openml/categorical` and `./data/openml/numerical`.\n\n## Usage\nRegarding the configurations that were executed, they are listed in `benchmark.py`\nThat script cannot be run, it is made for use on a cluster where the experiments were executed.\n\n\n### Hybrid-trees\nA simple proof-of-concept example is in `Example.ipynb` with a walkthrough.\n\nTo run the proper optimization yourself, follow these 2 steps:\n - Run `python sklearn_warmstart.py -data path/to/data -res path/to/results` This will compute the low-depth tree for the data with default parameters, as presented in the paper. Data must be in the same format that is used in the download script. Results folder must be created in advance. For different hyperparameters, refer to the python implementation. This creates a `run0.ctx` in the `path/to/results` folder. This is the context representing the model (0 in the name is the seed used). You can choose different strategy of optimization by selecting different python script from `sklearn_warmstart.py`. The options are:\n    - `sklearn_warmstart.py` for Warmstrarted variant\n    - `gradual.py` for Gradual variant\n    - `direct.py` for Direct variant\n    - `halving.py` for Halving variant - not used in the paper, an earlier version using bisection, described in the thesis\n    - `oct.py` for OCT\n - Then, run `python finalize_model.py path/to/results/[model].ctx` to extend the tree stored in `[model].ctx` file with XGBoost models in leaves. The hybrid tree will be saved in a `[model]_ext.ctx`. This file can then be loaded using the functions in `src/UtilityHelper.py`\n\nTo investigate how were the results collected, see the function `retrieve_information()` in `src/UtilityHelper.py`\n\n### CART\nTo run the optimization of CART methods, run the following: `python find_best_trees.py` This generates a file per configuration, containing the same information as used in the `Results.ipynb`\n\nNote that the script takes a lot of time to finish, and can take a lot of memory as well.\n\n## Reference\n\nImproving the Validity of Decision Trees as Explanations \\\n*__Jiří Němeček__, Tomáš Pevný, Jakub Mareček* \\\n[link to the preprint](https://arxiv.org/abs/2306.06777)\n\n\n## Master's Thesis\nThis repository is a major part of Jiří Němeček's [Master's Thesis](https://dspace.cvut.cz/handle/10467/109455?locale-attribute=en) at FEE CTU in Prague\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fepanemu%2Ffct","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fepanemu%2Ffct","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fepanemu%2Ffct/lists"}