Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/leesael/EDiT
EDiT: Interpreting Ensemble Models via Compact Soft Decision Trees (ICDM'19)
https://github.com/leesael/EDiT
Last synced: 2 months ago
JSON representation
EDiT: Interpreting Ensemble Models via Compact Soft Decision Trees (ICDM'19)
- Host: GitHub
- URL: https://github.com/leesael/EDiT
- Owner: leesael
- License: bsd-3-clause
- Created: 2019-08-20T06:17:09.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2019-09-25T06:55:30.000Z (over 5 years ago)
- Last Synced: 2024-08-03T19:07:58.131Z (6 months ago)
- Language: Python
- Size: 1.33 MB
- Stars: 10
- Watchers: 2
- Forks: 4
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-decision-tree-papers - [Code
README
# EDiT
This project is a PyTorch implementation of [EDiT: Interpreting Ensemble Models via Compact Soft Decision Trees](docs/YooS19.pdf), published as a conference proceeding at [ICDM 2019](http://icdm2019.bigke.org/).
This paper proposes a novel approach that distills the knowledge of an ensemble model to maximize the interpretability of soft decision trees (SDT) with fewer parameters.## Prerequisites
- Python 3.6+
- [PyTorch](https://pytorch.org/) 1.2.0+
- [NumPy](https://numpy.org)
- [scikit-learn](https://scikit-learn.org/stable/)
- [joblib](https://joblib.readthedocs.io/en/latest/)
- [pandas](https://pandas.pydata.org/)## Usage
You should first download the datasets from [this website](http://persoal.citius.usc.es/manuel.fernandez.delgado/papers/jmlr/) and place them in `/data`.
You may just run `down.sh` in `data/` in a Linux environment.
Although it contains over a hundred datasets which were used in previous works, we use only 8 of them in our work.
The list of target datasets is described in `datasets.txt`.Then, move to `src/` and run `python main.py` to actually run EDiT.
Currently it trains a vanilla SDT over the `abalone` dataset, but you can change easily the hyperparameters in `src/main.py` including the dataset, sparsification technique, and training procedure.
For instance, it will use the tree pruning technique if you change `tree_threshold` from `0` to a desired threshold such as `1e-4`.If you want to enable the knowledge distillation technique, you should run `python rf.py` to train and save random forests (RF) which are not included in this repository.
The trained RF models are saved in `out/rf/models`.
The other results such as intermediate logs of training and trained compact soft decision trees are saved in `out/edit`.## Reference
You can download [this bib file](docs/YooS19.bib) or copy the following information:
```
@inproceedings{YooS19,
author = {Jaemin Yoo and Lee Sael},
title = {EDiT: Interpreting Ensemble Models via Compact Soft Decision Trees},
booktitle = {IEEE International Conference on Data Mining (ICDM)},
year = {2019}
}
```