An open API service indexing awesome lists of open source software.

https://github.com/yoshoku/softdtree

Soft Decision Tree
https://github.com/yoshoku/softdtree

Last synced: about 1 month ago
JSON representation

Soft Decision Tree

Awesome Lists containing this project

README

          

# softdtree

[![Test Status](https://github.com/yoshoku/softdtree/actions/workflows/test.yml/badge.svg)](https://github.com/yoshoku/softdtree/actions/workflows/test.yml)
[![BSD 3-Clause License](https://img.shields.io/badge/License-BSD%203--Clause-orange.svg)](https://github.com/yoshoku/softdtree/blob/main/LICENSE.txt)
[![PyPI](https://img.shields.io/pypi/v/softdtree?color=blue)](https://pypi.org/project/softdtree/)

softdtree is a Python library that implements classifier and regressor with Soft Decision Tree.

## Installation

softdtree requires Eigen3, so install it beforehand,

macOS:

```bash
$ brew install eigen cmake
```

Ubuntu:

```bash
$ sudo apt-get install libeigen3-dev cmake
```

Then, install softdtree from [PyPI](https://pypi.org/project/softdtree):

```bash
$ pip install -U softdtree
```

## Usage

The API of softdtree is compatible with [scikit-learn](https://scikit-learn.org/stable/).

Classifier:

```python
from sklearn.datasets import load_digits
from sklearn.model_selection import cross_val_score
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from softdtree import SoftDecisionTreeClassifier

X, y = load_digits(n_class=4, return_X_y=True)

clf = Pipeline([
("scaler", StandardScaler()),
("tree", SoftDecisionTreeClassifier(
max_depth=4, eta=0.01, max_epoch=100, random_seed=42))
])

scores = cross_val_score(clf, X, y, cv=5)
print(f"Accuracy: {scores.mean():.3f} ± {scores.std():.3f}")
```

Regressor:

```python
from sklearn.datasets import load_diabetes
from sklearn.model_selection import cross_val_score
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import MinMaxScaler
from softdtree import SoftDecisionTreeRegressor

X, y = load_diabetes(return_X_y=True)

reg = Pipeline([
("scaler", MinMaxScaler()),
("tree", SoftDecisionTreeRegressor(
max_depth=4, eta=0.1, max_epoch=100, random_seed=42))
])

scores = cross_val_score(reg, X, y, cv=5)
print(f"R^2: {scores.mean():.3f} ± {scores.std():.3f}")
```

## Parameters

- `max_depth` (int): The maximum depth of the tree. The default is `8`.
- `max_features` (float): The ratio of the number of features used at each node. The number of features used is `max(1, min(n_features, n_features * max_features))`. The default is `1.0`.
- `max_epoch` (int): The maximum number of epochs to train. The default is `100`.
- `batch_size` (int): The number of samples used in one iteration. The default is `5`.
- `eta` (float): The learning rate. The default is `0.1`.
- `beta1` (float): The exponential decay rate for the first moment estimates. The default is `0.9`.
- `beta2` (float): The exponential decay rate for the second moment estimates. he default is `0.999`.
- `epsilon` (float): The term added to the denominator for numerical stability. The default is `1e-8`.
- `tol` (float): The tolerance for the optimization. The default is `1e-4`.
- `verbose` (int): If it is set to a value greater than `0`, the estimator outputs a log. The default is `0`.
- `random_seed` (int): The random seed. If `-1`, then it will be set to a number generated by a uniformly-distributed integer random number generator. The default is `-1`.

## References

- O. Irsoy, O. T. Yildiz, and E. Alpaydin, "Soft Decision Trees," In Proc. ICPR2012, 2012.

## License

softdtree is available as open source under the terms of
the [BSD-3-Clause License](https://github.com/yoshoku/softdtree/blob/main/LICENSE.txt).

## Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/yoshoku/softdtree
This project is intended to be a safe, welcoming space for collaboration,
and contributors are expected to adhere to the [Contributor Covenant](https://contributor-covenant.org) code of conduct.