https://github.com/nok/weka-porter
Transpile trained decision trees from Weka to C, Java or JavaScript.
https://github.com/nok/weka-porter
data-science machine-learning weka
Last synced: about 1 year ago
JSON representation
Transpile trained decision trees from Weka to C, Java or JavaScript.
- Host: GitHub
- URL: https://github.com/nok/weka-porter
- Owner: nok
- License: mit
- Created: 2016-11-27T20:27:32.000Z (over 9 years ago)
- Default Branch: master
- Last Pushed: 2022-12-26T20:26:16.000Z (over 3 years ago)
- Last Synced: 2025-05-09T01:48:10.561Z (about 1 year ago)
- Topics: data-science, machine-learning, weka
- Language: Python
- Homepage:
- Size: 28.3 KB
- Stars: 7
- Watchers: 1
- Forks: 2
- Open Issues: 2
-
Metadata Files:
- Readme: readme.md
- License: license.txt
Awesome Lists containing this project
README
# weka-porter
[](https://travis-ci.org/nok/weka-porter)
[](https://pypi.python.org/pypi/weka-porter)
[](https://pypi.python.org/pypi/weka-porter)
[](https://raw.githubusercontent.com/nok/weka-porter/master/license.txt)
Transpile trained decision trees from [Weka](http://www.cs.waikato.ac.nz/ml/weka/) to C, Java or JavaScript.
It's recommended for limited embedded systems and critical applications where performance matters most.
## Benefit
The benefit of the module is to transpile a decision tree from the compact representation by the [Weka](http://www.cs.waikato.ac.nz/ml/weka/) software to a target programming language:
```
outlook = sunny
| humidity <= 75: yes (2.0)
| humidity > 75: no (3.0)
outlook = overcast: yes (4.0)
outlook = rainy
| windy = TRUE: no (2.0)
| windy = FALSE: yes (3.0)
```
```java
public static String classify(String outlook, boolean windy, double humidity) {
if (outlook.equals("sunny")) {
if (humidity <= 75) {
return "yes";
}
else if (humidity > 75) {
return "no";
}
}
else if (outlook.equals("overcast")) {
return "yes";
}
else if (outlook.equals("rainy")) {
if (windy == true) {
return "no";
}
else if (windy == false) {
return "yes";
}
}
return null;
}
```
## Installation
```bash
pip install weka-porter
```
## Usage
Either you use the porter as [imported module](#module) in your application or you use the [command-line interface](#cli).
### Training
First of all a trained decision tree is required.
```
# download Weka:
wget https://netcologne.dl.sourceforge.net/project/weka/weka-3-8/3.8.2/weka-3-8-2.zip
unzip weka-3-8-2.zip && cd weka-3-8-2
# train decision tree and save the result:
java -cp weka.jar weka.classifiers.trees.J48 -t data/weather.numeric.arff -v > j48.txt
```
Copy and paste the compact representation from `j48.txt` to a new file (i.e. `j48_tree.txt`):
```
outlook = sunny
| humidity <= 75: yes (2.0)
| humidity > 75: no (3.0)
outlook = overcast: yes (4.0)
outlook = rainy
| windy = TRUE: no (2.0)
| windy = FALSE: yes (3.0)
```
### Module
Now the saved decision tree can be ported to Java:
```python
from weka_porter import Porter
porter = Porter(language='java')
output = porter.port('j48_tree.txt', method_name='classify')
print(output)
```
The ported [decision tree](examples/basics.py#L12-L33) matches the [original version](examples/j48_tree.txt) of the estimator.
### Command-line interface
This examples shows how you can port a estimator from the command line. The estimator can be ported by using the following command:
```
python -m weka_porter --input [--output ] [--c] [--java] [--js]
python -m weka_porter -i [-o ] [--c] [--java] [--js]
```
The target programming language is changeable on the fly:
```bash
python -m weka_porter -i j48_tree.txt --c
python -m weka_porter -i j48_tree.txt --java
python -m weka_porter -i j48_tree.txt --js
```
Finally the following command will display all options:
```bash
python -m weka_porter --help
python -m weka_porter -h
```
## Development
### Environment
Install the required environment [modules](environment.yml) by executing the script [environment.sh](scripts/environment.sh):
```bash
bash ./scripts/environment.sh
```
```bash
conda env create -n weka-porter -f environment.yml
source activate weka-porter
```
Furthermore [Node.js](https://nodejs.org) (`>=6`), [Java](https://java.com) (`>=1.6`) and [GCC](https://gcc.gnu.org) (`>=4.2`) are required for all tests.
### Testing
Run all [tests](tests) by executing the bash script [test.sh](scripts/test.sh):
```bash
bash ./scripts/test.sh
```
```bash
python -m unittest discover -vp '*Test.py'
```
The tests cover module functions as well as matching predictions of ported trees.
## License
The library is Open Source Software released under the [MIT](license.txt) license.
## Questions?
Don't be shy and feel free to contact me on [Twitter](https://twitter.com/darius_morawiec).