https://github.com/cair/regression-tsetlin-machine
Implementation of the Regression Tsetlin Machine
https://github.com/cair/regression-tsetlin-machine
machine-learning regression tsetlin-machine
Last synced: 14 days ago
JSON representation
Implementation of the Regression Tsetlin Machine
- Host: GitHub
- URL: https://github.com/cair/regression-tsetlin-machine
- Owner: cair
- License: mit
- Created: 2019-05-11T18:51:42.000Z (almost 6 years ago)
- Default Branch: master
- Last Pushed: 2019-05-14T07:15:23.000Z (almost 6 years ago)
- Last Synced: 2025-03-27T03:35:11.866Z (about 1 month ago)
- Topics: machine-learning, regression, tsetlin-machine
- Language: Python
- Homepage: https://arxiv.org/abs/1905.04206
- Size: 409 KB
- Stars: 9
- Watchers: 7
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# The Regression Tsetlin Machine
The inner inference mechanism of the Tsetlin Machine (https://arxiv.org/abs/1804.01508) is modified so that input patterns are transformed into a single continuous output, rather than to distinct categories.
This is achieved by:
* Using the conjunctive clauses of the Tsetlin Machine to capture arbitrarily complex patterns;
* Mapping these patterns to a continuous output through a novel voting and normalization mechanism; and
* Employing a feedback scheme that updates the Tsetlin Machine clauses to minimize the regression error.Further details can be found in https://arxiv.org/abs/1905.04206.
## Behaviour with noisy and noise-free data
Six datasets have been given in order to study the behaviour of the Regression Tsetlin Machine.
* **Dataset I** contains 2-bit feature input and the output is 100 times larger than the decimal value of the binary input (e.g., when the input is [1, 0], the output is 200). The training set consists of 8000 samples while testing set consists of 2000 samples, both without noise
* **Dataset II** contains the same data as Dataset I, except that the output of the training data is perturbed to introduce noise
* **Dataset III** has 3-bit input without noise
* **Dataset IV** has 3-bit input with noise
* **Dataset V** has 4-bit input without noise
* **Dataset VI** has 4-bit input with noiseDifferent datasets can be loaded by changing the following line in **_ArtificialDataDemo.py_**
```
df = np.loadtxt("2inNoNoise.txt").astype(dtype=np.float32)
```
The training error variation for each dataset with different number of clauses can be seen in the following figure.
Datasets without noise can be perfectly learned with a small number of clauses
```
Average Absolute Error on Training Data: 0.0
Average Absolute Error on Test Data: 0.0
```
Training and testing error for noisy data can be reduced by increasing the number of clauses and training rounds.