https://github.com/freakwill/thomas

My Bayes algorithm, for the name of Thomas Bayes 👨‍🔬
https://github.com/freakwill/thomas

bayes-classifier naive-bayes-classifier python

Last synced: about 2 months ago
JSON representation

My Bayes algorithm, for the name of Thomas Bayes 👨‍🔬

Host: GitHub
URL: https://github.com/freakwill/thomas
Owner: Freakwill
License: mit
Created: 2018-10-06T13:10:28.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2019-05-22T12:57:33.000Z (about 7 years ago)
Last Synced: 2025-06-21T15:04:14.374Z (about 1 year ago)
Topics: bayes-classifier, naive-bayes-classifier, python
Language: Python
Homepage:
Size: 2.62 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # Thomas

My Bayes algorithm, for the name of Thomas Bayes

## Features

* Cope with continuous random varaibles intellegently.

* integer random varaibles (e.g. the mass of things with integer gram) will be treated as continuous ones in some case.

## Requirement

* numpy

* pandas

* scikit-learn (in examples)

* neupy

## Install

`pip install tomas`

not `thomas` which had been registered.

## Why

For the Honor of T. Bayes

![](https://github.com/Freakwill/Thomas/blob/master/Thomas_Bayes.gif)

## Grammar

Just see the example file.

## Examples

```python

import pandas as pd

from mystat import *

from scipy.stats import chi2_contingency

from sklearn.model_selection import train_test_split

train = pd.read_excel('modeling_data.xls', encoding='utf-8')

train['批次时间'] = [_.total_seconds() for _ in train['批次完成时间'] - train['批次开始时间']]

train['G-W-T'] = list(zip(train['克重'], train['重量'], train['批次时间']))

train['L-M'] = list(zip(train['长度'], train['门幅']))

# train = train.drop(columns=['批次开始时间', '批次完成时间', '配方ID', '流程卡号', '缸号', '批次时间'])

y_train = train['质量问题']

x_train, x_test, y_train, y_test = train_test_split(train, y_train, test_size=0.2)

# seperate the data in x_train to 1 + 3 groups as x_train and z_trains

x_train = x_train[['机器', '弹力', '氨纶', '织物', '纱线', '颜色', '客户', '月份', 'L-M', 'G-W-T']]

z_trains = x_train[[s for s in train.columns if s.startswith('助')]], x_train[[s for s in train.columns if s.startswith('染')]], x_train[[s for s in train.columns if s.startswith('光')]]

from tomas import *

def nb():

    models = None # use GRNN to fit data (z_trains, y_train)

    nbc = ZeroOneHemiNaiveBayesClassifier.fromDataFrame(x_train, z_trains, y_train, models)

    y_pred = nbc.predictdf(x_test)

    scores = check(y_test, y_pred)

    print(report(scores))

nb()

# =>

   | -> C1 | -> C0|

    C1 | 128 | 43 |

    C0 | 217 | 431 |

----------------

    f-score(p) 0.4961

    f-socre(n) 0.7683

    mcc 0.3405

```

## Is it easy?

Yes

## Principle

![](https://github.com/Freakwill/Thomas/blob/master/README.png)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/freakwill/thomas

Awesome Lists containing this project

README