https://github.com/holygrease/algorithm-id3

Decision tree learning is a method for approximating discete-valued target functions, in which the learned function is represented by a decision tree.
https://github.com/holygrease/algorithm-id3

classification classification-algorithm

Last synced: 11 months ago
JSON representation

Decision tree learning is a method for approximating discete-valued target functions, in which the learned function is represented by a decision tree.

Host: GitHub
URL: https://github.com/holygrease/algorithm-id3
Owner: HolyGrease
Created: 2021-05-22T12:53:36.000Z (about 5 years ago)
Default Branch: main
Last Pushed: 2021-05-31T19:15:56.000Z (about 5 years ago)
Last Synced: 2025-04-13T19:18:20.389Z (about 1 year ago)
Topics: classification, classification-algorithm
Language: Python
Homepage:
Size: 25.4 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: Readme.md

Awesome Lists containing this project

README

          # Example of creating Dataset object

Firstly, import Dataset:

	from dataset import Dataset

Secondly, get some data:

	data = [

		[5.1, 3.5, 1.4, 0.2, "Iris-setosa"],

		[5.0, 3.2, 1.2, 0.2, "Iris-setosa"],

		[6.4, 3.2, 4.5, 1.5, "Iris-versicolor"],

		[6.7, 3.1, 4.4, 1.4, "Iris-versicolor"],

		[6.7, 3.0, 5.2, 2.3, "Iris-virginica"]]

Thirdly, set columns (attributes) names:

	column_names = [

		"Sepal length", "Sepal width",

		"Petal length", "Petal width",

		"Class"]

Now we can create Dataset object. Arguments:

- data - just list of list

- target index - index of target attribute, attribute that contains classes values

- column or attributes names - list of attributes names

- name - Dataset name



	iris = Dataset(data, 4, column_names, "Iris")

Also you can just get iris dataset by calling method

	get_iris().

You can specify path to dataset file by passing this path as argument, for example:

	get_iris("data\\iris.data")

Default value of path 

> resources\\data\\iris\\iris.data.

	iris = get_iris()

# Preprocessing dataset

Threshold - process of converting continius values to discrete values. There are two methods:

- median - thresholding by median value of column

- gain - thresholding by using value with maximum gain

As first argument takes column index, second - name of method

	iris.threshold(i, "gain")

Shuffle dataset:

	iris = iris.shuffle()

Split dataset on "train" and "test", as argument passing ratio. Train dataset gets 80% of original dataset, test - other:

	train, test = iris.split_by_ration(0.8)

# Creating tree

For this purpose import tree.py

	from tree import make_tree

	from tree import Root

To create decision tree use make_tree method. This method takes one argument - dataset. Return tree object.

	tree = make_tree(train)

# Classification

For classify the instance use method classify. May return None if can't classify this instance.

	instance_to_classify = [4.8, 3.1, 1.6, 0.2]

	predicted_class = tree.classify(instance_to_classify)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/holygrease/algorithm-id3

Awesome Lists containing this project

README