An open API service indexing awesome lists of open source software.

https://github.com/cgoliver/arxivdt

Full implementation of decision Tree Classifier of arXiv abstracts into subject.
https://github.com/cgoliver/arxivdt

decision-trees language-processing machine-learning python

Last synced: over 1 year ago
JSON representation

Full implementation of decision Tree Classifier of arXiv abstracts into subject.

Awesome Lists containing this project

README

          

# arXiv Decision Tree Classifier

Full Python implementation of a decision tree classifier with pruning. Classifies arXiv abstracts by topic.

Sample usage:

```
python decision_tree.py -h

python decision_tree.py -i ../Data/train_in.csv -o ../Data/train_out.csv -f 400 -e 1.5 -s 1000

```

A decision tree will be trained on `train_in.csv` and evaluated against `train_out.csv`. It will use 400 words as features and have an entropy threshold of 1.5, it will use 1000 examples to train the tree.

Will produce a file `DTmetrics.txt` containing the performance metrics of the decision tree classifier.