https://github.com/cgoliver/arxivdt
Full implementation of decision Tree Classifier of arXiv abstracts into subject.
https://github.com/cgoliver/arxivdt
decision-trees language-processing machine-learning python
Last synced: over 1 year ago
JSON representation
Full implementation of decision Tree Classifier of arXiv abstracts into subject.
- Host: GitHub
- URL: https://github.com/cgoliver/arxivdt
- Owner: cgoliver
- Created: 2017-08-26T19:01:07.000Z (almost 9 years ago)
- Default Branch: master
- Last Pushed: 2017-08-26T19:08:21.000Z (almost 9 years ago)
- Last Synced: 2025-01-22T04:31:40.251Z (over 1 year ago)
- Topics: decision-trees, language-processing, machine-learning, python
- Language: Python
- Size: 25.1 MB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# arXiv Decision Tree Classifier
Full Python implementation of a decision tree classifier with pruning. Classifies arXiv abstracts by topic.
Sample usage:
```
python decision_tree.py -h
python decision_tree.py -i ../Data/train_in.csv -o ../Data/train_out.csv -f 400 -e 1.5 -s 1000
```
A decision tree will be trained on `train_in.csv` and evaluated against `train_out.csv`. It will use 400 words as features and have an entropy threshold of 1.5, it will use 1000 examples to train the tree.
Will produce a file `DTmetrics.txt` containing the performance metrics of the decision tree classifier.