Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/mehuaniket/blog-classifier
blog classifier with scikit random forest.
https://github.com/mehuaniket/blog-classifier
bag-of-words blog-classifier python scikit-learn
Last synced: 1 day ago
JSON representation
blog classifier with scikit random forest.
- Host: GitHub
- URL: https://github.com/mehuaniket/blog-classifier
- Owner: mehuaniket
- Created: 2017-07-06T10:15:17.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2017-07-07T10:29:34.000Z (over 7 years ago)
- Last Synced: 2024-11-09T15:24:05.617Z (about 2 months ago)
- Topics: bag-of-words, blog-classifier, python, scikit-learn
- Language: Python
- Size: 995 KB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Blog-Classifier
## Introduction
- This is small tool to classify blog in categories using random-forest.
- Training set is in blogs.csv in following format
- To predict category for blog you've to use `use_forest_prediction.py`. May be you can change the way
data is provided to the function to integrate with backend.
```
|----------------------|
| blog | category |
|----------------------|```
## Installation
- Installing dependency from requirements.txt using following command
```bash
pip install -r requirements.txt
```- Training from `blogs.csv` (save model in forest.pickle and vocab.pickle)
- run following command in folder.```bash
python bag-of-words.py
```
- comment out 15th line in `bag-of-words.py` after downloading stopwords from popup.## Make Prediction
- In order to make prediction,Run `user_forest_prediction.py'
```python
python user_forest_predition.py
```- Basically,Collection blog is collected from following urls with scrapy framework.So Random blog will
not provide good results.
- tsa - https://www.dhs.gov/archive/news-releases/blog
- air-traffic - http://www.atc-network.com/atc-news
- Reference: https://www.kaggle.com/c/word2vec-nlp-tutorial/details/part-1-for-beginners-bag-of-words