https://github.com/nouraalgohary/news-article-classifier
A ML model natural language processing techniques to classify articles based on their content. The model analyzes text from the articles to predict the most relevant category, such as news, sports, or entertainment. The model's performance is validated through rigorous testing, and all relevant code and data sets are included.
https://github.com/nouraalgohary/news-article-classifier
kaggle machine-learning mlproject multinomial-naive-bayes news-article-classifier nlp text-classification
Last synced: about 2 months ago
JSON representation
A ML model natural language processing techniques to classify articles based on their content. The model analyzes text from the articles to predict the most relevant category, such as news, sports, or entertainment. The model's performance is validated through rigorous testing, and all relevant code and data sets are included.
- Host: GitHub
- URL: https://github.com/nouraalgohary/news-article-classifier
- Owner: NouraAlgohary
- Created: 2022-12-21T14:55:21.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2022-12-22T04:02:56.000Z (over 3 years ago)
- Last Synced: 2025-02-14T14:51:39.883Z (over 1 year ago)
- Topics: kaggle, machine-learning, mlproject, multinomial-naive-bayes, news-article-classifier, nlp, text-classification
- Language: Jupyter Notebook
- Homepage:
- Size: 138 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: News Article Classifier.ipynb
Awesome Lists containing this project
README
# News-Article-Classifier
A machine learning model to predict article category using some text of the article. Data from [BBC News Clssification | Kaggle](https://www.kaggle.com/competitions/learn-ai-bbc/data).
## Steps:
Using LSTM is not effcient for such a small data set, so replaced it with MultinomialNB.
1. Data Loading
2. Data Cleaning
3. Data Preprocessing
4. Model Training
5. Testing
## Features:
- 1490 instances
- 3 columns(ArticleId, Article, Category)
- 5 Categories ('business', 'tech', 'politics', 'sport', 'entertainment')