Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/massimilianovisintainer/fake-news-prediction-model
Fake News Prediction Model
https://github.com/massimilianovisintainer/fake-news-prediction-model
machine-learning nltk numpy pandas python3 sklearn
Last synced: 14 days ago
JSON representation
Fake News Prediction Model
- Host: GitHub
- URL: https://github.com/massimilianovisintainer/fake-news-prediction-model
- Owner: MassimilianoVisintainer
- Created: 2024-07-25T18:04:16.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2024-07-25T18:05:54.000Z (4 months ago)
- Last Synced: 2024-10-11T08:20:41.508Z (about 1 month ago)
- Topics: machine-learning, nltk, numpy, pandas, python3, sklearn
- Language: Python
- Homepage:
- Size: 3.91 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## Fake News Prediction with Logistic Regression
### Introduction
This repository contains the code for a Fake News prediction system using Logistic Regression. The code is based on a Jupyter notebook originally generated by Colab.
### Functionality
This code performs the following tasks:
* **Import Libraries:** Imports necessary libraries like pandas, numpy, nltk etc. for data manipulation, text processing and machine learning.
* **Data Preprocessing:**
* Loads the training data (`train.csv`) into a pandas dataframe.
* Handles missing values by replacing them with empty strings.
* Combines author name and title into a single "content" column.
* Separates the data (content) and the target label (fake/real).
* Applies stemming to reduce words to their root form and removes stopwords (common words like "the", "and").
* Converts textual data into numerical features using TF-IDF vectorizer.
* **Train-Test Split:** Splits the data into training and testing sets for model evaluation.
* **Model Training:** Trains a Logistic Regression model on the training data.
* **Evaluation:**
* Evaluates the model's accuracy on both training and testing data.
* **Prediction:**
* Makes a prediction on a new unseen piece of text data (example from the testing set).
* Classifies the news as Real or Fake based on the prediction.### Running the Code
This code is intended to be run in a Jupyter Notebook environment. You can follow these steps:
1. Download the code and data files.
2. Open the `Fake_News_Prediction.ipynb` file in a Jupyter Notebook environment.
3. Run the code cells sequentially.### Dependencies
* Python 3.x
* pandas
* numpy
* nltk
* scikit-learnl