https://github.com/ramarav/fake_news_detection
Machine learning approach for fake news detection using Scikitlearn
https://github.com/ramarav/fake_news_detection
itertools jupyter-notebook jupyter-notebooks machine-learning machine-learning-algorithms machinelearning numpy pandas passiveaggressiveclassifier python python-3 python3 scikit-learn scikitlearn-machine-learning tfidfvectorizer
Last synced: about 2 months ago
JSON representation
Machine learning approach for fake news detection using Scikitlearn
- Host: GitHub
- URL: https://github.com/ramarav/fake_news_detection
- Owner: ramarav
- License: gpl-3.0
- Created: 2020-06-13T10:09:52.000Z (almost 6 years ago)
- Default Branch: master
- Last Pushed: 2020-06-13T10:13:24.000Z (almost 6 years ago)
- Last Synced: 2025-02-16T00:24:42.970Z (over 1 year ago)
- Topics: itertools, jupyter-notebook, jupyter-notebooks, machine-learning, machine-learning-algorithms, machinelearning, numpy, pandas, passiveaggressiveclassifier, python, python-3, python3, scikit-learn, scikitlearn-machine-learning, tfidfvectorizer
- Language: Jupyter Notebook
- Size: 11.1 MB
- Stars: 4
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# ๐ Fake News Detection using ML + Generative AI
### ๐ง Detect fake vs. real news headlines and explain the reasoning using OpenAI GPT-4o-mini.
---
## ๐๏ธ Project Overview
This upgraded version of the classic Fake News Detection project adds an **LLM-powered explainability layer**.
Traditional ML models classify news articles, while **GPT-4o-mini** provides a human-readable justification of why an article is likely fake or real.
---
## โ๏ธ Tech Stack
| Layer | Technology | Purpose |
|-------|-------------|----------|
| ๐งฉ Machine Learning | **Scikit-Learn** | Core fake/real classification |
| ๐ง Generative AI | **OpenAI GPT-4o-mini** | Natural-language explanations |
| ๐ป Frontend | **HTML + CSS** | Simple, responsive web interface |
| ๐ Backend | **Flask** | Web app for serving predictions |
| ๐พ Storage | **joblib, pandas** | Model + dataset handling |
---
## ๐งฉ Features
- ๐ฐ Classifies news as **FAKE** or **REAL** using Passive Aggressive Classifier.
- ๐ Generates **explanations** via GPT-4o-mini for every prediction.
- ๐ Flask-based web app with an easy-to-use text input box.
- ๐ Confusion matrix & accuracy summary available on `/metrics`.
- ๐งฑ Modular folder structure for quick extension or retraining.
---
## ๐ Folder Structure
```
Fake_News_Detection/
โ
โโโ app.py # Flask entry point
โโโ requirements.txt # Dependencies
โโโ model/
โ โโโ fake_news_model.pkl # Trained PAC model
โ โโโ tfidf_vectorizer.pkl # TF-IDF vectorizer
โ
โโโ data/
โ โโโ news.csv # Dataset (Kaggle-style)
โ
โโโ utils/
โ โโโ gpt_explainer.py # GPT-4o-mini text explanations
โ โโโ model_loader.py # Load + predict helpers
โ โโโ preprocess.py # Text preprocessing utils
โ
โโโ templates/
โ โโโ index.html # Web interface
โ
โโโ static/
โ โโโ style.css # Styling
โ
โโโ README.md
```
---
## ๐งฐ Setup Instructions
```bash
# 1๏ธโฃ Clone the repository
git clone https://github.com/ramarav/Fake_News_Detection.git
cd Fake_News_Detection
# 2๏ธโฃ Create a virtual environment
python -m venv venv
source venv/bin/activate # on Windows use venv\Scripts\activate
# 3๏ธโฃ Install dependencies
pip install -r requirements.txt
# 4๏ธโฃ Add your OpenAI API key (for explanations)
set OPENAI_API_KEY=your_api_key_here # Windows
export OPENAI_API_KEY=your_api_key_here # macOS/Linux
# 5๏ธโฃ Run Flask app
python app.py
```
Then open [http://localhost:5000](http://localhost:5000) ๐ฏ
---
## ๐งช Sample Output
| Input | Prediction | Explanation |
|--------|-------------|--------------|
| โNASA confirms aliens discovered near Mars base.โ | **FAKE** | โThis resembles tabloid-style unverifiable claims.โ |
| โUN reports global hunger dropped by 10% in 2024.โ | **REAL** | โThe phrasing and reference to official data suggest credibility.โ |
---
## ๐งฎ Model Performance
| Metric | Value |
|---------|--------|
| Accuracy | **93.13%** |
| Classifier | PassiveAggressiveClassifier |
| Vectorizer | TF-IDF (max_df=0.7, stop_words='english') |
---
## ๐ท๏ธ Badges






---
## ๐ฆ API Endpoint Example
**POST** `/predict`
```bash
curl -X POST http://127.0.0.1:5000/predict -H "Content-Type: application/json" -d '{"text": "Breaking: New vaccine approved by WHO"}'
```
**Response:**
```json
{
"prediction": "REAL",
"explanation": "WHO approvals are verified through credible institutional sources."
}
```
---
## ๐ก Future Enhancements
- [ ] Integrate news source credibility scoring
- [ ] Add multilingual detection
- [ ] Deploy using Docker + Render
- [ ] Support voice-based input (Speech-to-Text)
---
## ๐จโ๐ป Author
**Mekala Ramarao**
AMD India
Focus: AI/ML applications in NLP, GPU analytics, and intelligent automation.
๐ง [LinkedIn](https://www.linkedin.com/in/mekala-ramarao-a2b5a562/)