An open API service indexing awesome lists of open source software.

https://github.com/drisskhattabi6/fake-news-detection-using-finetuned-bert

Fake News Detection Using FineTuned BERT
https://github.com/drisskhattabi6/fake-news-detection-using-finetuned-bert

Last synced: 3 months ago
JSON representation

Fake News Detection Using FineTuned BERT

Awesome Lists containing this project

README

          

# Fake News Detection using Fine-tuned BERT

This project demonstrates how to fine-tune a pre-trained BERT model to detect fake news articles using a labeled dataset. It includes data preprocessing, model training, evaluation, and deployment via a Flask web app.

---

## ๐Ÿ“ Project Structure

```

โ”œโ”€โ”€ Finetuning-BERT-Fake-News-Detection.ipynb # Main notebook for training the model
โ”œโ”€โ”€ Testing\_BERT.ipynb # Notebook for testing and evaluation
โ”œโ”€โ”€ app.py # Flask web application
โ”œโ”€โ”€ a1\_True.csv # Dataset of real news articles
โ”œโ”€โ”€ a2\_Fake.csv # Dataset of fake news articles
โ”œโ”€โ”€ model/
โ”‚ โ””โ”€โ”€ fakenews\_weights.pt # Another saved model version
โ”œโ”€โ”€ static/
โ”‚ โ””โ”€โ”€ style.css # Styling for the web interface
โ”œโ”€โ”€ templates/
โ”‚ โ”œโ”€โ”€ index.html # Home page for news submission
โ”‚ โ””โ”€โ”€ result.html # Result page for prediction
โ”œโ”€โ”€ logs.log # Log file during training or testing
โ””โ”€โ”€ README.md # This file

````

---

## ๐Ÿงช Dataset

The dataset consists of two CSV files:

- `a1_True.csv` โ€“ real news articles
- `a2_Fake.csv` โ€“ fake news articles

Each file includes the text of the news and associated metadata. The data is preprocessed and combined for training.

---

## ๐Ÿง  Model

- **Model Used:** [`bert-base-uncased`](https://huggingface.co/bert-base-uncased)
- **Architecture:**
- Pre-trained BERT encoder
- Linear โ†’ ReLU โ†’ Dropout โ†’ Linear โ†’ LogSoftmax
- **Loss Function:** Negative Log Likelihood Loss (`NLLLoss`)
- **Optimizer:** `AdamW` with a learning rate of `1e-5`
- **Output Classes:** `Real` or `Fake`

-> Fine-tuning BERT model process :

![Screenshot](imgs/process.png)

---

### ๐Ÿ“ˆ Evaluation

After fine-tuning the BERT model on the fake news dataset, the classifier achieved the following performance metrics on the test set:

```
precision recall f1-score support

Real 0.84 0.92 0.88 3213
Fake 0.92 0.84 0.88 3522

Accuracy 0.88 6735
Macro avg 0.88 0.88 0.88 6735
Weighted avg 0.88 0.88 0.88 6735
```

โœ… **Overall Accuracy:** 88%
๐Ÿ“Š **Balanced Performance:** Both real and fake news are classified with high precision and recall, indicating that the model is not biased toward either class.

---

## ๐Ÿš€ How to Use

### ๐Ÿ”ง 1. Install Requirements

```bash
pip install torch transformers flask
````

### ๐Ÿ‹๏ธ 2. Train or Load the Model

* Use the notebook `Finetuning-BERT-Fake-News-Detection.ipynb` to train and save the model.
* Or use the already fine-tuned model in `cashe/c2_new_model_weights.pt`.

### ๐ŸŒ 3. Run the Flask Web App

```bash
python app.py
```

Then open your browser and visit: [http://localhost:5000](http://localhost:5000)

---

## ๐Ÿ–ฅ๏ธ Web Interface

* Paste a news article into the form.
* Click "Check".
* The app will return a prediction: โœ… **Real News** or โŒ **Fake News**

![Screenshot](imgs/img1.png)

![Screenshot](imgs/img2.png)

---

## ๐Ÿงช API Endpoint

You can also use the API with a POST request:

```bash
curl -X POST http://localhost:5000/api/predict \
-H "Content-Type: application/json" \
-d '{"text": "Some news content here..."}'
```

Response:

```json
{
"prediction": "Fake News"
}
```

---

## ๐Ÿ™Œ Acknowledgements

* [BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding](https://arxiv.org/abs/1810.04805)
* Hugging Face Transformers
* PyTorch