https://github.com/valaydave/fake-news-detection-han

Hierarchical Attention Neural Network For Fake News Detection
https://github.com/valaydave/fake-news-detection-han

document-classification fake-news-classification han hierarchical-attention-networks keras keras-implementations lstm neural-networks python

Last synced: 3 months ago
JSON representation

Hierarchical Attention Neural Network For Fake News Detection

Host: GitHub
URL: https://github.com/valaydave/fake-news-detection-han
Owner: valayDave
Created: 2019-11-09T09:13:00.000Z (over 5 years ago)
Default Branch: master
Last Pushed: 2019-11-10T20:38:55.000Z (over 5 years ago)
Last Synced: 2025-01-25T06:12:14.371Z (4 months ago)
Topics: document-classification, fake-news-classification, han, hierarchical-attention-networks, keras, keras-implementations, lstm, neural-networks, python
Language: Python
Homepage:
Size: 1020 KB
Stars: 0
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# FAKE NEWS DETECTION USING HIERARCHICAL ATTENTION NETWORKS

Implementation of document classification model described in Hierarchical Attention Networks for Document Classification (Yang et al., 2016) for Fake News Detection.

## Neural Networks

- There are three networks which the current script trains :
- LSTM
- Hierarchical Attention Networks(HAN) with Tokenized Sentences and Words for attention context.
- HAN with Headline tokenized vector as another layer of for additional attention context.

## Data setup

- run ```./gather_data_training.sh``` to get Embedding Vectors, Training Data Files and the Tokenized Word Index

- This will even download a Video File which displays the entire execution of the model. To run ```./gather_data_training.sh```, ```wget``` needs to be a part of the system. Ubuntu 16.04 is the ideal environment.

## Environment Setup

- ```pip install -r req.txt```

## Training :

- Dataset Used : split-3.csv : Contains 22000 Documents of Labels : FAKE_NEWS and REAL_NEWS

- Running ```python main.py``` will initate the training process for the LSTM,HAN and 3HAN models. It will save the Models in the ```models``` folder and the plots int the ```plots``` folder.

## Testing

- ```python predict.py```

# Instructions :

- The models provided in the Models folder can be run using the predict API. But there the output will not be what is provided in the presentation. To get the accuracy and the outputs mentioned in the presentation, please Train the models again using ```python main.py```. The logs display the accuracy of the model post the training processes against a test dataset.

- While we were saving the model we are facing a well known issue of Keras. Even though the model is saved it doesn't behave in the way it is supposed to. Even if passed with the same data it is trained with, the model doesn't yield the same accuracy which it yielded when the same model is run through the test dataset just after training. To validate the issues that we are proposing , Please find below the links on Keras's github thread where multiple users are reporting the same issues with the model.

- https://github.com/keras-team/keras/issues/4875

- https://github.com/keras-team/keras/issues/4904

- To View the full capabilities of the Model, Please retrain the Model and the logs contain the accuracy and the loss of the model. The video that gets downloaded via ```gather_data_training.sh``` show cases a model training session for the HAN.

## Best Results From Training :

### HAN
- ACCURACY
- ![HAN Accuracy](final_models/plots/HAN_Accuracy.png)

- LOSS
- ![HAN LOSS](final_models/plots/HAN_Loss.png)

### LSTM
- ACCURACY
- ![LSTM Accuracy](final_models/plots/LSTM_Accuracy.png)

- LOSS
- ![LSTM LOSS](final_models/plots/LSTM_Loss.png)

## Inference

The 3 LSTM is overfit, But even with regularization and drop out the overfitting doesn't seem to go. The LSTM's predictions are also less accurate than the HAN

## Authors

- [Valay Dave]([email protected])
- [Craig Ignatowski]([email protected])

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/valaydave/fake-news-detection-han

Awesome Lists containing this project

README