An open API service indexing awesome lists of open source software.

https://github.com/shreyansh-21/hateshield

HateShield is an AI-powered hate speech detection system using LSTM for text and ResNet for images. It analyzes social media comments, memes, and other content to identify harmful speech with high accuracy.
https://github.com/shreyansh-21/hateshield

ai computer-vision deep-learning hate-speech-detection keras machine-learning nlp tenserflow

Last synced: about 2 months ago
JSON representation

HateShield is an AI-powered hate speech detection system using LSTM for text and ResNet for images. It analyzes social media comments, memes, and other content to identify harmful speech with high accuracy.

Awesome Lists containing this project

README

        

HateShield


Overview


HateShield is a deep learning-based system designed to detect hate speech in both text and images. It utilizes:




  • LSTM (Long Short-Term Memory) for analyzing text-based hate speech.


  • ResNet (Residual Neural Networks) for detecting hate content in memes and images.


This repository contains the implementation of both models, trained on multiple hate speech datasets.


Table of Contents



  1. Datasets

  2. LSTM for Text-Based Hate Speech Detection

  3. ResNet for Image-Based Hate Speech Detection

  4. How to Run

  5. Results


Datasets


Text-Based Datasets




  • HateXplain: A dataset that provides explanations along with hate speech classification.

  • Dataset Location: Hugging Face


  • Twitter & YouTube Hate Comments Dataset: Contains hate speech from social media platforms.

  • Dataset Location: Kaggle

Image-Based Datasets




  • Hateful Memes Dataset: A multimodal dataset for hate speech detection in memes.

  • Dataset Location: Facebook AI


LSTM for Text-Based Hate Speech Detection


What is LSTM?


LSTM (Long Short-Term Memory) is a type of recurrent neural network (RNN) that is particularly useful for processing sequential data, such as text. Unlike traditional RNNs, LSTMs can handle long-range dependencies, making them effective in understanding the context of words in sentences.

How LSTM Helps in Text Analysis




  • Captures Context: LSTMs remember words from earlier in a sentence, making them ideal for detecting implicit hate speech.


  • Handles Long Sequences: Works well with long tweets, comments, and posts where context is crucial.


  • Mitigates Vanishing Gradient Problem: Unlike simple RNNs, LSTMs use gates to selectively store and forget information.

Implementation Details




  • Preprocessing:

    • Tokenization of text.

    • Removal of stop words and special characters.

    • Padding sequences for uniform input size.




  • Model Architecture:

    • Embedding Layer: Converts words into dense vectors.

    • LSTM Layer: Captures sequential dependencies.

    • Dense Layer: Classifies text as hateful or non-hateful.



Code Reference: Check the Text_LSTM.ipynb file for full implementation.


ResNet for Image-Based Hate Speech Detection


What is ResNet?


ResNet (Residual Networks) is a deep convolutional neural network (CNN) architecture that introduces residual learning to solve the problem of vanishing gradients in deep networks. It allows for efficient training of very deep models.

How ResNet Helps in Image Analysis




  • Feature Extraction: Detects text, symbols, and offensive imagery in memes.


  • Deep Learning Performance: Prevents degradation in accuracy as networks get deeper.


  • Residual Connections: Helps in learning more complex patterns compared to traditional CNNs.

Implementation Details




  • Preprocessing:

    • Image resizing and normalization.

    • Data augmentation to improve model generalization.




  • Model Architecture:

    • Convolutional Layers: Extracts features from images.

    • Residual Blocks: Helps in deeper learning without vanishing gradients.

    • Fully Connected Layers: Classifies images as hateful or non-hateful.



Code Reference: Check the ResNet_final.ipynb file for full implementation.


How to Run


Prerequisites



  • Python 3.8+

  • TensorFlow

  • Keras

  • OpenCV

  • Pandas

  • scikit-learn

  • Matplotlib

Steps to Run



  1. Clone the repository:
    git clone https://github.com/shreyansh-21/HateShield.git
    
    cd HateShield


  2. Install dependencies:
    pip install -r requirements.txt


  3. Run the LSTM model:
    jupyter notebook
    
    # Open and run Text_LSTM.ipynb


  4. Run the ResNet model:
    jupyter notebook
    
    # Open and run ResNet_final.ipynb



Results


The models achieve the following performance metrics:




  • LSTM Model:

    • Precision: xx%

    • Recall: xx%

    • F1-score: xx%




  • ResNet Model:

    • Accuracy: xx%

    • Precision: xx%

    • Recall: xx%