https://github.com/nyx1311/toxicity-detector-using-bilstm
π What we built: An AI-powered Womenβs Safety & Well-Being Detector β a web app that flags multiple forms of online abuse in real time and offers tools for emotional recovery. π Under the hood: BiLSTM + Word2Vec embeddings for deep, context-aware detection Trained on 21K+ labeled comments across 7 toxicity categories Built with Python, Tensor
https://github.com/nyx1311/toxicity-detector-using-bilstm
epoch genism gpu keras-tensorflow matplotlib model nlp nlp-machine-learning nlpaug-textual numpy pandas pandas-library python3 streamlit tensorflow word2vec
Last synced: 2 months ago
JSON representation
π What we built: An AI-powered Womenβs Safety & Well-Being Detector β a web app that flags multiple forms of online abuse in real time and offers tools for emotional recovery. π Under the hood: BiLSTM + Word2Vec embeddings for deep, context-aware detection Trained on 21K+ labeled comments across 7 toxicity categories Built with Python, Tensor
- Host: GitHub
- URL: https://github.com/nyx1311/toxicity-detector-using-bilstm
- Owner: Nyx1311
- Created: 2025-08-31T05:47:44.000Z (10 months ago)
- Default Branch: main
- Last Pushed: 2025-08-31T05:56:36.000Z (10 months ago)
- Last Synced: 2025-08-31T07:11:09.151Z (10 months ago)
- Topics: epoch, genism, gpu, keras-tensorflow, matplotlib, model, nlp, nlp-machine-learning, nlpaug-textual, numpy, pandas, pandas-library, python3, streamlit, tensorflow, word2vec
- Language: Jupyter Notebook
- Homepage:
- Size: 36.3 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
Toxicity Detector using BiLSTM
A deep learning model to classify the degree of toxicity in user comments using a **Bidirectional LSTM (BiLSTM)** network.
This project aims to automatically detect toxic, offensive, or harmful language in text, making online communities safer.
---
## β¨ Features
- Preprocessing pipeline for raw text (cleaning, tokenization, padding).
- Word embeddings for rich semantic representation.
- **BiLSTM-based classifier** for sequence modeling.
- Trained on labeled comment datasets for toxicity detection.
- Supports multiple levels of toxicity classification (toxic, severe toxic, obscene, threat, insult, identity hate).
---
## βοΈ Tech Stack
- **Python 3**
- **TensorFlow / Keras**
- **BiLSTM (Bidirectional Long Short-Term Memory)**
- **NumPy, Pandas** for data handling
- **Matplotlib / Seaborn** for visualization
- **Streamlit** for deployment as a simple web app
---
## π How to Run
### 1. Clone the repository
```bash
git clone https://github.com/Nyx1311/Toxicity-Detector-using-BiLSTM.git
cd Toxicity-Detector-using-BiLSTM
````
### 2. Install dependencies
```bash
pip install -r requirements.txt
```
### 3. Train the model
```bash
jupyter notebook Comment_Toxicity_using_BiLSTM.ipynb
```
### 4. Run the web app
```bash
streamlit run toxicity_app_streamlit.py
```
---
## π Results
* Achieved strong accuracy on multi-label toxicity classification.
* BiLSTM captures context from both directions, improving prediction quality compared to vanilla LSTM.
*(Insert confusion matrix, accuracy/loss plots here.)*
---
## π Future Improvements
* Integrate pre-trained embeddings (e.g., GloVe, fastText).
* Explore Transformer-based models (BERT, RoBERTa).
* Deploy a production-ready API.
---
## π§βπ» Author
Developed by **[Nyx1311](https://github.com/Nyx1311)**.
`