Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/vidhi1290/disaster-tweet-classification-using-lstm
Developed a deep learning model using LSTM networks to classify tweets as disaster-related or non-disaster-related, vital for emergency response. Explored advanced EDA techniques, visualized tweet data insights, and achieved a high F1 score of 0.74. Check out the code and results!
https://github.com/vidhi1290/disaster-tweet-classification-using-lstm
classification deep-learning disaster-tweet-prediction eda kaggle-competition lstm lstm-neural-networks nlp nlp-machine-learning visualizations
Last synced: 30 days ago
JSON representation
Developed a deep learning model using LSTM networks to classify tweets as disaster-related or non-disaster-related, vital for emergency response. Explored advanced EDA techniques, visualized tweet data insights, and achieved a high F1 score of 0.74. Check out the code and results!
- Host: GitHub
- URL: https://github.com/vidhi1290/disaster-tweet-classification-using-lstm
- Owner: Vidhi1290
- Created: 2023-11-01T14:22:06.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2023-11-02T10:22:56.000Z (about 1 year ago)
- Last Synced: 2024-01-28T22:37:48.742Z (11 months ago)
- Topics: classification, deep-learning, disaster-tweet-prediction, eda, kaggle-competition, lstm, lstm-neural-networks, nlp, nlp-machine-learning, visualizations
- Language: Jupyter Notebook
- Homepage:
- Size: 345 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Disaster Tweet Classification using LSTM đđ
## Introduction âšī¸
In this project, we developed a deep learning model using Long Short-Term Memory (LSTM) networks to classify tweets into two categories: disaster-related and non-disaster-related. This classification is crucial for emergency response, public safety, and disaster management applications. We applied Natural Language Processing (NLP) techniques and deep learning to achieve this classification.## Data Preparation đ ī¸
We started by loading the datasets containing tweet information. The data underwent several preprocessing steps, including handling missing values, text cleaning, and combining text with keyword and location information. Text data was tokenized, padded, and handled class imbalance using Synthetic Minority Over-sampling Technique (SMOTE). The preprocessed data was split into training and validation sets for model training.## Model Architecture đ§
We built an LSTM neural network model for tweet classification. The model consisted of an embedding layer, a bidirectional LSTM layer, and a dense output layer with sigmoid activation. The model was compiled using the Adam optimizer and binary cross-entropy loss function.## Model Training and Evaluation đī¸ââī¸
The model was trained for 15 epochs with a batch size of 32. We visualized the accuracy and loss on both training and validation sets to monitor the model's performance over epochs. We evaluated the model's performance using the F1 score, a metric that considers both precision and recall. The F1 score on the validation data was approximately 0.74, indicating a good balance between precision and recall.## Advanced Exploratory Data Analysis (EDA) and Visualizations đ
We performed advanced EDA to gain insights into the tweet data. Word clouds were generated for disaster and non-disaster tweets, showcasing the most common words in each category. We also visualized the tweet length distribution for both categories and explored the most common keywords in disaster and non-disaster tweets.## Model Performance Visualization đ
We visualized the training and validation accuracy and loss over epochs to analyze the model's learning process. Additionally, we created a confusion matrix heatmap to visualize the model's performance in detail.## Conclusion đ
This project demonstrated the application of deep learning techniques for classifying disaster-related tweets. The developed LSTM model showed good performance in distinguishing between disaster and non-disaster tweets. The insights gained from advanced EDA further enhanced our understanding of the tweet data, contributing to the overall analysis.Feel free to explore the provided code and visualize the results! đ
Follow me on LinkedIn: [Vidhi Waghela](https://www.linkedin.com/in/vidhi-waghela-434663198/) \\
Kaggle: [Vidhi Kishor Waghela](https://www.kaggle.com/vidhikishorwaghela) \\
GitHub: [Vidhi1290](https://github.com/Vidhi1290)