https://github.com/shimaa83/twitter_disaster
twitter classification using classic ML models
https://github.com/shimaa83/twitter_disaster
cat-boast light-gm naive-bayes-classifier nlp random-forest tfidf word-cloud
Last synced: 11 months ago
JSON representation
twitter classification using classic ML models
- Host: GitHub
- URL: https://github.com/shimaa83/twitter_disaster
- Owner: shimaa83
- Created: 2024-06-01T18:54:24.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2024-06-01T18:57:36.000Z (about 2 years ago)
- Last Synced: 2025-01-20T04:34:44.786Z (over 1 year ago)
- Topics: cat-boast, light-gm, naive-bayes-classifier, nlp, random-forest, tfidf, word-cloud
- Language: Jupyter Notebook
- Homepage:
- Size: 576 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# The Summary
**Data Analysis:**
- read the csv file
- find null values
- remove null values
- remove duplicated values
**visualization**
- word cloud for the text
- histplot for distrupution of target values
**preprocess steps**
- remove special character
- remove emojies
- convert to lower case
- remove punctuation
- convert to word toknize
- remove stop words
- find the word steming
- finally find TFIDF vector for the words
**model development**
- we apply several machine learning algorithm using cross validation
- logistic regression with mean cross validation score= 0.7935960591133006
- Random forest with Mean cross-validation score: 0.7688013136288999
- Naive baise with Mean cross-validation score: 0.6088669950738916
- ANN with accuracy 0.7432698607444763
- Light GM classifier with Mean cross-validation score: 0.6252873563218391
- catboost regressor with Mean cross-validation score: 0.24323716285103467
**conclusion**
The developed model can help in automatically filtering and prioritizing tweets during disaster events.
By accurately identifying real disaster tweets, emergency response teams can focus on critical information and respond more effectively.
Enhanced Situational Awareness:
Real-time monitoring and analysis of social media data can provide valuable insights into the scale, location, and nature of disasters.
The model's ability to classify disaster-related tweets can contribute to a better understanding of the evolving situation on the ground.
Early Warning Systems:
By identifying early signals of disasters from social media data, authorities can initiate early warning systems and evacuation procedures, potentially saving lives and minimizing damage.
Resource Allocation Optimization:
Accurate classification of disaster-related tweets can inform resource allocation decisions, directing emergency services to areas most in need of assistance.
Public Engagement and Communication:
Effective use of social media analysis can facilitate two-way communication between authorities and the public, enabling timely dissemination of information, instructions, and safety tips during disaster events.