https://github.com/debasish-dutta/nlp-disaster-prediction
This repo contains my NLP processing of tweets determining whether they are disaster tweets or not of a kaggle open competition.
https://github.com/debasish-dutta/nlp-disaster-prediction
kaggle-competition nlp-machine-learning sckit-learn
Last synced: 2 months ago
JSON representation
This repo contains my NLP processing of tweets determining whether they are disaster tweets or not of a kaggle open competition.
- Host: GitHub
- URL: https://github.com/debasish-dutta/nlp-disaster-prediction
- Owner: debasish-dutta
- Created: 2020-07-18T03:19:06.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2020-07-18T07:12:19.000Z (almost 5 years ago)
- Last Synced: 2025-01-19T07:45:18.895Z (4 months ago)
- Topics: kaggle-competition, nlp-machine-learning, sckit-learn
- Language: Jupyter Notebook
- Homepage:
- Size: 969 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# NLP-Disaster-prediction
# 1. Problem Statement
> ### This model will aim to predict which tweets are about real disaster and which of them are not using `natural language processing`.
# 2. Data
### This dataset was created by the company figure-eight and originally shared on their ‘Data For Everyone’ [website](https://www.figure-eight.com/data-for-everyone/) but this dataset is readily avaliable on [Kaggle](https://www.kaggle.com/c/nlp-getting-started/overview).The Dataset has two csv- one with train data and other with test data. The dataset contains 4 features.
# 3. Evaluation
> #### The evaluation metrics for this competition is the f1 score.
# 4. Features
- id: Used for submission
- keyword: Keyword of the tweet
- location: Location the tweet was sent from
- text: The Tweet itself
# 5. Modelling
> #### I used a [Multinomial Naive Bayes model](https://scikit-learn.org/stable/modules/generated/sklearn.naive_bayes.MultinomialNB.html#sklearn.naive_bayes.MultinomialNB) for this project.