https://github.com/abhinavsharma07/disaster-tweets-kaggle
My solution to Kaggle's Getting started, "Natural Language Processing with Disaster Tweets" competition. Uses GloVe + BiLSTM
https://github.com/abhinavsharma07/disaster-tweets-kaggle
Last synced: 8 months ago
JSON representation
My solution to Kaggle's Getting started, "Natural Language Processing with Disaster Tweets" competition. Uses GloVe + BiLSTM
- Host: GitHub
- URL: https://github.com/abhinavsharma07/disaster-tweets-kaggle
- Owner: AbhinavSharma07
- License: mit
- Created: 2024-10-20T22:00:30.000Z (12 months ago)
- Default Branch: main
- Last Pushed: 2024-10-20T22:17:42.000Z (12 months ago)
- Last Synced: 2025-01-10T19:28:06.417Z (9 months ago)
- Language: Jupyter Notebook
- Size: 1.14 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Natural Language Processing with Disaster tweets
This repo contains an approach I implemented for the Disaster Tweets competition on Kaggle. This particular challenge is perfect for data scientists looking to get started with Natural Language Processing, and Kaggle in general. You can access the Kaggle competition [here](https://www.kaggle.com/c/nlp-getting-started)
## About the Competition
In this competition, you’re challenged to build a machine learning model that predicts which Tweets are about real disasters and which one’s aren’t. You’ll have access to a dataset of 10,000 tweets that were hand classified.
## Data
Each sample in the train and test set has the following information:
- The `text` of a tweet
- A `keyword` from that tweet (although this may be blank!)
- The `location` the tweet was sent from (may also be blank)### Files
- **train.csv** - the training set
- **test.csv** - the test set
- **sample_submission.csv** - a sample submission file in the correct format### Columns
- `id` - a unique identifier for each tweet
- `text` - the text of the tweet
- `location` - the location the tweet was sent from (may be blank)
- `keyword` - a particular keyword from the tweet (may be blank)
- `target` - in train.csv only, this denotes whether a tweet is about a real disaster (1) or not (0)