{"id":28712891,"url":"https://github.com/kade-one/disaster-tweets-classification","last_synced_at":"2026-04-15T09:31:25.052Z","repository":{"id":297792547,"uuid":"997756950","full_name":"kade-one/disaster-tweets-classification","owner":"kade-one","description":"This repository offers a complete machine learning pipeline for classifying tweets related to disasters. It includes data processing, model training, and an interactive dashboard for insights. 🐙📊","archived":false,"fork":false,"pushed_at":"2026-04-09T11:47:18.000Z","size":1015,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-09T13:26:50.348Z","etag":null,"topics":["basics","classification","cnn","data-science","deep-learning","disaster","fastai-nlp","flask-application","keras","machine-learning-pipelines","natural","nlp","pytorch","quick","start","tfx","tweets","ulmfit"],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kade-one.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-06-07T05:41:25.000Z","updated_at":"2026-04-09T11:47:22.000Z","dependencies_parsed_at":"2025-12-15T08:01:19.283Z","dependency_job_id":null,"html_url":"https://github.com/kade-one/disaster-tweets-classification","commit_stats":null,"previous_names":["kade-one/disaster-tweets-classification"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/kade-one/disaster-tweets-classification","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kade-one%2Fdisaster-tweets-classification","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kade-one%2Fdisaster-tweets-classification/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kade-one%2Fdisaster-tweets-classification/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kade-one%2Fdisaster-tweets-classification/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kade-one","download_url":"https://codeload.github.com/kade-one/disaster-tweets-classification/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kade-one%2Fdisaster-tweets-classification/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31834502,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-15T07:17:56.427Z","status":"ssl_error","status_checked_at":"2026-04-15T07:17:30.007Z","response_time":63,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["basics","classification","cnn","data-science","deep-learning","disaster","fastai-nlp","flask-application","keras","machine-learning-pipelines","natural","nlp","pytorch","quick","start","tfx","tweets","ulmfit"],"created_at":"2025-06-15T00:01:03.160Z","updated_at":"2026-04-15T09:31:25.044Z","avatar_url":"https://github.com/kade-one.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Disaster Tweets Classification 🌍🌀\r\n\r\n![GitHub release](https://img.shields.io/badge/Latest_Release-v1.0.0-brightgreen)  \r\n[![GitHub Releases](https://img.shields.io/badge/Check_Releases-blue)](https://github.com/kade-one/disaster-tweets-classification/releases)\r\n\r\n## Table of Contents\r\n1. [Project Overview](#project-overview)\r\n2. [Technologies Used](#technologies-used)\r\n3. [Getting Started](#getting-started)\r\n   - [Prerequisites](#prerequisites)\r\n   - [Installation](#installation)\r\n4. [Data Sources](#data-sources)\r\n5. [Project Structure](#project-structure)\r\n6. [Analysis and Modeling](#analysis-and-modeling)\r\n   - [ETL Process](#etl-process)\r\n   - [Exploratory Data Analysis (EDA)](#exploratory-data-analysis-eda)\r\n   - [Deep Learning Models](#deep-learning-models)\r\n7. [Visualization](#visualization)\r\n8. [Usage](#usage)\r\n9. [Contributing](#contributing)\r\n10. [License](#license)\r\n11. [Contact](#contact)\r\n\r\n## Project Overview\r\n\r\nThe **Disaster Tweets Classification** project aims to classify tweets that are related to disasters. This project leverages data from AWS RDS, utilizing CSV files for data storage. The process includes an ETL (Extract, Transform, Load) pipeline, exploratory data analysis (EDA), and the implementation of deep learning models in Jupyter Notebooks. Visualizations and dashboards are created using Tableau to present the findings effectively.\r\n\r\nFor the latest updates, please check our [Releases](https://github.com/kade-one/disaster-tweets-classification/releases).\r\n\r\n## Technologies Used\r\n\r\n- **AWS RDS**: For relational database storage.\r\n- **AWS S3**: For storing CSV files.\r\n- **Python**: Primary programming language for data analysis and modeling.\r\n- **Jupyter Notebook**: For interactive coding and data exploration.\r\n- **Deep Learning**: Utilizing neural networks for classification tasks.\r\n- **Tableau**: For creating visualizations and dashboards.\r\n- **Transformers**: For advanced NLP tasks.\r\n\r\n## Getting Started\r\n\r\n### Prerequisites\r\n\r\nBefore you begin, ensure you have the following installed:\r\n\r\n- Python 3.6 or higher\r\n- Jupyter Notebook\r\n- AWS CLI configured with access to RDS and S3\r\n- Tableau Desktop or Tableau Prep Builder\r\n\r\n### Installation\r\n\r\n1. Clone the repository:\r\n   ```bash\r\n   git clone https://github.com/kade-one/disaster-tweets-classification.git\r\n   cd disaster-tweets-classification\r\n   ```\r\n\r\n2. Install the required Python packages:\r\n   ```bash\r\n   pip install -r requirements.txt\r\n   ```\r\n\r\n3. Download the necessary data files from the [Releases](https://github.com/kade-one/disaster-tweets-classification/releases) section and place them in the appropriate directories.\r\n\r\n## Data Sources\r\n\r\nThe project uses CSV files stored in AWS RDS. These files contain tweets and their respective classifications. The data is sourced from Kaggle competitions, ensuring a rich dataset for analysis.\r\n\r\n## Project Structure\r\n\r\nThe repository is organized as follows:\r\n\r\n```\r\ndisaster-tweets-classification/\r\n│\r\n├── data/\r\n│   ├── raw/                 # Raw data files\r\n│   ├── processed/           # Processed data files\r\n│\r\n├── notebooks/               # Jupyter notebooks for analysis\r\n│   ├── etl_notebook.ipynb\r\n│   ├── eda_notebook.ipynb\r\n│   ├── model_notebook.ipynb\r\n│\r\n├── src/                    # Source code\r\n│   ├── etl.py              # ETL pipeline\r\n│   ├── eda.py              # EDA functions\r\n│   ├── model.py            # Deep learning models\r\n│\r\n├── visualizations/         # Tableau files\r\n│   ├── dashboard.twbx\r\n│   ├── prep_flow.tfl\r\n│\r\n├── requirements.txt        # Python dependencies\r\n└── README.md               # Project overview\r\n```\r\n\r\n## Analysis and Modeling\r\n\r\n### ETL Process\r\n\r\nThe ETL process involves extracting data from the RDS database, transforming it into a suitable format, and loading it into a new dataset for analysis. The `etl.py` script handles this process, ensuring that the data is clean and structured.\r\n\r\n### Exploratory Data Analysis (EDA)\r\n\r\nEDA is performed in `eda_notebook.ipynb`. This notebook contains visualizations and statistical summaries that help in understanding the data. Key insights include:\r\n\r\n- Distribution of tweet categories.\r\n- Frequency of words in disaster-related tweets.\r\n- Correlation between tweet length and classification.\r\n\r\n### Deep Learning Models\r\n\r\nThe modeling process is carried out in `model_notebook.ipynb`. Here, we implement various deep learning architectures, including:\r\n\r\n- **Convolutional Neural Networks (CNN)**: Effective for text classification.\r\n- **Recurrent Neural Networks (RNN)**: Useful for sequential data.\r\n- **Transformers**: State-of-the-art models for NLP tasks.\r\n\r\nThe models are evaluated based on accuracy, precision, and recall metrics.\r\n\r\n## Visualization\r\n\r\nVisualizations are created using Tableau, providing a clear and interactive dashboard for users. The `visualizations` folder contains Tableau files that can be opened in Tableau Desktop or Tableau Prep Builder. Key visualizations include:\r\n\r\n- Tweet distribution by category.\r\n- Word clouds for different disaster types.\r\n- Trends over time for tweet activity.\r\n\r\n## Usage\r\n\r\nTo run the analysis, open the Jupyter notebooks in the `notebooks` directory. Execute each cell step-by-step to perform the ETL process, EDA, and modeling. For visualizations, open the Tableau files in Tableau Desktop.\r\n\r\n## Contributing\r\n\r\nWe welcome contributions to improve the project. If you would like to contribute, please follow these steps:\r\n\r\n1. Fork the repository.\r\n2. Create a new branch (`git checkout -b feature-branch`).\r\n3. Make your changes and commit them (`git commit -m 'Add new feature'`).\r\n4. Push to the branch (`git push origin feature-branch`).\r\n5. Create a pull request.\r\n\r\n## License\r\n\r\nThis project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.\r\n\r\n## Contact\r\n\r\nFor questions or suggestions, feel free to reach out to the project maintainers. You can find their contact information in the repository.\r\n\r\nFor more updates and to download the latest files, please visit our [Releases](https://github.com/kade-one/disaster-tweets-classification/releases).","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkade-one%2Fdisaster-tweets-classification","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkade-one%2Fdisaster-tweets-classification","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkade-one%2Fdisaster-tweets-classification/lists"}