An open API service indexing awesome lists of open source software.

https://github.com/rmkenv/climate-disaster-data-pipeline

Automated pipeline for synchronizing NOAA Billion-Dollar Disasters data via NWS and FEMA APIs.
https://github.com/rmkenv/climate-disaster-data-pipeline

climate-change data-engineering disaster-management fema-data noaa-api

Last synced: 4 months ago
JSON representation

Automated pipeline for synchronizing NOAA Billion-Dollar Disasters data via NWS and FEMA APIs.

Awesome Lists containing this project

README

          

This is a high-end, professionally structured `README.md` designed to impress recruiters by highlighting technical proficiency, domain impact, and engineering best practices.

***

# ๐ŸŒช๏ธ disaster-data-replicator

[![Python Version](https://img.shields.io/badge/python-3.9%2B-blue.svg)](https://www.python.org/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Data Source: NOAA](https://img.shields.io/badge/Data-NOAA_NWS-informational)](https://www.noaa.gov/)
[![Data Source: FEMA](https://img.shields.io/badge/Data-FEMA-orange)](https://www.fema.gov/)
[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg)](http://makeapullrequest.com)

> **Bridging the gap between climate events and actionable insights through automated, high-fidelity data engineering.**

`disaster-data-replicator` is a production-grade ETL pipeline designed to synchronize and normalize "Billion-Dollar Disaster" data. By orchestrating data from the National Weather Service (NWS) and FEMA APIs, this tool provides a unified view of climate-driven economic impacts, enabling researchers and developers to analyze disaster trends with precision.

---

## ๐Ÿ’ก Key Features

- **๐Ÿ”„ Multi-Source Synchronization:** Seamlessly merges NOAA Billion-Dollar Disaster datasets with real-time FEMA disaster declarations.
- **๐Ÿ›  Automated ETL Pipeline:** Handles data extraction, schema normalization, and validation without manual intervention.
- **๐Ÿ“‰ Intelligent Rate Limiting:** Built-in backoff algorithms and request throttling to respect NWS and FEMA API constraints.
- **๐Ÿงช Data Integrity:** Uses Pydantic for strict schema enforcement, ensuring that inconsistent API responses don't break downstream analytics.
- **๐Ÿ“‚ Flexible Export:** Supports high-performance output formats including Parquet (for Big Data), CSV, and JSON.

---

## ๐Ÿ›  Tech Stack

| Category | Tools |
| :--- | :--- |
| **Language** | ![Python](https://img.shields.io/badge/python-3670A0?style=for-the-badge&logo=python&logoColor=ffdd54) |
| **Data Handling** | ![Pandas](https://img.shields.io/badge/pandas-%23150458.svg?style=for-the-badge&logo=pandas&logoColor=white) ![Pydantic](https://img.shields.io/badge/Pydantic-E92064?style=for-the-badge&logo=pydantic&logoColor=white) |
| **APIs** | `noaa-api`, `fema-data-api`, `requests` |
| **DevOps** | ![GitHub Actions](https://img.shields.io/badge/github%20actions-%232671E5.svg?style=for-the-badge&logo=githubactions&logoColor=white) ![Docker](https://img.shields.io/badge/docker-%230db7ed.svg?style=for-the-badge&logo=docker&logoColor=white) |
| **Domain** | Climate Change, Disaster Management, Data Engineering |

---

## ๐Ÿ Quick Start

### Prerequisites

- Python 3.9+
- API keys for NOAA/FEMA (if required for higher rate limits)

### Installation

1. **Clone the repository:**
```bash
git clone https://github.com/yourusername/disaster-data-replicator.git
cd disaster-data-replicator
```

2. **Set up a virtual environment:**
```bash
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
```

3. **Install dependencies:**
```bash
pip install -r requirements.txt
```

### Usage

**Run the full replication pipeline:**
```bash
python main.py --start-year 2000 --output-format parquet
```

**Sync specific agency data:**
```bash
# Sync only FEMA disaster declarations
python scripts/sync_fema.py --region "FL"
```

---

## ๐Ÿ“Š Project Architecture

```mermaid
graph LR
A[NOAA NWS API] --> E[Ingestion Engine]
B[FEMA API] --> E
E --> F{Normalization}
F --> G[Validation/Pydantic]
G --> H[(Local Storage / S3)]
H --> I[Analytics & Visualization]
```

---

## ๐Ÿค How to Contribute

Contributions make the open-source community an amazing place to learn, inspire, and create.

1. **Fork** the Project
2. Create your **Feature Branch** (`git checkout -b feature/AmazingFeature`)
3. **Commit** your Changes (`git commit -m 'Add some AmazingFeature'`)
4. **Push** to the Branch (`git push origin feature/AmazingFeature`)
5. Open a **Pull Request**

---

## ๐Ÿ“„ License

Distributed under the MIT License. See `LICENSE` for more information.

---

## ๐Ÿ“ง Contact

**Your Name** - [@YourTwitter](https://twitter.com/YourHandle) - your.email@example.com

*Project Link: [https://github.com/yourusername/disaster-data-replicator](https://github.com/yourusername/disaster-data-replicator)*

---
*This project was developed to provide transparent access to climate-related economic data, supporting the global effort to understand and mitigate the impacts of climate change.*