{"id":23788648,"url":"https://github.com/alefrp/airbnb_dbt","last_synced_at":"2026-02-27T03:01:42.262Z","repository":{"id":270390383,"uuid":"910235355","full_name":"AlefRP/airbnb_dbt","owner":"AlefRP","description":"A dbt project for transforming and analyzing Airbnb data with staging models and advanced transformations.","archived":false,"fork":false,"pushed_at":"2025-02-02T20:52:57.000Z","size":110,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-21T12:23:36.864Z","etag":null,"topics":["dbt","snowflake","sql"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/AlefRP.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-12-30T18:49:19.000Z","updated_at":"2025-02-02T20:53:00.000Z","dependencies_parsed_at":"2025-01-07T15:31:59.404Z","dependency_job_id":null,"html_url":"https://github.com/AlefRP/airbnb_dbt","commit_stats":null,"previous_names":["alefrp/airbnb_dbt"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/AlefRP/airbnb_dbt","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AlefRP%2Fairbnb_dbt","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AlefRP%2Fairbnb_dbt/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AlefRP%2Fairbnb_dbt/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AlefRP%2Fairbnb_dbt/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/AlefRP","download_url":"https://codeload.github.com/AlefRP/airbnb_dbt/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AlefRP%2Fairbnb_dbt/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29883111,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-26T23:51:21.483Z","status":"online","status_checked_at":"2026-02-27T02:00:06.759Z","response_time":57,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dbt","snowflake","sql"],"created_at":"2025-01-01T16:36:56.131Z","updated_at":"2026-02-27T03:01:42.247Z","avatar_url":"https://github.com/AlefRP.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🏠 dbt Project: Airbnb Data Modeling with Snowflake\n\n![Data Flow: CSV → Snowflake → dbt](assets/dbt_airbnb.svg)\n\nWelcome to this **dbt** project, which demonstrates how to model **Airbnb** data in a **Data Warehouse** environment using **Snowflake**. \n\nThis setup is inspired by:\n\n- The [**jaffle_shop**](https://github.com/dbt-labs/jaffle_shop) example project, which outlines a simple pattern for staging, transforming, and analyzing data.\n- The Udemy course [**Complete dbt (Data Build Tool) Bootcamp: Zero to Hero**](https://www.udemy.com/course/complete-dbt-data-build-tool-bootcamp-zero-to-hero-learn-dbt/), created by the **dbt Learn Team**.\n\nThe project follows best practices for:\n\n1. **Data Loading**: Ingesting CSV data into Snowflake.\n2. **Staging**: Creating **staging** models to clean and prepare raw data.\n3. **Transforming**: Building **intermediate** and **mart** layers to enrich and aggregate data.\n4. **Analyzing**: Leveraging **dbt** to easily manage models, run tests, and generate documentation.\n\n---\n\n## 📖 Overview\n\nThe goal of this project is to:\n\n1. Ingest Airbnb data from a CSV file into Snowflake.\n2. Transform raw data into analytics-friendly models using dbt.\n3. Provide dimensional models (dimensions, facts) that can be easily queried and analyzed.\n\nKey components include:\n\n- **Raw Data**: Sourced from CSV files (e.g., `input_data.csv`).\n- **Snowflake**: Serves as the data warehouse to store both the raw data and transformed models.\n- **dbt (Data Build Tool)**: Handles data transformations, testing, and documentation within Snowflake.\n\nThis README outlines how everything is structured, following a simplified approach similar to **jaffle_shop**.\n\n---\n\n## 🏗 Project Structure\n\nThe project is organized into several types of dbt models:\n\n- **Source Models (`src_*.sql`)**  \n  These models define your data sources (CSV files loaded into Snowflake). They provide a clear reference for all your raw data tables.\n\n- **Staging Models (`stg_*.sql`)**  \n  Inspired by **jaffle_shop**, these models clean and unify data from the source layer. They typically handle tasks like:\n  - Renaming or casting fields.\n  - Filtering out invalid records.\n  - Standardizing date formats and keys.\n\n- **Dimension Models (`dim_*.sql`)**  \n  These are transformations of your staged data into dimensions, holding descriptive attributes. For example, `dim_hosts_cleansed.sql` represents cleansed and standardized information about Airbnb hosts.\n\n- **Fact Models (`fct_*.sql`)**  \n  These models aggregate data into tables optimized for analytics. Fact models often contain metrics or measures (e.g., a reviews fact table that stores review counts, ratings, etc.).\n\n- **Intermediate/Join Models (`dim_listings_w_hosts.sql`)**  \n  When you need to combine data from multiple dimensions (e.g., listings with their corresponding hosts), these models serve as an intermediate step, making further analysis simpler.\n\n---\n\n## 📂 File Summary\n\nBelow is a quick guide to the primary SQL models:\n\n- `src_hosts.sql` — Defines the source of host data (raw form).\n- `src_listings.sql` — Defines the source of listing data (raw form).\n- `src_reviews.sql` — Defines the source of review data (raw form).\n- `dim_hosts_cleansed.sql` — Cleans and transforms host data.\n- `dim_listings_cleansed.sql` — Cleans and transforms listing data.\n- `dim_listings_w_hosts.sql` — Joins listings and hosts for broader context.\n- `fct_reviews.sql` — Provides an aggregated fact table of reviews.\n\n---\n\n## ⚙️ Data Pipeline\n\n1. **Ingest CSV Data into Snowflake**  \n   - Upload or load your `input_data.csv` into a Snowflake table (e.g., `RAW_AIRBNB_DATA`).\n   - Create external stages or use Snowflake’s load methods (e.g., `COPY INTO`) to bring the data into your environment.\n\n2. **Build dbt Models**  \n   - Source models (`src_*.sql`) reference the loaded raw data in Snowflake.\n   - Staging models (`stg_*.sql`) transform and clean this data according to best practices from **jaffle_shop**.\n   - Dimension and fact models (`dim_*.sql`, `fct_*.sql`) organize the data into analysis-ready tables.\n\n3. **Analyze and Visualize**  \n   - Once your models are materialized in Snowflake, use BI or data visualization tools (e.g., Looker, Tableau, or Mode) to analyze your Airbnb metrics and KPIs.\n\n---\n\n## 🚀 Getting Started\n\n### Prerequisites\n\n1. **Snowflake** account with the necessary warehouse and database.\n2. **dbt** installed (CLI or Cloud version).\n3. A configured `profiles.yml` to connect dbt to your Snowflake environment.\n\n### Quickstart\n\n1. **Clone** this repository and navigate into the project directory:\n\n   ```bash\n   git clone https://github.com/your-account/airbnb-dbt-snowflake.git\n   cd airbnb-dbt-snowflake\n   ```\n\n2. **Install** dbt dependencies:\n\n   ```bash\n   pip install dbt-snowflake\n   ```\n\n3. **Run** dbt commands:\n\n   ```bash\n   dbt deps\n   dbt seed\n   dbt run\n   dbt test\n   dbt docs generate\n   ```\n\n## 🔨 Customization \u0026 Inspiration\n\n- **Adapt to Your Data**: While this project focuses on Airbnb datasets, you can customize the staging, dimension, and fact models to fit any CSV-based data.\n- **Inspired by jaffle_shop**: Check out [**jaffle_shop**](https://github.com/dbt-labs/jaffle_shop) to see how the dbt-labs team organizes source, staging, and modeling layers. It’s a great starting point for any new analytics project.\n- **Modular \u0026 Reusable**: Because each stage is decoupled, you can easily swap out or update specific models without affecting the rest of the pipeline.\n\n## 📄 Additional Resources\n\n- [dbt Documentation](https://docs.getdbt.com/)\n- [Snowflake SQL Reference](https://docs.snowflake.com/en/sql-reference/)\n- [jaffle_shop on GitHub](https://github.com/dbt-labs/jaffle_shop)\n\n## ⚖️ License\n\nThis project is licensed under the [MIT License](LICENSE). Feel free to use and adapt any part of this project in your own data workflows.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falefrp%2Fairbnb_dbt","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Falefrp%2Fairbnb_dbt","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falefrp%2Fairbnb_dbt/lists"}