https://github.com/humairarizwan/uber-ride-dataengineering-analysis
This project creates a pipeline to process data and performs data analytics on Uber data.
https://github.com/humairarizwan/uber-ride-dataengineering-analysis
bigquery dataanalysis dataengineering gcp-project googlestorage looker-studio
Last synced: about 2 months ago
JSON representation
This project creates a pipeline to process data and performs data analytics on Uber data.
- Host: GitHub
- URL: https://github.com/humairarizwan/uber-ride-dataengineering-analysis
- Owner: HumairaRizwan
- Created: 2025-01-16T20:54:37.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-01-19T21:04:38.000Z (over 1 year ago)
- Last Synced: 2025-01-19T21:31:40.596Z (over 1 year ago)
- Topics: bigquery, dataanalysis, dataengineering, gcp-project, googlestorage, looker-studio
- Language: Python
- Homepage:
- Size: 4.93 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Uber Data Engineering and Analysis Project
## Introduction
This project creates a pipeline to process data and performs data analytics on Uber data using various tools and technologies, including GCP Storage, Python, Compute Instance, Mage Data Pipeline Tool, BigQuery, and Looker Studio.
## Dataset Used
TLC Trip Record Data
- Yellow and green taxi trip records include fields capturing pick-up and drop-off dates/times, pick-up and drop-off locations, trip distances, itemized fares, rate types, payment types, and driver-reported passenger counts.
- Website - https://www.nyc.gov/site/tlc/about/tlc-trip-record-data.page
- Data Dictionary - https://www.nyc.gov/assets/tlc/downloads/pdf/data_dictionary_trip_records_yellow.pdf
## Technology Used
- Programming Language - Python
Google Cloud Platform(GCP)
1. Google Storage
2. Compute Instance
3. BigQuery
4. Looker Studio