https://github.com/lkethridge/data_collection_and_storage_-sql-
A Data Collection and Storage project using SQL from TripleTen
https://github.com/lkethridge/data_collection_and_storage_-sql-
advanced-sql aggregate-functions api converting-data-types data-collection data-manipulation data-relationships data-slice data-storage databases er-diagrams get-requests grouping-data json processing-data regular-expressions sorting-data subqueries-and-joins web-mining window-functions-in-sql
Last synced: about 2 months ago
JSON representation
A Data Collection and Storage project using SQL from TripleTen
- Host: GitHub
- URL: https://github.com/lkethridge/data_collection_and_storage_-sql-
- Owner: LKEthridge
- License: cc0-1.0
- Created: 2025-01-20T02:22:00.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-01-20T05:28:06.000Z (over 1 year ago)
- Last Synced: 2025-02-02T05:28:58.661Z (over 1 year ago)
- Topics: advanced-sql, aggregate-functions, api, converting-data-types, data-collection, data-manipulation, data-relationships, data-slice, data-storage, databases, er-diagrams, get-requests, grouping-data, json, processing-data, regular-expressions, sorting-data, subqueries-and-joins, web-mining, window-functions-in-sql
- Language: Jupyter Notebook
- Homepage:
- Size: 162 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Data_Collection_and_Storage_-SQL-
## *This was a Data Collection and Storage (SQL) project for TripleTen. 👩🏽💻*
After parsing data from a website containing Chicago's weather records for November 2017, I used SQL to analyze taxi trip data by calculating ride counts for different taxi companies across various time periods, identifying popular companies and aggregating smaller ones into an "Other" category. I also retrieved identifiers for key neighborhoods (Loop and O'Hare), categorized weather conditions using CASE logic, and joined Saturday ride data from the Loop to O'Hare with the parsed weather records.
Using Python, this project analyzed passenger preferences, competitor performance, and weather impacts to provide fictional rideshare company Zuber with actionable insights for entering the competitive Chicago market. Key findings include identifying high demand in downtown neighborhoods, the need to differentiate from strong competitors like Flash Cab, and the impact of rain on ride duration for key routes. These insights would help Zuber optimize resource allocation, marketing, and pricing strategies to launch successfully in Chicago.
## Skills Highlighted
💿 Data Collection and Storage
👩🏽💻 Advanced SQL and Working with Databases
🔪 Data Slices
➕ Aggregate Functions
⌨️ Grouping, Sorting, Processing, Converting, and Joining Data
❓ Subqueries
🪟 Window Functions
⛏️ APIs, JSON, GET Requests, and Web Mining
📆 Operators and Functions for Working with Dates
## Installation & Usage
* This project uses pandas, pyplot, numpy, and stats. It requires python 3.11.
* You can see the original weather data here: https://practicum-content.s3.us-west-1.amazonaws.com/data-analyst-eng/moved_chicago_weather_2017.html
* Please see the SQL code I used here: https://popsql.com/queries/-OH18RhF8cTt3uEtYNTV/sql?access_token=7d3fa95f4b0e37e5ab40a8811f513438