An open API service indexing awesome lists of open source software.

https://github.com/mjshubham21/ny_yellow_taxi_python_da_project

A data analysis project of New York Yellow Taxi (Feb of 2025) using Python and its libraries for analytics like : NumPy, MatPlotLib, Pandas and Seaborn.
https://github.com/mjshubham21/ny_yellow_taxi_python_da_project

data-analysis jupyter-notebook matplotlib numpy pandas python seaborn

Last synced: about 2 months ago
JSON representation

A data analysis project of New York Yellow Taxi (Feb of 2025) using Python and its libraries for analytics like : NumPy, MatPlotLib, Pandas and Seaborn.

Awesome Lists containing this project

README

          

# 🚕 New York Yellow Taxi Trip Data Analysis

**By Shubham Pawar**

---

## 📊 NYC Yellow Taxi Trips Trends & Insights

---

## 📌 Overview

This project delivers a comprehensive analysis of New York City Yellow Taxi trip data, uncovering trends in trip distances, fare amounts, passenger counts, and payment methods. The study leverages Python-based data analysis techniques to identify operational insights, seasonal patterns, and actionable recommendations for improving taxi service efficiency.

---

## Project Link

[NY Yellow Taxi Python Data Analysis Project](https://github.com/mjshubham21/NY_yellow_taxi_python_DA_project/blob/main/yellow_taxi_project.ipynb)

---

## 🛠️ Tools Used

Analysis was performed using:

- **Python** (Jupyter Notebook)
- **Pandas** — Data cleaning and manipulation
- **NumPy** — Numerical computations
- **Matplotlib** — Data visualization
- **Seaborn** — Statistical data visualization

---

## 📁 Dataset

All analysis was based on the following dataset:
🔗 [NYC Yellow Taxi Trip Records (Parquet)](https://d37ci6vzurychx.cloudfront.net/trip-data/yellow_tripdata_2025-02.parquet)
🔗 [NYC Yellow Taxi Trip Records All Datasets](https://www.nyc.gov/site/tlc/about/tlc-trip-record-data.page)

> **Note:** The dataset was originally provided in `.parquet` format by the NYC Taxi & Limousine Commission.
> For this project, it was **converted to `.csv` format online** and then saved locally as an Excel file for ease of analysis.

- **File used in project:** `Yellow-tripdata-2025-02.xlsx` (includes trip distance, fare amount, payment type, passenger count, and other trip attributes).

---

## 🎯 Key Performance Indicators (KPIs)

- 🚖 **Total Number of Trips**
- 📏 **Average Trip Distance**
- 💵 **Average Fare Amount**
- 🧍 **Passenger Count Distribution**
- 💳 **Payment Method Preferences**
- ⏱️ **Trip Duration Patterns**
- 📅 **Peak Hour & Day Analysis**

---

## 📈 Key Insights

- **Dominance of Short Trips:** Majority of Yellow Taxi trips are under 3 miles, indicating a strong preference for short-distance commutes in NYC.
- **Fare Distribution:** Fares cluster between $10–$20, with outliers for airport runs and longer routes.
- **Passenger Count Patterns:** Most trips carry 1–2 passengers, reflecting solo and small group travel habits.
- **Payment Method Trends:** Card payments dominate, with cash usage declining — hinting at a shift toward digital transactions.
- **Peak Hours:** Strong demand observed during morning and evening rush hours, with weekends showing more late-night trips.

---

## 🔍 Additional Insights

- **Operational Efficiency:** Peak-hour congestion may extend trip duration and affect fare efficiency.
- **Digital Adoption:** Declining cash transactions suggest drivers should be equipped with multiple cashless payment methods.
- **Revenue Optimization:** Promotions during off-peak hours could help balance driver earnings.

---

## 📚 Data Story

The NYC Yellow Taxi market reflects a high reliance on short-distance urban travel, concentrated during commuter peaks. Digital payments are becoming the norm, signaling an industry shift toward cashless operations. Seasonal and hourly variations indicate opportunities for optimizing operations and marketing strategies.

**Recommendations:**

1. Encourage drivers to focus on high-demand zones during peak commute times.
2. Offer fare discounts or promotions in low-demand periods to boost ridership.
3. Expand and promote digital payment acceptance to align with rider preferences.
4. Monitor trip duration trends to improve route optimization and reduce idle time.
5. Use demand patterns to guide fleet deployment and coverage areas.

---