Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/vidushibhadana/eda-on-nyc-taxi-data
About Conducting an Exploratory Data Analysis (EDA) on New York City taxi data and visualizing it through countplots, distribution plots (displot), and histograms using Python and it's libraries.
https://github.com/vidushibhadana/eda-on-nyc-taxi-data
data data-visualization jupyter-notebook matplotlib numpy pandas python seaborn
Last synced: 19 days ago
JSON representation
About Conducting an Exploratory Data Analysis (EDA) on New York City taxi data and visualizing it through countplots, distribution plots (displot), and histograms using Python and it's libraries.
- Host: GitHub
- URL: https://github.com/vidushibhadana/eda-on-nyc-taxi-data
- Owner: vidushibhadana
- Created: 2024-10-16T05:39:22.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2024-10-16T05:59:06.000Z (4 months ago)
- Last Synced: 2025-02-02T06:09:37.659Z (19 days ago)
- Topics: data, data-visualization, jupyter-notebook, matplotlib, numpy, pandas, python, seaborn
- Language: HTML
- Homepage:
- Size: 490 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# EDA-on-NYC-Taxi-Data
* Conducting an Exploratory Data Analysis (EDA) on New York City taxi data and visualizing it through countplots, distribution plots (displot), and histograms using Python and it's libraries.
#### To view the project (Google Collab) -> https://colab.research.google.com/drive/19kFrs_tu3P3W4sWwYIQZpRkkkB5Vd7t2
## About the project
Exploratory Data Analysis (EDA) is a critical step in data analysis, helping us understand the structure, patterns, and characteristics of a dataset. In this example, we'll perform EDA on New York City taxi data using Python and various libraries. We'll visualize the data using countplots, distribution plots (displot), and histograms to gain insights.In which we performed Descriptive and Diagnostic Analysis.
We have used the NYC taxi dataset from Kaggle for this project -> https://www.kaggle.com/competitions/nyc-taxi-trip-duration## Libraries Used:
* Numpy
* Pandas
* Matplotlib
* Seaborn## VISUALIZATIONS
### Plot 1. Distribution of Passenger Count### Plot 2. Distribution of each day in a week
### Plot 3. Trip Duration Distribution
### PLot 4. Distribution of pickup timezone
### Plot 5. Distribution of active hours - pickup and drop off
### Plot 6: Distribution of pickup and dropoff months
![]()
### Plot 7. Distribution of total pickup hour