Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/syed-bakhtawar-fahim/datavisualization

Data Visualization with Python
https://github.com/syed-bakhtawar-fahim/datavisualization

big-data-analytics data data-analysis data-analysis-python data-science data-visualization pandas pyspark

Last synced: 2 days ago
JSON representation

Data Visualization with Python

Host: GitHub
URL: https://github.com/syed-bakhtawar-fahim/datavisualization
Owner: Syed-Bakhtawar-Fahim
Created: 2022-10-08T21:58:54.000Z (about 2 years ago)
Default Branch: master
Last Pushed: 2023-08-23T15:39:45.000Z (about 1 year ago)
Last Synced: 2023-08-23T18:16:51.448Z (about 1 year ago)
Topics: big-data-analytics, data, data-analysis, data-analysis-python, data-science, data-visualization, pandas, pyspark
Language: Jupyter Notebook
Homepage:
Size: 2.39 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: Readme.md

Awesome Lists containing this project

README

# Projects: `Exploratory Data Analysis (EDA)`
Exploratory data analysis (EDA) is used by data scientists to analyze and investigate data sets and summarize their main characteristics, often employing data visualization methods. It helps determine how best to manipulate data sources to get the answers you need, making it easier for data scientists to discover patterns, spot anomalies, test a hypothesis, or check assumptions.

## `US Accident Exploratory Data Analysis`
In the context of the United States Accident Dataset, a comprehensive analysis and visualization were undertaken to extract significant insights from pivotal columns within the dataset. This analytical endeavor aimed to unravel valuable patterns and trends inherent in the data.

#### Summary and Conclusion
Insights:
- No data from New York
- The number of accidents per city decreases exponentially
- Less than 5% of cities have more than 1000 yearly accidents.
- Over 1200 cities have reported just one accident (need to investigate)

## `Car Exploratory Data Analysis`:
In the context of Car Data set, a comprehensive analysis and visualization were undertaken to extract significant insights from pivotal columns within the dataset.

#### Summary and Conclusion
Insights:
Certainly, here's a concise summary of the insights derived from the provided instructions:
- Null Values Handling
- Vehicle Makes Exploration
- Regional Origin Focus
- Weight-Based Refinement
- Fuel Efficiency Augmentation

# `Netflix Exploratory Data Analysis`:
The Netflix Dataset has information about the TV shows and Movie available on youtube till 2021
The dataset is collected from Flixable which is a third-party Netflix search engine and is available on the Kaggle website for free.
#### Summary and Conclusion
Insights:
- Data Preprocessing and Cleaning
- Data Filtering
- Find meaningful insight about the movies, director, and shows

# `Weather Exploratory Data Analysis`:
The weather Dataset is a time-series data set with per-hour information about the weather conditions at a particular location. It records Temperature, Dew Point Temperature, Relative Humidity, Wind Speed, Visibility, Pressure, and conditions.

#### Summary and Conclusion
Insights:
- Data Preprocessing and Cleaning
- Data Filtering
- Find meaningful insight about the weather

# PySpark: `Google Play Store Exploratory Data Analysis`:
We have a Google Play Store dataset containing information on different apps installed ratings and versions and other details and we are going to do an analysis based on the data we have

#### `Agenda`
- Find out the top 10 Reviews given to the apps
- Top 10 install apps and distribution of types(Free/Paid)
- Category-wise distribution of installed apps
- Top paid apps
- Top paid rating apps

## Conclusion:
The EDA project illuminated valuable insights, enabling informed decisions through meticulous data analysis and visualization.