Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/aniruddhakhedkar/eda_for_chinese_automotive_company_teclov_chinese
Exploratory_Data_Analysis_Python_Project_1
https://github.com/aniruddhakhedkar/eda_for_chinese_automotive_company_teclov_chinese
datavisualization duplicate-detection imputation-methods numpy outlier-removal pandas seaborn statistical-analysis
Last synced: 9 days ago
JSON representation
Exploratory_Data_Analysis_Python_Project_1
- Host: GitHub
- URL: https://github.com/aniruddhakhedkar/eda_for_chinese_automotive_company_teclov_chinese
- Owner: Aniruddhakhedkar
- Created: 2024-08-14T14:21:39.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2024-09-16T05:34:02.000Z (2 months ago)
- Last Synced: 2024-09-16T06:50:45.386Z (2 months ago)
- Topics: datavisualization, duplicate-detection, imputation-methods, numpy, outlier-removal, pandas, seaborn, statistical-analysis
- Language: Jupyter Notebook
- Homepage:
- Size: 1.89 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Exploratory_Data_Analysis_for_Chinese_Automotive_Company_Teclov_Chinese
## Problem Description
1) A Chinese automobile company Teclov_chinese aspires to enter the US market by setting up their manufacturing unit there and producing cars locally to give competition to their US and European counterparts.
2) They have contracted an automobile consulting company to understand the factors on which the pricing of cars depends. Specifically, they want to understand the factors affecting the pricing of cars in the American market, since those may be very different from the Chinese market.## Objectives_of_the_Analysis-
1) To determine variables that are significant in predicting the price of a car
2) How these variables describes the price of a car## EDA_Methodology_Employed-
1) Cleaning the data, and assigning the proper data type to the variables
2) Descriptive statistics
3) Removal of duplicates
4) Handling missing values (Imputation)
5) Outlier removal
6) Understanding the collinearity between independent variables
7) Determination of correlationship between independent and dependent variable
8) Storage of cleaned data in a new excel file and EDA report preparation## Python_Libraries_Used-
1)NumPy
2)Pandas
3)Matplotlib
4)Seaborn
5)Scipy