Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/surayasumona/ford_used_car_analysis

Explanatory Data Analysis with Python
https://github.com/surayasumona/ford_used_car_analysis

data-science data-visualization matplotlib numpy pandas python seaborn

Last synced: 27 days ago
JSON representation

Explanatory Data Analysis with Python

Awesome Lists containing this project

README

        

# Data Analysis and Data Visualization of the Ford used cars
In this dataset, there has some ford used car's information. Here are the descriptions of the columns for the dataset:

**Target variable:**
* **Price:** selling price of the cars

**Features:**
* **model:** list of the Ford cars
* **year:** when the car was made
* **transmission:** transmission adapts the output of the internal combustion engine to the drive wheels
* **mileage:** The mileage of a vehicle is the number of miles that it can travel using one gallon or litre of fuel
* **fuelType:** different fuels a vehicle may use
* **mpg**: miles per gallon the vehicle can travel
* **engineSize:** engineSize is the volume of fuel and air that can be pushed through a car's cylinders

### Goal of this project:
#### Learn Data visualization and predict the resale price of the used cars using Machine learning algorithm
**Exploratory Data Analysis**:
* Read the data as Pandas Dataframe
* Check the data types and missing values
* Check the basic statistics of numerical features
* Find the percentage of unique values and reset the index,rename and round the catergorical variables

**Exploring the data using different data visualization plots**:
* Barplot
* Scatterplot
* Trendline or Regression plot
* Histogram
* Distribution plot
* ECDF ( Emperical Cumulative Distribution Function)
* Boxplot
* Violinplot

**EDA using GroupBy/Pivot_Table and Barplot based on some features such as model, transmission, and fuelType**
* What are the top 5 selling car models in the dataset?
* What's the average selling price of the top 5 selling car models?
* What's the total sale of the top 5 selling car models?

# Machine Learning Algorithms
**Supervised Learning: Linear Regression and Regression accuracy metrics**:
* Understanding the equation of a straight line
* feature coefficient (slope, gradient, m)
* bias coeffcient (y-intercept, c)
* loss function, cost function, objective function, error function
* Mean Absolute Error (MAE)
* Mean Absolute Percentage Error (MAPE)
* Mean Squared Error (MSE)
* Root Mean Squared Error (RMSE)
* R-squared or coefficient of determination
* Prediction result evaluation

### Reference of this Dataset: https://www.kaggle.com/aishwaryamuthukumar/cars-dataset-audi-bmw-ford-hyundai-skoda-vw