An open API service indexing awesome lists of open source software.

https://github.com/muthukumar0908/cardekho_used_car_price_prediction

The project aim is to build a machine learning model that offers users to find current valuations for used cars.
https://github.com/muthukumar0908/cardekho_used_car_price_prediction

data-analysis data-visualization datacleaning eda machine-learning python streamlit

Last synced: 29 days ago
JSON representation

The project aim is to build a machine learning model that offers users to find current valuations for used cars.

Awesome Lists containing this project

README

        

# CarDekho_Used_Car_Price_Prediction
Technologies: Data Cleaning, Exploratory Data Analysis (EDA), Visualization and Machine Learning
Domain: Automobile

Problem Statement:

The primary objective of is project is to create a data science solution for predicting used car prices accurately by analyzing a diverse dataset including car model, no. of owners, age, mileage, fuel type, kilometers driven, features and location. The aim is to build a machine learning model that offers users to find current valuations for used cars.

Data Understanding

The Dataset contains multiple excel files, each represents its city, columns in each excel gives you an overview of each car, its details, specification and available features.

Data Collected From: https://www.cardekho.com/usedCars

Dataset Link: https://drive.google.com/drive/folders/16U7OH7URsCW0rf91cwyDqEgd9UoeZAJh

Feature Description Link: https://docs.google.com/document/d/1hxW7IvCX5806H0IsG2Zg9WnVIpr2ZPueB4AElMTokGs/edit

youtube: https://youtu.be/KdhGAjJhpTo

Approach:
Import data from all excel files
Examine the structure of each dataset component (New Car Detail, New Car Overview, etc.).
Check for missing values, outliers, data types and other statistical inference.
Data Preprocessing:
Handle Missing Values: Impute or remove missing values appropriately.
Feature Engineering: Extract relevant information from features like age, mileage, and others.
Encode categorical variables using suitable techniques.
Normalization/Scaling: Scale numerical features to bring them to a comparable range.
Exploratory Data Analysis: Create visualizations to understand the distribution of target variables (used car prices) and relationships between features.
Choose regression models suitable for predicting continuous values
Model Evaluation: Use suitable metrics
Fine-tune Hyperparameters: Optimize model hyperparameters to improve performance.
Feature Importance: Analyze feature importance to understand which features contribute most to the predictions.