https://github.com/yuvrajsaraogi/car-price-prediction-with-machine-learning
The price of a car depends on a lot of factors like the goodwill of the brand of the car, features of the car, horsepower and the mileage it gives and many more. Car price prediction is one of the major research areas in machine learning. So, if you want to learn how to train a car price prediction model then this project is for you.
https://github.com/yuvrajsaraogi/car-price-prediction-with-machine-learning
car-price-prediction-with-machine-learning data data-science deep-learning deep-neural-networks engineer github learning machine-learning mini-project natural-language-processing prediction predictive-modeling project python3 sql
Last synced: 3 months ago
JSON representation
The price of a car depends on a lot of factors like the goodwill of the brand of the car, features of the car, horsepower and the mileage it gives and many more. Car price prediction is one of the major research areas in machine learning. So, if you want to learn how to train a car price prediction model then this project is for you.
- Host: GitHub
- URL: https://github.com/yuvrajsaraogi/car-price-prediction-with-machine-learning
- Owner: yuvrajsaraogi
- Created: 2025-03-15T08:03:19.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2025-03-15T08:11:19.000Z (3 months ago)
- Last Synced: 2025-03-15T09:22:15.700Z (3 months ago)
- Topics: car-price-prediction-with-machine-learning, data, data-science, deep-learning, deep-neural-networks, engineer, github, learning, machine-learning, mini-project, natural-language-processing, prediction, predictive-modeling, project, python3, sql
- Language: Jupyter Notebook
- Homepage:
- Size: 164 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# π Car Price Prediction with Machine Learning


## π Project Overview
This project aims to **predict the selling price of used cars** based on various features such as the carβs **age, kilometers driven, fuel type, transmission, and number of previous owners**. By using **Machine Learning models**, we can help car buyers and sellers make informed pricing decisions.π **Key Features:**
βοΈ Data Preprocessing (Handling categorical & numerical data)
βοΈ Exploratory Data Analysis (EDA)
βοΈ Feature Engineering & Selection
βοΈ Model Training & Evaluation---
## π Dataset Overview
The dataset contains **301 entries** with the following **9 features**:| Feature | Description |
|---------|------------|
| `Car_Name` | Name of the car (string) |
| `Year` | Manufacturing year (integer) |
| `Selling_Price` | Price at which the car is being sold (Target variable) |
| `Present_Price` | Price of the car when it was new |
| `Driven_kms` | Kilometers driven |
| `Fuel_Type` | Type of fuel (Petrol, Diesel, CNG) |
| `Selling_type` | Seller type (Dealer or Individual) |
| `Transmission` | Manual or Automatic |
| `Owner` | Number of previous owners |π **Insights from EDA:**
β Selling price is **right-skewed** (most cars are lower-priced).
β **Present Price** has the highest correlation with **Selling Price**.
β **Fuel Type:** Petrol cars dominate, followed by Diesel.
β **Transmission Type:** Manual cars are more common than automatic.---
## π§ Data Preprocessing
βοΈ One-hot encoding for categorical features.
βοΈ Feature scaling for numerical values.
βοΈ Dropped irrelevant features like `Car_Name`.
βοΈ Splitting dataset into **80% Training** and **20% Testing**.```python
# Splitting data into train and test sets
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
```---
## π€ Model Training
We experimented with different models:
β **Linear Regression**
β **Random Forest Regressor**
β **Decision Tree**
β **XGBoost**π **Performance Metrics Used:**
- **RΒ² Score** (How well the model fits the data)
- **Mean Absolute Error (MAE)**---
## π Results & Findings
| Model | RΒ² Score (Test) | MAE (Test) |
|--------|-------------|-------------|
| Linear Regression | 0.86 | 1.2 Lakhs |
| Random Forest | 0.92 | 0.9 Lakhs |
| Decision Tree | 0.88 | 1.1 Lakhs |
| XGBoost | 0.94 | 0.8 Lakhs |π **Best Model:** **XGBoost** with **94% accuracy** π―
---
## π How to Run the Project
### 1οΈβ£ Install Dependencies
```bash
pip install pandas numpy matplotlib seaborn scikit-learn xgboost
```### 2οΈβ£ Run Jupyter Notebook
```bash
jupyter notebook
```
Open `Car Price Prediction with Machine Learning.ipynb` and run all cells.---
## π Future Improvements
πΉ Improve feature selection & engineering.
πΉ Try Deep Learning models.
πΉ Build a web app using **Flask / Streamlit** for real-time predictions.---
## π‘ Conclusion
This project successfully predicts used car prices with **high accuracy** using machine learning techniques. The **XGBoost model** provided the best results with a **94% RΒ² Score**.---
## π€ Connect With Me
π» [GitHub](https://github.com/yuvrajsaraogi) | π [LinkedIn](https://linkedin.com/in/yuvraj-saraogi) | βοΈ [Email]([email protected])