An open API service indexing awesome lists of open source software.

https://github.com/kindo-tk/laptop_price_predictor

An end-to-end machine learning project to predict laptop prices based on technical specifications. Built with Python and Scikit-Learn, featuring an interactive web app deployed using Streamlit.
https://github.com/kindo-tk/laptop_price_predictor

decision-tree-regression laptop-price-prediction linear-regression machine-learning prediction random-forest regression-models streamlit xgboost-regression

Last synced: about 2 months ago
JSON representation

An end-to-end machine learning project to predict laptop prices based on technical specifications. Built with Python and Scikit-Learn, featuring an interactive web app deployed using Streamlit.

Awesome Lists containing this project

README

          

# Laptop Price Predictor

This project implements a machine learning-based system to estimate the market price of laptops based on their technical specifications. It is an end-to-end data science project, covering data cleaning, extensive feature engineering, model comparison, and deployment using a **Streamlit** web application.

---

## Overview

The laptop market is saturated with various brands and specifications, making it difficult for consumers to estimate the fair value of a device. The goal of this project is to solve this opacity by predicting prices based on features such as:

- **Brand** (Apple, Dell, HP, etc.)
- **Processor** (Intel i5/i7, AMD, etc.)
- **RAM & Storage** (SSD/HDD)
- **GPU** (Nvidia, Intel Iris, etc.)
- **Screen Type** (IPS, Touchscreen, Resolution)

The project follows a complete Data Science lifecycle and provides a **modern web interface** for users to get instant price estimates.

---

## Methodology & Workflow

This project was executed in the following detailed steps:

### 1. Data Cleaning & Preprocessing
The raw dataset required significant cleaning to be usable for modeling:
- **Unit Handling:** Removed non-numeric characters (e.g., "GB" from RAM, "kg" from Weight) and converted columns to numeric types.
- **Missing Values:** Handled null values in critical columns to ensure data consistency.

### 2. Exploratory Data Analysis (EDA)
- **Target Variable:** Analyzed the distribution of the `Price` column. It was found to be right-skewed, so a **Log Transformation** was applied to normalize the distribution for better regression performance.
- **Correlation:** Analyzed how features like RAM and Screen Resolution correlated with Price.

### 3. Feature Engineering
This was the most critical phase, where raw text data was converted into meaningful features:
- **Screen Resolution:** Extracted detailed specs to create new binary columns for `TouchScreen` and `IPS Panel`, and calculated `PPI` (Pixels Per Inch).
- **CPU & GPU:** Parsed complex text strings to categorize processors (e.g., "Intel Core i5") and Graphics cards into generalized categories.
- **Storage:** Split mixed storage types (e.g., "128GB SSD + 1TB HDD") into separate columns for `SSD` and `HDD` capacities to capture the premium value of solid-state drives.

---

## Model Selection & Results

Multiple regression algorithms were trained and evaluated using the R² Score metric. Below is the comparative analysis of their performance based on our testing:

| **Model** | **R² Score** |
| :--- | :--- |
| **XGBoost Regressor** | **0.839459** |
| Random Forest Regressor | 0.819675 |
| Voting Regressor | 0.803241 |
| Linear Regression | 0.759321 |
| Ridge Regression | 0.758772 |
| Lasso Regression | 0.748790 |
| Decision Tree Regressor | 0.747561 |
| SVR | 0.737686 |
| KNN | 0.712115 |
| Gradient Boosting | 0.704107 |
| AdaBoost Regressor | 0.596751 |

### XGBoost gave the highest R2 score
---

## Project Structure

```text
laptop-price-predictor/

├── app.py # Streamlit application
├── models/
│ └── best_laptop_price_model.pkl # Trained ML pipeline

├── notebooks/
│ └── laptop_price_predictor.ipynb # EDA, preprocessing, training & evaluation

├── datasets/
│ ├── laptop_data.csv # Original dataset
│ └── cleaned_data.csv # Cleaned & feature-engineered dataset

├── .dockerignore # Files ignored by Docker
├── .gitignore # Files & folders ignored by Git
├── Dockerfile # Docker configuration
├── LICENSE # Project license
├── requirements.txt # Project dependencies
└── README.md # Project documentation
```
---

### Setup

1. Clone the repository:

```bash
git clone https://github.com/kindo-tk/laptop_price_predictor.git
```
2. **Navigate to the project directory:**

```sh
cd laptop_price_predictor
```

3. **Create and Activate the virtual environment:**

**Windows:**
```bash
python -m venv .venv
.venv\Scripts\activate
```

**macOS/Linux:**
```bash
python3 -m venv .venv
source .venv/bin/activate
```

4. **Install the required packages:**

```sh
pip install -r requirements.txt
```

5. **Run the Streamlit application:**

```sh
streamlit run app.py
```
---

### 🐳 Docker Usage

If you prefer using Docker, you can pull the pre-built image from Docker Hub and run it instantly:

1. **Pull the Docker Image:**

```bash
docker pull kindotk/laptop_price_predictor
```

2. **Run the Container:**

```bash
docker run -p 8501:8501 kindotk/laptop_price_predictor
```

3. **Access the App:**
Open your browser and navigate to `http://localhost:8501`

---
## Usage

1. Enter the required laptop details:

- Brand (e.g., Dell, Apple)
- Processor & GPU
- RAM & Storage configuration
- Screen Size & Resolution

2. Click **Predict Price** to see the estimated market value.

---

## Technologies Used

- Python
- Streamlit
- scikit-learn
- XGBoost
- Pandas & NumPy
> See [`requirements.txt`](requirements.txt) for the full list of dependencies.
---

## License

This project is licensed under the MIT License. See [LICENSE](LICENSE) for details.

---

## Contact

For any inquiries or feedback, please contact:

- [Tufan Kundu (LinkedIn)](https://www.linkedin.com/in/tufan-kundu-577945221/)
- Email: tufan.kundu11@gmail.com

---

### Demo

Visit the live app: Click here