Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/myahninsi/housing-price-prediction-ml

Final project for Big Data Visualization for Business Communications 01 (DSMM Group 1). Analyzes housing data, identifies key price factors, and builds predictive models using machine learning. Includes Power BI dashboards for interactive visualizations and Flask for deployment.
https://github.com/myahninsi/housing-price-prediction-ml

elastic-net lasso-regression linear-regression matplotlib numpy pandas ridge-regression scikit-learn seaborn

Last synced: 9 days ago
JSON representation

Final project for Big Data Visualization for Business Communications 01 (DSMM Group 1). Analyzes housing data, identifies key price factors, and builds predictive models using machine learning. Includes Power BI dashboards for interactive visualizations and Flask for deployment.

Awesome Lists containing this project

README

        

# Housing Price Prediction with Machine Learning

## Overview
This repository contains the final project for Big Data Visualization for Business Communications 01 (DSMM Group 1). The project aims to analyze housing data, identify key factors influencing property prices, and develop predictive models using machine learning techniques. It also features advanced visualizations using Power BI and Python libraries to deliver actionable insights.

## Problem Statement
The housing market is influenced by numerous factors such as location, square footage, and property features. This project addresses the challenge of predicting house prices by:
- Identifying key factors affecting property prices.
- Building regression models for accurate price prediction.
- Creating interactive dashboards for better data visualization and decision-making.

## Features
- **Data Preprocessing**: Cleaning and preparing the data for analysis and modeling.
- **Exploratory Data Analysis (EDA)**: Analyzing trends and patterns using advanced visualization techniques.
- **Machine Learning Models**: Implementing multiple regression models for price prediction.
- **Interactive Dashboards**: Using Power BI for data presentation.
- **Deployment**: Preparing the workflow for scalability and future deployment.

## Tools and Technologies
- **Programming Language**: Python
- **Libraries**:
- Pandas, NumPy: Data manipulation and computation.
- Matplotlib, Seaborn: Data visualization.
- Scikit-learn: Machine learning and preprocessing.
- **Regression Models**:
- Linear Regression
- Lasso Regression
- Ridge Regression
- ElasticNet Regression
- **Data Scaling**: MinMaxScaler
- **Visualization Tools**: Power BI for interactive dashboards.

## Skills and Techniques
- **Data Cleaning**: Removing duplicates, handling missing values, and adjusting data types.
- **Feature Engineering**: Creating new variables such as renovation status and property age.
- **Outlier Detection**: Identifying and removing outliers using IQR.
- **EDA**: Visualizing correlations and distributions to extract meaningful insights.
- **Model Evaluation**: Using metrics like R² and RMSE for performance assessment.

## Workflow
1. **Dataset Selection and Exploration**:
- Initial analysis of the dataset to understand its structure and key variables.
2. **Data Cleaning**:
- Handling missing values, duplicates, and incorrect data types.
3. **Feature Engineering**:
- Creating new variables like "Renovated" and "Age of the House."
4. **Exploratory Data Analysis (EDA)**:
- Visualizing trends and relationships using Python libraries and Power BI.
5. **Model Development**:
- Implementing Linear, Lasso, Ridge, and ElasticNet regression models.
- Splitting data into training and testing sets for model evaluation.
6. **Visualization**:
- Creating interactive dashboards using Power BI for better insights.

## Results
- Identified key predictors of housing prices, including square footage, grade, and location.
- Built regression models achieving strong performance metrics (R² and RMSE).
- Delivered interactive Power BI dashboards for data-driven decision-making.