Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/elfgk/kc-house-data-analysis

KC House Data Analysis
https://github.com/elfgk/kc-house-data-analysis

classification dataanalysis dataanalysis-projects dataanalysisusingpython dataanalystportfolio jupyter-notebook machine-learning regression-models xgbregressor

Last synced: 26 days ago
JSON representation

KC House Data Analysis

Awesome Lists containing this project

README

        

# KC House Data Analysis

This project focuses on analyzing house price data for King County, Washington. The goal is to explore the dataset, perform data cleaning, feature engineering, and build a machine learning model to predict house prices.

Dataset: https://www.kaggle.com/datasets/harlfoxem/housesalesprediction

## Project Overview

The dataset used in this project contains information about homes sold in King County, including house features such as square footage, number of bedrooms, location, and more. This project includes the following steps:

1. **Exploratory Data Analysis (EDA):**
- Visualizing the data to understand trends, distributions, and correlations.
- Identifying missing values and outliers in the data.

2. **Data Preprocessing:**
- Cleaning the data by handling missing values, encoding categorical variables, and scaling numerical features.

3. **Feature Engineering:**
- Creating new features that can improve model performance.

4. **Modeling:**
- Building machine learning models (such as linear regression, decision trees, etc.) to predict house prices.

5. **Model Evaluation:**
- Evaluating models using various metrics such as Mean Absolute Error (MAE), Mean Squared Error (MSE), and R-squared.

## Dataset

The dataset used in this project contains information about houses sold in King County, WA. It includes the following features:

- `id`: Unique ID for each house
- `date`: Date when the house was sold
- `price`: Price of the house
- `bedrooms`: Number of bedrooms
- `bathrooms`: Number of bathrooms
- `sqft_living`: Square footage of the house
- `sqft_lot`: Square footage of the lot
- `floors`: Number of floors in the house
- `waterfront`: Whether the house has a waterfront view (1 if true, 0 if false)
- `view`: Quality of the view (a rating from 0 to 4)
- `condition`: Condition of the house (a rating from 1 to 5)
- `grade`: Grade of the house (a rating from 1 to 13)
- `sqft_above`: Square footage of the house above ground
- `sqft_basement`: Square footage of the basement
- `yr_built`: Year the house was built
- `yr_renovated`: Year the house was renovated
- `zipcode`: ZIP code of the house's location

## Libraries Used

- `pandas`: For data manipulation and analysis.
- `numpy`: For numerical operations.
- `matplotlib` and `seaborn`: For data visualization.
- `scikit-learn`: For machine learning models and evaluation.

## Getting Started

To run the analysis on your local machine, follow these steps:

1. Clone or download the repository:

```bash
git clone https://github.com/elfgk/KC-House-Data-Analysis.git
```

2. Install the required Python libraries.
3. Open the kc-house.ipynb Jupyter notebook and follow the steps.

π“’Φ΄ΰ»‹β˜•οΈβœ§Λš ༘ ⋆

Contact MeπŸ§‘β€πŸ’»:

[![LinkedIn](https://img.shields.io/badge/LinkedIn-0A66C2?style=for-the-badge&logo=linkedin&logoColor=white)](https://www.linkedin.com/in/elfgk/)
[![Stack Overflow](https://img.shields.io/badge/StackOverflow-FE7A16?style=for-the-badge&logo=stackoverflow&logoColor=white)](https://stackoverflow.com/users/27559679/elfgk)
[![Hugging Face](https://img.shields.io/badge/HuggingFace-9C30FF?style=for-the-badge&logo=huggingface&logoColor=white)](https://huggingface.co/elfgk)
[![Kaggle](https://img.shields.io/badge/Kaggle-20BEFF?style=for-the-badge&logo=kaggle&logoColor=white)](https://www.kaggle.com/elfgkk)