Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/reyhaneh-saffar/data-analysis-on-houses
Analyzing a housing dataset to derive statistical insights and understand relationships among various features.
https://github.com/reyhaneh-saffar/data-analysis-on-houses
Last synced: 16 days ago
JSON representation
Analyzing a housing dataset to derive statistical insights and understand relationships among various features.
- Host: GitHub
- URL: https://github.com/reyhaneh-saffar/data-analysis-on-houses
- Owner: reyhaneh-saffar
- Created: 2025-01-09T20:20:42.000Z (21 days ago)
- Default Branch: main
- Last Pushed: 2025-01-09T20:26:36.000Z (21 days ago)
- Last Synced: 2025-01-09T21:31:36.027Z (21 days ago)
- Language: Jupyter Notebook
- Size: 2.04 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Data Analysis on Houses - Summary and Highlights
This project involves analyzing a housing dataset to derive statistical insights and understand relationships among various features.
## **Highlights**
### **1. Data Processing**
- **Handling Null Values**: High null-value columns like `MiscFeature` (96.3%), `PoolQC` (99.52%), and others were removed as they offered limited analytical value.
- **Normality Checks**:
- **Shapiro-Wilk Test**: Used to assess normality; features like `LotArea` deviated significantly from a Gaussian distribution.
- **Anderson-Darling Test**: Confirmed non-normality in features such as `OverallQual`.
- **Visualization**: Histograms and other plots verified the results visually.---
### **2. Correlation Analysis**
- **Heatmap**: A visual representation of correlations among numerical features, highlighting the strength of relationships.
- **Pearson Test**: Revealed weak correlations between `MSSubClass` and `SalePrice`, as well as between `LotArea` and `SalePrice`.---
### **3. Statistical Tests**
- **ANOVA Test**:
- Neighborhood significantly impacts sale prices.
- Sale conditions show a strong correlation with sale prices.
- **T-Test**: Demonstrated a significant relationship between `SalePrice` and `2ndFlrSF`.---