An open API service indexing awesome lists of open source software.

https://github.com/tolumie/exploratory-data-analytics-projects

Exploratory Data Analytics – A collection of projects covering data exploration, feature engineering, hypothesis testing, and predictive modeling across diverse datasets, including insurance, real estate, laptops, cars, COVID-19, and the Olympics.
https://github.com/tolumie/exploratory-data-analytics-projects

data-analysis data-visualization data-wrangling exploratory-data-analysis-eda feature-engineering hypothesis-testing machine-learning matplotlib numpy pandas predictive-modeling python seaborn statistical-analysis

Last synced: 3 months ago
JSON representation

Exploratory Data Analytics – A collection of projects covering data exploration, feature engineering, hypothesis testing, and predictive modeling across diverse datasets, including insurance, real estate, laptops, cars, COVID-19, and the Olympics.

Awesome Lists containing this project

README

          

# 📊 Exploratory Data Analytics

## 📌 Overview
This repository contains a collection of **data analytics projects** that focus on **data exploration, feature engineering, hypothesis testing, and predictive modeling**. The datasets cover various domains, including **insurance costs, laptop and used car pricing, house sales, COVID-19 trends, and the Olympic Games**.

## 📁 Project Descriptions

### 1️⃣ **Insurance Cost Analysis**
- Investigates **insurance cost drivers** using statistical methods.
- Explores relationships between **age, BMI, smoking status, and charges**.

### 2️⃣ **Laptop Pricing Analysis**
- Examines factors affecting **laptop prices**, including **brand, specifications, and market demand**.
- Builds **predictive models** for price estimation.

### 3️⃣ **Used Car Pricing Analysis**
- Uses **EDA and machine learning** to understand used car pricing trends.
- Factors include **mileage, manufacturing year, and brand perception**.

### 4️⃣ **House Sales in King County (USA)**
- Analyzes **real estate trends**, identifying key features influencing **house prices**.
- Uses **regression modeling** for price prediction.

### 5️⃣ **Exploratory Data Analysis of COVID-19 in India**
- Examines the **spread, mortality, and recovery trends** of COVID-19 in India.
- Uses **time series analysis** to visualize case growth.

### 6️⃣ **Olympic Games Data Analysis**
- Investigates **medal distributions, country participation trends, and athlete performances** over time.

### 7️⃣ **Machine Learning & Statistical Analysis Labs**
- **Classification with Python:** Covers **logistic regression, decision trees, and SVM**.
- **Feature Engineering Lab:** Focuses on **data transformation and new feature creation**.
- **Hypothesis Testing Lab:** Applies **z-tests, t-tests, and chi-square tests** to validate assumptions.

## 🛠️ Key Techniques
✔ **Data Wrangling & Cleaning** – Handling missing values, outlier detection, and feature extraction.
✔ **Exploratory Data Analysis (EDA)** – Uncovering patterns and insights through **visualization and statistics**.
✔ **Feature Engineering** – Creating new features to improve model accuracy.
✔ **Machine Learning Models** – Regression and classification models for predictive analysis.
✔ **Hypothesis Testing** – Using **statistical tests to validate findings**.

## 📂 Files in This Repository
- **Data Wrangling Notebooks** – Cleaning and preprocessing datasets.
- **EDA Notebooks** – Exploring data distributions and visualizing trends.
- **Feature Engineering & Hypothesis Testing Notebooks** – Transforming data and conducting statistical analysis.
- **Machine Learning Notebooks** – Developing and evaluating predictive models.

## 🚀 Future Enhancements
🔹 Integrating **advanced machine learning models** for better predictions.
🔹 Expanding **time series forecasting** for financial and pricing trends.
🔹 Incorporating **more real-world datasets** for deeper insights.