Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/saksham-jain177/data-analysis
A collection of data analysis and machine learning projects across various datasets. Explore predictive modeling, data visualization, and insights from real-world data. Projects include sales predictions, disease detection, customer segmentation, and more.
https://github.com/saksham-jain177/data-analysis
api data data-analysis data-cleaning data-science data-visualization datamodeling dataset datasets exploratory-data-analysis python python3 web-scraping youtube-api
Last synced: 2 days ago
JSON representation
A collection of data analysis and machine learning projects across various datasets. Explore predictive modeling, data visualization, and insights from real-world data. Projects include sales predictions, disease detection, customer segmentation, and more.
- Host: GitHub
- URL: https://github.com/saksham-jain177/data-analysis
- Owner: saksham-jain177
- License: apache-2.0
- Created: 2023-08-07T23:34:19.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-08-29T18:32:02.000Z (3 months ago)
- Last Synced: 2024-08-29T20:52:46.185Z (3 months ago)
- Topics: api, data, data-analysis, data-cleaning, data-science, data-visualization, datamodeling, dataset, datasets, exploratory-data-analysis, python, python3, web-scraping, youtube-api
- Language: Jupyter Notebook
- Homepage:
- Size: 1.59 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Data Analysis Projects
Welcome to my repository of Data Analysis projects! This repository contains a series of notebooks demonstrating different data analysis and machine learning tasks. Each project focuses on a unique dataset and problem statement, showcasing various analytical and predictive techniques.
## Table of Contents
1. [Swiggy Restaurants Data Analysis](#swiggy-restaurants-data-analysis)
2. [GeeksforGeeks Data Analysis](#geeksforgeeks-data-analysis)
3. [Cardekho Used Car Price Analysis](#cardekho-used-car-price-analysis)
4. [Sonar Mine Prediction](#sonar-mine-prediction)
5. [Big Mart Sales Prediction](#big-mart-sales-prediction)
6. [California House Price Prediction](#california-house-price-prediction)
7. [CarDekho Car Price EDA](#cardekho-car-price-eda)
8. [Credit Card Fraud Detection](#credit-card-fraud-detection)
9. [Customer Segmentation Using K-Means](#customer-segmentation-using-k-means)
10. [Fake News Prediction](#fake-news-prediction)
11. [Gold Price Prediction](#gold-price-prediction)
12. [Heart Disease Prediction](#heart-disease-prediction)
13. [House Prices: Advanced Regression Techniques](#house-prices-advanced-regression-techniques)
14. [Loan Eligibility Prediction](#loan-eligibility-prediction)
15. [Parkinson's Disease Detection](#parkinsons-disease-detection)
16. [Spam Mail Prediction](#spam-mail-prediction)
17. [Used Medical Insurance Prediction](#used-medical-insurance-prediction)## Swiggy Restaurants Data Analysis
**Description:** This project involves analyzing restaurant data from the Swiggy food delivery platform. Key aspects include:
- **Data Collection:** Access data on restaurant names, cuisines, ratings, reviews, delivery times, and locations.
- **Data Cleansing and Preparation:** Clean and preprocess the data for analysis.
- **Restaurant Performance Analysis:** Calculate average ratings, review counts, and identify high-performing restaurants.
- **Cuisine and Menu Analysis:** Analyze cuisine distribution and popular menu items.## GeeksforGeeks Data Analysis
**Description:** This project involves scraping and analyzing video data from the GeeksforGeeks YouTube channel.
- **Data Gathering:** Use YouTube Data API to fetch video details such as titles, views, upload dates, and lengths.
- **Data Processing and Analysis:** Calculate total views and lengths, identify popular topics, and analyze correlations.
- **Visualization:** Use libraries like matplotlib to create visualizations of trends and patterns.## Cardekho Used Car Price Analysis
**Description:** Analyze the used car dataset from Cardekho to uncover insights about factors influencing car prices.
- **Data Gathering:** The dataset includes features like selling price, vehicle age, KM driven, engine size, fuel type, seller type, and transmission type.
- **Data Cleaning and Preprocessing:** Handle missing values, remove duplicates, standardize text columns, and remove outliers.
- **Exploratory Data Analysis (EDA):** Perform univariate, bivariate, and categorical analyses to identify key trends and insights.
- **Visualization:** Use libraries like matplotlib and seaborn to create distribution plots, scatter plots, and correlation heatmaps.
- **Insights and Findings:** Analyze the impact of various factors on car prices and provide recommendations based on the analysis.## Sonar Mine Prediction
**Description:** Build a machine learning model to classify sonar signals as either mines (M) or rocks (R).
- **Data Gathering:** The dataset includes sonar readings for mines and rocks.
- **Data Cleaning and Preprocessing:** Verify and handle missing values and outliers.
- **Exploratory Data Analysis (EDA):** Analyze summary statistics and class distribution.
- **Model Building:** Create feature matrices, split data, and evaluate models such as Logistic Regression, SVC, Decision Tree, and Random Forest.
- **Model Comparison:** Compare models based on accuracy and performance metrics.
- **Insights and Findings:** Determine the best model for sonar signal classification based on accuracy.## Big Mart Sales Prediction
**Description:** Predict sales for Big Mart using historical sales data.
- **Data Gathering:** Use sales data from Big Mart to create predictive models.
- **Data Cleaning and Preprocessing:** Handle missing values and preprocess data for modeling.
- **Model Building:** Build and evaluate regression models to predict sales.## California House Price Prediction
**Description:** Predict house prices in California using historical data.
- **Data Gathering:** Use historical housing data from California.
- **Data Cleaning and Preprocessing:** Clean and preprocess data for analysis.
- **Model Building:** Develop regression models to predict house prices.## CarDekho Car Price EDA
**Description:** Perform exploratory data analysis on CarDekho's car price dataset.
- **Data Gathering:** Analyze features such as car price, model, and mileage.
- **Exploratory Data Analysis (EDA):** Identify key trends and patterns in the dataset.## Credit Card Fraud Detection
**Description:** Build a model to detect fraudulent credit card transactions.
- **Data Gathering:** Use historical credit card transaction data.
- **Model Building:** Develop and evaluate classification models to detect fraud.## Customer Segmentation Using K-Means
**Description:** Segment customers into different groups using K-Means clustering.
- **Data Gathering:** Use customer data for clustering.
- **Model Building:** Apply K-Means clustering to segment customers.## Fake News Prediction
**Description:** Predict whether a news article is fake or real.
- **Data Gathering:** Use a dataset of news articles.
- **Model Building:** Develop and evaluate classification models for fake news detection.## Gold Price Prediction
**Description:** Predict gold prices using historical data.
- **Data Gathering:** Use historical gold price data.
- **Model Building:** Develop regression models to predict future gold prices.## Heart Disease Prediction
**Description:** Predict the likelihood of heart disease based on patient data.
- **Data Gathering:** Use health data related to heart disease.
- **Model Building:** Develop classification models to predict heart disease risk.## House Prices: Advanced Regression Techniques
**Description:** Use advanced regression techniques to predict house prices.
- **Data Gathering:** Use historical housing data.
- **Model Building:** Apply advanced regression techniques to improve predictions.## Loan Eligibility Prediction
**Description:** Predict loan eligibility based on applicant data.
- **Data Gathering:** Use applicant data to determine loan eligibility.
- **Model Building:** Develop classification models to predict loan approval.## Parkinson's Disease Detection
**Description:** Build a model to detect Parkinson's disease from patient data.
- **Data Gathering:** Use health data related to Parkinson's disease.
- **Model Building:** Develop and evaluate classification models for disease detection.## Spam Mail Prediction
**Description:** Predict whether an email is spam or not.
- **Data Gathering:** Use email data to classify spam and non-spam emails.
- **Model Building:** Develop classification models to detect spam emails.## Used Medical Insurance Prediction
**Description:** Predict the likelihood of medical insurance usage based on patient data.
- **Data Gathering:** Use patient data to predict insurance usage.
- **Model Building:** Develop classification models to predict medical insurance needs.## License
This project is licensed under the MIT License.