https://github.com/neemiasbsilva/case-study-data-science
Welcome to some case study of data science projects - (Personal Projects).
https://github.com/neemiasbsilva/case-study-data-science
anomaly-detection case-study-data-science census-income churn-prediction data-science data-science-projects decision-tree healthcare house-price-prediction logistic-regression machine-learning pyspark pyspark-mllib rag-chatbot spaceship-titanic
Last synced: 6 months ago
JSON representation
Welcome to some case study of data science projects - (Personal Projects).
- Host: GitHub
- URL: https://github.com/neemiasbsilva/case-study-data-science
- Owner: neemiasbsilva
- Created: 2022-08-22T17:20:17.000Z (about 3 years ago)
- Default Branch: main
- Last Pushed: 2025-01-15T20:39:52.000Z (9 months ago)
- Last Synced: 2025-03-25T15:14:20.960Z (7 months ago)
- Topics: anomaly-detection, case-study-data-science, census-income, churn-prediction, data-science, data-science-projects, decision-tree, healthcare, house-price-prediction, logistic-regression, machine-learning, pyspark, pyspark-mllib, rag-chatbot, spaceship-titanic
- Language: Jupyter Notebook
- Homepage:
- Size: 11.1 MB
- Stars: 16
- Watchers: 1
- Forks: 4
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Case Studies: Data Science Projects







## Table of Content
- [About](#about);
- [Artificial Neural Network Approach in Churn Analysis using PyTorch](#artificial-neural-network-approach-in-churn-analysis-using-pytorch);
- [House Price Prediction](#house-price-prediction);
- [Healthcare Stroke Using PySpark](#healthcare-stroke-using-pyspark).## About
Welcome to some case studies of data science projects - (Personal Projects). The goal of this respository is to show some projects I developed in my career over the years.
## Artificial Neural Network Approach in Churn Analysis using PyTorch
Churn analytics is the process of measure the rate at which customers will quit the company (or producty). In this case tutorial we'll use a customer bank
dataset for estimate the churn rate. For understand better this case study, please check the follow [link](https://github.com/neemiasbsilva/case-study-data-science/tree/main/churn_analysis).- Exploratory Data Analysis - [EDA](https://github.com/neemiasbsilva/case-study-data-science/blob/main/churn_analysis/data_analysis.ipynb);
- [Data Visualization](https://github.com/neemiasbsilva/case-study-data-science/blob/main/churn_analysis/data_analysis.ipynb);
- [Preprocessing](https://github.com/neemiasbsilva/case-study-data-science/blob/main/churn_analysis/preprocessing.ipynb);
- [ANN model](https://github.com/neemiasbsilva/case-study-data-science/blob/main/churn_analysis/ann_pytorch_model.ipynb);## House Price Prediction
House Price Prediction is a case study based on the "Hands-on of Machine Learning book" and has the goal for create a Machine Learning model for predicted the correct price for a particular house.. For undestand better this case study, please check the follow [link](https://github.com/neemiasbsilva/case-study-data-science/tree/main/house_price_prediction).
- Exploratory Data Analysis - [EDA](https://github.com/neemiasbsilva/case-study-data-science/blob/main/house_price_prediction/end_to_end_ml_project_regression.ipynb);
- [Preprocessing](https://github.com/neemiasbsilva/case-study-data-science/blob/main/house_price_prediction/end_to_end_ml_project_regression.ipynb);
- [Linear Regression](https://github.com/neemiasbsilva/case-study-data-science/blob/main/house_price_prediction/end_to_end_ml_project_regression.ipynb); - [Decision Tree](https://github.com/neemiasbsilva/case-study-data-science/blob/main/house_price_prediction/end_to_end_ml_project_regression.ipynb);
- [Random Forest](https://github.com/neemiasbsilva/case-study-data-science/blob/main/house_price_prediction/end_to_end_ml_project_regression.ipynb);
- [Suport Vector Machine](https://github.com/neemiasbsilva/case-study-data-science/blob/main/house_price_prediction/end_to_end_ml_project_regression.ipynb);
- [ANN](https://github.com/neemiasbsilva/case-study-data-science/blob/main/house_price_prediction/end_to_end_ml_project_regression.ipynb).## Healthcare Stroke Using PySpark
This study case has the purpose for implementing a classification using PySpark for estimate the probability of smoke and not smoke. To understando more about the features and the case study, please check this [link](https://github.com/neemiasbsilva/case-study-data-science/tree/main/data_analysis_using_pyspark)
- [PySpark Configuration](https://github.com/neemiasbsilva/case-study-data-science/blob/main/data_analysis_using_pyspark/data_analysis.ipynb);
- [Exploratory Data Analysis](https://github.com/neemiasbsilva/case-study-data-science/blob/main/data_analysis_using_pyspark/data_analysis.ipynb);
- [Data Visualization](https://github.com/neemiasbsilva/case-study-data-science/blob/main/data_analysis_using_pyspark/data_analysis.ipynb);
- [Data Preprocessing](https://github.com/neemiasbsilva/case-study-data-science/blob/main/data_analysis_using_pyspark/healthcare_logistic_regression_pyspark.ipynb);
- [Feature Engineer](https://github.com/neemiasbsilva/case-study-data-science/blob/main/data_analysis_using_pyspark/healthcare_logistic_regression_pyspark.ipynb);
- [Model Selection](https://github.com/neemiasbsilva/case-study-data-science/blob/main/data_analysis_using_pyspark/healthcare_logistic_regression_pyspark.ipynb);
- [Evaluation](https://github.com/neemiasbsilva/case-study-data-science/blob/main/data_analysis_using_pyspark/healthcare_logistic_regression_pyspark.ipynb).## Counting and Aggregating M&Ms using PySpark
This case study has the puporse for use Pyspark for counting and aggregating M&Ms. For check the implementation, please check this [link](https://github.com/neemiasbsilva/case-study-data-science/tree/main/counting-and-aggregating-m%26ms-pyspark).
- [Build a Spark Session](https://github.com/neemiasbsilva/case-study-data-science/blob/main/counting-and-aggregating-m%26ms-pyspark/counting_aggregating_m%26ms.ipynb);
- [Load M&M Dataset](https://github.com/neemiasbsilva/case-study-data-science/blob/main/counting-and-aggregating-m%26ms-pyspark/counting_aggregating_m%26ms.ipynb);
- [Group Each State and Color and Ordering in Descending Order](https://github.com/neemiasbsilva/case-study-data-science/blob/main/counting-and-aggregating-m%26ms-pyspark/counting_aggregating_m%26ms.ipynb);
- [Aggregate for a Particular State](https://github.com/neemiasbsilva/case-study-data-science/blob/main/counting-and-aggregating-m%26ms-pyspark/counting_aggregating_m%26ms.ipynb).