Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/atheeralzhrani/data-science-projects
This repository contains my data science projects, where I utilized tools and libraries such as Spark, Python, Pandas, NumPy, SQLite, Matplotlib, Seaborn, and performed Exploratory Data Analysis .
https://github.com/atheeralzhrani/data-science-projects
data-engineering data-preprocessing data-science data-visualization exploratory-data-analysis matplotlib pandas python python-lambda seaborn spark
Last synced: 27 days ago
JSON representation
This repository contains my data science projects, where I utilized tools and libraries such as Spark, Python, Pandas, NumPy, SQLite, Matplotlib, Seaborn, and performed Exploratory Data Analysis .
- Host: GitHub
- URL: https://github.com/atheeralzhrani/data-science-projects
- Owner: AtheerAlzhrani
- Created: 2024-08-02T08:11:58.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2024-09-09T07:47:12.000Z (about 2 months ago)
- Last Synced: 2024-09-28T07:01:20.554Z (about 1 month ago)
- Topics: data-engineering, data-preprocessing, data-science, data-visualization, exploratory-data-analysis, matplotlib, pandas, python, python-lambda, seaborn, spark
- Language: Jupyter Notebook
- Homepage:
- Size: 1.53 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Data-Science-Projects
This repository contains my data science projects, where I utilized tools and libraries such as Spark, Python, Pandas, NumPy, Matplotlib, Seaborn, and performed Exploratory Data Analysis (EDA) and more, to perform data preprocessing, visualization, and model building.# Project2
This project provides a comprehensive analysis of the AI-powered job market, including data preprocessing, exploratory data analysis, visualizations, and predictive modeling using a combination of encoded categorical features and scaled salary data. The model's performance was evaluated using RMSE and visualized using a scatter plot. Further steps include finetune the model and exploring additional features to improve prediction accuracy.
# Project1
The project involved analyzing a Telco customer dataset with 21 features on 7043 customers. I used Spark for dataset manipulation and user churn forecasting, with the Naive Bayes model achieving 70.55% accuracy and an F1 score of 0.6741.