https://github.com/markoshb/my-data-science-learning-projects
Short but illustrative notebooks to showcase data-analysis in Python
https://github.com/markoshb/my-data-science-learning-projects
data-science matplotlib-pyplot pandas python pythorch scikit-learn tensorflow
Last synced: 3 months ago
JSON representation
Short but illustrative notebooks to showcase data-analysis in Python
- Host: GitHub
- URL: https://github.com/markoshb/my-data-science-learning-projects
- Owner: MarkosHB
- Created: 2025-02-14T14:49:38.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2025-03-02T15:45:40.000Z (9 months ago)
- Last Synced: 2025-03-02T16:33:25.162Z (9 months ago)
- Topics: data-science, matplotlib-pyplot, pandas, python, pythorch, scikit-learn, tensorflow
- Language: Jupyter Notebook
- Homepage:
- Size: 209 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# ๐ My Data Science Learning Projects.
Welcome to my personal collection of hands-on data science projects! This repository showcases my journey exploring and mastering various data science concepts, tools, and techniques.
๐ Stay tuned as I continue to expand this repository with more exciting projects!
> [!Note]
> Have a look at this repo about my [Machine Learning Subject](https://github.com/MarkosHB/Machine-Learning-Subject) wich contains even more projects written in R.
### ๐ธ Iris Classification.
- ๐ [Notebook](https://github.com/MarkosHB/My-Data-Science-Learning-Projects/blob/main/iris/notebook.ipynb)
- ๐ ๏ธ **Technologies:** Pandas, PyTorch.
- ๐งช **Summary:** A classic classification problem using the Iris dataset to practice data manipulation, visualization, and building simple neural networks.
### ๐ฉบ Diabetes Prediction.
- ๐ [Notebook](https://github.com/MarkosHB/My-Data-Science-Learning-Projects/blob/main/diabetes/notebook.ipynb)
- ๐ ๏ธ **Technologies:** Scikit-learn, TensorFlow.
- ๐งช **Summary:** Predicting the likelihood of diabetes using machine learning models, focusing on data preprocessing and model evaluation.
### ๐งโโ๏ธ Breast Cancer.
- ๐ [Notebook](https://github.com/MarkosHB/My-Data-Science-Learning-Projects/blob/main/breast_cancer/notebook.ipynb)
- ๐ ๏ธ **Technologies:** Autokeras, Scikit-learn.
- ๐งช **Summary:** Automated approach to classify breast cancer cases. The project leverages AutoKeras to find optimal deep learning models with minimal manual tuning.
### ๐ท Wine.
- ๐ [Notebook](https://github.com/MarkosHB/My-Data-Science-Learning-Projects/blob/main/wine/notebook.ipynb)
- ๐ ๏ธ **Technologies:** Pyspark, Pandas.
- ๐งช **Summary:** The script processes the Wine dataset using Apache Spark, performing data cleaning, exploration, and applying custom pandas UDFs for additional transformations.
### ๐ California Housing.
- ๐ [Notebook](https://github.com/MarkosHB/My-Data-Science-Learning-Projects/blob/main/california_housing/notebook.ipynb)
- ๐ ๏ธ **Technologies:** Dask, Scikit-learn.
- ๐งช **Summary:** This time, we will use an alternative to Pandas so that parallel computing is considered when manipulating dataframes thanks to the library Dask.
### ๐ญ Movie Reviews.
- ๐ [Notebook](https://github.com/MarkosHB/My-Data-Science-Learning-Projects/blob/main/movie_reviews/notebook.ipynb)
- ๐ ๏ธ **Technologies:** NLTK, Scikit-learn.
- ๐งช **Summary:** The notebook analyzes movie reviews using the NLTK library, focusing on text preprocessing, feature extraction, and sentiment classification with a Naive Bayes model.
### ๐ ManageYourData.
- ๐ป [Repository](https://github.com/MarkosHB/ManageYourData)
- ๐ ๏ธ **Technologies:** Pandas, Matplotlib, FPDF, Openpyxl, Streamlit.
- ๐งช **Summary:** A self-made tool for generating PDF reports from data files locally.
---
### ๐๏ธ Data analysis.
- ๐ [Carprice report](https://github.com/MarkosHB/My-Data-Science-Learning-Projects/blob/main/carprice/carprice.pdf) and [Titanic report](https://github.com/MarkosHB/My-Data-Science-Learning-Projects/blob/main/titanic/titanic.pdf)
- ๐ ๏ธ **Technologies:** Power BI.
- ๐งช **Summary:** My first two dashboards ever made with Power BI allowed me to learn the basics of visualizing and manipulating data.