https://github.com/mdalamin5/data-science-machine-learning-basics
This repository is a comprehensive guide to Machine Learning algorithms, Python OOP, data preprocessing, and visualization using Pandas, NumPy, Seaborn, Scikit-learn, and more. It includes hands-on Jupyter notebooks, modular Python scripts, and a structured ML pipeline for training and evaluating models. 🚀
https://github.com/mdalamin5/data-science-machine-learning-basics
data-visualization datapreprocessing machine-learning-algorithms object-oriented-programming
Last synced: 10 months ago
JSON representation
This repository is a comprehensive guide to Machine Learning algorithms, Python OOP, data preprocessing, and visualization using Pandas, NumPy, Seaborn, Scikit-learn, and more. It includes hands-on Jupyter notebooks, modular Python scripts, and a structured ML pipeline for training and evaluating models. 🚀
- Host: GitHub
- URL: https://github.com/mdalamin5/data-science-machine-learning-basics
- Owner: MDalamin5
- Created: 2023-12-01T15:47:20.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2025-03-07T06:11:48.000Z (11 months ago)
- Last Synced: 2025-03-07T06:28:56.730Z (11 months ago)
- Topics: data-visualization, datapreprocessing, machine-learning-algorithms, object-oriented-programming
- Language: Jupyter Notebook
- Homepage: https://www.linkedin.com/in/mdalamin5/
- Size: 33.7 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# **Comprehensive Guide to Machine Learning & Python OOP**
## **Overview**
This repository serves as a **comprehensive resource** for understanding **Machine Learning algorithms**, **Python Object-Oriented Programming (OOP)**, **data preprocessing**, and **visualization techniques** using industry-standard tools.
## **Topics Covered**
### 🔹 **Machine Learning Algorithms**
✔ **Supervised Learning:** Linear Regression, Logistic Regression, Decision Trees, SVM, KNN
✔ **Unsupervised Learning:** K-Means Clustering, PCA, DBSCAN
✔ **Ensemble Methods:** Random Forest, Gradient
✔ **Deep Learning (Basic):** Neural Networks, CNN, RNN (Intro)
### 🔹 **Data Preprocessing Techniques**
✔ Handling **Missing Values** (Mean/Mode Imputation, Interpolation)
✔ **Feature Scaling:** Min-Max Scaling, Standardization
✔ **Categorical Encoding:** One-Hot Encoding, Label Encoding
✔ **Feature Selection:** Correlation Analysis, Recursive Feature Elimination (RFE)
### 🔹 **Visualization Techniques**
✔ **Seaborn & Matplotlib:** Histograms, Pair Plots, Heatmaps
✔ **Pandas Profiling:** Automated EDA
✔ **Plotly & Interactive Visuals:** Scatter Plots, Line Graphs, 3D Plots
### 🔹 **Python OOP in Machine Learning**
✔ **DataPreprocessor Class** (Handles missing values, encoding, scaling)
✔ **ModelTrainer Class** (Fits and evaluates ML models)
✔ **Visualizer Class** (Generates charts & plots for analysis)
✔ **Pipeline Implementation** (Combining preprocessing, training, and evaluation)
## **Installation**
To set up the environment, install dependencies with:
```
pip install -r requirements.txt
```
## **Future Enhancements**
🚀 Implement Deep Learning models for advanced tasks
🚀 Add more real-world datasets for hands-on learning
🚀 Expand visualization techniques with interactive tools