https://github.com/saniyaabushakimova/machine-learning-algorithms-from-scratch
https://github.com/saniyaabushakimova/machine-learning-algorithms-from-scratch
gmm-em hmm-viterbi-algorithm knn lasso-regression pegasos-learning-algorithm python sgd splines svm
Last synced: about 2 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/saniyaabushakimova/machine-learning-algorithms-from-scratch
- Owner: SaniyaAbushakimova
- Created: 2025-02-25T13:38:12.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-02-25T14:40:04.000Z (over 1 year ago)
- Last Synced: 2025-02-25T14:45:00.307Z (over 1 year ago)
- Topics: gmm-em, hmm-viterbi-algorithm, knn, lasso-regression, pegasos-learning-algorithm, python, sgd, splines, svm
- Language: Jupyter Notebook
- Homepage:
- Size: 4.93 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# About
This repository contains implementations of various **Machine Learning and Statistical Learning algorithms from scratch**, developed as part of the **Practical Statistical Learning** and **Deep Learning for Computer Vision** courses. Each project focuses on building models without relying on high-level machine learning libraries, providing deeper insights into their mathematical foundations and optimizations.
Each folder contains:
* `ipynb` files with implementation (Python).
* Corresponding datasets.
* Instructions on implementation details.
## Implemented Algorithms and Projects
### 1. GMM-and-HMM-with-Expectation-Maximization
*Gaussian Mixture Models (GMM) and Hidden Markov Models (HMM) using the EM Algorithm*
Project completed on October 20, 2024.
* Implemented Expectation-Maximization (EM) from scratch to fit GMMs for clustering and density estimation.
* Developed Baum-Welch (EM for HMMs) and Viterbi Algorithm to train and decode Hidden Markov Models.
* Applied the models to sequence modeling and probabilistic clustering.
### 2. KNN-and-Bayes-Classification
*Comparing k-Nearest Neighbors (kNN) and Bayes Rule for Classification*
Project completed on September 6, 2024.
* Implemented custom kNN classifier with cross-validation for hyperparameter selection.
* Developed Bayes Classifier from scratch, leveraging probability distributions for decision-making.
* Conducted a simulation study to compare kNN and Bayes decision rules in different distributions.
### 3. LOESS-RidgelessRegression-NCS
*Nonparametric Regression and Overfitting in High-Dimensional Models*
Project completed on September 30, 2024.
* Implemented LOESS (Locally Weighted Scatterplot Smoothing) for nonlinear regression.
* Explored Ridgeless Regression to analyze overfitting and the Double Descent phenomenon.
* Used Natural Cubic Splines (NCS) for time series smoothing and feature extraction.
### 4. Lasso-with-Coordinate-Descent
*Sparse Regression with L1 Regularization*
Project completed on September 18, 2024.
* Implemented Lasso Regression from scratch using the Coordinate Descent algorithm.
* Compared Lasso with Ridge Regression and Principal Component Regression (PCR).
* Analyzed model sparsity and feature selection using simulated datasets.
### 5. SVM-with-Pegasos-Algorithm
*Support Vector Machines (SVM) using a Specialized SGD Method*
Project completed on November 12, 2024.
* Developed Support Vector Machines (SVM) from scratch, solving the primal form directly.
* Implemented Pegasos Algorithm (Primal Estimated sub-GrAdient SOlver for SVM), a variation of Stochastic Gradient Descent (SGD) optimized for large-scale datasets.
* Applied the model to binary classification tasks on MNIST subsets and evaluated generalization performance.
### 6. Linear Classifiers: Perceptron, SVM, Softmax, Logistic Regression
Project completed on February 20, 2025.
* Implemented Perceptron, SVM, Softmax, and Logistic Regression classifiers from scratch in Python and applied them to the **Rice** and **Fashion-MNIST** datasets.
* Focused on understanding core concepts of linear classification, hyperparameter tuning, and performance evaluation using training/validation/test splits.