An open API service indexing awesome lists of open source software.

https://github.com/saniyaabushakimova/machine-learning-algorithms-from-scratch


https://github.com/saniyaabushakimova/machine-learning-algorithms-from-scratch

gmm-em hmm-viterbi-algorithm knn lasso-regression pegasos-learning-algorithm python sgd splines svm

Last synced: about 2 months ago
JSON representation

Awesome Lists containing this project

README

          

# About

This repository contains implementations of various **Machine Learning and Statistical Learning algorithms from scratch**, developed as part of the **Practical Statistical Learning** and **Deep Learning for Computer Vision** courses. Each project focuses on building models without relying on high-level machine learning libraries, providing deeper insights into their mathematical foundations and optimizations.

Each folder contains:
* `ipynb` files with implementation (Python).
* Corresponding datasets.
* Instructions on implementation details.

## Implemented Algorithms and Projects

### 1. GMM-and-HMM-with-Expectation-Maximization
*Gaussian Mixture Models (GMM) and Hidden Markov Models (HMM) using the EM Algorithm*

Project completed on October 20, 2024.

* Implemented Expectation-Maximization (EM) from scratch to fit GMMs for clustering and density estimation.
* Developed Baum-Welch (EM for HMMs) and Viterbi Algorithm to train and decode Hidden Markov Models.
* Applied the models to sequence modeling and probabilistic clustering.

### 2. KNN-and-Bayes-Classification
*Comparing k-Nearest Neighbors (kNN) and Bayes Rule for Classification*

Project completed on September 6, 2024.

* Implemented custom kNN classifier with cross-validation for hyperparameter selection.
* Developed Bayes Classifier from scratch, leveraging probability distributions for decision-making.
* Conducted a simulation study to compare kNN and Bayes decision rules in different distributions.

### 3. LOESS-RidgelessRegression-NCS
*Nonparametric Regression and Overfitting in High-Dimensional Models*

Project completed on September 30, 2024.

* Implemented LOESS (Locally Weighted Scatterplot Smoothing) for nonlinear regression.
* Explored Ridgeless Regression to analyze overfitting and the Double Descent phenomenon.
* Used Natural Cubic Splines (NCS) for time series smoothing and feature extraction.

### 4. Lasso-with-Coordinate-Descent
*Sparse Regression with L1 Regularization*

Project completed on September 18, 2024.

* Implemented Lasso Regression from scratch using the Coordinate Descent algorithm.
* Compared Lasso with Ridge Regression and Principal Component Regression (PCR).
* Analyzed model sparsity and feature selection using simulated datasets.

### 5. SVM-with-Pegasos-Algorithm
*Support Vector Machines (SVM) using a Specialized SGD Method*

Project completed on November 12, 2024.

* Developed Support Vector Machines (SVM) from scratch, solving the primal form directly.
* Implemented Pegasos Algorithm (Primal Estimated sub-GrAdient SOlver for SVM), a variation of Stochastic Gradient Descent (SGD) optimized for large-scale datasets.
* Applied the model to binary classification tasks on MNIST subsets and evaluated generalization performance.

### 6. Linear Classifiers: Perceptron, SVM, Softmax, Logistic Regression

Project completed on February 20, 2025.

* Implemented Perceptron, SVM, Softmax, and Logistic Regression classifiers from scratch in Python and applied them to the **Rice** and **Fashion-MNIST** datasets.
* Focused on understanding core concepts of linear classification, hyperparameter tuning, and performance evaluation using training/validation/test splits.