{"id":14958227,"url":"https://github.com/tirthajyoti/machine-learning-with-python","last_synced_at":"2025-05-14T20:08:02.612Z","repository":{"id":38899401,"uuid":"97429942","full_name":"tirthajyoti/Machine-Learning-with-Python","owner":"tirthajyoti","description":"Practice and tutorial-style notebooks  covering wide variety of machine learning techniques","archived":false,"fork":false,"pushed_at":"2023-05-22T22:28:39.000Z","size":101368,"stargazers_count":3191,"open_issues_count":9,"forks_count":1821,"subscribers_count":156,"default_branch":"master","last_synced_at":"2025-04-13T14:07:11.835Z","etag":null,"topics":["artificial-intelligence","classification","clustering","data-science","decision-trees","deep-learning","dimensionality-reduction","flask","k-nearest-neighbours","machine-learning","matplotlib","naive-bayes","neural-network","numpy","pandas","pytest","random-forest","regression","scikit-learn","statistics"],"latest_commit_sha":null,"homepage":"https://machine-learning-with-python.readthedocs.io/en/latest/","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-2-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tirthajyoti.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null},"funding":{"patreon":"tirthajyoti"}},"created_at":"2017-07-17T03:06:13.000Z","updated_at":"2025-04-12T11:52:38.000Z","dependencies_parsed_at":"2024-01-18T04:19:23.612Z","dependency_job_id":null,"html_url":"https://github.com/tirthajyoti/Machine-Learning-with-Python","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tirthajyoti%2FMachine-Learning-with-Python","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tirthajyoti%2FMachine-Learning-with-Python/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tirthajyoti%2FMachine-Learning-with-Python/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tirthajyoti%2FMachine-Learning-with-Python/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tirthajyoti","download_url":"https://codeload.github.com/tirthajyoti/Machine-Learning-with-Python/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248724637,"owners_count":21151561,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","classification","clustering","data-science","decision-trees","deep-learning","dimensionality-reduction","flask","k-nearest-neighbours","machine-learning","matplotlib","naive-bayes","neural-network","numpy","pandas","pytest","random-forest","regression","scikit-learn","statistics"],"created_at":"2024-09-24T13:16:32.107Z","updated_at":"2025-04-13T14:07:25.387Z","avatar_url":"https://github.com/tirthajyoti.png","language":"Jupyter Notebook","readme":"[![License](https://img.shields.io/badge/License-BSD%202--Clause-orange.svg)](https://opensource.org/licenses/BSD-2-Clause)\n[![GitHub forks](https://img.shields.io/github/forks/tirthajyoti/Machine-Learning-with-Python.svg)](https://github.com/tirthajyoti/Machine-Learning-with-Python/network)\n[![GitHub stars](https://img.shields.io/github/stars/tirthajyoti/Machine-Learning-with-Python.svg)](https://github.com/tirthajyoti/Machine-Learning-with-Python/stargazers)\n[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg)](https://github.com/tirthajyoti/Machine-Learning-with-Python/pulls)\n\n# Python Machine Learning Jupyter Notebooks ([ML website](https://machine-learning-with-python.readthedocs.io/en/latest/))\n\n### Dr. Tirthajyoti Sarkar, Fremont, California ([Please feel free to connect on LinkedIn here](https://www.linkedin.com/in/tirthajyoti-sarkar-2127aa7))\n\n![ml-ds](https://raw.githubusercontent.com/tirthajyoti/Machine-Learning-with-Python/master/Images/ML-DS-cycle-1.png)\n\n---\n\n## Also check out these super-useful Repos that I curated\n\n- ### [Highly cited and useful papers related to machine learning, deep learning, AI, game theory, reinforcement learning](https://github.com/tirthajyoti/Papers-Literature-ML-DL-RL-AI)\n\n- ### [Carefully curated resource links for data science in one place](https://github.com/tirthajyoti/Data-science-best-resources)\n\n## Requirements\n* **Python 3.6+**\n* **NumPy (`pip install numpy`)**\n* **Pandas (`pip install pandas`)**\n* **Scikit-learn (`pip install scikit-learn`)**\n* **SciPy (`pip install scipy`)**\n* **Statsmodels (`pip install statsmodels`)**\n* **MatplotLib (`pip install matplotlib`)**\n* **Seaborn (`pip install seaborn`)**\n* **Sympy (`pip install sympy`)**\n* **Flask (`pip install flask`)**\n* **WTForms (`pip install wtforms`)**\n* **Tensorflow (`pip install tensorflow\u003e=1.15`)**\n* **Keras (`pip install keras`)**\n* **pdpipe (`pip install pdpipe`)**\n\n---\n\nYou can start with this article that I wrote in Heartbeat magazine (on Medium platform): \n### [\"Some Essential Hacks and Tricks for Machine Learning with Python\"](https://heartbeat.fritz.ai/some-essential-hacks-and-tricks-for-machine-learning-with-python-5478bc6593f2)\n\u003cimg src=\"https://cookieegroup.com/wp-content/uploads/2018/10/2-1.png\" width=\"450\" height=\"300\"/\u003e\n\n## Essential tutorial-type notebooks on Pandas and Numpy\nJupyter notebooks covering a wide range of functions and operations on the topics of NumPy, Pandans, Seaborn, Matplotlib etc.\n\n* [Detailed Numpy operations](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Pandas%20and%20Numpy/Numpy_operations.ipynb)\n* [Detailed Pandas operations](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Pandas%20and%20Numpy/Pandas_Operations.ipynb)\n* [Numpy and Pandas quick basics](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Pandas%20and%20Numpy/Numpy_Pandas_Quick.ipynb)\n* [Matplotlib and Seaborn quick basics](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Pandas%20and%20Numpy/Matplotlib_Seaborn_basics.ipynb)\n* [Advanced Pandas operations](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Pandas%20and%20Numpy/Advanced%20Pandas%20Operations.ipynb)\n* [How to read various data sources](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Pandas%20and%20Numpy/Read_data_various_sources/How%20to%20read%20various%20sources%20in%20a%20DataFrame.ipynb)\n* [PDF reading and table processing demo](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Pandas%20and%20Numpy/Read_data_various_sources/PDF%20table%20reading%20and%20processing%20demo.ipynb)\n* [How fast are Numpy operations compared to pure Python code?](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Pandas%20and%20Numpy/How%20fast%20are%20NumPy%20ops.ipynb) (Read my [article](https://towardsdatascience.com/why-you-should-forget-for-loop-for-data-science-code-and-embrace-vectorization-696632622d5f) on Medium related to this topic)\n* [Fast reading from Numpy using .npy file format](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Pandas%20and%20Numpy/Numpy_Reading.ipynb) (Read my [article](https://towardsdatascience.com/why-you-should-start-using-npy-file-more-often-df2a13cc0161) on Medium on this topic)\n\n## Tutorial-type notebooks covering regression, classification, clustering, dimensionality reduction, and some basic neural network algorithms\n\n### Regression\n* Simple linear regression with t-statistic generation\n\u003cimg src=\"https://slideplayer.com/slide/6053182/20/images/10/Simple+Linear+Regression+Model.jpg\" width=\"400\" height=\"300\"/\u003e\n\n* [Multiple ways to perform linear regression in Python and their speed comparison](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Regression/Linear_Regression_Methods.ipynb) ([check the article I wrote on freeCodeCamp](https://medium.freecodecamp.org/data-science-with-python-8-ways-to-do-linear-regression-and-measure-their-speed-b5577d75f8b))\n\n* [Multi-variate regression with regularization](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Regression/Multi-variate%20LASSO%20regression%20with%20CV.ipynb)\n\u003cimg src=\"https://upload.wikimedia.org/wikipedia/commons/thumb/f/f8/L1_and_L2_balls.svg/300px-L1_and_L2_balls.svg.png\"/\u003e\n\n* Polynomial regression using ***scikit-learn pipeline feature*** ([check the article I wrote on *Towards Data Science*](https://towardsdatascience.com/machine-learning-with-python-easy-and-robust-method-to-fit-nonlinear-data-19e8a1ddbd49))\n\n* [Decision trees and Random Forest regression](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Regression/Random_Forest_Regression.ipynb) (showing how the Random Forest works as a robust/regularized meta-estimator rejecting overfitting)\n\n* [Detailed visual analytics and goodness-of-fit diagnostic tests for a linear regression problem](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Regression/Regression_Diagnostics.ipynb)\n\n* [Robust linear regression using `HuberRegressor` from Scikit-learn](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Regression/Robust%20Linear%20Regression.ipynb)\n\n-----\n\n### Classification\n* Logistic regression/classification ([Here is the Notebook](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Classification/Logistic_Regression_Classification.ipynb))\n\u003cimg src=\"https://qph.fs.quoracdn.net/main-qimg-914b29e777e78b44b67246b66a4d6d71\"/\u003e\n\n* _k_-nearest neighbor classification ([Here is the Notebook](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Classification/KNN_Classification.ipynb))\n\n* Decision trees and Random Forest Classification ([Here is the Notebook](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Classification/DecisionTrees_RandomForest_Classification.ipynb))\n\n* Support vector machine classification ([Here is the Notebook](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Classification/Support_Vector_Machine_Classification.ipynb)) (**[check the article I wrote in Towards Data Science on SVM and sorting algorithm](https://towardsdatascience.com/how-the-good-old-sorting-algorithm-helps-a-great-machine-learning-technique-9e744020254b))**\n\n\u003cimg src=\"https://docs.opencv.org/2.4/_images/optimal-hyperplane.png\"/\u003e\n\n* Naive Bayes classification ([Here is the Notebook](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Classification/Naive_Bayes_Classification.ipynb))\n\n---\n\n### Clustering\n\u003cimg src=\"https://i.ytimg.com/vi/IJt62uaZR-M/maxresdefault.jpg\" width=\"450\" height=\"300\"/\u003e\n\n* _K_-means clustering ([Here is the Notebook](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Clustering-Dimensionality-Reduction/K_Means_Clustering_Practice.ipynb))\n\n* Affinity propagation (showing its time complexity and the effect of damping factor) ([Here is the Notebook](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Clustering-Dimensionality-Reduction/Affinity_Propagation.ipynb))\n\n* Mean-shift technique (showing its time complexity and the effect of noise on cluster discovery) ([Here is the Notebook](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Clustering-Dimensionality-Reduction/Mean_Shift_Clustering.ipynb))\n\n* DBSCAN (showing how it can generically detect areas of high density irrespective of cluster shapes, which the k-means fails to do) ([Here is the Notebook](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Clustering-Dimensionality-Reduction/DBScan_Clustering.ipynb))\n\n* Hierarchical clustering with Dendograms showing how to choose optimal number of clusters ([Here is the Notebook](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Clustering-Dimensionality-Reduction/Hierarchical_Clustering.ipynb))\n\n\u003cimg src=\"https://www.researchgate.net/profile/Carsten_Walther/publication/273456906/figure/fig3/AS:294866065084419@1447312956501/Example-of-hierarchical-clustering-clusters-are-consecutively-merged-with-the-most.png\" width=\"700\" height=\"400\"/\u003e\n\n---\n\n### Dimensionality reduction\n* Principal component analysis\n\n\u003cimg src=\"https://i.ytimg.com/vi/QP43Iy-QQWY/maxresdefault.jpg\" width=\"450\" height=\"300\"/\u003e\n\n---\n\n### Deep Learning/Neural Network\n* [Demo notebook to illustrate the superiority of deep neural network for complex nonlinear function approximation task](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Function%20Approximation%20by%20Neural%20Network/Polynomial%20regression%20-%20linear%20and%20neural%20network.ipynb)\n* Step-by-step building of 1-hidden-layer and 2-hidden-layer dense network using basic TensorFlow methods\n\n---\n\n### Random data generation using symbolic expressions\n* How to use [Sympy package](https://www.sympy.org/en/index.html) to generate random datasets using symbolic mathematical expressions.\n\n* Here is my article on Medium on this topic: [Random regression and classification problem generation with symbolic expression](https://towardsdatascience.com/random-regression-and-classification-problem-generation-with-symbolic-expression-a4e190e37b8d)\n\n---\n\n### Synthetic data generation techniques\n* [Notebooks here](https://github.com/tirthajyoti/Machine-Learning-with-Python/tree/master/Synthetic_data_generation)\n\n### Simple deployment examples (serving ML models on web API)\n* [Serving a linear regression model through a simple HTTP server interface](https://github.com/tirthajyoti/Machine-Learning-with-Python/tree/master/Deployment/Linear_regression). User needs to request predictions by executing a Python script. Uses `Flask` and `Gunicorn`.\n\n* [Serving a recurrent neural network (RNN) through a HTTP webpage](https://github.com/tirthajyoti/Machine-Learning-with-Python/tree/master/Deployment/rnn_app), complete with a web form, where users can input parameters and click a button to generate text based on the pre-trained RNN model. Uses `Flask`, `Jinja`, `Keras`/`TensorFlow`, `WTForms`.\n\n---\n\n### Object-oriented programming with machine learning\nImplementing some of the core OOP principles in a machine learning context by [building your own Scikit-learn-like estimator, and making it better](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/OOP_in_ML/Class_MyLinearRegression.ipynb).\n\nSee my articles on Medium on this topic.\n\n* [Object-oriented programming for data scientists: Build your ML estimator](https://towardsdatascience.com/object-oriented-programming-for-data-scientists-build-your-ml-estimator-7da416751f64)\n* [How a simple mix of object-oriented programming can sharpen your deep learning prototype](https://towardsdatascience.com/how-a-simple-mix-of-object-oriented-programming-can-sharpen-your-deep-learning-prototype-19893bd969bd)\n\n---\n### Unit testing ML code with Pytest\nCheck the files and detailed instructions in the [Pytest](https://github.com/tirthajyoti/Machine-Learning-with-Python/tree/master/Pytest) directory to understand how one should write unit testing code/module for machine learning models\n\n---\n\n### Memory and timing profiling\n\nProfiling data science code and ML models for memory footprint and computing time is a critical but often overlooed area. Here are a couple of Notebooks showing the ideas,\n\n* [Memory profling using Scalene](https://github.com/tirthajyoti/Machine-Learning-with-Python/tree/master/Memory-profiling/Scalene)\n* [Time-profiling data science code](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Time-profiling/cProfile.ipynb)\n","funding_links":["https://patreon.com/tirthajyoti"],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftirthajyoti%2Fmachine-learning-with-python","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftirthajyoti%2Fmachine-learning-with-python","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftirthajyoti%2Fmachine-learning-with-python/lists"}