{"id":14958317,"url":"https://github.com/suvoooo/machine_learning","last_synced_at":"2025-04-06T09:10:28.046Z","repository":{"id":50293264,"uuid":"149387421","full_name":"suvoooo/Machine_Learning","owner":"suvoooo","description":"Some fundamental machine learning and data-analysis techniques are explained through realistic examples. ","archived":false,"fork":false,"pushed_at":"2024-09-18T14:55:05.000Z","size":54832,"stargazers_count":122,"open_issues_count":0,"forks_count":200,"subscribers_count":8,"default_branch":"master","last_synced_at":"2025-04-06T09:09:57.826Z","etag":null,"topics":["machine-learning","pandas","python3","seaborn","sklearn"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/suvoooo.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-09-19T03:30:27.000Z","updated_at":"2025-03-18T01:51:33.000Z","dependencies_parsed_at":"2024-09-02T16:21:32.428Z","dependency_job_id":"89298a5a-e258-4ad7-b659-e197fb29ef2a","html_url":"https://github.com/suvoooo/Machine_Learning","commit_stats":{"total_commits":94,"total_committers":3,"mean_commits":"31.333333333333332","dds":"0.25531914893617025","last_synced_commit":"cb9f09ee7bb54a575fd7527e147709def4a27b2f"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/suvoooo%2FMachine_Learning","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/suvoooo%2FMachine_Learning/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/suvoooo%2FMachine_Learning/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/suvoooo%2FMachine_Learning/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/suvoooo","download_url":"https://codeload.github.com/suvoooo/Machine_Learning/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247457803,"owners_count":20941906,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["machine-learning","pandas","python3","seaborn","sklearn"],"created_at":"2024-09-24T13:16:43.938Z","updated_at":"2025-04-06T09:10:28.022Z","avatar_url":"https://github.com/suvoooo.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"## Machine Learning and Data Analysis\n\n\n\n### This repo contains introduction and examples of some of the most important machine learning and data-analysis techniques.\n#### Filenames are preceded by DDMMYY. For descriptions and more check the Wiki Page. \n#### Dedicated _Deep Learning Repository_ similar to this is [here](https://github.com/suvoooo/Learn-TensorFlow). \n----------------------------------------------------------------------------------------------------------------------------\n\n#### Libraries\n![Python](https://img.shields.io/badge/python-3670A0?style=for-the-badge\u0026logo=python\u0026logoColor=ffdd54) ![NumPy](https://img.shields.io/badge/numpy-%23013243.svg?style=for-the-badge\u0026logo=numpy\u0026logoColor=white) ![Pandas](https://img.shields.io/badge/pandas-%23150458.svg?style=for-the-badge\u0026logo=pandas\u0026logoColor=white) ![scikit-learn](https://img.shields.io/badge/scikit--learn-%23F7931E.svg?style=for-the-badge\u0026logo=scikit-learn\u0026logoColor=white) ![TensorFlow](https://img.shields.io/badge/TensorFlow-%23FF6F00.svg?style=for-the-badge\u0026logo=TensorFlow\u0026logoColor=white) ![SciPy](https://img.shields.io/badge/SciPy-%230C55A5.svg?style=for-the-badge\u0026logo=scipy\u0026logoColor=%white) ![pymc3](https://drive.google.com/uc?export=view\u0026id=1oi-5--D8kcgJdVV_GAI-pZq-ZKr0STOX)\n\n\n-----------------------------------------------------------------------------------------------------------------------------------------\n\n*PCA_Muller.py 190818:* Principal component analysis example with breast cancer data-set. \n\n*270918: RidgeandLin.py, LassoandLin.py:* Lasso and Ridge regression examples.     \n\n*081018: bank.csv*, data set of selling products of a portuguese company to random customers over phone call(s). Data-set description is available [here](http://archive.ics.uci.edu/ml/datasets/Bank+Marketing).\n\n*161018: gender_purchase.csv*, data-set of two columns describing customers buying a product depending on gender.\n\n*111118: winequality-red.csv*, red wine data set, where the output is the quality column which ranges from 0 to 10.\n\n*121118: pipelineWine.py*, A simple example of applying pipeline and gridsearchCV together using the red wine data.  \n\n*24112018: lagmult.py*, This program just demonstrate a simple constrained optimization problem using figures.   \n\n*11122018: Consumer_Complaints_short.csv*, 3 columns describing the complaints, product_label and category. Complete file can be obtained from [Govt.data](https://catalog.data.gov/dataset/consumer-complaint-database/resource/2f297213-7198-4be1-af1e-2d2623e7f6e9). \n\n*13122018: Text-classification_compain_suvo.py*, Classify the consumer complaints data, which is already described above. \n\n1912018: SVMdemo.py*, this program shows the effect of using RBF kernel to map from 2d space to 3d space. Animation requires ffmpeg in unix system. \n\n*05032019: IBM_Python_Web_Scrapping.ipynb*, Deals with basic web scrapping, string handling, image manipulation.\n\n*06042019: datacleaning*, Folder containing files and images related to data cleaning with pandas. \n\n*08062010: DBSCAN_Complete*, Folder containing files and images related to application of DBSCAN algorithm to cluster Weather Stations in Canada. \n\n*13072019: SVM_Decision_Boundary*, Pipeline + GridSearchCV were performed to find best-fit parameters for SVM and then decision function contours of SVM classifier for binary classification are plotted.      \n\n*28122019: DecsTree*, Folder contains notebook using a decision tree classifier on the [Bank Marketing Data-Set](http://archive.ics.uci.edu/ml/datasets/Bank+Marketing).   \n\n*07032020: Conjugate Prior*, Folder contains a notebook where concept of conjugate prior is discussed including an introduction to [PyMC3](https://docs.pymc.io/).   \n\n*29052020: ExMax_Algo*, Folder contains a notebook completely explaining the Expectation Maximization algorithm. \n\n*11092020: AdaptiveLoss.ipynb*, File contains description and a simple implemetation of robust and adaptive loss function. [Original Paper by J. Barron](https://arxiv.org/pdf/1701.03077.pdf). More details on [TDS](https://medium.com/@saptashwa/the-most-awesome-loss-function-172ffc106c99).   \n\n*31092020: pima_diabetes.ipynb*, file contains description of data preparation and choosing best machine learning algorithm for binary classification task. \nLittle more details on [kaggle kernel](https://www.kaggle.com/suvoooo/eda-and-choosing-best-classifier-on-pima-diabetes). \n\n\n*15112020: terrorism_kaggle.ipynb*, Notebook contains elaborate examples on how to think about problems and interpret large scale data using [Global Terrorism Database](https://www.kaggle.com/START-UMD/gtd). Apart from Pandas Groupby, Crosstab methods I have also used Folium, Basemap libraries for visualizing Leaflet map and 2D data on maps respectively. More on [The Startup](https://medium.com/swlh/practical-data-analysis-using-pandas-global-terrorism-database-20b29009adad).     \n\n*15022021: FocalLoss_Ex.ipynb*, Notebook contains explanation on detail of how Focal Loss works. Please read the original [Focal Loss paper](https://arxiv.org/abs/1708.02002). Example of implementing Focal Loss using Tensorflow is also shown. For more detail check the post on [TDS](https://towardsdatascience.com/a-loss-function-suitable-for-class-imbalanced-data-focal-loss-af1702d75d75). \n\n\n*19062021: Augly_Try.ipynb*, Notebook contains examples of image augmentation using [Facebook's Augly](https://ai.facebook.com/blog/augly-a-new-data-augmentation-library-to-help-build-more-robust-ai-models/) Library. For more detail check the notebook and [TDS](https://towardsdatascience.com/facebook-just-launched-the-coolest-augmentation-library-augly-3910c05db505) post. \n\n*24122021: NB_LogisticReg.ipynb*, Notebook clearly explains connection between Gaussian Naive Bayes and Logistic Regression and determine parameters of Logistic Regression starting from GNB. The notebook is self-explanatory but you can also check the [TDS post](https://towardsdatascience.com/connecting-naive-bayes-and-logistic-regression-binary-classification-ce69e527157f).\n\n\n\n\n------------------------\n\n## License \n\nDistributed under Apache License. Read `LICENSE.md` for detail. \n\n-----------------------------\n## Contacts\n\n[Saptashwa](https://www.linkedin.com/in/saptashwa/). \n\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsuvoooo%2Fmachine_learning","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsuvoooo%2Fmachine_learning","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsuvoooo%2Fmachine_learning/lists"}