{"id":19693170,"url":"https://github.com/breadrock1/ml-labs","last_synced_at":"2026-04-15T14:03:38.856Z","repository":{"id":171596011,"uuid":"219199169","full_name":"breadrock1/ML-labs","owner":"breadrock1","description":"Machine Learning. There is a suite of Machine Learning laboratory tasks.","archived":false,"fork":false,"pushed_at":"2021-11-28T09:07:19.000Z","size":11055,"stargazers_count":0,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-07-22T20:47:00.632Z","etag":null,"topics":["classification","knn","knn-classification","machine-learning","nlp","nlp-machine-learning","numpy","pandas","pandas-dataframe","python","python3","random-forest","regression","scikit-learn-python","skele","trees"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/breadrock1.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-11-02T18:51:13.000Z","updated_at":"2021-11-28T09:07:22.000Z","dependencies_parsed_at":null,"dependency_job_id":"726e1c34-bd79-431f-9bc9-2a08b60a1a80","html_url":"https://github.com/breadrock1/ML-labs","commit_stats":null,"previous_names":["breadrock1/ml-labs"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/breadrock1/ML-labs","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/breadrock1%2FML-labs","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/breadrock1%2FML-labs/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/breadrock1%2FML-labs/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/breadrock1%2FML-labs/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/breadrock1","download_url":"https://codeload.github.com/breadrock1/ML-labs/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/breadrock1%2FML-labs/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31844333,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-15T13:28:40.153Z","status":"ssl_error","status_checked_at":"2026-04-15T13:28:29.396Z","response_time":63,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["classification","knn","knn-classification","machine-learning","nlp","nlp-machine-learning","numpy","pandas","pandas-dataframe","python","python3","random-forest","regression","scikit-learn-python","skele","trees"],"created_at":"2024-11-11T19:15:56.417Z","updated_at":"2026-04-15T14:03:38.823Z","avatar_url":"https://github.com/breadrock1.png","language":"Jupyter Notebook","readme":"# Machine Learning Labs\n\n![GitHub](https://badgen.net/badge/icon/github?icon=github\u0026label)\n![version](https://img.shields.io/badge/version-1.0-blue)\n[![Awesome](https://awesome.re/badge.svg)](https://awesome.re)\n\n![Python](https://img.shields.io/badge/Python-FFD43B?style=for-the-badge\u0026logo=python\u0026logoColor=darkgreen)\n\nMachine Learning. This is set of laboratory tasks of Machine Learning.\n\n## Contents\n\n- [What is ML](#whatis)\n- [Labs](#labs)\n\t* [Lab 1 - Method of nearest neighbors](#lab_1)\n\t* [Lab 2 - Pipelines and backup](#lab_2)\n\t* [Lab 3 - Linearias and Regressions](#lab_3)\n\t* [Lab 4 - Trees](#lab_4)\n\t* [Lab 5 - Neural networks](#lab_5)\n\t* [Lab 6 - NLP](#lab_6)\n\n\n## \u003ca name=\"whatis\"/\u003e What is Machine Learning?\n\nMachine learning (ML) is the study of computer algorithms that can improve automatically through experience and by the use of data. It is seen as a part of artificial intelligence. Machine learning algorithms build a model based on sample data, known as training data, in order to make predictions or decisions without being explicitly programmed to do so. Machine learning algorithms are used in a wide variety of applications, such as in medicine, email filtering, speech recognition, and computer vision, where it is difficult or unfeasible to develop conventional algorithms to perform the needed tasks.\n\nA subset of machine learning is closely related to computational statistics, which focuses on making predictions using computers; but not all machine learning is statistical learning. The study of mathematical optimization delivers methods, theory and application domains to the field of machine learning. Data mining is a related field of study, focusing on exploratory data analysis through unsupervised learning. Some implementations of machine learning use data and neural networks in a way that mimics the working of a biological brain. In its application across business problems, machine learning is also referred to as predictive analytics.\n\nLits and useful links:\n  \n- [Introduction to ML](https://habr.com/ru/post/448892/)\n- [ML project-walkthrough by @WillKoehrsen](https://github.com/WillKoehrsen/machine-learning-project-walkthrough)\n- [Used dataset of Heart Disease UCI](https://www.kaggle.com/ronitf/heart-disease-uci)\n\n\n## \u003ca name=\"labs\"/\u003e Labs\n\n### \u003ca name=\"lab_1\"/\u003e Laboratory № 1 - KNN\n\nTo manipulate your data set by analogy with those considered: data preprocessing, predictions using the method of nearest neighbors.\t\n\n### \u003ca name=\"lab_2\"/\u003e Laboratory № 2 - Pipeline and backup\n\nSelect and justify a quality metric. Try several machine learning methods from sklearn, see which method is best suited in the context of the selected metric (so far without selecting hyper parameters). Optimize KNN according to metric. Wrap all previous actions with data (conversion, normalization, etc. - from the first and second labs) in Pipeline and save the resulting model in pickle.\n\n### \u003ca name=\"lab_3\"/\u003e Laboratory № 3 - Linearias and Regressions \n\nTrain logistic regression on your data, select the parameters. Compare the results of applying L1 and L2 regularizations. View the weights of signs, explain the obtained values. Perform feature selection using L1 regularization, select the optimal C, explain the result.\n\n### \u003ca name=\"lab_4\"/\u003e Laboratory № 4 - Trees\n\nVerify the instability of a single tree on its data. Select the most important traits by random forest, compare the result with the selection of traits by a linear method with L1 -regulation. Compare the performance of a random forest without cross-validation with cross-validation. Compare the quality of work and the training time (% time at the beginning of the cell) of the forest with gradient boosting over decision trees, choosing the optimal parameters for each. It will be especially good if you train gradient boosting on a video card.\n\n### \u003ca name=\"lab_5\"/\u003e Laboratory № 5 - Neural networks\n\nCollect a set of photos of your team members. Train the neuron so that she will classify the new photos of the participants well.\n\n### \u003ca name=\"lab_6\"/\u003e Laboratory № 6 - NLP\n\nClassify Russian texts into several categories. It is best if the body of the texts is really large. To pre-process texts: normalization, lemmatization, etc. Compare embeddings. Try several classification methods.\n\n\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbreadrock1%2Fml-labs","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbreadrock1%2Fml-labs","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbreadrock1%2Fml-labs/lists"}