{"id":14958332,"url":"https://github.com/daya-jin/ml_for_learner","last_synced_at":"2025-10-24T14:31:39.703Z","repository":{"id":65302302,"uuid":"162587303","full_name":"Daya-Jin/ML_for_learner","owner":"Daya-Jin","description":"Implementations of the machine learning algorithm with Python and numpy","archived":false,"fork":false,"pushed_at":"2021-10-20T10:05:27.000Z","size":8307,"stargazers_count":87,"open_issues_count":1,"forks_count":44,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-01-31T02:21:50.103Z","etag":null,"topics":["implementation","machine-learning","python3","sklearn"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Daya-Jin.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-12-20T14:02:33.000Z","updated_at":"2025-01-24T09:02:35.000Z","dependencies_parsed_at":"2023-01-16T15:15:26.811Z","dependency_job_id":null,"html_url":"https://github.com/Daya-Jin/ML_for_learner","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Daya-Jin%2FML_for_learner","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Daya-Jin%2FML_for_learner/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Daya-Jin%2FML_for_learner/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Daya-Jin%2FML_for_learner/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Daya-Jin","download_url":"https://codeload.github.com/Daya-Jin/ML_for_learner/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":237982442,"owners_count":19397257,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["implementation","machine-learning","python3","sklearn"],"created_at":"2024-09-24T13:16:47.310Z","updated_at":"2025-10-24T14:31:34.644Z","avatar_url":"https://github.com/Daya-Jin.png","language":"Jupyter Notebook","readme":"# ML_for_learner\n该项目旨在使用numpy实现一个类scikit-learn的mini机器学习库，对于相关的知识，均配有blog文章对其理论进行讲解，对于部分功能，还配有notebook分析代码实现上的细节。该项目的初衷是为那些算法学习者提供从理论到实现的一站式服务。\n\n由于本人学识有限，并且没有Python开发经验，该库目前还是一个非常松散的代码集合体。如果你在blog、notebook或者code中发现任何纰漏或bug，甚至是觉得哪写的不通顺，都可以联系我，当然也可以直接在项目页面提issue，谢谢。\n\nQQ: 435248055 \u0026ensp; | \u0026ensp; WeChat: QQ435248055 \u0026ensp; | \u0026ensp; [Blog](https://daya-jin.github.io/)\n\n---\n\n点击算法名称进入相应Blog了解算法理论，notebook指导如何step-by-step的去实现该算法，code为模块化的代码文件。\n\n注：除非特别说明，各模型所接受的数据格式均为```numpy.ndarray```格式，部分也可接受```List```或者嵌套```List```，除此之外的数据格式本人暂不保证。由于目前的Python type hint还不支持numpy，所以在代码中未说明(感谢微信昵称@Stream的提醒)。\n\n## Supervised learning\n\n|Class|Algorithm|Implementation|Code|\n|-|-|-|-|\n|Generalized Linear Models|[Linear Regression](https://daya-jin.github.io/2018/09/23/LinearRegression/)|[notebook](https://github.com/Daya-Jin/ML_for_learner/blob/master/linear_model/LinearRegression.ipynb)|[code](https://github.com/Daya-Jin/ML_for_learner/blob/master/linear_model/LinearRegression.py)|\n||[Logistic regression](https://daya-jin.github.io/2018/10/02/LogisticRegression/)|[notebook](https://github.com/Daya-Jin/ML_for_learner/blob/master/linear_model/LogisticRegression.ipynb)|[code](https://github.com/Daya-Jin/ML_for_learner/blob/master/linear_model/LogisticRegression.py)|\n|[Nearest Neighbors](https://daya-jin.github.io/2018/12/29/KNearestNeighbor/)|Nearest Neighbors Classification|[notebook](https://github.com/Daya-Jin/ML_for_learner/blob/master/neighbors/KNN.ipynb)|[code](https://github.com/Daya-Jin/ML_for_learner/blob/master/neighbors/KNeighborsClassifier.py)|\n|[Naive Bayes](https://daya-jin.github.io/2018/10/04/NaiveBayes/)|Gaussian Naive Bayes|[notebook](https://github.com/Daya-Jin/ML_for_learner/blob/master/naive_bayes/Gaussian%20Naive%20Bayes.ipynb)|[code](https://github.com/Daya-Jin/ML_for_learner/blob/master/naive_bayes/GaussianNB.py)|\n|[Support Vector Machine](https://daya-jin.github.io/2018/10/17/SupportVectorMachine/)|SVC|[notebook](https://github.com/Daya-Jin/ML_for_learner/blob/master/svm/SMO.ipynb)|[code](https://github.com/Daya-Jin/ML_for_learner/blob/master/svm/SVC.py)|\n|[Decision Trees](https://daya-jin.github.io/2018/08/10/DecisionTree/)|[ID3 Classification](https://daya-jin.github.io/2018/08/10/DecisionTree/#id3)|[notebook](https://github.com/Daya-Jin/ML_for_learner/blob/master/tree/ID3_Clf.ipynb)|[code](https://github.com/Daya-Jin/ML_for_learner/blob/master/tree/ID3_Clf.py)|\n||ID3 Regression|[notebook](https://github.com/Daya-Jin/ML_for_learner/blob/master/tree/ID3_Reg.ipynb)|[code](https://github.com/Daya-Jin/ML_for_learner/blob/master/tree/ID3_Reg.py)|\n||[CART Classification](https://daya-jin.github.io/2018/08/10/DecisionTree/#classification-and-regression-tree)|[notebook](https://github.com/Daya-Jin/ML_for_learner/blob/master/tree/DecisionTreeClassifier.ipynb)|[code](https://github.com/Daya-Jin/ML_for_learner/blob/master/tree/DecisionTreeClassifier.py)|\n||CART Regression|[notebook](https://github.com/Daya-Jin/ML_for_learner/blob/master/tree/DecisionTreeRegressor.ipynb)|[code](https://github.com/Daya-Jin/ML_for_learner/blob/master/tree/DecisionTreeRegressor.py)|\n|[Ensemble methods](https://daya-jin.github.io/2018/08/15/EnsembleLearning/)|[Random Forests Classification](https://daya-jin.github.io/2018/08/15/EnsembleLearning/#random-forest)|[notebook](https://github.com/Daya-Jin/ML_for_learner/blob/master/ensemble/RandomForestClassifier.ipynb)|[code](https://github.com/Daya-Jin/ML_for_learner/blob/master/ensemble/RandomForestClassifier.py)|\n||Random Forests Regression|[notebook](https://github.com/Daya-Jin/ML_for_learner/blob/master/ensemble/RandomForestRegressor.ipynb)|[code]()|\n||[AdaBoosting Classification](https://daya-jin.github.io/2018/08/15/EnsembleLearning/#boosting)|[notebook](https://github.com/Daya-Jin/ML_for_learner/blob/master/ensemble/AdaBoostClassifier.ipynb)|[code](https://github.com/Daya-Jin/ML_for_learner/blob/master/ensemble/AdaBoostClassifier.py)|\n\n## Unsupervised learning\n\n|Class|Algorithm|Implementation|Code|\n|-|-|-|-|\n|Gaussian mixture models|[Gaussian Mixture](https://daya-jin.github.io/2019/03/15/Gaussian_Mixture_Models/)|[notebook](https://github.com/Daya-Jin/ML_for_learner/blob/master/mixture/GaussianMixture.ipynb)|[code](https://github.com/Daya-Jin/ML_for_learner/blob/master/mixture/GaussianMixture.py)|\n|Clustering|[K-means](https://daya-jin.github.io/2018/09/22/KMeans/)|[notebook](https://github.com/Daya-Jin/ML_for_learner/blob/master/cluster/KMeans.ipynb)|[code](https://github.com/Daya-Jin/ML_for_learner/blob/master/cluster/KMeans.py)|\n||[DBSCAN](https://daya-jin.github.io/2018/08/06/DBSCAN/)|[notebook](https://github.com/Daya-Jin/ML_for_learner/blob/master/cluster/DBSCAN.ipynb)|[code](https://github.com/Daya-Jin/ML_for_learner/blob/master/cluster/DBSCAN.py)|\n|[Association Rules](https://daya-jin.github.io/2018/12/30/AssociationRules/)|Apriori|[notebook](https://github.com/Daya-Jin/ML_for_learner/blob/master/rule/Apriori.ipynb)||\n|[Collaborative Filtering](https://daya-jin.github.io/2019/04/03/CollaborativeFiltering/)|[User-based](https://daya-jin.github.io/2019/04/03/CollaborativeFiltering/#user-based-collaborative-filtering)|[notebook](https://github.com/Daya-Jin/ML_for_learner/blob/master/recommend/1.%20user_based_CF.ipynb)||\n||[Item-based](https://daya-jin.github.io/2019/04/03/CollaborativeFiltering/#item-based-collaborative-filtering)|[notebook](https://github.com/Daya-Jin/ML_for_learner/blob/master/recommend/2.%20item_based_CF.ipynb)||\n||[LFM](https://daya-jin.github.io/2019/04/03/CollaborativeFiltering/#latent-factor-model)|[notebook](https://github.com/Daya-Jin/ML_for_learner/blob/master/recommend/LFM.ipynb)||\n\n## Model selection and evaluation\n\n|Class|Approach|Code|\n|-|-|-|\n|[Model Selection](https://daya-jin.github.io/2018/12/11/Model_Assessment_and_Selection/)|Dataset Split|[code](https://github.com/Daya-Jin/ML_for_learner/blob/master/model_selection/train_test_split.py)|\n||[K-Fold](https://github.com/Daya-Jin/ML_for_learner/blob/master/model_selection/KFold.ipynb)|[code](https://github.com/Daya-Jin/ML_for_learner/blob/master/model_selection/KFold.py)|\n||Stratified K-Fold|[code](https://github.com/Daya-Jin/ML_for_learner/blob/master/model_selection/StratifiedKFold.py)|\n|[Metrics](https://daya-jin.github.io/2019/03/27/Evaluation_Metircs/)|Accuracy|[code](https://github.com/Daya-Jin/ML_for_learner/blob/master/metrics/Classification.py#L4)|\n||Log loss|[code](https://github.com/Daya-Jin/ML_for_learner/blob/master/metrics/Classification.py#L53)|\n||F1-score|[code](https://github.com/Daya-Jin/ML_for_learner/blob/master/metrics/Classification.py#L11)|\n||[AUC](https://github.com/Daya-Jin/ML_for_learner/blob/master/metrics/AUC.ipynb)|[code](https://github.com/Daya-Jin/ML_for_learner/blob/master/metrics/Classification.py#L75)|\n||Explained Variance|[code](https://github.com/Daya-Jin/ML_for_learner/blob/master/metrics/Regression.py#L4)|\n||Mean Absolute Error|[code](https://github.com/Daya-Jin/ML_for_learner/blob/master/metrics/Regression.py#L9)|\n||Mean Squared Error|[code](https://github.com/Daya-Jin/ML_for_learner/blob/master/metrics/Regression.py#L14)|\n||R Square|[code](https://github.com/Daya-Jin/ML_for_learner/blob/master/metrics/Regression.py#L19)|\n||[Euclidean Distances](https://github.com/Daya-Jin/ML_for_learner/blob/master/metrics/pairwise/euclidean_distances.ipynb)|[code](https://github.com/Daya-Jin/ML_for_learner/blob/master/metrics/pairwise/euclidean_distances.py)|\n\n## Preprocessing data\n\n|Class|Algorithm|Implementation|Code|\n|-|-|-|-|\n|[Feature Scaling](https://daya-jin.github.io/2019/03/20/Data_Scaling/)|StandardScaler||[code](https://github.com/Daya-Jin/ML_for_learner/blob/master/preprocessing/StandardScaler.py)|\n||MinMaxScaler||[code](https://github.com/Daya-Jin/ML_for_learner/blob/master/preprocessing/MinMaxScaler.py)|\n|Unsupervised dimensionality reduction|PCA|[notebook](https://github.com/Daya-Jin/ML_for_learner/blob/master/decomposition/PCA.ipynb)|[code](https://github.com/Daya-Jin/ML_for_learner/blob/master/decomposition/PCA.py)|\n||SVD|[notebook](https://github.com/Daya-Jin/ML_for_learner/blob/master/decomposition/SVD.ipynb)|[code](https://github.com/Daya-Jin/ML_for_learner/blob/master/decomposition/TruncatedSVD.py)|\n|Supervised dimensionality reduction|[Linear Discriminant Analysis](https://daya-jin.github.io/2018/12/05/LinearDiscriminantAnalysis/)|[notebook](https://github.com/Daya-Jin/ML_for_learner/blob/master/discriminant_analysis/LinearDiscriminantAnalysis.ipynb)|[code](https://github.com/Daya-Jin/ML_for_learner/blob/master/discriminant_analysis/LinearDiscriminantAnalysis.py)|\n|Text Feature|Count Feature||[code](https://github.com/Daya-Jin/ML_for_learner/blob/master/feature_extraction/text.py#L6)|\n||[TF-IDF]()||[code](https://github.com/Daya-Jin/ML_for_learner/blob/master/feature_extraction/text.py#L48)|\n\n## Known Issues\n\n整体代码重用性较低。\n\nrandom forest没有实现并行。\n\nLDA代码存在功能欠缺。\n\nK-Fold代码中使用了```np.append()```，效率较低。","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdaya-jin%2Fml_for_learner","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdaya-jin%2Fml_for_learner","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdaya-jin%2Fml_for_learner/lists"}