{"id":54638,"url":"https://github.com/data-folks/data-science-learning-path","name":"data-science-learning-path","description":"Data Science Learning Path - A complete guide to learn data science for beginners","projects_count":70,"last_synced_at":"2026-06-20T19:00:22.974Z","repository":{"id":46567483,"uuid":"396037354","full_name":"data-folks/data-science-learning-path","owner":"data-folks","description":"Data Science Learning Path - A complete guide to learn data science for beginners","archived":false,"fork":false,"pushed_at":"2021-10-24T02:21:32.000Z","size":122,"stargazers_count":272,"open_issues_count":5,"forks_count":61,"subscribers_count":2,"default_branch":"main","last_synced_at":"2026-06-04T01:02:35.495Z","etag":null,"topics":["awesome-list","data-science","indonesia","indonesian-language","machine-learning","python"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/data-folks.png","metadata":{},"created_at":"2021-08-14T14:44:39.000Z","updated_at":"2026-06-03T15:54:35.000Z","dependencies_parsed_at":"2022-09-07T04:52:20.386Z","dependency_job_id":null,"html_url":"https://github.com/data-folks/data-science-learning-path","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/data-folks/data-science-learning-path","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/data-folks%2Fdata-science-learning-path","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/data-folks%2Fdata-science-learning-path/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/data-folks%2Fdata-science-learning-path/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/data-folks%2Fdata-science-learning-path/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/data-folks","download_url":"https://codeload.github.com/data-folks/data-science-learning-path/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/data-folks%2Fdata-science-learning-path/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34581934,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-20T02:00:06.407Z","response_time":98,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"created_at":"2024-01-18T10:54:22.547Z","updated_at":"2026-06-20T19:00:22.975Z","primary_language":null,"list_of_lists":false,"displayable":true,"categories":["Programming","Mathematics \u0026 Statistics","Machine Learning","Deep Learning","Book References","NLP \u0026 NLU","Evaluation Metrics"],"sub_categories":[],"readme":"![image](https://github.com/data-folks/data-science-learning-path/blob/main/assets/banner.jpg)\n\n![image](https://visitor-badge.laobi.icu/badge?page_id=data-folks/data-science-learning-path) [![Contributors][contributors-shield]][contributors-url] [![Forks][forks-shield]][forks-url] [![Stargazers][stars-shield]][stars-url] [![Issues][issues-shield]][issues-url] [![MIT License][license-shield]][license-url] [![LinkedIn][linkedin-shield]][linkedin-url] [![Discord][discord-shield]][discord-url] [![Medium][medium-shield]][medium-url]\n\n## Brief Introduction\n\nA complete guide to learn data science for beginners.\n\nThis learning path is intended for everyone who wants to learn data science and build a career in data field especially data analyst and data scientist. In this guide, there is a corresponding link in each section that will help you to learn (at least to start) in each chapter.\n\n## Table of Contents\n\n\u003cdetails open=\"open\"\u003e\n  \u003csummary\u003eTable of Contents\u003c/summary\u003e\n  \u003col\u003e\n    \u003cli\u003e\u003ca href=\"#programming\"\u003eProgramming\u003c/a\u003e\u003c/li\u003e\n    \u003cli\u003e\u003ca href=\"#mathematics--statistics\"\u003eMathematics \u0026 Statistics\u003c/a\u003e\u003c/li\u003e\n    \u003cli\u003e\n      \u003ca href=\"#machine-learning\"\u003eMachine Learning\u003c/a\u003e\n      \u003cul\u003e\n        \u003cli\u003e\u003ca href=\"#supervised-learning\"\u003eSupervised Learning\u003c/a\u003e\u003c/li\u003e\n        \u003cli\u003e\u003ca href=\"#unsupervised-learning\"\u003eUnsupervised Learning\u003c/a\u003e\u003c/li\u003e\n      \u003c/ul\u003e\n    \u003c/li\u003e\n    \u003cli\u003e\n      \u003ca href=\"#evaluation-metrics\"\u003eEvaluation Metrics\u003c/a\u003e\n      \u003cul\u003e\n        \u003cli\u003e\u003ca href=\"#supervised-learning-1\"\u003eSupervised Learning\u003c/a\u003e\u003c/li\u003e\n        \u003cli\u003e\u003ca href=\"#unsupervised-learning-1\"\u003eUnsupervised Learning\u003c/a\u003e\u003c/li\u003e\n      \u003c/ul\u003e\n    \u003c/li\u003e\n    \u003cli\u003e\u003ca href=\"#deep-learning\"\u003eDeep Learning\u003c/a\u003e\u003c/li\u003e\n    \u003cli\u003e\u003ca href=\"#ml-applications\"\u003eML Applications\u003c/a\u003e\u003c/li\u003e\n    \u003cli\u003e\u003ca href=\"#computer-vision\"\u003eComputer Vision\u003c/a\u003e\u003c/li\u003e\n    \u003cli\u003e\u003ca href=\"#nlp--nlu\"\u003eNLP \u0026 NLU\u003c/a\u003e\u003c/li\u003e\n    \u003cli\u003e\u003ca href=\"#speech-recognition\"\u003eSpeech Recognition\u003c/a\u003e\u003c/li\u003e\n    \u003cli\u003e\u003ca href=\"#model-deployment\"\u003eModel Deployment\u003c/a\u003e\u003c/li\u003e\n    \u003cli\u003e\u003ca href=\"#book-references\"\u003eBook References\u003c/a\u003e\u003c/li\u003e\n  \u003c/ol\u003e\n\u003c/details\u003e\n\n## Programming\n\n1. [Basic Python](https://www.learnpython.org/)\n2. [Object-oriented Programming](https://realpython.com/python3-object-oriented-programming/)\n3. [Intro to DBMS](https://www.omnisci.com/technical-glossary/dbms)\n4. [SQL Data Manipulation](https://mode.com/sql-tutorial/introduction-to-sql)\n5. [Git](https://git-scm.com/doc)\n6. Code Versioning Platform: [Github](https://github.com/) | [Bitbucket](https://bitbucket.org/) | [Gitlab](https://about.gitlab.com/)\n7. [Shell Script](https://dagshub.com/blog/effective-linux-bash-data-scientists/)\n8. Competitive Programming: [Hackerrank](https://www.hackerrank.com/) | [Leetcode](https://leetcode.com/) | [Kattis](https://open.kattis.com/)\n\n\u003ca href=\"#table-of-contents\"\u003e🠥🠥 Back to Table of Contents 🠥🠥\u003c/a\u003e\n\n## Mathematics \u0026 Statistics\n\n1.  [Linear Algebra](https://www.coursera.org/learn/linear-algebra-machine-learning)\n2.  [Calculus](https://www.coursera.org/learn/multivariate-calculus-machine-learning?specialization=mathematics-machine-learning)\n3.  [Descriptive Statistics](https://conjointly.com/kb/descriptive-statistics/)\n4.  [Data Distributions](https://www.analyticssteps.com/blogs/10-types-statistical-data-distribution-models)\n5.  [Statistical Testing](https://homeweb.csulb.edu/~msaintg/ppa696/696stsig.htm)\n6.  [Exploratory Data Analysis](https://medium.com/data-folks-indonesia/10-things-to-do-when-conducting-your-exploratory-data-analysis-eda-7e3b2dfbf812)\n7.  [Regression](https://www.listendata.com/2018/03/regression-analysis.html)\n8.  [TOOLBOX: Pandas](https://pandas.pydata.org/)\n9.  [TOOLBOX: Numpy](https://numpy.org/)\n10. [TOOLBOX: Matplotlib](https://matplotlib.org/)\n11. [TOOLBOX: Seaborn](https://seaborn.pydata.org/)\n\n\u003ca href=\"#table-of-contents\"\u003e🠥🠥 Back to Table of Contents 🠥🠥\u003c/a\u003e\n\n## Machine Learning\n\n- ### Supervised Learning\n\n1.  [K-NN (K-Nearest Neighbors)](https://towardsdatascience.com/machine-learning-basics-with-the-k-nearest-neighbors-algorithm-6a6e71d01761)\n2.  [Naive Bayes](https://jakevdp.github.io/PythonDataScienceHandbook/05.05-naive-bayes.html)\n3.  [Support Vector Machine](https://datascience.foundation/datatalk/basic-overview-of-svm-algorithm)\n4.  [Random Forest](https://www.section.io/engineering-education/introduction-to-random-forest-in-machine-learning/)\n5.  [AdaBoost](https://www.mygreatlearning.com/blog/adaboost-algorithm/)\n6.  [Gradient Boosting](https://blog.mlreview.com/gradient-boosting-from-scratch-1e317ae4587d)\n7.  [XGBoost](https://machinelearningmastery.com/gentle-introduction-xgboost-applied-machine-learning/)\n8.  [CatBoost](https://dataaspirant.com/catboost-algorithm/)\n9.  [Bagging Classifier](https://vitalflux.com/bagging-classifier-python-code-example/)\n10. [Voting Classifier](https://towardsdatascience.com/how-voting-classifiers-work-f1c8e41d30ff)\n11. [Stacking Classifier](https://bush-dev.com/introduction-to-stacking-classifier/)\n12. [TOOLBOX: Scikit Learn](https://scikit-learn.org/stable/)\n13. [TOOLBOX: statsmodels](https://www.statsmodels.org/stable/index.html)\n14. [CASE STUDY: House Pricing](https://www.kaggle.com/c/house-prices-advanced-regression-techniques)\n15. [CASE STUDY: Titanic](https://www.kaggle.com/c/titanic)\n16. [CASE STUDY: Credit Scoring](https://www.kaggle.com/sakshigoyal7/credit-card-customers)\n\n\u003ca href=\"#table-of-contents\"\u003e🠥🠥 Back to Table of Contents 🠥🠥\u003c/a\u003e\n\n- ### Unsupervised Learning\n\n1. [K-Means Clustering](https://www.kdnuggets.com/2019/05/guide-k-means-clustering-algorithm.html)\n2. [DBSCAN](https://www.analyticsvidhya.com/blog/2020/09/how-dbscan-clustering-works/)\n3. [Hierarchical Clustering](https://www.kdnuggets.com/2019/09/hierarchical-clustering.html)\n\n\u003ca href=\"#table-of-contents\"\u003e🠥🠥 Back to Table of Contents 🠥🠥\u003c/a\u003e\n\n## Evaluation Metrics\n\n- ### Supervised Learning\n\n1. [Confusion Matrix](https://www.analyticsvidhya.com/blog/2020/04/confusion-matrix-machine-learning/)\n2. [Accuracy](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.accuracy_score.html)\n3. [Precision](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.precision_score.html#sklearn.metrics.precision_score)\n4. [Recall](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.recall_score.html#sklearn.metrics.recall_score)\n5. [F Score](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html#sklearn.metrics.f1_score)\n6. [Hamming Loss](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.hamming_loss.html#sklearn.metrics.hamming_loss)\n7. [ROC (Receiver Operating Characteristic)](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_curve.html)\n8. [ROC AUC (Area Under Curve)](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_auc_score.html#sklearn.metrics.roc_auc_score)\n9. [Top K Accuracy](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.top_k_accuracy_score.html#sklearn.metrics.top_k_accuracy_score)\n10. [MAE](https://www.statisticshowto.com/absolute-error/)\n11. [MSE](https://www.freecodecamp.org/news/machine-learning-mean-squared-error-regression-line-c7dde9a26b93/)\n12. MRR\n13. DCG\n14. NDCG\n15. PSNR\n16. SSIM\n17. IoU\n18. Perplexity\n19. BLEU score\n\n\u003ca href=\"#table-of-contents\"\u003e🠥🠥 Back to Table of Contents 🠥🠥\u003c/a\u003e\n\n- ### Unsupervised Learning\n\n1. [Elbow Method](\u003chttps://en.wikipedia.org/wiki/Elbow_method_(clustering)\u003e)\n2. [Silhouette Coefficient](https://towardsdatascience.com/silhouette-coefficient-validating-clustering-techniques-e976bb81d10c)\n\n\u003ca href=\"#table-of-contents\"\u003e🠥🠥 Back to Table of Contents 🠥🠥\u003c/a\u003e\n\n## Deep Learning\n\n1. [Activation Functions](https://towardsdatascience.com/activation-functions-neural-networks-1cbd9f8d91d6)\n2. [Linear Layer](https://medium.com/datathings/linear-layers-explained-in-a-simple-way-2319a9c2d1aa)\n3. [CNN (Convolutional Neural Networks)](https://cs231n.github.io/)\n4. [RNN (Recurrent Neural Networks)](https://builtin.com/data-science/recurrent-neural-networks-and-lstm)\n5. [Optimization](https://d2l.ai/chapter_optimization/)\n6. [Loss Functions / Objective Functions](https://towardsdatascience.com/common-loss-functions-in-machine-learning-46af0ffc4d23)\n7. [Dropout](https://leonardoaraujosantos.gitbook.io/artificial-inteligence/machine_learning/deep_learning/dropout_layer)\n8. [Batchnorm](https://www.baeldung.com/cs/batch-normalization-cnn)\n9. [Learning Rate Scheduler](https://towardsdatascience.com/learning-rate-schedules-and-adaptive-learning-rate-methods-for-deep-learning-2c8f433990d1)\n10. [TOOLBOX: PyTorch](https://pytorch.org/)\n11. [TOOLBOX: Tensorflow](https://www.tensorflow.org/)\n12. [TOOLBOX: Keras](https://keras.io)\n\n\u003ca href=\"#table-of-contents\"\u003e🠥🠥 Back to Table of Contents 🠥🠥\u003c/a\u003e\n\n## ML Applications\n\n1. Timeseries\n2. Recommendation System\n3. Netwok Analysis\n\n\u003ca href=\"#table-of-contents\"\u003e🠥🠥 Back to Table of Contents 🠥🠥\u003c/a\u003e\n\n## Computer Vision\n\n1. Image Classification\n2. Object Detection\n3. Object Segmentation\n4. Instance Segmentation\n\n\u003ca href=\"#table-of-contents\"\u003e🠥🠥 Back to Table of Contents 🠥🠥\u003c/a\u003e\n\n## NLP \u0026 NLU\n\n1.  Tokenization\n2.  Sequence\n3.  Padding\n4.  Stemming\n5.  Lemmatization\n6.  Feature Extraction\n7.  Feature Selection\n8.  Term Weighting\n9.  Embedding\n10. Part of Speech Tagging\n11. Named Entity Recognition\n12. Popular NLP \u0026 NLU Architecture\n13. [STUDY CASE: News Classification](http://qwone.com/~jason/20Newsgroups/)\n14. [STUDY CASE: Sentiment Analysis](https://medium.com/data-folks-indonesia/indonesian-app-review-sentiment-analysis-using-neural-network-and-pytorch-54c0ef766c09)\n15. [STUDY CASE: Machine Translation](http://www.manythings.org/anki/)\n\n\u003ca href=\"#table-of-contents\"\u003e🠥🠥 Back to Table of Contents 🠥🠥\u003c/a\u003e\n\n## Speech Recognition\n\n\u003ca href=\"#table-of-contents\"\u003e🠥🠥 Back to Table of Contents 🠥🠥\u003c/a\u003e\n\n## Model Deployment\n\n\u003ca href=\"#table-of-contents\"\u003e🠥🠥 Back to Table of Contents 🠥🠥\u003c/a\u003e\n\n## Book References\n\n1. [Practical Deep Learning for Coders](https://course.fast.ai/)\n2. [Dive Into Deep Learning](http://d2l.ai/index.html)\n3. [Interpretable Machine Learning](https://christophm.github.io/interpretable-ml-book/)\n4. [An Introduction to Statistical Learning with Applications in R](https://web.stanford.edu/~hastie/ISLRv2_website.pdf)\n5. [Natural Language Processing with Python](https://www.nltk.org/book/)\n\n\u003ca href=\"#table-of-contents\"\u003e🠥🠥 Back to Table of Contents 🠥🠥\u003c/a\u003e\n\n\u003c!-- MARKDOWN LINKS \u0026 IMAGES --\u003e\n\u003c!-- https://www.markdownguide.org/basic-syntax/#reference-style-links --\u003e\n\n[contributors-shield]: https://img.shields.io/github/contributors/data-folks/data-science-learning-path.svg?flat\n[contributors-url]: https://github.com/data-folks/data-science-learning-path/graphs/contributors\n[forks-shield]: https://img.shields.io/github/forks/data-folks/data-science-learning-path.svg?flat\n[forks-url]: https://github.com/data-folks/data-science-learning-path/network/members\n[stars-shield]: https://img.shields.io/github/stars/data-folks/data-science-learning-path.svg?flat\n[stars-url]: https://github.com/data-folks/data-science-learning-path/stargazers\n[issues-shield]: https://img.shields.io/github/issues/data-folks/data-science-learning-path.svg?flat\n[issues-url]: https://github.com/data-folks/data-science-learning-path/issues\n[license-shield]: https://img.shields.io/github/license/data-folks/data-science-learning-path.svg?flat\n[license-url]: https://github.com/data-folks/data-science-learning-path/blob/master/LICENSE.txt\n[linkedin-shield]: https://img.shields.io/badge/LinkedIn-0077B5?style=flat\u0026logo=linkedin\u0026logoColor=white\n[linkedin-url]: https://www.linkedin.com/company/jakartaresearch/\n[discord-shield]: https://img.shields.io/badge/Discord-7289DA?style=flat\u0026logo=discord\u0026logoColor=white\n[discord-url]: https://bit.ly/DiscordJakartaResearch\n[medium-shield]: https://img.shields.io/badge/Medium-12100E?style=flat\u0026logo=medium\u0026logoColor=white\n[medium-url]: http://medium.com/data-folks-indonesia\n","projects_url":"https://awesome.ecosyste.ms/api/v1/lists/data-folks%2Fdata-science-learning-path/projects"}