{"id":18340008,"url":"https://github.com/mxagar/data_science_udacity","last_synced_at":"2025-04-09T20:40:48.482Z","repository":{"id":41173620,"uuid":"488980720","full_name":"mxagar/data_science_udacity","owner":"mxagar","description":"My personal notes, code and projects of the Udacity Data Science Nanodegree.","archived":false,"fork":false,"pushed_at":"2024-03-21T13:21:38.000Z","size":34156,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-15T12:50:56.743Z","etag":null,"topics":["dashboard","data-analysis","data-engineering","data-science","machine-learning-pipelines"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mxagar.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-05-05T13:20:33.000Z","updated_at":"2022-11-16T19:43:47.000Z","dependencies_parsed_at":"2024-12-23T16:35:16.756Z","dependency_job_id":null,"html_url":"https://github.com/mxagar/data_science_udacity","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mxagar%2Fdata_science_udacity","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mxagar%2Fdata_science_udacity/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mxagar%2Fdata_science_udacity/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mxagar%2Fdata_science_udacity/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mxagar","download_url":"https://codeload.github.com/mxagar/data_science_udacity/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248109633,"owners_count":21049350,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dashboard","data-analysis","data-engineering","data-science","machine-learning-pipelines"],"created_at":"2024-11-05T20:20:32.634Z","updated_at":"2025-04-09T20:40:48.460Z","avatar_url":"https://github.com/mxagar.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Udacity Data Science Nanodegree: Personal Notes\n\nThese are my personal notes taken while following the [Udacity Data Science Nanodegree](https://www.udacity.com/course/data-scientist-nanodegree--nd025).\n\nThe Nanodegree asssumes basic data analysis skills with python libraries (pandas, numpy, matplotlib, sklearn, etc.) and has 5 modules that build up on those skills; each module has its corresponding folder in this repository with its guide Markdown file:\n\n1. Introduction to Data Science: [`01_Intro_Data_Science`](./01_Intro_Data_Science/DSND_Introduction.md).\n2. Software Engineering: [`02_SoftwareEngineering`](./02_SoftwareEngineering/DSND_SWEngineering.md).\n3. Data Engineering: [`03_DataEngineering`](./03_DataEngineering/DSND_DataEngineering.md).\n4. Experimental Design \u0026 Recommendations: [`04_ExperimentalDesign_RecSys`](./04_ExperimentalDesign_RecSys/DSND_ExperimentalDesign_RecSys.md).\n5. Data Scientist Capstone (Spark): [`05_Capstone_Project`](./05_Capstone_Project/DSND_Capstone.md).\n\nAdditionally, it is necessary to submit and pass some projects to get the certification:\n\n- Create a data science project and write a blog post: [airbnb_data_analysis](https://github.com/mxagar/airbnb_data_analysis).\n- Disaster response prediction pipeline deployed on a Flask app: [disaster_response_pipeline](https://github.com/mxagar/disaster_response_pipeline).\n- Recommender System which suggests new articles to the users of the IBM Watson Studio Platform: [recommendations_ibm](https://github.com/mxagar/recommendations_ibm).\n- Capstone project: Prediction of customer churn of a music streaming service using Spark: [sparkify_customer_churn](https://github.com/mxagar/sparkify_customer_churn).\n\nA regular python environment with the usual data science packages should suffice (i.e., scikit-learn, pandas, matplotlib, etc.); any special/additional packages and their installation commands are introduced in the guides. A recipe to set up a [conda](https://docs.conda.io/en/latest/) environment with my current packages is the following:\n\n```bash\nconda create --name ds pip python=3.10\nconda activate ds\npip install -r requirements.txt\n```\n\nAs a side note, I list here some related **free Udacity courses** on several topics:\n\n- **Big Data**\n  - [Intro to Hadoop and MapReduce](https://www.udacity.com/course/intro-to-hadoop-and-mapreduce--ud617)\n  - [Deploying a Hadoop Cluster](https://www.udacity.com/course/deploying-a-hadoop-cluster--ud1000)\n  - [Real-Time Analytics with Apache Storm](https://www.udacity.com/course/real-time-analytics-with-apache-storm--ud381)\n  - [Big Data Analytics in Healthcare](https://www.udacity.com/course/big-data-analytics-in-healthcare--ud758)\n  - [Spark](https://www.udacity.com/course/learn-spark-at-udacity--ud2002)\n- **Databases and APIs**\n  - [Data Wrangling with MongoDB](https://www.udacity.com/course/data-wrangling-with-mongodb--ud032)\n  - [SQL for Data Analysis](https://www.udacity.com/course/sql-for-data-analysis--ud198)\n  - [Designing RESTful APIs](https://www.udacity.com/course/designing-restful-apis--ud388)\n- **Interview Preparation**\n  - [Data Science Interview Prep](https://www.udacity.com/course/data-science-interview-prep--ud944)\n  - [Machine Learning Interview Preparation](https://www.udacity.com/course/machine-learning-interview-prep--ud1001)\n\n\nMikel Sagardia, 2022.  \nNo guarantees.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmxagar%2Fdata_science_udacity","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmxagar%2Fdata_science_udacity","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmxagar%2Fdata_science_udacity/lists"}