{"id":17968679,"url":"https://github.com/leriomaggio/python-data-science","last_synced_at":"2025-10-26T06:32:59.869Z","repository":{"id":31978688,"uuid":"130033947","full_name":"leriomaggio/python-data-science","owner":"leriomaggio","description":"Lecture notes and materials for Python Data Science course ","archived":false,"fork":false,"pushed_at":"2021-11-30T12:18:15.000Z","size":96051,"stargazers_count":43,"open_issues_count":1,"forks_count":33,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-07-26T10:59:05.284Z","etag":null,"topics":["data-science","jupyter-notebooks","machine-learning","materials","python-tutorials"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/leriomaggio.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-04-18T08:57:54.000Z","updated_at":"2025-02-26T11:43:53.000Z","dependencies_parsed_at":"2022-08-07T17:01:13.056Z","dependency_job_id":null,"html_url":"https://github.com/leriomaggio/python-data-science","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/leriomaggio/python-data-science","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/leriomaggio%2Fpython-data-science","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/leriomaggio%2Fpython-data-science/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/leriomaggio%2Fpython-data-science/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/leriomaggio%2Fpython-data-science/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/leriomaggio","download_url":"https://codeload.github.com/leriomaggio/python-data-science/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/leriomaggio%2Fpython-data-science/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":268629740,"owners_count":24281172,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-03T02:00:12.545Z","response_time":2577,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-science","jupyter-notebooks","machine-learning","materials","python-tutorials"],"created_at":"2024-10-29T14:41:14.752Z","updated_at":"2025-10-26T06:32:59.777Z","avatar_url":"https://github.com/leriomaggio.png","language":"Jupyter Notebook","readme":"# Programming for Data Science @ FBK Academy\n\nThis is a programming tutorial aimed at researchers and practitioners with (potentially) no prior programming experience, as well as with previous programming skills. \n\nWe will walk through several concepts to give you an introduction to some of the principal programming concepts like _conditionals_, _functions_, _iterations_, as well as more specialised topics like _classes_, _objects_ and what's sometimes called _defensive programming_.\n\n_If all these terms sounds like [gibberish](https://en.wikipedia.org/wiki/Gibberish) to you, don't worry!_ \n\n I'll try to show everything with simple code examples: no long and complicated explanations with fancy words. At the end of this tutorial, I am sure you will master all these concepts like a _pro_ 🙌\n\n### Why Programming for _Data Science_ ?\n\nIn this tutorial we will be using **Python 3**. Python is nowadays considered as **\"the\"** language of choice for Data Science. \nThere are indeed many reasons for that, and many articles have been written on the subject. \nThis [article](https://analyticsindiamag.com/heres-why-python-continues-to-be-the-language-of-choice-for-data-scientists/) looks like a good and clear example on the topic.\n\n#### A Few notes before we start\n\n* `Q:` _Yes, ok.. but.. is this a tutorial on Data Science?_\n* `A:` **No**. This is a **tutorial** on programming with Python. The _perspective_ though is of a _wanna-be_ data scientists.\n\n* `Q:` _Cool... but.. is this a tutorial on the Python Language ?_\n* `A:` **Ehm, No again. Sorry**. \nWe will focus on programming concepts _using_ Python as a language. Most of the concepts you will learn are shared in most of other languages (_just the syntax will be different, ed._) _Although_ there is a section in the Lecture materials named `Python Extras` that is **specifically** focusing on features of the Python language. You could read it, if interested :)\n\n#### Here is what I have in mind for this course (HTH)\n\n![lecture sketch](./images/lectures_sketch.png)\n\n_I do hope that this (very simple) mind-map look-alike clarifies a bit the perspective I chose when I thought about this course._\n\n`tl,dr;` We will dive into programming focusing on two main aspects: the _Algorithmic_ perspective, that is \"what are the steps we need to implement to solve a specific problem\", and the _Data Structure_ perspective, that is \"what is the data structure that would simplify as much as possible our algorithm implementation\". These two perspectives led in the past decades to two completely different approaches to programming: **Procedural** vs **Object-Oriented**, respectively.\n\nPython allows for _a lot_ of flexibility, and this flexibility will be our [swiss-knife](https://www.ctotech.io/blog/python/why-python3-insights-in-the-swiss-army-knife-of-coding/). In fact, Python supports _multiple programming paradigms_ at once (i.e _imperative_, _OOP_, _functional_ [1]), and we will be (seemingly) shifting our focus on those as we go along with the lecture materials.\n\n---\n\n`1`: functional programming only for the intrepid programmers of you :) See this [video](https://www.youtube.com/watch?v=ThS4juptJjQ)\n\n## Outline of the Course (at a glance)\n\nThe course is organised into **six parts** lectures, with the following learning path in mind: \n\n1. Python Programming (part 1): Introduction to Python Main Data structures, and functions;\n\n2. Python Programming (part 2): Advanced Data Structures and Object-Oriented Programming\n\n3. Scientific Python Programming and Data Processing: Numerical Processing with `NumPy` \u0026 Data Processing with`Pandas`\n\n4. Advanced Data Objects and Data Plotting: Introduction to `dataclasses` and `matplotlib` / `bokeh` for interactive plotting\n\n5. Introduction to Scikit-Learn (`sklearn`) and Machine Learning Modules\n\n6. Project-Team work on real-cases Data Science scenarios\n\n\n## Lecture Materials\n\n_Note: The following section is currently incomplete, and will be updated throughout the rest of the course._\n\n### Introductory Readings (`intro` folder)\n\nThis part will introduce to the concept of computer programming, and to the \nvery basics of the Python programming language:\n\n1. [The Way of the Program](intro/1-the-way-of-the-program.html)\n2. [Variables, Statements and Expressions](intro/2-variables-statements-expressions.html)\n3. [Introduction to Functions](intro/3-intro-functions.html)\n4. [Setting up an editor](intro/4-setup-editor.html)\n5. [Conditional Statements](basics/5-conditionals.html)\n\nRegardless you have already programmed before, using Python or not,  I would suggest to take a look at this introductory section anyway. There is always time to **skip**, based on your learning pace.\n\n**Alternatively**, a good starting point would be this online course: [Intro to Python by Microsoft](https://docs.microsoft.com/en-us/learn/modules/intro-to-python/)\n\n### Programming with Python (`programming_with_python` folder)\nThis section contains the materials for the main topics that will be covered in our first two lectures. These are (in no specific order):\n\n1. [Pythonic Functions](programmin_with_python/functions.ipynb)\n2. [Collections and Sequences](programmin_with_python/collections.ipynb)\n3. [Dictionaries](programmin_with_python/dictionaries.ipynb)\n4. [Iterators, Generators, Comprehensions](programmin_with_python/iterators.ipynb)\n5. [Classes and OOP](programmin_with_python/classes.ipynb)\n6. [Errors and Exceptions](programming_with_python/exceptions.ipynb)\n\n####  Python Extras (`pyhton_extras` folder)\nThis section contains some extra notebooks you could go through to read more about some specific aspects of the Python programming language. \n\n**Note:** This is the only part of the course spefically focused on _how Python_ does things\n\n1. [Modules](python_extras/modules.ipynb)\n2. [Python Data Model](python_extras/data-model.ipynb)\n3. [Function as Objects](python_extras/functions-objects.ipynb)\n4. [Magic Methods](python_extras/magic.ipynb)\n5. [ Pythonic Coding Style](python_extras/pep8.ipynb)\n\n## Instructions\n\n### 1. Get the material\n\n**Option A**: `Clone` (or `fork`) the Repository using `git` (**Recommended**) \n\n⚠️ Note: It is necessary to have `Git` installed in order to proceed. If you don't have `git` installed on your system, you need to **install git** first. \n[Instructions to Install Git](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git)\n\n\n\n💡 Please also consider looking at [**Git CheatSheet**](https://education.github.com/git-cheat-sheet-education.pdf)\n\n\n\nTo acquire the lecture material it is highly recommended using `git` to **clone** the current repository. Since the repository will be constantly updated after each lesson, using git method will allow for an easier synchronisation of the material.\n\nTo clone the repository, type the following command in the terminal prompt: \n```bash\ngit clone https://github.com/leriomaggio/python-data-science.git\n```\n\n⚠️ Note for **Windows users**: Once installed `git`, please make sure to run the _Git Terminal_ (or _Git Prompt_)\n\nOnce completed, this will create a new folder named `python-data-science` (_presumably in your Home folder_).\n\nWell done! Now you should bear with me another few minutes, following instructions reported below 🙏\n\n Please now proceed to **2. Setting up your Environment**\n\n\n\n\n\n\n**Option B**: Downloading the material in a ZIP archive from GitHub (**Not Recommended**)\n\nIt is indeed possible to download the whole material from GitHub as a ZIP archive. \nLink [here](https://github.com/leriomaggio/python-data-science#:~:text=with%20GitHub%20Desktop-,Download%20ZIP,-Latest%20commit)\n\nHowever, this method is **not recommended** as it will be required to download the archive everytime there is an update (which means at the end of each lesson)!\n\n### 2. Setting up your Environment\n\nWe will be using [**Jupyter lab**](https://jupyter.org) as our _interactive programming environment_ for this course. \n\nThis will have the great advantage of lowering the barriers in setting up the environment, and installing specialised tools. If you're not familiar with _jupyter notebooks_, no worries: we will get the time to familiarise with the environment as the first thing we will do!\n\nMeanwhile, it is necessary to setup the Python **Virtual Environment** to run the code contained in this repository _smoothly_ and with no _headaches_.\n\nIf you don't know what a Python [virtual environment](https://docs.python.org/3/tutorial/venv.html) is, think of it as a sandbox Python installation you can have on your machine that is fully controllable and fully independent from any other Python environment you may have on your local machine.\n\nTo execute the notebooks in this repository, a few packages are required, but installing them in your Conda environment is super easy. \n\n**Step 1:** Download [Anaconda Python Distribution](https://www.anaconda.com/products/individual).\n\nNote for **Windows Users**:  More information here on the [official documentation](https://docs.anaconda.com/anaconda/user-guide/getting-started/#open-nav-win)\n\n**Step 2:** Set up the virtual environment:\n\nOpen a Terminal (or **Anaconda Prompt** on Windows) and **move** to the `python-data-science` folder, i.e. the main folder of this repository. \n\n```bash\ncd python-data-science\n```\n\nNow create the conda environment by typing the following command:\n\n```bash\nconda env create -f pyds.yml\n```\nThis will install a **new** Conda environment named `pyds`.\n\n**Step 2.1**: If you'd like to double check that the creation of the environment completed successfully, you can type:\n\n```bash\nconda info --envs\n```\nThis will list all the virtual environments conda can found within your installation. `pyds` should appear in the list as well.\n\n**Step 3:**: Activate the environment:\n\nOnce the environment is set, we need to **activate** it in order to use it.\n\n```bash\nconda activate pyds\n```\n\n🎉 You should be now ready to go!\n\nThe last bit is to run your `jupyter lab` server, and open the notebooks:\n\n```bash\njupyter lab\n```\n\n#### (Alternative) Setup Environment via `pip`\n\nThe repository also includes a `requirements.txt` file that can be used to install all the required packages using `pip`:\n\n```bash\npip install -r requirements.txt\n```\n\nHowever this is recommended only if (A) it is not possible to install Anaconda on your machine; (B) The setup of Anaconda environment is unsuccessfull. \n\n⚠️ **Either is the case** it is important that the version of Python used will be `Python \u003e=3.9`\n\n## Colophon\n\n**Author**: Valerio Maggio ([`@leriomaggio`](https://twitter.com/leriomaggio)), Senior Research Associate, University of Bristol. \n\nAll the **Code** material is distributed under the terms of the GNU GPLv3 License. See [LICENSE](./LICENSE) file for additional details.\n\nAll the instructional materials in this repository is free to use, and made available under the [Creative Commons Attribution\nlicense][https://creativecommons.org/licenses/by/4.0/]. The following is a human-readable summary of (and not a substitute for) the [full legal text of the CC BY 4.0\nlicense](https://creativecommons.org/licenses/by/4.0/legalcode).\n\nYou are free:\n\n* to **Share**---copy and redistribute the material in any medium or format\n* to **Adapt**---remix, transform, and build upon the material\n\nfor any purpose, even commercially.\n\nThe licensor cannot revoke these freedoms as long as you follow the\nlicense terms.\n\nUnder the following terms:\n\n* **Attribution**---You must give appropriate credit (mentioning that\n  your work is derived from work that is Copyright © Software\n  Carpentry and, where practical, linking to\n  http://software-carpentry.org/), provide a [link to the\n  license][cc-by-human], and indicate if changes were made. You may do\n  so in any reasonable manner, but not in any way that suggests the\n  licensor endorses you or your use.\n\n**No additional restrictions**---You may not apply legal terms or\ntechnological measures that legally restrict others from doing\nanything the license permits. \n\n### Contacts \n\nFor any questions or doubts, feel free to open an [issue](https://github.com/leriomaggio/python-data-science/issues) in the repository, or drop me an email @ `valerio.maggio_at_bristol.ac.uk`","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fleriomaggio%2Fpython-data-science","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fleriomaggio%2Fpython-data-science","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fleriomaggio%2Fpython-data-science/lists"}