{"id":29950743,"url":"https://github.com/mlund/jupyter-course","last_synced_at":"2025-08-03T11:19:11.647Z","repository":{"id":53719649,"uuid":"76114564","full_name":"mlund/jupyter-course","owner":"mlund","description":"Jupyter Course","archived":false,"fork":false,"pushed_at":"2022-12-06T10:16:14.000Z","size":54015,"stargazers_count":18,"open_issues_count":1,"forks_count":18,"subscribers_count":7,"default_branch":"master","last_synced_at":"2025-01-18T23:52:28.124Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mlund.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2016-12-10T13:35:41.000Z","updated_at":"2024-02-08T22:23:55.000Z","dependencies_parsed_at":"2023-01-24T01:30:08.137Z","dependency_job_id":null,"html_url":"https://github.com/mlund/jupyter-course","commit_stats":null,"previous_names":[],"tags_count":5,"template":false,"template_full_name":null,"purl":"pkg:github/mlund/jupyter-course","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlund%2Fjupyter-course","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlund%2Fjupyter-course/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlund%2Fjupyter-course/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlund%2Fjupyter-course/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mlund","download_url":"https://codeload.github.com/mlund/jupyter-course/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlund%2Fjupyter-course/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":268531925,"owners_count":24265257,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-03T02:00:12.545Z","response_time":2577,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-08-03T11:19:05.753Z","updated_at":"2025-08-03T11:19:11.638Z","avatar_url":"https://github.com/mlund.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/mlund/jupyter-course/master)\n\n# Reproducible and Interactive Data Science\n\n- [Syllabus](#Syllabus)\n- [Credits](#Credits)\n- [Program](#Program)\n- [Prerequisites](#Prerequisites)\n- [Preparation Before the First Session](#Preparation)\n- [Project Work](#Project)\n- [Notebook Requirements](#Requirements)\n- [Getting a DOI Via Zenodo](#Zenodo)\n- [Create and Export Conda Environments](#Environments)\n- [Troubleshooting](#Troubleshooting)\n- [External Resources](#External)\n\n\u003ca name=\"Syllabus\"\u003e\u003c/a\u003e\n## Syllabus \n\nThe aim of this course is to introduce students to the [Jupyter Notebook](http://jupyter.org) which\nis an open-source software that allows you to create and share documents that contain live code, equations, visualizations, and explanatory text. Uses include: data cleansing and manipulation, numerical simulations, statistical modeling, machine learning, and much more. Through the notebooks, research results and the underlying analyses can be transparently reproduced as well as shared.\nAs an example, see [this Notebook](http://nbviewer.jupyter.org/github/minrk/ligo-binder/blob/master/index.ipynb) on gravitational waves published in _Physical Review Letters_.\n\nDuring three days with alternating video lectures ([Intro \u0026 Widgets](https://lu.instructuremedia.com/embed/f7dbab71-5fad-4308-a4d2-1dd0e7d7a3b7), \n[Libraries](https://www.youtube.com/playlist?list=PLto3nNV9nKZlXSWOAqmmn4J7csD4I6a2d), \n[ATLAS Dijet](see below) and hands-on exercises, the participants will learn to construct well-documented, electronic notebooks that perform advanced data analyses and produce publication ready plots.\nWhile the course is based on Python, this is not a prerequisite since the Jupyter Notebook supports [many programming languages](https://github.com/jupyter/jupyter/wiki/Jupyter-kernels). The name Jupyter itself stands for Julia, Python, and R, the main languages of data science.\n\n\u003ca name=\"Credits\"\u003e\u003c/a\u003e\n## Credits \n\n4 ECTS.\n\nWorkload equivalent to one working week (5 full-time days) for going through the course and seminars (1.5 hp), one working week to complete the individual project and implementation of corrections (1.5 hp), and 2.5 working days for the peer-review of other students project (0.5 hp).  \n\n\u003ca name=\"Logistics\"\u003e\u003c/a\u003e\n## Logistics \n\nThe course is held in \"flipped classroom\" mode: after the first introductory and get-to-know-each-other session, the students are supposed go through the videos themselves, and have Q\u0026A sessions with teachers and helpers. \nAll students and some of the teachers will be in LINXS, near Ideon, and other teachers will be on Zoom, with the room coordinates to be given to the participants.\nThere is also a [Discord server](https://discord.gg/3w7R7vYA). \n\n_Introductory sessions with teachers_ on **December 6, 2022** from 10:15 to 17:00 (with breaks for fika and lunch)\n\n_Q\u0026A Sessions with teachers_ on **December 8 and 9, 2022** from 10:15 to 17:00. \n\n\u003ca name=\"Program\"\u003e\u003c/a\u003e\n## Program \n\nThe course consists of a taught component with alternating video lectures ([Intro \u0026 Widgets](https://lu.instructuremedia.com/embed/f7dbab71-5fad-4308-a4d2-1dd0e7d7a3b7), [Libraries](https://www.youtube.com/playlist?list=PLto3nNV9nKZlXSWOAqmmn4J7csD4I6a2d), \nATLAS Dijet (see below) and hands-on exercises. All notebooks shown in the video lectures are available on this site in the [lectures](lectures) folder.\n\n- Day 1. Introduction\n    - [Introduction](https://lu.instructuremedia.com/embed/b76e3161-adcc-4c81-93df-3ba17e7c3a1e) and overview of the Jupyter Notebook (10')\n    - Introduction to project work and peer discussion (15')\n    - Installation and package management (Miniconda)\n    - [Binder](https://youtu.be/BqJyaejvVjQ?t=1315) and [conda environments](https://conda.io/docs/user-guide/tasks/manage-environments.html)\n    - Navigating cells, [online resources](#External), and getting help \n    - Documenting using Markdown: rich text, equations, images, tables, videos\n    - IPython Magic commands \n    - Cross-language interaction (`bash`)\n    - Python [built-in functions](https://youtu.be/YpBUiEsTiEA)\n    - Storage and manipulation of [numerical arrays](https://youtu.be/2xJsNi3wk-s) (`numpy`) \n    \n- Day 2. Python and Data Science, Plotting\n    - (Live demo) Explore a Notebook in action: machine learning for image morphing\n    - Repeated operations and [universal functions](https://youtu.be/469ukhzwEPg) (`numpy`, `Cython`, and `fortranmagic`)\n    - [Data structures](https://youtu.be/26ZioEwRw00) and [data wrangling](https://youtu.be/pHa3uuSZh6Y) (`pandas`)\n    - [Pivot tables](https://youtu.be/ODFpGo7UomA), [grouping and aggregating](https://youtu.be/oh8UijClQoE) (`pandas`)    \n    - Creating [publication ready plots](https://youtu.be/B0iTbVySNtc) (`matplotlib`)\n    - Plotting [images, errorbars, histograms, and composite plots](https://youtu.be/Xyobv9kGQxU) (`matplotlib`)\n    - Exporting figures to raster and vector formats (`matplotlib`)\n    - Plotting [categorical data](https://youtu.be/c0Bd8iWmHGw) (`matplotlib`,`pandas`,`seaborn`)\n    - \n- Day 3. Visualization and Interactivity\n    - Nonlinear least-squares (`scipy`, `R`, and `rpy2`)\n    - [IPython widgets](https://luplay.education.lu.se/media/MinRK-Jupyter-COMPUTE2018-widgets/0_18ipnucr)\n    - [Interactive plots](https://youtu.be/oLU5eIO7b84) (`bokeh`)\n    - Version control, sharing, and archiving (Github and [Zenodo](https://youtu.be/IdLSGZAdhlQ?t=266))\n    - [full videos] Explore a Notebook in action in [_the search for new particles_](https://github.com/urania277/jupyter-dijets/tree/jupyter-course-compute-2018) ([ATLAS Dijet 1](https://lu.instructuremedia.com/embed/1041f33b-af73-4ded-91d2-fc570d99d87f) ([ATLAS Dijet 2](https://lu.instructuremedia.com/embed/35624ec2-4073-4f05-9d2b-3a85c0cd8f24) ([ATLAS Dijet 3](https://lu.instructuremedia.com/embed/0d0ab8ab-6c84-4a57-b5f8-e80ed76c3a3e) ([ATLAS Dijet 4](https://lu.instructuremedia.com/embed/205f2870-3d0f-48f4-94fc-a7da72f3f7c6))\n    \n\u003ca name=\"Prerequisites\"\u003e\u003c/a\u003e\n## Prerequisites\n\n- No prior knowledge in Python is required, but familiarity with programming concepts is helpful.\n- A laptop connected to the internet (eduroam, for example) and running Linux, MacOS, or Windows and with Anaconda installed, see below.\n- Earphones for watching lectures in your own time / re-watching them during the sessions (we will also provide breakout rooms).\n\nIf you have little experience with Python or shell programming, the following two tutorials may be helpful:\n\n- https://swcarpentry.github.io/shell-novice\n- https://swcarpentry.github.io/python-novice-inflammation\n\n\u003ca name=\"Preparation\"\u003e\u003c/a\u003e\n\n## Preparation Before the First Session\n\n1. Watch the video lectures ([Intro \u0026 Widgets](https://lu.instructuremedia.com/embed/f7dbab71-5fad-4308-a4d2-1dd0e7d7a3b7), \n[Libraries](https://www.youtube.com/playlist?list=PLto3nNV9nKZlXSWOAqmmn4J7csD4I6a2d), \nATLAS dijets (see above, Day 3))\n2. Install [miniconda3](https://conda.io/miniconda.html) alternatively the full\n   [anaconda3](https://www.anaconda.com/download) enviroment on your laptop (the latter is **much** larger).\n3. [Download](https://github.com/mlund/jupyter-course/archive/master.zip) the course material\n   (this github repository) and unzip.\n4. Uncomment the line with \"# - gcc # [osx]\" in the file [`environment.yml`](/environment.yml).\n5. Install and activate the `LUcompute` environment described by the file [`environment.yml`](/environment.yml)\n   by running the following in a terminal:\n\n   ```bash\n   conda env create -f environment.yml\n   conda activate LUcompute\n   ```\n\nInstructions for Windows:\n\n1. Watch the video lectures ([Intro \u0026 Widgets](https://lu.instructuremedia.com/embed/f7dbab71-5fad-4308-a4d2-1dd0e7d7a3b7), [Libraries](https://www.youtube.com/playlist?list=PLto3nNV9nKZlXSWOAqmmn4J7csD4I6a2d), ATLAS dijets (see above, Day 3))\n2. Install [miniconda3](https://conda.io/miniconda.html).\n3. [Download](https://github.com/mlund/jupyter-course/archive/master.zip) the course material (this github repository)\n   and unzip.\n4. Open the `anaconda prompt` from the start menu.\n5. Navigate to the folder where the course material has been unzipped (_e.g._ using `cd` to change directory\n   and `dir` to list files in a folder).\n6. Install and activate the `LUcompute` environment described by the file [`environment.yml`](/environment.yml)\n   by running the following in the `anaconda prompt`:\n\n   ```bash\n   conda env create -f environment.yml\n   activate LUcompute\n   ```\n[Documentation on conda environments](https://conda.io/docs/user-guide/tasks/manage-environments.html#creating-an-environment-from-an-environment-yml-file) \n\n\u003ca name=\"Project\"\u003e\u003c/a\u003e\n## Project Work\n\nThe project work consists of three steps:\n\n1. Each student will make a Notebook project covering topics from day 1–4 with either:\n  - research, presenting data analysis and theory behind\n    a manuscript or published paper. The Notebook should ideally be written\n    such that it can act as supporting information (SI) for a journal.\n    Here's some [inspiration.](http://nbviewer.jupyter.org/github/jansoe/FUImaging/blob/master/examples/IOSsegmentation/regNMF.ipynb)\n  - _or_ a Notebook presenting a text-book topic of choice and aimed at students.\n    Here's some [inspiration](http://nbviewer.jupyter.org/github/demotu/BMC/blob/master/notebooks/Transformation2D.ipynb).\n  - **Deadline for project: January 30th**\n  \n2. Each student will upload their project on a public GitHub repository created through [GitHub Classroom](https://classroom.github.com/a/b7DO7_Ok) \nFor a brief introduction to git repositories, see [here](https://guides.github.com/activities/hello-world/#commit). Details and repositories will be made available at the end of the course. \n\n3. A peer-review process where each student reviews and writes comments on _two_ other notebooks by creating issues on the respective GitHub repositories.\n   The review should be based on the criteria listed below. For each point, include specific suggestions for improvements. The teachers can also add feedback on how to improve the notebook. **Deadline for review: February 15th.** \u003cbr\u003e \n   \n4. The deadline for implementing the reviewer comments on your notebook and __answering the GitHub issues__ is **March 15th**. At this point you should also have a Zenodo DOI for your project - add this as a badge to your repository, or as a link to your README. You will have to also add (to your Github repository) a text file that explains what changes you've made, and why. This process simulates a peer-review for scientific papers, so you're ready \n   \n5. Save your project to your own GitHub repository when the course has finished as we may delete it before the next course event.\n\n\u003ca name=\"Requirements\"\u003e\u003c/a\u003e\n## Notebook Requirements\n\nThis check list summarizes the minimum requirements for the Notebook project to be approved. It should be used as a reference for both the development of the Notebook and the peer-review process.\n\n- [ ] Documentation:\n  - [ ] title and abstract of the project (max 300 words)\n  - [ ] includes instructions on how to run the notebook \n  - [ ] includes the required packages in an `environment.yml` file\n  - [ ] includes a brief explanation of the reason each package/library was used \n  - [ ] includes rich documentation using Markdown (equations, tables, links, images or videos)\n  - [ ] is reproducible, _i.e._, someone else should be able to redo the steps\n- [ ] Input/Output:\n  - [ ] uses `pandas` to read large data sets or `numpy` to load data from text files\n  - [ ] uses `pandas` to save to disk the processed or generated data\n- [ ] Scientific computing/data processing:\n  - [ ] performs numerical operations (`numpy`, `scipy`, `pandas`) or manipulates, groups, and aggregates a data set (`pandas`) \n- [ ] Data visualization:\n  - [ ] includes at least one composite plot (inset or multiple panels)\n  - [ ] produces _publication ready_ quality figures (see [here](http://dx.doi.org/10/cg2g) for an editorial guide on _Graphical Excellence_):\n    - [ ] the figures are 89 mm wide (single column) or 183 mm wide (double column)\n    - [ ] the axes are labeled\n    - [ ] the font sizes are sufficiently large\n    - [ ] the figures are saved as rasterized images (300 dpi) or vector art\n- [ ] Version control, sharing, and archiving:\n  - [ ] is archived in a repository with a digital object identifier (DOI)\n  \n\u003ca name=\"Zenodo\"\u003e\u003c/a\u003e\n## Getting a DOI via Zenodo\n\nPart of your project work will consist of adding a Digital Object Identifier [DOI](https://en.wikipedia.org/wiki/Digital_object_identifier) to your work, through Zenodo. \nIn order to do that, you should watch the videos mentioned in \"day 3\": \n    - Version control, sharing, and archiving (Github and [Zenodo](https://youtu.be/IdLSGZAdhlQ?t=266))\nThe easiest and preferred way to do it is by connecting your Github account to Zenodo first, enabling the repository to be seen by Zenodo, then making a tag in GitHub, following the instructions [here](https://guides.github.com/activities/citable-code/).\n\n\u003ca name=\"Environments\"\u003e\u003c/a\u003e\n## Create and Export Conda Environments\n\nThe command to [create a new environemnt](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#creating-an-environment-with-commands) with Python x.y is \n```bash\nconda create --name myenv python=x.y\n```\nwhere `myenv` is a name of your choice for the new environment and x.y is a specific Python version (e.g. 2.7 or 3.6).\nAfter activating the environemnt (`conda activate myenv`), you can install all the other packages within the environment.\n`conda list` shows the list of packages installed in the environment.\nThe command to [export the active environment](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#sharing-an-environment) `myenv` to an environment yml file (e.g. `myenv.yml`) is\n```bash\nconda env export \u003e myenv.yml\n```\n\n\u003ca name=\"Troubleshooting\"\u003e\u003c/a\u003e\n## Troubleshooting\n\nIf your notebook seems to have an issue on connection, similar to the lines below:\n\n   ```\n[E 12:18:57.001 NotebookApp] Uncaught exception in /api/kernels/5e16fa4b-3e35-4265-89b0-ab36bb0573f5/channels\n    Traceback (most recent call last):\n      File \"/Library/Python/2.7/site-packages/tornado-5.0a1-py2.7-macosx-10.13-intel.egg/tornado/websocket.py\", line 494, in _run_callback\n        result = callback(*args, **kwargs)\n      File \"/Library/Python/2.7/site-packages/notebook-5.2.2-py2.7.egg/notebook/services/kernels/handlers.py\", line 258, in open\n        super(ZMQChannelsHandler, self).open()\n      File \"/Library/Python/2.7/site-packages/notebook-5.2.2-py2.7.egg/notebook/base/zmqhandlers.py\", line 168, in open\n        self.send_ping, self.ping_interval, io_loop=loop,\n    TypeError: __init__() got an unexpected keyword argument 'io_loop'\n[I 12:18:58.021 NotebookApp] Adapting to protocol v5.1 for kernel 5e16fa4b-3e35-4265\n   ```\nYou should either a) downgrade the package \"tornado\" b) change L178 of the file \n\n```\n[your conda installation location]/miniconda3/envs/LUcompute/lib/python3.6/site-packages/notebook/base/zmqhandlers.py \n```\n\nfrom \n\n   ```\n                self.send_ping, self.ping_interval, io_loop=loop,\n   ```\n\ninto\n\n   ```\n                self.send_ping, self.ping_interval,\n   ```\n\n\nhttps://stackoverflow.com/questions/48090119/jupyter-notebook-typeerror-init-got-an-unexpected-keyword-argument-io-l\n\n\u003ca name=\"External\"\u003e\u003c/a\u003e\n## External Resources\n\n- Cross-language interaction is a striking feature of Jupyter notebooks: The possibility to integrate multiple languages in the same notebook makes it feasible to exploit the best tools of the various languages in the different steps of data analysis. You can read more about it in [this post](https://blog.jupyter.org/i-python-you-r-we-julia-baf064ca1fb6).\n- The Jupyter notebook is a very popular tool for working with data in academia as well as in the private sector.\n  - These [tutorials](https://www.gw-openscience.org/tutorials/) show how the [LIGO/VIRGO collaboration](https://www.nobelprize.org/prizes/physics/2017/press-release/) extensively uses Jupyter notebooks to communicate its   research.\n  - The streaming service Netflix currently uses Jupyter notebooks as the main tool for data analysis. For example, recommendation algorithms which suggest which movies or TV series to watch next are currently run on Jupyter notebooks. You can read more about it in [this post](https://medium.com/netflix-techblog/notebook-innovation-591ee3221233).\n  - In 2017 Jupyter received the [ACM Software System Award](https://blog.jupyter.org/jupyter-receives-the-acm-software-system-award-d433b0dfe3a2), a prestigious award that it shares with projects such as Unix and the Web.\n- There are many freely available online resources to learn data science. \n  - The best resource to find help with programming and scripting is\n[Stack Overflow](https://stackoverflow.com), which is a question and answer website curated by software developer communities. \n  - An excellent book is \"Python Data Science Handbook\" by Jake VanderPlas which is freely available as Jupyter notebooks at [this GitHub page](https://github.com/jakevdp/PythonDataScienceHandbook). On the author's webpage, you can also find a list of excellent [talks, lectures, and tutorials](http://vanderplas.com/speaking.html) and a [blog](http://jakevdp.github.io/).\n  - Yet another useful resource is the podcast [Data Skeptic](https://dataskeptic.com) which features a collection of entertaining and educational mini-lectures on data science as well as interviews with experts.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmlund%2Fjupyter-course","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmlund%2Fjupyter-course","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmlund%2Fjupyter-course/lists"}