https://github.com/diestok/basic-machine-learning-for-bioinformatics
ML course materials for bioinformatics students following the Basic Machine Learning for Bioinformatics course at Utrecht University. Course created and taught by Dieter Stoker.
https://github.com/diestok/basic-machine-learning-for-bioinformatics
course jupyter-notebook machine-learning python supervised-machine-learning unsupervised-machine-learning
Last synced: 3 months ago
JSON representation
ML course materials for bioinformatics students following the Basic Machine Learning for Bioinformatics course at Utrecht University. Course created and taught by Dieter Stoker.
- Host: GitHub
- URL: https://github.com/diestok/basic-machine-learning-for-bioinformatics
- Owner: DieStok
- Created: 2021-10-16T12:30:14.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2023-02-27T09:16:31.000Z (over 2 years ago)
- Last Synced: 2023-09-19T00:43:32.884Z (almost 2 years ago)
- Topics: course, jupyter-notebook, machine-learning, python, supervised-machine-learning, unsupervised-machine-learning
- Language: Jupyter Notebook
- Homepage:
- Size: 87.1 MB
- Stars: 1
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Basic-Machine-Learning-for-Bioinformatics
ML course materials for bioinformatics students following the course Basic Machine Learning for Bioinformatics at UU.# Topics covered per day (lectures and practicals)
* Day 1: Linear regression, gradient descent, introduction to linear algebra
* Day 2: Logistic regression, regularisation, ROC curve, introduction to neural networks (NNs)
* Day 3: NN Backpropagation algorithm, convolutional neural networks explained, guest speaker on deep learning in Oxford Nanopore sequencing (13:15-14:05)
* Day 4: K-means clustering, hierarchical clustering, deep dive into phylogenetics
* Day 5: Problems with high-dimensional data, Principal Component Analysis (PCA)
* Day 6: Working with scikit-learn, introduction to Keras and TensorFlow, project introduction and start# Dependencies and running the practicals.
The material assumes a local installation of Anaconda, including the packages `numpy`, `scipy`, `pandas`, `sklearn`, `biopython`, `pandas-plink`, `tensorflow`, `notebook`, `matplotlib`, and `seaborn`.# More information
For more information and resources, read the [course reader](CourseReaderMLBasics2021_Final.pdf).# Words of thanks
Greatly inspired by/based on [Andrew Ng's course on Coursera](https://www.coursera.org/learn/machine-learning/home/welcome). The PCA part is based on [Prof. Victor Lavrenko's excellent lecture series](https://www.youtube.com/watch?v=IbE0tbjy6JQ&list=PLBv09BD7ez_5_yapAg86Od6JeeypkS4YM). Many thanks are owed to [Dr. Jeroen de Ridder](https://www.umcutrecht.nl/en/research/researchers/de-ridder-jeroen-j) for expert assistance. I thank [Dr. ir. Bas van Breukelen](https://www.uu.nl/staff/BvanBreukelen) for long-term assistance and [Prof. Dr. Berend Snel](https://tbb.bio.uu.nl/snel/group.html) for comments on the phylogenetics part. Any errors remain my own (and, with your help, will hopefully be noticed and rectified soon).