An open API service indexing awesome lists of open source software.

https://github.com/JRaviLab/compbio-gists

Computational Biology & Bioinformatics Resources
https://github.com/JRaviLab/compbio-gists

bioinformatics comparative-genomics computational-biology data-science gists molecular-evolution phylogeny r shell transcriptomics

Last synced: 4 months ago
JSON representation

Computational Biology & Bioinformatics Resources

Awesome Lists containing this project

README

          

# Computational Biology & Bioinformatics Resources
_With programming resources on R, Python, Unix, Git, and Stats._
_Other non-compbio gists will be [here](https://gist.github.com/jananiravi)!_
> NOTE: When the recommendation is an online course, we recommend the *FREE* version.

## Contributors
[Janani Ravi](https://github.com/jananiravi) & [Arjun Krishnan](https://github.com/krishnanlab)

> NOTE: _You can request gist on a particular topic by adding an [issue](https://github.com/jananiravi/compbio-gists/issues) outlining the details of the problem. Keywords of interest are in the repo description above._

## Table of Contents
* [Cheatsheets](#cheatsheets)
* [Unix](#unix)
* [R](#r)
* [Python](#python)
* [Probability & Statistics](#probability-and-statistics)
* [Biology](#biology)

## Cheatsheets
For R/RStudio, Git/GitHub, Markdown, Unix/vi, Slack, …

https://github.com/jananiravi/cheatsheets

## Unix
* [Command-line Bootcamp](http://rik.smith-unna.com/command_line_bootcamp/)
* [Command-line Guide](http://commandline.guide/) | Also interactive, just like the bootcamp.
* [Linux Journey](https://linuxjourney.com)
* A Unix workshop: [course materials](https://www.dropbox.com/s/1ltlyhtdbccymep/w1-files.zip?dl=0)
* Day1 - [Video](https://www.youtube.com/watch?v=liC5uM8czyo) & [Slides](https://www.dropbox.com/s/ggv7ijwateim7zt/day1_Unix.pdf?dl=0)
* Day2 - [Video](https://www.youtube.com/watch?v=ArbOG6YpakU) & [Slides](https://www.dropbox.com/s/xorsuvk1cugiyw8/day2_Unix.pdf?dl=0)
* Day3 - [Video](https://www.youtube.com/watch?v=PHmfgIuOMFQ) & [Slides](https://www.dropbox.com/s/88wu7svvfur8upw/day3_Unix.pdf?dl=0)
* Command-line refresher from [Software Carpentry](http://swcarpentry.github.io/shell-novice/)

## R
### General introduction to R
* [Swirl](http://swirlstats.com) ('R Programming' & 'Data Analysis’ lessons)
* [Programming with R](http://swcarpentry.github.io/r-novice-inflammation/)
* [RStudio Education](https://education.rstudio.com/)
* [Finding Your Way To R](https://education.rstudio.com/learn/) | [Beginners](https://education.rstudio.com/learn/beginner/)
* [RStudio Essentials](https://resources.rstudio.com/)
* [R Cheatsheets](https://www.rstudio.com/resources/cheatsheets/)

#### Data Visualization
A few useful resources to share along with the tidyverse/ggplot
1. To pick the right kind of visualization, given your data type:
https://www.data-to-viz.com/
2. Graph galleries w/ sample codes for R/python-newbies:

[R Graph Gallery](https://www.r-graph-gallery.com/) | [Python Graph Gallery](https://python-graph-gallery.com/)
3. [ggplot extension gallery](https://exts.ggplot2.tidyverse.org/gallery/) | https://github.com/ggplot2-exts/gallery

### R for data science and machine learning
* [Data Science Course in a Box](https://datasciencebox.org/) - Introductory data science course covering data acquisition and wrangling, exploratory data analysis, data visualization, inference, modeling, and effective communication of results (with tidyverse, R Markdown, and version control). The course also introduces interactive visualization and reporting, text analysis, and Bayesian inference.
* [RStudio | The Essentials of Data Science](https://resources.rstudio.com/the-essentials-of-data-science)
* [R for Reproducible Scientific Analysis](http://swcarpentry.github.io/r-novice-gapminder/)

### eBooks for R
* R for Data Science | R4DS | Hadley Wickham, Garrett Grolemund |
[eBook](https://r4ds.had.co.nz/)
* Hands-On Programming with R | HOPR | Garrett Grolemund |
[eBook](https://rstudio-education.github.io/hopr/)
* Happy Git and GitHub for the useR | Jenny Bryan |
[eBook](https://happygitwithr.com/)
* [Learning Statistics with R](https://learningstatisticswithr.com/) | Danielle Navarro |
[eBook](https://learningstatisticswithr.com/book/)
* Computational Genomics with R | Altuna Akalin |
[eBook](http://compgenomr.github.io/book/) | _Work in progress_
* R Programming for Data Science | Roger Peng |
[eBook](https://leanpub.com/rprogramming)
* R Graphics Cookbook | Winston Chang |
[eBook](https://r-graphics.org/)

## Python

### General introduction to Python
* [Learning Python the Hard Way](https://learnpythonthehardway.org/book/)
* [Google Python Class](https://developers.google.com/edu/python/)
* [Videos to follow along](https://www.youtube.com/playlist?list=PLfZeRfzhgQzTMgwFVezQbnpc1ck0I6CQl)
* Introduction to Interactive Programming in Python
* [Part 1](https://www.coursera.org/learn/interactive-python-1)
* [Part 2](https://www.coursera.org/learn/interactive-python-2)

### Python for data science and machine learning
* Courses to learn introductory computer science, programming, computational thinking, and data science (video lectures + notes + assignments):
* [Introduction to Computer Science and Programming in Python](https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-0001-introduction-to-computer-science-and-programming-in-python-fall-2016/)
* [Introduction to Computational Thinking and Data Science](https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-0002-introduction-to-computational-thinking-and-data-science-fall-2016/)
* [A Whirlwind Tour of Python](https://jakevdp.github.io/WhirlwindTourOfPython/): [PDF](http://www.oreilly.com/programming/free/files/a-whirlwind-tour-of-python.pdf) and [Jupyter Notebooks](https://github.com/jakevdp/WhirlwindTourOfPython)
* [Scipy Lecture Notes](http://www.scipy-lectures.org/) – Awesome document to learn numerics, science, and data with Python
* Data Wrangling:
* [Data Wrangling in Python with Pandas - Kaggle](https://www.kaggle.com/learn/pandas)
* [Video series on data analysis with Pandas](https://www.dataschool.io/easier-data-analysis-with-pandas/) – Excellent set of short videos
* Visualization:
* [Data Visualization with Python - Kaggle](https://www.kaggle.com/learn/data-visualisation)
* [Python Plotting for Exploratory Data Analysis](http://pythonplot.com/)
* Machine Learning:
* [Introduction to ML in Python - Kaggle](https://www.kaggle.com/learn/machine-learning) (Checkout both Levels 1 & 2)
* [Another intro to ML with scikit-learn](https://www.dataschool.io/machine-learning-with-scikit-learn/) – This one contains videos and accompanying JuPyter notebooks + blog posts.
* [A Quick Demo to ML with Scikit Learn Python Package](https://github.com/mmmayo13/scikit-learn-classifiers/blob/master/sklearn-classifiers-tutorial.ipynb) – A nice demo+tour of scikit learn.
* [Deep Learning with Python and TensorFlow - Kaggle](https://www.kaggle.com/learn/deep-learning)
* [Embeddings with Python and TensorFlow - Kaggle](https://www.kaggle.com/learn/embeddings) – Build deep learning models that handle sparse categorical variables
* [Machine Learning Explainability](https://www.kaggle.com/learn/machine-learning-explainability)
* General mutli-topic resources:
* [A Step-by-step Guide to Python for Data Science](http://www.dataschool.io/launch-your-data-science-career-with-python/)
* Always checkout the latest PyCon Conference tutorials and talks, almost all of which are available online. [For e.g., here's a list from PyCon 2017](https://krishnanlab.slack.com/files/arjunkrish/F5MEK7GAK/Python_Videos_of_Interest_to_Lab).

### Probability and statistics
* [Think Stats](https://greenteapress.com/wp/think-stats-2e/) (book + code + solutions; for Python programmers).
* [Learning statistics with R](https://learningstatisticswithr.com/book/) (book + code + solutions; for R programmers).
* [Points of Significance](https://www.nature.com/collections/qghhqm/pointsofsignificance) - an awesome collection of short articles on a variety of topics in statistical data analysis.
* [OpenIntro to Probablity and Statistics](https://www.openintro.org/stat/textbook.php?stat_book=os)

#### Statistical learning
> A great resource (book + videos + slides + exercises + example code + solutions) for simultaneously learning both statistical learning and R. [_Statistical learning_ is just another term for _machine learning_ done from a slightly statistical-modeling point-of-view.]
* An Introduction to Statistical Learning with Applications in R | Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani
http://www-bcf.usc.edu/~gareth/ISL/index.html
* You can download the latest version of the book as a PDF on that site: http://www-bcf.usc.edu/~gareth/ISL/ISLR%20Seventh%20Printing.pdf
* I would encourage watching these excellent course lecture videos (by the authors, who’re world-class scientists) that follow the book closely: http://www.dataschool.io/15-hours-of-expert-machine-learning-videos/
* There are additional slides & videos from another good course taught based on this book: https://www.alsharif.info/iom530

## Biology
* [Learn genetics](https://learn.genetics.utah.edu/)
* [IBiology](https://www.ibiology.org/biology-videos/)
* [DNA seen through the eyes of a coder](https://ds9a.nl/amazing-dna/) - If you have a computational/quantitaive background, you'll esp. love this!