https://github.com/Chris-Engelhardt/data_sci_guide
A community-sourced data science repo.
https://github.com/Chris-Engelhardt/data_sci_guide
Last synced: 4 months ago
JSON representation
A community-sourced data science repo.
- Host: GitHub
- URL: https://github.com/Chris-Engelhardt/data_sci_guide
- Owner: Chris-Engelhardt
- Created: 2019-04-14T13:42:35.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2021-02-03T14:35:06.000Z (about 4 years ago)
- Last Synced: 2024-08-13T07:04:16.207Z (8 months ago)
- Homepage:
- Size: 61.5 KB
- Stars: 545
- Watchers: 48
- Forks: 84
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- jimsghstars - Chris-Engelhardt/data_sci_guide - A community-sourced data science repo. (Others)
README
# Repo Purpose
Welcome to your community-sourced data science repo! The overarching goal here is to provide anyone interested in learning data science with a wealth of open source, industry-best learning materials and learning tracks.
> This repo is a work in progress. Please check back for updates. @momiji15, @Annu-07, and I are collaborating on the structure for this repo. If you would like to be involved in that process, please file an issue in this repo and we will add you to our Slack channel.
This repo is motivated by recent incidents. The data science community deserves better, and this repo is an attempt to provide a platform for the excellent learning resources available.
## Guided Data Science Resources
* [Dataquest](https://www.dataquest.io/)
* [Software Carpentry Lessons](https://software-carpentry.org/lessons/)
* [Data Carpentry Lessons](https://datacarpentry.org/lessons/)
* [Chromebook Data Science](http://jhudatascience.org/chromebookdatascience/cbds.html)
* [Business Science University](https://university.business-science.io/p/jumpstart-with-r)## Direct Course Replacements
Many instructors have admirably [advocated against taking their own DataCamp courses](https://twitter.com/noamross/status/1116667602741485571). Often, these instructors have suggested other ways in which learners can access the same material. The suggested replacements for their courses are listed below:
### R Courses
[Introduction to R](https://rstudio.cloud/learn/primers)
* Also see [here](http://swcarpentry.github.io/r-novice-inflammation/)
[RYouWithMe](https://rladiessydney.org/courses/ryouwithme/)
[Ready for R](https://ready4r.netlify.app)
[Intermediate R](https://www.dataquest.io/course/intermediate-r-programming/)
[Working with the RStudio IDE (Part 1)](https://resources.rstudio.com/)
[Working with the RStudio IDE (Part 2)](https://resources.rstudio.com/)
[Cleaning Data in R](https://www.dataquest.io/course/r-data-cleaning)
* Also see [here](https://www.coursera.org/learn/data-cleaning?specialization=jhu-data-science)
[Importing & Cleaning Data in R: Case Studies](https://app.dataquest.io/course/r-data-cleaning)
* See Guided Project: NYC Schools Perceptions
[Working with Dates and Times in R](https://d-rug.github.io/blog/2014/using-times-and-dates-in-r-presentation-code)
[Categorical Data in the Tidyverse](https://r4ds.had.co.nz/factors.html)
* Also see [here](https://forcats.tidyverse.org/)
[Writing Functions in R](https://r4ds.had.co.nz/functions.html)
* Next, [learn about iteration](https://r4ds.had.co.nz/iteration.html)
[Data Manipulation in R with dplyr](https://cran.r-project.org/web/packages/dplyr/vignettes/dplyr.html)
[Data Analysis in R, the data.table Way](https://cran.r-project.org/web/packages/data.table/vignettes/datatable-intro.html)
* See [here](https://cran.r-project.org/web/packages/data.table/vignettes/datatable-reference-semantics.html) for updating
* See [here](https://cran.r-project.org/web/packages/data.table/vignettes/datatable-secondary-indices-and-auto-indexing.html) for indexing
[Building Processing Pipelines in `data.table`](https://github.com/jameslamb/teaching/tree/master/datacamp_audition)
[Developing R Packages](https://kbroman.org/pkg_primer/)
* Also see the [`usethis`](https://www.tidyverse.org/articles/2019/04/usethis-1.5.0/) documentation
[Foundations of Probability in R](https://www.coursera.org/learn/probability-intro?specialization=statistics)
* See weeks 3 and 4
* Also see [here](https://www.edx.org/course/r-probability-2)
[Dealing With Missing Data in R](https://cran.r-project.org/web/packages/naniar/vignettes/getting-started-w-naniar.html)
[Dimensionality Reduction in R](https://www.coursera.org/lecture/statistical-genomics/dimension-reduction-in-r-8-48-8AYGh)
[Advanced Dimensionality Reduction in R](https://courses.edx.org/courses/course-v1:HarvardX+PH525.4x+2T2018/course/)
[Foundations of Inference](https://www.coursera.org/learn/inferential-statistics-intro?specialization=statistics)
[Correlation and Regression](https://learningstatisticswithr.com/book/regression.html)
[Fundamentals of Bayesian Data Analysis in R](https://www.coursera.org/learn/bayesian)
[Structural Equation Modeling with lavaan in R](http://statstools.com/learn/structural-equation-modeling/)
[Introduction to Machine Learning](https://lagunita.stanford.edu/courses/HumanitiesSciences/StatLearning/Winter2016/course/)
[Supervised Machine Learning: Case Studies in R](https://supervised-ml-course.netlify.com/)
[Unsupervised Learning in R](https://lagunita.stanford.edu/courses/HumanitiesSciences/StatLearning/Winter2016/course/)
[Machine Learning Toolbox](https://lagunita.stanford.edu/courses/HumanitiesSciences/StatLearning/Winter2016/course/)
* Also see a book on the [`caret` package](http://topepo.github.io/caret/index.html)
[Differential Expression Analysis in R with limma](https://jdblischak.github.io/dc-bioc-limma/)
[Bayesian Regression Modeling with `rstanarm`](https://mc-stan.org/rstanarm/articles/index.html)
* Also see a [walkthrough article](http://www.tqmp.org/RegularArticles/vol14-2/p099/p099.pdf) and a [practical example](https://mc-stan.org/users/documentation/case-studies/tutorial_rstanarm.html)
[Forecasting Using R](https://otexts.com/fpp2/)
[Introduction to Time Series Analysis](https://www.coursera.org/learn/practical-time-series-analysis)
* Also see [here](https://otexts.com/fpp2/)
[ARIMA Modeling with R](https://otexts.com/fpp2/arima-r.html)
[Forecasting Product Demand in R](https://www.coursera.org/learn/practical-time-series-analysis)
* Also see [here](https://otexts.com/fpp2/)
[Nonlinear Modeling in R with GAMs](https://github.com/noamross/gam-resources)
[Marketing Analytics in R: Choice Modeling](http://r-marketing.r-forge.r-project.org/)
* Please see Chapter 13
[Hyperparameter Tuning in R](http://topepo.github.io/caret/model-training-and-tuning.html#model-training-and-parameter-tuning)
* Also see the [`mlr` package docs](https://mlr.mlr-org.com/articles/tutorial/tune.html) and [the `h2o` package docs](http://docs.h2o.ai/h2o/latest-stable/h2o-docs/grid-search.html)[Exploratory Data Analysis](https://classroom.udacity.com/courses/ud651)
[Exploratory Data Analysis in R: Case Study](https://www.coursera.org/learn/exploratory-data-analysis?specialization=jhu-data-science)
* Please see week 4
[Visualization Best Practices in R](https://socviz.co/)
[Data Visualization with ggplot2 (Part 1)](https://www.dataquest.io/course/r-data-viz/)
[Data Visualization with ggplot2 (Part 2)](https://r4ds.had.co.nz/data-visualisation.html)
[Building Dashboards with `shinydashboard`](https://leanpub.com/c/shinydashboard)
[Building Dashboards with `flexdashboard`](https://rmarkdown.rstudio.com/flexdashboard/)
[Interactive Data Visualization with `rbokeh`](https://hafen.github.io/rbokeh/)
[Interactive Maps with `leaflet` in R](https://rstudio.github.io/leaflet/)
[Working with Geospatial Data in R](https://geocompr.robinlovelace.net)
[Building Web Applications in R with Shiny](https://laderast.github.io/gradual_shiny/)
[Building Web Applications in R with Shiny: Case Studies](https://shiny.rstudio.com/gallery)
[Introduction to Text Analysis in R](https://www.tidytextmining.com/)
[Text Mining with R](https://www.tidytextmining.com/)
* Also see the associated [blogs](https://juliasilge.com/blog/) and [tutorials](https://github.com/juliasilge/deming2018)
[Sentiment Analysis in R](https://www.tidytextmining.com/)
[Sentiment Analysis in R: The Tidy Way](https://www.tidytextmining.com/)
[Analyzing Election and Polling Data in R](https://www.thecrosstab.com/project/r-politics-guide/)
[Analyzing US Census Data in R](https://walkerke.github.io/tidycensus/articles/basic-usage.html)
[Single-Cell RNA-Seq Workflows in R](https://bioconductor.org/packages/release/workflows/html/simpleSingleCell.html)
[Data science for the medical and biomedical sciences (ds4biomed)](https://ds4biomed.tech/)
### Python Courses
[Introduction to Python](https://classroom.udacity.com/courses/ud170)
* Also see [here](http://swcarpentry.github.io/python-novice-inflammation/)
[Python for R Users](https://github.com/webartifex/intro-to-python)
[Python for MATLAB Users](https://github.com/webartifex/intro-to-python)
[Introduction to Data Science in Python](https://www.dataquest.io/course/python-for-data-science-fundamentals/)
[Elements of Data Science](https://allendowney.github.io/ElementsOfDataScience/)
[Intermediate Python for Data Science](https://classroom.udacity.com/courses/ud359)
[Object-Oriented Programming in Python](https://github.com/webartifex/intro-to-python)
[Writing Efficient Python Code](https://github.com/webartifex/intro-to-python)
[Analyzing Police Activity with pandas](https://www.dataschool.io/best-practices-with-pandas/)
[Interactive Data Visualization with Bokeh](https://mybinder.org/v2/gh/bokeh/bokeh-notebooks/master?filepath=tutorial%2F00%20-%20Introduction%20and%20Setup.ipynb)
[Advanced NLP with spaCy](https://course.spacy.io/)
[Intro to Python for Finance](https://github.com/webartifex/intro-to-python)
### SQL Courses
[Intro to SQL for Data Science](https://www.khanacademy.org/computing/computer-programming/sql)
* Also see [here](https://www.dataquest.io/course/sql-fundamentals/) (**requires subscription**)
[Intermediate SQL](https://mode.com/sql-tutorial/)
* Please see the Intermediate SQL section
[Intermediate SQL Server](https://www.edx.org/course/querying-data-with-transact-sql-2)
[Joining Data in SQL](https://librarycarpentry.org/lc-sql/)
* Also see [here](https://www.dataquest.io/course/sql-joins-relations/) (**requires subscription**)
### Git Courses
[Introduction to Git for Data Science](https://git-scm.com/book/en/v2)
* Also see [here](https://librarycarpentry.org/lc-git/)
* Also see [git branching](https://learngitbranching.js.org/)
### Shell Courses
[Introduction to Shell for Data Science](https://librarycarpentry.org/lc-shell/)
* Also see [here](http://swcarpentry.github.io/shell-novice/)
* Also see [here](https://www.dataquest.io/course/command-line-beginner/) (**requires subscription**)
### Spreadsheet Courses
[Spreadsheet Basics](https://datacarpentry.org/spreadsheets-socialsci/)
## Contributing
> Please feel free to submit a pull request. The full list of DC courses can be found [here](https://t.co/gahfLYYY2l?amp=1)