Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/chaconnewu/free-data-science-books
Free resources for learning data science
https://github.com/chaconnewu/free-data-science-books
Last synced: 8 days ago
JSON representation
Free resources for learning data science
- Host: GitHub
- URL: https://github.com/chaconnewu/free-data-science-books
- Owner: chaconnewu
- License: unlicense
- Created: 2013-11-26T21:45:11.000Z (about 11 years ago)
- Default Branch: master
- Last Pushed: 2024-06-05T11:27:34.000Z (6 months ago)
- Last Synced: 2024-10-16T06:23:34.707Z (about 2 months ago)
- Size: 50.8 KB
- Stars: 2,912
- Watchers: 363
- Forks: 1,095
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-ai-data-github-repos - List of Data Science/Big Data Resources
- awesome-ai-data-github-repos - List of Data Science/Big Data Resources
README
List of Data Science/Big Data Resources
======================
This list contains free learning resources for data science and big data related concepts, techniques, and applications. Inspired by [Free Programming Books](https://github.com/vhf/free-programming-books).Each entry provides the expected audience for the certain book (beginner, intermediate, or veteran). It may be subjective, but it provides some clue of how difficult the book is.
### How To Contribute
- Fork
- Edit, and add your recommendations (for beginner, intermediate, or veteran)
- Send a Pull Request### Index
* [Data Science Introduction](#data-science-introduction)
* [Data Processing](#big-data-processing)
* [Data Analysis](#big-data-analysis)
* [Fundamentals](#fundamentals)
* [Network Analysis](#network-analysis)
* [Statistics](#statistics)
* [Data Mining](#data-mining)
* [Machine Learning](#machine-learning)
* [Data Science Application](#big-data-application)
* [Data Visualization](#data-visualization)
* [Uncategorized](#uncategorized)
* [MOOCs about Data Science](#moocs)### Data Science Introduction
* [Data Science: An Introduction](http://en.wikibooks.org/wiki/Data_Science:_An_Introduction) - Wikibook - `Beginner`
* [Disruptive Possibilities: How Big Data Changes Everything](http://www.amazon.com/Disruptive-Possibilities-Data-Changes-Everything-ebook/dp/B00CLH387W) - Jeffrey Needham - `Beginner`
* [Introduction to Data Science](http://jsresearch.net/) - Jeffery Stanton - `Beginner`
* [Real-Time Big Data Analytics: Emerging Architecture](http://www.amazon.com/Real-Time-Big-Data-Analytics-Architecture-ebook/dp/B00DO33RSW) - Mike Barlow - `Beginner`
* [The Evolution of Data Products](http://www.amazon.com/The-Evolution-Data-Products-ebook/dp/B005QEKQUY/ref=sr_1_63?s=digital-text&ie=UTF8&qid=1351898530&sr=1-63) - Mike Loukides - `Beginner`
* [The Promise and Peril of Big Data](http://www.aspeninstitute.org/sites/default/files/content/docs/pubs/The_Promise_and_Peril_of_Big_Data.pdf) - David Bollier - `Beginner`### Data Processing
* [Data-Intensive Text Processing with MapReduce](http://lintool.github.io/MapReduceAlgorithms/MapReduce-book-final.pdf) - Jimmy Lin and Chris Dyer - `Intermediate`### Data Analysis
#### Fundamentals
* [Fundamental Numerical Methods and Data Analysis](http://ads.harvard.edu/books/1990fnmd.book/) - George W. Collins - `Beginner`
* [Introduction to Metadata](http://www.getty.edu/research/publications/electronic_publications/intrometadata/index.html) - Murtha Baca - `Beginner`
* [Introduction to R - Notes on R: A Programming Environment for Data Analysis and Graphics](http://cran.r-project.org/doc/manuals/R-intro.pdf) - W. N. Venables, D. M. Smith, and the R Core Team - `Beginner`
* [Modeling with Data: Tools and Techniques for Scientific Computing](http://modelingwithdata.org/about_the_book.html) - Ben Klemens - `Beginner`
* [R for Data Science: Import, Tidy, Transform, Visualize, and Model Data](http://r4ds.had.co.nz/) - Hadley Wickham & Garrett Grolemund - `Beginner`
- [Advanced R](http://adv-r.had.co.nz/) - Hadley Wickham - `Intermediate`#### Network Analysis
* [Introduction to Social Network Methods](http://faculty.ucr.edu/~hanneman/nettext/) - Robert A. Hanneman and Mark Riddle - `Intermediate`
* [Networks, Crowds, and Markets: Reasoning About a Highly Connected World](http://www.cs.cornell.edu/home/kleinber/networks-book/) - David Easley and Jon Kleinberg - `Intermediate`
* [Network Science](http://barabasilab.neu.edu/networksciencebook/downlPDF.html) - Sarah Morrison - `Beginner`
* [The Wealth of Networks](http://www.benkler.org/Benkler_Wealth_Of_Networks.pdf) - Yochai Benkler - `Beginner`#### Statistics
* [Advanced Data Analysis from an Elementary Point of View](http://www.stat.cmu.edu/~cshalizi/ADAfaEPoV/ADAfaEPoV.pdf) - Cosma Rohilla Shalizi - `Veteran`
* [An Introduction to R](http://cran.r-project.org/doc/manuals/R-intro.pdf) - W. N. Venables, D. M. Smith, and the R Core Team - `Beginner`
* [Analyzing Linguistic Data: a practical introduction to statistics](http://www.ualberta.ca/~baayen/publications/baayenCUPstats.pdf) - R. H. Baayan - `Beginner`
* [Applied Data Science](http://columbia-applied-data-science.github.io/appdatasci.pdf) - Ian Langmore and Daniel Krasner - `Intermediate`
* [Concepts and Applications of Inferential Statistics](http://vassarstats.net/textbook/) - Richard Lowry - `Beginner`
* [Forecasting: Principles and Practice](https://www.otexts.org/fpp/) - Rob J. Hyndman and George Athanasopoulos - `Intermediate`
* [Introduction to Probability](http://www.dartmouth.edu/~chance/teaching_aids/books_articles/probability_book/pdf.html) - Charles M. Grinstead and J. Laurie Snell - `Beginner`
* [Introduction to Statistical Thought](http://www.math.umass.edu/~lavine/Book/book.pdf) - Michael Lavine - `Beginner`
* [OpenIntro Statistics - Second Edition](http://www.openintro.org/stat/textbook.php) - David M. Diez, Christopher D. Barr, and Mine Cetinkaya-Rundel - `Beginner`
* [simpleR - Using R for Introductory Statistics](http://cran.r-project.org/doc/contrib/Verzani-SimpleR.pdf) - John Verzani - `Beginner`
* [Statistics](http://upload.wikimedia.org/wikipedia/commons/8/82/Statistics.pdf) - `Beginner`
* [Think Stats: Probability and Statistics for Programmers v2.0](http://greenteapress.com/thinkstats2/thinkstats2.pdf) - Allen B. Downey - `Beginner`
* [Computer Age Statistical Inference: Algorithms, Evidence and Data Science](https://web.stanford.edu/~hastie/CASI/) - Bradley Efron and Trevor Hastie - `Intermediate`#### Data Mining
* [Data Mining and Analysis: Fundamental Concepts and Algorithms](https://repo.palkeo.com/algo/information-retrieval/Data%20mining%20and%20analysis.pdf) - Mohammed J. Zaki and Wagner Meira Jr. - `Intermediate`
* [Data Mining and Knowledge Discovery in Real Life Applications](http://www.intechopen.com/books/data_mining_and_knowledge_discovery_in_real_life_applications) - Julio Ponce and Adem Karahoca - `Beginner`
* [Data Mining for Social Network Data](http://link.springer.com/book/10.1007%2F978-1-4419-6287-4) - Springer - `Veteran`
* [Mining of Massive Datasets](http://infolab.stanford.edu/~ullman/mmds/book.pdf) - Anand Rajaraman, Jure Leskovec, and Jeffrey D. Ullman - `Intermediate`
* [Knowledge-Oriented Applications in Data Mining](http://www.intechopen.com/books/knowledge-oriented-applications-in-data-mining) - Kimito Funatsu - `Intermediate`
* [New Fundamental Technologies in Data Mining](http://www.intechopen.com/books/new-fundamental-technologies-in-data-mining) - Kimito Funatsu - `Intermediate`
* [R and Data Mining: Examples and Case Studies](http://cran.r-project.org/doc/contrib/Zhao_R_and_data_mining.pdf) - Yanchang Zhao - `Beginner`
* [The Elements of Statistical Learning](http://statweb.stanford.edu/~tibs/ElemStatLearn/) - Trevor Hastie, Robert Tibshirani, and Jerome Friedman - `Intermediate`
* [Theory and Applications for Advanced Text Mining](http://www.intechopen.com/books/theory-and-applications-for-advanced-text-mining) - Shigeaki Sakurai - `Intermediate`#### Machine Learning
* [A Course in Machine Learning](http://ciml.info/) - Hal Daume - `Beginner`
* [A First Encounter with Machine Learning](https://www.ics.uci.edu/~welling/teaching/273ASpring10/IntroMLBook.pdf) - Max Welling - `Beginner`
* [Bayesian Reasoning and Machine Learning](http://web4.cs.ucl.ac.uk/staff/D.Barber/textbook/031013.pdf) - David Barber - `Veteran`
* [Gaussian Processes for Machine Learning](http://www.gaussianprocess.org/gpml/chapters/) - Carl Edward Rasmussen and Christopher K. I. Williams - `Veteran`
* [Introduction to Machine Learning](http://alex.smola.org/drafts/thebook.pdf) - Alex Smola and S.V.N. Vishwanathan - `Intermediate`
* [Probabilistic Programming & Bayesian Methods for Hackers](http://camdavidsonpilon.github.io/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/) - Cam Davidson-Pilon (main author) - `Intermediate`
* [The LION Way: Machine Learning plus Intelligent Optimization](http://www.lionsolver.com/LIONbook/) - Robert Battiti and Mauro Brunato - `Intermediate`
* [Thinking Bayes](http://www.greenteapress.com/thinkbayes/) - Allen B. Downey - `Beginner`
* [Sklearn Basics](http://nbviewer.ipython.org/github/jakevdp/sklearn_scipy2013/tree/master/notebooks/) - `Beginner`
* [Deep Learning](http://www.deeplearningbook.org) - Ian Goodfellow, Yoshua Bengio and Aaron Courville - `Intermediate`### Data Science Application
#### Information Retrieval
* [Introduction to Information Retrival](http://nlp.stanford.edu/IR-book/) - Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schutze - `Intermediate`#### Data Visualization
* [Interactive Data Visualization for the Web](http://chimera.labs.oreilly.com/books/1230000000345/index.html) - Scott Murray - `Beginner`
* [Plotting and Visualization in Python](http://nbviewer.ipython.org/urls/gist.github.com/fonnesbeck/5850463/raw/a29d9ffb863bfab09ff6c1fc853e1d5bf69fe3e4/3.+Plotting+and+Visualization.ipynb) - `Beginner`
* [ggplot2: Elegant Graphics for Data Analysis](https://github.com/hadley/ggplot2-book) - Hadley Wickham - `Beginner`### Uncategorized
* [Data Journalism Handbook](http://datajournalismhandbook.org/1.0/en/) - Jonathan Gray, Liliana Bounegru, and Lucy Chambers - `Beginner`
* [Building Data Science Teams](http://assets.en.oreilly.com/1/eventseries/23/Building-Data-Science-Teams.pdf) - DJ Patil - `Beginner`
* [Information Theory, Inference, and Learning Algorithms](http://www.inference.phy.cam.ac.uk/itprnn/book.html) - David MacKay - `Intermediate`
* [Mathematics for Computer Science](http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-042j-mathematics-for-computer-science-fall-2010/readings/MIT6_042JF10_notes.pdf) - Eric Lehman, Thomas Leighton, and Albert R. Meyer - `Beginner`
* [The Field Guide to Data Science](http://www.boozallen.com/media/file/The-Field-Guide-to-Data-Science.pdf) - `Beginner`### MOOCs about Data Science
* [Data Mining with Weka](http://www.cs.waikato.ac.nz/ml/weka/mooc/dataminingwithweka/) - Ian H. Witten - `Intermediate`
* [Mining Massive Datasets](https://class.coursera.org/mmds-002) - Jeff Ullman, Jure Leskovec, Anand Rajaraman (Coursera) - `Beginner`
* [Introduction to Data Science](https://class.coursera.org/datasci-001/class) - Bill Howe (Coursera) - `Beginner`
* [Introduction to Hadoop and MapReduce](https://www.udacity.com/course/ud617) - Udacity - `Beginner`
* [Machine Learning](https://class.coursera.org/ml-003/class) - Andrew Ng (Coursera) - `Beginner`
* [Machine Learning Video Library](http://work.caltech.edu/library/#!?goback=.gde_35222_member_5810981726511443971) - Yaser Abu-Mostafa - `Intermediate`
* [Natural Language Processing](https://class.coursera.org/nlp/lecture/preview) - Dan Jurafsky and Christopher Manning (Coursera) - `Intermediate`
* [Social and Economic Networks: Models and Analysis](https://class.coursera.org/networksonline-001/class) - Matthew O. Jackson (Coursera) - `Intermediate`
* [Social Network Analysis](https://class.coursera.org/sna-003/class) - Lada Adamic (Coursera) - `Intermediate`
* [Deep Learning](https://www.coursera.org/specializations/deep-learning) - Andrew Ng (Coursera) - `Intermediate`## License
[![CC0](http://mirrors.creativecommons.org/presskit/buttons/88x31/svg/cc-zero.svg)](https://creativecommons.org/publicdomain/zero/1.0/)