Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-learn-datascience
:chart_with_upwards_trend: Curated list of resources to help you get started with Data Science
https://github.com/siboehm/awesome-learn-datascience
- 'What is Data Science?' on Quora
- Explanation of important vocabulary - Differentiation of Big Data, Machine Learning, Data Science.
- Data Science for Business (Book) - An introduction to Data Science and its use as a business asset.
- Supervised vs unsupervised learning - The two most common types of Machine Learning algorithms.
- 9 important Data Science algorithms and their implementation
- Cross validation - Evaluate the performance of your algorithm / model.
- Feature engineering - Modifying the data to better model predictions.
- Scientific introduction to 10 important Data Science algorithms
- Model ensemble: Explanation - Combine multiple models into one for better performance.
- Data Science tutorials using R
- O'Reilly Data Science from Scratch (Book) - Data processing, implementation, and visualization with example code.
- Coursera Applied Data Science - Online Course using Python that covers most of the relevant toolkits.
- YouTube tutorial series by sentdex
- Interactive Python tutorial website
- numpy
- Numpy tutorial on DataCamp
- pandas
- Introduction to pandas
- DataCamp pandas foundations - Paid course, but 30 free days upon account creation (enough to complete course).
- Pandas cheatsheet - Quick overview over the most important functions.
- scikit-learn
- Introduction and first model application
- Rough guide for choosing estimators
- Scikit-learn complete user guide
- Model ensemble: Implementation in Python
- Jupyter Notebook
- Downloading and running first Jupyter notebook
- Example notebook for data exploration
- Seaborn data visualization tutorial - Plot library that works great with Jupyter.
- Template folder structure for organizing Data Science projects
- Anaconda Python distribution - Contains most of the important Python packages for Data Science.
- Spacy - Open source toolkit for working with text-based data.
- LightGBM gradient boosting framework - Successfully used in many Kaggle challenges.
- Amazon AWS - Rent cloud servers for more timeconsuming calculations (r4.xlarge server is a good place to start).
- Walkthrough: House prices challenge - Walkthrough through a simple challenge on house prices.
- Blood Donation Challenge - Predict if a donor will donate again.
- Titanic Challenge - Predict survival on the Titanic.
- Water Pump Challenge - Predict the operating condition of water pumps in Africa.
- Awesome Data Science
- Data Science Python
- Machine Learning Tutorials
- ![CC0
Keywords
machine-learning
3
data-science
3
r
2
python
2
datascience
1
text-mining
1
ai
1
cookiecutter
1
cookiecutter-data-science
1
cookiecutter-template
1
data-mining
1
decision-trees
1
distributed
1
gbdt
1
gbm
1
gbrt
1
gradient-boosting
1
kaggle
1
lightgbm
1
microsoft
1
parallel
1
data-scientists
1
python-tutorial
1
awesome
1
awesome-list
1
deep-learning
1
deep-learning-tutorial
1
deep-neural-networks
1
deeplearning
1
list
1
machinelearning
1
neural-network
1
neural-networks
1