Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/aaronwangy/Data-Science-Cheatsheet
A helpful 5-page machine learning cheatsheet to assist with exam reviews, interview prep, and anything in-between.
https://github.com/aaronwangy/Data-Science-Cheatsheet
cheatsheet data-science machine-learning
Last synced: 2 days ago
JSON representation
A helpful 5-page machine learning cheatsheet to assist with exam reviews, interview prep, and anything in-between.
- Host: GitHub
- URL: https://github.com/aaronwangy/Data-Science-Cheatsheet
- Owner: aaronwangy
- Created: 2021-02-05T06:01:57.000Z (almost 4 years ago)
- Default Branch: main
- Last Pushed: 2023-03-15T22:16:54.000Z (over 1 year ago)
- Last Synced: 2024-10-16T07:05:45.620Z (24 days ago)
- Topics: cheatsheet, data-science, machine-learning
- Language: TeX
- Homepage:
- Size: 4.36 MB
- Stars: 4,926
- Watchers: 149
- Forks: 709
- Open Issues: 6
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- awesome-list - Data Science Cheatsheet
README
# Data Science Cheatsheet 2.0
A helpful 5-page data science cheatsheet to assist with exam reviews, interview prep, and anything in-between. It covers over a semester of introductory machine learning, and is based on MIT's Machine Learning courses 6.867 and 15.072. The reader should have at least a basic understanding of statistics and linear algebra, though beginners may find this resource helpful as well.
Inspired by Maverick's *Data Science Cheatsheet* (hence the 2.0 in the name), located [here](https://github.com/ml874/Data-Science-Cheatsheet).
Topics covered:
- Linear and Logistic Regression
- Decision Trees and Random Forest
- SVM
- K-Nearest Neighbors
- Clustering
- Boosting
- Dimension Reduction (PCA, LDA, Factor Analysis)
- Natural Language Processing
- Neural Networks
- Recommender Systems
- Reinforcement Learning
- Anomaly Detection
- Time Series
- A/B TestingThis cheatsheet will be occasionally updated with new/improved info, so consider a follow or star to stay up to date.
Future additions (ideas welcome):
- ~~Time Series~~ Added!
- ~~Statistics and Probability~~ Added!
- Data Imputation
- Generative Adversarial Networks
- Graph Neural Networks## Links
* [Data Science Cheatsheet 2.0 PDF](https://github.com/aaronwangy/Data-Science-Cheatsheet/blob/main/Data_Science_Cheatsheet.pdf)## Screenshots
Here are screenshots of a couple pages - the link to the full cheatsheet is above!
![](images/page1-1.png?raw=true)
![](images/page2-1.png?raw=true)### Why is Python/SQL not covered in this cheatsheet?
I planned for this resource to cover mainly algorithms, models, and concepts, as these rarely change and are common throughout industries. Technical languages and data structures often vary by job function, and refreshing these skills may make more sense on keyboard than on paper.## License
Feel free to share this resource in classes, review sessions, or to anyone who might find it helpful :)
This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License
Images are used for educational purposes, created by me, or borrowed from my colleagues [here](https://stanford.edu/~shervine/teaching/cs-229/)
## Contact
Feel free to suggest comments, updates, and potential improvements!Author - [Aaron Wang](https://www.linkedin.com/in/axw/)
If you'd like to support this cheatsheet, you can buy me a coffee [here](https://www.paypal.me/aaxw). I also do resume, application, and tech consulting - send me a message if interested.