Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/denezt/all-things-data-science
All Things Data Science
https://github.com/denezt/all-things-data-science
Last synced: 4 days ago
JSON representation
All Things Data Science
- Host: GitHub
- URL: https://github.com/denezt/all-things-data-science
- Owner: denezt
- License: unlicense
- Created: 2023-08-27T17:55:36.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2023-08-27T18:52:22.000Z (about 1 year ago)
- Last Synced: 2023-08-27T19:11:33.496Z (about 1 year ago)
- Size: 8.79 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Data Science Cheat Sheet
## Table of Contents
- [Introduction to Data Science](#introduction-to-data-science)
- [Data Collection](#data-collection)
- [Data Preprocessing](#data-preprocessing)
- [Exploratory Data Analysis (EDA)](#exploratory-data-analysis)
- [Feature Engineering](#feature-engineering)
- [Machine Learning](#machine-learning)
- [Model Evaluation](#model-evaluation)
- [Visualization](#visualization)
- [Resources](#resources)## Introduction to Data Science
- What is Data Science?
- Data Science Process
- Importance of Domain Knowledge[Go To Top](#table-of-contents)
## Data Collection
- Types of Data (Structured, Unstructured, Semi-Structured)
- Data Sources (Databases, APIs, Web Scraping)
- Data Quality and Cleaning[Go To Top](#table-of-contents)
## Data Preprocessing
- Handling Missing Values
- Data Transformation (Scaling, Normalization)
- Encoding Categorical Variables
- Outlier Detection and Treatment[Go To Top](#table-of-contents)
## Exploratory Data Analysis (EDA)
- Summary Statistics (Mean, Median, Variance)
- Data Visualization (Histograms, Box Plots, Scatter Plots)
- Correlation Analysis
- Distribution Analysis[Go To Top](#table-of-contents)
## Feature Engineering
- Importance of Feature Engineering
- Feature Extraction (Dimensionality Reduction, PCA)
- Feature Selection (Correlation, Importance)
- Creating Interaction Features[Go To Top](#table-of-contents)
## Machine Learning
- Supervised vs. Unsupervised Learning
- Types of Algorithms (Regression, Classification, Clustering)
- Model Training and Testing
- Cross-Validation[Go To Top](#table-of-contents)
## Model Evaluation
- Evaluation Metrics (Accuracy, Precision, Recall, F1-Score, RMSE)
- Confusion Matrix
- Overfitting and Underfitting
- Bias-Variance Tradeoff[Go To Top](#table-of-contents)
## Visualization
- Matplotlib Basics
- Seaborn for Statistical Visualization
- Interactive Visualization (Plotly, Bokeh)
- Data Dashboards (Tableau, Power BI)[Go To Top](#table-of-contents)
## Resources
- Useful Libraries (numpy, pandas, scikit-learn)
- Online Courses and Tutorials
- Blogs and Books for Data Science
- Kaggle for Practice[Go To Top](#table-of-contents)
**Note:** This cheat sheet provides a basic overview of data science concepts. Expand each section with more detailed information based on your needs.