Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/denezt/all-things-data-science

All Things Data Science
https://github.com/denezt/all-things-data-science

Last synced: 4 days ago
JSON representation

All Things Data Science

Awesome Lists containing this project

README

        

# Data Science Cheat Sheet

## Table of Contents
- [Introduction to Data Science](#introduction-to-data-science)
- [Data Collection](#data-collection)
- [Data Preprocessing](#data-preprocessing)
- [Exploratory Data Analysis (EDA)](#exploratory-data-analysis)
- [Feature Engineering](#feature-engineering)
- [Machine Learning](#machine-learning)
- [Model Evaluation](#model-evaluation)
- [Visualization](#visualization)
- [Resources](#resources)

## Introduction to Data Science
- What is Data Science?
- Data Science Process
- Importance of Domain Knowledge

[Go To Top](#table-of-contents)

## Data Collection
- Types of Data (Structured, Unstructured, Semi-Structured)
- Data Sources (Databases, APIs, Web Scraping)
- Data Quality and Cleaning

[Go To Top](#table-of-contents)

## Data Preprocessing
- Handling Missing Values
- Data Transformation (Scaling, Normalization)
- Encoding Categorical Variables
- Outlier Detection and Treatment

[Go To Top](#table-of-contents)

## Exploratory Data Analysis (EDA)
- Summary Statistics (Mean, Median, Variance)
- Data Visualization (Histograms, Box Plots, Scatter Plots)
- Correlation Analysis
- Distribution Analysis

[Go To Top](#table-of-contents)

## Feature Engineering
- Importance of Feature Engineering
- Feature Extraction (Dimensionality Reduction, PCA)
- Feature Selection (Correlation, Importance)
- Creating Interaction Features

[Go To Top](#table-of-contents)

## Machine Learning
- Supervised vs. Unsupervised Learning
- Types of Algorithms (Regression, Classification, Clustering)
- Model Training and Testing
- Cross-Validation

[Go To Top](#table-of-contents)

## Model Evaluation
- Evaluation Metrics (Accuracy, Precision, Recall, F1-Score, RMSE)
- Confusion Matrix
- Overfitting and Underfitting
- Bias-Variance Tradeoff

[Go To Top](#table-of-contents)

## Visualization
- Matplotlib Basics
- Seaborn for Statistical Visualization
- Interactive Visualization (Plotly, Bokeh)
- Data Dashboards (Tableau, Power BI)

[Go To Top](#table-of-contents)

## Resources
- Useful Libraries (numpy, pandas, scikit-learn)
- Online Courses and Tutorials
- Blogs and Books for Data Science
- Kaggle for Practice

[Go To Top](#table-of-contents)

**Note:** This cheat sheet provides a basic overview of data science concepts. Expand each section with more detailed information based on your needs.