https://github.com/PavelGrigoryevDS/awesome-data-analysis
π 500+ curated resources for Data Analysis & Data Science: Python, SQL, Statistics, ML, AI, Visualization, Cheatsheets, Roadmaps, Interview Prep. For beginners and experts.
https://github.com/PavelGrigoryevDS/awesome-data-analysis
List: awesome-data-analysis
ai analytics awesome-list big-data business-intelligence dashboard data-analysis data-science data-visualization datasets eda jupyter-notebook ml numpy pandas python resources sql statistics tutorials
Last synced: 3 days ago
JSON representation
π 500+ curated resources for Data Analysis & Data Science: Python, SQL, Statistics, ML, AI, Visualization, Cheatsheets, Roadmaps, Interview Prep. For beginners and experts.
- Host: GitHub
- URL: https://github.com/PavelGrigoryevDS/awesome-data-analysis
- Owner: PavelGrigoryevDS
- License: cc0-1.0
- Created: 2025-07-30T11:02:44.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2025-11-13T12:38:17.000Z (6 days ago)
- Last Synced: 2025-11-13T14:29:27.763Z (6 days ago)
- Topics: ai, analytics, awesome-list, big-data, business-intelligence, dashboard, data-analysis, data-science, data-visualization, datasets, eda, jupyter-notebook, ml, numpy, pandas, python, resources, sql, statistics, tutorials
- Homepage: https://pavelgrigoryevds.github.io/awesome-data-analysis/
- Size: 1.82 MB
- Stars: 410
- Watchers: 4
- Forks: 40
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
- fucking-awesome-datascience - Awesome Data Analysis - A curated list of data analysis tools, libraries and resources. (Other Awesome Lists / Comics)
- awesome-datascience - Awesome Data Analysis - A curated list of data analysis tools, libraries and resources. (Other Awesome Lists / Comics)
README
# Awesome Data Analysis [](https://github.com/sindresorhus/awesome)
[](https://pavelgrigoryevds.github.io/awesome-data-analysis/)
[](http://makeapullrequest.com)
[](https://creativecommons.org/publicdomain/zero/1.0/)
500+ curated resources for data analysis and data science: tools, libraries, roadmaps, cheatsheets, interview guides and more.
**π For comfortable reading:** [Web version](https://pavelgrigoryevds.github.io/awesome-data-analysis/)
**π± Want to improve?** [Suggest here](https://github.com/PavelGrigoryevDS/awesome-data-analysis/issues/16) or [Welcome to Discussions](https://github.com/PavelGrigoryevDS/awesome-data-analysis/discussions)
π **Goal: 500 stars!** Join us in making data analysis learning more accessible! 
Maintained with β€οΈ
---
## π Contents
- [π Awesome Data Science Repositories](#awesome-data-science-repositories)
- [πΊοΈ Roadmaps](#roadmaps)
- [π Python](#python)
- [Resources](#python-resources)
- [Data Manipulation with Pandas and Numpy](#python-data-manipulation-with-pandas-and-numpy)
- [Useful Python Tools for Data Analysis](#python-useful-python-tools-for-data-analysis)
- [Data Processing \& Transformation](#python-data-processing-transformation)
- [Automated EDA and Visualization Tools](#python-automated-data-visualization-tools)
- [Data Quality \& Validation](#python-data-quality-validation)
- [Feature Engineering & Selection](#python-feature-engineering-selection)
- [Specialized Data Tools](#python-specialized-data-tools)
- [ποΈ SQL \& Databases](#sql-databases)
- [Resources](#sql-databases-resources)
- [Tools](#sql-databases-tools)
- [π Data Visualization](#data-visualization)
- [Resources](#data-visualization-resources)
- [Tools](#data-visualization-tools)
- [π Dashboards & BI](#dashboards)
- [Resources](#dashboards-resources)
- [Tools](#dashboards-tools)
- [Software](#dashboards-software)
- [πΈοΈ Web Scraping \& Crawling](#web-scraping-crawling)
- [Resources](#web-scraping-crawling-resources)
- [Tools](#web-scraping-crawling-tools)
- [π’ Mathematics](#mathematics)
- [π² Statistics \& Probability](#statistics-probability)
- [Resources](#statistics-probability-resources)
- [Tools](#statistics-probability-tools)
- [π§ͺ A/B Testing](#ab-testing)
- [β³ Time Series Analysis](#time-series-analysis)
- [Resources](#time-series-analysis-resources)
- [Tools](#time-series-analysis-tools)
- [βοΈ Data Engineering](#data-engineering)
- [Resources](#data-engineering-resources)
- [Tools](#data-engineering-tools)
- [π Natural Language Processing (NLP)](#natural-language-processing-nlp)
- [Resources](#natural-language-processing-nlp-resources)
- [Tools](#natural-language-processing-nlp-tools)
- [π€ Machine Learning & AI](#machine-learning)
- [Resources](#machine-learning-resources)
- [Tools](#machine-learning-tools)
- [π MLOps](#mlops)
- [Resources](#mlops-resources)
- [Tools](#mlops-tools)
- [π§ AI Applications & Platforms](#ai-applications)
- [Resources](#ai-applications-resources)
- [Tools](#ai-applications-tools)
- [βοΈ Cloud Platforms & Infrastructure](#cloud-platforms)
- [Resources](#cloud-platform-resources)
- [Tools](#cloud-platform-tools)
- [β‘ Productivity](#productivity)
- [Resources](#productivity-resources)
- [Useful Linux Tools](#productivity-useful-linux-tools)
- [Useful VS Code Extensions](#productivity-useful-vs-code-extensions)
- [π Skill Development \& Career](#skill-development-career-resources)
- [Practice Resources](#skill-development-career-resources-practice-resources)
- [Curated Jupyter Notebooks](#skill-development-career-resources-curated-jupyter-notebooks)
- [Data Sources \& Datasets](#skill-development-career-resources-data-sources-datasets)
- [Resume and Interview Tips](#skill-development-career-resources-resume-and-interview-tips)
- [π Cheatsheets](#cheatsheets)
- [GoalKicker Programming Notes](#cheatsheets-goalkicker)
- [Python](#cheatsheets-python)
- [Data Science \& Machine Learning](#cheatsheets-data-science-machine-learning)
- [Linux \& Git](#cheatsheets-linux-git)
- [Probability \& Statistics](#cheatsheets-probability-statistics)
- [SQL \& Databases](#cheatsheets-sql-databases)
- [Miscellaneous](#cheatsheets-miscellaneous)
- [π¦ Additional Python Libraries](#additional-python-libraries)
- [π More Awesome Lists](#more-awesome-curations)
- [π Additional Resources and Tools](#additional-resources)
- [π€ Contributing](#contributing)
- [π License](#license)
---
## π Awesome Data Science Repositories
Curated collections of high-quality GitHub repos for inspiration and learning.
- [Awesome Data Science](https://github.com/academic/awesome-datascience) - A curated list of courses, books, tools, and resources for data science.
- [Data Science for Beginners](https://github.com/microsoft/Data-Science-For-Beginners) - Microsoft's data science curriculum.
- [OSSU Data Science](https://github.com/ossu/data-science) - Open Source Society University's self-study path.
- [Data Science Best Resources](https://github.com/tirthajyoti/Data-science-best-resources) - Carefully curated links for data science resources in one place.
- [Data Science Articles from CodeCut](https://github.com/CodeCutTech/Data-science) - A collection of articles, videos, and code related to data science.
- [Data Science Using Python](https://github.com/WillKoehrsen/Data-Analysis) - Resources for data analysis using Python.
[β¬ back to top](#contents)
---
## πΊοΈ Roadmaps
Step-by-step guides and skill trees to master data science and analytics.
- [Data Analyst Roadmap](https://roadmap.sh/data-analyst) - Structured learning path for analysts.
- [Data Science Roadmap from A to Z](https://github.com/Moataz-Elmesmary/Data-Science-Roadmap) - Comprehensive roadmap for data science.
- [Roadmap To Learn Data Science](https://github.com/krishnaik06/Perfect-Roadmap-To-Learn-Data-Science-In-2025) - A comprehensive and updated roadmap for learning data science with modern tools and technologies.
- [66DaysOfData](https://github.com/mrankitgupta/Data-Analyst-Roadmap) - 66-day data analytics learning challenge.
- [Data Analyst Roadmap for Professionals](https://github.com/hemansnation/Data-Analyst-Roadmap) - 8-week program for analysts at all levels.
- [Data Science Roadmap Tutorials](https://github.com/MrMimic/data-scientist-roadmap) - Tutorials for the data science roadmap.
- [Data Analyst Roadmap from Zero](https://github.com/mtahiraslan/data-analyst-roadmap) - Guide to becoming a data analyst from scratch.
[β¬ back to top](#contents)
---
## π Python
### Resources
A collection of resources for learning and mastering Python programming.
- [Awesome Python](https://github.com/vinta/awesome-python) - An opinionated list of awesome Python frameworks, libraries, software, and resources.
- [30 Days Of Python](https://github.com/Asabeneh/30-Days-Of-Python) - A 30-day programming challenge to learn the Python programming language.
- [Real Python Tutorials](https://realpython.com/) - Tutorials on Python from Real Python.
- [Awesome Python Data Science](https://github.com/krzjoa/awesome-python-data-science) - A curated list of Python resources for data science.
- [Python Data Science Handbook](https://github.com/jakevdp/PythonDataScienceHandbook) - Full text of the "Python Data Science Handbook" in Jupyter Notebooks.
- [Interactive Coding Challenges](https://github.com/donnemartin/interactive-coding-challenges) - 120+ interactive Python coding interview challenges.
- [Clean Code Python](https://github.com/zedr/clean-code-python) - Clean Code concepts adapted for Python.
- [Best of Python](https://github.com/ml-tooling/best-of-python) - A ranked list of awesome Python open-source libraries and tools.
- [GeeksforGeeks Python](https://www.geeksforgeeks.org/python-programming-language-tutorial/) - Python tutorial from GeeksforGeeks.
- [W3Schools Python](https://www.w3schools.com/python/) - A beginner-friendly tutorial and reference for the Python programming language.
- [Tanu N Prabhu Python](https://github.com/Tanu-N-Prabhu/Python/tree/master) - This repository helps you understand Python from scratch.
- [Think Python](https://github.com/AllenDowney/ThinkPython) - Jupyter notebooks and other resources for Think Python by Allen Downey.
[β¬ back to top](#contents)
---
### Data Manipulation with Pandas and Numpy
Tutorials and best practices for working with Pandas and Numpy.
- [Awesome Pandas](https://github.com/tommyod/awesome-pandas) - A curated list of resources for using the Pandas library.
- [100 data puzzles for pandas](https://github.com/ajcr/100-pandas-puzzles) - A collection of data puzzles to practice your Pandas skills.
- [Pandas Tutor](https://pandastutor.com/) - Visualize Pandas operations step-by-step (perfect for beginners).
- [Pandas Exercises](https://github.com/guipsamora/pandas_exercises) - Exercises designed to help you improve your Pandas skills.
- [Pandas Cookbook](https://github.com/jvns/pandas-cookbook) - A cookbook with various recipes for using Pandas effectively.
- [Hands-On Data Analysis with Pandas](https://github.com/stefmolin/Hands-On-Data-Analysis-with-Pandas-2nd-edition) - Materials for following along with Hands-On Data Analysis with Pandas.
- [Effective Pandas](https://github.com/TomAugspurger/effective-pandas) - A series focused on writing effective and idiomatic Pandas code.
- [From Python to Numpy](https://github.com/rougier/from-python-to-numpy) - An open-access book on vectorization and efficient numerical computing with NumPy.
- [NumPy 100 Exercises](https://github.com/rougier/numpy-100) - A collection of 100 exercises to master the NumPy library for scientific computing.
[β¬ back to top](#contents)
---
### Useful Python Tools for Data Analysis
A collection of Python libraries for efficient data manipulation, cleaning, visualization, validation, and analysis.
#### Data Processing & Transformation
- [Pandas DQ](https://github.com/AutoViML/pandas_dq) - Data type correction and automatic DataFrame cleaning.
- [Vaex](https://github.com/vaexio/vaex) - High-performance Python library for lazy Out-of-Core DataFrames.
- [Polars](https://github.com/pola-rs/polars) - Multithreaded, vectorized query engine for DataFrames.
- [Fugue](https://github.com/fugue-project/fugue) - Unified interface for Pandas, Spark, and Dask.
- [TheFuzz](https://github.com/seatgeek/thefuzz) - Fuzzy string matching (Levenshtein distance).
- [DateUtil](https://github.com/dateutil/dateutil) - Extensions for standard Python datetime features.
- [Arrow](https://github.com/arrow-py/arrow) - Enhanced work with dates and times.
- [Pendulum](https://github.com/sdispater/pendulum) - Alternative to datetime with timezone support.
- [Dask](https://github.com/dask/dask) - Parallel computing for arrays and DataFrames.
- [Modin](https://github.com/modin-project/modin) - Speeds up Pandas by distributing computations.
- [Pandarallel](https://github.com/nalepae/pandarallel) - Parallel operations for pandas DataFrames.
- [DataCleaner](https://github.com/rhiever/datacleaner) - Python tool for automatically cleaning and preparing datasets.
- [Pandas Flavor](https://github.com/Zsailer/pandas_flavor) - Add custom methods to Pandas.
- [Pandas DataReader](https://github.com/pydata/pandas-datareader) - Reads data from various online sources into pandas DataFrames.
- [Sklearn Pandas](https://github.com/scikit-learn-contrib/sklearn-pandas) - Bridge between Pandas and Scikit-learn.
- [CuPy](https://github.com/cupy/cupy) - A NumPy-compatible array library accelerated by NVIDIA CUDA for high-performance computing.
- [Numba](https://github.com/numba/numba) - A JIT compiler that translates a subset of Python and NumPy code into fast machine code.
- [Pandas Stubs](https://github.com/pandas-dev/pandas-stubs) - Type stubs for pandas, improves IDE autocompletion.
- [Petl](https://github.com/petl-developers/petl) - ETL tool for data cleaning and transformation.
[β¬ back to top](#contents)
---
#### Automated EDA and Visualization Tools
- [AutoViz](https://github.com/AutoViML/AutoViz) - Automatic data visualization in 1 line of code.
- [Sweetviz](https://github.com/fbdesignpro/sweetviz) - Automatic EDA with dataset comparison.
- [Lux](https://github.com/lux-org/lux) - Automatic DataFrame visualization in Jupyter.
- [YData Profiling](https://github.com/ydataai/ydata-profiling) - Data quality profiling & exploratory data analysis.
- [Missingno](https://github.com/ResidentMario/missingno) - Visualize missing data patterns.
- [Vizro](https://github.com/mckinsey/vizro) - Low-code toolkit for building data visualization apps.
- [Yellowbrick](https://github.com/DistrictDataLabs/yellowbrick) - Visual diagnostic tools for machine learning.
- [Great Tables](https://github.com/posit-dev/great-tables) - Create awesome display tables using Python.
- [DataMapPlot](https://github.com/TutteInstitute/datamapplot) - Create beautiful plots of data maps.
- [Datashader](https://github.com/holoviz/datashader) - Quickly and accurately render even the largest data.
- [PandasAI](https://github.com/sinaptik-ai/pandas-ai) - Conversational data analysis using LLMs and RAG.
- [Mito](https://github.com/mito-ds/mito) - Jupyter extensions for faster code writing.
- [D-Tale](https://github.com/man-group/dtale) - Interactive GUI for data analysis in a browser.
- [Pandasgui](https://github.com/adamerose/pandasgui) - GUI for viewing and filtering DataFrames.
- [PyGWalker](https://github.com/Kanaries/pygwalker) - Interactive UIs for visual analysis of DataFrames.
- [QGrid](https://github.com/quantopian/qgrid) - Interactive grid for DataFrames in Jupyter.
- [Pivottablejs](https://github.com/nicolaskruchten/jupyter_pivottablejs) - Interactive PivotTable.js tables in Jupyter.
[β¬ back to top](#contents)
---
#### Data Quality & Validation
- [PyOD](https://github.com/yzhao062/pyod) - Outlier and anomaly detection.
- [Alibi Detect](https://github.com/SeldonIO/alibi-detect) - Outlier, adversarial and drift detection.
- [Pandera](https://github.com/unionai-oss/pandera) - Data validation through declarative schemas.
- [Cerberus](https://github.com/pyeve/cerberus) - Data validation through schemas.
- [Pydantic](https://github.com/pydantic/pydantic) - Data validation using Python type annotations.
- [Dora](https://github.com/NathanEpstein/Dora) - Automate EDA: preprocessing, feature engineering, visualization.
- [Great Expectations](https://github.com/great-expectations/great_expectations) - Data validation and testing.
[β¬ back to top](#contents)
---
#### Feature Engineering & Selection
- [FeatureTools](https://github.com/alteryx/featuretools) - Automated feature engineering.
- [Feature Engine](https://github.com/feature-engine/feature_engine) - Feature engineering with Scikit-Learn compatibility.
- [Prince](https://github.com/MaxHalford/prince) - Multivariate exploratory data analysis (PCA, CA, MCA).
- [Fitter](https://github.com/cokelaer/fitter) - Figures out the distribution your data comes from.
- [Feature Selector](https://github.com/WillKoehrsen/feature-selector) - Tool for dimensionality reduction of machine learning datasets.
- [Category Encoders](https://github.com/scikit-learn-contrib/category_encoders) - Extensive collection of categorical variable encoders.
- [Imbalanced Learn](https://github.com/scikit-learn-contrib/imbalanced-learn) - Handling imbalanced datasets.
[β¬ back to top](#contents)
---
#### Specialized Data Tools
- [cuDF](https://github.com/rapidsai/cudf) - A GPU DataFrame library for loading, joining, and aggregating data.
- [Faker](https://github.com/joke2k/faker) - Generates fake data for testing.
- [Mimesis](https://github.com/lk-geimfari/mimesis) - Generates realistic test data.
- [Geopy](https://github.com/geopy/geopy) - Geocoding addresses and calculating distances.
- [PySAL](https://github.com/pysal/pysal) - Spatial analysis functions.
- [Factor Analyzer](https://github.com/EducationalTestingService/factor_analyzer) - A Python package for factor analysis, including exploratory and confirmatory methods.
- [Scattertext](https://github.com/JasonKessler/scattertext) - Beautiful visualizations of language differences among document types.
- [IGraph](https://github.com/igraph/igraph) - A library for creating and manipulating graphs and networks, with bindings for multiple languages.
- [Joblib](https://github.com/joblib/joblib) - A lightweight pipelining library for Python, particularly useful for saving and loading large NumPy arrays.
- [ImageIO](https://github.com/imageio/imageio) - A library that provides an easy interface to read and write a wide range of image data.
- [Texthero](https://github.com/jbesomi/texthero) - Text preprocessing, representation and visualization.
- [Geopandas](https://github.com/geopandas/geopandas) - Geographic data operations with pandas.
- [NetworkX](https://github.com/networkx/networkx) - Network analysis and graph theory.
[β¬ back to top](#contents)
---
## ποΈ SQL & Databases
### Resources
SQL tutorials and database design principles.
- [SQLZoo - SQL Tutorial](https://sqlzoo.net/wiki/SQL_Tutorial) - Interactive SQL tutorial.
- [SQL Bolt - Learn SQL](https://sqlbolt.com/) - Learn SQL through interactive lessons.
- [SQL Tutorial](https://www.sqltutorial.org/) - Comprehensive SQL tutorial resource.
- [SQL Tutorial by W3Schools.](https://www.w3schools.com/sql/default.asp) - Comprehensive SQL tutorial.
- [PostgreSQL Tutorial by W3Resource](https://w3resource.com/PostgreSQL/tutorial.php) - Tutorial for PostgreSQL.
- [MySQL Tutorial by W3Resource](https://www.w3resource.com/mysql/mysql-tutorials.php) - Tutorial for MySQL.
- [MongoDB Tutorial by W3Resource](https://www.w3resource.com/mongodb/nosql.php) - Tutorial for MongoDB.
- [EverSQL](https://www.eversql.com/) - AI-powered SQL query optimization and database observability tool.
- [Awesome Postgres](https://github.com/dhamaniasad/awesome-postgres) - A curated list of awesome PostgreSQL software, libraries, tools and resources.
- [Awesome MySql](https://github.com/shlomi-noach/awesome-mysql) - A curated list of awesome MySQL software, libraries, tools and resources.
- [Awesome Clickhouse](https://github.com/korchasa/awesome-clickhouse) - A curated list of awesome ClickHouse software.
- [Awesome MongoDB](https://github.com/ramnes/awesome-mongodb) - A curated list of awesome MongoDB resources, libraries, tools, and applications.
- [Awesome SQLAlchemy](https://github.com/dahlia/awesome-sqlalchemy) - A curated list of awesome tools for SQLAlchemy.
- [Awesome Sql](https://github.com/danhuss/awesome-sql) - List of tools and techniques for working with relational databases.
- [Practice Window Functions](https://www.practicewindowfunctions.com) - Free interactive SQL tutorial site focused on mastering window functions through 80+ hands-on problems with hints and solutions.
- [AnimateSQL](https://animatesql.com/) - Interactive tool that visualizes the step-by-step execution of SQL queries.
- [SQL Tips and Tricks](https://github.com/ben-nour/SQL-tips-and-tricks) - Useful SQL techniques and optimizations for data analysis.
[β¬ back to top](#contents)
---
### Tools
A collection of libraries and drivers for seamless database access and interaction.
- [PyODBC](https://github.com/mkleehammer/pyodbc) - Python library for ODBC database access.
- [SQLAlchemy](https://github.com/sqlalchemy/sqlalchemy) - SQL toolkit and ORM for Python.
- [Psycopg2](https://github.com/psycopg/psycopg2) - PostgreSQL database adapter.
- [MySQL Connector/Python](https://github.com/mysql/mysql-connector-python) - MySQL driver for Python.
- [PonyORM](https://github.com/ponyorm/pony) - ORM for Python with dynamic query generation.
- [PyMongo](https://github.com/mongodb/mongo-python-driver) - Official MongoDB driver for Python.
- [SQLiteviz](https://github.com/lana-k/sqliteviz) - A tool for exploring SQLite databases and visualizing the results of your queries.
- [SQLite](https://github.com/sqlite/sqlite) - A C-language library that implements a small, fast, self-contained, high-reliability, full-featured SQL database engine.
- [DB Browser for SQLite](https://github.com/sqlitebrowser/sqlitebrowser) - A high quality, visual, open source tool to create, design, and edit database files compatible with SQLite.
- [DBeaver](https://github.com/dbeaver/dbeaver) - A free universal database tool and SQL client for developers, SQL programmers, and administrators.
- [Beekeeper Studio](https://github.com/beekeeper-studio/beekeeper-studio) - A modern, easy-to-use SQL client and database manager with a clean, cross-platform interface.
- [SQLFluff](https://github.com/sqlfluff/sqlfluff) - A modular SQL linter and auto-formatter designed to enforce consistent style and catch errors in SQL code.
- [PyMySQL](https://github.com/PyMySQL/PyMySQL) - A pure-Python MySQL client library for interacting with MySQL databases from Python applications.
- [Vanna.AI](https://github.com/vanna-ai/vanna) - An AI-powered tool for generating SQL queries from natural language questions.
- [SQLChat](https://github.com/sqlchat/sqlchat) - A chat-based SQL client that allows you to query databases using natural language conversations.
- [Records](https://github.com/kennethreitz-archive/records) - SQL queries to databases via Python syntax.
- [Dataset](https://github.com/pudo/dataset) - JSON-like interface for working with SQL databases.
- [SQLGlot](https://github.com/tobymao/sqlglot) - A no-dependency SQL parser, transpiler, and optimizer for Python.
- [TDengine](https://github.com/taosdata/TDengine) - An open-source big data platform designed for time-series data, IoT, and industrial monitoring.
- [TimescaleDB](https://github.com/timescale/timescaledb) - An open-source time-series SQL database optimized for fast ingest and complex queries.
- [DuckDB](https://github.com/duckdb/duckdb) - In-memory analytical database for fast SQL queries.
[β¬ back to top](#contents)
---
## π Data Visualization
### Resources
Color theory, chart selection guides, and storytelling tips.
- [From Data to Viz](https://github.com/holtzy/data_to_viz) - A guide to choosing the right visualization based on your data.
- [Awesome DataViz](https://github.com/hal9ai/awesome-dataviz) - A curated list of awesome data visualization libraries, tools, and resources.
- [Visualization Curriculum](https://github.com/uwdata/visualization-curriculum) - Interactive notebooks designed to teach data visualization concepts.
- [Scientific Visualization Book](https://github.com/rougier/scientific-visualization-book) - Guide to creating effective scientific visualizations and plots.
- [The Python Graph Gallery](https://python-graph-gallery.com/) - A collection of Python graph examples for data visualization.
- [FlowingData](https://flowingdata.com/) - Insights on data analysis and visualization.
- [Data Visualization Catalogue](https://datavizcatalogue.com/index.html) - A comprehensive catalog of data visualization types.
- [Data Viz Project](https://datavizproject.com/) - A resource for selecting suitable visualizations.
- [Chartopedia](https://www.anychart.com/chartopedia/usage-type/) - A guide to help you select the appropriate chart types.
- [DataForVisualization](https://dataforvisualization.com/) - Tutorials and insights on data visualization techniques.
- [Truth & Beauty](https://truth-and-beauty.net/) - Exploration of the aesthetics of data visualization.
- [Cedric Scherer's DataViz Resources](https://www.cedricscherer.com/top/dataviz/) - A collection of top data visualization resources and inspiration.
- [Information is Beautiful](https://informationisbeautiful.net/) - A site dedicated to visualizations that make complex ideas clear and engaging.
- [Plottie](https://plottie.art/) - A vast library of scientific plots for visualization inspiration and ideas.
- [Friends Don't Let Friends](https://github.com/cxli233/FriendsDontLetFriends) - A collection of bad data visualization practices and better alternatives.
- [Natural Colours](https://www.c82.net/natural-colors/) - A digital archive of historical color systems and pigments.
- [Colorgorical](http://vrl.cs.brown.edu/color) - Resource for generating categorical color palettes using perceptual principles.
[β¬ back to top](#contents)
---
### Tools
Libraries for static, interactive, and 3D visualizations.
- [Matplotlib](https://matplotlib.org/stable/contents.html) - A comprehensive library for creating static, animated, and interactive visualizations in Python.
- [Seaborn](https://seaborn.pydata.org/) - A statistical data visualization library based on Matplotlib.
- [Plotly](https://plotly.com/python/) - A library for creating interactive plots and dashboards.
- [Altair](https://github.com/vega/altair) - A declarative statistical visualization library for Python.
- [Bokeh](https://docs.bokeh.org/en/latest/) - A library for creating interactive visualizations for modern web browsers.
- [HoloViews](https://holoviews.org/) - A tool for building complex visualizations easily.
- [Geopandas](https://geopandas.org/en/stable/) - An extension of Pandas for geospatial data.
- [Folium](https://python-visualization.github.io/folium/) - A library for visualizing data on interactive maps.
- [Pygal](https://pygal.org/en/stable/) - A Python SVG charting library.
- [Plotnine](https://plotnine.readthedocs.io/en/stable/) - A grammar of graphics for Python.
- [Bqplot](https://github.com/bqplot/bqplot) - A plotting library for IPython/Jupyter notebooks.
- [PyPalettes](https://github.com/JosephBARBIERDARNAL/pypalettes) - A large (+2500) collection of color maps for Python.
- [Deck.gl](https://github.com/visgl/deck.gl) - A WebGL-powered framework for visual exploratory data analysis of large datasets.
- [Python for Geo](https://github.com/geopandas/contextily) - Contextily: add background basemaps to your plots in GeoPandas.
- [OSMnx](https://github.com/gboeing/osmnx) - A package to easily download, model, analyze, and visualize street networks from OpenStreetMap.
- [Apache ECharts](https://github.com/apache/echarts) - A powerful, interactive charting and visualization library for browser-based applications.
- [VisPy](https://github.com/vispy/vispy) - A high-performance interactive 2D/3D data visualization library leveraging the power of OpenGL.
- [Glumpy](https://github.com/glumpy/glumpy) - A Python library for scientific visualization that is fast, scalable and beautiful, based on OpenGL.
- [Pandas-bokeh](https://github.com/PatrikHlobil/Pandas-Bokeh) - Bokeh plotting backend for Pandas.
[β¬ back to top](#contents)
---
## π Dashboards & BI
### Resources
Ttutorials for building and enhancing dashboards and visualizations using various tools and frameworks.
- [Awesome Dashboards](https://github.com/obazoud/awesome-dashboard) - A collection of outstanding dashboard and visualization resources.
- [Best of Streamlit](https://github.com/jrieke/best-of-streamlit) - Showcase of community-built Streamlit applications.
- [Awesome Dash](https://github.com/ucg8j/awesome-dash) - Comprehensive resources for Dash users.
- [Awesome Panel](https://github.com/awesome-panel/awesome-panel) - Resources and support for Panel users.
- [Awesome Streamlit](https://github.com/MarcSkovMadsen/awesome-streamlit) - Curated list of Streamlit resources and components.
- [Dash Enterprise Samples](https://github.com/plotly/dash-sample-apps) - Production-ready Dash apps.
- [geeksforgeeks - Tableau Tutorial](https://www.geeksforgeeks.org/tableau-tutorial/) - Comprehensive tutorial on Tableau.
- [geeksforgeeks - Power BI Tutorial](https://www.geeksforgeeks.org/power-bi-tutorial/) - Detailed tutorial on Power BI.
[β¬ back to top](#contents)
---
### Tools
Frameworks for building custom dashboard solutions.
- [Dash](https://github.com/plotly/dash) - Framework for creating interactive web applications.
- [Streamlit](https://github.com/streamlit/streamlit) - Simplified framework for building data applications.
- [Panel](https://panel.holoviz.org/) - Framework for creating interactive web applications.
- [Gradio](https://github.com/gradio-app/gradio) - Tool for creating and sharing machine learning applications.
- [OpenSearch Dashboards](https://github.com/opensearch-project/OpenSearch-Dashboards) - A powerful data visualization and dashboarding tool for OpenSearch data, forked from Kibana.
- [GridStack.js](https://github.com/gridstack/gridstack.js) - A library for building draggable, resizable responsive dashboard layouts.
- [Tremor](https://github.com/tremorlabs/tremor-npm) - A React library to build dashboards fast with pre-built components for charts, KPIs, and more.
- [Appsmith](https://github.com/appsmithorg/appsmith) - An open-source platform to build and deploy internal tools, admin panels, and CRUD apps quickly.
- [Grafanalib](https://github.com/weaveworks/grafanalib) - A Python library for generating Grafana dashboards configuration as code.
- [H2O Wave](https://github.com/h2oai/wave) - A Python framework for rapidly building and deploying realtime web apps and dashboards for AI and analytics.
- [Shiny for Python](https://github.com/posit-dev/py-shiny) - Python version of the popular R Shiny framework.
- [VoilΓ ](https://github.com/voila-dashboards/voila) - Turn Jupyter notebooks into standalone web applications.
- [Reflex](https://github.com/reflex-dev/reflex) - Full-stack Python framework for building web apps.
[β¬ back to top](#contents)
---
### Software
A list of leading tools and platforms for data visualization and dashboard creation.
- [Tableau](https://www.tableau.com) - Leading data visualization software.
- [Microsoft Power BI](https://powerbi.microsoft.com) - Business analytics tool for visualizing data.
- [QlikView](https://www.qlik.com/us/products/qlikview) - Tool for data visualization and business intelligence.
- [Metabase](https://www.metabase.com) - User-friendly open-source BI tool.
- [Apache Superset](https://superset.apache.org) - Open-source data exploration and visualization platform.
- [Preset](https://preset.io/) - A platform for modern business intelligence, providing a hosted version of Apache Superset.
- [Metabase](https://github.com/metabase/metabase) - The simplest way to get analytics and business intelligence for everyone in your company.
- [Redash](https://github.com/getredash/redash) - Tool for visualizing and sharing data insights.
- [Grafana](https://grafana.com) - Dashboarding and monitoring tool.
- [Datawrapper](https://www.datawrapper.de) - User-friendly chart and map creation tool.
- [ChartBlocks](https://www.chartblocks.com) - Online chart creation platform.
- [Infogram](https://infogram.com) - Tool for creating infographics and visual content.
- [Google Data Studio](https://datastudio.google.com) - Free tool for creating interactive dashboards and reports.
- [Rath](https://github.com/Kanaries/Rath) - Next-generation automated data exploratory analysis and visualization platform.
- [Kibana](https://github.com/elastic/kibana) - The official visualization and dashboarding tool for the Elastic Stack (Elasticsearch, Logstash, Beats).
[β¬ back to top](#contents)
---
## πΈοΈ Web Scraping & Crawling
### Resources
A collection of valuable resources, tutorials, and libraries for web scraping with Python.
- [Awesome Web Scraping](https://github.com/lorien/awesome-web-scraping) - List of libraries, tools, and APIs for web scraping and data processing.
- [Python Scraping](https://github.com/REMitchell/python-scraping) - Code samples from the book "Web Scraping with Python".
- [Scraping Tutorial](https://github.com/Blatzar/scraping-tutorial) - Tutorial for scraping streaming sites.
- [Webscraping from 0 to Hero](https://github.com/TheWebScrapingClub/webscraping-from-0-to-hero) - An open project repository sharing knowledge and experiences about web scraping with Python.
[β¬ back to top](#contents)
---
### Tools
A list of libraries and tools for web scraping.
- [Requests](https://github.com/psf/requests) - A simple, yet elegant, HTTP library for Python.
- [BeautifulSoup](https://www.crummy.com/software/BeautifulSoup/bs4/doc/) - A library for parsing HTML and XML documents.
- [Selenium](https://github.com/SeleniumHQ/selenium) - A tool for automating web applications for testing purposes.
- [Scrapy](https://scrapy.org/) - An open-source and collaborative web crawling framework for Python.
- [Browser Use](https://github.com/browser-use/browser-use) - A library for browser automation and web scraping.
- [Gerapy](https://github.com/Gerapy/Gerapy) - Distributed Crawler Management Framework based on Scrapy, Scrapyd, Django, and Vue.js.
- [AutoScraper](https://github.com/alirezamika/autoscraper) - A smart, automatic, fast, and lightweight web scraper for Python.
- [Feedparser](https://github.com/kurtmckee/feedparser) - A library to parse feeds in Python.
- [Trafilatura](https://github.com/adbar/trafilatura) - A Python & command-line tool to gather text and metadata on the web.
- [You-Get](https://github.com/soimort/you-get) - A tiny command-line utility to download media contents (videos, audios, images) from the web.
- [MechanicalSoup](https://github.com/MechanicalSoup/MechanicalSoup) - A Python library for automating interaction with websites.
- [ScrapeGraph AI](https://github.com/ScrapeGraphAI/Scrapegraph-ai) - A Python scraper based on AI.
- [Snscrape](https://github.com/JustAnotherArchivist/snscrape) - A social networking service scraper in Python.
- [Ferret](https://github.com/MontFerret/ferret) - A web scraping system that lets you declaratively describe what data to extract using a simple query language.
- [Grab](https://github.com/lorien/grab) - A Python framework for building web scraping apps, providing a high-level API for asynchronous requests.
- [Playwright](https://github.com/microsoft/playwright-python) - Python version of the Playwright browser automation library.
- [PyQuery](https://github.com/gawel/pyquery) - A jQuery-like library for parsing HTML documents in Python.
- [Helium](https://github.com/mherrmann/helium) - High-level Selenium wrapper for easier web automation.
- [Scrapling](https://github.com/D4Vinci/Scrapling) - A framework for building web scrapers and crawlers.
- [Crawl4AI](https://github.com/unclecode/crawl4ai) - Advanced web crawling framework designed for AI and data extraction tasks.
[β¬ back to top](#contents)
---
## π’ Mathematics
A collection of resources for learning mathematics, particularly in the context of data science and machine learning.
- [Awesome Math](https://github.com/rossant/awesome-math) - A curated list of mathematics resources, books, and online courses.
- [MML Bool](https://github.com/mml-book/mml-book.github.io) - Comprehensive resource for mathematics in machine learning.
- [3Blue1Brown](https://www.3blue1brown.com/) - Visual explanations of mathematical concepts through animated videos.
- [Immersive Linear Algebra](http://immersivemath.com/ila/) - Interactive resource for understanding linear algebra.
- [Hackermath](https://github.com/amitkaps/hackermath) - Resource for learning statistics and mathematics for data science.
- [Stats Maths with Python](https://github.com/tirthajyoti/Stats-Maths-with-Python) - Collection of Python scripts and notebooks for statistics and mathematics.
- [Fast.ai - Computational Linear Algebra](https://github.com/fastai/numerical-linear-algebra) - Resource for learning linear algebra computationally.
[β¬ back to top](#contents)
---
## π² Statistics & Probability
### Resources
A selection of resources focused on statistics and probability, including tutorials and comprehensive guides.
- [Awesome Statistics](https://github.com/erikgahner/awesome-statistics) - A curated list of statistics resources, software, and learning materials.
- [The Elements of Statistical Learning](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks) - Notebooks for understanding statistical learning concepts.
- [Seeing Theory](https://github.com/seeingtheory/Seeing-Theory) - Interactive visual resource for learning probability and statistics.
- [Code repository for O'Reilly book](https://github.com/gedeck/practical-statistics-for-data-scientists) - Companion code for a practical statistics book.
- [Statistical Learning Theory - Stanford University](https://web.stanford.edu/class/cs229t/notes.pdf) - Lecture notes on statistical learning theory.
- [StatLect](https://www.statlect.com/) - Comprehensive online textbook covering probability and statistics concepts.
- [stanford.edu - Probabilities and Statistics](https://stanford.edu/~shervine/teaching/cs-229/refresher-probabilities-statistics) - Refresher course on probabilities and statistics from Stanford University.
- [Bayesian Methods for Hackers](https://github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers) - Resource for learning Bayesian methods in Python.
- [Bayesian Modeling and Computation in Python](https://github.com/BayesianModelingandComputationInPython/BookCode_Edition1) - Code for the book "Bayesian Modeling and Computation in Python".
- [Stat Trek](https://stattrek.com/) - A resource for learning statistics and probability, with tutorials and tools.
- [Online Statistics Book](https://onlinestatbook.com/2/index.html) - An interactive online statistics book with simulations and demonstrations.
- [All of Statistics](https://github.com/telmo-correa/all-of-statistics) - Resource for studying statistics based on Wasserman's book.
- [Think Stats](https://github.com/AllenDowney/ThinkStats/tree/v3) - Book and code for an introduction to Probability and Statistics.
- [Think Bayes 2](https://github.com/AllenDowney/ThinkBayes2) - Book and code for Bayesian statistical methods.
- [Causal Inference: The Mixtape](https://mixtape.scunning.com/) - Practical guide to causal inference methods.
- [The Effect](https://theeffectbook.net/) - Modern introduction to causality and research design.
[β¬ back to top](#contents)
---
### Tools
A collection of tools focused on statistics and probability.
- [SciPy](https://github.com/scipy/scipy) - Fundamental library for scientific computing and statistics.
- [Statsmodels](https://github.com/statsmodels/statsmodels) - Statistical modeling, testing, and data exploration.
- [PyMC](https://github.com/pymc-devs/pymc) - A probabilistic programming library for Python that allows for flexible Bayesian modeling.
- [Pingouin](https://github.com/raphaelvallat/pingouin) - Statistical package with improved usability over SciPy.
- [scikit-posthocs](https://github.com/maximtrp/scikit-posthocs) - Post-hoc tests for statistical analysis of data.
- [Lifelines](https://github.com/CamDavidsonPilon/lifelines) - Survival analysis and event history analysis in Python.
- [scikit-survival](https://github.com/sebp/scikit-survival) - Survival analysis built on scikit-learn for time-to-event prediction.
- [Bootstrap](https://github.com/cgevans/scikits-bootstrap) - Bootstrap confidence interval estimation methods.
- [PyStan](https://github.com/stan-dev/pystan) - Python interface to Stan for Bayesian statistical modeling.
- [ArviZ](https://github.com/arviz-devs/arviz) - Exploratory analysis of Bayesian models with visual diagnostics.
- [PyGAM](https://github.com/dswah/pyGAM) - A Python library for generalized additive models with built-in smoothing and regularization.
- [NumPyro](https://github.com/pyro-ppl/numpyro) - A probabilistic programming library built on JAX for high-performance Bayesian modeling.
- [Causal Impact](https://github.com/WillianFuks/tfcausalimpact) - A Python implementation of the R package for causal inference using Bayesian structural time-series models.
- [DoWhy](https://github.com/py-why/dowhy) - A Python library for causal inference that supports explicit modeling and testing of causal assumptions.
- [Patsy](https://github.com/pydata/patsy) - A Python library for describing statistical models and building design matrices.
- [Pomegranate](https://github.com/jmschrei/pomegranate) - Fast and flexible probabilistic modeling library for Python with GPU support.
- [Pgmpy](https://github.com/pgmpy/pgmpy) - Python library for probabilistic and causal inference using graphical models.
[β¬ back to top](#contents)
---
## π§ͺ A/B Testing
A collection of resources focused on A/B testing.
- [DynamicYield A/B Testing](https://www.dynamicyield.com/course/testing-and-optimization/) - An online course covering advanced testing and optimization techniques.
- [Evan's Awesome A/B Tools](https://www.evanmiller.org/ab-testing/) - A/B test calculators.
- [Experimentguide](https://experimentguide.com/) - A practical guide to A/B testing and experimentation from industry leaders.
- [Google's A/B Testing Course](https://www.udacity.com/course/ab-testing--ud257) - A free Udacity course covering the fundamentals of A/B testing.
- [So You Think You Can Test?](https://www.lukasvermeer.nl/confidence/) - Experience the challenges of A/B testing through this educational simulation.
[β¬ back to top](#contents)
---
## β³ Time Series Analysis
### Resources
A collection of resources for understanding time series fundamentals and analytical techniques.
- [Awesome Time Series](https://github.com/lmmentel/awesome-time-series) - A curated list of resources dedicated to time series analysis and forecasting.
- [Forecasting: Principles and Practice](https://otexts.com/fpp3/) - Comprehensive textbook on forecasting methods with practical examples.
- [NIST/SEMATECH e-Handbook](https://www.itl.nist.gov/div898/handbook/pmc/section4/pmc4.htm) - Official time series analysis guide from NIST.
- [Awesome Time Series Anomaly Detection](https://github.com/rob-med/awesome-TS-anomaly-detection) - A curated list of tools, datasets, and papers dedicated to time series anomaly detection.
- [Awesome Time Series in Python](https://github.com/MaxBenChrist/awesome_time_series_in_python) - A comprehensive list of Python tools and libraries for time series analysis.
[β¬ back to top](#contents)
---
### Tools
A collection of tools for working with temporal data.
- [Facebook Prophet](https://github.com/facebook/prophet) - A procedure for forecasting time series data based on an additive model.
- [Uber Orbit](https://github.com/uber/orbit) - A Python package for Bayesian time series forecasting and inference.
- [sktime](https://github.com/sktime/sktime) - A unified Python framework for machine learning with time series, compatible with scikit-learn.
- [GluonTS](https://github.com/awslabs/gluonts) - A Python toolkit for probabilistic time series modeling, built on MXNet.
- [Time-Series-Library](https://github.com/thuml/Time-Series-Library) - A library for deep learning-based time series analysis and forecasting.
- [TimesFM](https://github.com/google-research/timesfm) - A pretrained time series foundation model from Google Research for zero-shot forecasting.
- [PyTorch Forecasting](https://github.com/sktime/pytorch-forecasting) - A PyTorch-based library for time series forecasting with neural networks.
- [Time-series-prediction](https://github.com/LongxingTan/Time-series-prediction) - A collection of time series prediction methods and implementations.
- [PlotJuggler](https://github.com/facontidavide/PlotJuggler) - A tool to visualize and analyze time series data logs in real-time.
- [TSFresh](https://github.com/blue-yonder/tsfresh) - Automatically extracting features from time series data.
- [pmdarima](https://github.com/alkaline-ml/pmdarima) - Python library for ARIMA modeling and time series analysis.
- [Kats](https://github.com/facebookresearch/Kats) - Toolkit for analyzing time series data from Facebook Research.
[β¬ back to top](#contents)
---
## βοΈ Data Engineering
### Resources
A collection of resources to help you build and manage robust data pipelines and infrastructure.
- [Data Engineer Handbook](https://github.com/DataExpert-io/data-engineer-handbook) - A comprehensive guide covering fundamental and advanced data engineering concepts.
- [Data Engineering Zoomcamp](https://github.com/DataTalksClub/data-engineering-zoomcamp) - Free course on data engineering fundamentals.
- [Awesome Data Engineering](https://github.com/igorbarinov/awesome-data-engineering) - A curated list of data engineering tools, software, and resources.
- [Data Engineering Cookbook](https://github.com/andkret/Cookbook) - Techniques and strategies for building reliable data platforms.
- [Awesome Pipeline](https://github.com/pditommaso/awesome-pipeline) - A curated list of pipeline toolkits for data processing and workflow management.
- [Awesome DB Tools](https://github.com/mgramin/awesome-db-tools) - A curated list of awesome database tools.
[β¬ back to top](#contents)
---
### Tools
A collection of tools for building, deploying, and managing data pipelines and infrastructure.
- [dbt-core](https://github.com/dbt-labs/dbt-core) - A framework for transforming data in your warehouse using SQL and Jinja.
- [Apache Spark](https://github.com/apache/spark) - A unified engine for large-scale data processing and analytics.
- [Apache Kafka](https://github.com/apache/kafka) - A distributed event streaming platform for building real-time data pipelines.
- [Dagster](https://github.com/dagster-io/dagster) - A data orchestrator for machine learning, analytics, and ETL.
- [Apache Airflow](https://github.com/apache/airflow) - A platform to programmatically author, schedule, and monitor workflows.
- [Apache Hive](https://github.com/apache/hive) - A data warehouse software for reading, writing, and managing large datasets in distributed storage using SQL.
- [Apache Hadoop](https://github.com/apache/hadoop) - A framework that allows for the distributed processing of large data sets across clusters of computers.
- [Luigi](https://github.com/spotify/luigi) - A Python module for building complex and batch-oriented data pipelines.
- [Apache Iceberg](https://github.com/apache/iceberg) - A high-performance table format for huge analytic datasets.
- [Apache Cassandra](https://github.com/apache/cassandra) - A highly scalable distributed NoSQL database designed for handling large amounts of data across many commodity servers.
- [Apache Flink](https://github.com/apache/flink) - A framework for stateful computations over unbounded and bounded data streams (real-time stream processing).
- [Apache Beam](https://github.com/apache/beam) - A unified model for defining both batch and streaming data-parallel processing pipelines.
- [Apache Pulsar](https://github.com/apache/pulsar) - A cloud-native, distributed messaging and streaming platform.
- [Delta Lake](https://github.com/delta-io/delta) - A storage layer that brings ACID transactions to Apache Spark and big data workloads.
- [Apache Hudi](https://github.com/apache/hudi) - An open data lakehouse platform, built on a high-performance open table format.
- [Trino](https://github.com/trinodb/trino) - A distributed SQL query engine designed for fast analytic queries against large datasets.
- [DataHub](https://github.com/datahub-project/datahub) - A metadata platform for the modern data stack.
- [OpenLineage](https://github.com/OpenLineage/OpenLineage) - An open framework for collection and analysis of data lineage.
- [Kedro](https://github.com/kedro-org/kedro) - A framework for creating reproducible, maintainable and modular data science code.
- [Apache Calcite](https://github.com/apache/calcite) - A dynamic data management framework that allows for SQL parsing, optimization, and federation.
- [Prefect](https://github.com/PrefectHQ/prefect) - Workflow orchestration for building resilient data pipelines.
- [Apache Arrow](https://github.com/apache/arrow) - Universal columnar format and multi-language toolbox for fast data interchange.
- [Kestra](https://github.com/kestra-io/kestra) - An open-source, event-driven orchestrator that simplifies data workflow management.
[β¬ back to top](#contents)
---
## π Natural Language Processing (NLP)
### Resources
A selection of resources for learning and applying natural language processing in Python.
- [Awesome Nlp](https://github.com/keon/awesome-nlp) - A ranked list of awesome Python libraries for natural language processing (NLP).
- [Hugging Face NLP Course](https://huggingface.co/learn/llm-course/chapter1/1) - Official course on transformers and NLP from Hugging Face.
- [Practical NLP Code](https://github.com/practical-nlp/practical-nlp-code) - Code examples and notebooks for practical natural language processing.
- [Oxford Deep NLP Lectures](https://github.com/oxford-cs-deepnlp-2017/lectures) - Lecture materials from Oxford's Deep Natural Language Processing course.
- [NLTK Book](https://www.nltk.org/book/) - Natural Language Processing with Python.
- [NLP with Python by Susan Li](https://github.com/susanli2016/NLP-with-Python) - Jupyter notebooks demonstrating various NLP techniques and applications.
- [Hands on NLTK Tutorial](https://github.com/hb20007/hands-on-nltk-tutorial) - The hands-on NLTK tutorial for NLP in Python.
- [YSDA NLP Course](https://github.com/yandexdataschool/nlp_course) - Yandex School of Data Analysis course on Natural Language Processing.
- [The NLP Pandect](https://github.com/ivan-bilan/The-NLP-Pandect) - Comprehensive NLP guide covering theory, models, and practical implementations.
[β¬ back to top](#contents)
---
### Tools
A collection of powerful libraries and frameworks for natural language processing.
- [Natural Language Toolkit (NLTK)](https://www.nltk.org/) - A leading platform for building Python programs to work with human language data.
- [TextBlob](https://textblob.readthedocs.io/en/dev/) - A simple library for processing textual data.
- [SpaCy](https://spacy.io/) - An open-source software library for advanced NLP in Python.
- [BERT](https://github.com/google-research/bert) - A transformer-based model for NLP tasks.
- [Flair](https://github.com/flairNLP/flair) - A simple framework for state-of-the-art NLP.
- [OpenHands](https://github.com/All-Hands-AI/OpenHands) - A library and framework for building applications with large language models.
- [Stanford CoreNLP](https://github.com/stanfordnlp/CoreNLP) - A Java suite of core NLP tools providing fundamental linguistic analysis capabilities.
- [John Snow Labs Spark-NLP](https://github.com/JohnSnowLabs/spark-nlp) - A state-of-the-art Natural Language Processing library built on Apache Spark.
- [TextAttack](https://github.com/QData/TextAttack) - A Python framework for adversarial attacks, data augmentation, and model training in NLP.
- [Gensim](https://github.com/piskvorky/gensim) - Topic modeling and natural language processing library for Python.
- [Stanza](https://github.com/stanfordnlp/stanza) - Python NLP library for many human languages, from the Stanford NLP Group.
- [SentenceTransformers](https://github.com/UKPLab/sentence-transformers) - Framework for state-of-the-art sentence and text embeddings.
- [LangExtract](https://github.com/google/langextract) - Google's library for structured information extraction from text using language models.
- [Rasa](https://github.com/RasaHQ/rasa) - Open-source framework for building contextual AI assistants and chatbots.
[β¬ back to top](#contents)
---
## π€ Machine Learning & AI
### Resources
A collection of resources to help you learn and apply machine learning concepts and techniques.
- [Awesome Machine Learning](https://github.com/josephmisiti/awesome-machine-learning) - A curated list of awesome Machine Learning frameworks, libraries and software.
- [Machine Learning Tutorials](https://github.com/ujjwalkarn/Machine-Learning-Tutorials) - Machine learning and deep learning tutorials, articles and other resources.
- [Awesome Deep Learning](https://github.com/ChristosChristofidis/awesome-deep-learning) - A curated list of awesome Deep Learning tutorials, projects and communities.
- [Best of ML Python](https://github.com/lukasmasuch/best-of-ml-python) - A ranked list of awesome machine learning Python libraries and tools.
- [Microsoft ML for Beginners](https://github.com/microsoft/ML-For-Beginners) - A beginner-friendly introduction to machine learning concepts and practices.
- [mlcourse.ai](https://github.com/Yorko/mlcourse.ai) - Open Machine Learning Course with practical assignments and real-world applications.
- [Machine Learning Zoomcamp](https://github.com/DataTalksClub/machine-learning-zoomcamp) - A free practical machine learning course focused on building and deploying models.
- [Awesome Artificial Intelligence](https://github.com/owainlewis/awesome-artificial-intelligence) - A curated list of artificial intelligence resources.
- [Google Research](https://github.com/google-research/google-research) - Official repository for Google Research projects and publications.
- [100 Days of ML Coding](https://github.com/Avik-Jain/100-Days-Of-ML-Code) - A comprehensive coding challenge to learn machine learning over 100 days.
- [Made With ML](https://github.com/GokuMohandas/Made-With-ML) - Resource for building and deploying machine learning applications.
- [Handson-ml3](https://github.com/ageron/handson-ml3) - Hands-on guide to machine learning and deep learning using Python.
- [LLMs-from-scratch](https://github.com/rasbt/LLMs-from-scratch) - Educational repository for building LLMs from scratch.
- [Awesome Generative AI Guide](https://github.com/aishwaryanr/awesome-generative-ai-guide) - A comprehensive guide to generative AI models, tools, and applications.
- [Awesome LLM](https://github.com/Hannibal046/Awesome-LLM) - A curated list of papers, projects, and resources related to Large Language Models.
- [Machine Learning with Python by Susan Li](https://github.com/susanli2016/Machine-Learning-with-Python) - Jupyter notebooks covering various machine learning algorithms and applications.
- [Understanding Deep Learning](https://udlbook.github.io/udlbook/) - Comprehensive and accessible textbook on deep learning fundamentals.
- [Deep Learning Papers Reading Roadmap](https://github.com/floodsung/Deep-Learning-Papers-Reading-Roadmap) - Curated roadmap of seminal deep learning papers for newcomers.
[β¬ back to top](#contents)
---
### Tools
A collection of tools for developing and deploying machine learning models.
#### Machine Learning
- [Scikit-learn](https://github.com/scikit-learn/scikit-learn) - Machine learning library for classical algorithms and model building.
- [XGBoost](https://github.com/dmlc/xgboost) - Optimized distributed gradient boosting library for tree-based models.
- [LightGBM](https://github.com/microsoft/LightGBM) - Fast, distributed, high-performance gradient boosting framework.
- [CatBoost](https://github.com/catboost/catboost) - High-performance gradient boosting on decision trees with categorical features support.
- [H2O-3](https://github.com/h2oai/h2o-3) - Open-source distributed machine learning platform.
- [cuML](https://github.com/rapidsai/cuml) - GPU-accelerated machine learning algorithms from RAPIDS.
- [dlib](https://github.com/davisking/dlib) - Modern C++ toolkit containing machine learning algorithms and tools.
- [SHAP](https://github.com/shap/shap) - Game theoretic approach to explain the output of any machine learning model.
- [InterpretML](https://github.com/interpretml/interpret) - Fit interpretable models and explain blackbox machine learning.
- [Optuna](https://github.com/optuna/optuna) - Hyperparameter optimization framework.
#### Deep Learning
- [TensorFlow](https://github.com/tensorflow/tensorflow) - End-to-end open source platform for machine learning and deep learning.
- [PyTorch](https://github.com/pytorch/pytorch) - Deep learning framework with strong support for research and production.
- [PyTorch Lightning](https://github.com/Lightning-AI/pytorch-lightning) - PyTorch wrapper for high-performance AI research.
- [PyTorch Ignite](https://github.com/pytorch/ignite) - High-level library to help with training and evaluating neural networks.
- [Keras](https://github.com/keras-team/keras) - High-level neural networks API, running on top of TensorFlow.
- [Fast.ai](https://github.com/fastai/fastai) - Deep learning library simplifying training fast and accurate neural nets.
- [HuggingFace Transformers](https://github.com/huggingface/transformers) - Model-definition framework for state-of-the-art machine learning models.
- [HuggingFace Diffusers](https://github.com/huggingface/diffusers) - Library for state-of-the-art pretrained diffusion models.
- [PEFT](https://github.com/huggingface/peft) - Library for efficiently adapting large pretrained models.
- [YOLOv5](https://github.com/ultralytics/yolov5) - Real-time object detection system.
- [Ultralytics](https://github.com/ultralytics/ultralytics) - YOLOv8 and other computer vision models.
- [ONNX](https://github.com/onnx/onnx) - Open standard for machine learning interoperability.
- [PyTorch Geometric](https://github.com/pyg-team/pytorch_geometric) - Geometric deep learning extension library for PyTorch.
- [Pyro](https://github.com/pyro-ppl/pyro) - Deep universal probabilistic programming with Python and PyTorch.
- [Skorch](https://github.com/skorch-dev/skorch) - Scikit-learn compatible neural network library.
- [Sonnet](https://github.com/google-deepmind/sonnet) - DeepMind's library for building complex neural networks.
- [JAX](https://github.com/jax-ml/jax) - Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more.
- [TensorFlow Models](https://github.com/tensorflow/models) - Official TensorFlow repository with models and examples.
[β¬ back to top](#contents)
---
## π MLOps
### Resources
Materials and curated lists for machine learning operations.
- [MLOps Zoomcamp](https://github.com/DataTalksClub/mlops-zoomcamp) - A free course focused on the practical aspects of deploying and maintaining ML systems.
- [Awesome MLOps (visenger)](https://github.com/visenger/awesome-mlops) - A curated list of references for MLOps.
- [Awesome MLOps (kelvins)](https://github.com/kelvins/awesome-mlops) - A curated list of awesome MLOps tools.
- [Awesome LLMOps](https://github.com/tensorchord/Awesome-LLMOps) - An awesome & curated list of best LLMOps tools for developers.
- [LLM Zoomcamp](https://github.com/DataTalksClub/llm-zoomcamp) - A course dedicated to Large Language Models, their architecture and applications.
- [ML Engineering Guide](https://github.com/stas00/ml-engineering) - A practical guide to machine learning engineering and MLOps best practices.
- [Awesome Production Machine Learning](https://github.com/EthicalML/awesome-production-machine-learning) - A curated list of tools for deploying, monitoring, and maintaining ML systems in production.
- [Llama Cookbook](https://github.com/meta-llama/llama-cookbook) - Official recipes and examples for working with Llama models.
[β¬ back to top](#contents)
---
### Tools
Platforms and utilities for deploying, monitoring, and maintaining ML systems.
- [ColossalAI](https://github.com/hpcaitech/ColossalAI) - High-performance distributed training framework.
- [DVC](https://github.com/iterative/dvc) - Version control system for machine learning projects.
- [Evidently](https://github.com/evidentlyai/evidently) - Tool for analyzing and monitoring data and model drift.
- [Deepchecks](https://github.com/deepchecks/deepchecks) - Validation for ML models and data.
- [Sematic](https://github.com/sematic-ai/sematic) - Tool to build, debug, and execute ML pipelines with native Python.
- [netdata](https://github.com/netdata/netdata) - Real-time performance monitoring.
- [meilisearch](https://github.com/meilisearch/meilisearch) - Fast, open-source search engine.
- [vLLM](https://github.com/vllm-project/vllm) - High-throughput and memory-efficient inference library for LLMs.
- [haystack](https://github.com/deepset-ai/haystack) - LLM framework for building search and question answering systems.
- [Kubeflow](https://github.com/kubeflow/kubeflow) - Machine learning toolkit for Kubernetes.
- [Seldon Core](https://github.com/SeldonIO/seldon-core) - Open source platform for deploying and monitoring machine learning models in production.
- [Feast](https://github.com/feast-dev/feast) - A feature store for machine learning that manages and serves ML features to models.
- [BentoML](https://github.com/bentoml/BentoML) - Framework for building, shipping, and scaling ML applications.
- [MLflow](https://github.com/mlflow/mlflow) - Open-source platform for the complete machine learning lifecycle.
- [Wandb](https://github.com/wandb/wandb) - Tool for experiment tracking, dataset versioning, and model management.
- [Comet ML](https://github.com/comet-ml/opik) - ML platform for tracking, comparing and optimizing experiments.
- [Netflix Metaflow](https://github.com/Netflix/metaflow) - A human-friendly Python library for helping scientists and engineers build and manage real-life data science projects.
- [mindsdb](https://github.com/mindsdb/mindsdb) - Platform for integrating AI into databases and applications.
- [KServe](https://github.com/kserve/kserve) - Standardized serverless inference platform for deploying and serving machine learning models on Kubernetes.
- [SQLFlow](https://github.com/sql-machine-learning/sqlflow) - Brings machine learning capabilities to SQL, enabling model training and prediction using SQL syntax.
- [Jina AI Serve](https://github.com/jina-ai/serve) - Framework for building and deploying AI services that communicate via gRPC, HTTP and WebSockets.
- [LiteLLM](https://github.com/BerriAI/litellm) - Unified interface to call all LLM APIs (OpenAI, Anthropic, Cohere, etc.) with consistent output formatting.
[β¬ back to top](#contents)
---
## π§ AI Applications & Platforms
### Resources
A collection of resources focused on AI applications and platforms.
- [Awesome LLM Apps](https://github.com/Shubhamsaboo/awesome-llm-apps) - Collection of awesome LLM apps with AI Agents and RAG using OpenAI, Anthropic, Gemini and opensource models.
- [Awesome Generative AI](https://github.com/steven2358/awesome-generative-ai) - A curated list of modern Generative Artificial Intelligence projects and services.
- [Generative AI for Beginners](https://github.com/microsoft/generative-ai-for-beginners) - Course on generative AI for beginners from Microsoft.
- [Awesome AI Agents](https://github.com/e2b-dev/awesome-ai-agents) - A curated list of AI autonomous agents, environments, and frameworks.
- [AI Collection](https://github.com/ai-collection/ai-collection) - The Generative AI Landscape - A Collection of Awesome Generative AI Applications.
- [Awesome AI Apps](https://github.com/Arindam200/awesome-ai-apps) - A collection of projects showcasing RAG, agents, workflows, and other AI use cases.
- [System Prompts and Models](https://github.com/x1xhlol/system-prompts-and-models-of-ai-tools) - System Prompts, Internal Tools & AI Models from various AI applications and coding tools.
- [Awesome LangChain](https://github.com/kyrolabs/awesome-langchain) - Awesome list of tools and projects with the awesome LangChain framework.
- [Awesome AI Tools](https://github.com/mahseema/awesome-ai-tools) - A curated list of Artificial Intelligence Top Tools.
- [Awesome LLM Security](https://github.com/corca-ai/awesome-llm-security) - A curation of awesome tools, documents and projects about LLM Security.
- [Claude Cookbooks](https://github.com/anthropics/claude-cookbooks) - Official Anthropic examples and recipes for working with Claude AI.
[β¬ back to top](#contents)
---
### Tools
A collection of frameworks, platforms, and end-user applications for building and deploying AI-powered solutions.
#### AI Agents & Automation
- [n8n](https://github.com/n8n-io/n8n) - Workflow automation platform for connecting APIs and services.
- [crewAI](https://github.com/crewAIInc/crewAI) - Framework for orchestrating role-playing AI agents.
- [autogen](https://github.com/microsoft/autogen) - Framework for building multi-agent conversational systems.
- [AutoGPT](https://github.com/Significant-Gravitas/AutoGPT) - Autonomous AI agent that can complete complex tasks.
- [LangGraph](https://github.com/langchain-ai/langgraph) - Framework for building stateful, multi-actor applications with LLMs, with cycles and control flow.
#### Development Frameworks & Tools
- [LangChain](https://github.com/langchain-ai/langchain) - Framework for developing applications powered by language models.
- [LlamaIndex](https://github.com/run-llama/llama_index) - Data framework for LLM-based applications with RAG capabilities.
- [openai-python](https://github.com/openai/openai-python) - Official Python library for OpenAI API.
- [openai-agents-python](https://github.com/openai/openai-agents-python) - Official OpenAI framework for building AI agents.
- [ragflow](https://github.com/infiniflow/ragflow) - Open-source RAG (Retrieval-Augmented Generation) workflow platform.
- [firecrawl](https://github.com/firecrawl/firecrawl) - Web crawling and data extraction service for AI applications.
- [Fabric](https://github.com/danielmiessler/Fabric) - Framework for augmenting humans using AI.
#### Code Generation & Assistance
- [gpt-engineer](https://github.com/AntonOsika/gpt-engineer) - AI-powered code generation tool.
- [gpt-pilot](https://github.com/Pythagora-io/gpt-pilot) - AI pair programmer that writes entire applications.
- [tabby](https://github.com/TabbyML/tabby) - Self-hosted AI coding assistant.
#### Model Deployment & Platforms
- [Ollama](https://github.com/jmorganca/ollama) - Tool for running large language models locally.
- [OpenLLM](https://github.com/bentoml/OpenLLM) - Open platform for operating large language models in production.
- [LocalAI](https://github.com/mudler/LocalAI) - Self-hosted, local-first AI model deployment platform.
- [dify](https://github.com/langgenius/dify) - Visual LLM application development platform.
- [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) - Easy-to-use LLM fine-tuning framework.
#### End-User Applications
- [open-webui](https://github.com/open-webui/open-webui) - Web interface for interacting with various LLMs.
- [ComfyUI](https://github.com/comfyanonymous/ComfyUI) - Visual node-based interface for Stable Diffusion.
- [lobe-chat](https://github.com/lobehub/lobe-chat) - Modern AI conversation interface.
- [LibreChat](https://github.com/danny-avila/LibreChat) - Open-source ChatGPT alternative.
- [quivr](https://github.com/QuivrHQ/quivr) - Personal second brain and AI assistant.
- [upscayl](https://github.com/upscayl/upscayl) - AI-powered image upscaling tool.
- [facefusion](https://github.com/facefusion/facefusion) - AI face swapping and enhancement tool.
- [DocsGPT](https://github.com/arc53/DocsGPT) - Documentation-based question answering system.
- [Whisper](https://github.com/openai/whisper) - Robust speech recognition model for transcription and translation.
[β¬ back to top](#contents)
---
## βοΈ Cloud Platforms & Infrastructure
### Resources
A collection of resources for mastering cloud-native technologies, containerization, and infrastructure management.
- [Awesome Cloud Native](https://github.com/rootsongjc/awesome-cloud-native) - A curated list of resources for cloud native technologies.
- [Awesome Kubernetes](https://github.com/ramitsurana/awesome-kubernetes) - A curated list for awesome Kubernetes resources.
- [Awesome Docker](https://github.com/veggiemonk/awesome-docker) - A curated list of Docker resources and projects.
- [AWS Well-Architected Labs](https://github.com/awslabs/aws-well-architected-labs) - Hands-on labs to help you learn about the AWS Well-Architected Framework.
- [Kubernetes The Hard Way](https://github.com/kelseyhightower/kubernetes-the-hard-way) - Tutorial for bootstrapping a Kubernetes cluster the hard way on Google Cloud Platform.
- [Awesome Compose](https://github.com/docker/awesome-compose) - A curated list of Docker Compose samples.
- [AWS EKS Best Practices](https://github.com/aws/aws-eks-best-practices) - A best practices guide for Amazon EKS.
- [Awesome Selfhosted](https://github.com/awesome-selfhosted/awesome-selfhosted) - A list of Free Software network services and web applications which can be hosted locally.
- [Awesome Selfhosted Docker](https://github.com/hotheadhacker/awesome-selfhost-docker) - A curated list of awesome selfhosted applications and solutions using Docker.
- [Awesome Kubernetes Resources](https://github.com/tomhuang12/awesome-k8s-resources) - A curated list of awesome Kubernetes tutorials, tools, and resources.
- [Awesome Cloud Security](https://github.com/4ndersonLin/awesome-cloud-security) - A curated list of awesome cloud security resources, tools, and best practices.
- [DevOps Exercises](https://github.com/bregman-arie/devops-exercises) - Linux, Jenkins, AWS, SRE, Prometheus, Docker, Python, Ansible, Git, Kubernetes, Terraform, OpenStack, SQL, and more.
[β¬ back to top](#contents)
---
### Tools
Tools for containerization, orchestration, infrastructure as code, and cloud-native development.
#### Containerization & Orchestration
- [Docker](https://github.com/docker) - Open platform for developing, shipping, and running applications in containers.
- [Docker Compose](https://github.com/docker/compose) - A tool for defining and running multi-container Docker applications.
- [Kubernetes](https://github.com/kubernetes/kubernetes) - Production-grade container orchestration system.
- [Kompose](https://github.com/kubernetes/kompose) - Conversion tool from Docker Compose to Kubernetes.
#### Infrastructure as Code
- [Terraform](https://github.com/hashicorp/terraform) - Infrastructure as Code tool.
- [OpenTofu](https://github.com/opentofu/opentofu) - Open source fork of Terraform.
- [Pulumi](https://github.com/pulumi/pulumi) - Modern IaC platform using familiar programming languages.
- [CDK8s](https://github.com/cdk8s-team/cdk8s) - Define Kubernetes apps using familiar languages.
#### CI/CD & GitOps
- [Jenkins](https://github.com/jenkinsci/jenkins) - Open source automation server.
- [Argo CD](https://github.com/argoproj/argo-cd) - Declarative GitOps continuous delivery.
- [Argo Workflows](https://github.com/argoproj/argo-workflows) - Container-native workflow engine.
- [Tekton](https://github.com/tektoncd/pipeline) - Kubernetes-native CI/CD framework.
- [Spinnaker](https://github.com/spinnaker/spinnaker) - Multi-cloud continuous delivery.
- [Dagger](https://github.com/dagger/dagger) - Portable devkit for CI/CD pipelines.
#### Service Mesh & API Gateways
- [Traefik](https://github.com/traefik/traefik) - Modern HTTP reverse proxy and load balancer.
- [Kong](https://github.com/Kong/kong) - Cloud-native API Gateway.
- [Apache APISIX](https://github.com/apache/apisix) - Dynamic API gateway.
- [Envoy Gateway](https://github.com/envoyproxy/gateway) - Manages Envoy Proxy as gateway.
- [Higress](https://github.com/alibaba/higress) - Cloud-native API gateway based on Istio.
- [Meshery](https://github.com/meshery/meshery) - Service mesh management.
#### Kubernetes Ecosystem
- [Helm](https://github.com/helm/helm) - Package manager for Kubernetes.
- [Kustomize](https://github.com/kubernetes-sigs/kustomize) - Configuration customization for Kubernetes.
- [Kubernetes Dashboard](https://github.com/kubernetes/dashboard) - Web-based UI for Kubernetes.
- [Skaffold](https://github.com/GoogleContainerTools/skaffold) - Continuous development for Kubernetes.
- [Tilt](https://github.com/tilt-dev/tilt) - Local development for Kubernetes.
- [Flagger](https://github.com/fluxcd/flagger) - Progressive delivery operator.
- [KubeVela](https://github.com/kubevela/kubevela) - Application delivery platform.
- [KubeSphere](https://github.com/kubesphere/kubesphere) - Kubernetes multi-cloud management.
#### Developer Platforms & Control Planes
- [Crossplane](https://github.com/crossplane/crossplane) - Cloud native control plane.
- [Artifact Hub](https://github.com/artifacthub/hub) - Kubernetes packages and Helm charts.
- [Devtron](https://github.com/devtron-labs/devtron) - Kubernetes dashboard.
- [Harness](https://github.com/harness/harness) - End-to-end developer platform.
[β¬ back to top](#contents)
---
## β‘ Productivity
### Resources
A collection of resources to enhance productivity.
- [Positron](https://github.com/posit-dev/positron) - A next-generation data science IDE.
- [Nanobrowser](https://github.com/nanobrowser/nanobrowser) - An open-source AI web automation tool with multi-agent system that runs directly in your browser.
- [Best of Jupyter](https://github.com/ml-tooling/best-of-jupyter) - Ranked list of notable Jupyter Notebook, Hub, and Lab projects.
- [Deepnote](https://github.com/deepnote/deepnote) - AI native data science notebook platform compatible with Jupyter, featuring real-time collaboration, environment management, and integrations.
- [AFFiNE](https://github.com/toeverything/AFFiNE) - All-in-one workspace for notes, docs, and data visualization.
- [Marimo](https://github.com/marimo-team/marimo) - Reactive Python notebook for reproducible and interactive data science.
- [ChatGPT Data Science Prompts](https://github.com/travistangvh/ChatGPT-Data-Science-Prompts) - A collection of useful prompts for data scientists using ChatGPT.
- [Cookiecutter Data Science](https://github.com/drivendataorg/cookiecutter-data-science) - A standardized project structure for data science projects.
- [Learn Regex](https://github.com/ziishaned/learn-regex) - Comprehensive guide to learning regular expressions with examples and exercises.
- [Awesome Regex](https://github.com/aloisdg/awesome-regex) - Curated collection of regex tools, libraries, and learning resources.
- [The Markdown Guide](https://www.markdownguide.org/) - Comprehensive guide to learning Markdown.
- [Readme-AI](https://github.com/eli64s/readme-ai) - A tool to automatically generate README.md files for your projects.
- [Markdown Here](https://github.com/adam-p/markdown-here) - Extension for writing emails in Markdown and rendering them before sending.
- [MarkText](https://github.com/marktext/marktext) - Simple and elegant markdown editor for documentation.
- [QuarkDown](https://github.com/iamgio/quarkdown) - Lightweight markdown processor for fast document rendering.
- [screenshot-to-code](https://github.com/abi/screenshot-to-code) - AI tool that converts screenshots into code for various frontend stacks.
- [Codebeautify](https://codebeautify.org/) - All-in-one online code formatter and beautifier for Python, SQL, JSON, and more.
- [Notion](https://www.notion.com/) - An all-in-one workspace for note-taking and task management.
- [Trello](https://trello.com/home) - A visual project management tool.
- [Habitica](https://github.com/HabitRPG/habitica) - A habit-building and productivity app that treats your life like a role-playing game.
- [Bujo](https://bulletjournal.com/) - Tools to help transform the way you work and live.
- [Parabola](https://parabola.io/) - An AI-powered workflow builder for organizing data.
- [Asana](https://asana.com/) - A project management platform for tracking work and projects.
- [Puter](https://github.com/HeyPuter/puter) - An open-source, browser-based computing environment and cloud OS.
[β¬ back to top](#contents)
---
### Useful Linux Tools
A selection of tools to enhance productivity and functionality in Linux environments.
- [tldr-pages](https://github.com/tldr-pages/tldr) - Simplified and community-driven man pages with practical examples.
- [Bat](https://github.com/sharkdp/bat) - Cat clone with syntax highlighting.
- [Exa](https://github.com/ogham/exa) - Modern replacement for ls.
- [Ripgrep](https://github.com/BurntSushi/ripgrep) - Faster grep alternative.
- [Zoxide](https://github.com/ajeetdsouza/zoxide) - Smarter cd command.
- [Peek](https://github.com/phw/peek) - Simple animated GIF screen recorder with an easy to use interface.
- [CopyQ](https://github.com/hluk/CopyQ) - Clipboard manager with advanced features.
- [Translate Shell](https://github.com/soimort/translate-shell) - Command-line translator using Google Translate, Bing Translator, Yandex.Translate, etc.
- [Espanso](https://github.com/espanso/espanso) - Cross-platform Text Expander written in Rust.
- [Flameshot](https://github.com/flameshot-org/flameshot) - Powerful yet simple to use screenshot software.
- [DrawIO Desktop](https://github.com/jgraph/drawio-desktop) - An open-source diagramming software for making flowcharts, process diagrams, and more.
- [Inkscape](https://github.com/inkscape/inkscape) - A powerful, free, and open-source vector graphics editor for creating and editing visualizations.
- [Rclone](https://rclone.org/) - A command-line program to manage files on cloud storage.
- [Rsync](https://rsync.samba.org/) - A fast and versatile file copying tool that can synchronize files and directories between two locations over a network or locally.
- [Timeshift](https://github.com/linuxmint/timeshift) - System restore tool for Linux that creates filesystem snapshots using rsync+hardlinks or BTRFS snapshots.
- [Backintime](https://github.com/bit-team/backintime) - A comfortable and well-configurable graphical frontend for incremental backups.
- [Fzf](https://github.com/junegunn/fzf) - A command-line fuzzy finder.
- [Osquery](https://github.com/osquery/osquery) - SQL powered operating system instrumentation, monitoring, and analytics.
- [GNU Parallel](https://www.gnu.org/software/parallel/) - A tool to run jobs in parallel.
- [HTop](https://htop.dev/) - An interactive process viewer.
- [Ncdu](https://dev.yorhel.nl/ncdu) - A disk usage analyzer with an ncurses interface.
- [Thefuck](https://github.com/nvbn/thefuck) - A command line tool to correct your previous console command.
- [Miller](https://github.com/johnkerl/miller) - A tool for querying, processing, and formatting data in various file formats (CSV, JSON, etc.), like awk/sed/cut for data.
- [jq](https://github.com/jqlang/jq) - Command-line JSON processor for parsing and manipulating JSON data.
- [yq](https://github.com/mikefarah/yq) - Portable command-line YAML processor (like jq for YAML and XML).
- [q](https://github.com/harelba/q) - Run SQL directly on CSV or TSV files from the command line.
- [VisiData](https://github.com/saulpw/visidata) - Interactive multitool for tabular data exploration in the terminal.
- [csvkit](https://github.com/wireservice/csvkit) - Suite of command-line tools for working with CSV data.
- [httpie](https://github.com/httpie/cli) - Modern command-line HTTP client for API testing and debugging.
- [glances](https://github.com/nicolargo/glances) - Cross-platform system monitoring tool for resource usage analysis.
- [hyperfine](https://github.com/sharkdp/hyperfine) - Command-line benchmarking tool for performance testing.
- [termgraph](https://github.com/mkaz/termgraph) - Draw basic graphs in the terminal for quick data visualization.
- [fd](https://github.com/sharkdp/fd) - Simple, fast and user-friendly alternative to 'find'.
- [dust](https://github.com/bootandy/dust) - More intuitive version of du written in rust.
- [bottom](https://github.com/ClementTsang/bottom) - Cross-platform graphical process/system monitor.
[β¬ back to top](#contents)
---
### Useful VS Code Extensions
A collection of extensions to enhance functionality and productivity in Visual Studio Code.
- [JDBC Adapter](https://marketplace.visualstudio.com/items/?itemName=cweijan.dbclient-jdbc) - Connect to various databases using JDBC.
- [DBCode - Connect](https://marketplace.visualstudio.com/items?itemName=DBCode.dbcode) - Database client for managing and querying databases.
- [Markdown All in One](https://marketplace.visualstudio.com/items?itemName=yzhang.markdown-all-in-one) - Essential tools for Markdown editing.
- [Markdown Preview GitHub Styles](https://marketplace.visualstudio.com/items?itemName=bierner.markdown-preview-github-styles) - Changes VS Code's markdown preview to match GitHub's styling.
- [Snippington Python Pandas Basic](https://marketplace.visualstudio.com/items?itemName=snippington.snp-pandas-basic) - Basic tools for working with Pandas in Python.
- [PDF Viewer for Visual Studio Code](https://marketplace.visualstudio.com/items?itemName=mathematic.vscode-pdf) - View PDF files directly in VS Code.
- [Quick Python Print](https://marketplace.visualstudio.com/items?itemName=WeidaWang.quick-python-print) - Quickly handle print operations in Python.
- [Rainbow CSV](https://marketplace.visualstudio.com/items?itemName=mechatroner.rainbow-csv) - Highlight CSV and TSV files and run SQL-like queries.
- [Remove Blank Lines](https://marketplace.visualstudio.com/items?itemName=thamaraiselvam.remove-blank-lines) - Extension to remove empty lines in documents.
- [PDF Preview in VSCode](https://marketplace.visualstudio.com/items/tomoki1207.pdf) - Show PDF previews in VS Code.
- [CSV to Table](https://marketplace.visualstudio.com/items?itemName=phplasma.csv-to-table) - Convert CSV/TSV/PSV files to ASCII formatted tables.
- [Data Preview](https://marketplace.visualstudio.com/items?itemName=RandomFractalsInc.vscode-data-preview) - Import, view, slice, and export data.
- [Data Wrangler](https://marketplace.visualstudio.com/items?itemName=ms-toolsai.datawrangler) - Tool for cleaning and preparing tabular datasets.
- [Error Lens](https://marketplace.visualstudio.com/items?itemName=usernamehw.errorlens) - Enhances the display of errors and warnings in code.
- [Indent Rainbow](https://marketplace.visualstudio.com/items?itemName=oderwat.indent-rainbow) - Makes indentation easier to read.
- [Markdown Table Editor](https://marketplace.visualstudio.com/items?itemName=TakumiI.markdowntable) - Add features to edit Markdown tables.
- [WYSIWYG Editor for Markdown](https://marketplace.visualstudio.com/items?itemName=cweijan.vscode-office) - View Word and Excel files and edit Markdown.
- [Prettier](https://marketplace.visualstudio.com/items?itemName=esbenp.prettier-vscode) - Code formatting extension for VS Code.
- [Project Manager](https://marketplace.visualstudio.com/items?itemName=alefragnani.project-manager) - Easily switch between projects.
- [Python Indent](https://marketplace.visualstudio.com/items?itemName=KevinRose.vsc-python-indent) - Automatically indent Python code.
- [SandDance](https://marketplace.visualstudio.com/items?itemName=msrvida.vscode-sanddance) - Visually explore and present your data.
- [SQL Notebooks](https://marketplace.visualstudio.com/items?itemName=cmoog.sqlnotebook) - Open SQL files as VSCode Notebooks.
- [SQL Tools](https://marketplace.visualstudio.com/items/?itemName=mtxr.sqltools) - Database management tools for VSCode.
- [Kanban Board](https://marketplace.visualstudio.com/items/?itemName=mkloubert.vscode-kanban) - A Kanban board extension for organizing tasks within VS Code.
- [Path Autocomplete](https://marketplace.visualstudio.com/items?itemName=ionutvmi.path-autocomplete) - Provides path completion for files and directories in VS Code.
- [Path Intellisense](https://marketplace.visualstudio.com/items/?itemName=christian-kohler.path-intellisense) - Autocompletes filenames in your code.
- [Python Imports Utils](https://marketplace.visualstudio.com/items?itemName=mgesbert.python-path) - Utilities for managing Python imports.
- [Workspace Dashboard](https://marketplace.visualstudio.com/items?itemName=kruemelkatze.vscode-dashboard) - Organize your workspaces in a speed-dial manner.
- [Remote Development](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.vscode-remote-extensionpack) - Open any folder in a container, on a remote machine, or in WSL.
- [Text Power Tools](https://marketplace.visualstudio.com/items/?itemName=qcz.text-power-tools) - An all-in-one solution with 240+ commands for text manipulation.
- [Toggle Quotes](https://marketplace.visualstudio.com/items/?itemName=BriteSnow.vscode-toggle-quotes) - Toggle between single, double, and backticks for strings.
- [Comment Translate](https://marketplace.visualstudio.com/items/?itemName=intellsmi.comment-translate) - Helps translate comments, strings, and variable names in your code.
- [Text Marker](https://marketplace.visualstudio.com/items/?itemName=ryu1kn.text-marker) - Select text in your code and mark all matches with configurable highlight color.
- [Bookmarks](https://marketplace.visualstudio.com/items/?itemName=alefragnani.Bookmarks) - Mark lines in your code and jump to them easily.
- [Dendron](https://marketplace.visualstudio.com/items/?itemName=dendron.dendron) - A hierarchical note-taking tool that grows as you do.
- [Gitignore Generator](https://marketplace.visualstudio.com/items/?itemName=rubbersheep.gi) - Simplifies the process of generating .gitignore files.
- [Test Explorer UI](https://marketplace.visualstudio.com/items/?itemName=hbenl.vscode-test-explorer) - Run your tests in the sidebar of Visual Studio Code.
- [Python Test Explorer](https://marketplace.visualstudio.com/items/?itemName=LittleFoxTeam.vscode-python-test-adapter) - Run your Python tests in the sidebar of Visual Studio Code.
- [VSCode Markdownlint](https://marketplace.visualstudio.com/items?itemName=DavidAnson.vscode-markdownlint) - A VS Code extension to lint and style check markdown files.
[β¬ back to top](#contents)
---
## π Skill Development & Career
### Practice Resources
A collection of resources to enhance skills and advance your career in data analysis and related fields.
- [LeetCode](https://leetcode.com/problemset/) - A platform for preparing technical coding interviews.
- [Kaggle Competitions](https://www.kaggle.com/competitions) - Platform for participating in data analysis and machine learning competitions.
- [Makeovermonday](https://makeovermonday.co.uk/) - A platform focused on enhancing data visualization practices.
- [Workout Wednesday](https://workout-wednesday.com/) - Engage in weekly challenges to improve your visualization skills.
- [Official TidyTuesday Repository](https://github.com/rfordatascience/tidytuesday) - Repository for the TidyTuesday project, promoting data analysis.
- [DrivenData Competitions](https://www.drivendata.org/competitions/) - Data analysis competitions with a social impact focus.
- [Codecademy Data Science Path](https://www.codecademy.com/learn/paths/data-science) - Interactive courses for learning data analysis.
- [SQL Masterclass](https://github.com/datawithdanny/sql-masterclass?tab=readme-ov-file#course-content) - A course to master SQL for data analysis, complete with real-world projects.
- [Hugging Face Tasks](https://huggingface.co/tasks) - Hands-on practice with specific NLP and machine learning tasks using real models.
- [Awesome LeetCode Resources](https://github.com/ashishps1/awesome-leetcode-resources) - Collection of curated resources and strategies for LeetCode practice.
[β¬ back to top](#contents)
---
### Curated Jupyter Notebooks
A selection of curated Jupyter notebooks to support learning and exploration in data science and analysis.
- [Awesome Notebooks](https://github.com/jupyter-naas/awesome-notebooks) - Data & AI notebook templates catalog organized by tools.
- [Data Science Ipython Notebooks](https://github.com/donnemartin/data-science-ipython-notebooks) - Data science Python notebooks covering various topics.
- [Pydata Book](https://github.com/wesm/pydata-book) - Materials and IPython notebooks for "Python for Data Analysis" by Wes McKinney.
- [Spark py Notebooks](https://github.com/jadianes/spark-py-notebooks) - Apache Spark & Python tutorials for big data analysis and machine learning.
- [DataMiningNotebooks](https://github.com/eclarson/DataMiningNotebooks) - Example notebooks for data mining accompanying the course at Southern Methodist University.
- [Pythondataanalysis](https://github.com/hnawaz007/pythondataanalysis) - Python data repository with Jupyter notebooks and scripts.
- [Python For Data Analysis](https://github.com/cuttlefishh/python-for-data-analysis) - An introduction to data science using Python and Pandas with Jupyter notebooks.
- [Jdwittenauer Ipython Notebooks](https://github.com/jdwittenauer/ipython-notebooks) - A collection of IPython notebooks covering various topics.
- [DataScienceInteractivePython](https://github.com/GeostatsGuy/DataScienceInteractivePython) - A collection of interactive Python notebooks for learning data science concepts.
- [Unsloth Notebooks](https://github.com/unslothai/notebooks) - Optimized notebooks for faster AI model training and fine-tuning.
- [Huggingface Notebooks](https://github.com/huggingface/notebooks) - Official Hugging Face notebooks for NLP, vision, audio, and diffusion models.
- [Deep Learning with Python Notebooks](https://github.com/fchollet/deep-learning-with-python-notebooks) - Official Jupyter notebooks from FranΓ§ois Chollet's Deep Learning with Python book.
[β¬ back to top](#contents)
---
### Data Sources & Datasets
A collection of resources for accessing datasets and data sources for analysis and projects.
- [Kaggle Datasets](https://www.kaggle.com/datasets) - Extensive collection of datasets for practice in data analysis.
- [Opendatasets](https://github.com/JovianHQ/opendatasets) - A Python library for downloading datasets from Kaggle, Google Drive, and other online sources.
- [Datasette](https://github.com/simonw/datasette) - An open source multi-tool for exploring and publishing data.
- [Awesome Public Datasets](https://github.com/awesomedata/awesome-public-datasets) - Curated list of high-quality open datasets.
- [Open Data Sources](https://github.com/datasciencemasters/data) - Collection of various open data sources.
- [Free Datasets for Projects](https://www.dataquest.io/blog/free-datasets-for-projects/) - Dataquest's compilation of free datasets.
- [Data World](https://data.world/) - The enterprise data catalog that CIOs, governance professionals, data analysts, and engineers trust in the AI era.
- [Awesome Public Real Time Datasets](https://github.com/bytewax/awesome-public-real-time-datasets) - A list of publicly available datasets with real-time data.
- [Google Dataset Search](https://datasetsearch.research.google.com/) - A search engine for datasets from across the web.
- [NASA Open Data Portal](https://data.nasa.gov/) - A site for NASA's open data initiative, providing access to NASA's data resources.
- [The World Bank Data](https://data.worldbank.org/) - Free and open access to global development data by The World Bank.
- [Voice Datasets](https://github.com/jim-schwoebel/voice_datasets) - A collection of audio and speech datasets for voice AI and machine learning.
- [HuggingFace Datasets](https://github.com/huggingface/datasets) - A lightweight library to easily share and access datasets for audio, computer vision, and NLP.
- [TensorFlow Datasets](https://github.com/tensorflow/datasets) - A collection of ready-to-use datasets for use with TensorFlow and other Python ML frameworks.
- [NLP Datasets](https://github.com/niderhoff/nlp-datasets) - A curated list of datasets for natural language processing (NLP) tasks.
- [TorchVision Datasets](https://github.com/pytorch/vision) - The torchvision.datasets module provides many built-in computer vision datasets.
- [LLM Datasets](https://github.com/mlabonne/llm-datasets) - A collection of datasets and resources for training and fine-tuning Large Language Models (LLMs).
- [Unsplash Datasets](https://github.com/unsplash/datasets) - A collection of datasets from Unsplash, useful for computer vision and research.
- [Awesome JSON Datasets](https://github.com/jdorfman/awesome-json-datasets?tab=readme-ov-file#bitcoin) - A curated list of awesome JSON datasets that are publicly available without authentication.
[β¬ back to top](#contents)
---
### Resume and Interview Tips
A variety of resources to help you prepare for interviews and enhance your resume.
- [Data Science Interview Questions Answers](https://github.com/youssefHosni/Data-Science-Interview-Questions-Answers) - Curated list of data science interview questions and answers.
- [Data Science Interview Preperation Resources](https://github.com/youssefHosni/Data-Science-Interview-Preperation-Resources) - Resource to help you prepare for your upcoming data science interviews.
- [Data Science Interviews](https://github.com/alexeygrigorev/data-science-interviews) - A comprehensive collection of data science interview questions and resources.
- [The Data Science Interview Book](https://book.thedatascienceinterviewproject.com/) - A comprehensive resource to prepare for data science and machine learning interviews.
- [Machine Learning Interviews Book](https://github.com/chiphuyen/ml-interviews-book) - A comprehensive guide to preparing for machine learning engineering interviews.
- [Interviews](https://github.com/kdn251/interviews) - Personal tech interview study guide covering algorithms and data structures.
- [Devinterview](https://devinterview.io/) - Ace your next tech interview with confidence.
- [Interviewqs](https://www.interviewqs.com/) - Ace your next data science interview.
- [Cracking Data Science Interview](https://github.com/khanhnamle1994/cracking-the-data-science-interview) - A Collection of Cheatsheets, Books, Questions, and Portfolio For DS/ML Interview Prep.
- [Interview Query](https://www.interviewquery.com/) - Another platform to prepare for data science interviews.
- [Awesome Behavioral Interviews](https://github.com/ashishps1/awesome-behavioral-interviews) - Curated resources for mastering behavioral and system design interviews.
- [Enhancv Data Scientist Resumes](https://enhancv.com/resume-examples/data-scientist/) - A collection of resume examples and tips tailored for data scientists.
- [Data Science Portfolio](https://www.datascienceportfol.io/) - A platform to create and showcase your data science portfolio.
- [InterviewBit - SQL Interview Questions](https://www.interviewbit.com/sql-interview-questions/) - Collection of SQL interview questions.
- [StrataScratch](https://www.stratascratch.com/) - Platform with real data science interview questions from top companies.
- [LeetCode Patterns](https://github.com/seanprashad/leetcode-patterns) - Curated collection of coding patterns and strategies for technical interviews.
- [Bartosz Jarocki's CV](https://github.com/BartoszJarocki/cv) - Modern, open-source technical resume template and example.
- [Awesome-CV](https://github.com/posquit0/Awesome-CV) - Professional CV and resume templates built with LaTeX.
- [Reactive-Resume](https://github.com/AmruthPillai/Reactive-Resume) - Open-source resume builder with multiple templates and customization options.
[β¬ back to top](#contents)
---
## π Cheatsheets
A collection of cheatsheets across various domains to aid in quick reference and learning.
### GoalKicker Programming Notes
- [Python Notes for Professionals](https://books.goalkicker.com/PythonBook/PythonNotesForProfessionals.pdf) - A massive collection of Python concepts, idioms, and best practices for all levels.
- [SQL Notes for Professionals](https://books.goalkicker.com/SQLBook/SQLNotesForProfessionals.pdf) - A definitive guide to SQL syntax, queries, and database interaction concepts.
- [PostgreSQL Notes for Professionals](https://books.goalkicker.com/PostgreSQLBook/PostgreSQLNotesForProfessionals.pdf) - A professional compendium of knowledge for PostgreSQL administration and development.
- [MySQL Notes for Professionals](https://books.goalkicker.com/MySQLBook/MySQLNotesForProfessionals.pdf) - Essential reference material for working with the MySQL database management system.
- [Oracle Database Notes for Professionals](https://books.goalkicker.com/OracleDatabaseBook/OracleDatabaseNotesForProfessionals.pdf) - A guide to Oracle Database concepts, PL/SQL, and administration tasks.
- [MongoDB Notes for Professionals](https://books.goalkicker.com/MongoDBBook/MongoDBNotesForProfessionals.pdf) - A practical guide to working with NoSQL and MongoDB for modern application development.
- [Bash Notes for Professionals](https://books.goalkicker.com/BashBook/BashNotesForProfessionals.pdf) - A comprehensive guide to shell scripting and command-line mastery.
- [Git Notes for Professionals](https://books.goalkicker.com/GitBook/GitNotesForProfessionals.pdf) - Everything you need to know about version control with Git, from basics to advanced workflows.
- [Linux Notes for Professionals](https://books.goalkicker.com/LinuxBook/LinuxNotesForProfessionals.pdf) - A deep dive into Linux system administration, commands, and environment management.
- [Microsoft SQL Server Notes for Professionals](https://books.goalkicker.com/MicrosoftSQLServerBook/MicrosoftSQLServerNotesForProfessionals.pdf) - A detailed reference for developing and administering MS SQL Server databases.
- [PowerShell Notes for Professionals](https://books.goalkicker.com/PowerShellBook/PowerShellNotesForProfessionals.pdf) - A guide to task automation and configuration management using PowerShell.
[β¬ back to top](#contents)
---
### Python
- [Python Cheat Sheet](https://vivitoa.github.io/python-cheat-sheet/) - Comprehensive Python syntax and examples.
- [Learn Python](https://github.com/trekhleb/learn-python) - Interactive Python learning.
- [Pythoncheatsheet](https://www.pythoncheatsheet.org/) - Quick reference for Python basics and advanced topics.
- [Comprehensive Python Cheatsheet](https://github.com/gto76/python-cheatsheet) - Detailed Python functions and libraries.
- [Python Cheatsheet](https://github.com/wilfredinni/python-cheatsheet) - A comprehensive cheatsheet for the Python programming language.
- [Pysheeet](https://github.com/crazyguitar/pysheeet) - Concise Python cheat sheet for quick reference and interview prep.
[β¬ back to top](#contents)
---
### Data Science & Machine Learning
- [DS Cheatsheets](https://github.com/FavioVazquez/ds-cheatsheets) - List of Data Science Cheatsheets.
- [DS Notes \& Cheatsheets](https://github.com/merveenoyan/my_notes) - Cheatsheets for data science, ML, computer science and more.
- [Data Science Cheat Sheets (Math)](https://github.com/ml874/Data-Science-Cheatsheet) - Cheat sheets for quick reference in data science mathematics.
- [Pandas Cheat Sheet](https://pandas.pydata.org/Pandas_Cheat_Sheet.pdf) - Data manipulation with Pandas.
- [PySpark Cheatsheet](https://github.com/kevinschaich/pyspark-cheatsheet) - Common PySpark patterns.
- [Machine Learning Cheat Sheet](https://github.com/soulmachine/machine-learning-cheat-sheet) - Concise machine learning cheat sheets covering key concepts and equations.
[β¬ back to top](#contents)
---
### Linux & Git
- [Linux Cheatsheet](https://github.com/gto76/linux-cheatsheet) - Linux commands and shortcuts.
- [Bash Awesome Cheatsheets](https://github.com/LeCoupa/awesome-cheatsheets/blob/master/languages/bash.sh) - Bash scripting essentials.
- [Unix Commands Reference](https://github.com/AdiBro/Data-Science-Resources/blob/master/Cheat-Sheets/CL-Git/Unix-Commands-Reference.pdf) - Unix terminal basics.
- [GitHub Cheat Sheet](https://github.com/tiimgreen/github-cheat-sheet) - Git/GitHub workflows and tips.
- [Git Awesome Cheatsheets](https://github.com/LeCoupa/awesome-cheatsheets/blob/master/tools/git.sh) - Git commands and best practices.
- [Git and Git Flow Cheat Sheet](https://github.com/arslanbilal/git-cheat-sheet) - Branching strategies.
[β¬ back to top](#contents)
---
### Probability & Statistics
- [Stanford CME 106 Cheatsheets](https://github.com/shervinea/stanford-cme-106-probability-and-statistics) - Probability and statistics for engineers.
- [10-Page Probability Cheatsheet](https://github.com/wzchen/probability_cheatsheet) - In-depth probability concepts.
- [Statistics Cheatsheet](https://github.com/khanhnamle1994/cracking-the-data-science-interview/blob/master/Cheatsheets/stats_cheatsheet.pdf) - Key statistical methods.
[β¬ back to top](#contents)
---
### SQL & Databases
- [Quick SQL Cheatsheet](https://github.com/enochtangg/quick-SQL-cheatsheet) - Handy SQL reference guide.
- [PostgreSQL Cheatsheet](https://www.postgresonline.com/downloads/special_feature/postgresql83_psql_cheatsheet.pdf) - A handy reference for the most common PostgreSQL psql commands and queries.
[β¬ back to top](#contents)
---
### Miscellaneous
- [CheatSheet for CheatSheets](https://github.com/plusminuschirag/CheatSheet-for-CheatSheets) - Mega-repository of cheat sheets.
- [Dataquest - Power BI Cheat Sheet](https://www.dataquest.io/cheat-sheet/power-bi-cheat-sheet/) - A helpful resource for Power BI users.
- [Data Structures Cheat Sheet](https://www.clear.rice.edu/comp160/data_cheat.html) - A concise reference for common data structures and their properties.
- [Matplotlib Cheatsheets](https://github.com/matplotlib/cheatsheets) - Official cheatsheets for the Matplotlib plotting library in Python.
- [VSCode Awesome Cheatsheets](https://github.com/LeCoupa/awesome-cheatsheets/blob/master/tools/vscode.md) - VS Code shortcuts.
- [Markdown Cheatsheet](https://github.com/tchapi/markdown-cheatsheet) - Formatting for GitHub READMEs.
- [Emoji Cheat Sheet](https://github.com/ikatyang/emoji-cheat-sheet) - Emojis in Markdown.
- [Docker Cheat Sheet](https://github.com/wsargent/docker-cheat-sheet) - Docker commands and workflows.
- [Docker Awesome Cheatsheets](https://github.com/LeCoupa/awesome-cheatsheets/blob/master/tools/docker.sh) - Containerization basics.
[β¬ back to top](#contents)
---
## π¦ Additional Python Libraries
A collection of supplementary Python libraries that enhance development workflow, automate processes, and maintain project quality beyond core data analysis tools.
### Code Quality & Development
- [Black](https://github.com/psf/black) - Uncompromising Python code formatter.
- [Pre-commit](https://github.com/pre-commit/pre-commit) - Framework for managing pre-commit hooks.
- [Pylint](https://github.com/pylint-dev/pylint) - Python code static analysis.
- [Mypy](https://github.com/python/mypy) - Optional static typing for Python.
- [Rich](https://github.com/Textualize/rich) - Rich text and beautiful formatting in the terminal.
- [Icecream](https://github.com/gruns/icecream) - Debugging without using print.
- [Pandas-log](https://github.com/eyaltrabelsi/pandas-log) - Logs pandas operations for data transformation tracking.
- [PandasVet](https://github.com/deppen8/pandas-vet) - Code style validator for Pandas.
- [Pydeps](https://github.com/thebjorn/pydeps) - Python module dependency graphs.
- [PyForest](https://github.com/8080labs/pyforest) - Automated Python imports for data science.
[β¬ back to top](#contents)
---
### Documentation & File Processing
- [Sphinx](https://github.com/sphinx-doc/sphinx) - Documentation generator.
- [Pdoc](https://github.com/mitmproxy/pdoc) - API documentation for Python projects.
- [Mkdocs](https://github.com/mkdocs/mkdocs) - Project documentation with Markdown.
- [OpenPyXL](https://openpyxl.readthedocs.io/en/stable/) - Read/write Excel files.
- [Tablib](https://github.com/jazzband/tablib) - Exports data to XLSX, JSON, CSV.
- [PyPDF2](https://github.com/py-pdf/PyPDF2) - Reads and writes PDF files.
- [Python-docx](https://github.com/python-openxml/python-docx) - Reads and writes Word documents.
- [CleverCSV](https://github.com/alan-turing-institute/CleverCSV) - Smart CSV reader for messy data.
- [Python-markdownify](https://github.com/matthewwithanm/python-markdownify) - Convert HTML to Markdown.
- [Xlwings](https://github.com/xlwings/xlwings) - Integration of Python with Excel.
- [Xmltodict](https://github.com/martinblech/xmltodict) - Converts XML to Python dictionaries.
- [MarkItDown](https://github.com/microsoft/markitdown) - Python tool for converting files and office documents to Markdown.
- [Jupyter-book](https://github.com/executablebooks/jupyter-book) - Build publication-quality books from Jupyter notebooks.
- [WeasyPrint](https://github.com/Kozea/WeasyPrint) - Convert HTML to PDF.
- [PyMuPDF](https://github.com/pymupdf/PyMuPDF) - Advanced PDF manipulation library.
- [Camelot](https://github.com/camelot-dev/camelot) - PDF table extraction library.
[β¬ back to top](#contents)
---
### Web & APIs
- [HTTPX](https://github.com/encode/httpx) - Next-generation HTTP client for Python.
- [FastAPI](https://github.com/fastapi/fastapi) - Modern web framework for building APIs.
- [Flask](https://github.com/pallets/flask) - Lightweight Python web framework for building applications and APIs.
- [Typer](https://github.com/fastapi/typer) - Library for building CLI applications.
- [Requests-cache](https://github.com/reclosedev/requests-cache) - Persistent caching for requests library.
[β¬ back to top](#contents)
---
### Miscellaneous
- [UV](https://github.com/astral-sh/uv) - An extremely fast Python package installer and resolver.
- [Funcy](https://github.com/Suor/funcy) - Fancy functional tools for Python.
- [Pillow](https://github.com/python-pillow/Pillow) - Image processing library.
- [Ftfy](https://github.com/rspeer/python-ftfy) - Fixes broken Unicode strings.
- [JmesPath](https://github.com/jmespath/jmespath.py) - Queries JSON data (SQL-like for JSON).
- [Glom](https://github.com/mahmoud/glom) - Transforms nested data structures.
- [Diagrams](https://github.com/mingrammer/diagrams) - Diagrams as code for cloud architecture.
- [Pytest](https://github.com/pytest-dev/pytest) - Framework for writing small tests.
- [Pampy](https://github.com/santinic/pampy) - Pattern matching for Python dictionaries.
- [Pygorithm](https://github.com/OmkarPathak/pygorithm) - A Python module for learning all major algorithms.
- [GitPython](https://github.com/gitpython-developers/GitPython) - A Python library used to interact with Git repositories.
- [TQDM](https://github.com/tqdm/tqdm) - Progress bars for loops and operations.
- [Loguru](https://github.com/Delgan/loguru) - Python logging made simple.
- [Click](https://github.com/pallets/click) - Beautiful command line interfaces.
- [Poetry](https://github.com/python-poetry/poetry) - Python dependency management and packaging.
- [Hydra](https://github.com/facebookresearch/hydra) - Elegant configuration management.
[β¬ back to top](#contents)
---
## π More Awesome Lists
A curated list of other awesome lists on various topics and technologies.
- [Awesome](https://github.com/sindresorhus/awesome) - A curated list of awesome lists.
- [Awesome Big Data](https://github.com/oxnr/awesome-bigdata) - A curated list of awesome big data frameworks, resources, and tools.
- [Awesome Geospatial](https://github.com/sacridini/Awesome-Geospatial) - A curated list of awesome geospatial libraries, tools, and resources.
- [Awesome Chatgpt Prompts](https://github.com/f/awesome-chatgpt-prompts) - A repository for ChatGPT prompt curation.
- [Awesome Jupyter](https://github.com/markusschanta/awesome-jupyter) - Curated list of Jupyter projects, libraries, and resources.
- [Awesome Business Intelligence](https://github.com/thenaturalist/awesome-business-intelligence) - Actively curated list of awesome BI tools.
- [Awesome Prompt Engineering](https://github.com/promptslab/Awesome-Prompt-Engineering) - A curated list of resources for prompt engineering with LLMs like ChatGPT.
- [Awesome Product Design](https://github.com/ttt30ga/awesome-product-design) - A collection of bookmarks, resources, articles about product design.
- [Awesome Shell](https://github.com/alebcay/awesome-shell) - A curated list of awesome command-line frameworks, toolkits, and guides.
- [Awesome FastAPI](https://github.com/mjhea0/awesome-fastapi) - A curated list of awesome FastAPI frameworks, libraries, and resources.
- [Awesome Linux Software](https://github.com/luong-komorebi/Awesome-Linux-Software) - A list of awesome applications and tools for Linux.
- [Awesome Product Management](https://github.com/dend/awesome-product-management) - A curated list of resources for product managers and aspiring PMs.
- [Awesome Python Applications](https://github.com/mahmoud/awesome-python-applications) - A list of free software and applications written in Python.
- [Awesome AutoHotkey](https://github.com/ahkscript/awesome-AutoHotkey) - A curated list of awesome AutoHotkey libraries, scripts, and resources.
- [Awesome Productivity](https://github.com/jyguyomarch/awesome-productivity) - A curated list of delightful productivity resources.
- [Awesome Scientific Writing](https://github.com/writing-resources/awesome-scientific-writing) - A curated list of resources for scientific writing, publishing, and research.
- [Awesome LaTeX](https://github.com/egeerardyn/awesome-LaTeX) - A curated list of LaTeX resources, libraries, and tools.
- [Awesome Actions](https://github.com/sdras/awesome-actions) - A curated list of awesome GitHub Actions for automation.
- [Awesome Quarto](https://github.com/mcanouil/awesome-quarto) - A curated list of Quarto resources, including talks, tools, examples, and articles. Contributions are welcome!
- [Awesome Vscode](https://github.com/viatsko/awesome-vscode) - A comprehensive list of useful VS Code extensions and resources.
- [Awesome Readme](https://github.com/matiassingers/awesome-readme) - Collection of well-crafted README files for inspiration.
- [Awesome GitHub Profile Readme](https://github.com/abhisheknaiidu/awesome-github-profile-readme) - A collection of awesome GitHub profile READMEs and resources.
- [Awesome Code Review](https://github.com/joho/awesome-code-review?tab=readme-ov-file#awesome-code-review-) - A collection of resources for code review practices.
- [Awesome Certificates](https://github.com/PanXProject/awesome-certificates) - A curated list of IT and developer certifications and learning resources.
- [Awesome Tunneling](https://github.com/anderspitman/awesome-tunneling) - A list of ngrok alternatives and tunneling software.
- [Anomaly Detection Resources](https://github.com/yzhao062/anomaly-detection-resources) - Books, papers, videos, and toolboxes related to anomaly detection.
- [Awesome Claude Prompts](https://github.com/langgptai/awesome-claude-prompts) - Collection of powerful prompts for Anthropic's Claude AI.
- [Awesome Linux](https://github.com/inputsh/awesome-linux) - Curated list of Linux applications, tools, and resources for users and developers.
- [Awesome for Beginners](https://github.com/MunGell/awesome-for-beginners) - List of beginner-friendly projects for contributing to open-source software.
[β¬ back to top](#contents)
---
## π Additional Resources and Tools
A wide range of resources and tools designed to facilitate learning, development, and exploration across different domains.
- [UC Berkeley - Data 8](https://github.com/data-8/textbook) - Course materials for the Data Science Foundations course.
- [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR) - Production-ready OCR toolkit with multilingual and document AI support.
- [A collective list of free APIs](https://github.com/public-apis/public-apis) - A comprehensive list of free APIs for various purposes.
- [arXiv.org](https://arxiv.org/) - A free distribution service and open-access archive for scholarly articles.
- [Elicit](https://elicit.com/) - An AI research assistant that helps automate parts of literature review.
- [500+ AI/ML/DL/NLP Projects](https://github.com/ashishpatel26/500-AI-Machine-learning-Deep-learning-Computer-vision-NLP-Projects-with-code) - A massive collection of AI and machine learning projects with code for learning and portfolios.
- [Full Stack Fastapi Template](https://github.com/fastapi/full-stack-fastapi-template) - Full-stack template with FastAPI, React, and PostgreSQL.
- [Kittl](https://www.kittl.com/) - Platform for creating and editing charts and data visualizations.
- [Zasper](https://github.com/zasper-io/zasper) - High Performace IDE for Jupyter Notebooks.
- [Sketch](https://www.sketch.com/) - Toolkit designed for designers, focusing on their workflow.
- [Growth.Design](https://growth.design/) - A collection of product case studies and behavioral psychology insights for data-driven decision-making.
[β¬ back to top](#contents)
---
## π€ Contributing
**We welcome your contributions!**
See [CONTRIBUTING.md](https://github.com/PavelGrigoryevDS/awesome-data-analysis/blob/main/CONTRIBUTING.md) for how to add resources.
[β¬ back to top](#contents)
---
## π License
[](http://creativecommons.org/publicdomain/zero/1.0/)
This work is dedicated to the public domain under the [CC0 1.0 Universal](https://creativecommons.org/publicdomain/zero/1.0/) license.
[β¬ back to top](#contents)