awesome-python-data-science
A curated list of Python libraries used for data science.
https://github.com/thomasjpfan/awesome-python-data-science
Last synced: 13 days ago
JSON representation
-
Feature Extraction
-
Time Series
- prophet - Tool for producing high quality forecasts.
- tsfresh - Automatic extraction of relevant features from time series.
- tslearn - Machine learning toolkit dedicated to time-series data.
- pyts - A Python package for time series transformation and classification.
- luminaire - ML driven solutions for monitoring time series data.
- NeuralProphet - A Neural Network based Time-Series model, inspired by Facebook Prophet and AR-Net, built on PyTorch.
- sktime - A scikit-learn compatible Python toolbox for learning with time series data.
-
-
Machine Learning Frameworks
- Xgboost - Scalable, Portable and Distributed Gradient Boosting.
- scikit-learn - Machine learning.
- statsmodels - Statistical modeling and econometrics.
- SymPy - A computer algebra system.
- dask-ml - Distributed and parallel machine learning.
- imbalanced-learn - Perform under sampling and over sampling.
- lightning - Large-scale linear models.
- scikit-optimize - Sequential model-based optimization with a `scipy.optimize` interface.
- BayesianOptimization - Global optimization with gaussian processes.
- gplearn - Genetic Programming.
- python-glmnet - glmnet package for fitting generalized linear models.
- hmmlearn - Hidden Markov Models.
- vecstack - stacking (machine learning technique).
- deap - Evolutionary computation framework.
- civisml-extensions - scikit-learn-compatible estimators from Civis Analytics.
- hyperopt-sklearn - Hyper-parameter optimization for sklearn.
- scikit-survival - Survival analysis built on top of scikit-learn.
- dstoolbox - Tools that make working with scikit-learn and pandas easier.
- modin - Unify the way you interact with your data.
- pyomo - Python Optimization MOdels.
- BAMBI - BAyesian Model-Building Interface.
- combo - A Python Toolbox for Machine Learning Model Combination.
- fastai - The fast.ai deep learning library, lessons, and tutorials.
- pycaret - Low-code machine learning library in Python.
- river - River is a Python library for online machine learning.
- pyro - Deep universal probabilistic programming with PyTorch.
- PyMC - Probabilistic Programming.
-
Misc
-
Outlier Detection
-
Profiling
-
Ranking/Recommender
- memory_profiler - monitoring memory usage of a python program.
- mem_usage_ui - Measuring and graphing memory usage of local processes.
- viztracer - VizTracer is a low-overhead logging/debugging/profiling tool that can trace and visualize your python code execution.
- py-spy - Sampling profiler for Python programs.
- memory_profiler - monitoring memory usage of a python program.
- line_profiler - Line-by-line profiling.
- filprofiler - Fil a memory profiler designed for data processing applications.
- scalene - High-performance CPU and memory profiler for Python.
- python-flamegraph - Statistical profiler which outputs in format suitable for FlameGraph.
-
-
Python Tools
-
Ranking/Recommender
- devpi - PyPI server and packaging/testing/release tool.
- sacred - Reproduce computational experiments.
- Typer - Build CLIs with type hints.
- neurtu - A Python package for parametric benchmarks.
- pyprojroot - Finding project directories in Python.
- datasette - An open source multi-tool for exploring and publishing data.
- delorean - Time Travel Made Easy.
- pip-tools - Keeps dependencies up to date.
- click - CLI package.
- sacredboard - Dashboard for sacred.
- magic-wormhole - get things from one computer to another, safely.
-
-
Scientific
- Pandas - A library providing high-performance, easy-to-use data structures and data analysis tools.
- Numba - NumPy aware dynamic Python compiler using LLVM.
- blaze - NumPy and Pandas for databases.
- PyDy - Multibody Dynamics.
- nilearn - NeuroImaging.
- patsy - Describing statistical models using symbolic formulas.
- numexpr - Fast numerical array expression evaluator.
- dask - Parallel computing with task scheduling.
- or-tools - Google's Operations Research tools. Classical CS algorithms.
- cvxpy - Python-embedded modeling language for convex optimization problems.
- NumPy - A fundamental package for scientific computing with Python.
- astropy - Astronomy and astrophysics.
-
Trading
-
Ranking/Recommender
- Clairvoyant - Identify and monitor social/historical cues.
- zipline - Algorithmic Trading Library.
-
-
Visualization
- PyGWalker - Turns pandas and polars dataframes into a Tableau-like user interface for visual exploration.
- Great Tables - Absolutely Delightful Table-making in Python.
- diagrams - Diagrams lets you draw the cloud system architecture in Python code.
- bokeh - Interactive web plotting.
- dash - Interactive Web plotting.
- altair - Declarative statistical visualization.
- folium - Leaflet.js Maps.
- geoplot - High-level geospatial data visualization.
- mplleaftlet - Matplotlib plots from Python into interactive Leaflet web maps.
- matplotlib-venn - Area-weighted venn-diagrams.
- pyLDAvis - Interactive topic model visualization.
- cufflinks - Productivity Tools for Plotly + Pandas.
- scatterText - Visualizations of how language differs among document types.
- plotnine - ggplot for python.
- mizani - scales package.
- PtitPrince - Raindrop cloud.
- dtreeviz - Decision tree visualization and model interpretation.
- ipyvolume - 3d plotting for Python in the Jupyter notebook based on IPython widgets using WebGL.
- matplotlib - 2D plotting.
- seaborn - Visualization library.
- bqplot - Plotting library for IPython/Jupyter Notebooks.
Programming Languages
Categories
Sub Categories
Keywords
python
93
machine-learning
75
data-science
34
deep-learning
29
pytorch
24
scikit-learn
19
tensorflow
19
nlp
17
natural-language-processing
13
keras
13
visualization
12
neural-network
11
time-series
10
pandas
10
artificial-intelligence
10
ml
10
neural-networks
9
data-analysis
8
automl
8
python3
7
data-mining
7
ai
7
anomaly-detection
7
data-visualization
6
hyperparameter-optimization
6
forecasting
6
numpy
6
automated-machine-learning
5
xgboost
5
statistics
5
computer-vision
5
ensemble-learning
4
plotting
4
optimization
4
gpu
4
deeplearning
4
classification
4
matplotlib
4
object-detection
4
xai
4
outlier-detection
3
interpretability
3
feature-engineering
3
python-3
3
gradient-boosting
3
python-library
3
random-forest
3
augmentation
3
jax
3
datascience
3