{"id":16234776,"url":"https://github.com/maxwelllzh/python-packages-for-data-geeks","last_synced_at":"2025-03-19T15:30:34.776Z","repository":{"id":129542749,"uuid":"184765586","full_name":"MaxwellLZH/python-packages-for-data-geeks","owner":"MaxwellLZH","description":"A curated list of useful Python packages for data geeks","archived":false,"fork":false,"pushed_at":"2021-10-27T02:16:29.000Z","size":286,"stargazers_count":22,"open_issues_count":0,"forks_count":5,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-03-17T08:38:15.436Z","etag":null,"topics":["machine-learning","python","visualization"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/MaxwellLZH.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2019-05-03T14:04:26.000Z","updated_at":"2024-11-07T06:52:12.000Z","dependencies_parsed_at":"2023-05-26T16:00:27.717Z","dependency_job_id":null,"html_url":"https://github.com/MaxwellLZH/python-packages-for-data-geeks","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MaxwellLZH%2Fpython-packages-for-data-geeks","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MaxwellLZH%2Fpython-packages-for-data-geeks/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MaxwellLZH%2Fpython-packages-for-data-geeks/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MaxwellLZH%2Fpython-packages-for-data-geeks/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/MaxwellLZH","download_url":"https://codeload.github.com/MaxwellLZH/python-packages-for-data-geeks/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244453715,"owners_count":20455267,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["machine-learning","python","visualization"],"created_at":"2024-10-10T13:17:03.160Z","updated_at":"2025-03-19T15:30:32.146Z","avatar_url":"https://github.com/MaxwellLZH.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# A curated list of nice pakages for we data people\n\n\n\n## Time Series  \nname|owner|stars|description\n---|---|---|---\n[**AnomalyDetection**](https://github.com/twitter/AnomalyDetection)|twitter|3.1k|Anomaly detection with r\n[**stumpy**](https://github.com/TDAmeritrade/stumpy)|TDAmeritrade|879|Stumpy is a powerful and scalable python library that can be used for a variety of time series data mining tasks\n[**gluon-ts**](https://github.com/awslabs/gluon-ts)|awslabs|765|Gluonts - probabilistic time series modeling in python\n[**RobustSTL**](https://github.com/LeeDoYup/RobustSTL)|LeeDoYup|120|Unofficial implementation of robuststl: a robust seasonal-trend decomposition algorithm for long time series (aaai 2019)\n \n\n\n## Feature Engineering  \nname|owner|stars|description\n---|---|---|---\n[**featuretools**](https://github.com/FeatureLabs/featuretools)|FeatureLabs|4.8k|An open source python library for automated feature engineering\n[**Augly**](https://github.com/facebookresearch/AugLy)|facebookresearch|3.3k|A data augmentations library for audio, image, text, and video.\n[**great_expectations**](https://github.com/great-expectations/great_expectations)|great-expectations|2.7k|Always know what to expect from your data.\n[**categorical-encoders**](https://github.com/scikit-learn-contrib/categorical-encoding)|scikit-learn-contrib|1.1k|A library of sklearn compatible categorical variable encoders\n[**fancy-impute**](https://github.com/iskandr/fancyimpute)|iskandr|735|Multivariate imputation and matrix completion algorithms implemented in python\n[**dirty-cat**](https://github.com/dirty-cat/dirty_cat/)|dirty-cat|158|Encoding methods for dirty categorical variables\n \n\n\n## Pandas Extensions  \nname|owner|stars|description\n---|---|---|---\n[**pandas-profiliing**](https://github.com/pandas-profiling/pandas-profiling)|pandas-profiling|5.9k|Create html profiling reports from pandas dataframe objects\n[**pdpipe**](https://github.com/pdpipe/pdpipe)|pdpipe|557|Easy pipelines for pandas dataframes.\n[**pydqc**](https://github.com/SauceCat/pydqc)|SauceCat|211|Python automatic data quality check toolkit\n[**pandas_flavor**](https://github.com/Zsailer/pandas_flavor)|Zsailer|186|The easy way to write your own flavor of pandas\n[**pandas-log**](https://github.com/eyaltrabelsi/pandas-log)|eyaltrabelsi|154|The goal of pandas-log is to provide feedback about basic pandas operations. it provides simple wrapper functions for the most common functions that add additional logs\n \n\n\n## Feature Selection  \nname|owner|stars|description\n---|---|---|---\n[**scikit-features**](https://github.com/jundongl/scikit-feature)|jundongl|845|Open-source feature selection repository in python\n[**boruta**](https://github.com/scikit-learn-contrib/boruta_py)|scikit-learn-contrib|615|Python implementations of the boruta all-relevant feature selection method.\n[**ppscore**](https://github.com/8080labs/ppscore.git)|8080labs|321|Predictive power score (pps) in python\n[**minepy**](https://github.com/minepy/minepy)|minepy|114|Minepy - maximal information-based nonparametric exploration\n[**stability-selection**](https://github.com/scikit-learn-contrib/stability-selection)|scikit-learn-contrib|94|Scikit-learn compatible implementation of stability selection.\n \n\n\n## Model Tunning  \nname|owner|stars|description\n---|---|---|---\n[**mlflow**](https://github.com/mlflow/mlflow)|mlflow|5.3k|Open source platform for the machine learning lifecycle\n[**nnl**](https://github.com/microsoft/nni)|microsoft|4.5k|An open source automl toolkit for neural architecture search, model compression and hyper-parameter tuning.\n[**metaflow**](https://github.com/Netflix/metaflow)|Netflix|2k|Build and manage real-life data science projects with ease.\n[**skopt**](https://github.com/scikit-optimize/scikit-optimize)|scikit-optimize|1.6k|Sequential model-based optimization with a `scipy.optimize` interface\n[**optuna**](https://github.com/optuna/optuna)|optuna|1.5k|A hyperparameter optimization framework\n \n\n\n## AutoML  \nname|owner|stars|description\n---|---|---|---\n[**jina**](https://github.com/jina-ai/jina)|jina-ai|7.6k|Cloud-native neural search framework for 𝙖𝙣𝙮 kind of data\n[**autokeras**](https://github.com/keras-team/autokeras)|keras-team|7k|An automl system based on keras\n[**tpot**](https://github.com/EpistasisLab/tpot)|EpistasisLab|6.5k|A python automated machine learning tool that optimizes machine learning pipelines using genetic programming.\n[**auto-scikitlearn**](https://github.com/automl/auto-sklearn)|automl|4.1k|Automated machine learning with scikit-learn\n[**darts**](https://github.com/quark0/darts)|quark0|2.8k|Differentiable architecture search for convolutional and recurrent networks\n \n\n\n## Dimension Reduction  \nname|owner|stars|description\n---|---|---|---\n[**umap**](https://github.com/lmcinnes/umap)|lmcinnes|3.4k|Uniform manifold approximation and projection\n[**star-clustering**](https://github.com/josephius/star-clustering)|josephius|83|A clustering algorithm that automatically determines the number of clusters and works without hyperparameter fine-tuning.\n \n\n\n## Machine Learning  \nname|owner|stars|description\n---|---|---|---\n[**pattern**](https://github.com/clips/pattern)|clips|7.2k|Web mining module for python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.\n[**VowpalWabbit**](https://github.com/VowpalWabbit/vowpal_wabbit)|VowpalWabbit|6.7k|Vowpal wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.\n[**xlearn**](https://github.com/aksnzhy/xlearn)|aksnzhy|2.6k|High performance, easy-to-use, and scalable machine learning (ml) package, including linear model (lr), factorization machines (fm), and field-aware factorization machines (ffm) for python and cli interface.\n[**lightning**](https://github.com/scikit-learn-contrib/lightning)|scikit-learn-contrib|1.3k|Large-scale linear classification, regression and ranking in python\n[**Metrics**](https://github.com/benhamner/Metrics)|benhamner|1.2k|Machine learning evaluation metrics, implemented in python, r, haskell, and matlab / octave\n[**mlens**](https://github.com/flennerhag/mlens)|flennerhag|553|Ml-ensemble – high performance ensemble learning\n[**NGBoost**](https://github.com/stanfordmlgroup/ngboost.git)|stanfordmlgroup|335|Natural gradient boosting for probabilistic prediction\n[**polylearn**](https://github.com/scikit-learn-contrib/polylearn)|scikit-learn-contrib|191|A library for factorization machines and polynomial networks for classification and regression in python.\n \n\n\n## Bayesian Statistics  \nname|owner|stars|description\n---|---|---|---\n[**pyro**](https://github.com/pyro-ppl/pyro)|pyro-ppl|6.2k|Deep universal probabilistic programming with python and pytorch\n[**pymc**](https://github.com/pymc-devs/pymc3)|pymc-devs|4.6k|Probabilistic programming in python: bayesian modeling and probabilistic machine learning with theano\n[**Edward**](https://github.com/blei-lab/edward)|blei-lab|4.6k|A probabilistic programming language in tensorflow. deep generative models, variational inference.\n \n\n\n## Deep Learning  \nname|owner|stars|description\n---|---|---|---\n[**Autograd**](https://github.com/HIPS/autograd)|HIPS|4.4k|Efficiently computes derivatives of numpy code.\n[**RAdam**](https://github.com/LiyuanLucasLiu/RAdam)|LiyuanLucasLiu|1.8k|On the variance of the adaptive learning rate and beyond\n[**einops**](https://github.com/arogozhnikov/einops)|arogozhnikov|1.6k|Deep learning operations reinvented (for pytorch, tensorflow, chainer, gluon and others)\n[**Pytorch Metric Learning**](https://github.com/KevinMusgrave/pytorch-metric-learning)|KevinMusgrave|1.3k|The easiest way to use deep metric learning in your application. modular, flexible, and extensible. written in pytorch.\n \n\n\n## Model Training  \nname|owner|stars|description\n---|---|---|---\n[**horovod**](https://github.com/horovod/horovod)|horovod|11.8k|Distributed training framework for tensorflow, keras, pytorch, and apache mxnet.\n[**tfx**](https://github.com/tensorflow/tfx)|tensorflow|1.2k|Tfx is an end-to-end platform for deploying production ml pipelines\n \n\n\n## Distributed  \nname|owner|stars|description\n---|---|---|---\n[**ray**](https://github.com/ray-project/ray)|ray-project|13.3k|An open source framework that provides a simple, universal api for building distributed applications. ray is packaged with rllib, a scalable reinforcement learning library, and tune, a scalable hyperparameter tuning library.\n[**dask**](https://github.com/dask/dask)|dask|7.5k|Parallel computing with task scheduling\n \n\n\n## Federated Learning  \nname|owner|stars|description\n---|---|---|---\n[**FATE**](https://github.com/WeBankFinTech/FATE)|FederatedAI|1.1k|An industrial level federated learning framework\n \n\n\n## Confident Learning  \nname|owner|stars|description\n---|---|---|---\n[**cleanlab**](https://github.com/cgnorthcutt/cleanlab)|cgnorthcutt|1.2k|Find label errors in datasets, weak supervision, and learning with noisy labels.\n \n\n\n## Causal Inference  \nname|owner|stars|description\n---|---|---|---\n[**Edward**](https://github.com/blei-lab/edward)|blei-lab|4.6k|A probabilistic programming language in tensorflow. deep generative models, variational inference.\n[**dowhy**](https://github.com/microsoft/dowhy)|microsoft|2.3k|Dowhy is a python library for causal inference that supports explicit modeling and testing of causal assumptions. dowhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks.\n[**CausalML**](https://github.com/uber/causalml)|uber|2.1k|Uplift modeling and causal inference with machine learning algorithms\n[**EconML**](https://github.com/microsoft/EconML)|microsoft|943|Alice (automated learning and intelligence for causation and economics) is a microsoft research project aimed at applying artificial intelligence concepts to economic decision making. one of its goals is to build a toolkit that combines state-of-the-art machine learning techniques with econometrics in order to bring automation to complex causal …\n \n\n\n## NLP Preprocessing  \nname|owner|stars|description\n---|---|---|---\n[**jieba**](https://github.com/fxsjy/jieba)|fxsjy|23k|结巴中文分词\n[**HanLP**](https://github.com/hankcs/HanLP)|hankcs|19.4k|Natural language processing for the next decade. tokenization, part-of-speech tagging, named entity recognition, syntactic \u0026 semantic dependency parsing, document classification\n[**datasets**](https://github.com/huggingface/datasets)|huggingface|8.6k|🤗 the largest hub of ready-to-use nlp datasets for ml models with fast, easy-to-use and efficient data manipulation tools\n[**Chinese Word Embeddings**](https://github.com/Embedding/Chinese-Word-Vectors)|Embedding|6.6k|100+ chinese word vectors 上百种预训练中文词向量\n[**sentencepiece**](https://github.com/google/sentencepiece)|google|3.3k|Unsupervised text tokenizer for neural network-based text generation.\n[**ckiptagger**](https://github.com/ckiplab/ckiptagger)|ckiplab|1.1k|Ckip neural chinese word segmentation, pos tagging, and ner\n[**jiagu**](https://github.com/ownthink/Jiagu)|ownthink|1k|Jiagu深度学习自然语言处理工具 知识图谱关系抽取 中文分词 词性标注 命名实体识别 情感分析 新词发现 关键词 文本摘要 文本聚类\n[**TextAttack**](https://github.com/QData/TextAttack)|QData|902|Textattack 🐙 is a python framework for adversarial attacks, data augmentation, and model training in nlp\n[**MiNLP**](https://github.com/XiaoMi/MiNLP)|XiaoMi|494|Xiaomi natural language processing toolkits\n[**fastHan**](https://github.com/fastnlp/fastHan)|fastnlp|163|Fasthan是基于fastnlp与pytorch实现的中文自然语言处理工具，像spacy一样调用方便。\n \n\n\n## NLP Models  \nname|owner|stars|description\n---|---|---|---\n[**pytorch-transformers**](https://github.com/huggingface/pytorch-transformers)|huggingface|17.8k|🤗 transformers: state-of-the-art natural language processing for tensorflow 2.0 and pytorch.\n[**xlnet**](https://github.com/zihangdai/xlnet)|zihangdai|5.3k|Xlnet: generalized autoregressive pretraining for language understanding\n[**MatchZoo**](https://github.com/NTMC-Community/MatchZoo)|NTMC-Community|3.2k|Facilitating the design, comparison and sharing of deep text matching models.\n[**GPT2-Chinese**](https://github.com/Morizeyao/GPT2-Chinese)|Morizeyao|2.4k|Chinese version of gpt2 training code, using bert tokenizer.\n[**ALBERT**](https://github.com/brightmart/albert_zh)|brightmart|1.7k|A lite bert for self-supervised learning of language representations, 海量中文预训练albert模型\n[**bertforkeras**](https://github.com/bojone/bert4keras)|bojone|1.2k|Light reimplement of bert for keras\n[**AliceMind**](https://github.com/alibaba/AliceMind)|alibaba|820|Alibaba's collection of encoder-decoders from mind (machine intelligence of damo) lab\n[**FinBert**](https://github.com/valuesimplex/FinBERT)|valuesimplex|356|\n[**gensen**](https://github.com/Maluuba/gensen)|Maluuba|284|Learning general purpose distributed sentence representations via large scale multi-task learning\n \n\n\n## Representation Learning  \nname|owner|stars|description\n---|---|---|---\n[**sentence-transformers**](https://github.com/UKPLab/sentence-transformers)|UKPLab|3k|Sentence embeddings with bert \u0026 xlnet\n[**top2vec**](https://github.com/ddangelov/Top2Vec)|ddangelov|489|Top2vec learns jointly embedded topic, document and word vectors.\n[**glyce embedding**](https://github.com/ShannonAI/glyce)|ShannonAI|238|Code for neurips 2019 - glyce: glyph-vectors for chinese character representations\n \n\n\n## Image Processing  \nname|owner|stars|description\n---|---|---|---\n[**imgaug**](https://github.com/aleju/imgaug)|aleju|9k|Image augmentation for machine learning experiments.\n[**albumentations**](https://github.com/albumentations-team/albumentations)|albumentations-team|7.2k|Fast image augmentation library and easy to use wrapper around other libraries. documentation: https://albumentations.ai/docs/ paper about library: https://www.mdpi.com/2078-2489/11/2/125\n[**imagededupe**](https://github.com/idealo/imagededup)|idealo|2.7k|😎 finding duplicate images made easy!\n[**imutils**](https://github.com/jrosebr1/imutils)|jrosebr1|2.6k|A series of convenience functions to make basic image processing operations such as translation, rotation, resizing, skeletonization, and displaying matplotlib images easier with opencv and python.\n \n\n\n## Object Detection  \nname|owner|stars|description\n---|---|---|---\n[**mmdetection**](https://github.com/open-mmlab/mmdetection)|open-mmlab|7.4k|Open mmlab detection toolbox and benchmark\n[**keras-YOLO3**](https://github.com/qqwweee/keras-yolo3)|qqwweee|5.8k|A keras implementation of yolov3 (tensorflow backend)\n[**Light Facial Detection**](https://github.com/Linzaer/Ultra-Light-Fast-Generic-Face-Detector-1MB)|Linzaer|3.9k|💎1mb lightweight face detection model (1mb轻量级人脸检测模型)\n[**SSD-Tensorflow**](https://github.com/balancap/SSD-Tensorflow)|balancap|3.8k|Single shot multibox detector in tensorflow\n[**detr**](https://github.com/facebookresearch/detr)|facebookresearch|3.4k|End-to-end object detection with transformers\n[**FastMaskRCNN**](https://github.com/CharlesShang/FastMaskRCNN)|CharlesShang|3k|Mask rcnn in tensorflow\n[**u2net**](https://github.com/NathanUA/U-2-Net)|NathanUA|873|\"The code for our newly accepted paper in pattern recognition 2020: \"\"u^2-net: going deeper with nested u-structure for salient object detection.\"\"\"\n[**TFace**](https://github.com/Tencent/TFace)|Tencent|454|A trusty face recognition research platform developed by tencent youtu lab\n \n\n\n## OCR  \nname|owner|stars|description\n---|---|---|---\n[**easyOCR**](https://github.com/JaidedAI/EasyOCR)|JaidedAI|8.3k|Ready-to-use ocr with 40+ languages supported including chinese, japanese, korean and thai\n[**chineseocr-lite**](https://github.com/ouyanghuiyu/chineseocr_lite)|ouyanghuiyu|5.7k|超轻量级中文ocr，支持竖排文字识别, 支持ncnn推理 ( dbnet(1.8m) + crnn(2.5m) + anglenet(378kb)) 总模型仅4.7m\n[**InvoiceNet**](https://github.com/naiveHobo/InvoiceNet)|naiveHobo|1.5k|Deep neural network to extract intelligent information from invoice documents.\n \n\n\n## Recommendation  \nname|owner|stars|description\n---|---|---|---\n[**recommenders**](https://github.com/microsoft/recommenders)|microsoft|6.6k|Best practices on recommendation systems\n[**DeepCTR**](https://github.com/shenweichen/DeepCTR)|shenweichen|3.3k|Easy-to-use,modular and extendible package of deep-learning based ctr models.\n[**DeepFM**](https://github.com/ChenglongChen/tensorflow-DeepFM)|ChenglongChen|1.5k|Tensorflow implementation of deepfm for ctr prediction.\n[**neural-collaborative-filtering**](https://github.com/hexiangnan/neural_collaborative_filtering)|hexiangnan|988|Neural collaborative filtering\n[**deepmatch**](https://github.com/shenweichen/DeepMatch)|shenweichen|781|A deep matching model library for recommendations \u0026 advertising. it's easy to train models and to export representation vectors which can be used for ann search.\n[**xDeepFM**](https://github.com/Leavingseason/xDeepFM)|Leavingseason|656|\n \n\n\n## Outlier Detection  \nname|owner|stars|description\n---|---|---|---\n[**alibi-detect**](https://github.com/SeldonIO/alibi-detect)|SeldonIO|206|Algorithms for outlier and adversarial instance detection, concept drift and metrics.\n \n\n\n## Graph  \nname|owner|stars|description\n---|---|---|---\n[**graph_nets**](https://github.com/deepmind/graph_nets)|deepmind|3.9k|Build graph nets in tensorflow\n[**dgl**](https://github.com/dmlc/dgl)|dmlc|3.4k|Python package built to ease deep learning on graph, on top of existing dl frameworks.\n[**graphSAGE**](https://github.com/williamleif/GraphSAGE)|williamleif|1.9k|Representation learning on large graphs using stochastic graph convolutions.\n[**SNAP**](https://github.com/snap-stanford/snap)|snap-stanford|1.5k|Stanford network analysis platform (snap) is a general purpose network analysis and graph mining library.\n[**stellargraph**](https://github.com/stellargraph/stellargraph)|stellargraph|1.2k|Stellargraph - machine learning on graphs\n[**plato**](https://github.com/Tencent/plato)|Tencent|874|腾讯高性能分布式图计算框架plato\n[**spektral**](https://github.com/danielegrattarola/spektral)|danielegrattarola|810|Graph neural networks with keras and tensorflow 2.\n[**simple-graph**](https://github.com/dpapathanasiou/simple-graph)|dpapathanasiou|499|\"This is a simple graph database in sqlite, inspired by \"\"sqlite as a document database\"\"\"\n \n\n\n## Searching  \nname|owner|stars|description\n---|---|---|---\n[**faiss**](https://github.com/facebookresearch/faiss)|facebookresearch|9.7k|A library for efficient similarity search and clustering of dense vectors.\n[**annoy**](https://github.com/spotify/annoy)|spotify|6.7k|Approximate nearest neighbors in c++/python optimized for memory usage and loading/saving to disk\n[**haystack**](https://github.com/deepset-ai/haystack)|deepset-ai|2.1k|🔍 end-to-end python framework for building natural language search interfaces to data. leverages transformers and the state-of-the-art of nlp. supports dpr, elasticsearch, hugging face’s hub, and much more!\n \n\n\n## Adversarial Learning  \nname|owner|stars|description\n---|---|---|---\n[**pytorch-CycleGAN-and-pix2pix**](https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix)|junyanz|13.1k|Image-to-image translation in pytorch\n[**CycleGAN**](https://github.com/junyanz/CycleGAN)|junyanz|9.7k|Software that can generate photos from paintings, turn horses into zebras, perform style transfer, and more.\n[**GANHacks**](https://github.com/soumith/ganhacks)|soumith|8.4k|\"Starter from \"\"how to train a gan?\"\" at nips2016\"\n[**pix2pix**](https://github.com/phillipi/pix2pix)|phillipi|7.7k|Image-to-image translation with conditional adversarial nets\n[**DCGAN**](https://github.com/carpedm20/DCGAN-tensorflow)|carpedm20|6.5k|\"A tensorflow implementation of \"\"deep convolutional generative adversarial networks\"\"\"\n[**ALAE**](https://github.com/podgorskiy/ALAE)|podgorskiy|1.8k|[cvpr2020] adversarial latent autoencoders\n[**DoppelGANger**](https://github.com/fjxmlzn/DoppelGANger)|fjxmlzn|45|Using gans for sharing networked time series data: challenges, initial promise, and open questions, imc 2020\n \n\n\n## Model Interpretation  \nname|owner|stars|description\n---|---|---|---\n[**SHAP**](https://github.com/slundberg/shap)|slundberg|7.2k|A unified approach to explain the output of any machine learning model.\n[**LIME**](https://github.com/marcotcr/lime)|marcotcr|6.8k|Lime: explaining the predictions of any machine learning classifier\n[**Tensorwatch**](https://github.com/microsoft/tensorwatch)|microsoft|2.5k|Debugging, monitoring and visualization for python machine learning and data science\n[**eli5**](https://github.com/TeamHG-Memex/eli5)|TeamHG-Memex|1.8k|A library for debugging/inspecting machine learning classifiers and explaining their predictions\n[**PDPBox**](https://github.com/SauceCat/PDPbox)|SauceCat|382|Python partial dependence plot toolbox\n \n\n\n## Visualization  \nname|owner|stars|description\n---|---|---|---\n[**Dash**](https://github.com/plotly/dash)|plotly|10.8k|Analytical web apps for python \u0026 r. no javascript required.\n[**prettymaps**](https://github.com/marceloprates/prettymaps)|marceloprates|7.2k|A small set of python functions to draw pretty maps from openstreetmap data. based on osmnx, matplotlib and shapely libraries.\n[**Seaborn**](https://github.com/mwaskom/seaborn)|mwaskom|6.6k|Statistical data visualization using matplotlib\n[**Plotly**](https://github.com/plotly/plotly.py)|plotly|5.8k|An open-source, interactive graphing library for python (includes plotly express) ✨\n[**streamlit**](https://github.com/streamlit/streamlit)|streamlit|5.4k|Streamlit — the fastest way to build custom ml tools\n[**folium**](https://github.com/python-visualization/folium)|python-visualization|4.3k|Python data. leaflet.js maps.\n[**altair**](https://github.com/altair-viz/altair)|altair-viz|4.3k|Declarative statistical visualization library for python\n[**dash sample apps**](https://github.com/plotly/dash-sample-apps)|plotly|2k|Open-source demos hosted on dash gallery\n[**scikit-plot**](https://github.com/reiinakano/scikit-plot)|reiinakano|1.7k|An intuitive library to add plotting functionality to scikit-learn objects.\n[**CNN-Visualizer**](https://github.com/poloclub/cnn-explainer)|poloclub|1.7k|Learning convolutional neural networks with interactive visualization. https://poloclub.github.io/cnn-explainer/\n \n\n\n## Development Toolkit  \nname|owner|stars|description\n---|---|---|---\n[**free-apis**](https://github.com/public-apis/public-apis#geocoding)|public-apis|65.9k|A collective list of free apis for use in software and web development.\n[**bash-bible**](https://github.com/dylanaraps/pure-bash-bible#strip-pattern-from-start-of-string)|dylanaraps|23.3k|📖 a collection of pure bash alternatives to external processes.\n[**python-fire**](https://github.com/google/python-fire)|google|15.8k|Python fire is a library for automatically generating command line interfaces (clis) from absolutely any python object.\n[**black**](https://github.com/psf/black)|psf|13.7k|The uncompromising python code formatter\n[**PySnooper**](https://github.com/cool-RR/PySnooper)|cool-RR|12.9k|Never use print for debugging again\n[**poetry**](https://github.com/sdispater/poetry)|sdispater|7.1k|Python dependency management and packaging made easy.\n[**free api**](https://github.com/fangzesheng/free-api)|fangzesheng|6.5k|收集免费的接口服务,做一个api的搬运工\n[**fastapi**](https://github.com/tiangolo/fastapi)|tiangolo|6.5k|Fastapi framework, high performance, easy to learn, fast to code, ready for production\n[**playwright-python**](https://github.com/microsoft/playwright-python)|microsoft|4.9k|Python version of the playwright testing and automation library.\n[**hypothesis**](https://github.com/HypothesisWorks/hypothesis)|HypothesisWorks|4k|Hypothesis is a powerful, flexible, and easy to use library for property-based testing.\n[**modin**](https://github.com/modin-project/modin)|modin-project|3.6k|Modin: speed up your pandas workflows by changing a single line of code\n[**pyautogui**](https://github.com/asweigart/pyautogui)|asweigart|3.2k|A cross-platform gui automation python module for human beings. used to programmatically control the mouse \u0026 keyboard.\n[**jupytext**](https://github.com/mwouts/jupytext)|mwouts|3k|Jupyter notebooks as markdown documents, julia, python or r scripts\n[**papermill**](https://github.com/nteract/papermill/)|nteract|2.7k|📚 parameterize, execute, and analyze notebooks\n[**handclacs**](https://github.com/connorferster/handcalcs)|connorferster|2.3k|Python library for converting python calculations into rendered latex.\n[**lark**](https://github.com/lark-parser/lark)|lark-parser|2k|Lark is a parsing toolkit for python, built with a focus on ergonomics, performance and modularity.\n[**sqlfluff**](https://github.com/sqlfluff/sqlfluff)|sqlfluff|1.8k|A sql linter and auto-formatter for humans\n[**handout**](https://github.com/danijar/handout)|danijar|1.8k|Turn python scripts into handouts with markdown and figures\n[**urwind**](https://github.com/urwid/urwid)|urwid|1.7k|Console user interface library for python (official repo)\n[**more-itertools**](https://github.com/more-itertools/more-itertools)|more-itertools|1.5k|More routines for operating on iterables, beyond itertools\n[**xarray**](https://github.com/pydata/xarray)|pydata|1.5k|N-d labeled arrays and datasets in python\n[**icecream - debugging**](https://github.com/gruns/icecream)|gruns|1.4k|🍦 sweet and creamy print debugging.\n[**pygooglenews**](https://github.com/kotartemiy/pygooglenews)|kotartemiy|816|If google news had a python library\n[**bottleneck**](https://github.com/pydata/bottleneck)|pydata|540|Fast numpy array functions written in c\n[**wily**](https://github.com/tonybaloney/wily)|tonybaloney|445|A python application for tracking, reporting on timing and complexity in python code\n \n\n\n## Tutorial  \nname|owner|stars|description\n---|---|---|---\n[**Python 100 days**](https://github.com/jackfrued/Python-100-Days)|jackfrued|70.8k|Python - 100天从新手到大师\n[**Command line tutorial in one page**](https://github.com/jlevy/the-art-of-command-line)|jlevy|66.5k|Master the command line, in one page\n[**Deep Learning 500 Questions**](https://github.com/scutan90/DeepLearning-500-questions)|scutan90|35.3k|深度学习500问，以问答形式对常用的概率知识、线性代数、机器学习、深度学习、计算机视觉等热点问题进行阐述，以帮助自己及有需要的读者。 全书分为18个章节，50余万字。由于水平有限，书中不妥之处恳请广大读者批评指正。 未完待续............ 如有意合作，联系scutjy2015@163.com 版权所有，违权必究 tan 2018.06\n[**Learn Regex**](https://github.com/ziishaned/learn-regex)|ziishaned|31.9k|Learn regex the easy way\n[**500 lines or less**](https://github.com/aosabook/500lines)|aosabook|23.9k|500 lines or less\n[**Data Science Tutorial notebook**](https://github.com/donnemartin/data-science-ipython-notebooks)|donnemartin|17.7k|Data science python notebooks: deep learning (tensorflow, theano, caffe, keras), scikit-learn, kaggle, big data (spark, hadoop mapreduce, hdfs), matplotlib, pandas, numpy, scipy, python essentials, aws, and various command lines.\n[**Awesome tensorflow**](https://github.com/jtoy/awesome-tensorflow)|jtoy|15.3k|Tensorflow - a curated list of dedicated resources http://tensorflow.org\n[**NLP progress**](https://github.com/sebastianruder/NLP-progress)|sebastianruder|13k|Repository to track the progress in natural language processing (nlp), including the datasets and the current state-of-the-art for the most common nlp tasks.\n[**《神经网络与深度学习》- 邱锡鹏**](https://github.com/nndl/nndl.github.io)|nndl|12k|《神经网络与深度学习》 邱锡鹏著 neural network and deep learning\n[**wtfpython-cn**](https://github.com/leisurelicht/wtfpython-cn)|leisurelicht|9.2k|Wtfpython的中文翻译/施工结束/ 能力有限，欢迎帮我改进翻译\n[**object-detection-papers**](https://github.com/hoya012/deep_learning_object_detection)|hoya012|8.1k|A paper list of object detection using deep learning.\n[**MLAlgorithms**](https://github.com/rushter/MLAlgorithms)|rushter|7.8k|Minimal and clean examples of machine learning algorithms implementations\n[**numpy-ml**](https://github.com/ddbourgin/numpy-ml)|ddbourgin|7.8k|Machine learning, in numpy\n[**Reinforcement-learning-introduction**](https://github.com/ShangtongZhang/reinforcement-learning-an-introduction)|ShangtongZhang|7.7k|Python implementation of reinforcement learning: an introduction\n[**deep learning drizzle**](https://github.com/kmario23/deep-learning-drizzle)|kmario23|7.1k|Drench yourself in deep learning, reinforcement learning, machine learning, computer vision, and nlp by learning from these exciting lectures!!\n[**Google Research**](https://github.com/google-research/google-research)|google-research|6k|Google ai research\n[**GNN Papers**](https://github.com/thunlp/GNNPapers)|thunlp|5.7k|Must-read papers on graph neural networks (gnn)\n[**minGPT**](https://github.com/karpathy/minGPT)|karpathy|5.3k|A minimal pytorch re-implementation of the openai gpt (generative pretrained transformer) training\n[**UGATIT**](https://github.com/taki0112/UGATIT)|taki0112|4.4k|Official tensorflow implementation of u-gat-it: unsupervised generative attentional networks with adaptive layer-instance normalization for image-to-image translation\n[**tensorflow2_tutorials_chinese**](https://github.com/czy36mengfei/tensorflow2_tutorials_chinese)|czy36mengfei|4k|Tensorflow2中文教程，持续更新(当前版本:tensorflow2.0)，tag: tensorflow 2.0 tutorials\n[**Tensorflow-2.x-Tutorials**](https://github.com/dragen1860/TensorFlow-2.x-Tutorials)|dragen1860|3.7k|Tensorflow 2.x version's tutorials and examples, including cnn, rnn, gan, auto-encoders, fasterrcnn, gpt, bert examples, etc. tf 2.0版入门实例代码，实战教程。\n[**Machine Learning Notes from Prof. Yida Xu**](https://github.com/roboticcam/machine-learning-notes)|roboticcam|3.5k|My continuously updated machine learning, probabilistic models and deep learning notes and demos (1500+ slides) 我不间断更新的机器学习，概率模型和深度学习的讲义(1500+页)和视频链接\n[**Awesome graph classification**](https://github.com/benedekrozemberczki/awesome-graph-classification)|benedekrozemberczki|2.5k|A collection of important graph embedding, classification and representation learning papers with implementations.\n[**NLP-Beginner**](https://github.com/FudanNLP/nlp-beginner)|FudanNLP|2.3k|Nlp上手教程\n[**openNRE**](https://github.com/thunlp/OpenNRE)|thunlp|2k|An open-source package for neural relation extraction (nre)\n[**Microsoft NLP examples**](https://github.com/microsoft/nlp)|microsoft|1.9k|Natural language processing best practices \u0026 examples\n[**anomaly detection**](https://github.com/yzhao062/anomaly-detection-resources)|yzhao062|1.9k|Anomaly detection related books, papers, videos, and toolboxes\n[**Stanford Natural Language Understanding Course**](https://github.com/cgpotts/cs224u)|cgpotts|727|Code for stanford cs224u\n[**Generative Models in TF2**](https://github.com/timsainb/tensorflow2-generative-models)|timsainb|690|Implementations of a number of generative models in tensorflow 2. gan, vae, seq2seq, vaegan, gaia, spectrogram inversion. everything is self contained in a jupyter notebook for easy export to colab.\n[**Dimensional reduction algos**](https://github.com/heucoder/dimensionality_reduction_alo_codes)|heucoder|553|Pca、lda、mds、lle、tsne等降维算法的python实现\n[**Generative Deep Learning**](https://github.com/davidADSP/GDL_code)|davidADSP|491|The official code repository for examples in the o'reilly book 'generative deep learning'\n[**reinforcement learning**](https://github.com/dalmia/David-Silver-Reinforcement-learning)|dalmia|417|Notes for the reinforcement learning course by david silver along with implementation of various algorithms.\n[**Keras Text classification**](https://github.com/yongzhuo/Keras-TextClassification)|yongzhuo|277|中文长文本分类、短句子分类、多标签分类、两句子相似度（chinese text classification of keras nlp, multi-label classify, or sentence classify, long or short），字词句向量嵌入层（embeddings）和网络层（graph）构建基类，fasttext，textcnn，charcnn，textrnn, rcnn, dcnn, dpcnn, vdcnn, crnn, bert, xlnet, albert, attention, deepmoji, han, 胶囊网络-capsulenet, transformer-encode, seq2seq, ent, dmn,\n[**Graph neural network implementation by Microsoft**](https://github.com/microsoft/tf-gnn-samples)|microsoft|161|Tensorflow implementations of graph neural networks\n \n\n\n## Fun Stuff  \nname|owner|stars|description\n---|---|---|---\n[**funNLP**](https://github.com/fighting41love/funNLP)|fighting41love|14.9k|中英文敏感词、语言检测、中外手机/电话归属地/运营商查询、名字推断性别、手机号抽取、身份证抽取、邮箱抽取、中日文人名库、中文缩写库、拆字词典、词汇情感值、停用词、反动词表、暴恐词表、繁简体转换、英文模拟中文发音、汪峰歌词生成器、职业名称词库、同义词库、反义词库、否定词库、汽车品牌词库、汽车零件词库、连续英文切割、各种中文词向量、公司名字大全、古诗词库、it词库、财经词库、成语词库、地名词库、历史名人词库、诗词词库、医学词库、饮食词库、法律词库、汽车词库、动物词库、中文聊天语料、中文谣言数据、百度中文问答数据集、句子相似度匹配算法集合、bert资源、文本生成\u0026摘要相关工具、coconlp信息抽取工具、国内电话号码正则匹配、清华大学xlore:中英文跨语言百科知识图谱、清华大学人工智能技术…\n[**tiler**](https://github.com/nuno-faria/tiler)|nuno-faria|3.8k|👷 build images with images\n[**Hacking neural nets**](https://github.com/Kayzaks/HackingNeuralNetworks)|Kayzaks|1.9k|A small course on exploiting and defending neural networks\n[**KnockKnock**](https://github.com/huggingface/knockknock)|huggingface|1.7k|🚪✊knock knock: get notified when your training ends with only two additional lines of code\n[**break-capcha**](https://github.com/zhaipro/easy12306)|zhaipro|1.4k|使用机器学习算法完成对12306验证码的自动识别\n[**GNE**](https://github.com/kingname/GeneralNewsExtractor)|kingname|908|新闻网页正文通用抽取器 alpha 版.\n[**pyforest**](https://github.com/8080labs/pyforest)|8080labs|689|Pyforest - feel the bliss of automated imports\n \n\n\n## Trading  \nname|owner|stars|description\n---|---|---|---\n[**zipline**](https://github.com/quantopian/zipline)|quantopian|11.2k|Zipline, a pythonic algorithmic trading library\n[**tensortrade**](https://github.com/tensortrade-org/tensortrade)|tensortrade-org|2k|An open source reinforcement learning framework for training, evaluating, and deploying robust trading agents.\n[**mlfinlab**](https://github.com/hudson-and-thames/mlfinlab)|hudson-and-thames|1.3k|Mlfinlab helps portfolio managers and traders who want to leverage the power of machine learning by providing reproducible, interpretable, and easy to use tools.\n[**tf-quant**](https://github.com/google/tf-quant-finance)|google|773|High-performance tensorflow library for quantitative finance.\n \n\n\n\n\n## Contribution Guide\n\nAdd your favourite packages to `package.json`, and run `package_info.py` to update the page :)","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmaxwelllzh%2Fpython-packages-for-data-geeks","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmaxwelllzh%2Fpython-packages-for-data-geeks","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmaxwelllzh%2Fpython-packages-for-data-geeks/lists"}