awesome-list
A list of useful stuff in Machine Learning, Computer Graphics, Software Development, ...
https://github.com/johnhany/awesome-list
Last synced: 5 days ago
JSON representation
-
Cross-Platform
-
JavaScript
- Refined GitHub - Browser extension that simplifies the GitHub interface and adds useful features.
- Foam - A personal knowledge management and sharing system for VSCode.
- Notable - The Markdown-based note-taking app that doesn't suck.
- Atom - The hackable text editor.
- Fusuma - Makes slides with Markdown easily.
- Kilo - A text editor in less than 1000 LOC with syntax highlight and search.
- lint-md - 检查中文 markdown 编写格式规范的命令行工具,基于 AST,方便集成 CI,写博客 / 文档必备。支持 API 调用
- Mailspring - A beautiful, fast and fully open source mail client for Mac, Windows and Linux.
- Google Earth Enterprise - The open source release of Google Earth Enterprise, a geospatial application which provides the ability to build and host custom 3D globes and 2D maps.
- carbon - Create and share beautiful images of your source code.
- vscode-python - Python extension for Visual Studio Code.
- vscode-cpptools - Official repository for the Microsoft C/C++ extension for VS Code.
- code-server - VS Code in the browser.
- Gradle - A build tool with a focus on build automation and support for multi-language development.
- LiteIDE - A simple, open source, cross-platform Go IDE.
- YouCompleteMe - A code-completion engine for Vim.
- readme-md-generator - CLI that generates beautiful README.md files.
- pdfdiff - Command-line tool to inspect the difference between (the text in) two PDF files.
- Rufus - The Reliable USB Formatting Utility.
- projectM - Cross-platform music visualization.
- Syncthing - Open Source Continuous File Synchronization.
- PCSX2 - The Playstation 2 Emulator.
- PPSSPP - A PSP emulator for Android, Windows, Mac and Linux, written in C++.
- PyBoy - Game Boy emulator written in Python.
- libtorrent - An efficient feature complete C++ bittorrent implementation.
- qBittorrent-Enhanced-Edition - [Unofficial] qBittorrent Enhanced, based on qBittorrent
- trackerslist - Updated list of public BitTorrent trackers.
- TrackersListCollection - A list of popular BitTorrent Trackers.
- bittorrent-tracker - Simple, robust, BitTorrent tracker (client & server) implementation.
- ShareX - A free and open source program that lets you capture or record any area of your screen and share it with a single press of a key.
- Streamlabs Desktop - Free and open source streaming software built on OBS and Electron.
- SwitchHosts - Switch hosts quickly.
- Albert - A fast and flexible keyboard launcher.
- Kindle_download_helper - Download all your kindle books script.
- GitHub520 - 让你“爱”上 GitHub,解决访问时图裂、加载慢的问题。
- Peek - Simple animated GIF screen recorder with an easy to use interface.
- GayHub - An awesome chrome extension for github.
- sindresorhus/awesome - Awesome lists about all kinds of interesting topics.
-
-
Data Format & I/O
-
For C++/C
- glog - C++ implementation of the Google logging module.
- FFmpeg - A collection of libraries and tools to process multimedia content such as audio, video, subtitles and related metadata.
- LAV Filters - Open-Source DirectShow Media Splitter and Decoders.
- OpenEXR - Provides the specification and reference implementation of the EXR file format, the professional-grade image storage format of the motion picture industry.
- spdlog - Fast C++ logging library.
- glogg - A fast, advanced log explorer.
-
For Go
- json-iterator/go - A high-performance 100% compatible drop-in replacement of "encoding/json"
- json-to-go - Translates JSON into a Go type in your browser instantly (original).
-
For Java
- fastjson - A Java library that can be used to convert Java Objects into their JSON representation.
- jackson-core - Core part of Jackson that defines Streaming API as well as basic shared abstractions.
- Okio - A modern I/O library for Android, Java, and Kotlin Multiplatform.
-
For Python
- Imageio - Python library for reading and writing image data.
- Wand - The ctypes-based simple ImageMagick binding for Python.
- VidGear - A High-performance cross-platform Video Processing Python framework powerpacked with unique trailblazing features.
- marshmallow - A lightweight library for converting complex objects to and from simple Python datatypes.
- cloudpickle - Extended pickling support for Python objects.
- dill - Extends python's pickle module for serializing and de-serializing python objects to the majority of the built-in python types.
- UltraJSON - Ultra fast JSON decoder and encoder written in C with Python bindings.
- orjson - Fast, correct Python JSON library supporting dataclasses, datetimes, and numpy
- simplejson - A simple, fast, extensible JSON encoder/decoder for Python.
- jsonschema - An implementation of the JSON Schema specification for Python.
- jsonpickle - Python library for serializing any arbitrary object graph into JSON.
- MessagePack - An efficient binary serialization format. It lets you exchange data among multiple languages like JSON.
- PyYAML - Canonical source repository for PyYAML.
- StrictYAML - Type-safe YAML parser and validator.
- xmltodict - Python module that makes working with XML feel like you are working with JSON.
- csvkit - A suite of utilities for converting to and working with CSV, the king of tabular file formats.
- Tablib - Python Module for Tabular Datasets in XLS, CSV, JSON, YAML, &c.
- HDF5 for Python - The h5py package is a Pythonic interface to the HDF5 binary data format.
- validators - Python Data Validation for Humans.
- Arrow - A Python library that offers a sensible and human-friendly approach to creating, manipulating, formatting and converting dates, times and timestamps.
- dateutil - The dateutil module provides powerful extensions to the standard datetime module, available in Python.
- dateparser - Python parser for human readable dates.
- Watchdog - Python library and shell utilities to monitor filesystem events.
- uvloop - A fast, drop-in replacement of the built-in asyncio event loop.
- aiofiles - An Apache2 licensed library, written in Python, for handling local disk files in asyncio applications.
- PyFilesystem2 - Python's Filesystem abstraction layer.
- path - Object-oriented file system path manipulation.
- phonenumbers Python Library - Python port of Google's libphonenumber.
- Chardet - Python character encoding detector.
- Python Slugify - A Python slugify application that handles unicode.
- humanize - Contains various common humanization utilities, like turning a number into a fuzzy human-readable duration ("3 minutes ago") or into a human-readable size or throughput.
- XlsxWriter - A Python module for creating Excel XLSX files.
- xlwings - A Python library that makes it easy to call Python from Excel and vice versa.
- pygsheets - Google Spreadsheets Python API v4
- gdown - Download a large file from Google Drive.
- schema - A library for validating Python data structures.
- smart_open - Utils for streaming large files (S3, HDFS, gzip, bz2...).
- Pendulum - Python datetimes made easy.
-
Streaming Data Management
- protobuf - Google's language-neutral, platform-neutral, extensible mechanism for serializing structured data.
- FlatBuffers - A cross platform serialization library architected for maximum memory efficiency.
-
-
Data Management & Processing
-
Database & Cloud Management
- Redis - An in-memory database that persists on disk.
- redis-py - Redis Python client
- Node-Redis - Redis Node.js client
- Jedis - Redis Java client
- MongoDB - The MongoDB Database.
- PyMongo - The Python driver for MongoDB
- MongoDB Go Driver - The Go driver for MongoDB
- MongoDB NodeJS Driver - The Node.js driver for MongoDB
- MongoDB C# Driver - The .NET driver for MongoDB
- MongoEngine - A Python Object-Document-Mapper for working with MongoDB
- Motor - The async Python driver for MongoDB and Tornado or asyncio
- Apache Spark - A unified analytics engine for large-scale data processing.
- Presto - A distributed SQL query engine for big data.
- Google Cloud Python Client - Google Cloud Client Library for Python.
- Elasticsearch - Free and Open, Distributed, RESTful Search Engine.
- Kibana - A browser-based analytics and search dashboard for Elasticsearch
- Logstash - Transport and process your logs, events, or other data
- Beats - Lightweight shippers for Elasticsearch & Logstash
- Elastic UI Framework - A collection of React UI components for quickly building user interfaces at Elastic
- Elasticsearch Python Client - Official Elasticsearch client library for Python
- Elasticsearch DSL - High level Python client for Elasticsearch
- Elasticsearch Node.js client - Official Elasticsearch client library for Node.js
- Elasticsearch PHP client - Official PHP client for Elasticsearch
- go-elasticsearch - The official Go client for Elasticsearch
- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper.
- Alembic - A database migrations tool for SQLAlchemy
- Databases - Async database support for Python
- Apache Libcloud - A Python library which hides differences between different cloud provider APIs and allows you to manage different cloud resources through a unified and easy to use API.
- Grafana - The open and composable observability and data visualization platform. Visualize metrics, logs, and traces from multiple sources like Prometheus, Loki, Elasticsearch, InfluxDB, Postgres and many more.
- Joblib Apache Spark Backend - Provides Apache Spark backend for joblib to distribute tasks on a Spark cluster.
- PyMySQL - Pure Python MySQL Client.
- mysqlclient - MySQL database connector for Python
- Redigo - Go client for Redis.
- Tortoise ORM - Familiar asyncio ORM for python, built with relations in mind.
- Ibis - Expressive analytics in Python at any scale.
- peewee - A small, expressive orm -- supports postgresql, mysql and sqlite.
- DB4S - DB Browser for SQLite (DB4S) is a high quality, visual, open source tool to create, design, and edit database files compatible with SQLite.
- TinyDB - A lightweight document oriented database written in pure Python and has no external dependencies.
- MyCAT - An enforced database which is a replacement for MySQL and supports transaction and ACID.
- Pony - An advanced object-relational mapper.
- dataset - Easy-to-use data handling for SQL data stores with support for implicit table creation, bulk loading, and transactions.
- Dagster - An orchestration platform for the development, production, and observation of data assets.
- Great Expectations - Helps data teams eliminate pipeline debt, through data testing, documentation, and profiling.
- dbt - Enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
- Metabase - The simplest, fastest way to get business intelligence and analytics to everyone in your company.
- Ploomber - The fastest way to build data pipelines.
- PyHive - Python interface to Hive and Presto.
- Pypeln - A simple yet powerful Python library for creating concurrent data pipelines.
- petl - A general purpose Python package for extracting, transforming and loading tables of data.
- PySyft - Data science on data without acquiring a copy.
- Dgraph - Native GraphQL Database with graph backend.
- SQLModel - SQL databases in Python, designed for simplicity, compatibility, and robustness
-
Streaming Data Management
- Apache Beam - A unified programming model for Batch and Streaming data processing.
- Apache Kafka - Mirror of Apache Kafka.
- Apache Flink - An open source stream processing framework with powerful stream- and batch-processing capabilities.
- kafka-python - Python client for Apache Kafka.
- confluent-kafka-python - Confluent's Kafka Python Client.
- Deep Lake - Data Lake for Deep Learning. Build, manage, query, version, & visualize datasets. Stream data real-time to PyTorch/TensorFlow.
- StreamAlert - A serverless, realtime data analysis framework which empowers you to ingest, analyze, and alert on data from any environment, using datasources and alerting logic you define.
- Prometheus - The Prometheus monitoring system and time series database.
- Prometheus Python Client - Prometheus instrumentation library for Python applications
- Streamparse - Lets you run Python code against real-time streams of data via Apache Storm.
-
-
Data Processing
-
Data Management
- pandera - A light-weight, flexible, and expressive statistical data testing library.
- Kedro - A Python framework for creating reproducible, maintainable and modular data science code.
- PyFunctional - Python library for creating data pipelines with chain functional programming.
- ImageHash - An image hashing library written in Python.
- FiftyOne - An open-source tool for building high-quality datasets and computer vision models.
- Datasette - An open source multi-tool for exploring and publishing data.
- glom - Python's nested data operator (and CLI), for all your declarative restructuring needs.
- dedupe - A python library that uses machine learning to perform fuzzy matching, deduplication and entity resolution quickly on structured data.
- datasketch - Gives you probabilistic data structures that can process and search very large amount of data super fast, with little loss of accuracy.
- Ciphey - Automatically decrypt encryptions without knowing the key or cipher, decode encodings, and crack hashes.
- pandas-profiling - Create HTML data profiling reports for pandas DataFrame.
-
Data Pre-processing & Loading
- Instagram Scraper - Scrapes an instagram user's photos and videos.
- DALI - A library for data loading and pre-processing to accelerate deep learning applications.
- AugLy - A data augmentations library for audio, image, text, and video.
- Albumentations - A Python library for image augmentation.
- Augmentor - Image augmentation library in Python for machine learning.
- Pillow - The friendly PIL fork (Python Imaging Library).
- MoviePy - Video editing with Python.
- Open3D - A Modern Library for 3D Data Processing.
- PCL - The Point Cloud Library (PCL) is a standalone, large scale, open project for 2D/3D image and point cloud processing.
- imutils - A basic image processing toolkit in Python, based on OpenCV.
- Towhee - Data processing pipelines for neural networks.
- ffcv - A drop-in data loading system that dramatically increases data throughput in model training.
- NLPAUG - Data augmentation for NLP.
- Audiomentations - A Python library for audio data augmentation.
- torch-audiomentations - Fast audio data augmentation in PyTorch, with GPU support.
- librosa - A python package for music and audio analysis.
- Pydub - Manipulate audio with a simple and easy high level interface.
- DDSP - A library of differentiable versions of common DSP functions.
- TSFRESH - Automatic extraction of relevant features from time series.
- TA - A Technical Analysis library useful to do feature engineering from financial time series datasets, based on Pandas and NumPy.
- Featuretools - An open source python library for automated feature engineering.
- Feature-engine - A Python library with multiple transformers to engineer and select features for use in machine learning models.
- img2dataset - Easily turn large sets of image urls to an image dataset.
- Faker - A Python package that generates fake data for you.
- SDV - Synthetic Data Generation for tabular, relational and time series data.
- Googletrans - (unofficial) Googletrans: Free and Unlimited Google translate API for Python. Translates totally free of charge.
- OptBinning - Monotonic binning with constraints. Support batch & stream optimal binning. Scorecard modelling and counterfactual explanations.
- Scrapy - A fast high-level web crawling & scraping framework for Python.
- pyspider - A Powerful Spider(Web Crawler) System in Python.
- instaloader - Download pictures (or videos) along with their captions and other metadata from Instagram.
- XueQiuSuperSpider - 雪球股票信息超级爬虫
- coordtransform - 提供了百度坐标(BD09)、国测局坐标(火星坐标,GCJ02)、和WGS84坐标系之间的转换
- nlp_chinese_corpus - 大规模中文自然语言处理语料
- imgaug - Image augmentation for machine learning experiments.
- accimage - High performance image loading and augmenting routines mimicking PIL.Image interface.
- Snorkel - A system for quickly generating training data with weak supervision.
- fancyimpute - A variety of matrix completion and imputation algorithms implemented in Python.
- Requests-HTML - Pythonic HTML Parsing for Humans.
-
Programming Languages
Categories
Deep Learning Framework
150
Programming Language Tutorials
83
Containers & Language Extentions & Linting
82
Computer Vision
79
Data Processing
78
Machine Learning Framework
72
Data Management & Processing
62
Natural Language Processing
60
Cross-Platform
56
Linear Algebra / Statistics Toolkit
53
Data Format & I/O
51
Machine Learning
46
Data Visualization
44
Web Development
43
Desktop App Development
42
DevOps
32
Game Engines
28
Machine Learning Tutorials
25
Reinforcement Learning
24
Graphic Libraries & Renderers
22
Debugging & Profiling & Tracing
21
Programming Language
21
Mobile Development
20
Time-Series & Financial
19
Graph
15
Recommendation, Advertisement & Ranking
14
Windows
13
Process, Thread & Coroutine
12
Package Management
12
Other Machine Learning Applications
11
Causal Inference
10
Linux
9
Security
7
CG Tutorials
6
Computer Graphics
5
MacOS
3
For JavaScript
1
Sub Categories
Data Management
178
JavaScript
175
Others
101
For Python
97
High-Level DL APIs
95
C++/C Toolkit
77
Database & Cloud Management
52
General Purpose Framework
42
Data Pre-processing & Loading
41
For Scala
39
Deployment & Distribution
36
For C++/C
34
General Purpose NLP
32
General Purpose Tensor Library
30
Python Toolkit
30
Classification & Detection & Tracking
28
General Purpose CV
24
Data Representation
22
Conversation & Translation
17
For Go
15
OCR
14
Statistical Toolkit
14
Image / Video Generation
13
Streaming Data Management
12
For Java
12
C++/C
11
Experiment Management
10
Python
10
Hyperparameter Search & Gradient-Free Optimization
8
For JavaScript
8
Speech & Audio
7
Interpretability & Adversarial Training
7
Auto ML & Hyperparameter Optimization
7
Tensor Similarity & Dimension Reduction
5
Anomaly Detection & Others
5
Model Interpretation
5
Nearest Neighbors & Similarity
5
Data Similarity
4
Java
2
Anomaly Detection
2
Flutter
2
Go
2
Scala
1
Keywords
python
354
machine-learning
234
deep-learning
187
pytorch
109
data-science
75
tensorflow
72
cpp
46
nlp
42
neural-network
38
computer-vision
37
natural-language-processing
36
visualization
36
artificial-intelligence
34
gpu
34
go
33
javascript
33
c-plus-plus
32
ai
32
java
30
scikit-learn
29
golang
28
python3
26
cross-platform
26
keras
26
ml
25
android
25
numpy
24
linux
24
data-analysis
23
windows
23
data-visualization
23
pandas
21
awesome
20
c
20
neural-networks
20
cuda
20
awesome-list
19
mlops
18
opengl
18
react
18
game-development
17
reinforcement-learning
17
game-engine
16
deep-neural-networks
16
time-series
16
vulkan
16
gamedev
15
image-processing
15
database
15
statistics
15