Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-stars
https://github.com/Jrebel-i/awesome-stars
Last synced: 5 days ago
JSON representation
-
ANTLR
- melin/superior-sql-parser - 基于 antlr4 的多种数据库SQL解析器,获取SQL中元数据,可用于数据平台产品中的多个场景:ddl语句提取元数据、sql 权限校验、表级血缘、sql语法校验等场景。支持spark、flink、gauss、starrocks、Oracle、MYSQL、Postgresql,sqlserver,、db2等
-
Batchfile
- Prodesire/Python-Guide-CN - Python最佳实践指南。 The chinese translation of "Hitchhiker's Guide to Python".
- massgravel/Microsoft-Activation-Scripts - Open-source Windows and Office activator featuring HWID, Ohook, KMS38, and Online KMS activation methods, along with advanced troubleshooting.
-
C
- NVIDIA/open-gpu-kernel-modules - NVIDIA Linux open GPU kernel module source
- canonical/raft - Unmaintained C implementation of the Raft consensus protocol
- yanfeizhang/coder-kung-fu - 开发内功修炼
- jean553/c-simd-avx2-example - Simple SIMD example in C (AVX2 Vectorization)
- LearningOS/os-lectures - 2024年春季OS课程Slides
- edenhill/kcat - Generic command line non-JVM Apache Kafka producer and consumer
-
C# #
- leiurayer/downkyi - 哔哩下载姬downkyi,哔哩哔哩网站视频下载工具,支持批量下载,支持8K、HDR、杜比视界,提供工具箱(音视频提取、去水印等)。
- 2dust/v2rayN - A GUI client for Windows, support Xray core and v2fly core and others
- OdysseusYuan/LKY_OfficeTools - 一键自动化 下载、安装、激活 Office 的利器。
- huiyadanli/RevokeMsgPatcher - :trollface: A hex editor for WeChat/QQ/TIM - PC版微信/QQ/TIM防撤回补丁(我已经看到了,撤回也没用了)
- microsoft/PowerToys - Windows system utilities to maximize productivity
- Ponderfly/GoogleTranslateIpCheck - 扫描国内可用的谷歌翻译IP
- shadowsocksrr/shadowsocksr-csharp
-
C++
- google/leveldb - LevelDB is a fast key-value storage library written at Google that provides an ordered mapping from string keys to string values.
- oceanbase/oceanbase - OceanBase is an enterprise distributed relational database with high availability, high performance, horizontal scalability, and compatibility with SQL standards.
- KDAB/hotspot - The Linux perf GUI for performance analysis.
- async-profiler/async-profiler - Sampling CPU and HEAP profiler for Java featuring AsyncGetCallTrace + perf_events
- apache/arrow - Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
- apache/kvrocks - Apache Kvrocks is a distributed key value NoSQL database that uses RocksDB as storage engine and is compatible with Redis protocol.
- apache/thrift - Apache Thrift
- duckdb/duckdb - DuckDB is an analytical in-process SQL database management system
- facebookincubator/velox - A composable and fully extensible C++ execution engine library for data management systems.
- satanson/cpp_etudes - smart tools for source code study : cpptree.pl, calltree.pl, javatree.pl, java_calltree.pl
- topling/toplingdb - ToplingDB is a cloud native LSM Key-Value Store with searchable compression algo and distributed compaction
- ByConity/ByConity - ByConity is an open source cloud data warehouse
- bytedance/terarkdb - A RocksDB compatible KV storage engine with better performance
- krareT/terichdb - TerichDB, an open source data store based on terark engine
- redpanda-data/redpanda - Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!
- facebook/rocksdb - A library that provides an embeddable, persistent key-value store for fast storage.
- rockset/rocksdb-cloud - A library that provides an embeddable, persistent key-value store for fast storage optimized for AWS
- s3fs-fuse/s3fs-fuse - FUSE-based file system backed by Amazon S3
- chdb-io/chdb - chDB is an in-process OLAP SQL Engine 🚀 powered by ClickHouse
- Light-City/CPlusPlusThings - C++那些事
-
CSS
- slippersheepig/chatgpt-web - 使用官方ChatGPT API实现简单HTML网页版在线聊天(支持markdown语法、多用户会话隔离及连续对话)
-
Dart
- AppFlowy-IO/AppFlowy - Bring projects, wikis, and teams together with AI. AppFlowy is an AI collaborative workspace where you achieve more without losing control of your data. The best open source alternative to Notion.
- Notsfsssf/pixez-flutter - 一个支持免代理直连及查看动图的第三方Pixiv flutter客户端
- wgh136/PicaComic - A comic app built with Flutter, supporting multiple comic sources.
- localsend/localsend - An open-source cross-platform alternative to AirDrop
-
Dockerfile
- wirelessr/flink-iceberg-playground - minio as local storage and DynamoDB as catalog
- dafei1288/flink-yarn-docker - Run flink on docker containers.
- ververica/ververica-ff-2024-dd2-sql-training - Flink Forward 2024, Deep Dives Session 2 repository
-
FreeMarker
- dromara/CloudEon - CloudEon uses Kubernetes to install and deploy open-source big data components, enabling the containerized operation of an open-source big data platform. This allows you to reduce your focus on underl
-
Go
- minio/minio - MinIO is a high-performance, S3 compatible object store, open sourced under GNU AGPLv3 license.
- hashicorp/raft - Golang implementation of the Raft consensus protocol
- serjs/socks5-server
- google/cadvisor - Analyzes resource usage and performance characteristics of running containers.
- rclone/rclone - "rsync for cloud storage" - Google Drive, S3, Dropbox, Backblaze B2, One Drive, Swift, Hubic, Wasabi, Google Cloud Storage, Azure Blob, Azure Files, Yandex Files
- 1Panel-dev/1Panel - 🔥🔥🔥 Web-based linux server management control panel. / 现代化、开源的 Linux 服务器运维管理面板。
- kopia/kopia - Cross-platform backup tool for Windows, macOS & Linux with fast, incremental backups, client-side end-to-end encryption, compression and data deduplication. CLI and GUI included.
- wailsapp/wails - Create beautiful applications using Go
- openacid/paxoskv - Naive and Basic impl of a kv-storage based on paxos; for https://blog.openacid.com/algo/paxos/
- juicedata/juicefs - JuiceFS is a distributed POSIX file system built on top of Redis and S3.
- etcd-io/etcd - Distributed reliable key-value store for the most critical data of a distributed system
- apache/incubator-answer - A Q&A platform software for teams at any scales. Whether it's a community forum, help center, or knowledge management platform, you can always count on Apache Answer.
- bytebase/bytebase - The GitHub/GitLab for database DevSecOps. World's most advanced database DevSecOps solution for Developer, Security, DBA and Platform Engineering teams.
- dvassallo/s3-benchmark - Measure Amazon S3's performance from any location.
- stulzq/azure-openai-proxy - Azure OpenAI Service Proxy. Convert OpenAI official API request to Azure OpenAI API request. Support GPT-4,Embeddings,Langchain. Adapter from OpenAI to Azure OpenAI.
- whyiyhw/chatgpt-wechat - 企业微信/微信 安全使用的 ChatGPT 个人助手应用
- derailed/k9s - 🐶 Kubernetes CLI To Manage Your Clusters In Style!
- hashicorp/terraform - Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amo
- alist-org/alist - 🗂️A file list/WebDAV program that supports multiple storages, powered by Gin and Solidjs. / 一个支持多存储的文件列表/WebDAV程序,使用 Gin 和 Solidjs。
- haplone/docs - 个人代码相关的一些理解
- pingcap/tidb - TiDB - the open-source, cloud-native, distributed SQL database designed for modern applications.
- intel/PerfSpect - System performance analysis and characterization tool
- AlistGo/alist - 🗂️A file list/WebDAV program that supports multiple storages, powered by Gin and Solidjs. / 一个支持多存储的文件列表/WebDAV程序,使用 Gin 和 Solidjs。
-
HTML
- doocs/technical-books - 😆 国内外互联网技术大牛们都写了哪些书籍:计算机基础、网络、前端、后端、数据库、架构、大数据、深度学习...
- byoungd/English-level-up-tips - An advanced guide to learn English which might benefit you a lot 🎉 . 离谱的英语学习指南/英语学习教程。
- EsotericSoftware/kryo - Java binary serialization and cloning: fast, efficient, automatic
- google/styleguide - Style guides for Google-originated open-source projects
- cym1102/nginxWebUI - Nginx Web page configuration tool. Use web pages to quickly configure Nginx. Nginx网页管理工具,使用网页来快速配置与管理nginx单机与集群
- Tikam02/DevOps-Guide - DevOps Guide - Development to Production all configurations with basic notes to debug efficiently.
- PKUFlyingPig/cs-self-learning - 计算机自学指南
- adams549659584/go-proxy-bingai - 用 Vue3 和 Go 搭建的微软 New Bing 演示站点,拥有一致的 UI 体验,支持 ChatGPT 提示词,国内可用。
- ClickHouse/ClickBench - ClickBench: a Benchmark For Analytical Databases
- dibingfa/flash-linux0.11-talk - 你管这破玩意叫操作系统源码 — 像小说一样品读 Linux 0.11 核心代码
- jaceklaskowski/spark-workshop - Apache Spark™ and Scala Workshops
- WeNeedHome/SummaryOfLoanSuspension - 全国各省市停贷通知汇总
-
Java
- awslabs/aws-glue-catalog-sync-agent-for-hive - Enables synchronizing metadata changes (Create/Drop table/partition) from Hive Metastore to AWS Glue Data Catalog
- unitycatalog/unitycatalog - Open, Multi-modal Catalog for Data & AI
- aws-samples/amazon-managed-service-for-apache-flink-examples - Collection of code examples for Amazon Managed Service for Apache Flink
- aws-samples/aws-big-data-blog
- apache/zookeeper - Apache ZooKeeper
- apache/bigtop - Bigtop is an Apache Foundation project for Infrastructure Engineers and Data Scientists looking for comprehensive packaging, testing, and configuration of the leading open source big data components.
- trinodb/trino - Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
- Netflix/genie - Distributed Big Data Orchestration Service
- apache/hbase - Apache HBase
- felayman/elasticsearch-full - full-scale introduce for elasticsearch
- OpenLineage/OpenLineage - An Open Standard for lineage metadata collection
- apache/calcite-avatica - Apache Calcite Avatica
- google/auto - A collection of source code generators for Java.
- apache/iceberg - Apache Iceberg
- ExpediaGroup/waggle-dance - Hive federation service. Enables disparate tables to be concurrently accessed across multiple Hive deployments.
- jaegertracing/jaeger-analytics-flink - Big data analytics for Jaeger using Apache Flink
- ben-manes/caffeine - A high performance caching library for Java
- antlr/antlr4 - ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.
- williamfiset/Algorithms - A collection of algorithms and data structures
- dataease/dataease - 🔥 人人可用的开源 BI 工具,Tableau、帆软的开源替代。
- tomfran/LSM-Tree - Log-Structured Merge Tree Java implementation
- Alluxio/alluxio - Alluxio, data orchestration for analytics and machine learning in the cloud
- nICEnnnnnnnLee/BilibiliDown - (GUI-多平台支持) B站 哔哩哔哩 视频下载器。支持稍后再看、收藏夹、UP主视频批量下载|Bilibili Video Downloader 😳
- apache/doris-flink-connector - Flink Connector for Apache Doris
- apache/hadoop - Apache Hadoop
- alibaba/otter - 阿里巴巴分布式数据库同步系统(解决中美异地机房)
- netty/netty - Netty project - an event-driven asynchronous network application framework
- debezium/debezium - Change data capture for a variety of databases. Please log issues at https://issues.redhat.com/browse/DBZ.
- apolloconfig/apollo - Apollo is a reliable configuration management system suitable for microservice configuration management scenarios.
- zq2599/blog_demos - CSDN博客专家程序员欣宸的github,这里有六百多篇原创文章的详细分类和汇总,以及对应的源码,内容涉及Java、Docker、Kubernetes、DevOPS等方面
- alibaba/nacos - an easy-to-use dynamic service discovery, configuration and service management platform for building cloud native applications.
- blossom-editor/blossom - A markdown editor that you can deploy on your own servers to achieve cloud storage and device synchronization(支持私有部署的云端存储双链笔记软件)
- emichael/dslabs - Distributed Systems Labs and Framework
- uber-common/jvm-profiler - JVM Profiler Sending Metrics to Kafka, Console Output or Custom Reporter
- apache/pinot - Apache Pinot - A realtime distributed OLAP datastore
- Aiven-Open/tiered-storage-for-apache-kafka - RemoteStorageManager for Apache Kafka® Tiered Storage
- apache/incubator-xtable - Apache XTable (incubating) is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.
- pinpoint-apm/pinpoint - APM, (Application Performance Management) tool for large-scale distributed systems.
- TheAlgorithms/Java - All Algorithms implemented in Java
- apache/bookkeeper - Apache BookKeeper - a scalable, fault tolerant and low latency storage service optimized for append-only workloads
- flink-ci/flink-mirror
- hazelcast/hazelcast - Hazelcast is a unified real-time data platform combining stream processing with a fast data store, allowing customers to act instantly on data-in-motion for real-time insights.
- lihuigang/hive-bitmap-udf - 在hive中使用Roaring64Bitmap实现精确去重功能
- neoremind/navi-pbrpc - A protobuf based high performance rpc framework leveraging full-duplexing and asynchronous io with netty
- google/error-prone - Catch common Java mistakes as compile-time errors
- StephenYou520/SyCep - CEP 动态Pattern
- apache/calcite - Apache Calcite
- apache/skywalking - APM, Application Performance Monitoring System
- neoremind/app-on-yarn-demo - Demo for service oriented application hosted on Hadoop YARN cluster for HA and scheduling
- linkedin/coral - Coral is a translation, analysis, and query rewrite engine for SQL and other relational languages.
- SPLWare/esProc - esProc SPL is a scripting language for data processing, with well-designed rich library functions and powerful syntax, which can be executed in a Java program through JDBC interface and computing inde
- smartloli/EFAK - A easy and high-performance monitoring system, for comprehensive monitoring and management of kafka cluster.
- apache/ratis - Open source Java implementation for Raft consensus protocol.
- shyiko/mysql-binlog-connector-java - MySQL Binary Log connector
- apache/amoro - Apache Amoro (incubating) is a Lakehouse management system built on open data lake formats.
- alibaba/arthas - Alibaba Java Diagnostic Tool Arthas/Alibaba Java诊断利器Arthas
- birdLark/LarkMidTable - LarkMidTable 是一站式开源的数据中台,实现中台的 基础建设,数据治理,数据开发,监控告警,数据服务,数据的可视化,实现高效赋能数据前台并提供数据服务的产品。
- 201206030/novel-plus - novel-plus 是一个多端(PC、WAP)阅读 、功能完善的小说 CMS 系统。包括小说推荐、小说检索、小说排行、小说阅读、小说书架、小说评论、小说爬虫、会员中心、作家专区、充值订阅、新闻发布等功能。
- oap-project/Gluten-Trino - Gluten: Plugin to Boost Trino's Performance
- aws-samples/amazon-kinesis-data-analytics-flink-benchmarking-utility - Amazon Managed Service for Apache Flink Benchmarking Utility helps with capacity planning, integration testing, and benchmarking of Amazon Managed Service for Apache Flink applications.
- zzzzming95/calcite-demo - calcite的相关联系代码,包含CSV适配器,使用CSV适配器来进行SQL查询。SQL的parse和validate,以及RBO和CBO的使用。
- cubefs/compass - Compass is a task diagnosis platform for bigdata
- apache/flink-cdc - Flink CDC is a streaming data integration tool
- kaori-seasons/calcite-gremlin-sql - 将一些 SQL 查询转换为 Gremlin 浏览,并针对支持 TinkerPop 3 的图形数据库即可运行
- marcelmay/hadoop-hdfs-fsimage-exporter - Exports Hadoop HDFS content statistics to Prometheus
- alibaba/cost-based-incremental-optimizer
- hortonworks/hive-testbench
- realtime-storage-engine/flink-spillable-statebackend - The preview version of a spillable state backend for Apache Flink
- asdf2014/algorithm - Team up to solve problems on LeetCode together
- apache/flink-benchmarks - Benchmarks for Apache Flink
- pan3793/spark-terasort
- streaming-with-flink/examples-java - Stream Processing with Apache Flink - Java Examples
- decodableco/decodable-pipeline-sdk - An SDK for implementing Flink jobs based on Decodable
- cloudera/flink-basic-auth-handler - flink-basic-auth-handler
- ing-bank/apache-ranger-s3-plugin - Apache Ranger Plugin for S3
- apache/linkis - Apache Linkis builds a computation middleware layer to facilitate connection, governance and orchestration between the upper applications and the underlying data engines.
- duhanmin/trino-yarn - trino-yarn可以让trino在yarn上多节点运行
- janino-compiler/janino - Janino is a super-small, super-fast Java™ compiler.
- LB-Yu/data-systems-learning - Learning summary and examples about data systems.
- brianfrankcooper/YCSB - Yahoo! Cloud Serving Benchmark
- xiaozhch5/calcite-learning
- datavane/datasophon - The next generation of cloud-native big data management expert , Aims to help users rapidly build stable, efficient, and scalable cloud-native platforms for big data.
- apache/submarine - Submarine is Cloud Native Machine Learning Platform.
- rmetzger/tiny-flink-talk
- kingschan1204/easyCrawl - 一个java实现的爬虫工具包
- alibaba/SREWorks - Cloud Native DataOps & AIOps Platform | 云原生数智运维平台
- twalthr/flink-api-examples - Examples for using Apache Flink® with DataStream API, Table API, Flink SQL and connectors such as MySQL, JDBC, CDC, Kafka.
- a0x8o/flink - Scalable Batch and Stream Data Processing
- StarRocks/starrocks-connector-for-apache-flink
- mesmacosta/hive-custom-hook - Example on how to implement a hive hook
- ververica/lab-vvp-pyflink - Code examples for our blog post "Run PyFlink Jobs and UDFs in Ververica Platform"
- charles-tan/flink-state-processor-example
- alibaba/DataX - DataX是阿里云DataWorks数据集成的开源版本。
- Peefy/CompileDragonBook - Compile Dragon Book + DSL book, etc.
- knaufk/flink-faker - A data generator source connector for Flink SQL based on data-faker.
- qinsql/QinSQL - AI 时代的智能数据库
- apache/flink-playgrounds - Apache Flink Playgrounds
- shusheng007/design-patterns - 使用最浅显的语言呈现设计模式,力争让每一位程序员都看得懂
- melin/flink-cdc-catalog
- melin/sqlflow - 数据血缘
- jeff-zou/flink-catalog-in-jdbc
- apache/seatunnel - SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
- dafei1288/flink-connector-opengauss - flink-connector-opengauss (unofficial)
- SophiaData/Bigdata_Code_Tutorial - Flink cdc 整库同步 & flink 代码 demo
- fesh0r/fernflower - Unofficial mirror of FernFlower Java decompiler (All pulls should be submitted upstream)
- miaowenting/calcite-learning - learn calcite sql parsing
- streamnative/kop - Kafka-on-Pulsar - A protocol handler that brings native Kafka protocol to Apache Pulsar
- datahub-project/datahub - The Metadata Platform for your Data Stack
- decaywood/XueQiuSuperSpider - 雪球股票信息超级爬虫
- apache/hbase-operator-tools - Apache HBase Operator Tools
- kaori-seasons/flink-catalog-in-jdbc - 基于flink可以创建物理表的catalog
- PlexPt/chatgpt-java - ChatGPT Java SDK。支持 GPT-4o、 GPT4 API。开箱即用。An unofficial Java SDK for seamless integration with ChatGPT's GPT-3.5 and GPT-4 APIs. Ready-to-use, simple setup, and efficient for building AI-powered appli
- pierre94/minibase - An embedded KV storage engine for learning HBase
- openinx/minibase - An embedded KV storage engine for learning HBase
- eyesmoons/data-lineage-doris - Doris表和字段血缘项目
- prometheus/jmx_exporter - A process for exposing JMX Beans via HTTP for Prometheus consumption
- apache/shardingsphere - Empowering Data Intelligence with Distributed SQL for Sharding, Scalability, and Security Across All Databases.
- WeBankFinTech/Qualitis - Qualitis is a one-stop data quality management platform that supports quality verification, notification, and management for various datasource. It is used to solve various data quality problems cause
- felixzh2020/felixzh-flink-examples - felixzh的Flink生态实战案例(use case)
- raphw/byte-buddy - Runtime code generation for the Java virtual machine.
- apache/hive - Apache Hive
- apache/incubator-streampark - Make stream processing easier! Easy-to-use streaming application development framework and operation platform.
- apache/incubator-uniffle - Uniffle is a high performance, general purpose Remote Shuffle Service.
- apache/celeborn - Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.
- insightlake/Ranger-Metastore-Plugin - Ranger Hive Metastore Plugin
- MyLanPangzi/flink-demo - Flink Demo
- aakashnand/ranger - Mirror of Apache Ranger
- ververica/flink-sql-benchmark
- apache/ranger - Apache Ranger - To enable, monitor and manage comprehensive data security across the Hadoop platform and beyond
- bytedance/CloudShuffleService - Cloud Shuffle Service(CSS) is a general purpose remote shuffle solution for compute engines, including Spark/Flink/MapReduce.
- apache/hudi - Upserts, Deletes And Incremental Processing on Big Data.
- confluentinc/schema-registry - Confluent Schema Registry for Kafka
- provectus/kafka-ui - Open-Source Web UI for Apache Kafka Management
- apache/doris - Apache Doris is an easy-to-use, high performance and unified analytics database.
- flink-extended/flink-remote-shuffle - Remote Shuffle Service for Flink
- godaai/flink-book-zh - Flink Tutorial Project
- apache/flink-connector-aws - Apache flink
- apache/dolphinscheduler - Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
- RealtimeCompute/ververica-cep-demo - Demo of Flink CEP with dynamic patterns
- confucianzuoyuan/mini-flink
- siddhi-io/siddhi - Stream Processing and Complex Event Processing Engine
- apache/paimon - Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
- getindata/flink-http-connector - Http Connector for Apache Flink. Provides sources and sinks for Datastream , Table and SQL APIs.
- luxiaoxun/eagle - Real time data processing system based on flink and CEP
- quxiucheng/apache-calcite-tutorial - https://blog.csdn.net/QXC1281/article/details/89070285
- HamaWhiteGG/flink-sql-lineage - The Lineage Analysis system for FlinkSQL supports advanced syntax such as Watermark, UDTF, CEP, Windowing TVFs, and CTAS.
- flowerfine/flinkful - flink endpoint for open world
- DataLinkDC/dinky - Dinky is a real-time data development platform based on Apache Flink, enabling agile data development, deployment and operation.
- jeff-zou/flink-connector-redis - Asynchronous flink connector based on the Lettuce, supporting sql join and sink, query caching and debugging.
- apache/flink - Apache Flink
- knaufk/statefun-aws-lambda-demo
- zabetak/calcite-tutorial
- zhp8341/flink-streaming-platform-web - 基于flink的实时流计算web平台
- wooplevip/sedis - SQL for Redis
- apache/flink-training - Apache Flink Training Excercises
- Jrebel-i/calcite-test - Test code for apache calcite
- yuqi1129/calcite-test - Test code for apache calcite
- nexmark/nexmark - Benchmarks for queries over continuous data streams.
- KikiLetGo/CyberController - CyberController
- leonardBang/flink-sql-etl - Using Flink SQL to build ETL job
- DTStack/flinkStreamSQL - 基于开源的flink,对其实时sql进行扩展;主要实现了流与维表的join,支持原生flink SQL所有的语法
- Downfy/log4j-elasticsearch-java-api - Using log4j insert log info into ElasticSearch
- apache/flink-kubernetes-operator - Apache Flink Kubernetes Operator
- LinMingQiang/flink-learn - Learning Flink : Flink CEP,Flink Core,Flink SQL
- sunjincheng121/know_how_know_why - For every learner
- itinycheng/flink-connector-clickhouse - Flink SQL connector for ClickHouse. Support ClickHouseCatalog and read/write primary data, maps, arrays to clickhouse.
- didi/KnowStreaming - 一站式云原生实时流数据平台,通过0侵入、插件化构建企业级Kafka服务,极大降低操作、存储和管理实时流数据门槛
- Jrebel-i/LogiKM - 一站式Apache Kafka集群指标监控与运维管控平台
- Jrebel-i/flinkx - Based on Apache Flink. support data synchronization/integration and streaming SQL computation.
- ft20082/flink-connector-redis - A simple redis sql connector for Flink
- apache/bahir-flink - Mirror of Apache Bahir Flink
- yangyichao-mango/flink-study
- zhisheng17/flink-learning - flink learning blog. http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例,还有 Flink 落地应用的大型项目案例(PVUV、日志存储、百亿数据实时去重、
- taehee/flume-kinesis - Amazon Kinesis Sink and Source for Apache Flume
- bighuang624/Algorithms-notes - 《算法(第4版)》笔记及代码 | 《Algorithms(Fourth Edition)》notes & code
- NiceSeason/SpringBoot_demo - springboot的demo,跟着b站雷丰阳的视频教程敲的,包括基础篇+核心篇
- StarRocks/starrocks - The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance for
- redisson/redisson - Redisson - Valkey & Redis Java client. Complete Real-Time Data Platform. Sync/Async/RxJava/Reactive API. Over 50 Redis or Valkey based Java objects and services: Set, Multimap, SortedSet, Map, List, Q
- yanagishima/yanagishima - Web UI for Trino, Hive and SparkSQL
- awslabs/aws-glue-data-catalog-client-for-apache-hive-metastore - The AWS Glue Data Catalog is a fully managed, Apache Hive Metastore compatible, metadata repository. Customers can use the Data Catalog as a central repository to store structural and operational meta
- alldatacenter/alldata - 🔥🔥 AllData可定义数据中台,以数据平台为底座,以数据中台为桥梁,以机器学习平台为工厂,以大模型应用为上游产品,提供全链路数字化解决方案。采购商业版、加入技术社区:https://docs.qq.com/doc/DVHlkSEtvVXVCdEFo
- tabular-io/iceberg-kafka-connect
- databricks/iceberg-kafka-connect
- flowerfine/scaleph - Open data platform based on Kubernetes. Scaleph supports SeaTunnel、Flink and Doris backended by SeaTunnel on Flink engine、Flink Kubernetes Operator and Doris operator.
- duhanmin/bigdata-sql-parser - 数据血缘,支持spark sql,hive sql,pg sql,presto sql,mysql sql,tidb sql, flink sql, datax血缘,spark/flink jar 运行命令的血缘解析;支持with语法
- lixz3321/flink-connector-jdbc-ext - 这是一个扩展的flink-connector-jdbc,相比于官方,该版本新增了对clickhouse、phoenix的支持,后面将会继续改造以支持更多的jdbc连接
- alibaba/fastjson2 - 🚄 FASTJSON2 is a Java JSON library with excellent performance.
- HamaWhiteGG/flink-sql-security - FlinkSQL数据脱敏和行级权限解决方案及源码,支持面向用户级别的数据脱敏和行级数据访问控制,即特定用户只能访问到脱敏后的数据或授权过的行。此方案是实时领域Flink的解决方案,类似于离线数仓Hive Ranger中的Row-level Filter和Column Masking方案。
- zhangjun0x01/bigdata-examples - 分享一些在工作中的大数据实战案例,包括flink、kafka、hadoop、presto等等。欢迎大家关注我的公众号【大数据技术与应用实战】,一起成长。
- apache/fury - A blazingly fast multi-language serialization framework powered by JIT and zero-copy.
- krahets/hello-algo - 《Hello 算法》:动画图解、一键运行的数据结构与算法教程。支持 Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart 代码。简体版和繁体版同步更新,English version ongoing
- 0xqq/bigdata-sql-parser - 基于antlr4 解析器,支持spark sql, tidb sql, flink sql, Spark/flink jar 运行命令解析器
- zhaoyachao/zdh_web - 大数据采集,抽取平台,zdh_web是zdh系列服务的可视化管理平台,包含数据采集,调度,权限,审批流,私域营销等模块
- apache/gravitino - World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
-
Others
- MoRan1607/BigDataGuide - 大数据学习,从零开始学习大数据,包含大数据学习各阶段学习视频、面试资料
- kjfx/AppleID - 美区apple id注册教学,苹果账号注册美国
- ihmily/ip-info-api - Free IP information query APIs / 免费IP信息查询API接口,GET请求,可直接访问,无任何鉴权
- 0voice/linux_kernel_wiki - linux内核学习资料:200+经典内核文章,100+内核论文,50+内核项目,500+内核面试题,80+内核视频
- peizhe/spark-imf-DESKTOP-4DQ7P6D - 后续整合
- maemual/raft-zh_cn - Raft一致性算法论文的中文翻译
- madawei2699/notion-sites - 发掘Notion好站
- YunaiV/Blog - 每周一篇,内容精简,不咸不淡,期盼探讨。微信公众号:芋道源码【纯源码分享公众号】
- hwanz/SSR-V2ray-Trojan - 机场推荐与机场评测
- TheodoreKrypton/slacking-off-tools - 上班摸鱼工具集
- DataExpert-io/data-engineer-handbook - This is a repo with links to everything you'd ever want to learn about data engineering
- davidgasquez/awesome-duckdb - 🦆 A curated list of awesome DuckDB resources
- huachaohuang/awesome-dbdev - Awesome materials about database development.
- easychen/lean-side-bussiness - 精益副业:程序员如何优雅地做副业
- pingcap/awesome-database-learning - A list of learning materials to understand databases internals
- OneSizeFitsQuorum/raft-thesis-zh_cn - Raft 博士论文的中文翻译
- Sunt-ing/database-system-readings - :yum: A curated reading list about database systems
- heidihoward/distributed-consensus-reading-list - A list of papers about distributed consensus.
- Snoopy1866/LiTiaotiao-Custom-Rules
- zlzhang0122/flink-source-zh - Flink源码阅读分享,不断记录Flink源码的阅读过程
- Archmage83/tvapk - 收集各大AndroidTV的apk应用,可免费看vip和国外电影电视。如大家有也可以贡献一下。
- dttung2905/flink-at-scale - 📚 Tech blogs & talks by companies that run Apache Flink in production
- wangzzu/awesome - 不积硅步,无以至千里
- rxin/db-readings - Readings in Databases
- ByteByteGoHq/system-design-101 - Explain complex systems using visuals and simple terms. Help you prepare for system design interviews.
- maguowei/awesome-stars - My Awesome List
- AngersZhuuuu/Angerszhuuuu.github.io
- BookaiCode/JavaRecord - 「Java学习+面试指南」一份涵盖大部分 Java 程序员所需要掌握的核心知识。打造Java后端知识体系,帮助Java初学者成长
- jaceklaskowski/spark-kubernetes-book - The Internals of Spark on Kubernetes
- sqlcore/presto-teach - presto、trino资料分享,开发文档、源码阅读、二次开发。
- firstcontributions/first-contributions - 🚀✨ Help beginners to contribute to open source projects
- k8s-club/k8s-club - K8s-club for learn, share and explore the K8s world :)
- zijie0/HumanSystemOptimization - 健康学习到150岁 - 人体系统调优不完全指南
- resumejob/awesome-resume - Resume,Resume Templates,程序员简历例句,简历模版,
- lw-lin/streaming-readings - Streaming System 相关的论文读物
- wuchong/awesome-flink - 😎 A curated list of amazingly awesome Flink and Flink ecosystem resources
- TalalAlrawajfeh/mathematics-roadmap - A Comprehensive Roadmap to Mathematics
- justjavac/awesome-wechat-weapp - 微信小程序开发资源汇总 :100:
- QiuChenlyOpenSource/MusicDownload - 歌曲下载
- kaori-seasons/sql-segmentation-algorithm - 对于在Flink sql下不同方言的大SQL进行切分的工具
- luzhouxiaobai/Big-Data-Review - 大数据学习/面试
- practical-tutorials/project-based-learning - Curated list of project-based tutorials
- zhaoyun0071/DragGAN-Windows-GUI
- japila-books/spark-sql-internals - The Internals of Spark SQL
- WarrenWen666/AI-Software-Startups - A Survey of AI startups
- eftales/ldap-videoTutorial - ldap 视频教程资料
- ept/ddia-references - Literature references for “Designing Data-Intensive Applications”
- taishi-i/awesome-ChatGPT-repositories - A curated list of resources dedicated to open source GitHub repositories related to ChatGPT
- LangLangShanDeNanKe/chatgpt - ChatGPT网址导航,分享免费好用AI网站!
- megaease/Remembering-Haoel - 记录您对左耳朵耗子(陈皓)的点滴回忆
- zouhuigang/book - 📕📗📘收集的各种书籍,pdf,ppt,doc资料,下载链接永久有效!
- geekan/HowToLiveLonger - 程序员延寿指南 | A programmer's guide to live longer
- icey-zhang/miniGPT4_guide - miniGPT4的本地复现
- dafei1288/CalciteDocTrans - 翻译Calcite文档,非官方
- XIU2/Yuedu - 📚「阅读」APP 自用书源(网络小说)
- tangwz/db-monthly - 阿里云数据库内核月报分类整理(定时更新) http://mysql.taobao.org/monthly/
- click33/chatgpt---mirror-station-summary - 汇总所有 chatgpt 镜像站,免费、付费、多模态、国内外大模型汇总等等 持续更新中…… 个人能力有限,搜集到的不多,求大家多多贡献啊!众人拾柴火焰高!
- xx025/carrot - Free ChatGPT Site List 这儿为你准备了众多免费好用的ChatGPT镜像站点
- kelseyhightower/kubernetes-the-hard-way - Bootstrap Kubernetes the hard way. No scripts.
- louisfb01/best_AI_papers_2022 - A curated list of the latest breakthroughs in AI (in 2022) by release date with a clear video explanation, link to a more in-depth article, and code.
- awsdocs/amazon-emr-management-guide - The open source version of the Amazon EMR Management Guide. You can submit feedback & requests for changes by submitting issues in this repo or by making proposed changes & submitting a pull request.
- wangzhiwubigdata/God-Of-BigData - 专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
- mtdvio/every-programmer-should-know - A collection of (mostly) technical things every software developer should know about
- leesf/hudi-resources - 汇总Apache Hudi相关资料
- ruanyf/weekly - 科技爱好者周刊,每周五发布
- The-Run-Philosophy-Organization/run - 润学全球官方指定GITHUB,整理润学宗旨、纲领、理论和各类润之实例;解决为什么润,润去哪里,怎么润三大问题; 并成为新中国人的核心宗教,核心信念。
- huangfox/dpkb - 大数据相关内容汇总,包括分布式存储引擎、分布式计算引擎、数仓建设等。关键词:Hadoop、HBase、ES、Kudu、Hive、Presto、Spark、Flink、Kylin、ClickHouse
- bethunebtj/datasource_architecture - 追源索骥-flink
- tghfly/grafana-manual - Grafana最佳实践教程,视频教程可关注我b站itcooking (https://www.bilibili.com/video/BV1PV411k7Rz)
- jwasham/coding-interview-university - A complete computer science study plan to become a software engineer.
- cloudlandboy/myNote - 我的笔记
- NiceSeason/fucking-algorithm - 手把手撕LeetCode题目,扒各种算法套路的裤子。English version supported! Crack LeetCode, not only how, but also why.
- igorbarinov/awesome-data-engineering - A curated list of data engineering tools for software developers
- yvoronoy/awesome-english - A collection of awesome study resources for learners of English.
- ddgksf2013/ddgksf2013 - 墨鱼去广告计划 | QuantumultX 去广告 | 去开屏广告 | 应用净化 | 会员解锁 | 墨鱼配置 | 应用增强 | 网页优化 | 网盘资源 | 模块去广告 | Shadowrocket配置 | 墨鱼规则 | Clash配置 | 资源库 | 不完全指北
- yanue/V2rayU - V2rayU,基于v2ray核心的mac版客户端,用于科学上网,使用swift编写,支持trojan,vmess,shadowsocks,socks5等服务协议,支持订阅, 支持二维码,剪贴板导入,手动配置,二维码分享等
- BigDataScholar/TheKingOfBigData - 🚀🚀🚀优质的历史文章,大数据高频考点,Java一线大厂知识考点,更有精美简历模板,简历指导手册和上百本技术书籍,最重要的就是被全网下载上千次的我自己花精力去画的大数据生态圈,Kafka,Spark,Scala的思维导图...这是一个你在大数据学习路上不能错过的宝藏项目!
- AZeC4/TelegramGroup - 2024最新悄咪咪收集的10000+个Telegram群合集,附全网最有趣好用的机器人BOT🤖【电报百科全书】
- immersive-translate/immersive-translate - 沉浸式双语网页翻译扩展 , 支持输入框翻译, 鼠标悬停翻译, PDF, Epub, 字幕文件, TXT 文件翻译 - Immersive Dual Web Page Translation Extension
- githubvpn007/v2rayNvpn - 翻墙、免费翻墙、免费科学上网、免费节点、免费梯子、免费ss/ssr/v2ray/trojan节点、蓝灯、谷歌商店、翻墙梯子 、外网游戏、国外游戏、vpn、vpn推荐、每天更新、上外网、外网、V2rayN、Qv2ray、V2rayW、V2RayS、Mellow、V2rayX、V2rayU、ClashX、Kitsunebi、BifrostV、i2Ray 、Quantumult、Surge 4、winX
-
JavaScript
- qirenzhidao/tvbox18 - tvbox 影视tv 宝盒 接口
- sxei/chrome-plugin-demo - 《Chrome插件开发全攻略》配套完整Demo,欢迎clone体验
- cyao2q/files - TVBox开源版,盒子软件分享
- grpc/grpc-web - gRPC for Web Clients
- kscript/markdown-download - 谷歌浏览器插件: 将掘金、知乎、思否、简书、博客园、微信公众号、开源中国、CSDN的文章转为markdown文档并下载
- LGiki/cosmos-enhanced - 🪐 一个增强小宇宙播客网页端使用体验的浏览器插件
- iptv-org/iptv - Collection of publicly available IPTV channels from all over the world
- L8426936/CleanUpWeChatZombieFans - auto.js脚本,Android自动化,清理微信僵尸粉
- anuraghazra/github-readme-stats - :zap: Dynamically generated stats for your github readmes
- iamadamdev/bypass-paywalls-chrome - Bypass Paywalls web browser extension for Chrome and Firefox.
- carteryh/big-data - big data study
- wangrongding/github-old-feed - Replace the shit💩 new feed with the old one.
- adamyi/wechrome - Chrome extension to unblock web wechat
- langren1353/GM_script - 我就是来分享脚本玩玩的
- melin/spark-jobserver - REST job server for Apache Spark
- Annihil/github-spray - :octocat: Draw on your GitHub contribution graph ░▒▓█
- melin/flink-jobserver - REST job server for Apache Flink
- easychen/openai-api-proxy - 一行Docker命令部署的 OpenAI/GPT API代理,支持SSE流式返回、腾讯云函数 。Simple proxy for OpenAi api via a one-line docker command
- timqian/openprompt.co - Create. Use. Share. ChatGPT prompts
- anc95/ChatGPT-CodeReview - 🐥 A code review bot powered by ChatGPT
- vaxilu/x-ui - 支持多协议多用户的 xray 面板
- cloudera/hue - Open source SQL Query Assistant service for Databases/Warehouses
- work7z/CodeGen - This repo is moved to LafTools
- zhaoolee/ChromeAppHeroes - 🌈谷粒-Chrome插件英雄榜, 为优秀的Chrome插件写一本中文说明书, 让Chrome插件英雄们造福人类~ ChromePluginHeroes, Write a Chinese manual for the excellent Chrome plugin, let the Chrome plugin heroes benefit the human~ 公众号「0加1」同步更新
- xcanwin/KeepChatGPT - 这是一款提高ChatGPT的数据安全能力和效率的插件。并且免费共享大量创新功能,如:自动刷新、保持活跃、数据安全、取消审计、克隆对话、言无不尽、净化页面、展示大屏、拦截跟踪、日新月异、明察秋毫等。让我们的AI体验无比安全、顺畅、丝滑、高效、简洁。
- fanmingming/live - ✯ 可直连访问的电视/广播图标库与相关工具项目 ✯ 🔕 永久免费 直连访问 完整开源 不断完善的台标 支持IPv4/IPv6双栈访问 🔕
-
Python
- langflow-ai/langflow - Langflow is a low-code app builder for RAG and multi-agent AI applications. It’s Python-based and agnostic to any model, API, or database.
- dropbox/PyHive - Python interface to Hive and Presto. 🐝
- Supervisor/supervisor - Supervisor process control system for Unix (supervisord)
- localstack/localstack - 💻 A fully functional local AWS cloud stack. Develop and test your cloud & Serverless apps offline
- VikParuchuri/marker - Convert PDF to markdown quickly with high accuracy
- eosphoros-ai/sqlgpt-parser - sqlgpt-parser is a Python implementation of an SQL parser that effectively converts SQL statements into Abstract Syntax Trees (AST). By leveraging AST tree comparisons between two SQL queries, it beco
- yihong0618/iBeats - Apple Watch 心率数据采集 - Your Soul, Your Beats!
- yihong0618/running_page - Make your own running home page
- alexta69/metube - Self-hosted YouTube downloader (web UI for youtube-dl / yt-dlp)
- maguowei/starred - creating your own Awesome List by GitHub stars!
- airbytehq/airbyte - The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
- meta-llama/codellama - Inference code for CodeLlama models
- jt-zhang/XDU_CS_Learning - 西安电子科技大学计算机专业经验分享:lollipop:
- jiran214/GPT-vup - GPT-vup BIliBili | 抖音 | AI | 虚拟主播
- conanhujinming/comments-for-awesome-courses - 名校公开课程评价网
- 198808xc/Pangu-Weather - An official implementation of Pangu-Weather
- pyspark-ai/pyspark-ai - English SDK for Apache Spark
- soimort/you-get - :arrow_double_down: Dumb downloader that scrapes the web
- 1061700625/WeChat_Article - 爬取微信公众号文章
- aws/aws-emr-containers-best-practices - Best practices and recommendations for getting started with Amazon EMR on EKS.
- Vonng/ddia - 《Designing Data-Intensive Application》DDIA中文翻译
- dbiir/df-bench - Benchmark for DataFrame Systems
- jdagdelen/hyperDB - A hyper-fast local vector database for use with LLM Agents. Now accepting SAFEs at $135M cap.
- eosphoros-ai/DB-GPT - AI Native Data App Development framework with AWEL(Agentic Workflow Expression Language) and Agents
- OpenBMB/BMTools - Tool Learning for Big Models, Open-Source Solutions of ChatGPT-Plugins
- ytdl-org/youtube-dl - Command-line program to download videos from YouTube.com and other video sites
- EwingYangs/awesome-open-gpt - Collection of Open Source Projects Related to GPT,GPT相关开源项目合集🚀、精选🔥🔥
- xtekky/gpt4free - The official gpt4free repository | various collection of powerful language models
- AUTOMATIC1111/stable-diffusion-webui - Stable Diffusion web UI
- Vision-CAIR/MiniGPT-4 - Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
- openai/openai-python - The official Python library for the OpenAI API
- Significant-Gravitas/AutoGPT - AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
- openai/chatgpt-retrieval-plugin - The ChatGPT Retrieval Plugin lets you easily find personal or work documents by asking questions in natural language.
- karpathy/nanoGPT - The simplest, fastest repository for training/finetuning medium-sized GPTs.
- alibaba/feathub - FeatHub - A stream-batch unified feature store for real-time machine learning
- getindata/flink-sql-runner - Framework for scheduling Flink SQL jobs on AWS Elastic MapReduce or a standalone Flink cluster.
- fivesheep/chnroutes - scripts help chinese netizen, who uses vpn to combat censorship, by modifying the route table so as routing only the censored ip to the vpn
- aws-samples/aws-emr-apache-ranger
- skyplane-project/skyplane - 🔥 Blazing fast bulk data transfers between any cloud 🔥
- nl8590687/ASRT_SpeechRecognition - A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统
- dsoprea/RandomUtility - Disparate tools by published by Dustin.
- Jrebel-i/hadoop_onekey_deploy
- aws-samples/aws-glue-samples - AWS Glue code samples
- donnemartin/system-design-primer - Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
- polaris-catalog/polaris - The interoperable, open source catalog for Apache Iceberg
- apache/polaris - Apache Polaris, the interoperable, open source catalog for Apache Iceberg
- jupyter-incubator/sparkmagic - Jupyter magics and kernels for working with remote Spark clusters
- aws-solutions-library-samples/guidance-for-ec2-spot-placement-score-tracker - This Guidance shows how to build an Amazon Elastic Compute Cloud (Amazon EC2) Spot placement score tracker to monitor unused Amazon EC2 Spot Instance capacity.
- gangly/datafaker - Datafaker is a large-scale test data and flow test data generation tool. Datafaker fakes data and inserts to varied data sources. 测试数据生成工具
- NanmiCoder/MediaCrawler - 小红书笔记 | 评论爬虫、抖音视频 | 评论爬虫、快手视频 | 评论爬虫、B 站视频 | 评论爬虫、微博帖子 | 评论爬虫、百度贴吧帖子 | 百度贴吧评论回复爬虫 | 知乎问答文章|评论爬虫
- hankcs/HanLP - 中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理
- LC044/WeChatMsg - 提取微信聊天记录,将其导出成HTML、Word、Excel文档永久保存,对聊天记录进行分析生成年度聊天报告,用聊天数据训练专属于个人的AI聊天助手
-
TypeScript
- TBXark/ChatGPT-Telegram-Workers - Deploy your own Telegram ChatGPT bot on Cloudflare Workers with ease.
- mermaid-js/mermaid - Generation of diagrams like flowcharts or sequence diagrams from text in a similar manner as markdown
- yanggggjie/rising-repo
- ZuodaoTech/everyone-can-use-english - 人人都能用英语
- labring/sealos - Sealos is a production-ready Kubernetes distribution. You can run any Docker image on sealos, start high availability databases like mysql/pgsql/redis/mongo, develop applications using any Programming
- toeverything/AFFiNE - There can be more than Notion and Miro. AFFiNE(pronounced [ə‘fain]) is a next-gen knowledge base that brings planning, sorting and creating all together. Privacy first, open-source, customizable and r
- conwnet/github1s - One second to read GitHub code with VS Code.
- yangshun/tech-interview-handbook - 💯 Curated coding interview preparation materials for busy software engineers
- whyour/qinglong - 支持 Python3、JavaScript、Shell、Typescript 的定时任务管理平台(Timed task management platform supporting Python3, JavaScript, Shell, Typescript)
- awslabs/clickstream-analytics-on-aws - Build clickstream analytics on AWS for your mobile and web applications
- hypertrons/hypertrons-crx - A browser extension for insights into GitHub projects and developers.
- aws-samples/aws-lambda-clickhouse - Run the open-source online analytics database ClickHouse in an AWS Lambda function
- excalidraw/excalidraw - Virtual whiteboard for sketching hand-drawn like diagrams
- daybrush/infinite-viewer - Infinite Viewer is Document Viewer Component with infinite scrolling.
- open-metadata/OpenMetadata - OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team colla
- kangkaisen/olap-performance - OLAP Database Performance Tuning Guide
- RealKai42/qwerty-learner-vscode - 为键盘工作者设计的单词记忆与英语肌肉记忆锻炼软件 VSCode 摸🐟版 / Words learning and English muscle memory training software designed for keyboard workers for VSCode
- RealKai42/qwerty-learner - 为键盘工作者设计的单词记忆与英语肌肉记忆锻炼软件 / Words learning and English muscle memory training software designed for keyboard workers
- transitive-bullshit/agentic - AI agent stdlib that works with any LLM and TypeScript AI SDK.
- ilyydy/cf-openai - Deploy your chatGPT service on Cloudflare Workers and integrate with apps.
- reworkd/AgentGPT - 🤖 Assemble, configure, and deploy autonomous AI Agents in your browser.
- Bin-Huang/chatbox - User-friendly Desktop Client App for AI Models/LLMs (GPT, Claude, Gemini, Ollama...)
- teo-ma/AzureSQLChatGPTDemo
- wong2/chatgpt-google-extension - This project is deprecated. Check my new project ChatHub:
- dbeaver/cloudbeaver - Cloud Database Manager
- pingcap/ossinsight - Analysis, Comparison, Trends, Rankings of Open Source Software, you can also get insight from more than 7 billion with natural language (powered by OpenAI). Follow us on Twitter: https://twitter.com/o
- Licoy/ChatGPTs - 🍭 一键拥有你自己的ChatGPT+众多AI网页服务 | One click access to your own ChatGPT+numerous AI web services
- OI-wiki/OI-wiki - :star2: Wiki of OI / ICPC for everyone. (某大型游戏线上攻略,内含炫酷算术魔法)
- ChatGPTNextWeb/ChatGPT-Next-Web - A cross-platform ChatGPT/Gemini UI (Web / PWA / Linux / Win / MacOS). 一键拥有你自己的跨平台 ChatGPT/Gemini 应用。
- Licoy/ChatAny - 🌻 一键拥有你自己的 ChatGPT+众多AI 网页服务 | One click access to your own ChatGPT+Many AI web services
- openai-translator/openai-translator - 基于 ChatGPT API 的划词翻译浏览器插件和跨平台桌面端应用 - Browser extension and cross-platform desktop application for translation based on ChatGPT API.
- ChatAnyTeam/ChatAny - 🌻 一键拥有你自己的 ChatGPT+众多AI 网页服务 | One click access to your own ChatGPT+Many AI web services
-
Jinja
- awesome-kyuubi/hadoop-testing - Testing Sandbox for Hadoop Ecosystem Components
-
Jsonnet
- monitoringartist/grafana-aws-cloudwatch-dashboards - :cloud: 40+ Grafana dashboards for AWS CloudWatch metrics: EC2, Lambda, S3, ELB, EMR, EBS, SNS, SES, SQS, RDS, EFS, ElastiCache, Billing, API Gateway, VPN, Step Functions, Route 53, CodeBuild, ...
-
Jupyter Notebook
- microsoft/AI-For-Beginners - 12 Weeks, 24 Lessons, AI for All!
- suno-ai/bark - 🔊 Text-Prompted Generative Audio Model
- michaelmior/calcite-notebooks - :notebook: A series of Jupyter notebooks to demonstrate the functionality of Apache Calcite
- gkamradt/langchain-tutorials - Overview and tutorial of the LangChain Library
- langchain-ai/langchain - 🦜🔗 Build context-aware reasoning applications
- datawhalechina/llm-cookbook - 面向开发者的 LLM 入门教程,吴恩达大模型系列课程中文版
- confucianzuoyuan/deep-learning-tutorial
- bluishglc/apache-hudi-core-conceptions - A set of notebooks to explore and explain core conceptions of Apache Hudi, such as file layouts, file sizing, compaction, clustering and so on.
- Jrebel-i/DeOldify - A Deep Learning based project for colorizing and restoring old images (and video!)
- Fafa-DL/Lhy_Machine_Learning - 李宏毅2021/2022/2023春季机器学习课程课件及作业
- apache/gravitino-playground - A playground to experience Gravitino
- aws-samples/aws-emr-cost-toolbox
- ga642381/ML2021-Spring - **Official** 李宏毅 (Hung-yi Lee) 機器學習 Machine Learning 2021 Spring
-
Kotlin
- andygrove/how-query-engines-work - This is the companion repository for the book How Query Engines Work.
- gkd-kit/gkd - 基于无障碍,高级选择器,订阅规则的自定义屏幕点击 Android 应用 | An Android APP with custom screen tapping based on Accessibility, Advanced Selectors, and Subscription Rules
- alibaba/p3c - Alibaba Java Coding Guidelines pmd implements and IDE plugin
- kukume/kukubot - A bot.
- pppscn/SmsForwarder - 短信转发器——监控Android手机短信、来电、APP通知,并根据指定规则转发到其他手机:钉钉群自定义机器人、钉钉企业内机器人、企业微信群机器人、飞书机器人、企业微信应用消息、邮箱、bark、webhook、Telegram机器人、Server酱、PushPlus、手机短信等。包括主动控制服务端与客户端,让你轻松远程发短信、查短信、查通话、查话簿、查电量等。(V3.0 新增)PS.这个APK主要是
-
MDX
- openai/openai-cookbook - Examples and guides for using the OpenAI API
-
Markdown
- labuladong/fucking-algorithm - 刷算法全靠套路,认准 labuladong 就够了!English version supported! Crack LeetCode, not only how, but also why.
- codecrafters-io/build-your-own-x - Master programming by recreating your favorite technologies from scratch.
-
PHP
- pixelfed/pixelfed - Photo Sharing. For Everyone.
- hlmd/Postman-cn - Postman汉化中文版
-
PowerShell
- AliyunContainerService/k8s-for-docker-desktop - 为Docker Desktop for Mac/Windows开启Kubernetes和Istio。
- fleschutz/PowerShell - 500+ free PowerShell scripts (.ps1) for Linux, Mac OS, and Windows.
-
Ruby
- Cute-Dress/Dress - 好耶 是女装 | 备份·接受PR
-
Rust
- apache/datafusion-comet - Apache DataFusion Comet Spark Accelerator
- cmu-db/optd - CMU-DB's Cascades optimizer framework
- tikv/raft-engine - A persistent storage engine for Multi-Raft log
- restatedev/restate - Restate is the platform for building resilient applications that tolerate all infrastructure faults w/o the need for a PhD.
- datafuselabs/openraft - rust raft with improvements
- alecmocatta/streaming_algorithms - Performant implementations of various streaming algorithms, including Count–min sketch, Top k, HyperLogLog, Reservoir sampling.
- kwai/blaze - Blazing-fast query execution engine speaks Apache Spark language and has Arrow-DataFusion at its core.
- google/comprehensive-rust - This is the Rust course used by the Android team at Google. It provides you the material to quickly teach Rust.
- apache/arrow-rs - Official Rust implementation of Apache Arrow
- apache/opendal - Apache OpenDAL: access data freely.
- tikv/tikv - Distributed transactional key-value database, originally created to complement TiDB
- sqlparser-rs/sqlparser-rs - Extensible SQL Lexer and Parser for Rust
- ArroyoSystems/arroyo - Distributed stream processing engine in Rust
- MaterializeInc/materialize - The Cloud Operational Data Store: use SQL to transform, deliver, and act on fast-changing data.
- risingwavelabs/risingwave - Best-in-class stream processing, analytics, and management. Perform continuous analytics, or build event-driven applications, real-time ETL pipelines, and feature stores in minutes. Unified streaming
- apache/datafusion-ballista - Apache DataFusion Ballista Distributed Query Engine
- apache/datafusion - Apache DataFusion SQL Query Engine
- databendlabs/openraft - rust raft with improvements
- databendlabs/databend - 𝗗𝗮𝘁𝗮, 𝗔𝗻𝗮𝗹𝘆𝘁𝗶𝗰𝘀 & 𝗔𝗜. Modern alternative to Snowflake. Cost-effective and simple for massive-scale analytics. https://databend.com
- apache/paimon-rust - Apache Paimon Rust The rust implementation of Apache Paimon.
- apache/datafusion-sqlparser-rs - Extensible SQL Lexer and Parser for Rust
- skyzh/mini-lsm - A tutorial of building an LSM-Tree storage engine in a week.
-
Scala
- GuoNingNing/fire-spark - Spark 脚手架工程,标准化 spark 开发、部署、测试流程。
- akka/akka - Build highly concurrent, distributed, and resilient message-driven applications on the JVM
- AbsaOSS/spline-spark-agent - Spline agent for Apache Spark
- streamnative/pulsar-spark - Spark Connector to read and write with Pulsar
- AbsaOSS/spline - Data Lineage Tracking And Visualization Solution
- NVIDIA/spark-rapids - Spark RAPIDS plugin - accelerate Apache Spark with GPUs
- delta-io/delta - An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
- IBM/spark-s3-shuffle - A S3 Shuffle plugin for Apache Spark to enable elastic scaling for generic Spark workloads.
- apache/kyuubi - Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
- neoremind/kraps-rpc - A RPC framework leveraging Spark RPC module
- lhbench/lhbench - Lakehouse storage system benchmark
- apache/spark - Apache Spark - A unified analytics engine for large-scale data processing
- target/data-validator - A tool to validate data, built around Apache Spark.
- aistack/sql-booster - This is a library for SQL optimizing/rewriting including Materialized View rewrite
- lw-lin/CoolplaySpark - 酷玩 Spark: Spark 源代码解析、Spark 类库等
- alibaba/SparkCube - SparkCube is an open-source project for extremely fast OLAP data analysis. SparkCube is an extension of Apache Spark.
- pierre94/flink-notes - flink学习笔记
- apache/incubator-gluten - Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
- yhyyz/flink-cdc-msk - flink-cdc-msk
- hortonworks-spark/spark-atlas-connector - A Spark Atlas connector to track data lineage in Apache Atlas
- spark-redshift-community/spark-redshift - Performant Redshift data source for Apache Spark
- awslabs/deequ - Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
- yhyyz/emr-hudi-example - emr-hudi-example
- yahoo/CMAK - CMAK is a tool for managing Apache Kafka clusters
- Jrebel-i/SZT-bigdata - 深圳地铁大数据客流分析系统🚇🚄🌟
- apache/hbase-connectors - Apache HBase Connectors
- baolibin/Bigdata - 大数据处理相关技术学习之路(持续更新中...)。 Bigdata整理 --> 慢慢滴~ 大数据相关技术包括离线处理,实时处理,OLAP等,如hadoop、spark、flink、hive、hbase、oozie...以及大数据项目,如用户画像、数据仓库等,欢迎感兴趣的小伙伴一起来开发...
- Shockang/spark-examples - 致力于提供最具实践性的 Spark 代码开发学习指南
-
Shell
- hxhwing/EMR-Managed-Ranger-Plugin
- zabetak/calcite-druid-dataset - Druid containers used for running the integration tests of Calcite Druid adapter
- apache/flink-docker - Docker packaging for Apache Flink
- aws/aws-emr-best-practices - A best practices guide for using AWS EMR. The guide will cover best practices on the topics of cost, performance, security, operational excellence, reliability and application specific best practices
- oldratlee/useful-scripts - 🐌 useful scripts for making developer's everyday life easier and happier, involved java, shell etc.
- rootsongjc/kubernetes-handbook - Kubernetes中文指南/云原生应用架构实战手册
- hwdsl2/docker-ipsec-vpn-server - Docker image to run an IPsec VPN server, with IPsec/L2TP, Cisco IPsec and IKEv2
- hxhwing/EMR-Ranger-Integration
- tmcgrath/kafka-connect-examples - Kafka Connect Examples
- aws-samples/emr-spark-benchmark
- dqzboy/ChatGPT-Proxy - ChatGPT Proxy Project:一键部署 go-chatgpt-api 和 ninja 逆向工程
- apache/flink-shaded - Apache Flink shaded artifacts repository
- cucker0/docker - docker的使用
- hq450/fancyss - fancyss is a project providing tools to across the GFW on asuswrt/merlin based router.
- aakashnand/trino-ranger-demo - Tutorial on how to setup Trino and Apache Ranger using docker
- aws-samples/aws-emr-utilities
- bluishglc/ranger-emr-cli-installer - This is a powerful cli tool for Apache Ranger and AWS EMR automated installation & integration with OpenLDAP & Windows AD. It supports Open-Source Ranger and EMR-Native Ranger both, supports OpenLDAP
- bluishglc/emr-edgenode-maker - This tool can easily make / build an emr cluster edge node / client node / gateway node
- OpenVPN/easy-rsa - easy-rsa - Simple shell based CA utility
- LadyForest/flink-table-store-101 - Playground for Flink Table Store with use cases and performance features
- steveloughran/winutils - Windows binaries for Hadoop versions (built from the git commit ID used for the ASF relase)
- cdarlint/winutils - winutils.exe hadoop.dll and hdfs.dll binaries for hadoop windows
- zhenchao125/docker_bigdata
- 233boy/v2ray - 最好用的 V2Ray 一键安装脚本 & 管理脚本
- aws-samples/emr-remote-shuffle-service
- ohmyzsh/ohmyzsh - 🙃 A delightful community-driven (with 2,400+ contributors) framework for managing your zsh configuration. Includes 300+ optional plugins (rails, git, macOS, hub, docker, homebrew, node, php, python,
- confluentinc/demo-scene - 👾Scripts and samples to support Confluent Demos and Talks up until Oct '24. ⚠️ No longer maintained 👉For automated tutorials and QA'd code, see https://github.com/confluentinc/examples/
- collabH/bigdata-growth - 大数据知识仓库涉及到数据仓库建模、实时计算、大数据、数据中台、系统设计、Java、算法等。
-
Smarty
- LinuxSuRen/open-source-best-practice - This is an open-source best practice for those who want to participate in open-source projects 参与开源过程中的一些最佳实践
- opensource-f2f/episode - 开源面对面,连接热爱开源的你!Episodes for the open-source face-to-face talk!
-
Swift
- rileytestut/Delta - Delta is an all-in-one classic video game emulator for non-jailbroken iOS devices.
-
TeX
- Jacobbishopxy/poma-notes - Notes of Principles of Mathematical Analysis
-
Vue
- dulaiduwang003/ScribbleHub - 基于SpringBoot3开发的轻量级技术博客小程序,支持 文章发布(支持上传音频内容或视频内容) 专题管理 搜索 以及渲染 以及文章评论功能 无需第三方OSS存储,使用的是服务器存储空间 详细请看yml中的file配置
- cfour-hi/gitstars - Github Starred Repositories Manager
- chatgpt-web-dev/chatgpt-web - A third-party ChatGPT Web UI page built with Express and Vue3, through the official OpenAI completion API. / 用 Express 和 Vue3 搭建的第三方 ChatGPT 前端页面, 基于 OpenAI 官方 completion API.
- chatpire/chatgpt-web-share - ChatGPT Plus 共享方案。ChatGPT Plus / OpenAI API sharing solution.
- Chanzhaoyu/chatgpt-web - 用 Express 和 Vue3 搭建的 ChatGPT 演示网页
Programming Languages
Categories
Java
194
Others
80
Python
52
TypeScript
32
Scala
28
Shell
28
JavaScript
26
Go
23
Rust
22
C++
20
Jupyter Notebook
13
HTML
12
C# #
7
C
6
Kotlin
5
Vue
5
Dart
4
Dockerfile
3
License
2
Smarty
2
Markdown
2
Batchfile
2
PHP
2
PowerShell
2
ANTLR
1
CSS
1
FreeMarker
1
MDX
1
TeX
1
Swift
1
Jinja
1
Ruby
1
Jsonnet
1
Sub Categories
Keywords
flink
45
java
45
sql
36
spark
35
database
32
chatgpt
29
big-data
27
python
25
hadoop
23
bigdata
20
kafka
20
kubernetes
19
openai
18
rust
18
hive
17
distributed-systems
16
olap
13
ai
12
javascript
12
stream-processing
12
docker
11
golang
11
mysql
11
awesome
11
awesome-list
10
hbase
10
go
10
streaming
9
apache
9
cloud
9
wechat
9
gpt-4
9
distributed-database
8
postgresql
8
flink-sql
8
s3
8
redis
8
cloud-native
8
scala
8
apache-spark
7
raft
7
aws
7
iceberg
7
cpp
7
linux
7
datalake
7
analytics
7
clickhouse
7
lakehouse
7
algorithm
7