Projects in Awesome Lists tagged with streaming-data
A curated list of projects in awesome lists tagged with streaming-data .
https://github.com/onurakpolat/awesome-bigdata
A curated list of awesome big data frameworks, ressources and other awesomeness.
awesome awesome-list bigdata data data-analytics data-science data-stream data-visualization data-warehouse database distributed-database series-database stream-processing streaming-data visualize-data
Last synced: 22 Mar 2025
https://github.com/provectus/kafka-ui
Open-Source Web UI for Apache Kafka Management
apache-kafka big-data cluster-management event-streaming hacktoberfest kafka kafka-brokers kafka-client kafka-cluster kafka-connect kafka-manager kafka-producer kafka-streams kafka-ui opensource streaming-data streams web-ui
Last synced: 12 May 2025
https://github.com/johnkerl/miller
Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON
command-line command-line-tools csv csv-format data-cleaning data-processing data-reduction data-regression devops devops-tools json json-data miller statistical-analysis statistics streaming-algorithms streaming-data tabular-data tsv unix-toolkit
Last synced: 21 Feb 2026
https://github.com/redpanda-data/connect
Fancy stream processing made operationally mundane
amqp cqrs data-engineering data-ops etl event-sourcing go golang kafka logs message-bus message-queue nats rabbitmq stream-processing stream-processor streaming-data
Last synced: 19 Feb 2026
https://github.com/Jeffail/benthos
Fancy stream processing made operationally mundane
amqp cqrs data-engineering data-ops etl event-sourcing go golang kafka logs message-bus message-queue nats rabbitmq stream-processing stream-processor streaming-data
Last synced: 25 Mar 2025
https://github.com/materializeinc/materialize
Real-time Data Integration and Transformation: use SQL to transform, deliver, and act on fast-changing data.
data-store database distributed-systems kafka materialized-view operational-data-store postgresql postgresql-dialect rust sql stream-processing streaming streaming-data
Last synced: 09 Sep 2025
https://github.com/MaterializeInc/materialize
The Cloud Operational Data Store: use SQL to transform, deliver, and act on fast-changing data.
data-warehouse database distributed-systems kafka materialized-view operational-data-warehouse postgresql postgresql-dialect rust sql stream-processing streaming streaming-data
Last synced: 28 Mar 2025
https://github.com/online-ml/river
🌊 Online machine learning in Python
concept-drift data-science incremental-learning machine-learning online-learning online-machine-learning online-statistics python real-time-processing stream-processing streaming streaming-data
Last synced: 12 Dec 2025
https://github.com/readysettech/readyset
Readyset is a MySQL and Postgres wire-compatible caching layer that sits in front of existing databases to speed up queries and horizontally scale read throughput. Under the hood, ReadySet caches the results of cached select statements and incrementally updates these results over time as the underlying data changes.
backend cache caching caching-proxy databases mysql mysql-database postgres postgresql postgresql-database rust rust-lang sql streaming-data
Last synced: 27 Feb 2026
https://github.com/fluvio-community/fluvio
🦀 event stream processing for developers to collect and transform data in motion to power responsive data intensive applications.
cloud-native data-analytics data-flow data-integration data-pipelines distributed-systems event-driven-architecture real-time rust serverless stateful stream-processing stream-processing-engine streaming streaming-analytics streaming-data streaming-data-pipelines streaming-data-processing webassembly
Last synced: 09 Mar 2026
https://github.com/infinyon/fluvio
🦀 event stream processing for developers to stream and process data in motion to power responsive data intensive applications.
cloud-native data-analytics data-flow data-integration data-pipelines distributed-systems event-driven-architecture real-time rust serverless stateful stream-processing stream-processing-engine streaming streaming-analytics streaming-data streaming-data-pipelines streaming-data-processing webassembly
Last synced: 13 May 2025
https://github.com/piskvorky/smart_open
Utils for streaming large files (S3, HDFS, gzip, bz2...)
boto bz2 file gzip-stream hacktoberfest hdfs python s3 streaming streaming-data webhdfs
Last synced: 11 Dec 2025
https://github.com/RaRe-Technologies/smart_open
Utils for streaming large files (S3, HDFS, gzip, bz2...)
boto bz2 file gzip-stream hacktoberfest hdfs python s3 streaming streaming-data webhdfs
Last synced: 31 Mar 2025
https://github.com/memgraph/memgraph
Open-source graph database, tuned for dynamic analytics environments. Easy to adopt, scale and own.
cypher graph graph-algorithms graph-analysis graph-database kafka kafka-streams nosql opencypher stream-processing streaming-data
Last synced: 14 May 2025
https://github.com/reugn/go-streams
A lightweight stream processing library for Go
aerospike data-pipeline data-stream etl kafka kafka-streams low-code nats-streaming pipeline pulsar redis stream-processing stream-processor streaming-api streaming-data streams throttling websocket windowing workflow
Last synced: 14 May 2025
https://github.com/pravega/pravega
Pravega - Streaming as a new software defined storage primitive
data-ingestion distributed-storage real-time-data streaming streaming-data
Last synced: 13 May 2025
https://github.com/bytewax/bytewax
Python Stream Processing
data-engineering data-processing data-science dataflow machine-learning python rust stream-processing streaming-data
Last synced: 13 May 2025
https://github.com/quixio/quix-streams
Python Streaming DataFrames for Kafka
data-engineering data-intensive-applications data-science event-driven-architecture kafka machine-learning python real-time-data-processing stream-processing stream-processor streaming-data streaming-data-pipelines streaming-data-processing time-series-data
Last synced: 13 May 2025
https://github.com/python-streamz/streamz
Real-time stream processing for python
async python real-time streaming-data
Last synced: 12 Dec 2025
https://github.com/Microsoft/trill
Trill is a single-node query processor for temporal or streaming data.
Last synced: 27 Mar 2025
https://github.com/microsoft/Trill
Trill is a single-node query processor for temporal or streaming data.
Last synced: 04 Apr 2025
https://github.com/microsoft/trill
Trill is a single-node query processor for temporal or streaming data.
Last synced: 14 May 2025
https://github.com/zpl-c/zpl
📐 Pushing the boundaries of simplicity
c cli coroutines cpp cross-platform csv-parser hashing header-only helper json5-parser math memory-allocation memory-management streaming-data tar thread-pool threading time timer zpl
Last synced: 15 May 2025
https://github.com/DoneDeal0/superdiff
Superdiff provides a complete and readable diff for both arrays and objects. Plus, it supports stream and file inputs for handling large datasets efficiently, is battle-tested, has zero dependencies, and is super fast.
array-comparison comparison comparison-tool deep-diff diff json-diff nodejs object-comparison object-diff objectdiff objectdifference react streaming streaming-data typescript
Last synced: 04 Apr 2025
https://github.com/joshday/onlinestats.jl
⚡ Single-pass algorithms for statistics
big-data julia julia-language julialang online-algorithms onlinestats statistics stochastic-approximation streaming-data
Last synced: 15 May 2025
https://github.com/joshday/OnlineStats.jl
⚡ Single-pass algorithms for statistics
big-data julia julia-language julialang online-algorithms onlinestats statistics stochastic-approximation streaming-data
Last synced: 15 Mar 2025
https://github.com/scikit-multiflow/scikit-multiflow
A machine learning package for streaming data in Python. The other ancestor of River.
machine-learning meka moa scikit scikit-learn stream streaming-data
Last synced: 15 May 2025
https://github.com/hstreamdb/hstream
HStreamDB is an open-source, cloud-native streaming database for IoT and beyond. Modernize your data stack for real-time applications.
data-processing database distributed-database distributed-systems financial-analysis haskell hstreamdb iot iot-database kafka materialized-view real-time realtime-database scale sql stream-processing streaming streaming-data streaming-database
Last synced: 15 May 2025
https://github.com/streamdal/streamdal
Code-Native Data Privacy
astrojs data-contracts deno docker event-driven go javascript message-queues nodejs observability python reactjs rust streaming-data tail-f wasi wasm
Last synced: 14 May 2025
https://github.com/Stratio/sparta
Real Time Analytics and Data Pipelines based on Spark Streaming
analytics hdfs kafka lambda olap real-time scala spark spark-streaming sparksql sparta stratio stratio-sparta streaming streaming-data triggers workflow
Last synced: 09 May 2025
https://github.com/swimos/swim
Full stack application platform for building stateful microservices, streaming APIs, and real-time UIs
actor-model asynchronous-programming decentralized-applications distributed-systems microservices-architecture non-blocking-io real-time serverless serverless-framework stateful streaming-api streaming-data web-agent websockets
Last synced: 11 Jan 2026
https://github.com/kLabUM/rrcf
🌲 Implementation of the Robust Random Cut Forest algorithm for anomaly detection on streams
anomaly-detection detect-outliers machine-learning outliers python random-forest robust-random-cut-forest streaming-data tree
Last synced: 14 Mar 2025
https://github.com/guillermo-navas-palencia/optbinning
Optimal binning: monotonic binning with constraints. Support batch & stream optimal binning. Scorecard modelling and counterfactual explanations.
batch-processing binning counterfactual-explanations credit-scoring mdlp optimization python scorecard stream streaming-data woe woebinning
Last synced: 30 Dec 2025
https://github.com/ankur-anand/unisondb
A streaming multimodal database for Edge AI, and Edge Computing.
ai-agents database edge-computing go golang golang-database grpc grpc-go key-value multi-modal replicated row-column streaming streaming-data streaming-database unisondb wide-column-database
Last synced: 14 Jan 2026
https://github.com/lightbend/cloudflow
Cloudflow enables users to quickly develop, orchestrate, and operate distributed streaming applications on Kubernetes.
akka cloudflow flink kubernetes microservices-architectures spark streaming-applications streaming-data streaming-runtimes
Last synced: 23 Oct 2025
https://github.com/microsoft/data-accelerator
Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.
apache-spark azure big-data cosmosdb docker eventhub hdinsight iot iothub kafka kafka-streams nodejs react servicefabric spark spark-sql spark-streaming sparksql streaming streaming-data
Last synced: 15 May 2025
https://github.com/keithknott26/datadash
Visualize and graph data in the terminal
chart charting csv go golang graph graphing graphing-application streaming-data tabular-data terminal-based terminal-ui tsv
Last synced: 09 Mar 2026
https://github.com/goodboy/tractor
A distributed, structured concurrency runtime for Python (and friends)
actor-model async-await distributed-systems multicore-programming multiprocessing rpc streaming-data structured-concurrency trio
Last synced: 15 May 2025
https://github.com/selimfirat/pysad
Streaming Anomaly Detection Framework in Python (Outlier Detection for Streaming Data)
anomaly anomaly-detection fraud-detection incremental-learning intrusion-detection machine-learning outlier-detection outliers python real-time streaming-data unsupervised-learning
Last synced: 02 Jan 2026
https://github.com/silverton-io/buz
Serverless multi-protocol + multi-destination event collection system.
analytics analytics-tracking cloudevents cloudevents-schema contracts data data-collection data-platform eventbridge jsonschema product-analytics redpanda redpanda-console schema-registry schema-validation snowplow-analytics streaming-analytics streaming-data webhook-receiver webhook-server
Last synced: 12 Apr 2025
https://github.com/ast-al/rangeless
c++ LINQ -like library of higher-order functions for data manipulation
cpp cpp11 functional functional-programming itertools lazy-evaluation linq parallel pipeline range streaming-algorithms streaming-data
Last synced: 08 May 2025
https://github.com/maraisr/meros
🪢 A fast utility that makes reading multipart responses simple
defer fetch graphql multipart multipart-mixed relay stream streaming-data
Last synced: 07 Oct 2025
https://github.com/GridProtectionAlliance/gsf
Grid Solutions Framework
communications complex-event-processing comtrade electric gsf ieee-1344 ieee-c37118 libraries macrodyne osi-pi phasor-measurement-unit pmu pqdif stream-processing stream-processing-engine streaming-data synchrophasor time-series
Last synced: 09 Apr 2025
https://github.com/evadne/packmatic
Zipping on the fly — Generate downloadable Zip streams by aggregating File or URL Sources
elixir-lang elixir-library elixir-phoenix elixir-plug phoenix streaming-data zip
Last synced: 05 Apr 2025
https://github.com/whitaker-io/machine
Machine is a workflow/pipeline library for processing data
codespaces github-actions golang golang-library golang-package golangci-lint mit-license pipeline pipeline-framework stream-processing streaming-data workflow workflow-engine
Last synced: 14 Mar 2025
https://github.com/AbubakrChan/crewai-UI-business-product-launch
Streamlit UI for crewai | crewai ui
crewai crewaiui python streaming-data streamlit terminal ui
Last synced: 29 Jul 2025
https://github.com/GridProtectionAlliance/openPDC
Open Source Phasor Data Concentrator
bpa-pdc-stream complex-event-processing hadoop iec61850 ieee-1344 ieee-c37118 naspi openpdc pdc phasor-data-concentrator phasor-measurement-unit pmu stream-processing stream-processing-engine streaming-data synchrophasor time-series
Last synced: 17 Apr 2025
https://github.com/tinybirdco/mockingbird
Mockingbird is a mock streaming data generator
data-generation data-generator fakerjs generator http kafka streaming-data tinybird typescript
Last synced: 16 May 2025
https://github.com/wso2/streaming-integrator
A stream processing runtime that allows connecting any streaming data source to any destination and act on it
cloud-native event-driven integration real-time siddhi stream-processing streaming-data streaming-integration wso2
Last synced: 28 Mar 2025
https://github.com/wso2/product-streaming-integrator
A stream processing runtime that allows connecting any streaming data source to any destination and act on it
cloud-native event-driven integration real-time siddhi stream-processing streaming-data streaming-integration wso2
Last synced: 10 Oct 2025
https://github.com/bobbyiliev/materialize-tutorials
Materialize is a streaming database for real-time analytics. This is a collection of Materialize demos and tutorials.
analytics databases materialize postgresql real-time-data sql streaming-data streaming-sql
Last synced: 23 Mar 2025
https://github.com/seznam/euphoria
Euphoria is an open source Java API for creating unified big-data processing flows. It provides an engine independent programming model which can express both batch and stream transformations.
apache-flink apache-spark batch-processing big-data hadoop hdfs java-api kafka streaming-data unified-bigdata-processing
Last synced: 21 Aug 2025
https://paragroup.github.io/WindFlow/
A C++17 Data Stream Processing Parallel Library for Multicores and GPUs
cuda gpu gpu-acceleration gpu-computing gpu-programming multi-core multicore multithreading parallel-computing parallel-patterns parallel-programming parallelism sliding-windows stream stream-api stream-processing streaming streaming-api streaming-data streams
Last synced: 14 May 2025
https://github.com/axway-streams/axway-amplify-streams-js
AMPLIFY Streams Javascript package containing SDK, documentation and sample applications
angular js nodejs server-sent-events sse streamdataio streaming-data vuejs
Last synced: 08 Mar 2026
https://github.com/ylem-co/ylem
Ylem is an open-source platform for real-time data streaming orchestration
data data-visualization dataorchestration etl etl-framework etl-pipeline ide ingestion orchestration pipelines processing real-time reverse-etl scheduler streaming streaming-data transformation workflows
Last synced: 12 Jan 2026
https://github.com/pathwaycom/pathway-benchmarks
Benchmarks for data processing systems: Pathway, Spark, Flink, Kafka Streams
benchmark-framework flink kafka-streams latency pagerank pathway spark-streaming streaming streaming-data wordcount
Last synced: 02 Aug 2025
https://github.com/karafka/karafka-web
Web UI for monitoring and managing Karafka consumers
kafka-consumer kafka-manager kafka-monitor kafka-monitoring kafka-monitoring-dashboards kafka-producer kafka-ui karafka karafka-framework rails ruby ruby-on-rails streaming streaming-data web-
Last synced: 08 Oct 2025
https://github.com/pravega/pravega-samples
Sample Applications for Pravega.
data-streaming pravega sample-app streaming-data
Last synced: 27 Apr 2025
https://github.com/systemaccounting/mxfactorial
a payment application intended for deployment by the united states treasury that replaces banking with accounting
banking bivector capital cga combinatorial-game combinatorial-optimization conservation-laws economics federal-reserve finance fintech game-theory geometric-algebra inflation mathematical-physics monetary-inflation physics price-discovery stream-processing streaming-data
Last synced: 22 Jul 2025
https://github.com/maki-nage/makinage
Stream Processing Made Easy
distributed-systems kafka machine-learning python reactive-machine-learning reactive-programming reactive-systems stream-processing streaming streaming-data
Last synced: 14 Dec 2025
https://github.com/lsds/saber
Window-Based Hybrid CPU/GPU Stream Processing Engine
gpu high-throughput hybrid multicore multicore-cpu saber sliding-windows stream stream-processing streaming streaming-data
Last synced: 24 Jun 2025
https://github.com/lsds/Saber
Window-Based Hybrid CPU/GPU Stream Processing Engine
gpu high-throughput hybrid multicore multicore-cpu saber sliding-windows stream stream-processing streaming streaming-data
Last synced: 27 Mar 2025
https://github.com/rxswiftcommunity/rxhttpclient
Simple Http client (Use RxSwift for stream data)
nsurlsession rxswift streaming-data swift
Last synced: 16 Jun 2025
https://github.com/andrewssobral/imtsl
IMTSL - Incremental and Multi-feature Tensor Subspace Learning
background-subtraction foreground-detection matlab streaming-data subspace-learning tensor
Last synced: 13 Apr 2025
https://github.com/memgraph/twitter-network-analysis
Analyzing a network of tweets and retweets using graph algorithms
kafka kafka-streams memgraph online-pagerank pagerank pagerank-algorithm streaming streaming-data twitter
Last synced: 24 Oct 2025
https://github.com/jzo001/webapistreaming
How to get data as a stream from a WebAPI (.NET)
csharp dotnet-core iasyncenumerable streaming streaming-api streaming-data webapi-core
Last synced: 07 May 2025
https://github.com/sdpython/pandas-streaming
Streaming API for pandas applied to big datasets
numpy pandas python3 streaming-data streaming-data-processing
Last synced: 30 Jun 2025
https://github.com/joshday/onlinestatsbase.jl
Base types for OnlineStats.
big-data julia online-algorithm onlinestats statistics streaming-data
Last synced: 01 May 2025
https://github.com/marrow/cinje
A Pythonic and ultra fast template engine DSL.
cpython dsl pypy python python-2 python-3 streaming-data template-engine text-processing
Last synced: 13 Jul 2025
https://github.com/garystafford/streaming-sales-generator
Streaming Synthetic Sales Data Generator: Streaming sales data generator for Apache Kafka, written in Python
analytics apache-flink apache-kafka data kafka kafka-streams kstreams python spark-structured-streaming streaming-data
Last synced: 03 Aug 2025
https://github.com/certeu/morio
Connect - Stream - Observe - Respond | Morio provides the plumbing for your observability needs
beats cybersecurity cybersecurity-tools kafka observability stream-processing streaming-data
Last synced: 27 Jan 2026
https://github.com/byte271/6cy
High-performance, streaming-first container format with per-block codec polymorphism and robust data recoverability. Reference implementation in Rust.
codec-polymorphism compression container-format data-integrity lz4 rust specification storage-engine streaming-data zstd
Last synced: 23 Feb 2026
https://github.com/pravahio/go-mesh
Realtime data exchange platform for Smart Cities
Last synced: 14 Jan 2026
https://github.com/swimos/transit
Massively real-time city transit streaming application
actor-model actors concurrency concurrent-programming demo-app distributed-systems map mapbox nextbus public-transportation real-time realtime rest-api stateful streaming-api streaming-data
Last synced: 08 Apr 2025
https://github.com/ominibyte/richflow
A Node.js and JavaScript synchronous data pipeline processing, data sharing and stream processing library. Actionable & Transformable Pipeline data processing.
data-flow data-pipeline data-processor data-stream data-transformation flow javascript nodejs pipe-data pipeline-framework streaming-data synchronous
Last synced: 19 Feb 2026
https://github.com/saidsef/aws-kinesis-local
Build for local AWS Kinesis
amazon-kinesis-streams aws-kinesis aws-kinesis-local aws-kinesis-stream docker-container stream-processing streaming-data
Last synced: 12 Apr 2025
https://github.com/garystafford/kinesis-redshift-streaming-demo
aws kinesis-firehose redshift streaming-data
Last synced: 03 Aug 2025
https://github.com/mikejareds/hermiter
Efficient Sequential and Batch Estimation of Univariate and Bivariate Probability Density Functions and Cumulative Distribution Functions along with Quantiles (Univariate) and Nonparametric Correlation (Bivariate)
cumulative-distribution-function kendall-correlation-coefficient online-algorithms probability-density-function quantile spearman-correlation-coefficient statistics streaming-algorithms streaming-data
Last synced: 22 Oct 2025
https://github.com/simplygreatwork/godsend
A simple and eloquent workflow for streaming messages to micro-services.
broker bus filter javascript message-broker message-bus message-passing messaging microservice microservices microservices-architecture stream stream-processing streaming-api streaming-data streams
Last synced: 03 Mar 2025
https://github.com/ni/easyrdma
An easy-to-use, cross-platform, MIT-licensed RDMA library from NI
drivers measurements rdma streaming-data
Last synced: 05 Sep 2025
https://github.com/avriiil/stream-this-dataset
Code to convert static datasets into simulated data streams
dataset-generation streaming-data
Last synced: 06 May 2025
https://github.com/meroxa/turbine-go
Turbine Library for Go
data go golang stream-processing streaming-data
Last synced: 17 Jan 2026
https://github.com/MikeJaredS/hermiter
Efficient Sequential and Batch Estimation of Univariate and Bivariate Probability Density Functions and Cumulative Distribution Functions along with Quantiles (Univariate) and Nonparametric Correlation (Bivariate)
cumulative-distribution-function kendall-correlation-coefficient online-algorithms probability-density-function quantile spearman-correlation-coefficient statistics streaming-algorithms streaming-data
Last synced: 13 Jul 2025
https://github.com/qntfy/frizzle
The magic message bus
consumer golang golang-library kafka kinesis message-bus pipeline producer stream-processing streaming-data
Last synced: 12 Jan 2026
https://github.com/cpacker/graphzip
Mining graph streams using dictionary-based compression
compression graph-algorithms graph-mining streaming-data
Last synced: 09 Aug 2025
https://github.com/qntfy/frafka
Frizzle for Apache Kafka
consumer golang golang-library kafka message-bus pipeline producer stream-processing streaming-data
Last synced: 12 Jan 2026
https://github.com/irc-sphere/hyperstream
HyperStream
compute-engine hyperstream sphere stream-processing streaming-data workflow-engine
Last synced: 13 Aug 2025
https://github.com/fajarnugraha37/turborepo-nestjs
Fullstack multiple service application using turborepo typescript, nestjs, nextjs, prisma, mongodb and rabbitmq.
event-driven event-sourcing message-broker message-bus message-queue messaging microservice mongodb nestjs nextjs noodejs prisma rabbitmq react reactjs streaming-data turborepo typescript vercel
Last synced: 15 Oct 2025
https://github.com/selimhorri/kafka-boot
kafka-producer-consumer-with-spring-boot
java kafka kafka-consumer kafka-producer kafka-topic message-streaming spring-boot spring-kafka streaming-data
Last synced: 12 Apr 2025
https://github.com/microsoft/fabricrtiworkshop
How to build a Medallion design pattern using Fabric Real-Time Intelligence
analytics batch dashboard intelligence realtime streaming-data
Last synced: 29 Oct 2025
https://github.com/embetrix/bmap-writer
bmaptool alternative written in C++
bmap cpp efficiency embedded-linux flashing-tool lightweight streaming-data yocto
Last synced: 24 Apr 2025
https://github.com/mineur/twitter-stream-api
:baby_chick: Another Twitter stream PHP library to retrieve filtered tweets on hot.
guzzle mineur php71 streaming-api streaming-data twitter-streaming-api
Last synced: 24 Feb 2026
https://github.com/kodi/splex
Streaming Log Multiplexer - combine multiple logs to one
cli javascript node stream streaming-data
Last synced: 20 Sep 2025
https://github.com/propensive/turbulence
Simple tools for working with data streams in LazyLists in Scala
multiplexing scala streaming streaming-api streaming-data
Last synced: 16 Aug 2025
https://github.com/correia-jpv/fucking-awesome-bigdata
A curated list of awesome big data frameworks, resources and other awesomeness. With repository stars⭐ and forks🍴
awesome awesome-list bigdata data data-analytics data-science data-stream data-visualization data-warehouse database distributed-database series-database stream-processing streaming-data visualize-data
Last synced: 27 Apr 2025
https://github.com/janaom/gcp-de-project-streaming-pubsub-beam-dataflow
This project demonstrates an end-to-end solution for processing and analyzing real-time conversations data from a JSON file using GCP services and infrastructure automation, showcasing data storage, streaming, processing, and analysis at scale.
apache-beam bigquery dataflow de-project gcp pubsub streaming-data
Last synced: 18 Oct 2025
https://github.com/jsa-aerial/aerobio
Extensible full DAG streaming computation server with services and jobs for RNA-Seq, Tn-Seq, WG-Seq and Term-Seq.
clojure genome-sequencing pipeline-framework pipelines rna-seq streaming-data term-seq tn-seq wg-seq
Last synced: 12 May 2025
https://github.com/benedekrozemberczki/nestedsubtreehash
A distributed implementation of "Nested Subtree Hash Kernels for Large-Scale Graph Classification Over Streams" (ICDM 2012).
data-mining data-science deepwalk distributed-machine-learning feature-extraction gensim graph-classification graph-kernel graph-mining hashing large-scale-learning machine-learning multi-scale node2vec representation-learning streaming-data streaming-processing word2vec
Last synced: 11 Apr 2025
https://github.com/jmaces/statstream
Statistics for Streaming Data
data-science numpy statistics streaming-data
Last synced: 23 Apr 2025
https://github.com/alexklibisz/meetup-viz
Real-time visualization of streaming data from the Meetup.com open events RSVP API.
meetup reactjs streaming-data visualization
Last synced: 28 Oct 2025