Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with streaming-data

A curated list of projects in awesome lists tagged with streaming-data .

https://github.com/materializeinc/materialize

The Cloud Operational Data Store: use SQL to transform, deliver, and act on fast-changing data.

data-store database distributed-systems kafka materialized-view operational-data-store postgresql postgresql-dialect rust sql stream-processing streaming streaming-data

Last synced: 16 Dec 2024

https://github.com/readysettech/readyset

Readyset is a MySQL and Postgres wire-compatible caching layer that sits in front of existing databases to speed up queries and horizontally scale read throughput. Under the hood, ReadySet caches the results of cached select statements and incrementally updates these results over time as the underlying data changes.

backend cache caching caching-proxy databases mysql mysql-database postgres postgresql postgresql-database rust rust-lang sql streaming-data

Last synced: 17 Dec 2024

https://github.com/piskvorky/smart_open

Utils for streaming large files (S3, HDFS, gzip, bz2...)

boto bz2 file gzip-stream hacktoberfest hdfs python s3 streaming streaming-data webhdfs

Last synced: 16 Dec 2024

https://github.com/memgraph/memgraph

Open-source graph database, tuned for dynamic analytics environments. Easy to adopt, scale and own.

cypher graph graph-algorithms graph-analysis graph-database kafka kafka-streams nosql opencypher stream-processing streaming-data

Last synced: 19 Dec 2024

https://github.com/pravega/pravega

Pravega - Streaming as a new software defined storage primitive

data-ingestion distributed-storage real-time-data streaming streaming-data

Last synced: 17 Dec 2024

https://github.com/microsoft/trill

Trill is a single-node query processor for temporal or streaming data.

streaming-data temporal-data

Last synced: 21 Dec 2024

https://github.com/microsoft/Trill

Trill is a single-node query processor for temporal or streaming data.

streaming-data temporal-data

Last synced: 05 Nov 2024

https://github.com/Microsoft/trill

Trill is a single-node query processor for temporal or streaming data.

streaming-data temporal-data

Last synced: 30 Oct 2024

https://github.com/python-streamz/streamz

Real-time stream processing for python

async python real-time streaming-data

Last synced: 17 Dec 2024

https://github.com/scikit-multiflow/scikit-multiflow

A machine learning package for streaming data in Python. The other ancestor of River.

machine-learning meka moa scikit scikit-learn stream streaming-data

Last synced: 20 Dec 2024

https://github.com/hstreamdb/hstream

HStreamDB is an open-source, cloud-native streaming database for IoT and beyond. Modernize your data stack for real-time applications.

data-processing database distributed-database distributed-systems financial-analysis haskell hstreamdb iot iot-database kafka materialized-view real-time realtime-database scale sql stream-processing streaming streaming-data streaming-database

Last synced: 19 Dec 2024

https://github.com/kLabUM/rrcf

🌲 Implementation of the Robust Random Cut Forest algorithm for anomaly detection on streams

anomaly-detection detect-outliers machine-learning outliers python random-forest robust-random-cut-forest streaming-data tree

Last synced: 26 Oct 2024

https://github.com/guillermo-navas-palencia/optbinning

Optimal binning: monotonic binning with constraints. Support batch & stream optimal binning. Scorecard modelling and counterfactual explanations.

batch-processing binning counterfactual-explanations credit-scoring mdlp optimization python scorecard stream streaming-data woe woebinning

Last synced: 13 Nov 2024

https://github.com/lightbend/cloudflow

Cloudflow enables users to quickly develop, orchestrate, and operate distributed streaming applications on Kubernetes.

akka cloudflow flink kubernetes microservices-architectures spark streaming-applications streaming-data streaming-runtimes

Last synced: 20 Dec 2024

https://github.com/microsoft/data-accelerator

Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.

apache-spark azure big-data cosmosdb docker eventhub hdinsight iot iothub kafka kafka-streams nodejs react servicefabric spark spark-sql spark-streaming sparksql streaming streaming-data

Last synced: 20 Dec 2024

https://github.com/goodboy/tractor

A distributed, structured concurrent runtime for Python (and friends)

actor-model async-await distributed-systems multicore-programming multiprocessing rpc streaming-data structured-concurrency trio

Last synced: 19 Nov 2024

https://github.com/ast-al/rangeless

c++ LINQ -like library of higher-order functions for data manipulation

cpp cpp11 functional functional-programming itertools lazy-evaluation linq parallel pipeline range streaming-algorithms streaming-data

Last synced: 14 Nov 2024

https://github.com/maraisr/meros

🪢 A fast utility that makes reading multipart responses simple

defer fetch graphql multipart multipart-mixed relay stream streaming-data

Last synced: 21 Dec 2024

https://github.com/evadne/packmatic

Zipping on the fly — Generate downloadable Zip streams by aggregating File or URL Sources

elixir-lang elixir-library elixir-phoenix elixir-plug phoenix streaming-data zip

Last synced: 21 Dec 2024

https://github.com/wso2/streaming-integrator

A stream processing runtime that allows connecting any streaming data source to any destination and act on it

cloud-native event-driven integration real-time siddhi stream-processing streaming-data streaming-integration wso2

Last synced: 18 Dec 2024

https://github.com/bobbyiliev/materialize-tutorials

Materialize is a streaming database for real-time analytics. This is a collection of Materialize demos and tutorials.

analytics databases materialize postgresql real-time-data sql streaming-data streaming-sql

Last synced: 16 Dec 2024

https://github.com/seznam/euphoria

Euphoria is an open source Java API for creating unified big-data processing flows. It provides an engine independent programming model which can express both batch and stream transformations.

apache-flink apache-spark batch-processing big-data hadoop hdfs java-api kafka streaming-data unified-bigdata-processing

Last synced: 19 Dec 2024

https://github.com/axway-streams/axway-amplify-streams-js

AMPLIFY Streams Javascript package containing SDK, documentation and sample applications

angular js nodejs server-sent-events sse streamdataio streaming-data vuejs

Last synced: 09 Nov 2024

https://github.com/pravega/pravega-samples

Sample Applications for Pravega.

data-streaming pravega sample-app streaming-data

Last synced: 11 Nov 2024

https://github.com/rxswiftcommunity/rxhttpclient

Simple Http client (Use RxSwift for stream data)

nsurlsession rxswift streaming-data swift

Last synced: 11 Nov 2024

https://github.com/pathwaycom/pathway-benchmarks

Benchmarks for data processing systems: Pathway, Spark, Flink, Kafka Streams

benchmark-framework flink kafka-streams latency pagerank pathway spark-streaming streaming streaming-data wordcount

Last synced: 13 Nov 2024

https://github.com/andrewssobral/imtsl

IMTSL - Incremental and Multi-feature Tensor Subspace Learning

background-subtraction foreground-detection matlab streaming-data subspace-learning tensor

Last synced: 07 Nov 2024

https://github.com/marrow/cinje

A Pythonic and ultra fast template engine DSL.

cpython dsl pypy python python-2 python-3 streaming-data template-engine text-processing

Last synced: 04 Dec 2024

https://github.com/garystafford/streaming-sales-generator

Streaming Synthetic Sales Data Generator: Streaming sales data generator for Apache Kafka, written in Python

analytics apache-flink apache-kafka data kafka kafka-streams kstreams python spark-structured-streaming streaming-data

Last synced: 06 Dec 2024

https://github.com/sdpython/pandas-streaming

Streaming API for pandas applied to big datasets

numpy pandas python3 streaming-data streaming-data-processing

Last synced: 02 Nov 2024

https://github.com/jzo001/webapistreaming

How to get data as a stream from a WebAPI (.NET)

csharp dotnet-core iasyncenumerable streaming streaming-api streaming-data webapi-core

Last synced: 09 Nov 2024

https://github.com/ominibyte/richflow

A Node.js and JavaScript synchronous data pipeline processing, data sharing and stream processing library. Actionable & Transformable Pipeline data processing.

data-flow data-pipeline data-processor data-stream data-transformation flow javascript nodejs pipe-data pipeline-framework streaming-data synchronous

Last synced: 01 Nov 2024

https://github.com/MikeJaredS/hermiter

Efficient Sequential and Batch Estimation of Univariate and Bivariate Probability Density Functions and Cumulative Distribution Functions along with Quantiles (Univariate) and Nonparametric Correlation (Bivariate)

cumulative-distribution-function kendall-correlation-coefficient online-algorithms probability-density-function quantile spearman-correlation-coefficient statistics streaming-algorithms streaming-data

Last synced: 22 Nov 2024

https://github.com/fajarnugraha37/turborepo-nestjs

Fullstack multiple service application using turborepo typescript, nestjs, nextjs, prisma, mongodb and rabbitmq.

event-driven event-sourcing message-broker message-bus message-queue messaging microservice mongodb nestjs nextjs noodejs prisma rabbitmq react reactjs streaming-data turborepo typescript vercel

Last synced: 05 Dec 2024

https://github.com/cpacker/graphzip

Mining graph streams using dictionary-based compression

compression graph-algorithms graph-mining streaming-data

Last synced: 28 Oct 2024

https://github.com/kodi/splex

Streaming Log Multiplexer - combine multiple logs to one

cli javascript node stream streaming-data

Last synced: 18 Dec 2024

https://github.com/mineur/twitter-stream-api

:baby_chick: Another Twitter stream PHP library to retrieve filtered tweets on hot.

guzzle mineur php71 streaming-api streaming-data twitter-streaming-api

Last synced: 12 Oct 2024

https://github.com/avriiil/stream-this-dataset

Code to convert static datasets into simulated data streams

dataset-generation streaming-data

Last synced: 23 Oct 2024

https://github.com/propensive/turbulence

Simple tools for working with data streams in LazyLists in Scala

multiplexing scala streaming streaming-api streaming-data

Last synced: 28 Oct 2024

https://github.com/jsa-aerial/aerobio

Extensible full DAG streaming computation server with services and jobs for RNA-Seq, Tn-Seq, WG-Seq and Term-Seq.

clojure genome-sequencing pipeline-framework pipelines rna-seq streaming-data term-seq tn-seq wg-seq

Last synced: 18 Nov 2024

https://github.com/alexklibisz/meetup-viz

Real-time visualization of streaming data from the Meetup.com open events RSVP API.

meetup reactjs streaming-data visualization

Last synced: 11 Oct 2024

https://github.com/jmaces/statstream

Statistics for Streaming Data

data-science numpy statistics streaming-data

Last synced: 20 Oct 2024

https://github.com/neurodata/sdtf

Exploring streaming options for decision trees and random forests. Based on scikit-learn fork.

classification decision-trees machine-learning streaming-data

Last synced: 10 Nov 2024

https://github.com/garystafford/flink-kafka-demo

Apache Flink/Apache Kafka streaming data analytics demonstration using Streaming Synthetic Sales Data Generator

analytics apache-flink apache-kafka flink kafka streaming-data

Last synced: 06 Dec 2024

https://github.com/microsoft/fabricrtiworkshop

How to build a Medallion design pattern using Fabric Real-Time Intelligence

analytics batch dashboard intelligence realtime streaming-data

Last synced: 27 Nov 2024

https://github.com/lampajr/ptdc

Python Twitter Data Collector

collection dataset streaming-data tweepy-api twitter

Last synced: 14 Oct 2024

https://github.com/nastel/jesl

jKool Event Streaming Library -- telemetry collection, simulation and streaming to jKool enabled servers, jKool Cloud.

java jkool-cloud stream-events stream-processing streaming-data

Last synced: 12 Dec 2024

https://github.com/omalperera/realtime-data-streaming-simulator

Real-Time IoT data Simulator written in Scala lang to feed kafka Producer

apache-kafka real-time simulator streaming streaming-data

Last synced: 04 Dec 2024

https://github.com/ibmstreams/sample.starter_notebooks

Notebooks showing Streams applications written in Python

db2 kafka machine-learning python streaming streaming-data

Last synced: 23 Nov 2024

https://github.com/binarskugga/tentacule

Tentacule is an uncomplicated library to deal with a pool of worker processes

multiprocessing parallel pool process python streaming-data worker-pool workers

Last synced: 19 Dec 2024

https://github.com/ekrich/exip

Efficient XML Interchange (EXI) Embeddable C API

c-lang compression-algorithm embedded-c exi exip streaming-data xml

Last synced: 17 Oct 2024

https://github.com/seddryck/streamistry

Streamistry is a lightweight library designed to support pipeline, streaming, and ETL development for data engineering and integration. Its versatility makes it an excellent tool for building robust, scalable data workflows and optimizing data processing tasks. With features such as accumulators, windows, and sinks, it efficiently handles streaming

computational-pipeline data-engineering data-integration data-pipeline etl streaming-data

Last synced: 10 Nov 2024

https://github.com/garystafford/kstreams-kafka-demo

Apache Kafka Streams streaming data analytics demonstration using Streaming Synthetic Sales Data Generator

analytics apache apache-kafka kafka kafka-streams kstreams streaming-data

Last synced: 06 Dec 2024

https://github.com/tomviering/monotone

This repository contains the code to reproduce all of the results in our paper: Making Learners (More) Monotone, T J Viering, A Mey, M Loog, IDA 2020.

learning-theory machine-learning mnist monotonicity streaming-data

Last synced: 09 Nov 2024

https://github.com/calvinlfer/alpakka-cassandra-sink-usage

A demonstration of how to use the Alpakka Cassandra Sink to push case classes through to a Cassandra Table

akka-streams cassandra streaming-data

Last synced: 10 Nov 2024