Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/felipekunzler/spark-twitter-analysis
Analyse a twitter dataset with Spark and vizualize the results on a React dashboard.
Last synced: 31 May 2024
![](https://github.com/felipekunzler.png)
https://github.com/Angel-ML/angel
A Flexible and Powerful Parameter Server for large-scale machine learning
high-dimensional machine-learning model online-learning parameter-server scala spark spark-streaming
Last synced: 31 May 2024
![](https://github.com/Angel-ML.png)
https://github.com/rezacsedu/Mining-Maximal-Frequent-Pattern-Spark
Implementation of Static mining part of "Mining maximal frequent patterns in transactional databases and dynamic data streams: A spark-based approach" Information Sciences, Volume 432, March 2018, Pages 278-300
data-mining data-stream frequent-pattern-mining java maximal-frequent-pattern spark structured-streaming
Last synced: 31 May 2024
![](https://github.com/rezacsedu.png)
https://github.com/feng-li/Distributed-Statistical-Computing
Teaching Materials for Distributed Statistical Computing (大数据分布式计算教学材料)
hadoop mapreduce pyspark-tutorial spark spark-teaching statistical-models
Last synced: 31 May 2024
![](https://github.com/feng-li.png)
https://github.com/zhonghuasheng/Tutorial
后端 (Java Golang)全栈知识架构体系总结
emsp go java keepalived mongodb mqtt mysql netty redis rocketmq spark spring springboot springcloud tomcat tutorial
Last synced: 31 May 2024
![](https://github.com/zhonghuasheng.png)
https://github.com/zhisheng17/flink-learning
flink learning blog. http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例,还有 Flink 落地应用的大型项目案例(PVUV、日志存储、百亿数据实时去重、监控告警)分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》
clickhouse elasticsearch flink hbase influxdb kafka loki mysql opentsdb rabbitmq redis rocketmq spark stream-processing streaming
Last synced: 31 May 2024
![](https://github.com/zhisheng17.png)
https://github.com/XZB-1248/Spark
✨Spark is a web-based, cross-platform and full-featured Remote Administration Tool (RAT) written in Go that allows you control all your devices anywhere. Spark是一个Go编写的,网页UI、跨平台以及多功能的远程控制和监控工具,你可以随时随地监控和控制所有设备。
dashboard go golang rat remote-access-tool remote-admin-tool remote-administration-tool remote-control server-monitoring shell spark webshell
Last synced: 31 May 2024
![](https://github.com/XZB-1248.png)
https://github.com/liyupi/sql-generator
🔨 用 JSON 来生成结构化的 SQL 语句,基于 Vue3 + TypeScript + Vite + Ant Design + MonacoEditor 实现,项目简单(重逻辑轻页面)、适合练手~
ant-design bigdata hive javascript json monaco-editor mysql spark sql typescript vite vue vue3
Last synced: 30 May 2024
![](https://github.com/liyupi.png)
https://github.com/FirelyTeam/spark
Firely and Incendi's open source FHIR server
c-sharp docker dstu2 fhir fhir-api fhir-server fhir-spec fhir-specification r4 spark spark-fhir-server stu3
Last synced: 30 May 2024
![](https://github.com/FirelyTeam.png)
https://github.com/miztiik/s3-to-rds-with-glue
Extract, transform, and load data for analytic processing using AWS Glue
cdk cloud-development-kit etl glue glue-catalog glue-job miztiik-automation s3-to-rds spark
Last synced: 27 May 2024
![](https://github.com/miztiik.png)
https://github.com/alanchn31/Movalytics-Data-Warehouse
Data pipeline performing ETL to AWS Redshift using Spark, orchestrated with Apache Airflow
airflow analytics aws-redshift aws-s3 data-engineer-nanodegree data-engineering data-engineering-pipeline data-modelling data-warehouse-cloud docker movie-database movie-recommendation movie-reviews pyspark python3 redshift spark sql udacity
Last synced: 27 May 2024
![](https://github.com/alanchn31.png)
https://github.com/AuFeld/Data_Engineering_Projects
A collection of data engineering projects: data modeling, ETL pipelines, data lakes, infrastructure configuration on AWS, data warehousing, containerization, and a dashboard to monitor data pipeline KPIs
airflow aws cassandra data-engineering data-lake data-warehouse docker emr etl-pipeline infrastructure-as-code infrastructure-setup postgresql python redshift s3 spark
Last synced: 27 May 2024
![](https://github.com/AuFeld.png)
https://github.com/Qihoo360/Quicksql
A Flexible, Fast, Federated(3F) SQL Analysis Middleware for Multiple Data Sources
Last synced: 26 May 2024
![](https://github.com/Qihoo360.png)
https://github.com/dharmeshkakadia/tpch-hdinsight
TPCH benchmark for various engines
benchmarking hive llap presto spark tpch
Last synced: 26 May 2024
![](https://github.com/dharmeshkakadia.png)
https://github.com/dharmeshkakadia/tpcds-hdinsight
TPCDS benchmark for various engines
benchmarking hive llap presto spark tpcds
Last synced: 26 May 2024
![](https://github.com/dharmeshkakadia.png)
https://github.com/open-datastudio/hive-metastore
Hive metastore on Staroid
hadoop hive hive-metastore kubernetes spark staroid
Last synced: 26 May 2024
![](https://github.com/open-datastudio.png)
https://github.com/lw-lin/CoolplaySpark
酷玩 Spark: Spark 源代码解析、Spark 类库等
apache-spark spark spark-streaming sparkcore structured-streaming
Last synced: 26 May 2024
![](https://github.com/lw-lin.png)
https://github.com/jadianes/spark-py-notebooks
Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
big-data bigdata data-analysis data-science ipython ipython-notebook machine-learning mllib notebook pyspark python spark
Last synced: 26 May 2024
![](https://github.com/jadianes.png)
https://github.com/lightbend/cloudflow
Cloudflow enables users to quickly develop, orchestrate, and operate distributed streaming applications on Kubernetes.
akka cloudflow flink kubernetes microservices-architectures spark streaming-applications streaming-data streaming-runtimes
Last synced: 26 May 2024
![](https://github.com/lightbend.png)
https://github.com/angelotc/MacroDAG
A Dockerized Airflow ETL pipeline that processes macroeconomic indicators from the Federal Reserve.
Last synced: 26 May 2024
![](https://github.com/angelotc.png)
https://github.com/elasticluster/elasticluster
Create clusters of VMs on the cloud and configure them with Ansible.
ansible azure cloud cluster clustering ec2 gcp gridengine hadoop hpc python slurm spark
Last synced: 26 May 2024
![](https://github.com/elasticluster.png)
https://github.com/GaiZhenbiao/ChuanhuChatGPT
GUI for ChatGPT API and many LLMs. Supports agents, file-based QA, GPT finetuning and query with web search. All with a neat UI.
chatbot chatglm chatgpt-api claude dalle3 ernie gemini gemma inspurai llama midjourney minimax moss ollama qwen spark stablelm
Last synced: 25 May 2024
![](https://github.com/GaiZhenbiao.png)
https://github.com/yahoo/lopq
Training of Locally Optimized Product Quantization (LOPQ) models for approximate nearest neighbor search of high dimensional data in Python and Spark.
clustering lopq nearest-neighbor-search product-quantization spark
Last synced: 24 May 2024
![](https://github.com/yahoo.png)
https://github.com/radanalyticsio/spark-operator
Operator for managing the Spark clusters on Kubernetes and OpenShift.
apache-spark kubernetes kubernetes-operator openshift spark
Last synced: 22 May 2024
![](https://github.com/radanalyticsio.png)
https://github.com/vmitchell85/spark-kiosk-notify
Adds a notification panel to your Laravel Spark Kiosk, allowing you to send notifications to users.
Last synced: 21 May 2024
![](https://github.com/vmitchell85.png)
https://github.com/gilbitron/spark-create-stripe-plans
A simple Laravel artisan command to create Spark plans in Stripe
laravel laravel-artisan-command spark stripe
Last synced: 21 May 2024
![](https://github.com/gilbitron.png)
https://github.com/cretueusebiu/laravel-spark-camera
Profile Photo Camera support for Laravel Spark
camera laravel laravel-spark php spark
Last synced: 21 May 2024
![](https://github.com/cretueusebiu.png)
https://github.com/cretueusebiu/laravel-spark-google2fa
Google Authenticator support for Laravel Spark
authenticator laravel laravel-spark php spark
Last synced: 21 May 2024
![](https://github.com/cretueusebiu.png)
https://github.com/leobenkel/Zparkio
Boiler plate framework to use Spark and ZIO together.
boiler-plate functional-programming helpers scala spark template zio
Last synced: 20 May 2024
![](https://github.com/leobenkel.png)
https://github.com/ing-bank/popmon
Monitor the stability of a Pandas or Spark dataframe ⚙︎
covariate-shift data-analysis data-distributions data-profiling data-science dataset-shifts drift-detection hacktoberfest ing-bank ipython jupyter mlops monitoring pandas population-monitoring python spark statistical-process-control statistical-tests statistics
Last synced: 20 May 2024
![](https://github.com/ing-bank.png)
https://github.com/databricks/koalas
Koalas: pandas API on Apache Spark
big-data data-science dataframe mlflow pandas pydata spark
Last synced: 18 May 2024
![](https://github.com/databricks.png)
https://simplexspatial.github.io/osm4scala/
Scala and Spark library focused on reading OpenStreetMap Pbf files.
gis openstreetmap openstreetmap-pbf-files osm pbf scala spark
Last synced: 17 May 2024
![](https://github.com/simplexspatial.png)
https://github.com/lucidworks/spark-solr
Tools for reading data from Solr as a Spark RDD and indexing objects from Spark into Solr using SolrJ.
Last synced: 17 May 2024
![](https://github.com/lucidworks.png)
https://github.com/locationtech/rasterframes
Geospatial Raster support for Spark DataFrames
earth-observation geotrellis image-processing machine-learning scala spark spark-ml sparksql
Last synced: 16 May 2024
![](https://github.com/locationtech.png)
https://github.com/WeBankFinTech/DataSphereStudio
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
airflow atlas azkaban dataworks davinci dolphinscheduler flink governance griffin hadoop hive hue kettle linkis spark supperset tableau visualis workflow zeppelin
Last synced: 16 May 2024
![](https://github.com/WeBankFinTech.png)
https://github.com/getredash/redash
Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
analytics athena bi bigquery business-intelligence dashboard databricks hacktoberfest javascript mysql postgresql python redash redshift spark spark-sql visualization
Last synced: 16 May 2024
![](https://github.com/getredash.png)
https://github.com/strapdata/elassandra
Elassandra = Elasticsearch + Apache Cassandra
aggregation cassandra completion elasticsearch fuzzy-search kibana logstash lucene masterless mission-critical nosql rest-api search spark
Last synced: 16 May 2024
![](https://github.com/strapdata.png)
https://github.com/dylan-profiler/visions
Type System for Data Analysis in Python
data-analysis data-science hacktoberfest numpy pandas python spark type-inference type-system
Last synced: 16 May 2024
![](https://github.com/dylan-profiler.png)
https://github.com/TIBCOSoftware/snappydata
Project SnappyData - memory optimized analytics database, based on Apache Spark™ and Apache Geode™. Stream, Transact, Analyze, Predict in one cluster
analytics memory-database scale snappydata spark stream transaction
Last synced: 16 May 2024
![](https://github.com/TIBCOSoftware.png)
https://github.com/apache/linkis
Apache Linkis builds a computation middleware layer to facilitate connection, governance and orchestration between the upper applications and the underlying data engines.
application-manager context-service engine hive hive-table impala jdbc jobserver linkis livy presto pyspark resource-manager rest-api scriptis spark sql storage thrift-server udf
Last synced: 16 May 2024
![](https://github.com/apache.png)
https://github.com/gchq/Gaffer
A large-scale entity and relation database supporting aggregation of properties
accumulo aggregation big-data graph graph-database hadoop hbase parquet spark
Last synced: 15 May 2024
![](https://github.com/gchq.png)
https://github.com/SeldonIO/seldon-server
Machine Learning Platform and Recommendation Engine built on Kubernetes
aws azure cloud deep-learning deployment docker gcp java kafka kafka-streams kubernetes machine-learning microservices prediction python recommendation-engine recommender-system seldon spark tensorflow
Last synced: 15 May 2024
![](https://github.com/SeldonIO.png)
https://github.com/Alluxio/alluxio
Alluxio, data orchestration for analytics and machine learning in the cloud
alluxio data-analysis data-orchestration hadoop memory-speed presto spark tensorflow virtual-distributed-filesystem
Last synced: 15 May 2024
![](https://github.com/Alluxio.png)
https://github.com/PipelineAI/pipeline
PipelineAI
airflow artificial-intelligence cassandra docker gpu kafka keras kubeflow kubernetes machine-learning neural-network pipelineai pytorch redis scikit-learn spark tensorflow tfx
Last synced: 14 May 2024
![](https://github.com/PipelineAI.png)
https://github.com/tfayyaz/awesome-azure-databricks
Awesome content all about Azure Databricks
awesome awesome-list azure azure-databricks delta-lake spark
Last synced: 14 May 2024
![](https://github.com/tfayyaz.png)
https://github.com/mikeroyal/Apache-Spark-Guide
Apache Spark Guide
apache-spark awesome awesome-automations awesome-list big-data data-engineering data-engineering-pipeline data-science machine-learning pyspark spark spark-streaming
Last synced: 14 May 2024
![](https://github.com/mikeroyal.png)
https://github.com/jonathandinu/spark-ray-data-science
Supporting content (slides and exercises) for the Pearson video series covering best practices for developing scalable applications with Spark and Ray in the context of a data scientist's standard workflow.
artificial-intelligence data-science distributed-computing machine-learning python ray spark
Last synced: 14 May 2024
![](https://github.com/jonathandinu.png)
https://github.com/eto-ai/rikai
Parquet-based ML data format optimized for working with unstructured data
deep-learning machine-learning pytorch spark tensorflow
Last synced: 14 May 2024
![](https://github.com/eto-ai.png)
https://github.com/purduedb/knowledgecubes
Efficient RDF Data Management over Spark
data-management filtering rdf-data spark
Last synced: 14 May 2024
![](https://github.com/purduedb.png)
https://github.com/delta-io/delta
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
acid analytics big-data delta-lake spark
Last synced: 14 May 2024
![](https://github.com/delta-io.png)
https://github.com/yeasy/docker_practice
Learn and understand Docker&Container technologies, with real DevOps practice!
book cloud-computing container devops docker kubernetes linux mesos spark swarm
Last synced: 13 May 2024
![](https://github.com/yeasy.png)
https://github.com/SANSA-Stack/SANSA-Stack
Big Data RDF Processing and Analytics Stack built on Apache Spark and Apache Jena http://sansa-stack.github.io/SANSA-Stack/
apache-jena apache-spark distributed-computing flink rdf semantic-web spark
Last synced: 13 May 2024
![](https://github.com/SANSA-Stack.png)
https://github.com/ytsaurus/ytsaurus
YTsaurus is a scalable and fault-tolerant open-source big data platform.
big-data clickhouse distributed-database lakehouse olap-database spark sql ytsaurus
Last synced: 13 May 2024
![](https://github.com/ytsaurus.png)
https://github.com/h2oai/sparkling-water
Sparkling Water provides H2O functionality inside Spark cluster
big-data h2o integration machine-learning pyspark pysparkling rsparkling scala spark
Last synced: 13 May 2024
![](https://github.com/h2oai.png)
https://github.com/LB-Yu/data-systems-learning
Learning summary and examples about data systems.
big-data distributed-systems flink hbase spark
Last synced: 11 May 2024
![](https://github.com/LB-Yu.png)
https://github.com/datamechanics/delight
A Spark UI and Spark History Server alternative with CPU and Memory metrics! Delight is free, cross-platform, and open-source.
apache-spark cpu dashboard delight kubernetes memory monitoring netapp-public spark spark-history-server spark-ui
Last synced: 11 May 2024
![](https://github.com/datamechanics.png)
https://github.com/zero-one-group/geni
A Clojure dataframe library that runs on Spark
big-data clojure clojure-library clojure-repl data-engineering data-science dataframe distributed-computing high-performance-computing machine-learning parallel-computing spark
Last synced: 11 May 2024
![](https://github.com/zero-one-group.png)
https://github.com/FavioVazquez/ds-cheatsheets
List of Data Science Cheatsheets to rule the world
cheatsheet datascience jupyter programming python r spark
Last synced: 10 May 2024
![](https://github.com/FavioVazquez.png)
https://github.com/DataTalksClub/data-engineering-zoomcamp
Free Data Engineering course!
data-engineering dbt docker kafka prefect spark
Last synced: 10 May 2024
![](https://github.com/DataTalksClub.png)
https://github.com/salesforce/TransmogrifAI
TransmogrifAI (pronounced trăns-mŏgˈrə-fī) is an AutoML library for building modular, reusable, strongly typed machine learning workflows on Apache Spark with minimal hand-tuning
ai automated-machine-learning automl dsl einstein estimators feature-engineering features machine-learning ml pipelines salesforce scala spark sparkml structured-data transformations transformers transmogrification transmogrify
Last synced: 09 May 2024
![](https://github.com/salesforce.png)
https://github.com/uni-openai/uniai-maas
An opensource AI & model as a service platform.
ai chatglm chatgpt gpt kimichat midjourney moonshot spark stability-ai uniai
Last synced: 08 May 2024
![](https://github.com/uni-openai.png)
https://github.com/archivesunleashed/twut
An open-source toolkit for analyzing line-oriented JSON Twitter archives with Apache Spark.
apache-spark spark spark-packages tweets twitter-data twitter-json
Last synced: 07 May 2024
![](https://github.com/archivesunleashed.png)
https://github.com/archivesunleashed/aut
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
analysis apache-spark big-data big-data-analytics dataframe digital-humanities hadoop network-graphing pyspark python3 scala spark text-extraction webarchives
Last synced: 07 May 2024
![](https://github.com/archivesunleashed.png)
https://github.com/archivesunleashed/notebooks
Various examples of notebooks for working with web archives with the Archives Unleashed Toolkit, and derivatives generated by the Archives Unleashed Toolkit.
juypter-notebook notebooks pyspark-notebook python3 spark web-archives
Last synced: 07 May 2024
![](https://github.com/archivesunleashed.png)
https://github.com/helgeho/ArchiveSpark
An Apache Spark framework for easy data processing, extraction as well as derivation for web archives and archival collections, developed at Internet Archive.
archivespark internet-archive spark spark-framework warc web-archiving webarchive
Last synced: 07 May 2024
![](https://github.com/helgeho.png)
https://github.com/helgeho/HadoopConcatGz
A Splitable Hadoop InputFormat for Concatenated GZIP Files and *.(w)arc.gz
hadoop spark warc web-archiving webarchive
Last synced: 07 May 2024
![](https://github.com/helgeho.png)
https://github.com/Microsoft/Mobius
C# and F# language binding and extensions to Apache Spark
apache-spark bigdata csharp dataframe dataset dstream eventhubs fsharp kafka-streaming mapreduce mobius near-real-time rdd spark spark-streaming streaming
Last synced: 05 May 2024
![](https://github.com/microsoft.png)
https://github.com/ohenley/awesome-ada
A curated list of awesome resources related to the Ada and SPARK programming language
ada ada-binding ada-framework ada-language ada-library ada-programs awesome awesome-list gnat spark spark-ada
Last synced: 05 May 2024
![](https://github.com/ohenley.png)
https://github.com/blaze-init/spark-blaze-extension
Blazing-fast query execution engine speaks Apache Spark language and has Arrow-DataFusion at its core.
Last synced: 03 May 2024
![](https://github.com/blaze-init.png)
![](https://github.com/atalii.png)
https://github.com/WeBankFinTech/Scriptis
Scriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, resource management and intelligent diagnosis.
errorcode hive hive-table hql hue ide linkis pyspark resouce-management scala spark sql udf zeppelin
Last synced: 02 May 2024
![](https://github.com/WeBankFinTech.png)
https://github.com/AdaCore/RecordFlux
Formal specification and generation of verifiable binary parsers, message generators and protocol state machines
ada binary-parser communication-protocol formal-methods formal-specification formal-verification parser protocol-parser protocol-specification python spark
Last synced: 02 May 2024
![](https://github.com/AdaCore.png)
https://github.com/docandrew/CuBit
General-purpose, formally-verified, 64-bit operating system in SPARK/Ada for x86-64
Last synced: 02 May 2024
![](https://github.com/docandrew.png)
https://github.com/RoaringBitmap/RoaringBitmap
A better compressed bitset in Java: used by Apache Spark, Netflix Atlas, Apache Pinot, Tablesaw, and many others
bitset druid java lucene roaring-bitmaps roaringbitmap spark
Last synced: 02 May 2024
![](https://github.com/RoaringBitmap.png)
https://github.com/deeplearning4j/deeplearning4j
Suite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular and tiny c++ library for running math code and a java based math library on top of the core c++ library. Also includes samediff: a pytorch/tensorflow like library for running deep learning using automatic differentiation.
artificial-intelligence clojure deeplearning deeplearning4j dl4j gpu hadoop intellij java linear-algebra matrix-library neural-nets python scala spark
Last synced: 01 May 2024
![](https://github.com/deeplearning4j.png)
https://github.com/JohnSnowLabs/spark-nlp
State of the Art Natural Language Processing
albert bert entity-extraction language-detection language-model lemmatizer llm machine-translation named-entity-recognition natural-language-processing nlp part-of-speech-tagger pyspark question-answering sentiment-analysis spark spell-checker tensorflow text-classification transformers
Last synced: 30 Apr 2024
![](https://github.com/JohnSnowLabs.png)
https://github.com/simplexspatial/osm4scala
Scala and Spark library focused on reading OpenStreetMap Pbf files.
gis openstreetmap openstreetmap-pbf-files osm pbf scala spark
Last synced: 30 Apr 2024
![](https://github.com/simplexspatial.png)
https://github.com/jacksu/utils4s
scala、spark使用过程中,各种测试用例以及相关资料整理
akka breeze json4s scala scala-demo scala-spark spark spark-streaming
Last synced: 30 Apr 2024
![](https://github.com/jacksu.png)
https://github.com/frees-io/freestyle
A cohesive & pragmatic framework of FP centric Scala libraries
architectural-patterns cassandra free-monads freestyle functional-programming kafka monads redis rpc scala spark tagless-final
Last synced: 30 Apr 2024
![](https://github.com/frees-io.png)
https://github.com/apache/zeppelin
Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
big-data database flink java javascript nosql scala spark zeppelin
Last synced: 30 Apr 2024
![](https://github.com/apache.png)
https://github.com/Stratio/sparta
Real Time Analytics and Data Pipelines based on Spark Streaming
analytics hdfs kafka lambda olap real-time scala spark spark-streaming sparksql sparta stratio stratio-sparta streaming streaming-data triggers workflow
Last synced: 30 Apr 2024
![](https://github.com/Stratio.png)
https://github.com/galliaproject/gallia-core
A schema-aware Scala library for data transformation
data-engineering data-manipulation data-science data-transformation etl feature-engineering json nesting scala spark
Last synced: 30 Apr 2024
![](https://github.com/galliaproject.png)
https://github.com/spark-notebook/spark-notebook
Interactive and Reactive Data Science using Scala and Spark.
apache-spark data-science notebook reactive scala spark
Last synced: 30 Apr 2024
![](https://github.com/spark-notebook.png)
https://github.com/indix/sparkplug
Spark package to "plug" holes in data using SQL based rules ⚡️ 🔌
Last synced: 30 Apr 2024
![](https://github.com/indix.png)
https://github.com/indix/schemer
Schema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.
avro graphql-api json parquet schema-inference schema-registry spark tsv
Last synced: 30 Apr 2024
![](https://github.com/indix.png)
https://github.com/Clustering4Ever/Clustering4Ever
C4E, a JVM friendly library written in Scala for both local and distributed (Spark) Clustering.
ai artificial-intelligence big-data bigdata clustering clustering-algorithm clustering-evaluation scala scalability spark
Last synced: 30 Apr 2024
![](https://github.com/Clustering4Ever.png)
https://github.com/academyofdata/cassandra-zeppelin
Docker-Compose script for Cassandra + Zeppelin setup
Last synced: 30 Apr 2024
![](https://github.com/academyofdata.png)
https://github.com/nmarus/node-red-contrib-spark
Node-RED Nodes to integrate with the Cisco Webex Teams API
Last synced: 29 Apr 2024
![](https://github.com/nmarus.png)
https://github.com/brh55/generator-spark-bot
:zap: Yeoman generator that scaffold out a Cisco spark bot with usability and simplicity in mind
cisco cisco-spark flint nodejs scaffold spark yeoman
Last synced: 29 Apr 2024
![](https://github.com/brh55.png)
https://github.com/flint-bot/sparky
Cisco Spark API for NodeJS (deprecated in favor of https://github.com/webex/webex-bot-node-framework)
Last synced: 29 Apr 2024
![](https://github.com/flint-bot.png)