Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/markus-wa/cq
Clojure Query: A Command-line Data Processor for JSON, YAML, EDN, XML and more
cli clojure command-line csv data-processing data-transformation edn hacktoberfest json msgpack transformation xml yaml
Last synced: 24 Jun 2024
https://github.com/kuhumcst/cuphic
Transform or scrape Hiccup with a declarative DSL.
data-mining data-transformation declarative dsl hiccup html scraping sgml web-scraping xml
Last synced: 24 Jun 2024
https://github.com/zinggAI/zingg
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
analytics analytics-engineering data-science data-transformation data-transformations dataengineering datalake dataquality dedupe deduplication entity-resolution etl fuzzy-matching fuzzymatch identity identity-resolution masterdata ml modern-data-stack spark
Last synced: 22 Jun 2024
https://github.com/SebKrantz/collapse
Advanced and Fast Data Transformation in R
cran data-aggregation data-analysis data-manipulation data-processing data-science data-transformation econometrics high-performance panel-data r rstats scientific-computing statistics time-series weighted weights
Last synced: 21 Jun 2024
https://github.com/devsgnr/breadroll
breadroll π₯ is a simple lightweight library for data processing operations written in Typescript and powered by Bun.
bun csv csv-parser data-engineering data-science data-transformation eda exploratory-data-analysis tsv tsv-parser
Last synced: 21 Jun 2024
https://github.com/jupyter-naas/naas
Low-code Python library to safely use notebooks in production: schedule workflows, generate assets, trigger webhooks, send notifications, build pipelines, manage secrets (Cloud-only)
ai binder data data-science data-transformation engine etl integration jupyter jupyterlab notebooks open-source pipeline
Last synced: 08 Jun 2024
https://github.com/leeper/mcode
Functions to merge and recode across multiple variables
data data-transformation r recode recoding
Last synced: 04 Jun 2024
https://github.com/nimblelearn/datapackage-m
Power Query M functions for working with Tabular Data Packages (Frictionless Data) in Power BI and Excel
csv-files data-acquisition data-analysis data-analytics data-package data-transformation data-visualisation data-visualization datapackage excel frictionlessdata json-table-schema open-data power-bi power-query powerbi tabular-data tabular-data-package
Last synced: 04 Jun 2024
https://github.com/dry-rb/dry-transformer
Data transformation toolkit
data-mapping data-transformation dry-rb function-composition functional library ruby rubygem
Last synced: 02 Jun 2024
https://github.com/ominibyte/richflow
A Node.js and JavaScript synchronous data pipeline processing, data sharing and stream processing library. Actionable & Transformable Pipeline data processing.
data-flow data-pipeline data-processor data-stream data-transformation flow javascript nodejs pipe-data pipeline-framework streaming-data synchronous
Last synced: 02 Jun 2024
https://github.com/raystack/optimus
Optimus is an easy-to-use, reliable, and performant workflow orchestrator for data transformation, data modeling, pipelines, and data quality management.
airflow analytics analytics-engineering automation bigquery business-intelligence data-modelling data-pipelines data-transformation data-warehouse dataops elt etl golang workflows
Last synced: 01 Jun 2024
https://github.com/2ndQuadrant/pglogical
Logical Replication extension for PostgreSQL 15, 14, 13, 12, 11, 10, 9.6, 9.5, 9.4 (Postgres), providing much faster replication than Slony, Bucardo or Londiste, as well as cross-version upgrades.
cdc data-transformation data-transport database-replication etl logical-decoding postgresql publish-subscribe replication subscription zero-downtime
Last synced: 01 Jun 2024
https://github.com/fastverse/fastverse
An Extensible Suite of High-Performance and Low-Dependency Packages for Statistical Computing and Data Manipulation in R
c cpp data-aggregation data-manipulation data-science data-transformation high-performance low-dependency panel-data r rstats statistical-computing time-series weights
Last synced: 19 May 2024
https://github.com/dbohdan/sqawk
Like awk but with SQL and table joins
awk cli converter csv data-transformation data-wrangling delimited-files json sql tsv
Last synced: 14 May 2024
https://github.com/strengejacke/sjmisc
Data transformation and utility functions for R
data-transformation data-wrangling labelled-data r recoding
Last synced: 14 May 2024
https://github.com/galliaproject/gallia-core
A schema-aware Scala library for data transformation
data-engineering data-manipulation data-science data-transformation etl feature-engineering json nesting scala spark
Last synced: 30 Apr 2024
https://github.com/e-alizadeh/sample_dbt_project
Companion template repo for the blog post "dbt for Data Transformation - A Hands-on Tutorial" (https://ealizadeh.com/blog/dbt-tutorial)
data-engineering data-transformation database dbt dbt-packages dbtcloud etl sql
Last synced: 28 Apr 2024
https://github.com/hi-primus/optimus
:truck: Agile Data Preparation Workflows madeΒ easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
big-data-cleaning bigdata cudf dask dask-cudf data-analysis data-cleaner data-cleaning data-cleansing data-exploration data-extraction data-preparation data-profiling data-science data-transformation data-wrangling machine-learning pyspark spark
Last synced: 28 Apr 2024
https://github.com/dreftymac/dynamic.yaml
DEPRECATED: YAML-based data transformations
code-generation data-transformation yaml
Last synced: 24 Apr 2024
https://github.com/Azure/iot-central-compute
A simple way to do compute and data transformation on data sent to Azure IoT Central using Azure Functions and a slightly modified version of the Azure IoT Central Device Bridge.
azure-functions data-transformation iot iot-central javascript nodejs tutorial
Last synced: 23 Apr 2024
https://github.com/ScriptFUSION/Porter
:lipstick: Durable and asynchronous data imports for consuming data at scale and publishing testable SDKs.
abstraction asynchronous data-import data-transformation durability fibers framework library php-development porter scalability
Last synced: 22 Apr 2024
https://github.com/Data-Practitioner/Population-Project
This project is to analyze and understand population data using UN data.
data-analysis data-automation data-reporting data-transformation data-visualization dax excel m-language power-pivot power-query vba
Last synced: 10 Apr 2024
https://github.com/mahmoud/glom
βοΈ Python's nested data operator (and CLI), for all your declarative restructuring needs. Got data? Glom it! βοΈ
apis cli data data-transformation declarative dictionaries nested-structures python recursion utilities
Last synced: 26 Mar 2024
https://github.com/mattt/TransformerKit
A block-based API for NSValueTransformer, with a growing collection of useful examples.
data-transformation nsvaluetransformer objective-c swift
Last synced: 23 Mar 2024
https://github.com/SETL-Framework/setl
A simple Spark-powered ETL framework that just works πΊ
big-data data-analysis data-engineering data-science data-transformation dataset etl etl-pipeline framework machine-learning modularization pipeline scala setl spark
Last synced: 23 Mar 2024
https://github.com/shinima/temme
π Concise selector to extract JSON from HTML.
css-selector data-transformation html json temme-selector
Last synced: 22 Mar 2024
https://github.com/hopsoft/pipe_envy
Elixir style pipe operator for Ruby
data-transformation elixir ruby
Last synced: 21 Mar 2024
https://github.com/aws-samples/aws-dbs-refarch-datalake
Reference Architectures for Datalakes on AWS
amazon-emr data-analytics data-catalog data-lake data-transformation emr-cluster glue hive-metastore ingest-data
Last synced: 21 Mar 2024
https://github.com/antononcube/Raku-Data-Reshapers
Raku package with data reshaping functions for different data structures (full arrays, Red tables, Text::CSV tables.)
data data-transformation data-wrangling rakulang
Last synced: 18 Mar 2024
https://github.com/assemblee-virtuelle/Semantic-Bus
object flow treatment, data transformation
data-mining data-transformation semantic-data-transformation worflows workflow-sharing
Last synced: 16 Mar 2024