An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with data-transformation

A curated list of projects in awesome lists tagged with data-transformation .

https://github.com/mahmoud/glom

☄️ Python's nested data operator (and CLI), for all your declarative restructuring needs. Got data? Glom it! ☄️

apis cli data data-transformation declarative dictionaries nested-structures python recursion utilities

Last synced: 16 May 2025

https://github.com/2ndQuadrant/pglogical

Logical Replication extension for PostgreSQL 17, 16, 15, 14, 13, 12, 11, 10, 9.6, 9.5, 9.4 (Postgres), providing much faster replication than Slony, Bucardo or Londiste, as well as cross-version upgrades.

cdc data-transformation data-transport database-replication etl logical-decoding postgresql publish-subscribe replication subscription zero-downtime

Last synced: 30 Mar 2025

https://github.com/bruin-data/bruin

Build data pipelines with SQL and Python, ingest data from different sources, add quality checks, and build end-to-end flows.

analytics bigquery data-analysis data-ingestion data-modeling data-pipelines data-platform data-transformation python snowflake sql

Last synced: 05 Jan 2026

https://github.com/mattt/TransformerKit

A block-based API for NSValueTransformer, with a growing collection of useful examples.

data-transformation nsvaluetransformer objective-c swift

Last synced: 22 Jul 2025

https://github.com/mattt/transformerkit

A block-based API for NSValueTransformer, with a growing collection of useful examples.

data-transformation nsvaluetransformer objective-c swift

Last synced: 16 May 2025

https://github.com/raystack/optimus

Optimus is an easy-to-use, reliable, and performant workflow orchestrator for data transformation, data modeling, pipelines, and data quality management.

airflow analytics analytics-engineering automation bigquery business-intelligence data-modelling data-pipelines data-transformation data-warehouse dataops elt etl golang workflows

Last synced: 16 May 2025

https://github.com/microsoft/prose

Microsoft Program Synthesis using Examples SDK is a framework of technologies for the automatic generation of programs from input-output examples. This repo includes samples and sample data for the Microsoft Program Synthesis using Example SDK.

csharp data-transformation data-wrangling dotnet examples microsoft program-synthesis prose sdk synthesis

Last synced: 13 May 2025

https://github.com/ScriptFUSION/Porter

:lipstick: Durable and asynchronous data imports for consuming data at scale and publishing testable SDKs.

abstraction asynchronous data-import data-transformation durability fibers framework library php-development porter scalability

Last synced: 05 Apr 2025

https://github.com/scriptfusion/porter

:lipstick: Durable and asynchronous data imports for consuming data at scale and publishing testable SDKs.

abstraction asynchronous data-import data-transformation durability fibers framework library php-development porter scalability

Last synced: 14 May 2025

https://github.com/dbohdan/sqawk

Like awk but with SQL and table joins

awk cli converter csv data-transformation data-wrangling delimited-files json sql tsv

Last synced: 06 Apr 2025

https://github.com/jupyter-naas/naas

Low-code Python library to safely use notebooks in production: schedule workflows, generate assets, trigger webhooks, send notifications, build pipelines, manage secrets (Cloud-only)

ai binder data data-science data-transformation engine etl integration jupyter jupyterlab notebooks open-source pipeline

Last synced: 03 Apr 2025

https://github.com/feichao93/temme

📄 Concise selector to extract JSON from HTML.

css-selector data-transformation html json temme-selector

Last synced: 07 Apr 2025

https://github.com/fastverse/fastverse

An Extensible Suite of High-Performance and Low-Dependency Packages for Statistical Computing and Data Manipulation in R

c cpp data-aggregation data-manipulation data-science data-transformation high-performance low-dependency panel-data r rstats statistical-computing time-series weights

Last synced: 12 Dec 2025

https://github.com/simongray/clojure-dsl-resources

A curated list of Clojure resources for dealing with domain-specific languages.

data-transformation domain-specific-language dsl nlp parsing

Last synced: 22 Apr 2025

https://github.com/markus-wa/cq

Clojure Query: A Command-line Data Processor for JSON, YAML, EDN, XML and more

cli clojure command-line csv data-processing data-transformation edn hacktoberfest json msgpack transformation xml yaml

Last synced: 10 May 2025

https://github.com/strengejacke/sjmisc

Data transformation and utility functions for R

data-transformation data-wrangling labelled-data r recoding

Last synced: 04 Apr 2025

https://github.com/toucantoco/weaverbird

A visual data pipeline builder with various backends

data-transformation mongodb mysql pandas postgresql redshift snowflake sql vuejs

Last synced: 12 Apr 2025

https://github.com/devsgnr/breadroll

breadroll 🥟 is a simple lightweight library for data processing operations written in Typescript and powered by Bun.

bun csv csv-parser data-engineering data-science data-transformation eda exploratory-data-analysis tsv tsv-parser

Last synced: 11 Oct 2025

https://github.com/hopsoft/pipe_envy

Elixir style pipe operator for Ruby

data-transformation elixir ruby

Last synced: 18 Jun 2025

https://github.com/bloomberg/pycsvw

A tool to read CSV files with CSVW metadata and transform them into other formats.

csv csvw data-transformation rdf

Last synced: 07 May 2025

https://github.com/ramonvermeulen/dbt-toolkit

The dbt-toolkit is an early-stage plugin designed to enhance your experience working with dbt-core projects in JetBrains IDEs.

data-transformation dbt dbt-core intellij-plugin jetbrains-plugin plugin

Last synced: 17 Sep 2025

https://github.com/tsantos84/serializer

A PHP serialization component focused on performance

data-transformation php-library php7 serialization-library

Last synced: 12 Apr 2025

https://github.com/hopsoft/field_mapper

Data mapping & transformation

data-conversion data-transformation ruby

Last synced: 12 Sep 2025

https://github.com/ominibyte/richflow

A Node.js and JavaScript synchronous data pipeline processing, data sharing and stream processing library. Actionable & Transformable Pipeline data processing.

data-flow data-pipeline data-processor data-stream data-transformation flow javascript nodejs pipe-data pipeline-framework streaming-data synchronous

Last synced: 30 Mar 2025

https://github.com/wayofdev/laravel-symfony-serializer

🔧 Laravel + Symfony Serializer. This package provides a bridge between Laravel and Symfony Serializer.

api data-mapper data-serialization data-transformation json laravel laravel-api laravel-serializer object-mapping php8 serialize serializer symfony-component symfony-serializer

Last synced: 20 Jul 2025

https://github.com/nicosuave/awesome-sqlmesh

A curated list of awesome SQLMesh resources

data-modeling data-transformation sqlmesh

Last synced: 16 Apr 2025

https://github.com/opportus/object-mapper

Maps generically data from source to target object via extensible strategies and controls

composer-package data-transformation data-transformer dto dto-generator mapper mapping object-mapper object-mapping php transformer

Last synced: 25 Apr 2025

https://github.com/wingkwong/hk-atm-locator

:atm: 香港自動櫃員機定位器 :atm: Centralising Automated Teller Machine (ATM) Data in Hong Kong in a well-defined yet standardised format and display in a web portal for public use

atm data-enrichment data-scraping data-transformation hk-atm-locator hong-kong-atm open-api

Last synced: 11 Apr 2025

https://github.com/nickforddev/vue-models

Backbone inspired plugin for handling models in Vue.js with built-in serialization

data-transformation fetch model mongodb schema vue vue-plugin vue2 vuejs

Last synced: 26 Oct 2025

https://github.com/sigma-andex/purescript-morello

Cherry-picking 🍒 for your data

data-transformation data-validation purescript

Last synced: 02 Apr 2025

https://github.com/vasturiano/index-array-by

A utility function to index arrays by any criteria

array data-transformation index

Last synced: 26 Jul 2025

https://github.com/bagher/fast-resource

fast-resource is a data transformation layer that sits between the database and the application's users, enabling quick data retrieval. It further enhances performance by caching data using Redis and Memcached.

cache data-transformation django fastapi flask memcached python redis

Last synced: 12 Jul 2025

https://github.com/shuyib/chronic-kidney-disease-kaggle

Using machine learning models to predict if patients have chronic kidney disease based on a few features. The results of the models are also interpreted to make it more understandable to health practitioners.

data-cleaning-pipeline data-science data-transformation data-visualization diagnostics dimensionality-reduction feature-engineering feature-selection health-data-analysis health-data-science machine-learning machine-learning-algorithm machine-learning-algorithms model-interpretability preventative-medicine

Last synced: 19 Apr 2025

https://github.com/quantumudit/insurance-portfolio-analysis

This project focuses on analyzing and visualizing the insurance portfolio of an anonymous company that implemented an aggressive growth plan in 2021 across the counties of Florida using Python and Power BI

data-analytics data-transformation data-visualization etl geospatial-analysis jupyter-notebook power-bi python

Last synced: 12 Jun 2025

https://github.com/cjdoris/chevrons.jl

Your friendly >> chevron >> based syntax for piping data through multiple transformations.

data data-science data-transformation julia julia-lang julia-language macros piping repl

Last synced: 16 Oct 2025

https://github.com/glassflow/cli

GlassFlow CLI to create and manage data pipelines

cli data-pipelines data-transformation real-time stream-processing

Last synced: 13 Nov 2025

https://github.com/ronpinkas/dbbridge

dbBridge is an 'SQL Migration Tool' - enabling import of SQL Databases from any supported Dialect (MsSql, MySql, Oracle, PostgreSQL, Sqlite) to any of these supported dialects with just three lines of PHP code.

data-integration data-migration data-transfer data-transformation database-conversion db-migrate db-migration etl migration minimal mssql mysql open-source oracle php postgresql simple sql sqlite

Last synced: 30 Apr 2025

https://github.com/sagold/json-relationship

Transform json-data using relational concepts

data-transformation json json-relationship relation-extraction

Last synced: 15 Mar 2025

https://github.com/zzan54/execonverter

EXEConverter is a simple tool that allows you to convert any .exe file into various encoded formats (Base64, Hex, and Binary) and back.

base64 batch batch-script binary cli data-transformation decoding encoding exe file-conversion file-decoding file-encoding file-utilities hex open-source powershell scripting text-encoding utilities windows

Last synced: 10 Apr 2025

https://github.com/danielgamage/data-lathe

a set of utility functions for remapping and reshaping data (esp normalized), inspired by DSP and shader development

data-transformation normalized-data

Last synced: 12 Sep 2025

https://github.com/lykmapipo/nyc-tlc-trip-data

Python scripts to download, process, and analyze the New York City Taxi and Limousine Commission (TLC) Trip Record Data dataset

apache-arrow apache-spark data data-engineering data-extraction data-transformation etl fsspec geopandas joblib jupyterlab lykmapipo metadata nyc nyc-taxi-dataset pandas pyarrow python s3

Last synced: 17 Sep 2025

https://github.com/aloftdata/vptstools

Python library to transfer and convert vertical profile time series data

aeroecology data-transformation oscibio python weather-radar

Last synced: 12 Apr 2025

https://github.com/quantumudit/analyzing-whiskyexchange-whisky

This project focuses on scraping data related to Japanese Whiskey from the Whiskey Exchange website; performing necessary transformations on the scraped data and then analyzing & visualizing it using Jupyter Notebook and Power BI.

data-analysis data-science data-transformation data-visualization etl jupyter-notebook power-bi python webscraping

Last synced: 15 May 2025

https://github.com/bottleneko/dtrans

Erlang data-transformation and validation micro library

data-transformation erlang not-production-ready validation

Last synced: 10 Apr 2025

https://github.com/antononcube/raku-data-reshapers

Raku package with data reshaping functions for different data structures (full arrays, Red tables, Text::CSV tables.)

data data-transformation data-wrangling rakulang

Last synced: 14 Aug 2025

https://github.com/antononcube/Raku-Data-Reshapers

Raku package with data reshaping functions for different data structures (full arrays, Red tables, Text::CSV tables.)

data data-transformation data-wrangling rakulang

Last synced: 11 Apr 2025

https://github.com/kuhumcst/cuphic

Transform or scrape Hiccup with a declarative DSL.

data-mining data-transformation declarative dsl hiccup html scraping sgml web-scraping xml

Last synced: 10 May 2025

https://github.com/quantumudit/analyzing-cleanaway-services

This project focuses on scraping all the service locations across Australia and their associated attributes from "Cleanaway" website; performing necessary transformations on the scraped data and then analyzing & visualizing it using Jupyter Notebook and Power BI.

data-analysis data-science data-transformation data-visualization etl jupyter-notebook power-bi python webscraping

Last synced: 09 Apr 2025

https://github.com/alimghmi/ice-client

Web scraper for ICE (Intercontinental Exchange) Markit Settlement Prices. Transforms and inserts data into an MSSQL database.

automated-data-fetch data-transformation database-integration financial-data ice markit-settlement-prices mssql python web-scraping

Last synced: 01 Mar 2025

https://github.com/josecsotomorales/dbt

Repository for testing data build tool (dbt)

business-intelligence data data-engineering data-transformation dbt dbt-packages

Last synced: 06 Jan 2026

https://github.com/nevinmathew/spring-batch-etl

The project efficiently processes user data, demonstrating key components. Explore the code for a structured approach to large-scale data transformations.

batch-processing data-processing data-transformation etl spring-batch

Last synced: 09 Apr 2025

https://github.com/python-odin/odin3

Odin for Python 3. Ground up refresh built with Python 3 (>=3.5) in mind.

data-mapping data-structures data-transformation python3 validation

Last synced: 10 Sep 2025

https://github.com/e-alizadeh/sample_dbt_project

Companion template repo for the blog post "dbt for Data Transformation - A Hands-on Tutorial" (https://ealizadeh.com/blog/dbt-tutorial)

data-engineering data-transformation database dbt dbt-packages dbtcloud etl sql

Last synced: 06 Mar 2025

https://github.com/jhd3197/tukuy

Tukuy is a robust, extensible data transformation library that leverages a flexible plugin system. It simplifies the manipulation, validation, and extraction of data across multiple formats (text, HTML, JSON, dates, numbers, and more), making it an ideal tool for building data pipelines and cleaning workflows.

data-cleaning data-transformation date-parsing plugin python text-processing

Last synced: 25 Jun 2025

https://github.com/prathameshlakawade/pipeline-genie

Pipeline-Genie is an intelligent data pipeline that processes CSV datasets, identifies their schema, and leverages LLaMA 2.0 to extract business insights. Users can select relevant business needs, triggering automated ETL transformations using Apache Spark. The final transformed dataset is stored in AWS S3 and made available for download.

apache-spark artificial-intelligence aws-s3 business-insights csv-processing data-pipeline data-transformation etl-pipeline fastapi generative-ai llama2 machine-learning mongodb-atlas python react

Last synced: 27 Jul 2025

https://github.com/quantumudit/zomato-restaurants-analysis

This project focuses on analyzing and visualizing restaurants listed in Zomato across Bengaluru city of India using Python and Power BI

data-analytics data-transformation data-visualization etl geospatial-analysis jupyter-notebook power-bi python

Last synced: 24 Apr 2025

https://github.com/leeper/mcode

Functions to merge and recode across multiple variables

data data-transformation r recode recoding

Last synced: 16 May 2025

https://github.com/findinpath/dbt_jaffle_shop_historized

Proof of concept on how to historize entity changes on a database with dbt

data-transformation dbt dbt-macros dtspec historized snowflake

Last synced: 05 Jan 2026

https://github.com/Azure/iot-central-compute

A simple way to do compute and data transformation on data sent to Azure IoT Central using Azure Functions and a slightly modified version of the Azure IoT Central Device Bridge.

azure-functions data-transformation iot iot-central javascript nodejs tutorial

Last synced: 08 May 2025

https://github.com/azurespheredev/microsoftfabric-exploratorium

A comprehensive educational resource hub dedicated to mastering Microsoft Fabric, offering in-depth tutorials, real-world use cases, and hands-on guides for seamless end-to-end analytics

analytics data-science data-transformation lakehouse microsoft-fabric one-lake powerbi real-time-analytics spark warehouse

Last synced: 23 Aug 2025

https://github.com/tibcosoftware/catalystml

CatalystML is an open source specification for real-time feature processing, purpose built to transform data for machine learning models.

ai artificial-intelligence data-science data-transformation feature-prep machine-learning

Last synced: 19 Jul 2025

https://github.com/globaldothealth/adtl

Another data transformation language

data-transformation json parser python

Last synced: 15 Jul 2025

https://github.com/quantumudit/uk-elections-2019-analysis

This project focuses on analyzing and visualizing the United Kingdom elections-2019 results using Python & Power BI.

data-analytics data-transformation data-visualization etl geospatial-analysis jupyter-notebook power-bi python

Last synced: 15 May 2025

https://github.com/quantumudit/alteryx-weekly-challenges

This repository contains Alteryx solutions to the weekly challenges published in Alteryx Community

alteryx alteryx-workflow data-analysis data-science data-transformation data-visualization etl

Last synced: 01 Nov 2025

https://github.com/quantumudit/uk-student-accommodation-analysis

This project focuses on scraping student properties related data from the UK Student Accommodation website; performing necessary transformations on the scraped data and then analyzing & visualizing it using Jupyter Notebook and Power BI.

data-analysis data-science data-transformation data-visualization etl jupyter-notebook power-bi python webscraping

Last synced: 01 Nov 2025

https://github.com/quantumudit/analyzing-suez-services

This project focuses on scraping all the service locations across Australia & New Zealand and their associated attributes from "Suez" website; performing necessary transformations on the scraped data and then analyzing & visualizing it using Jupyter Notebook and Power BI.

data-analysis data-science data-transformation data-visualization etl jupyter-notebook power-bi python webscraping

Last synced: 01 Nov 2025

https://github.com/quantumudit/analyzing-yell-cafes

This project focuses on scraping data related to cafes and coffee shops in London, England from the Yellow Pages (Yell.com) website; performing necessary transformations on the scraped data and then analyzing & visualizing it using Jupyter Notebook and Power BI.

data-analysis data-science data-transformation data-visualization etl jupyter-notebook power-bi python webscraping

Last synced: 15 May 2025

https://github.com/quantumudit/analyzing-gamerevolution-games

This project focuses on scraping data related to video games from the GameRevolution website; performing necessary transformations on the scraped data and then analyzing & visualizing it using Jupyter Notebook and Power BI.

data-analysis data-science data-transformation data-visualization etl jupyter-notebook power-bi python webscraping

Last synced: 15 May 2025

https://github.com/quantumudit/analyzing-quotes

This project focuses on scraping all the quotes and their related data from the "Quotes To Scrape" website; performing necessary transformations on the scraped data and then analyzing & visualizing it using Jupyter Notebook and Power BI.

data-analysis data-science data-transformation data-visualization etl jupyter-notebook power-bi python webscraping

Last synced: 15 May 2025

https://github.com/quantumudit/thereyougo-store-analysis

This project focuses on scraping all the products and their related info from the "There You Go" website; performing necessary transformations on the scraped data and then analyzing & visualizing it using Jupyter Notebook and Power BI.

data-analysis data-science data-transformation data-visualization etl jupyter-notebook power-bi python webscraping

Last synced: 26 Aug 2025

https://github.com/quantumudit/analyzing-goodreads-famous-quotes

This project focuses on scraping famous quotes and their related data from the GoodReads website; performing necessary transformations on the scraped data and then analyzing & visualizing it using Jupyter Notebook and Power BI.

data-analysis data-science data-transformation data-visualization etl jupyter-notebook power-bi python webscraping

Last synced: 31 Aug 2025

https://github.com/bmarsaud/calendar-shaper

🗓️ iCalendar proxy reshaping the data for your needs

calendar data-transformation icalendar proxy

Last synced: 03 Oct 2025