Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Projects in Awesome Lists tagged with data-transformation
A curated list of projects in awesome lists tagged with data-transformation .
https://github.com/mahmoud/glom
☄️ Python's nested data operator (and CLI), for all your declarative restructuring needs. Got data? Glom it! ☄️
apis cli data data-transformation declarative dictionaries nested-structures python recursion utilities
Last synced: 07 Jan 2025
https://github.com/hi-primus/optimus
:truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
big-data-cleaning bigdata cudf dask dask-cudf data-analysis data-cleaner data-cleaning data-cleansing data-exploration data-extraction data-preparation data-profiling data-science data-transformation data-wrangling machine-learning pyspark spark
Last synced: 07 Jan 2025
https://github.com/2ndQuadrant/pglogical
Logical Replication extension for PostgreSQL 15, 14, 13, 12, 11, 10, 9.6, 9.5, 9.4 (Postgres), providing much faster replication than Slony, Bucardo or Londiste, as well as cross-version upgrades.
cdc data-transformation data-transport database-replication etl logical-decoding postgresql publish-subscribe replication subscription zero-downtime
Last synced: 01 Nov 2024
https://github.com/zinggai/zingg
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
analytics analytics-engineering data-science data-transformation data-transformations dataengineering datalake dataquality dedupe deduplication entity-resolution etl fuzzy-matching fuzzymatch identity identity-resolution masterdata ml modern-data-stack spark
Last synced: 09 Jan 2025
https://github.com/mattt/transformerkit
A block-based API for NSValueTransformer, with a growing collection of useful examples.
data-transformation nsvaluetransformer objective-c swift
Last synced: 05 Jan 2025
https://github.com/mattt/TransformerKit
A block-based API for NSValueTransformer, with a growing collection of useful examples.
data-transformation nsvaluetransformer objective-c swift
Last synced: 29 Nov 2024
https://github.com/raystack/optimus
Optimus is an easy-to-use, reliable, and performant workflow orchestrator for data transformation, data modeling, pipelines, and data quality management.
airflow analytics analytics-engineering automation bigquery business-intelligence data-modelling data-pipelines data-transformation data-warehouse dataops elt etl golang workflows
Last synced: 03 Jan 2025
https://github.com/microsoft/prose
Microsoft Program Synthesis using Examples SDK is a framework of technologies for the automatic generation of programs from input-output examples. This repo includes samples and sample data for the Microsoft Program Synthesis using Example SDK.
csharp data-transformation data-wrangling dotnet examples microsoft program-synthesis prose sdk synthesis
Last synced: 03 Jan 2025
https://github.com/ScriptFUSION/Porter
:lipstick: Durable and asynchronous data imports for consuming data at scale and publishing testable SDKs.
abstraction asynchronous data-import data-transformation durability fibers framework library php-development porter scalability
Last synced: 05 Nov 2024
https://github.com/scriptfusion/porter
:lipstick: Durable and asynchronous data imports for consuming data at scale and publishing testable SDKs.
abstraction asynchronous data-import data-transformation durability fibers framework library php-development porter scalability
Last synced: 04 Jan 2025
https://github.com/sebkrantz/collapse
Advanced and Fast Data Transformation in R
cran data-aggregation data-analysis data-manipulation data-processing data-science data-transformation econometrics high-performance panel-data r rstats scientific-computing statistics time-series weighted weights
Last synced: 03 Jan 2025
https://github.com/SebKrantz/collapse
Advanced and Fast Data Transformation in R
cran data-aggregation data-analysis data-manipulation data-processing data-science data-transformation econometrics high-performance panel-data r rstats scientific-computing statistics time-series weighted weights
Last synced: 11 Nov 2024
https://github.com/dbohdan/sqawk
Like awk but with SQL and table joins
awk cli converter csv data-transformation data-wrangling delimited-files json sql tsv
Last synced: 07 Jan 2025
https://github.com/jupyter-naas/naas
Low-code Python library to safely use notebooks in production: schedule workflows, generate assets, trigger webhooks, send notifications, build pipelines, manage secrets (Cloud-only)
ai binder data data-science data-transformation engine etl integration jupyter jupyterlab notebooks open-source pipeline
Last synced: 04 Nov 2024
https://github.com/feichao93/temme
📄 Concise selector to extract JSON from HTML.
css-selector data-transformation html json temme-selector
Last synced: 08 Jan 2025
https://github.com/shinima/temme
📄 Concise selector to extract JSON from HTML.
css-selector data-transformation html json temme-selector
Last synced: 21 Dec 2024
https://github.com/fastverse/fastverse
An Extensible Suite of High-Performance and Low-Dependency Packages for Statistical Computing and Data Manipulation in R
c cpp data-aggregation data-manipulation data-science data-transformation high-performance low-dependency panel-data r rstats statistical-computing time-series weights
Last synced: 05 Jan 2025
https://github.com/mahmoudparsian/data-algorithms-with-spark
O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian
algorithms bigdata data data-abstractions data-algorithms data-transformation dataframes design design-patterns machine-learning mappers mapreduce monoid partitioning-algorithms pyspark python rdd reducers spark transformations
Last synced: 08 Jan 2025
https://github.com/simongray/clojure-dsl-resources
A curated list of Clojure resources for dealing with domain-specific languages.
data-transformation domain-specific-language dsl nlp parsing
Last synced: 10 Dec 2024
https://github.com/SETL-Framework/setl
A simple Spark-powered ETL framework that just works 🍺
big-data data-analysis data-engineering data-science data-transformation dataset etl etl-pipeline framework machine-learning modularization pipeline scala setl spark
Last synced: 08 Nov 2024
https://github.com/setl-framework/setl
A simple Spark-powered ETL framework that just works 🍺
big-data data-analysis data-engineering data-science data-transformation dataset etl etl-pipeline framework machine-learning modularization pipeline scala setl spark
Last synced: 08 Jan 2025
https://github.com/markus-wa/cq
Clojure Query: A Command-line Data Processor for JSON, YAML, EDN, XML and more
cli clojure command-line csv data-processing data-transformation edn hacktoberfest json msgpack transformation xml yaml
Last synced: 16 Nov 2024
https://github.com/strengejacke/sjmisc
Data transformation and utility functions for R
data-transformation data-wrangling labelled-data r recoding
Last synced: 03 Jan 2025
https://github.com/mahmoudparsian/big-data-mapreduce-course
Big Data Modeling, MapReduce, Spark, PySpark @ Santa Clara University
algorithms apache-hadoop apache-spark big-data data-algorithms data-analysis data-engineering data-partition data-transformation glossary mapreduce mapreduce-algorithm mapreduce-python monoid partitioning-algorithms pyspark pyspark-algorithms-book santa-clara-university spark-dataframes spark-rdd
Last synced: 06 Jan 2025
https://github.com/jim-schwoebel/allie
🤖 An automated machine learning framework for audio, text, image, video, or .CSV files (50+ featurizers and 15+ model trainers). Python 3.6 required.
autokeras automl autopytorch data-augmentation data-cleaning data-cleaning-pipeline data-transformation data-visualization datasets deep-learning ludwig machine-learning machine-learning-api machine-learning-library machine-learning-models model-compression model-deployment tpot voice-computing
Last synced: 19 Dec 2024
https://github.com/toucantoco/weaverbird
A visual data pipeline builder with various backends
data-transformation mongodb mysql pandas postgresql redshift snowflake sql vuejs
Last synced: 08 Jan 2025
https://github.com/aws-samples/aws-dbs-refarch-datalake
Reference Architectures for Datalakes on AWS
amazon-emr data-analytics data-catalog data-lake data-transformation emr-cluster glue hive-metastore ingest-data
Last synced: 13 Nov 2024
https://github.com/dry-rb/dry-transformer
Data transformation toolkit
data-mapping data-transformation dry-rb function-composition functional library ruby rubygem
Last synced: 05 Jan 2025
https://github.com/devsgnr/breadroll
breadroll 🥟 is a simple lightweight library for data processing operations written in Typescript and powered by Bun.
bun csv csv-parser data-engineering data-science data-transformation eda exploratory-data-analysis tsv tsv-parser
Last synced: 09 Dec 2024
https://github.com/bruin-data/bruin
Build data pipelines with SQL and Python, ingest data from different sources, add quality checks, and build end-to-end flows.
analytics bigquery data-analysis data-modeling data-pipelines data-transformation python snowflake sql
Last synced: 09 Jan 2025
https://github.com/assemblee-virtuelle/semantic-bus
object flow treatment, data transformation
data-mining data-transformation semantic-data-transformation worflows workflow-sharing
Last synced: 06 Nov 2024
https://github.com/assemblee-virtuelle/Semantic-Bus
object flow treatment, data transformation
data-mining data-transformation semantic-data-transformation worflows workflow-sharing
Last synced: 04 Nov 2024
https://github.com/nilportugues/php-serializer
Serialize PHP variables, including objects, in any format. Support to unserialize it too.
api array-transformer data-transformation hal hal-api jsend-transformer json json-api json-transformation jsonapi marshaller php php7 serialization transformer xml xml-transformation yaml yaml-transformer yml
Last synced: 03 Jan 2025
https://github.com/hopsoft/pipe_envy
Elixir style pipe operator for Ruby
data-transformation elixir ruby
Last synced: 09 Jan 2025
https://github.com/bloomberg/pycsvw
A tool to read CSV files with CSVW metadata and transform them into other formats.
csv csvw data-transformation rdf
Last synced: 09 Nov 2024
https://github.com/nimblelearn/datapackage-m
Power Query M functions for working with Tabular Data Packages (Frictionless Data) in Power BI and Excel
csv-files data-acquisition data-analysis data-analytics data-package data-transformation data-visualisation data-visualization datapackage excel frictionlessdata json-table-schema open-data power-bi power-query powerbi tabular-data tabular-data-package
Last synced: 04 Dec 2024
https://github.com/tsantos84/serializer
A PHP serialization component focused on performance
data-transformation php-library php7 serialization-library
Last synced: 23 Dec 2024
https://github.com/hopsoft/field_mapper
Data mapping & transformation
data-conversion data-transformation ruby
Last synced: 09 Jan 2025
https://github.com/ominibyte/richflow
A Node.js and JavaScript synchronous data pipeline processing, data sharing and stream processing library. Actionable & Transformable Pipeline data processing.
data-flow data-pipeline data-processor data-stream data-transformation flow javascript nodejs pipe-data pipeline-framework streaming-data synchronous
Last synced: 01 Nov 2024
https://github.com/8080labs/bamboolib_binder_template
bamboolib - template for creating your own binder notebook
binder-jupyter-notebook data-exploration data-science data-transformation data-visualisation data-visualization data-viz docker
Last synced: 07 Nov 2024
https://github.com/opportus/object-mapper
Maps generically data from source to target object via extensible strategies and controls
composer-package data-transformation data-transformer dto dto-generator mapper mapping object-mapper object-mapping php transformer
Last synced: 10 Nov 2024
https://github.com/wingkwong/hk-atm-locator
:atm: 香港自動櫃員機定位器 :atm: Centralising Automated Teller Machine (ATM) Data in Hong Kong in a well-defined yet standardised format and display in a web portal for public use
atm data-enrichment data-scraping data-transformation hk-atm-locator hong-kong-atm open-api
Last synced: 07 Nov 2024
https://github.com/wayofdev/laravel-symfony-serializer
🔧 Laravel + Symfony Serializer. This package provides a bridge between Laravel and Symfony Serializer.
api data-mapper data-serialization data-transformation json laravel laravel-api laravel-serializer object-mapping php8 serialize serializer symfony-component symfony-serializer
Last synced: 20 Dec 2024
https://github.com/tsantos84/serializer-benchmark
A PHP benchmark application to compare PHP serializer libraries
benchmark blackfire data-transformation jms-serializer overhead performance-analysis php7 serialization-library simple-serializer symfony-serializer tsantos-serializer
Last synced: 23 Dec 2024
https://github.com/ramonvermeulen/dbt-toolkit
The dbt-toolkit is an early-stage plugin designed to enhance your experience working with dbt-core projects in JetBrains IDEs.
data-transformation dbt dbt-core intellij-plugin jetbrains-plugin plugin
Last synced: 17 Nov 2024
https://github.com/nickforddev/vue-models
Backbone inspired plugin for handling models in Vue.js with built-in serialization
data-transformation fetch model mongodb schema vue vue-plugin vue2 vuejs
Last synced: 11 Oct 2024
https://github.com/sigma-andex/purescript-morello
Cherry-picking 🍒 for your data
data-transformation data-validation purescript
Last synced: 14 Dec 2024
https://github.com/bagher/fast-resource
fast-resource is a data transformation layer that sits between the database and the application's users, enabling quick data retrieval. It further enhances performance by caching data using Redis and Memcached.
cache data-transformation django fastapi flask memcached python redis
Last synced: 21 Nov 2024
https://github.com/shuyib/chronic-kidney-disease-kaggle
Using machine learning models to predict if patients have chronic kidney disease based on a few features. The results of the models are also interpreted to make it more understandable to health practitioners.
data-cleaning-pipeline data-science data-transformation data-visualization diagnostics dimensionality-reduction feature-engineering feature-selection health-data-analysis health-data-science machine-learning machine-learning-algorithm machine-learning-algorithms model-interpretability preventative-medicine
Last synced: 28 Nov 2024
https://github.com/quantumudit/insurance-portfolio-analysis
This project focuses on analyzing and visualizing the insurance portfolio of an anonymous company that implemented an aggressive growth plan in 2021 across the counties of Florida using Python and Power BI
data-analytics data-transformation data-visualization etl geospatial-analysis jupyter-notebook power-bi python
Last synced: 26 Dec 2024
https://github.com/danielgamage/data-lathe
a set of utility functions for remapping and reshaping data (esp normalized), inspired by DSP and shader development
data-transformation normalized-data
Last synced: 08 Nov 2024
https://github.com/sagold/gson-relationship
Transform json-data using relational concepts
data-transformation json json-relationship relation-extraction
Last synced: 26 Oct 2024
https://github.com/bottleneko/dtrans
Erlang data-transformation and validation micro library
data-transformation erlang not-production-ready validation
Last synced: 12 Oct 2024
https://github.com/antononcube/Raku-Data-Reshapers
Raku package with data reshaping functions for different data structures (full arrays, Red tables, Text::CSV tables.)
data data-transformation data-wrangling rakulang
Last synced: 07 Nov 2024
https://github.com/kuhumcst/cuphic
Transform or scrape Hiccup with a declarative DSL.
data-mining data-transformation declarative dsl hiccup html scraping sgml web-scraping xml
Last synced: 16 Nov 2024
https://github.com/antononcube/raku-data-reshapers
Raku package with data reshaping functions for different data structures (full arrays, Red tables, Text::CSV tables.)
data data-transformation data-wrangling rakulang
Last synced: 15 Dec 2024
https://github.com/enram/vptstools
Python library to transfer and convert vertical profile time series data
aeroecology data-transformation oscibio python weather-radar
Last synced: 14 Nov 2024
https://github.com/nevinmathew/spring-batch-etl
The project efficiently processes user data, demonstrating key components. Explore the code for a structured approach to large-scale data transformations.
batch-processing data-processing data-transformation etl spring-batch
Last synced: 28 Dec 2024
https://github.com/quantumudit/zomato-restaurants-analysis
This project focuses on analyzing and visualizing restaurants listed in Zomato across Bengaluru city of India using Python and Power BI
data-analytics data-transformation data-visualization etl geospatial-analysis jupyter-notebook power-bi python
Last synced: 26 Dec 2024
https://github.com/dataform-co/dataform-example-project
Example project on Dataform
data-analysis data-pipeline data-transformation elt sql sqlx
Last synced: 13 Nov 2024
https://github.com/quantumudit/analyzing-cleanaway-services
This project focuses on scraping all the service locations across Australia and their associated attributes from "Cleanaway" website; performing necessary transformations on the scraped data and then analyzing & visualizing it using Jupyter Notebook and Power BI.
data-analysis data-science data-transformation data-visualization etl jupyter-notebook power-bi python webscraping
Last synced: 06 Nov 2024
https://github.com/josecsotomorales/dbt
Repository for testing data build tool (dbt)
business-intelligence data data-engineering data-transformation dbt dbt-packages
Last synced: 05 Dec 2024
https://github.com/quantumudit/analyzing-whiskyexchange-whisky
This project focuses on scraping data related to Japanese Whiskey from the Whiskey Exchange website; performing necessary transformations on the scraped data and then analyzing & visualizing it using Jupyter Notebook and Power BI.
data-analysis data-science data-transformation data-visualization etl jupyter-notebook power-bi python webscraping
Last synced: 26 Dec 2024
https://github.com/e-alizadeh/sample_dbt_project
Companion template repo for the blog post "dbt for Data Transformation - A Hands-on Tutorial" (https://ealizadeh.com/blog/dbt-tutorial)
data-engineering data-transformation database dbt dbt-packages dbtcloud etl sql
Last synced: 15 Nov 2024
https://github.com/lykmapipo/python-spark-log-analysis
Python scripts to process, and analyze log files using PySpark.
apache-arrow apache-spark apache-spark-sql data-analysis data-extraction data-processing data-transformation log-analysis log-analyzer log-monitor lykmapipo pandas pyarrow pyspark python seaborn spark-ml spark-nlp sparkml-pipelines sql
Last synced: 27 Oct 2024
https://github.com/quantumudit/alteryx-weekly-challenges
This repository contains Alteryx solutions to the weekly challenges published in Alteryx Community
alteryx alteryx-workflow data-analysis data-science data-transformation data-visualization etl
Last synced: 26 Dec 2024
https://github.com/bmarsaud/calendar-shaper
🗓️ iCalendar proxy reshaping the data for your needs
calendar data-transformation icalendar proxy
Last synced: 20 Dec 2024
https://github.com/zzan54/execonverter
EXEConverter is a simple tool that allows you to convert any .exe file into various encoded formats (Base64, Hex, and Binary) and back.
base64 batch batch-script binary cli data-transformation decoding encoding exe file-conversion file-decoding file-encoding file-utilities hex open-source powershell scripting text-encoding utilities windows
Last synced: 13 Oct 2024
https://github.com/quantumudit/uk-student-accommodation-analysis
This project focuses on scraping student properties related data from the UK Student Accommodation website; performing necessary transformations on the scraped data and then analyzing & visualizing it using Jupyter Notebook and Power BI.
data-analysis data-science data-transformation data-visualization etl jupyter-notebook power-bi python webscraping
Last synced: 26 Dec 2024
https://github.com/quantumudit/analyzing-goodreads-famous-quotes
This project focuses on scraping famous quotes and their related data from the GoodReads website; performing necessary transformations on the scraped data and then analyzing & visualizing it using Jupyter Notebook and Power BI.
data-analysis data-science data-transformation data-visualization etl jupyter-notebook power-bi python webscraping
Last synced: 26 Dec 2024
https://github.com/quantumudit/analyzing-quotes
This project focuses on scraping all the quotes and their related data from the "Quotes To Scrape" website; performing necessary transformations on the scraped data and then analyzing & visualizing it using Jupyter Notebook and Power BI.
data-analysis data-science data-transformation data-visualization etl jupyter-notebook power-bi python webscraping
Last synced: 26 Dec 2024
https://github.com/quantumudit/analyzing-suez-services
This project focuses on scraping all the service locations across Australia & New Zealand and their associated attributes from "Suez" website; performing necessary transformations on the scraped data and then analyzing & visualizing it using Jupyter Notebook and Power BI.
data-analysis data-science data-transformation data-visualization etl jupyter-notebook power-bi python webscraping
Last synced: 26 Dec 2024
https://github.com/quantumudit/uk-elections-2019-analysis
This project focuses on analyzing and visualizing the United Kingdom elections-2019 results using Python & Power BI.
data-analytics data-transformation data-visualization etl geospatial-analysis jupyter-notebook power-bi python
Last synced: 26 Dec 2024
https://github.com/quantumudit/analyzing-gamerevolution-games
This project focuses on scraping data related to video games from the GameRevolution website; performing necessary transformations on the scraped data and then analyzing & visualizing it using Jupyter Notebook and Power BI.
data-analysis data-science data-transformation data-visualization etl jupyter-notebook power-bi python webscraping
Last synced: 26 Dec 2024
https://github.com/lykmapipo/nyc-tlc-trip-data
Python scripts to download, process, and analyze the New York City Taxi and Limousine Commission (TLC) Trip Record Data dataset
apache-arrow apache-spark data data-engineering data-extraction data-transformation etl fsspec geopandas joblib jupyterlab lykmapipo metadata nyc nyc-taxi-dataset pandas pyarrow python s3
Last synced: 27 Oct 2024
https://github.com/quantumudit/thereyougo-store-analysis
This project focuses on scraping all the products and their related info from the "There You Go" website; performing necessary transformations on the scraped data and then analyzing & visualizing it using Jupyter Notebook and Power BI.
data-analysis data-science data-transformation data-visualization etl jupyter-notebook power-bi python webscraping
Last synced: 26 Dec 2024
https://github.com/quantumudit/analyzing-yell-cafes
This project focuses on scraping data related to cafes and coffee shops in London, England from the Yellow Pages (Yell.com) website; performing necessary transformations on the scraped data and then analyzing & visualizing it using Jupyter Notebook and Power BI.
data-analysis data-science data-transformation data-visualization etl jupyter-notebook power-bi python webscraping
Last synced: 26 Dec 2024
https://github.com/danieljancar/soccer-fields
Fictional match results of European leagues transformed and displayed appropriately (season 2023/2024)
angular angular-material data-transformation tableview
Last synced: 12 Nov 2024
https://github.com/djthorpe/data
Data extraction, transformation, processing and visualisation
canvas csv data data-extraction data-transformation dom golang svg visualization
Last synced: 11 Dec 2024
https://github.com/Azure/iot-central-compute
A simple way to do compute and data transformation on data sent to Azure IoT Central using Azure Functions and a slightly modified version of the Azure IoT Central Device Bridge.
azure-functions data-transformation iot iot-central javascript nodejs tutorial
Last synced: 15 Nov 2024
https://github.com/azure/iot-central-compute
A simple way to do compute and data transformation on data sent to Azure IoT Central using Azure Functions and a slightly modified version of the Azure IoT Central Device Bridge.
azure-functions data-transformation iot iot-central javascript nodejs tutorial
Last synced: 31 Dec 2024
https://github.com/leeper/mcode
Functions to merge and recode across multiple variables
data data-transformation r recode recoding
Last synced: 19 Nov 2024
https://github.com/azurespheredev/microsoftfabric-exploratorium
A comprehensive educational resource hub dedicated to mastering Microsoft Fabric, offering in-depth tutorials, real-world use cases, and hands-on guides for seamless end-to-end analytics
analytics data-science data-transformation lakehouse microsoft-fabric one-lake powerbi real-time-analytics spark warehouse
Last synced: 12 Nov 2024
https://github.com/andryadsm/ibrd-statement-loans
🏦 Project IBRD Statement of Loans (Python, SQL, Excel, Power BI, Tableau)
bank bank-loans dashboard data-analysis data-transformation data-visualization database-management excel finance international-development loans mssqlserver mysql powerbi python sql tableau
Last synced: 14 Dec 2024
https://github.com/gattiharishkumar/blinkit-sales-analysis-dashboard
This project presents a comprehensive sales analysis dashboard for Blinkit, an Indian last-minute delivery app. The dashboard was created using Power BI and provides a detailed overview of the company's sales performance across various outlets and product categories.
dashboard data-analysis data-transformation data-visualization ms-excel-data-analytics power-query powerbi powerbi-visuals
Last synced: 29 Dec 2024
https://github.com/gattiharishkumar/employee-attendance-leaves-analytics-dashboard
This project showcases a Power BI dashboard created to analyze employee attendance and leaves over a three-month period. The data was sourced from Excel datasets available on the Codebasics website.
dashboards data-analysis data-cleaning data-transformation data-visualization power-query-editor powerbi
Last synced: 29 Dec 2024
https://github.com/aneeshmurali-n/ml_bangalore_house_price_analysis
This project focusing on statistical analysis to understand and prepare data for potential machine learning applications. The dataset house_price.csv includes property prices in Bangalore. The analysis aims to perform exploratory data analysis (EDA), detect and handle outliers, check data distribution and normality, and analyze correlations.
box-plot correlation data-distribution data-transformation exploratory-data-analysis heatmap histplot iqr-method log-transformation matplotlib numpy outlier-detection outlier-handling pandas percentile-method python scatter-plot scipy seaborn z-score-method
Last synced: 06 Dec 2024
https://github.com/moindalvs/learn_simple_linear_regression
Learn about Simple Linear Regression for Data Science
box-cox-transformation correlation data-science data-transformation log-transformation model-validation ols-regression prediction-model simple-linear-regression sklearn statsmodels
Last synced: 17 Nov 2024
https://github.com/svetlanam/pycon-csv-to-firebase
Convert data from CSV and upload them to Firebase Cloud Database
data-transformation firebase-database import json mobile-app pycon python
Last synced: 13 Nov 2024
https://github.com/open-portfolio/finporter
Data transformation tool for investing data
allocation brokerage csv data-transformation finance financial-data investing investment-analysis net-worth portfolio-management swift-cli swift-lang swift-language swift-library tabular-data
Last synced: 18 Dec 2024
https://github.com/shridhar1504/foreign-exchange-rate-time-series-datascience-project
This project will use time series analysis to forecast the exchange rate between the euro and the US dollar. The project will use a variety of statistical techniques, such as ARIMA to model the data and forecast the exchange rate.
data-analysis data-preprocessing data-science data-transformation data-visualization eda exploratory-data-analysis foreign-exchange-rates machine-learning model-fitting predictive-modeling python3 time-series time-series-analysis
Last synced: 23 Dec 2024
https://github.com/abdelhakim-gh/pfa-process-mining-fraud-detection
New Frontiers in the Fight against Fraud : The Contribution of Process Mining
celonis data-processing data-transformation eda jupyter-notebook machine-learning python
Last synced: 03 Dec 2024
https://github.com/findinpath/dbt_jaffle_shop_historized
Proof of concept on how to historize entity changes on a database with dbt
data-transformation dbt dbt-macros dtspec historized snowflake
Last synced: 02 Dec 2024
https://github.com/antononcube/wl-datareshapers-paclet
Wolfram Language (aka Mathematica) paclet for data reshaping functions, like, long- and wide form, cross tabulation, etc.
contingency-table cross-tabulation data-analysis data-transformation long-form wide-form
Last synced: 15 Dec 2024
https://github.com/richardwarepam16/etl-data_pipelining_project_using_awsservice
Streamline your data flow with AWS Data Pipelining - a reliable and scalable solution for seamless data ingestion, processing, and storage
airflow amazon-data aws-ec2 aws-s3 data-extraction data-loading data-pipepline data-transformation etl-pipeline spotify-api
Last synced: 21 Nov 2024
https://github.com/yash22222/data-analysis-with-python
This repository provides a practical introduction to data acquisition and analysis using Pandas. It covers loading datasets, exploring data, manipulating data, and gaining insights through statistical summaries. Ideal for beginners, it offers code examples and explanations to enhance your data manipulation skills using Pandas for Python.
binning data data-acquisition data-analysis data-binning data-cleaning data-formatting data-integration data-normalization data-preprocessing data-science data-transformation data-wrangling dataframe description numpy pandas pandas-dataframe python python3
Last synced: 05 Jan 2025
https://github.com/moindalvs/learn_feature_engineering
Data Set: House Prices: Advanced Regression Techniques Feature Engineering with 80+ Features
data-science data-transformation handling-missing-value label-encoding log-transformation minmaxscaling missing-values
Last synced: 17 Nov 2024
https://github.com/moindalvs/learn_feature_engg_time_series
Feature Engineering on Time Series Dataset (Flight Price Prediction)
data-science data-structures data-transformation feature-engineering feature-extraction feature-selection label-encoder onehot-encoding time-series-analysis
Last synced: 17 Nov 2024
https://github.com/moindalvs/learn_eda_for_data_science
Univariate, Bivariate and Multi-variate Analysis
bivariate-analysis correlation-analysis data-science data-transformation data-type-conversion data-types-and-structures data-visualization duplicates-removal exploratory-data-analysis imputation missing-values multi-variate-analysis normalization outlier-detection pandas-profiling standardization univariate-analysis
Last synced: 17 Nov 2024