Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/CleverInsight/cognito

πŸš€πŸ€– Cognito - Simplifies AutoML Data Preprocessing.

automl data-munging data-preperation data-preprocessing data-wrangling

Last synced: 24 Jun 2024

https://github.com/datacarpentry/R-ecology-lesson

Data Analysis and Visualization in R for Ecologists - the version at https://github.com/datacarpentry/R-ecology-lesson-alternative will be merged on 8th July 2024

carpentries data-carpentry data-visualisation data-visualization data-wrangling ecology english lesson r stable

Last synced: 10 Jun 2024

https://github.com/dlab-berkeley/R-Fundamentals-Legacy

D-Lab's 12 hour introduction to R Fundamentals. Learn how to create variables and functions, manipulate data frames, make visualizations, use control flow structures, and more, using R in RStudio.

automation data-science data-visualization data-wrangling r

Last synced: 10 Jun 2024

https://github.com/r-rudra/tidycells

Automatic transformation of untidy spreadsheet-like data into tidy form

cran data-wrangling heuristic heuristic-algorithm r r-package r-stats spreadsheets tabular-data tidy

Last synced: 10 Jun 2024

https://github.com/r-hyperspec/hyperSpec

hyperSpec: Tools for Spectroscopy (R package)

data-wrangling hyperspectral imaging infrared nmr r-package raman spectroscopy uv-vis xrf

Last synced: 10 Jun 2024

https://github.com/uc-r/uc-r.github.io

Main repository for R programming courses @ University of Cincinnati, courses and tutorials that focus on data wrangling, exploration, visualization, and analysis with R.

classroom data-science data-wrangling machine-learning r tutorial tutorial-code visualization

Last synced: 31 May 2024

https://github.com/chris-prener/qualmap

R package for working with semi-structured qualitative GIS data

data-management data-wrangling gis mapping package qualitative qualitative-analysis qualitative-gis r rstats

Last synced: 20 May 2024

https://github.com/OpenRefine/OpenRefine

OpenRefine is a free, open source power tool for working with messy data and improving it

data-analysis data-science data-wrangling datacleaning datacleansing datajournalism datamining java journalism opendata reconciliation wikidata

Last synced: 15 May 2024

https://github.com/dbohdan/sqawk

Like awk but with SQL and table joins

awk cli converter csv data-transformation data-wrangling delimited-files json sql tsv

Last synced: 14 May 2024

https://github.com/strengejacke/sjmisc

Data transformation and utility functions for R

data-transformation data-wrangling labelled-data r recoding

Last synced: 14 May 2024

https://github.com/TomFevrier/kiwis

A Pandas-inspired data wrangling toolkit in JavaScript

data data-manipulation data-wrangling pandas

Last synced: 07 May 2024

https://github.com/tomwright/dasel

Select, put and delete data from JSON, TOML, YAML, XML and CSV files with a single tool. Supports conversion between formats and can be used as a Go package.

cli config configuration data-processing data-structures data-wrangling devops-tools go golang json json-processing parser query selector toml update xml yaml yaml-processor

Last synced: 29 Apr 2024

https://github.com/Desbordante/desbordante-core

Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algorithms. Desbordante has a console version and an easy-to-use web application.

anomaly-detection correlations data-analytics data-cleaning data-cleansing data-engineering data-exploration data-mining data-mining-algorithms data-preprocessing data-profiling data-science data-wrangling exploratory-data-analysis feature-engineering feature-extraction feature-selection knowledge-discovery spreadsheets tabular-data

Last synced: 21 Apr 2024

https://github.com/kjam/data-cleaning-101

Data Cleaning Libraries with Python

data-validation data-wrangling python teaching

Last synced: 19 Apr 2024

https://github.com/brimdata/zui

Zui is a powerful desktop application for exploring and working with data. The official front-end to the Zed lake.

csv data data-analytics data-viz data-wrangling electron-app json-inspector keyword-search super-structured-data table-view type-system zed zng zq zui

Last synced: 17 Apr 2024

https://github.com/christianbors/OpenRefineQualityMetrics

MetricDoc is an interactive visual exploration environment for assessing data quality

data-profiling data-quality data-quality-checks data-wrangling interactive-visualizations quality-metrics visual-analytics

Last synced: 08 Apr 2024

https://github.com/audiomuze/tagminder

Import, maintain and export tag metadata to/from audio files and a dynamically created SQLite table. Automates incremental tag cleanup, enrichment and standardisation for your digital audio library at scale using pre-scripted SQL queries, achieving quality and consistency throughout your music collection in a manner not possible with a tagger.

audio-metadata data-enrichment data-wrangling flac metadata-editing metadata-extraction music-library music-metadata music-tagging musicbrainz rym-capitalisation sqlite3

Last synced: 05 Apr 2024

https://github.com/ContextLab/hypertools

A Python toolbox for gaining geometric insights into high-dimensional data

data-visualization data-wrangling high-dimensional-data python text-vectorization time-series topic-modeling visualization

Last synced: 27 Mar 2024

https://github.com/antononcube/Raku-Data-Reshapers

Raku package with data reshaping functions for different data structures (full arrays, Red tables, Text::CSV tables.)

data data-transformation data-wrangling rakulang

Last synced: 18 Mar 2024

https://github.com/asavinov/prosto

Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby

business-intelligence data-preparation data-preprocessing data-processing data-science data-wrangling feature-engineering map-reduce olap pandas python spark workflow

Last synced: 18 Mar 2024

https://github.com/shawnbrown/datatest

Tools for test driven data-wrangling and data validation.

data-wrangling pytest-plugin python quality-assurance testing unittest

Last synced: 16 Mar 2024