An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/hadley/neiss

Data from National Electronic Injury Surveillance System

data r

Last synced: 19 Oct 2025

https://github.com/leonawicz/rtrek

R package for Star Trek datasets and related R functions.

data r-package stapi star-trek

Last synced: 06 Sep 2025

https://github.com/fityannugroho/idn-area-data

Provides the administrative areas data of Indonesia based on the latest official sources 🇮🇩

data data-sources hacktoberfest idn-area indonesia island javascript open-data pulau wilayah wilayah-indonesia

Last synced: 06 Apr 2025

https://github.com/leeper/csvy

Import and Export CSV Data With a YAML Metadata Header

cran csv csvy data r yaml

Last synced: 17 Mar 2025

https://github.com/cylondata/twister2

A composable framework for fast and scalable data analytics

batch big-data data graph iterative streaming

Last synced: 29 May 2026

https://github.com/dxxxxy/EssentialCosmeticsUnlocker

Client-side only patch that allows you to unlock ALL cosmetics (+ emotes) in the Essential mod. Works on every version of Essential MC (1.8.9 - 1.20.6).

client compatibility config cosmetics data dump emotes essential fabric forge free inject minecraft mixin mod patch side universal unlocker

Last synced: 09 Aug 2025

https://github.com/vijinho/iso-country-data

ISO Country data in JSON and CSV format.

country-data currencies data iso-country json-data

Last synced: 21 Jan 2026

https://github.com/paiml/ruchy

Ruchy, a systems-oriented scripting language that transpiles to Rust.

data notebooks repl ruchy rust science scripting wasm

Last synced: 01 Apr 2026

https://github.com/emberexperts/ember-await

Await component for Ember Applications. Resolve your data on demand, just when needed.

await data ember-addon ember-await emberjs javascript loading

Last synced: 12 Apr 2025

https://github.com/kevinwang15/treebox

an interactive TreeMap visualization - Please star if you like this project

canvas canvas2d data javascript treemap visualization

Last synced: 13 May 2025

https://github.com/jonschlinkert/cache-base

Basic object store with methods like get/set/extend/omit

cache config data dot-notation emit extend inherit javascript node nodejs object store

Last synced: 05 Apr 2025

https://github.com/mysociety/yournextrepresentative

A website for crowd-sourcing structured election candidate data

civic-tech data elections politics

Last synced: 02 Aug 2025

https://github.com/turbot/steampipe-sqlite

Steampipe SQLite is a zero-ETL engine for SQLite. Virtual tables translate queries into live API calls for cloud services and APIs. Hundreds of plugins with thousands of documented examples.

aws azure data devsecops etl gcp golang kubernetes security sql sqlite steampipe steampipe-engine zero-etl

Last synced: 28 Jul 2025

https://github.com/ropensci/rsnps

Wrapper to a number of SNP web APIs

data r r-package rstats snps web-api

Last synced: 22 Feb 2026

https://github.com/tg12/script-toolbox

This repository contains a collection of scripts and tools that I have written to solve various problems that I have come across.

data exif exif-metadata exif-reader exiftool scripts scripts-collection toolbox toolkit toolkits

Last synced: 27 Jul 2025

https://github.com/alinski29/stonks.jl

Julia library for standardizing financial data retrieval and storage from multiple APIs.

data data-mining data-science dataframe finance julia trading trading-algorithms

Last synced: 06 May 2025

https://github.com/ioos/bio_data_guide

Standardizing Marine Biological Data Working Group - An open community to facilitate the mobilization of biological data to OBIS.

darwin-core data data-management marine-biology marine-data obis tutorials

Last synced: 30 Oct 2025

https://github.com/datakitchen/dataops-testgen

DataOps Data Quality TestGen is part of DataKitchen's Open Source Data Observability. DataOps TestGen delivers simple, fast data quality test generation and execution by data profiling,  new dataset hygiene review, AI generation of data quality validation tests, ongoing testing of data refreshes, & continuous anomaly monitoring

data data-engineering data-observability data-quality data-science data-testing datachecker dataops dataprofiling dataquality datavalidation mssql postgresql python redshift self-hosted snowflake

Last synced: 25 Feb 2026

https://github.com/zachowj/xfinity-data-usage

Fetch Xfinity data usage and serve it via an HTTP endpoint, publish it to MQTT or post it to an URL.

data home-assistant mqtt usage xfinity xfinity-data

Last synced: 09 Mar 2025

https://github.com/modm-io/modm-devices

Curated device data for all AVR and ARM Cortex-M devices

avr cortex-m data microcontroller modm nrf sam stm32

Last synced: 06 Apr 2026

https://github.com/zbrookle/sql_to_ibis

A Python package that parses sql and converts it to ibis expressions

data databases dataframes etl hacktoberfest ibis sql

Last synced: 14 Apr 2025

https://github.com/mcarlucci/decky-storage-cleaner

A Decky Loader plugin for tidying up your Steam Deck's storage. Quickly visualize, select and clear shader cache and compatibility data.

cache cleaner compat compatibility data decky decky-loader plugin shader steam steamdeck storage utility

Last synced: 03 Mar 2025

https://github.com/spratiher9/sparkora

Powerful rapid automatic EDA and feature engineering library with a very easy to use API 🌟

apache apache-spark data data-analysis data-analysis-python data-analytics easy-to-use eda exploratory-data-analysis open-source opensource pyspark python python3 toolkit

Last synced: 10 Jul 2025

https://github.com/OElesin/querypal

Web UI for Amazon Athena

analytics aws aws-athena data data-lake sql

Last synced: 30 Jul 2025

https://github.com/bpbond/srdb

Global soil respiration database

carbon-cycle data global-database science soil soil-respiration

Last synced: 07 Apr 2025

https://github.com/quasilyte/gocorpus

The code used to serve gocorpus application

analysis corpus data go gogrep golang query search statistics syntax

Last synced: 21 Apr 2025

https://github.com/tony-xlh/animatedqrcodereader

Animated QR code reader

animated data qrcode transfer

Last synced: 23 Feb 2026

https://github.com/moscarde/junior_zone

Vagas Jr. atualizadas diariamente. Telegram e Planilha Online

data python scraping telegram

Last synced: 14 Apr 2025

https://github.com/expressapp/construct

Library for dealing with data structures

data elixir elixir-construct elixir-lang types validation

Last synced: 11 Dec 2025

https://github.com/jcparkyn/phetch

A small Blazor library for handling async query state, in the style of React Query

async blazor data fetch query

Last synced: 27 Mar 2025

https://github.com/tobilg/aws-iam-data

This repository contains the full dataset of AWS IAM data (services, actions, resource types and conditions keys). It's updated on a daily basis at 4AM UTC.

aws data git-scraping iam

Last synced: 07 Apr 2025

https://github.com/iesahin/xvc

A robust (🐢) and fast (🐇) MLOps tool for managing data and pipelines in Rust (🦀)

command-line-tool data data-engineering data-pipelines data-science devops machine-learning machine-learning-engineering mlops rust

Last synced: 28 Jun 2025

https://github.com/ahmed-mohamed-sn/olliePy

OlliePy is a python package which can help data scientists in exploring their data and evaluating and analysing their machine learning experiments by utilising the power and structure of modern web applications. The data scientist only needs to provide the data and any required information and OlliePy will generate the rest.

ai analytics charts dashboard data data-analytics data-science data-scientist eda error-analysis exploratory-data-analysis machine-learning python visualization

Last synced: 08 May 2025

https://github.com/3c7/common-osint-model

Converting data from services like Censys and Shodan to a common data model

analysis censys data infrastructure model osint shodan

Last synced: 22 Feb 2026

https://github.com/siberiacancode/mock-config-server

🎉 tool that easily and quickly imitates server operation, create full fake api in few steps

api config data database fake graphql mock mock-server mocking rest rest-api server

Last synced: 14 Jul 2025

https://github.com/olist/work-at-olist-data

Apply for a job at Olist's Data Team: https://olist.gupy.io/

analytics data dataengineering datascience dataset julia machinelearning pandas python r sql

Last synced: 25 Jun 2025

https://github.com/distributedsystemsgroup/zoe

Zoe: Container Analytics as a Service -- mirror of https://gitlab.eurecom.fr/zoe/main/

analytics containers data jupyter python spark

Last synced: 23 Oct 2025

https://github.com/dig-eds-cat/digeds_cat

This research seeks to examine best practice in the field of digital editions by collating relevant evidence in a detailed catalogue of extant digital projects.

catalogue data digital-edition digital-humanities library-database open-access open-data open-source

Last synced: 08 Apr 2025

https://github.com/gmh5225/ida-find-.data-ptr

A simple ida python script to find .data ptr

anti anti-cheat cheat data data-ptr driver ida idapython plugin ptr windows

Last synced: 17 May 2026

https://github.com/datakitchen/dataops-observability

DataOps Observability is part of DataKitchen's Open Source Data Observability. DataOps Observability monitors every data journey from data source to customer value, from any team development environment into production, across every tool, team, environment, and customer so that problems are detected, localized, and understood immediately.

data data-engineering data-observability data-science dataops pipleine-monitoring

Last synced: 01 Apr 2026

https://github.com/lucasferreira/react-async-fetcher

React component for asynchronous loading/fetch online data

asynchronous component data fetch loader react react-native reactjs

Last synced: 28 Jan 2026

https://github.com/michabirklbauer/instagram_json_viewer

Transforms Instagram's *.json / backup data - that you get via the Data Download Tool - to a readable format!

backup converter data download html instagram json python viewer

Last synced: 18 Jul 2025

https://github.com/datacoon/undatum

undatum: a command-line tool for data processing. Brings CSV simplicity to NDJSON, BSON, XML and other data files

bson cli command-line csv data dataset json jsonl jsonlines parquet

Last synced: 18 Jan 2026

https://github.com/jkanev/treetime

TreeTime is a data organisation, management and analysis tool. A tree is a hierarchical structure that arranges information in units and sub-units. TreeTime uses linked trees (one data item can be part of different distinct trees) to store and organise any general purpose data.

ancestry data data-organisation data-organizer hierarchical-data information-management information-management-system information-manager linked-trees ontology project-management time-management to-do-list tree tree-editor tree-structure

Last synced: 14 Dec 2025

https://github.com/jamesleesaunders/d3-ez

D3-EZ Easy Reusable Charts

chart d3 data dataviz graph svg visualization

Last synced: 16 Mar 2025

https://github.com/ysichov/Simple-Data-Explorer

Simple Data Explorer

abap data erp hcm sap sde se16n viewer

Last synced: 04 May 2025

https://github.com/paulknysh/sym

A Mathematica package for generating symbolic models from data

data generation mathematica model symbolic

Last synced: 08 Jul 2025

https://github.com/intake/akimbo

For when your data won't fit in your dataframe

awkward-array cudf data dataframe pandas polars python

Last synced: 15 Jun 2026

https://github.com/ysichov/simple-data-explorer

Simple Data Explorer

abap data erp hcm sap sde se16n viewer

Last synced: 13 Mar 2026

https://github.com/r-geoflow/geoflow

Orchestrate Geospatial (Meta)Data Management Workflows and Manage FAIR Services

data dataverse fair geospatial inspire iso metadata ocs ogc orchestrator postgis r spatial workflow zenodo

Last synced: 22 Oct 2025

https://github.com/chrisvwn/Rnightlights

R package to extract data from satellite nightlights.

data dmsp-ols extraction nightlights noaa package r satellite snpp-viirs

Last synced: 13 Jul 2025

https://github.com/streamingfast/streamingfast

The dfuse Blockchain Data Platform

blockchain data eosio ethereum platform

Last synced: 30 Apr 2025

https://github.com/0xb10c/bitcoin-development-history

Data and a example for a open source timeline of the history of Bitcoin development

bitcoin data history json timeline

Last synced: 14 Apr 2025

https://github.com/raphaelmansuy/digital_palace

My Digital Palace - A Personal Journal for Reflection - A place to store all my thoughts

ai data data-engineering

Last synced: 16 Mar 2025

https://github.com/vesparny/brcast

Tiny data broadcaster with 0 dependencies

broadcast data emitter event pubsub

Last synced: 13 Apr 2025

https://github.com/ropensci/rdataretriever

R interface to the Data Retriever

data data-science database datasets r r-package rstats science

Last synced: 22 Oct 2025

https://github.com/atolcd/pentaho-gis-plugins

🗺 GIS plugins for Pentaho Data Integration

data dxf etl geojson gpx java mif-mid pentaho-data-integration shp spatialite svg

Last synced: 23 Jan 2026

https://github.com/jmboehm/douglass.jl

Stata-like toolkit for data wrangling on Julia DataFrames

data data-frames economics julia stata tabular-data

Last synced: 31 Oct 2025

https://github.com/thantthet/YBS-Data

Yangon Bus Service data

bus data public transit yangon ybs

Last synced: 14 Mar 2025

https://github.com/bluewave-labs/maskwise

Maskwise detects, redacts, masks, and anonymizes sensitive data across text, images, and structured data in training datasets for LLM systems. Powered by Microsoft Presidio

data data-anonymization data-redaction data-scanning gdpr-compliance hipaa-compliance pii-anonymization pii-detection sensitive-data-masking

Last synced: 20 Jan 2026

https://github.com/utrechtuniversity/yoda

A system for reliable, long-term storing and archiving large amounts of research data during all stages of a study.

ansible automated-deployment data irods research utrecht-university yoda

Last synced: 07 Apr 2025

https://github.com/wjakethompson/taylor

A comprehensive resource for data on Taylor Swift songs, and ggplot2 helper functions

color-palettes data genius-lyrics ggplot2-themes lyrics r spotify spotify-api taylor-swift

Last synced: 06 Apr 2025

https://github.com/cdnjs/cf-stats

📈 Monthly usage statistics from Cloudflare for the cdnjs.cloudflare.com domain - The #1 free and open source CDN built to make life easier for developers.

cdnjs cloudflare data data-analysis statistics stats usage usage-data usage-reports

Last synced: 06 Jul 2025

https://github.com/opendatasoft/semantic-bot

A Semi-Automatic Tool to generate RDF mappings for Opendatasoft's datasets

data dbpedia linked lov ontology rdf rdf-mapping rdfs rml semantic yago yarrrml

Last synced: 01 May 2025

https://github.com/quickbirdeng/datakit

A Swift library to easily read and write binary formatted data using a modern, declarative interface.

binary-data ble bluetooth bluetooth-le bluetooth-low-energy data declarative declarative-programming decoding dsl encoding network resultbuilder swift swift5

Last synced: 23 Jun 2025

https://github.com/cgsecurity/testdisk_documentation

Documentation for TestDisk & PhotoRec

data photorec recovery testdisk

Last synced: 03 Sep 2025

https://github.com/jqnpm/jqnpm

A package manager built for the command-line JSON processor jq.

command-line-tool data data-processing jq json package-manager

Last synced: 21 Jul 2025

https://github.com/optixal/cryptoinscriber

:chart_with_upwards_trend: A live cryptocurrency historical trade data blotter. Download live historical trade data from any cryptoexchange, be it for machine learning, backtesting/visualizing trading strategies or for Quantopian/Zipline.

backtest bot cryptocurrency data downloader exchange feeds historical historical-data learning live machine poll strategy trade transactions

Last synced: 21 Mar 2025

https://github.com/Optixal/CryptoInscriber

:chart_with_upwards_trend: A live cryptocurrency historical trade data blotter. Download live historical trade data from any cryptoexchange, be it for machine learning, backtesting/visualizing trading strategies or for Quantopian/Zipline.

backtest bot cryptocurrency data downloader exchange feeds historical historical-data learning live machine poll strategy trade transactions

Last synced: 22 Mar 2025

https://github.com/asem000/pytreeclass

Visualize, create, and operate on pytrees in the most intuitive way possible.

data dataclasses deep-learning jax machine-learning pipelines pytorch pytree tensorflow

Last synced: 07 Apr 2025

https://github.com/timokoerber/laravel-json-seeder

Create and use JSON files to seed your database in your Laravel applications

data database json laravel seed seeder seeder-table seeders

Last synced: 19 Oct 2025

https://github.com/doktormike/dammmdatagen

Marketing Mix Modeling Data Generator

benchmark data data-generator marketing-mix-modeling

Last synced: 29 Jul 2025

https://github.com/arthurhd/immosheets

Tired of searching with your mouse ? Let's automate the process. 🚀

automation data google-sheets google-sheets-api immosheets leboncoin open-source orpi pypi python real-estate seloger sheets-api

Last synced: 06 Apr 2026

https://github.com/kuda-io/kuda

Kubernetes 原生的数据交付平台

cloud-native data golang hdfs kubernetes storage

Last synced: 14 Jan 2026

https://github.com/data-solution-automation-engine/virtual-data-warehouse

The Virtual Data Warehouse is a code generation and template management tool. It is part of the data solution automation ecosystem - the 'engine' for data solution automation.

codegeneration data datavault datavault20 datawarehouse datawarehouseautomation etl etl-automation virtual virtual-data-warehouse

Last synced: 14 Feb 2026

https://github.com/airbytehq/write-for-the-community

Contribute and collaborate on educational content for the Airbyte Community.

airbyte articles data open-source showcases tutorial videos

Last synced: 02 Mar 2026

https://github.com/benthosdev/benthos-captain

A Kubernetes Operator to orchestrate Benthos pipelines

benthos data data-engineering gitops go golang helm kubernetes kustomize pipelines stream-processing

Last synced: 22 Jan 2026

https://github.com/yzfly/mcp-excel-server

The Excel MCP Server is a powerful tool that enables natural language interaction with Excel files through the Model Context Protocol (MCP). It provides a comprehensive set of capabilities for reading, analyzing, visualizing, and writing Excel data.

claude claude-mcp data excel mcp mcp-excel-server

Last synced: 06 Jul 2025

https://github.com/DoktorMike/dammmdatagen

Marketing Mix Modeling Data Generator

benchmark data data-generator marketing-mix-modeling

Last synced: 06 May 2025