An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/ptiger10/pd

A fast, tested, and predictable way to clean, aggregate, and transform data

analytics data go spreadsheet

Last synced: 12 Jan 2026

https://github.com/jeffcore/covid-19-usa-by-state

CSV files of COVID-19 total daily confirmed cases and deaths in the USA by state and county. All data from Johns Hopkins & NYT..

confirmed-cases coronavirus coronavirus-tracking county covid-19 covid19 csv csv-files daily-files data deaths johns-hopkins nyt state usa

Last synced: 16 Jan 2026

https://github.com/rbren/vizzy

Data Visualization with LLMs

chatgpt data data-visualization llm

Last synced: 07 May 2025

https://github.com/ivailop7/healthkit-influxdb-grafana

Publish your Apple HealthKit data via Python Flask HTTP endpoint to InfluxDB to plot in Grafana

analytics apple autoexport chart data flask grafana health healthkit http influxdb linux local mac plot python selfquant visualization windows workouts

Last synced: 30 Apr 2025

https://github.com/sungchun12/airflow-dbt-cloud

dbt Cloud pipelines in airflow examples

airflow data dbt dbt-cloud schedule scheduler workflow-engine

Last synced: 04 Sep 2025

https://github.com/kristijorgji/goseeder

Go database seeder inspired from Laravel/Lumen seeder and more

data database go seeder seeders table test-seeds testing

Last synced: 14 May 2025

https://github.com/FrediBach/Blowson

Blow up JSON like sample data in an awesomely realistic way!

data database extender graphql json

Last synced: 02 Aug 2025

https://github.com/travishorn/csval

Check CSV files against a set of validation rules.

cli csv data json-schema parser validation

Last synced: 09 Apr 2025

https://github.com/vijinho/epl_mysql_db

Free/open English Premier League results database from 1993-2017. Dump format is MySQL and sqlite.

data dataset epl football-data mysql premierleague soccer

Last synced: 20 Mar 2025

https://github.com/tradewelltech/protarrow

Convert from protobuf to arrow and back

apache-arrow data protobuf python

Last synced: 16 Jan 2026

https://github.com/vincentauriau/tennis-prediction

Predicts the winner of a tennis match with machine learning

atp data data-science machine-learning tennis

Last synced: 22 Apr 2025

https://github.com/microsoft/reconner

ReconNER, Debug annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality of your data.

ai data ner-data nlp

Last synced: 31 Oct 2025

https://github.com/ocamlpro/directories

directories is an OCaml library that provides configuration, cache and data paths (and more!) following the suitable conventions on Linux, macOS and Windows. The following conventions are used: XDG Base Directory Specification and xdg-user-dirs on Linux, Known Folders on Windows, Standard Directories on macOS.

basedir cache config conventions data directories knownfolders linux macos ocaml standard standarddirectories windows xdg

Last synced: 12 Jun 2025

https://github.com/juliadata/dataapi.jl

A data-focused namespace for packages to share functions

data julia julialang

Last synced: 11 Sep 2025

https://github.com/fredibach/blowson

Blow up JSON like sample data in an awesomely realistic way!

data database extender graphql json

Last synced: 23 Apr 2025

https://github.com/julianfaraway/faraway

R package, scripts and documentation supporting R books by Julian Faraway

data r

Last synced: 21 Feb 2026

https://github.com/holunda-io/camunda-bpm-data

Beautiful process data handling for Camunda 7 Platform.

api bpm camunda-7 data process

Last synced: 14 Jan 2026

https://github.com/streamr-dev/hub

Streamr Hub frontend

data real-time streamr streams web3

Last synced: 03 Jul 2025

https://github.com/guanguans/laravel-api-response

Normalize and standardize Laravel API response data structures. - 规范化和标准化 Laravel API 响应数据结构。

api data json laravel normalize response rest restful standardize structure

Last synced: 26 Mar 2025

https://github.com/iamphytan/rosbag-tools

A ROS-agnostic toolbox for common rosbag operations

data data-management python python3 robotics ros1 ros2 rosbag

Last synced: 14 Apr 2025

https://github.com/juliaferraioli/opensource-timeline

This repository aims to collect events in open source history.

data history opendata opensource

Last synced: 10 Feb 2026

https://github.com/eidoslab/unitopatho

Dataset of 9536 H&E-stained patches for colorectal polyps classification and adenomas grading | ICIP21 https://doi.org/10.1109/ICIP42928.2021.9506198

cancer data health histopathological-image histopathology histopathology-images medical-image-processing medical-images neural-networks

Last synced: 12 Aug 2025

https://github.com/canclid/canto-filter

粵文語料篩選器 Cantonese text filter

cantonese cantonese-language corpus corpus-data data nlp

Last synced: 27 Oct 2025

https://github.com/aiven/aiven-operator

Provision and manage Aiven Services from your Kubernetes cluster.

automation data databases kubernetes operator

Last synced: 09 Apr 2026

https://github.com/flother/rio2016

Data on the 11,500+ athletes and 306 events at the Rio Olympics. Includes medals tallies

athletes data medals olympic-games olympics rio-de-janeiro rio2016

Last synced: 16 Mar 2026

https://github.com/mwouts/world_trade_data

World Integrated Trade Solution (WITS) API in Python

data statistics trade worldbank

Last synced: 03 Apr 2025

https://github.com/rxavier/econuy

Wrangling Uruguayan economic data so you don't have to.

data economy python uruguay

Last synced: 17 Jan 2026

https://github.com/ctjacobs/git-rdm

A research data management plugin for the Git version control system.

curation data datasets git open-data open-science publishing research-data-management version-control

Last synced: 21 Jan 2026

https://github.com/stefen-taime/iceberg-dbt-trino-hive-modern-open-source-data-stack

To provide a deeper understanding of how the modern, open-source data stack consisting of Iceberg, dbt, Trino, and Hive operates within a music streaming platform, let’s delve into the detailed workflow and benefits of each component.

data dbt hive iceberg modern trinodb

Last synced: 20 Oct 2025

https://github.com/tompollard/sammon

Sammon mapping in Python

data visualization

Last synced: 29 Oct 2025

https://github.com/z3z1ma/target-bigquery

target-bigquery is a Singer target for BigQuery. It supports storage write, GCS, streaming, and batch load methods. Built with the Meltano SDK.

bigquery data meltano pipelines singer

Last synced: 25 Oct 2025

https://github.com/webankblockchain/data-stash

Data-Stash是基于FISCO-BCOS的数据仓库组件,通过解析节点的binlog日志,生成该节点状态的全量备份,从而使节点能够实现冷热数据分离和数据裁剪。

blockchain consortium data data-governance data-separation webank-blockchain

Last synced: 23 Jul 2025

https://github.com/fluhus/gostuff

Convenience packages for data science in Go.

data data-science data-structures go golang

Last synced: 12 Jan 2026

https://github.com/ihrke/pypillometry

Pupillometry and eyetracking with python

data data-analysis eye-tracking eyetracking pupillometry

Last synced: 10 Oct 2025

https://github.com/hodur-org/hodur-datomic-schema

Hodur is a domain modeling approach and collection of libraries to Clojure. By using Hodur you can define your domain model as data, parse and validate it, and then either consume your model via an API or use one of the many plugins to help you achieve mechanical results faster and in a purely functional manner.

clojure data database datomic modeling schema

Last synced: 12 Dec 2025

https://github.com/countly/countly-sdk-cpp

Countly C++ SDK for Windows, MacOS and Linux

analytics data linux mac mobile

Last synced: 10 Jun 2025

https://github.com/ropensci/weatherOz

An API Client for Australian Weather and Climate Data Resources

api-client australia climate data r rainfall rstats weather weather-api weather-forecast

Last synced: 20 Jul 2025

https://github.com/z3z1ma/cdf

A framework to manage data, continuously

data framework pipelines transformation

Last synced: 17 Mar 2025

https://github.com/matrix-msu/kora

The easiest way to manage and publish your data. Open-source, database-driven, online digital repository application for complex multimedia objects (text, images, audio, video). kora stores, manages, and delivers digital objects with corresponding metadata that enhances the research and educational value of the objects.

archive collections data laravel management matrix metadata msu mysql php repository schema

Last synced: 11 Jan 2026

https://github.com/inphyt/covid19-italy-integrated-surveillance-data

COVID-19 integrated surveillance data provided by the Italian Institute of Health and processed via UnrollingAverages.jl to deconvolve the weekly moving averages.

covid-19 covid19-data data data-analysis data-structures data-visualization data-wrangling database dataset epidemiological-data epidemiology italy italy-data italy-dataset open-data surveillance surveillance-data time-series time-series-analysis

Last synced: 26 Jul 2025

https://github.com/webankblockchain/data-reconcile

Data-Reconcile是一款基于区块链的对账组件,提供基于区块链智能合约账本的通用化数据对账解决方案,并提供了一套可动态扩展的对账框架,支持定制化开发。

blockchain consortium data data-governance reconcile webank-blockchain

Last synced: 09 Jul 2025

https://github.com/pkmn/smogon

Wrapper around Smogon's analyses and usage statistics

data git-scraping pokemon smogon

Last synced: 09 Apr 2025

https://github.com/ropenspain/infoelectoral

infoelectoral is a R library that helps retrieve and analize official electoral results for Spain from the Ministry of the Interior. It allows you to download the results of general, european and municipal elections of any year at the polling station and municipality level.

data elecciones elections electoral infoelectoral r spain

Last synced: 14 Apr 2025

https://github.com/wakataw/pyproc

SPSE (Sistem Pengadaan Secara Elektronik) Python API Wrapper

data e-procurement lkpp lpse pengadaan python sedot spse

Last synced: 17 Jan 2026

https://github.com/EIDOSLAB/UNITOPATHO

Dataset of 9536 H&E-stained patches for colorectal polyps classification and adenomas grading | ICIP21 https://doi.org/10.1109/ICIP42928.2021.9506198

cancer data health histopathological-image histopathology histopathology-images medical-image-processing medical-images neural-networks

Last synced: 06 May 2025

https://github.com/brightway-lca/brightway2-io

Importing and exporting for the Brightway LCA framework

bw2 data life-cycle-assessment python

Last synced: 04 Apr 2025

https://github.com/pinecone-io/pinecone-datasets

An open-source dataset library for pre-embedded dataset: create your own data catalog, or use Pinecone's public datasets.

data database embeddings vector

Last synced: 29 Apr 2025

https://github.com/iboxdb/db4o-gpl

new Db4o GPL Source Code for Java7+ & .netstardard2.0 Android Xamarin..., the best database project to help you to learn how to make databases

data database db4o embaddable java netstandard oodb

Last synced: 14 Jan 2026

https://github.com/audeering/audb

Manage audio and video datasets

annotation audio data mlops

Last synced: 10 Jun 2025

https://github.com/pawel-0/xdg-unused-data

A simple way to identify unused applications data in user directories such as ~./config and ~/.cache.

bash data linux unused xdg xdg-basedir

Last synced: 04 Sep 2025

https://github.com/pennlabs/penn-sdk-python

A Python module for the various services of Penn OpenData. Validated API token required.

data opendata python university-of-pennsylvania

Last synced: 31 Jul 2025

https://github.com/reymond-group/lore

WebGL engine for (big) data visualization.

3d-engine data data-science interactive visualization webgl

Last synced: 06 Mar 2026

https://github.com/htrgouvea/harpoon

[W.I.P] An ecosystem of crawlers for detecting: leaks, sensitive data exposure and attempts exfiltration of data

bing data detect exfiltrate leak notify pastebin perl sensitive-data uranus

Last synced: 01 Mar 2026

https://github.com/tniedbala/secdatatools

Simple Python utility that downloads and extracts SEC financial statement data sets.

accounting analysis csv data dataset finance financial-statements securities tsv utility

Last synced: 23 Jan 2026

https://github.com/svrnm/exceldatatables

Replace a worksheet within an Excel workbook (.xlsx) without changing any other properties of the file.

data datatable excel php xlsx

Last synced: 07 May 2025

https://github.com/rodabt/vduckdb

A blazing-fast DuckDB wrapper built with the V language, making it easier to leverage its power in your projects.

data duckdb vlang wrapper-library

Last synced: 09 Aug 2025

https://github.com/pkmn/randbats

Pokémon Showdown's Random Battle sets

data git-scraping pokemon pokemon-showdown

Last synced: 29 Jul 2025

https://github.com/juliagraphics/namedcolors.jl

More color names than you ever knew you wanted

color color-palette data

Last synced: 10 Sep 2025

https://github.com/aws-samples/data-for-saas-patterns

A collection of samples, best practices and reference architectures for implementing SaaS applications on AWS for databases and data services.

aws data databases saas

Last synced: 14 Apr 2025

https://github.com/climatewatch-vizzuality/climate-watch

Climate Watch: Data for Climate Action

climate data postgresql rails react

Last synced: 08 May 2025

https://github.com/mrpaulandrewltd/Microsoft-Data-Integration-Pipeline-Training

Training workshop content on Azure Data Factory and Azure Synapse Analytics Data Integration Pipelines

azure data data-factory integration pipelines procfwk synapse-analytics

Last synced: 31 Mar 2025

https://github.com/suchjs/such

A powerful fake data library, expandable, configurable, generate data exactly as you want.

data fake faker generation generator javascript json json-data mock mocking nodejs simulate simulation typescript

Last synced: 14 Apr 2025

https://github.com/datawithbaraa/sql-data-analytics-project

This repository contains a collection of SQL scripts demonstrating various analytical techniques, such as changes over time, cumulative, performance, data segmentation, part-to-whole analysis.

analytics business-analytics business-intelligence data data-analysis data-analyst data-analytics data-engineering data-science data-scientist database datascience query reporting sql sql-queries sql-query sql-server window-functions window-functions-in-sql

Last synced: 15 Apr 2025

https://github.com/ckan/ckanext-validation

CKAN extension for validating Data Packages using Table Schema.

ckan ckanext data validation

Last synced: 06 Apr 2025

https://github.com/oobianom/shinyStorePlus

An R package with in-browser storage for Shiny persistent, synchronized data from the inputs using IndexedDB. Transfer browser link parameters to Shiny input or output values.

cran data data-structures r r-package shiny

Last synced: 05 Oct 2025

https://github.com/randomfractals/observable-data-tools

Repository of web and code editor friendly Observable Data Toools 🛠️ and Notebooks 📚 in .js, .nb.json, .ojs, .omd, .html and .qmd document formats for Data Previews in a browser and in VSCode IDE with Observable JS extension, Quarto extension, and new Quarto publishing tools.

data data-notebooks data-tools diagrams editor jsnotebooks notebook quarto quartopub query sql summary tabular

Last synced: 01 Mar 2026

https://github.com/jnmclarty/validada

Another library for defensive data analysis.

checkset data data-analysis data-validation decorators pandas slice validation

Last synced: 24 Jan 2026

https://github.com/RealityBending/TemplateResults

A template for a data analysis folder that can be easily exported as a webpage or as Supplementary Materials

data open-science open-source pdf r reproducible rmarkdown scripts share statistics submit supplementary-material template webpage website word

Last synced: 30 Jul 2025

https://github.com/gher-uliege/physocean.jl

Utility functions for physical oceanography (properties of seawater, air-sea heat fluxes,...)

data density fluxes julia physical-oceanography sea-water

Last synced: 13 Oct 2025

https://github.com/pepijn-devries/CopernicusMarine

Subset and download marine data from EU Copernicus Marine Service Information. Import data on the oceans physical and biogeochemical state from Copernicus into R without the need of external software.

data spatial

Last synced: 20 Jul 2025

https://github.com/nbremer/datasketches

A monthly collaboration project between Shirley & Nadieh

d3 d3js data data-art data-visualization

Last synced: 14 Aug 2025

https://github.com/arm-university/rpi-pico-projects-for-schools

Raspberry Pi Pico Projects for Schools: Explore cutting-edge topics in Computing, including Machine Learning and Internet of Things. Ages 16-18.

ai data datascience iot ml pico python raspberry-pi rpi

Last synced: 23 Apr 2025

https://github.com/garystafford/streaming-sales-generator

Streaming Synthetic Sales Data Generator: Streaming sales data generator for Apache Kafka, written in Python

analytics apache-flink apache-kafka data kafka kafka-streams kstreams python spark-structured-streaming streaming-data

Last synced: 03 Aug 2025

https://github.com/ssamadgh/ModelAssistant

Elegant library to manage the interactions between view and model in Swift

collectionview controller core coredata data datasource interactor manager model mvc mvp mvvm swift tableview view viewmodel viper

Last synced: 06 Aug 2025

https://github.com/cahyadsn/db_rajaongkir

Data Kode Provinsi, Kota/Kabupaten dan Kecamatan untuk RajaOngkir

data kabupaten kecamatan kode kota provinsi rajaongkir sql

Last synced: 07 Apr 2026

https://github.com/ropensci/weatheroz

An API Client for Australian Weather and Climate Data Resources

api-client australia climate data r rainfall rstats weather weather-api weather-forecast

Last synced: 09 Apr 2026

https://github.com/favstats/uaconflict_equipmentloss

This repo scrapes Oryxspioenkop (daily) to document and visualize equipment losses in the Russia-Ukraine war. https://www.oryxspioenkop.com/2022/02/attack-on-europe-documenting-equipment.html

conflict data data-visualization ukraine-invasion ukrainewar war

Last synced: 13 Aug 2025

https://github.com/randomfractals/pro-data-tools

Pro Data Tools 🛠️ for VS Code IDE 🧙‍♂️: DuckDB Pro Tools, PRQL Code Lens, new Markdown SQL Pro Tools, upcoming Data Notebooks 📚 Pro Tools docs and demos, etc.

data duckdb markdown notebooks prql sql tools vscode

Last synced: 22 Mar 2025

https://github.com/khive-ai/pydapter

adapt data to and from every format

ai data database pydantic schema vector

Last synced: 24 Apr 2026

https://github.com/ahuang11/ahlive

animate your data to life

ahlive animate animation data gif matplotlib xarray

Last synced: 17 Mar 2025

https://github.com/xability/maidr-legacy

[DEPRECATED prototype] Multimodal Access and Interactive Data Representation

ai blind braille chart data description image impairments llm low-vision multimodality plot representation science sonification tactile visual visualization

Last synced: 12 Feb 2026

https://github.com/lolleko/mesh-data-synthesizer

Uses Unreal Engine & Cesium to generate large synthetic dataset from 3D meshes. Enables machine learning tasks like Visual Place Recognition read more in our paper on this: https://meshvpr.github.io

cesium data geospatial machine-learning mesh place-recognition synthesis synthesizer ue5 unreal-engine

Last synced: 28 Apr 2025

https://github.com/ssamadgh/modelassistant

Elegant library to manage the interactions between view and model in Swift

collectionview controller core coredata data datasource interactor manager model mvc mvp mvvm swift tableview view viewmodel viper

Last synced: 29 Apr 2025

https://github.com/feup-infolab/dendro

"Open-source Dropbox" with added description features. It is a data storage and description platform designed to help researchers and other users to describe their data files, built on Linked Open Data and ontologies. Users can use Dendro to publish data to CKAN, Zenodo, DSpace or EUDAT's B2Share and others.

data dendro dendro-platform infolab invenio linked-data rdm research

Last synced: 13 Jul 2025

https://github.com/airframesio/data

Centralization of source data for Airframes/Acars projects

acars airframes csv data database json sql vdl vdl2 xml

Last synced: 16 Jan 2026