An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/xefi/faker-php

Generate fake data on demand.

data fake faker php

Last synced: 04 Apr 2025

https://github.com/rspeele/rezoom

Implements a resumption monad for .NET targeting data access with automatic batching and caching.

batching caching data dot-net dotnet monad

Last synced: 01 Aug 2025

https://github.com/leeper/data-versioning

Collecting thoughts about data versioning

data data-citation data-versioning metadata unf version-control

Last synced: 15 Feb 2026

https://github.com/robjhyndman/fpp2

All data sets required for the examples and exercises in the book "Forecasting: principles and practice" (2nd ed, 2018) by Rob J Hyndman and George Athanasopoulos <http://OTexts.org/fpp2/>. All packages required to run the examples are also loaded.

cran data forecasting r

Last synced: 04 Mar 2026

https://github.com/International-Data-Spaces-Association/idsa

This is the main repository of International Data Spaces Association on GitHub, where you can find general overview and useful information on IDS Landscape.

cybersecurity data data-sharing data-spaces dataeconomy dataexchange datasharing datasovereignty dataspace

Last synced: 04 Apr 2025

https://github.com/hyriver/hyriver.github.io

A Python software stack for retrieving hydroclimate data from web services.

climate data hydrology python webservice

Last synced: 04 Oct 2025

https://github.com/datainsider-co/rocket-bi

A free, open-source, web-based self-service BI tailor-made for clickhouse, google bigquery, mysql, postgresql, vertica

analytics bigdata bigquery bussiness-intelligence clickhouse dashboard data etl hacktoberfest hacktoberfest2023 ingestion mysql postgresql vertica

Last synced: 05 Apr 2025

https://github.com/robjhyndman/fpp2-package

All data sets required for the examples and exercises in the book "Forecasting: principles and practice" (2nd ed, 2018) by Rob J Hyndman and George Athanasopoulos <http://OTexts.org/fpp2/>. All packages required to run the examples are also loaded.

cran data forecasting r

Last synced: 13 Jul 2025

https://github.com/pancake-llc/foundation

Game Mobile Foundation (Android + iOS) Using Unity3D

android base binary code-base core data engine foundation framework ios mobile package ui unity unity3d

Last synced: 05 Apr 2025

https://github.com/scienceverse/faux

R functions for simulating factorial datasets

data simulation

Last synced: 21 Feb 2026

https://github.com/juliacomputing/tableview.jl

A Tables.jl compatible table viewer based on ag-grid

data julia table web

Last synced: 02 Sep 2025

https://github.com/pfython/cleverdict

A JSON-friendly data structure which allows both object attributes and dictionary keys and values to be used simultaneously and interchangeably.

alias attributes auto-save data dictionary keyword object orm

Last synced: 10 Apr 2025

https://github.com/domosekai/tripreader-data

“读卡识途”项目公开数据

card china data metro nfc t-union

Last synced: 06 Oct 2025

https://github.com/centerforopenscience/share

SHARE is building a free, open, data set about research and scholarly activities across their life cycle.

data elasticsearch harvest-data metadata openscience python scholarly-communication science

Last synced: 05 Apr 2025

https://github.com/debruine/faux

R functions for simulating factorial datasets

data simulation

Last synced: 28 Aug 2025

https://github.com/bredele/datastore

:hamster: Bloat free and flexible interface for data store and database access.

async asynchronous data database database-access datastore model store

Last synced: 13 Apr 2025

https://github.com/JGCRI/CEDS

Community Emissions Data System (CEDS)

data emissions

Last synced: 07 May 2025

https://github.com/itachi-uchiha581/auto-data

Auto Data is a library designed for quick and effortless creation of datasets tailored for fine-tuning Large Language Models (LLMs).

ai data finetuning-large-language-models finetuning-llms generative-ai llm llm-training python python3

Last synced: 20 Sep 2025

https://github.com/ERDDAP/erddap

ERDDAP is a scientific data server that gives users a simple, consistent way to download subsets of gridded and tabular scientific datasets in common file formats and make graphs and maps. ERDDAP is a Free and Open Source (Apache and Apache-like) Java Servlet from NOAA NMFS SWFSC Environmental Research Division (ERD).

data environmental erddap noaa scientific server

Last synced: 08 May 2025

https://github.com/purarue/google_takeout_parser

A library/CLI tool to parse data out of your Google Takeout (History, Activity, Youtube, Locations, etc...)

backup data export google google-location-history google-takeout

Last synced: 14 Jun 2025

https://github.com/zbrookle/dataframe_sql

A Python package that parses SQL and interprets it as methods that act upon existing pandas (or other types of) DataFrames that have been declared and registered

data dataframes pandas python sql

Last synced: 20 Aug 2025

https://github.com/joelgmsec/fakedatagen

Full Valid Fake Data Generator

data fake full generator valid

Last synced: 24 Apr 2025

https://github.com/1n3/powerexfil

A collection of data exfiltration scripts for Red Team assessments.

data exfil exfiltration hacking powershell redteam redteaming script scripts tool tools

Last synced: 08 Aug 2025

https://github.com/ralyodio/humanparser

Parse a human name string into salutation, first name, middle name, last name, suffix.

data es6 javascript parsing scraping

Last synced: 02 Apr 2026

https://github.com/saschagobel/legislatoR

Interface to the Comparative Legislators Database

data dataset legislators parliament political-science politicians politics r wikipedia

Last synced: 13 Jul 2025

https://github.com/bukalapak/ktpextractor

This is a service which takes KTP image as the input, and extract the data in the KTP as the output. This is a part of open source project by Data Scientists of Bukalapak.

data datascience

Last synced: 01 Aug 2025

https://github.com/jason89521/daxus

Daxus is a server state management library for React that provides full control over data, leading to a better user experience.

cache data dedupe hook react revalidate server-state-management user-experience

Last synced: 23 Jun 2025

https://github.com/geostatsguy/geodatasets

Synthetic datasets for geoscience (geo)statistical modeling

data database spatial-data

Last synced: 26 Oct 2025

https://github.com/gabrieldim/advanced-programming

Generic programming, generic classes, maps, sets, abstract data types and so on.

abstarct class data data-type data-types generic generic-programming generics interface interfaces map set

Last synced: 10 Jul 2025

https://github.com/vr-25/migrator

A backup solution and data migration utility for Android

android appdata backup data factoryreset magisk migation migrate titaniumbackup

Last synced: 08 Jul 2025

https://github.com/neurosnap/cofx

A node and javascript library that helps developers describe side-effects as data in a declarative, flexible API.

asynchronous cofx data javascript node promise side-effects yield

Last synced: 14 Apr 2025

https://github.com/cncf/surveys

📝📊 CNCF Survey Data

cncf data surveys

Last synced: 23 Feb 2026

https://github.com/anthonybudd/S4

S4 is 100% S3 compatible storage, accessed through Tor and distributed using IPFS.

data docker ipfs object-storage s3 s4 storage

Last synced: 07 Apr 2025

https://github.com/anthonybudd/s4

S4 is 100% S3 compatible storage, accessed through Tor and distributed using IPFS.

data docker ipfs object-storage s3 s4 storage

Last synced: 12 Apr 2025

https://github.com/mattphillips/jest-each

A parameterised testing library for Jest. https://www.npmjs.com/package/jest-each 🏃

data each jest parameterised test

Last synced: 13 Apr 2025

https://github.com/jobehi/isthistechdead

The place where your favourite framework will be resting

data metrics tech

Last synced: 19 Jun 2025

https://github.com/synthesized-io/fairlens

Identify bias and measure fairness of your data

bias data data-analysis data-science fairness ml pandas python statistics

Last synced: 24 Jun 2025

https://github.com/adzz/data_schema

Declarative schemas for data transformations.

data data-parsing elixir functional-programming types validation

Last synced: 20 Jul 2025

https://github.com/yuxqiu/modern-poetry

The most comprehensive database of modern Chinese poetry and foreign poetry 最全的中国近现代诗以及外国诗数据库

data json poems poetry translation

Last synced: 16 Jan 2026

https://github.com/nucs/cryptocurrency-ticks-data

590 days of trade ticks on BTC/ETH/LTC/NEO to USDT

crypto data stock-data ticks

Last synced: 12 Feb 2026

https://github.com/googlecloudplatform/dlp-dataflow-deidentification

Multi Cloud Data Tokenization Solution By Using Dataflow and Cloud DLP

beam bigquery data dataflow dlp pii tokenization

Last synced: 11 Apr 2025

https://github.com/contextdata/vectoretl

Build super simple end-to-end data & ETL pipelines for your vector databases and Generative AI applications

cohere data datapipeline etl etl-framework etl-pipeline openai pinecone python qdrant qdrant-vector-database unstructured vector-database weaviate

Last synced: 09 Apr 2025

https://github.com/saschagrunert/rain

Visualize vertical data inside your terminal 💦

data log logger rain rust terminal

Last synced: 09 Apr 2025

https://github.com/tmcw/simpleopendata

simple guidelines for publishing open data in useful formats

copleft copyright data formats government licensing open

Last synced: 03 Mar 2026

https://github.com/empower-ai/sql-agent

Ai Agent that helps you do data analytics with natural language.

analytics bigquery chatgpt chatgpt-bot data data-analytics data-science mysql postgresql slack slack-bot slackbot

Last synced: 11 Apr 2025

https://github.com/phyphox/phyphox-arduino

The phyphox BLE library to connect Arduino projects with the phyphox app to display data on the phone or use the phone's sensors on the Arduino

arduino ble bluetooth bluetooth-low-energy data phyphox sensors

Last synced: 16 Jan 2026

https://github.com/Azure/azure-data-labs-modules

A list of Terraform modules to build your Azure Data IaC templates.

analytics azure data github github-actions labs terraform terraform-modules

Last synced: 06 May 2025

https://github.com/leinstay/steamdb

JSON file of all games available on Steam with prices and additional data from Steam Spy, GameFAQs, Metacritic, IGDB and HLTB.

data gamefaqs games history hltb igdb json steam steamspy

Last synced: 22 Apr 2025

https://github.com/josephrp/datatonic

🌟DataTonic : A Data-Capable AGI-style Agent Builder of Agents , that creates swarms , runs commands and securely processes and creates datasets, databases, visualisations, and analyses.

agent-builder agi autogen azure chroma data data-science data-visualization database memgpt semantic-kernel semantic-memory taskweaver

Last synced: 11 Oct 2025

https://github.com/aws-solutions/automated-data-analytics-on-aws

The Automated Data Analytics on AWS solution provides an end-to-end data platform for ingesting, transforming, managing and querying datasets. This helps analysts and business users manage and gain insights from data without deep technical experience using Amazon Web Services (AWS).

analytics automated aws data

Last synced: 17 Apr 2025

https://github.com/JujuAdams/SNAP

Data format converters for GameMaker LTS 2022

array data gamemaker gamemaker-studio-2 gms2 ini json messagepack struct xml

Last synced: 01 Apr 2025

https://github.com/jujuadams/snap

Data format converters for GameMaker LTS 2022

array data gamemaker gamemaker-studio-2 gms2 ini json messagepack struct xml

Last synced: 06 Apr 2025

https://github.com/volorf/paster

Pasting a text data from a clipboard directlly to Sketch text layers [Sketch plugin]

clipboard data plugin sketch sketch-plugin text

Last synced: 21 Mar 2025

https://github.com/joaocarmo/react-smart-data-table

A smart data table component for React meant to be configuration free

data data-table data-visualization plug-and-play react

Last synced: 13 Apr 2025

https://github.com/stanfordnlp/edu-convokit

Edu-ConvoKit: An Open-Source Framework for Education Conversation Data

data data-analysis data-science education language natural-language-processing

Last synced: 15 Apr 2025

https://github.com/malloydata/publisher

Publisher is the open-source semantic model server for the Malloy data language. It lets you define semantic models once — and use them everywhere.

analytics business-intelligence data data-modeling data-transformation data-visualization database semantic-modeling transformation

Last synced: 06 May 2026

https://github.com/richienb/ros-data-waster

The easiest way to waste your data.

data html waste

Last synced: 19 Jun 2025

https://github.com/wildflowai/platform

Model natural ecosystems 🌎🪸🐳

ai biodiversity conservation data ocean restoration

Last synced: 11 Apr 2026

https://github.com/azure/azure-data-labs-modules

A list of Terraform modules to build your Azure Data IaC templates.

analytics azure data github github-actions labs terraform terraform-modules

Last synced: 05 Jul 2025

https://github.com/slowkow/tftargets

:dart: Human transcription factor target genes from 6 databases in convenient R format.

bioinformatics data rstats transcription-factors

Last synced: 14 Apr 2025

https://github.com/ropensci/opentripplanner

An R package to set up and use OpenTripPlanner (OTP) as a local or remote multimodal trip planner.

data isochrones java opentripplanner otp public-transport r routing transport transportation-planning

Last synced: 08 Oct 2025

https://github.com/Baukebrenninkmeijer/table-evaluator

Evaluate real and synthetic datasets against each other

data data-evaluation evaluation generation synthetic synthetic-data table-evaluator

Last synced: 02 May 2025

https://github.com/Hebilicious/vue-query-nuxt

A lightweight, 0 config Nuxt Module for Vue Query.

data data-fetching fetch nuxt react-query tanstack tanstack-query vue vue-query

Last synced: 02 Aug 2025

https://github.com/JoelGMSec/FakeDataGen

Full Valid Fake Data Generator

data fake full generator valid

Last synced: 12 Jul 2025

https://github.com/drivy/checker_jobs

Regression testing for data

data regression-testing ruby sidekiq

Last synced: 06 Apr 2025

https://github.com/ContextData/VectorETL

Build super simple end-to-end data & ETL pipelines for your vector databases and Generative AI applications

cohere data datapipeline etl etl-framework etl-pipeline openai pinecone python qdrant qdrant-vector-database unstructured vector-database weaviate

Last synced: 22 Sep 2025

https://github.com/hebilicious/vue-query-nuxt

A lightweight, 0 config Nuxt Module for Vue Query.

data data-fetching fetch nuxt react-query tanstack tanstack-query vue vue-query

Last synced: 04 Apr 2025

https://github.com/open-discourse/open-discourse

Open Discourse is the first fully comprehensive corpus of the plenary proceedings of the federal German Parliament (Bundestag).

bundestag corpus data hacktoberfest

Last synced: 14 Mar 2025

https://github.com/fityannugroho/idn-area-map

The map of Indonesia's administrative areas 🇮🇩🌏

data hacktoberfest idn-area indonesia island map nextjs tailwindcss wilayah

Last synced: 07 Apr 2025

https://github.com/torkleyy/nitric

[ABANDONED] General-purpose data processing library. Mirror of https://gitlab.com/nitric/nitric

data ecs entity-component processing

Last synced: 20 Aug 2025

https://github.com/jbzoo/data

Extended implementation of ArrayObject - useful collection for any config in your system (write, read, store, change, validate, convert to other format and etc).

arrayobject config converts data filters ini jbzoo php yml

Last synced: 05 Apr 2025

https://github.com/uwdata/flechette

Fast, lightweight access to Apache Arrow data.

arrow data interchange

Last synced: 04 Apr 2025

https://github.com/opensource-observer/oss-directory

A curated directory of open source software (OSS) projects and their associated artifacts

data github open-source public-goods research

Last synced: 08 Oct 2025

https://github.com/ngxs-labs/data

NGXS Persistence API

data entity ngxs ngxs-persistence-api

Last synced: 24 Apr 2025

https://github.com/finos/datahub

DataHub - Synthetic data library

data library pandas python sklearn synthetic

Last synced: 30 Sep 2025

https://github.com/zhangyoujia/hd_write_verify

LBA tools(hd_write_verify & hd_write_verify_dump) are very useful for testing Storage stability and verifying DATA consistency, there are much better than FIO & vdbench's verifying functions. for example: physical disk: ide/sata/scsi/ssd/iscsi/fc/raid/...; virtual disk: loop/nbd/lvm/soft raid/...; VM disk: ide/sata/scsi/virtio-blk/virtio-scsi/...;

consistency data filesystem migration physical-disk snapshot stability storage testing verifying virtual-disk vm-backup vm-disk

Last synced: 27 Feb 2026

https://github.com/spine-tools/Spine-Toolbox

Spine Toolbox is an open source Python package to manage data, scenarios and workflows for modelling and simulation. You can have your local workflow, but work as a team through version control and SQL databases.

anaconda data energy miniconda python simulation-model spine-toolbox workflow

Last synced: 07 May 2025

https://github.com/visgl/deck.gl-data

Data for the data visualization library deck.gl examples (https://uber.github.io/deck.gl/#/)

data data-science data-visualization uber

Last synced: 12 Jun 2025

https://github.com/ashvin27/react-datatable

React-datatable is a component which provide ability to create multifunctional table using single component like jQuery Datatable. It's fully customizable and easy to integrate in any react component. Bootstrap compatible.

data datatables datatables-plugin react react-data-table react-datagrid react-datatable react-table table

Last synced: 13 May 2025

https://github.com/smappnyu/youtube-data-api

A Python Client for collect and parse public data from the Youtube Data API

api api-wrapper data python python-client research research-tool youtube youtube-api-v3 youtube-search

Last synced: 28 Oct 2025

https://github.com/tirthajyoti/synthetic-data-gen

Various methods for generating synthetic data for data science and ML

classification data data-science machine-learning python regression symbolic-computation time-series

Last synced: 30 Apr 2025

https://github.com/mainakrepositor/datasets

A bunch of some 200 datasets. You can call it mini-kaggle :)

csv data data-science database datasets image-files mini-kaggle ml nlp-machine-learning tsv

Last synced: 01 Mar 2025

https://github.com/luanborelli/ipeadatapy

ipeadatapy is a data and metadata extraction package made in Python using Ipeadata database official API. In it's essence it is an API wrapper.

api api-wrapper brazil dados-abertos dados-historicos data data-analysis datasets econometrics economic-data economics geographic-data geography ipea ipeadata wrapper

Last synced: 07 Apr 2026