An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/rmax/databrewer

The missing datasets manager. Like hombrew but for datasets. CLI-tool for search and discover datasets!

command-line data datasets discovery python

Last synced: 20 Mar 2025

https://github.com/maicius/universityrecruitment-ssurvey

用严肃的数据来回答“什么样的企业会到什么样的大学招聘”?

analysis beautifulsoup crawler data redis university

Last synced: 28 Apr 2025

https://github.com/ipeagit/flightsbr

R Package to Download Flight and Airport Data from Brazil

aviation-data brazil data r rstats rstats-package

Last synced: 02 May 2025

https://github.com/webankblockchain/data-export

Data-Export支持将链上数据导出到MySQL、ES等便于进行大数据处理的存储介质中,解决区块链数据复杂查询、分析、可视化和处理的问题。

blockchain consortium data data-governance export webank-blockchain

Last synced: 09 Jul 2025

https://github.com/Maicius/UniversityRecruitment-sSurvey

用严肃的数据来回答“什么样的企业会到什么样的大学招聘”?

analysis beautifulsoup crawler data redis university

Last synced: 06 Mar 2025

https://github.com/prioritizr/wdpar

Interface to the World Database on Protected Areas

biodiversity conservation cran data database protected-areas r r-package rstats spatial

Last synced: 01 Jul 2025

https://github.com/tradewelltech/beavers

Python stream processing for analytics

analytics apache-arrow data kafka pandas python realtime stream-processing

Last synced: 14 Jan 2026

https://github.com/junyuan-chen/readstattables.jl

Read and write Stata, SAS and SPSS data files with Julia tables

data dataframe datasets julia sas spss stata statistics tables tabular-data

Last synced: 27 Jan 2026

https://github.com/yiuman/data-visulaization

:scream_cat:数据可视化~实现可拖拽数据可视化视图、数据获取配置

data data-visualization datavalidation draggable vcharts-echarts visual visualization vue

Last synced: 19 Mar 2025

https://github.com/ashleydavis/sql-to-mongodb

A Node.js script to convert an SQL table to a MongoDB database.

convert-sql-to-mongodb data database javascript mongodb mongodb-database nodejs nosql sql sql-table

Last synced: 24 Oct 2025

https://github.com/Link-/uber_data

Uber web interface crawler / scraper - Convert the trips table into a CSV file

analysis data jupyter uber-crawler uber-data

Last synced: 04 May 2025

https://github.com/sghall/chord-transitions

Transitioning Chord Diagram Demo with Angular/D3

angularjs d3js data visualisation

Last synced: 22 Mar 2025

https://github.com/curran/d3-in-motion

Code examples and references for the course "D3.js in Motion"

chart d3js data dataviz html5 programming teaching visualization web

Last synced: 07 Jul 2025

https://github.com/juliaclimate/climatebase.jl

Tools to analyze and manipulate climate (spatiotemporal) data. Also used by ClimateTools and ClimatePlots

analysis climate data hacktoberfest julia spatiotemporal

Last synced: 11 Apr 2025

https://github.com/m-ld/m-ld-js

m-ld Javascript engine

crdt data graph rdf

Last synced: 01 May 2025

https://github.com/JuliaClimate/ClimateBase.jl

Tools to analyze and manipulate climate (spatiotemporal) data. Also used by ClimateTools and ClimatePlots

analysis climate data hacktoberfest julia spatiotemporal

Last synced: 20 Jul 2025

https://github.com/iodepo/odis-arch

Development of the Ocean Data and Information System (ODIS) architecture

catalogue data interoperability knowledge-graph metadata ocean ogc-services rdf sharing

Last synced: 20 Jul 2025

https://github.com/TomFevrier/kiwis

A Pandas-inspired data wrangling toolkit in JavaScript

data data-manipulation data-wrangling pandas

Last synced: 15 Mar 2025

https://github.com/fsolt/swiid

Standardized World Income Inequality Database

data dataset

Last synced: 29 Oct 2025

https://github.com/synthead/timex_datalink_client

Write data to Timex Datalink devices with an optical sensor

150 150s beepwear data data-link datalink dsi e-brain fl90 fl95 ironman link pro royal ruby sync timex triathlon upload watch

Last synced: 07 Apr 2025

https://github.com/rsquaredacademy/xplorerr

Shiny apps for interactive data analysis, visualization and modeling.

data exploration r rstats shiny-apps statistics visualization

Last synced: 02 Jul 2025

https://github.com/dodoex/dodoex_v2_subgraph

Subgraphs to index data for DODOEX V2

assembly blockchain-technology data subgraph typescript

Last synced: 11 Apr 2025

https://github.com/47degrees/org

Easily create a webpage with your organization's open source projects

clojure clojurescript data github graphql react rum

Last synced: 11 Apr 2025

https://github.com/epiforecasts/covidregionaldata

An interface to subnational and national level COVID-19 data. For all countries supported, this includes a daily time-series of cases. Wherever available we also provide data on deaths, hospitalisations, and tests. National level data is also supported using a range of data sources as well as linelist data and links to intervention data sets.

covid-19 data open-science r6 regional-data rstats

Last synced: 14 Jun 2025

https://github.com/fahad19/tydel

Typed Models & Collections for JavaScript data structure

data immutable javascript models structure

Last synced: 19 Apr 2025

https://github.com/rafzamb/sknifedatar

sknifedatar is a package that serves primarily as an extension to the modeltime 📦 ecosystem. In addition to some functionalities of spatial data and visualization.

data data-analysis data-science data-visualization forecasting r statistics time-series

Last synced: 22 Oct 2025

https://github.com/mathiasbynens/unicode-tr51

Emoji data extracted from Unicode Technical Report #51.

data emoji unicode

Last synced: 20 Jun 2025

https://github.com/daoodaba975/galsenify

A comprehensive library for Senegalese data, it offers a lot of information about country of Teranga 💫

data made-in-senegal npm-package

Last synced: 26 Jun 2025

https://github.com/sircryptic/autoexif

want to remove sensitive data from photos or even view it? use autoexif to easily help you do that no more remembering syntaxs with this user-friendly tool.

data data-analysis exif-data exif-data-extraction exif-interface exif-metadata exif-reader exif-remover exiftool image meta metadata osint osint-tool viewer

Last synced: 14 Apr 2025

https://github.com/ropenspain/spanishoddata

Access national high-quality and open-access datasets on movement patterns derived from mobile telephone datasets / Accede y usa datos nacionales abiertos sobre movimientos basados en teléfonos móviles.

cdr data data-package mobile-telephone-data mobility origin-destination rstats

Last synced: 28 Apr 2025

https://github.com/itext/itext-pdfocr-java

pdfOCR is an iText add-on to recognize and extract text in scanned documents and images. It can also convert them into fully ISO-compliant PDF or PDF/A-3u files that are accessible, searchable, and suitable for archiving

archival character data diacritic extractable glyphs hindi image iso-compliant ligatures mandarin ocr optical pdf portuguese recognition scan searchable spanish tesseract

Last synced: 09 Jan 2026

https://github.com/smarie/pytest-patterns

A couple of examples showing how pytest and its plugins can be combined to solve real-world needs.

benchmark case concerns data decorator design file fixture incremental modular parameter parametrize pattern pytest result separate share state step test

Last synced: 20 Mar 2025

https://github.com/artigraph/artigraph

Batteries included toolkit for data engineering.

data python

Last synced: 14 Jan 2026

https://github.com/tk3369/saslib.jl

Julia library for reading SAS7BDAT data sets

data julia reader sas sas7bdat

Last synced: 06 Jul 2025

https://github.com/ssbuild/aigc_data

share data, prompt data , pretraining data

aigc-data data instruct llm open open-data pretraining prompt

Last synced: 24 Apr 2025

https://github.com/theronione/cleaner.jl

A toolbox of simple solutions for common data cleaning problems.

data data-cleaning julia

Last synced: 24 Oct 2025

https://github.com/sungchun12/airflow-dbt-cloud

dbt Cloud pipelines in airflow examples

airflow data dbt dbt-cloud schedule scheduler workflow-engine

Last synced: 04 Sep 2025

https://github.com/nagix/ukraine-livecams

Ukraine live camera 3D map

data mapping open-data ukraine-invasion

Last synced: 02 Mar 2025

https://github.com/rbren/vizzy

Data Visualization with LLMs

chatgpt data data-visualization llm

Last synced: 07 May 2025

https://github.com/kristijorgji/goseeder

Go database seeder inspired from Laravel/Lumen seeder and more

data database go seeder seeders table test-seeds testing

Last synced: 14 May 2025

https://github.com/SirCryptic/autoexif

want to remove sensitive data from photos or even view it? use autoexif to easily help you do that no more remembering syntaxs with this user-friendly tool.

data data-analysis exif-data exif-data-extraction exif-interface exif-metadata exif-reader exif-remover exiftool image meta metadata osint osint-tool viewer

Last synced: 04 Mar 2025

https://github.com/jeffcore/covid-19-usa-by-state

CSV files of COVID-19 total daily confirmed cases and deaths in the USA by state and county. All data from Johns Hopkins & NYT..

confirmed-cases coronavirus coronavirus-tracking county covid-19 covid19 csv csv-files daily-files data deaths johns-hopkins nyt state usa

Last synced: 16 Jan 2026

https://github.com/stefan-schroedl/tabulator

A set of Unix shell command line tools for quick and convenient batch processing of tabular text files (a.k.a., tab-delimited, tsv, csv, or flat data file format) with a header line. Provides column reference by name, automatic delimiter and compression detection for per-line transformations, sql-like group-by operation and relational join.

comma-separated-values command-line csv csv-files data delimited-files join tab-separated tsv unix

Last synced: 13 Apr 2025

https://github.com/ptiger10/pd

A fast, tested, and predictable way to clean, aggregate, and transform data

analytics data go spreadsheet

Last synced: 12 Jan 2026

https://github.com/ivailop7/healthkit-influxdb-grafana

Publish your Apple HealthKit data via Python Flask HTTP endpoint to InfluxDB to plot in Grafana

analytics apple autoexport chart data flask grafana health healthkit http influxdb linux local mac plot python selfquant visualization windows workouts

Last synced: 30 Apr 2025

https://github.com/albar965/atools

atools is a static library extending Qt for exception handling, a log4j like logging framework, Flight Simulator related utilities like BGL reader and more.

compiler data flight fsx map prepar3d simulator x-plane

Last synced: 09 Apr 2025

https://github.com/bothub-it/bothub

Bothub is an open platform for predicting, training and sharing NLP datasets in multiple languages

bothub bots chatbot data database docker ilhasoft issue-tracker multiple-languages nlp nlp-datasets push python sharing-nlp-datasets webapp

Last synced: 07 May 2025

https://github.com/juliadata/dataapi.jl

A data-focused namespace for packages to share functions

data julia julialang

Last synced: 11 Sep 2025

https://github.com/fredibach/blowson

Blow up JSON like sample data in an awesomely realistic way!

data database extender graphql json

Last synced: 23 Apr 2025

https://github.com/ocamlpro/directories

directories is an OCaml library that provides configuration, cache and data paths (and more!) following the suitable conventions on Linux, macOS and Windows. The following conventions are used: XDG Base Directory Specification and xdg-user-dirs on Linux, Known Folders on Windows, Standard Directories on macOS.

basedir cache config conventions data directories knownfolders linux macos ocaml standard standarddirectories windows xdg

Last synced: 12 Jun 2025

https://github.com/tradewelltech/protarrow

Convert from protobuf to arrow and back

apache-arrow data protobuf python

Last synced: 16 Jan 2026

https://github.com/microsoft/reconner

ReconNER, Debug annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality of your data.

ai data ner-data nlp

Last synced: 31 Oct 2025

https://github.com/vijinho/epl_mysql_db

Free/open English Premier League results database from 1993-2017. Dump format is MySQL and sqlite.

data dataset epl football-data mysql premierleague soccer

Last synced: 20 Mar 2025

https://github.com/travishorn/csval

Check CSV files against a set of validation rules.

cli csv data json-schema parser validation

Last synced: 09 Apr 2025

https://github.com/FrediBach/Blowson

Blow up JSON like sample data in an awesomely realistic way!

data database extender graphql json

Last synced: 02 Aug 2025

https://github.com/vincentauriau/tennis-prediction

Predicts the winner of a tennis match with machine learning

atp data data-science machine-learning tennis

Last synced: 22 Apr 2025

https://github.com/holunda-io/camunda-bpm-data

Beautiful process data handling for Camunda 7 Platform.

api bpm camunda-7 data process

Last synced: 14 Jan 2026

https://github.com/flother/rio2016

Data on the 11,500+ athletes and 306 events at the Rio Olympics. Includes medals tallies

athletes data medals olympic-games olympics rio-de-janeiro rio2016

Last synced: 29 Dec 2025

https://github.com/streamr-dev/hub

Streamr Hub frontend

data real-time streamr streams web3

Last synced: 03 Jul 2025

https://github.com/eidoslab/unitopatho

Dataset of 9536 H&E-stained patches for colorectal polyps classification and adenomas grading | ICIP21 https://doi.org/10.1109/ICIP42928.2021.9506198

cancer data health histopathological-image histopathology histopathology-images medical-image-processing medical-images neural-networks

Last synced: 12 Aug 2025

https://github.com/canclid/canto-filter

粵文語料篩選器 Cantonese text filter

cantonese cantonese-language corpus corpus-data data nlp

Last synced: 27 Oct 2025

https://github.com/ctjacobs/git-rdm

A research data management plugin for the Git version control system.

curation data datasets git open-data open-science publishing research-data-management version-control

Last synced: 21 Jan 2026

https://github.com/mwouts/world_trade_data

World Integrated Trade Solution (WITS) API in Python

data statistics trade worldbank

Last synced: 03 Apr 2025

https://github.com/guanguans/laravel-api-response

Normalize and standardize Laravel API response data structures. - 规范化和标准化 Laravel API 响应数据结构。

api data json laravel normalize response rest restful standardize structure

Last synced: 26 Mar 2025

https://github.com/aiven/aiven-operator

Provision and manage Aiven Services from your Kubernetes cluster.

automation data databases kubernetes operator

Last synced: 09 Apr 2025

https://github.com/iamphytan/rosbag-tools

A ROS-agnostic toolbox for common rosbag operations

data data-management python python3 robotics ros1 ros2 rosbag

Last synced: 14 Apr 2025

https://github.com/rxavier/econuy

Wrangling Uruguayan economic data so you don't have to.

data economy python uruguay

Last synced: 17 Jan 2026

https://github.com/fluhus/gostuff

Convenience packages for data science in Go.

data data-science data-structures go golang

Last synced: 12 Jan 2026

https://github.com/hodur-org/hodur-datomic-schema

Hodur is a domain modeling approach and collection of libraries to Clojure. By using Hodur you can define your domain model as data, parse and validate it, and then either consume your model via an API or use one of the many plugins to help you achieve mechanical results faster and in a purely functional manner.

clojure data database datomic modeling schema

Last synced: 12 Dec 2025

https://github.com/webankblockchain/data-stash

Data-Stash是基于FISCO-BCOS的数据仓库组件,通过解析节点的binlog日志,生成该节点状态的全量备份,从而使节点能够实现冷热数据分离和数据裁剪。

blockchain consortium data data-governance data-separation webank-blockchain

Last synced: 23 Jul 2025

https://github.com/ropensci/weatherOz

An API Client for Australian Weather and Climate Data Resources

api-client australia climate data r rainfall rstats weather weather-api weather-forecast

Last synced: 20 Jul 2025

https://github.com/ihrke/pypillometry

Pupillometry and eyetracking with python

data data-analysis eye-tracking eyetracking pupillometry

Last synced: 10 Oct 2025

https://github.com/countly/countly-sdk-cpp

Countly C++ SDK for Windows, MacOS and Linux

analytics data linux mac mobile

Last synced: 10 Jun 2025

https://github.com/z3z1ma/target-bigquery

target-bigquery is a Singer target for BigQuery. It supports storage write, GCS, streaming, and batch load methods. Built with the Meltano SDK.

bigquery data meltano pipelines singer

Last synced: 25 Oct 2025

https://github.com/z3z1ma/cdf

A framework to manage data, continuously

data framework pipelines transformation

Last synced: 17 Mar 2025

https://github.com/matrix-msu/kora

The easiest way to manage and publish your data. Open-source, database-driven, online digital repository application for complex multimedia objects (text, images, audio, video). kora stores, manages, and delivers digital objects with corresponding metadata that enhances the research and educational value of the objects.

archive collections data laravel management matrix metadata msu mysql php repository schema

Last synced: 11 Jan 2026

https://github.com/stefen-taime/iceberg-dbt-trino-hive-modern-open-source-data-stack

To provide a deeper understanding of how the modern, open-source data stack consisting of Iceberg, dbt, Trino, and Hive operates within a music streaming platform, let’s delve into the detailed workflow and benefits of each component.

data dbt hive iceberg modern trinodb

Last synced: 20 Oct 2025

https://github.com/tompollard/sammon

Sammon mapping in Python

data visualization

Last synced: 29 Oct 2025

https://github.com/data-fair/data-fair

Findable, Accessible, Interoperable and Reusable Data. A complete open-source solution for your open and private data needs. French only for the time being, internationalization coming soon.

api data datasets docker nocode nocodeapi nodejs open-data openapi3

Last synced: 10 Nov 2025