An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/countly/countly-sdk-cpp

Countly C++ SDK for Windows, MacOS and Linux

analytics data linux mac mobile

Last synced: 10 Jun 2025

https://github.com/inphyt/covid19-italy-integrated-surveillance-data

COVID-19 integrated surveillance data provided by the Italian Institute of Health and processed via UnrollingAverages.jl to deconvolve the weekly moving averages.

covid-19 covid19-data data data-analysis data-structures data-visualization data-wrangling database dataset epidemiological-data epidemiology italy italy-data italy-dataset open-data surveillance surveillance-data time-series time-series-analysis

Last synced: 26 Jul 2025

https://github.com/ropenspain/infoelectoral

infoelectoral is a R library that helps retrieve and analize official electoral results for Spain from the Ministry of the Interior. It allows you to download the results of general, european and municipal elections of any year at the polling station and municipality level.

data elecciones elections electoral infoelectoral r spain

Last synced: 14 Apr 2025

https://github.com/pkmn/smogon

Wrapper around Smogon's analyses and usage statistics

data git-scraping pokemon smogon

Last synced: 09 Apr 2025

https://github.com/wakataw/pyproc

SPSE (Sistem Pengadaan Secara Elektronik) Python API Wrapper

data e-procurement lkpp lpse pengadaan python sedot spse

Last synced: 17 Jan 2026

https://github.com/EIDOSLAB/UNITOPATHO

Dataset of 9536 H&E-stained patches for colorectal polyps classification and adenomas grading | ICIP21 https://doi.org/10.1109/ICIP42928.2021.9506198

cancer data health histopathological-image histopathology histopathology-images medical-image-processing medical-images neural-networks

Last synced: 06 May 2025

https://github.com/webankblockchain/data-reconcile

Data-Reconcile是一款基于区块链的对账组件,提供基于区块链智能合约账本的通用化数据对账解决方案,并提供了一套可动态扩展的对账框架,支持定制化开发。

blockchain consortium data data-governance reconcile webank-blockchain

Last synced: 09 Jul 2025

https://github.com/pinecone-io/pinecone-datasets

An open-source dataset library for pre-embedded dataset: create your own data catalog, or use Pinecone's public datasets.

data database embeddings vector

Last synced: 29 Apr 2025

https://github.com/brightway-lca/brightway2-io

Importing and exporting for the Brightway LCA framework

bw2 data life-cycle-assessment python

Last synced: 04 Apr 2025

https://github.com/climatewatch-vizzuality/climate-watch

Climate Watch: Data for Climate Action

climate data postgresql rails react

Last synced: 08 May 2025

https://github.com/svrnm/exceldatatables

Replace a worksheet within an Excel workbook (.xlsx) without changing any other properties of the file.

data datatable excel php xlsx

Last synced: 07 May 2025

https://github.com/pkmn/randbats

Pokémon Showdown's Random Battle sets

data git-scraping pokemon pokemon-showdown

Last synced: 29 Jul 2025

https://github.com/juliagraphics/namedcolors.jl

More color names than you ever knew you wanted

color color-palette data

Last synced: 10 Sep 2025

https://github.com/aws-samples/data-for-saas-patterns

A collection of samples, best practices and reference architectures for implementing SaaS applications on AWS for databases and data services.

aws data databases saas

Last synced: 14 Apr 2025

https://github.com/rodabt/vduckdb

A blazing-fast DuckDB wrapper built with the V language, making it easier to leverage its power in your projects.

data duckdb vlang wrapper-library

Last synced: 09 Aug 2025

https://github.com/htrgouvea/harpoon

[W.I.P] An ecosystem of crawlers for detecting: leaks, sensitive data exposure and attempts exfiltration of data

bing data detect exfiltrate leak notify pastebin perl sensitive-data uranus

Last synced: 10 Apr 2025

https://github.com/pennlabs/penn-sdk-python

A Python module for the various services of Penn OpenData. Validated API token required.

data opendata python university-of-pennsylvania

Last synced: 31 Jul 2025

https://github.com/audeering/audb

Manage audio and video datasets

annotation audio data mlops

Last synced: 10 Jun 2025

https://github.com/pawel-0/xdg-unused-data

A simple way to identify unused applications data in user directories such as ~./config and ~/.cache.

bash data linux unused xdg xdg-basedir

Last synced: 04 Sep 2025

https://github.com/tniedbala/secdatatools

Simple Python utility that downloads and extracts SEC financial statement data sets.

accounting analysis csv data dataset finance financial-statements securities tsv utility

Last synced: 23 Jan 2026

https://github.com/iboxdb/db4o-gpl

new Db4o GPL Source Code for Java7+ & .netstardard2.0 Android Xamarin..., the best database project to help you to learn how to make databases

data database db4o embaddable java netstandard oodb

Last synced: 14 Jan 2026

https://github.com/jnmclarty/validada

Another library for defensive data analysis.

checkset data data-analysis data-validation decorators pandas slice validation

Last synced: 24 Jan 2026

https://github.com/gher-uliege/physocean.jl

Utility functions for physical oceanography (properties of seawater, air-sea heat fluxes,...)

data density fluxes julia physical-oceanography sea-water

Last synced: 13 Oct 2025

https://github.com/pepijn-devries/CopernicusMarine

Subset and download marine data from EU Copernicus Marine Service Information. Import data on the oceans physical and biogeochemical state from Copernicus into R without the need of external software.

data spatial

Last synced: 20 Jul 2025

https://github.com/randomfractals/observable-data-tools

Repository of web and code editor friendly Observable Data Toools 🛠️ and Notebooks 📚 in .js, .nb.json, .ojs, .omd, .html and .qmd document formats for Data Previews in a browser and in VSCode IDE with Observable JS extension, Quarto extension, and new Quarto publishing tools.

data data-notebooks data-tools diagrams editor jsnotebooks notebook quarto quartopub query sql summary tabular

Last synced: 21 Mar 2025

https://github.com/suchjs/such

A powerful fake data library, expandable, configurable, generate data exactly as you want.

data fake faker generation generator javascript json json-data mock mocking nodejs simulate simulation typescript

Last synced: 14 Apr 2025

https://github.com/oobianom/shinyStorePlus

An R package with in-browser storage for Shiny persistent, synchronized data from the inputs using IndexedDB. Transfer browser link parameters to Shiny input or output values.

cran data data-structures r r-package shiny

Last synced: 05 Oct 2025

https://github.com/RealityBending/TemplateResults

A template for a data analysis folder that can be easily exported as a webpage or as Supplementary Materials

data open-science open-source pdf r reproducible rmarkdown scripts share statistics submit supplementary-material template webpage website word

Last synced: 30 Jul 2025

https://github.com/datawithbaraa/sql-data-analytics-project

This repository contains a collection of SQL scripts demonstrating various analytical techniques, such as changes over time, cumulative, performance, data segmentation, part-to-whole analysis.

analytics business-analytics business-intelligence data data-analysis data-analyst data-analytics data-engineering data-science data-scientist database datascience query reporting sql sql-queries sql-query sql-server window-functions window-functions-in-sql

Last synced: 15 Apr 2025

https://github.com/ckan/ckanext-validation

CKAN extension for validating Data Packages using Table Schema.

ckan ckanext data validation

Last synced: 06 Apr 2025

https://github.com/mrpaulandrewltd/Microsoft-Data-Integration-Pipeline-Training

Training workshop content on Azure Data Factory and Azure Synapse Analytics Data Integration Pipelines

azure data data-factory integration pipelines procfwk synapse-analytics

Last synced: 31 Mar 2025

https://github.com/nbremer/datasketches

A monthly collaboration project between Shirley & Nadieh

d3 d3js data data-art data-visualization

Last synced: 14 Aug 2025

https://github.com/favstats/uaconflict_equipmentloss

This repo scrapes Oryxspioenkop (daily) to document and visualize equipment losses in the Russia-Ukraine war. https://www.oryxspioenkop.com/2022/02/attack-on-europe-documenting-equipment.html

conflict data data-visualization ukraine-invasion ukrainewar war

Last synced: 13 Aug 2025

https://github.com/randomfractals/pro-data-tools

Pro Data Tools 🛠️ for VS Code IDE 🧙‍♂️: DuckDB Pro Tools, PRQL Code Lens, new Markdown SQL Pro Tools, upcoming Data Notebooks 📚 Pro Tools docs and demos, etc.

data duckdb markdown notebooks prql sql tools vscode

Last synced: 22 Mar 2025

https://github.com/garystafford/streaming-sales-generator

Streaming Synthetic Sales Data Generator: Streaming sales data generator for Apache Kafka, written in Python

analytics apache-flink apache-kafka data kafka kafka-streams kstreams python spark-structured-streaming streaming-data

Last synced: 03 Aug 2025

https://github.com/ssamadgh/ModelAssistant

Elegant library to manage the interactions between view and model in Swift

collectionview controller core coredata data datasource interactor manager model mvc mvp mvvm swift tableview view viewmodel viper

Last synced: 06 Aug 2025

https://github.com/ropensci/weatheroz

An API Client for Australian Weather and Climate Data Resources

api-client australia climate data r rainfall rstats weather weather-api weather-forecast

Last synced: 09 Apr 2025

https://github.com/airframesio/data

Centralization of source data for Airframes/Acars projects

acars airframes csv data database json sql vdl vdl2 xml

Last synced: 16 Jan 2026

https://github.com/feup-infolab/dendro

"Open-source Dropbox" with added description features. It is a data storage and description platform designed to help researchers and other users to describe their data files, built on Linked Open Data and ontologies. Users can use Dendro to publish data to CKAN, Zenodo, DSpace or EUDAT's B2Share and others.

data dendro dendro-platform infolab invenio linked-data rdm research

Last synced: 13 Jul 2025

https://github.com/ssamadgh/modelassistant

Elegant library to manage the interactions between view and model in Swift

collectionview controller core coredata data datasource interactor manager model mvc mvp mvvm swift tableview view viewmodel viper

Last synced: 29 Apr 2025

https://github.com/lolleko/mesh-data-synthesizer

Uses Unreal Engine & Cesium to generate large synthetic dataset from 3D meshes. Enables machine learning tasks like Visual Place Recognition read more in our paper on this: https://meshvpr.github.io

cesium data geospatial machine-learning mesh place-recognition synthesis synthesizer ue5 unreal-engine

Last synced: 28 Apr 2025

https://github.com/arm-university/rpi-pico-projects-for-schools

Raspberry Pi Pico Projects for Schools: Explore cutting-edge topics in Computing, including Machine Learning and Internet of Things. Ages 16-18.

ai data datascience iot ml pico python raspberry-pi rpi

Last synced: 23 Apr 2025

https://github.com/cahyadsn/db_rajaongkir

Data Kode Provinsi, Kota/Kabupaten dan Kecamatan untuk RajaOngkir

data kabupaten kecamatan kode kota provinsi rajaongkir sql

Last synced: 21 Feb 2025

https://github.com/ahuang11/ahlive

animate your data to life

ahlive animate animation data gif matplotlib xarray

Last synced: 17 Mar 2025

https://github.com/faviovazquez/odsc_india_2018

My presentation at ODSC India 2018 about Deep Learning with Apache Spark

data datascience deeplearning optimus pyspark spark

Last synced: 30 Apr 2025

https://github.com/jonschlinkert/read-yaml

Very thin wrapper around js-yaml for directly reading in YAML files.

data file json yaml

Last synced: 30 Apr 2025

https://github.com/charity-base/charity-base-api

CharityBase GraphQL API

api charity data graphql

Last synced: 05 May 2025

https://github.com/randomfractals/vscode-data-table

Data Table 🈸 , Flat Data Grid 中 & Data Summary 🈷️ Renderers for VSCode Notebook 📓 Cell ⌗ Data Outputs

cell data data-summary flat-data geo notebook output renderer runbooks runme table view viewer vscode

Last synced: 20 Jul 2025

https://github.com/tsolomko/BitByteData

Read and write bits and bytes in Swift.

bits bytes data swift

Last synced: 02 May 2025

https://github.com/achyutjoshi/Flight-Prices-Scraper

Automated Script to scrape flight prices from any website into a csv format

collection data flight prices scraper

Last synced: 07 Apr 2025

https://github.com/eigenfoo/cryptics

A dataset of cryptic crossword clues, collected from various blogs and digital archives.

cryptic-crossword-clues cryptic-crosswords cryptics data dataset

Last synced: 22 Mar 2025

https://github.com/airbloc/airbloc-go

Airbloc Core Implementation using Go

airbloc blockchain data data-exchange golang

Last synced: 17 Jan 2026

https://github.com/waylonwalker/kedro-static-viz

kedro cli plugin for generating a static kedro viz site (html, css, js) that can be deployed on many serverless tools.

data dataengineering datapipeline kedro kedro-plugin python

Last synced: 05 May 2025

https://github.com/Energy-Sparks/energy-sparks

Source for the EnergySparks website

bath data energy rails ruby ruby-on-rails school

Last synced: 07 May 2025

https://msune.github.io/libcdada/

Basic data structures in C: list, set, map/hashtable, queue... (libstdc++ wrapper)

bitmap c cdada data data-container data-structures data-structures-and-algorithms hashmap hashtable library libstdc libstdcxx linked-list list map queue set stack string struct

Last synced: 18 Nov 2025

https://github.com/ludvigolsen/groupdata2

R-package: Methods for dividing data into groups. Create balanced partitions and cross-validation folds. Perform time series windowing and general grouping and splitting of data. Balance existing groups with up- and downsampling or collapse them to fewer groups.

balance cross-validation data data-frame fold group-factor groups participants partition rstats split staircase

Last synced: 12 Oct 2025

https://github.com/xdrop/jrand

A Java library to generate random data for all sorts of things. Java random data faker

data faker java random random-generation random-number-generators random-string randomization

Last synced: 14 Jun 2025

https://github.com/daveebbelaar/ai-fundamentals

Learn the fundamentals of building AI solutions using Python.

ai data python tutorials

Last synced: 06 Sep 2025

https://github.com/WaylonWalker/kedro-static-viz

kedro cli plugin for generating a static kedro viz site (html, css, js) that can be deployed on many serverless tools.

data dataengineering datapipeline kedro kedro-plugin python

Last synced: 24 Mar 2025

https://github.com/github/transparency

Structured data files for topics covered by GitHub's Transparency Report

data dataset open-data transparency

Last synced: 19 Oct 2025

https://github.com/tocreator/tostore

A high-performance relational database for Dart with multi-space architecture. Features smart caching, file/local storage, SQL & key-value persistent store.

cache data database db schema sql storage

Last synced: 31 Jan 2026

https://github.com/msune/libcdada

Basic data structures in C: list, set, map/hashtable, queue... (libstdc++ wrapper)

bitmap c cdada data data-container data-structures data-structures-and-algorithms hashmap hashtable library libstdc libstdcxx linked-list list map queue set stack string struct

Last synced: 24 Apr 2025

https://github.com/stefan-niedermann/nextcloud-tables

📊 Android client for nextcloud tables app

android data nextcloud nextcloud-tables tables

Last synced: 16 Mar 2025

https://github.com/Ana06/medical-data-android

Android app to collect data to be analyzed for medical purposes.

android bipolar-disorder-patients data medical prototype ucm user-testing

Last synced: 12 Jul 2025

https://github.com/gamemann/the-dpdk-stats

A simple DPDK application that calculates stats for dropped and forwarded packets depending on the command line.

counter data development dpdk dropped forward kit packet plane stats

Last synced: 18 Mar 2025

https://github.com/CedricBonjour/nanocell-csv

A free csv file viewer & editor

csv-files data editor visualization

Last synced: 25 Feb 2025

https://github.com/theengineeringworld/python-data-science

Python Data Science has all the data sets and jupyter notebook files for the Youtube course at http://youtube.com/theengineeringworld under the name of " Python Data Science Course ".

data data-analysis data-mining data-science data-visualization jupyter-notebook jupyter-notebooks machine-learning python python27

Last synced: 17 Nov 2025

https://github.com/ana06/medical-data-android

Android app to collect data to be analyzed for medical purposes.

android bipolar-disorder-patients data medical prototype ucm user-testing

Last synced: 16 Apr 2025

https://github.com/marcusschiesser/intraday

Download and cache intraday finance market data using yfinance

api cache data finance market yfinance

Last synced: 02 Oct 2025

https://github.com/morellodev/react-test-attributes

React library to add data-* attributes to DOM elements.

attributes cypress data dom e2e javascript reactjs selenium test testing typescript

Last synced: 23 Nov 2025

https://github.com/vikashpr/daml

Data Analytics and Machine learning 2024 is a National Level workshop organised by SRM Analytics Society of India Student Chapter. This project is building a data analytics dashboard in Google sheets and deploying it in a personal website using Github Pages.

data data-analytics data-analytics-bootcamp github-pages google-sheets

Last synced: 05 Apr 2025

https://github.com/chalk-ai/examples

Curated examples and patterns for using Chalk. Use these to build your feature pipelines.

chalk data data-science ml ml-ops pipeline python

Last synced: 17 Jan 2026

https://github.com/fdonnet/dapper-sql-generator

Tool that uses a MS-SQL project (.dacpac) to generate stored procedures, entities and Dapper DbContext (async, ready for .Net Core .Net 6)... Extensible to a lot of use cases.

configurator dal dapac dapper dapper-donet-core data generator mssql net6 stored-procedures

Last synced: 27 Feb 2025

https://github.com/naturalsolutions/ecosecrets

ecoSecrets is a web application which enables users to manage their camera traps data

biodiversity camera-traps data opensource picture python react wildlife

Last synced: 22 Jan 2026

https://github.com/mtes-mct/parcours-r

Valise pédagogique pour la formation à R

data dplyr formation ggplot2 markdown r shiny statistics tidyverse

Last synced: 10 Oct 2025

https://github.com/kamath/nfl-data-hacking

Algorithmically draft NFL players for your fantasy league!

algorithm data hacking

Last synced: 28 Feb 2025

https://github.com/kodedninja/datta

A readable plain text data structure

data data-structures javascript json nodejs

Last synced: 30 Apr 2025

https://github.com/nuhmanpk/webtrench

A powerful and easy-to-use web scrapper for collecting data from the web. Supports scraping of images, text, videos, meta data, and more. Ideal for machine learning and deep learning engineers. Download and extract data with just one line of code

audio-datasets data data-collection data-science dataset-generation deep-learning image-data-generator machine-learning python scarper text-datasets

Last synced: 21 Mar 2025

https://github.com/pycroscopy/pyusid

Framework for storing, visualizing, and processing Universal Spectroscopic and Imaging Data (USID)

data hdf5 imaging parallel-computing spectroscopy

Last synced: 11 Apr 2025