An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/datawookie/data-diaspora

Various datasets used in tutorials and workshops.

data datasets

Last synced: 20 Mar 2025

https://github.com/carpentries-incubator/indigenous-data-sovereignty

Introduces the concepts and framework of Indigenous Data Sovereignty and Governance.

data english lesson pre-alpha

Last synced: 24 Jan 2026

https://github.com/mutasim77/dbt-analytics

πŸ‰ Repo for analytics engineering with dbt, transforming raw data into actionable insights.

big-query data data-analysis dbt warehouse

Last synced: 25 Feb 2026

https://github.com/legopitstop/addons

All legopitstop's Bedrock add-ons in one place.

add-on assets behaviorpack data hacktoberfest minecraft mods modtoberfest resroucepack vanilla

Last synced: 06 Feb 2026

https://github.com/0xdir/htcds_dart

Human Trafficking Case Data Standard (HTCDS v0.2) objects, for easy creation, storage and transmission of case data related to human trafficking.

data humanitarian schema standards

Last synced: 24 Oct 2025

https://github.com/andyfratello/dbd

πŸ—„οΈ Exercicis de Disseny de Bases de Dades (DBD) Q2 - UPC FIB

data database dbd dbd-fib dbeaver fib-upc nosql nosql-database oracle sql sql-database

Last synced: 10 Feb 2026

https://github.com/desultory/pycpio

Python library for CPIO manipulation

cpio cpio-archives data initramfs pypi-package python python-3 python3

Last synced: 04 Feb 2026

https://github.com/sneels/parkds

Connect all your Data Sources via 1 process (Cross-Domain + Single-Domain)

cross-domain data database datasource datasources javascript source

Last synced: 24 Feb 2026

https://github.com/floriancassayre/nicknames-datasets

Open source nicknames sets with informations about the data origin(s).

data data-mining dataset

Last synced: 08 Feb 2026

https://github.com/reubano/ckanutils

A Python library for interacting with CKAN instances

ckan data library open-data

Last synced: 10 Feb 2026

https://github.com/tosun-si/world-cup-qatar-team-stats-kotlin-midgard

This application shows a full Apache Beam pipeline with Kotlin and Midgard library. The use case works on the last Qatar FIFA world cup data and calculate players statistics per team. This application will be presented at Beam Summit 2023 in New York

apache-beam beam-summit data kotlin midgard world-cup-2022

Last synced: 01 Feb 2026

https://github.com/iondv/metrics

IONDV. Framework application: Metrics is to collect and show the metrics data.

collecting data data-analysis iondv iondv-app metrics

Last synced: 10 Feb 2026

https://github.com/vrm-piyush/python-projects

Open source Python Projects. Feel Free to contribute!

data dataanalysis games open-source pygame-games python python-app

Last synced: 26 Feb 2026

https://github.com/critocrito/data-scores-in-the-uk

Investigate the uses of data analytics and algorithms in public services in the UK.

clojure data data-investigation data-preservation javascript social-sciences sugarcube uk

Last synced: 18 Oct 2025

https://github.com/cmudig/mosaic-profiler

A data profiler built with Mosaic

data jupyter visualization

Last synced: 25 Oct 2025

https://github.com/reala10n/simplejsondb

Create a simple JSON database with just one line of code!

data database db easy json python simple

Last synced: 27 Oct 2025

https://github.com/hadro/brewery-guides

The data for guides to breweries across the United States from 1896 to 1918

brewers brewery-guides brewing brewing-history data dataset digital-collections digital-humanities hocr nypl open-data

Last synced: 16 Mar 2026

https://github.com/jsdhami/python-for-research

"Python-For-Research" Event Organized By Tri-Chandra Research Group, Ghantaghar, Kathmandu

analysis colab data jupyter matplotlib numpy panda physics python research visualization

Last synced: 27 Oct 2025

https://github.com/imagodata/filter_mate

FilterMate is a Qgis plugin, an everyday companion that allows you to easily filter your vector layers

data exploratory-data-analysis filter geospatial ogr postgis qgis qgis-plugin qgis3 qgis3-plugin spatialite sql vector-database

Last synced: 29 Apr 2026

https://github.com/suh1z/rakkauttify_fullstack

CS2 Data and Statistics Dashboard -fullstackproject

analytics data expressjs gaming mongo nodejs react redux

Last synced: 24 Oct 2025

https://github.com/as/worm

Worm provides write-once read-many log-structured storage semantics

data log record storage worm

Last synced: 31 Jan 2026

https://github.com/open-i18n/data-unicode-cldr

Git mirror for Unicode Common Locale Data Repository (CLDR) data

cldr data open-i18n unicode unicode-consortium

Last synced: 07 Feb 2026

https://github.com/udityamerit/python-librearies-for-data-science

Python libraries for data science enable efficient data manipulation, analysis, and modeling. Key libraries include NumPy for numerical computing, pandas for data handling, Matplotlib for visualization, Scikit-learn for machine learning, TensorFlow for deep learning, and BeautifulSoup/requests for web scraping. These libraries simplify complex data

beautifulsoup data data-science data-science-libraries machine-learning matplotlib numpy pandas requests scikit-learn scikitlearn-machine-learning tensorflow

Last synced: 06 Feb 2026

https://github.com/josechirif/reviews-and-satisfaction-analysis-of-airbnb-brazil-and-mexico-from-june-2010-to-february-2021

This project analyzes the reviews and satisfaction of customers who used AirBnB services. It also studies if there is a relationship between another variables.

data data-analysis data-visualization powerbi sql-server

Last synced: 25 Feb 2026

https://github.com/audeering/emodb

Publishes Berlin Database of Emotional Speech with audb

audb data emotion

Last synced: 19 Oct 2025

https://github.com/cptpiepmatz/tabledatamerge

πŸ”€ Merge plain text tables together.

cli data format latex table tdm

Last synced: 24 Feb 2026

https://github.com/ballerina-platform/module-ballerina-data.csv

The Ballerina CSV Data Library is a comprehensive toolkit designed to facilitate the handling and manipulation of CSV data within Ballerina applications. It streamlines the process of converting CSV data to native Ballerina data types, enabling developers to work with CSV content seamlessly and efficiently.

ballerina ballerina-csv csv csv-data data

Last synced: 29 Jan 2026

https://github.com/squareslab/probabilisticmodel_saner2018

Paper and supporting materials of the Probabilistic Model paper Accepted to SANER 2018

code data mausotog published replication

Last synced: 26 Oct 2025

https://github.com/yorkulibraries/vendorpol

URLs for vendor privacy policies and terms of use.

data libraries privacy-policy

Last synced: 15 Oct 2025

https://github.com/liamross/use-data

A React hook for async fetching of data, data manipulation, and take latest vs take every functionality.

async data hook hooks react

Last synced: 22 Jan 2026

https://github.com/jaldekoa/nyfedapi

A Python wrapper to easily retrieve data from the Federal Reserve Bank of New York (FRBoNY) official API in pandas format.

api api-wrapper banking data finance pandas python united-states

Last synced: 08 Feb 2026

https://github.com/flrd/standardlastprofile

R Data Package for BDEW Standard Load Profiles in Electricity

data electricity germany r

Last synced: 16 Mar 2026

https://github.com/cerema/groum

Utilitaire en ligne de commande pour convertir les donnΓ©es d'arrΓͺtΓ©s de circulation

data traffic

Last synced: 06 Feb 2026

https://github.com/drkenreid/introductory-data-science

Hands-on machine learning tutorials in Google Colab, covering various algorithms and techniques for learners at different levels.

cnn data data-science deep-learning learning-datascience learning-machine-learning learning-python neural-network neural-networks regression rnn science tutorial tutorial-exercises tutorials

Last synced: 28 Jan 2026

https://github.com/nononoexe/setariaviridis

🌾 Field-collected data of green foxtail

data data-science dataset rpackage

Last synced: 27 Feb 2026

https://github.com/banbord/data-vis-tornados

This repository includes data files, processing scripts, visualization code, and documentation for our tornado data visualization project. It aims to provide insights into tornado patterns across the United States using interactive and informative visual representations.

d3-visualization d3js data javascript json visualization

Last synced: 24 Feb 2026

https://github.com/olegegoism/datagenerator

Django web application for managing database connections and generating test data.

app application big-data csv data database dataset db django fake generator schema teable work

Last synced: 26 Oct 2025

https://github.com/p32929/use-megamind

A simple react hook for managing asynchronous function calls with ease on the client side

async asynchronous-tasks axios client-side-javascript data data-fetching easy fetch generics hooks javascript npm painless promise query react rest simple small typescript

Last synced: 23 Jan 2026

https://github.com/d3oxy/country-state-data

A comprehensive JSON dataset containing countries, states, cities, regions, and languages with TypeScript support. Perfect for building location-based dropdowns, address forms, and geographical applications.

address cities countries currency data dropdown geographical iso json languages location regions states typescript

Last synced: 24 Jan 2026

https://github.com/asirihewage/simplest-xpath-web-scraper

Simplest web scraper created using Python3 and MongoDB

data data-mining python3 scraper web webscrping

Last synced: 29 Jan 2026

https://github.com/mrnazu/eth-data-library

eth-data-library is a Nodejs library that provides tools for accessing and processing data on the Ethereum blockchain.

blockchain data ethereum nodejs smart-contracts web3

Last synced: 28 Jan 2026

https://github.com/codecentric/reedelk-bookingintegrationservice

Example service for the blog post series about Reedelk

api api-gateway data integration integration-flow

Last synced: 16 Oct 2025

https://github.com/pjt3591oo/exchange-crawler

μ—…λΉ„νŠΈ, 코인원 크둀러

crawler data exchange python

Last synced: 27 Oct 2025

https://github.com/plabayo/datapoints.earth

Earth data liberation for and by its citizens.

data foss free scrape

Last synced: 15 Mar 2026

https://github.com/sapienzanlp/exploring-srl

Repository for the paper "Exploring Non-Verbal Predicates in Semantic Role Labeling: Challenges and Opportunities"

acl acl2023 conllu data dataset natural-language-processing nlp semantic-role-labeling srl

Last synced: 31 Jan 2026

https://github.com/dhimmel/het.io-rep-data

Data from Project Rephetio for the het.io website

browser data datatables drug-repurposing rephetio

Last synced: 07 Feb 2026

https://github.com/peterdavehello/nrd-list-archive

πŸŒπŸ“‚ A collection of past NRD lists to exploreβ€”perfect for fun, research, or just plain curiosity! πŸŽ‰πŸ”βœ¨

archive data nrd

Last synced: 17 Mar 2026

https://github.com/stdlib-js/ndarray-base-dtype-str2enum

Return the enumeration constant associated with an ndarray data type string.

array data dtype dtypes enum javascript multidimensional ndarray node node-js nodejs stdlib types util utilities utility utils

Last synced: 15 Mar 2026

https://github.com/wonderium/browser-releases

This repository contains release dates for browser versions.

browsers data json releases wonderium

Last synced: 31 Jan 2026

https://github.com/a3r0id/lightshot-data-miner

A random idea I had a while back to make a data miner for lightshot. Never released this but after a friend sent me a post about lightshot's transparency I figured it'd be a good time to release this. I've included some output from a run before making the repo. I am not responsible for the imagery or it's contents.

brute-force bruteforce data dataset face-recognition image-processing lightshot mining scraper scraping text-recognition

Last synced: 19 Oct 2025

https://github.com/sermetpekin/evdscpp

evdscpp is a C++ library for fast, efficient, and user-friendly interaction with the EVDS API Server. Designed with performance in mind, it provides built-in caching, an Excel export option, and an intuitive user interface for configuring and retrieving data. evdscpp can be extended for integration with other C++ projects and offers options for use

cbrt central-bank cpp data edds evds evds-api evdscpp tcmb tcmb-api

Last synced: 07 Sep 2025

https://github.com/antoineaugusti/purchasing-power

Archive daily data about purchasing power parity: how much goods should cost in various countries

archive data purchasing-power-parity

Last synced: 28 Oct 2025

https://github.com/headless-start/data-augmentation-impact

This repository contains effect of Data Augmentation of Training Set during Model Training.

augmented-images cuda data gpu keras matplotlib mnist opencv-python python3 tensorflow training-data

Last synced: 05 Apr 2026

https://github.com/amethyst-php/address

The place where a person or organization can be found or communicated with. Contains fields such as: street, postal code, city, country etc... Can be used for example as a shipment address or as an invoice address.

address amethyst amethyst-package api data laravel

Last synced: 13 Aug 2025

https://github.com/null-none/jwlibrary2json

Handler for jwlibrary to json

data json jwlibrary library parser

Last synced: 14 May 2026

https://github.com/thekartikeyamishra/data_cleaning_project

Welcome to the Data Cleaning and Visualization project! This repository demonstrates how to clean messy data and create insightful visualizations using Python with Pandas and Matplotlib.

data dataanalysis matplotlib matplotlib-pyplot pandas python

Last synced: 02 May 2026

https://github.com/courtois-neuromod/anat

Anatomical sub-dataset of Courtois-Neuromod project.

data raw

Last synced: 17 Jan 2026

https://github.com/kom-senapati/ghw-data-hacks

🌍 Global Hack Week data projects, πŸ“Š focused on exploration, manipulation, and analysis...

data ghw

Last synced: 12 Mar 2025

https://github.com/dmnsgn/airports-data

Airports data: static, dynamic and custom dump.

airport airports cli data database json

Last synced: 30 Apr 2025

https://github.com/imadsaddik/bodmaghdataset

BoDmagh dataset is a Supervised Fine-Tuning (SFT) dataset for the Darija language

arabic-llm arabic-nlp darija-llm darija-nlp data dataset fine-tuning llm nlp sft

Last synced: 03 Apr 2025

https://github.com/sheweny/discord-resolve

This module groups together functions to retrieve data from different types of arguments.

data discord discord-js mentions resolver sheweny utility

Last synced: 29 Oct 2025

https://github.com/schbenedikt/datamining

Heise (https://heise.de) News Crawler

data data-science heise postgresql web-crawler

Last synced: 10 Apr 2025

https://github.com/seregpie/vuefort

The state management for Vue.

data model state store vue

Last synced: 13 Apr 2026

https://github.com/thealphadollar/messiah

Messiah: The Mighty Son Of God Is Here To Help You Through Times Of Calamity

azure backend data data-analysis flask frontend materialize natural-disasters

Last synced: 19 Jan 2026

https://github.com/godeltech/godeltech.data.entityframeworkcore

Library to access database with Unit of Work, Repository and Entity classes for Entity Framework Core.

data entity entity-framework-core repository unitofwork

Last synced: 30 Apr 2025

https://github.com/godeltech/godeltech.data

.NET library to access data storage with Unit of Work, Repository and Entity classes

data entity repository unitofwork

Last synced: 30 Apr 2025

https://github.com/polina-prokofieva/viewjson

The class for convenient visualization of json with some settings.

data data-visualization es5 es6 javascript json

Last synced: 15 May 2026

https://github.com/woctezuma/download-steam-banners-data

Data consisting of Steam banners.

data steam steam-api

Last synced: 06 Jan 2026

https://github.com/heikomuller/histore

Library for maintaining snapshots of evolving tabular data sets

data version-control

Last synced: 10 Apr 2025

https://github.com/felixklauke/atomizer

Playing around with butter knife, android bindings and rx java.

binding butterknife data java react rx rxjava

Last synced: 15 May 2026

https://github.com/danlsn/causality

A Personal Data Platform and the culmination of years of curiosity and learning in the Data Engineering space.

data data-engineering datawarehousing personal-data quantified-self

Last synced: 06 Mar 2026

https://github.com/satyam4229/college-predictor-system

The college predictor system is a Python-based application that utilizes a machine learning model to predict colleges and their corresponding degree programs and branches based on a student's JEE (Joint Entrance Examination) score.

data data-science jupyter-notebook kaggle prediction python

Last synced: 06 Apr 2026

https://github.com/shysolocup/aepl

A Node.JS multi-layered class creation package with built-in parenting systems that let you get info from classes above as well as better function and property makers for easier to read and understand development and modding support inspired by Roblox's Studio API.

aepl backend classes data framework game-development game-framework javascript js js-class js-framework lightweight nodejs package

Last synced: 28 Oct 2025

https://github.com/antvis/create-antv-demo

A simple CV-dashboard framework for practicing how to use AntV.

antv cv dashboard data resume resume-template resume-website visualization

Last synced: 09 Apr 2025

https://github.com/radekbednarik/data_generator

Random data generator using Python. Generate data files with random string, floats, ints, dates via console or TOML files..

csv data generator python python3 random test-data-generator

Last synced: 13 Dec 2025

https://github.com/andrei-vataselu/data-science-snippets

🧰 Essential EDA and Data Cleaning Helpers for Any DataFrame This collection of functions is designed to accelerate exploratory data analysis (EDA), quickly surface data quality issues, and offer high-level insights into the structure and content of your dataset.

artificial-intelligence data data-science eda feature-engineering hyperparamater-tunning library loading model-evaluation modeling preprocessing python snippets text-processing time-series visualization

Last synced: 10 Mar 2026

https://github.com/memair/apps

App Store for Memair

apps appstore data data-science quantified-self

Last synced: 06 Apr 2026

https://github.com/utrechtuniversity/dataprivacysurvey

Code for analysing data from the Data Privacy Survey (2022)

data gdpr open-science privacy rdm research research-data-management survey utrecht-university

Last synced: 16 Jun 2025

https://github.com/ciscorn/tinybufr

A Rust library for decoding BUFR meteorological observation data format

bufr data meteorology rust weather wmo

Last synced: 11 Jan 2026

https://github.com/johntocci/nullaxe

Nullaxe is a powerful and user-friendly Python library designed for cleaning and preprocessing data. It works seamlessly with both pandas and polars DataFrames, making it a versatile tool for data scientists and developers.

data data-analysis data-science datacleaning pandas polars python

Last synced: 06 Apr 2026

https://github.com/dhruvldrp9/simpledht

A Python-based Distributed Hash Table (DHT) implementation enabling cross-network key-value storage, automatic node discovery, and data replication with a simple CLI and library interface.

cross-network-node-communation data data-replication data-synchronization dht dht-python distributed-hash-table key-value-storage nat netowork node-discovery peer-to-peer peer-to-peer-network python sha-256 simple udp udp-socket-communication

Last synced: 28 Feb 2026

https://github.com/marlenezw/speech-to-text

Turn any video or audio recording into a written transcript using python

data data-science python speech speech-recognition speech-synthesis speech-to-text

Last synced: 27 Apr 2026

https://github.com/zarr-developers/cookiecutter-zarr-store

Cookiecutter for Zarr store implementations

chunked data n-dimensional zarr

Last synced: 16 Jun 2025

https://github.com/hmeleiro/alquilermad

Housing rent map in Comunidad de Madrid / Mapa del alquiler en la Comunidad de Madrid

data data-science data-visualization datascience housing-location-visualization rent renting

Last synced: 13 Sep 2025

https://github.com/ahmedkhalf/arabic-keyword-scraper

Stop wasting your time! And obtain Arabic definitions without having to look it up.

arabic data definitions scraper sentences wordsearch

Last synced: 12 Mar 2025

https://github.com/eosdis-nasa/earthdata-pub-dashboard

Front-end Dashboard for Earthdata Pub

data earthdata edpub publication

Last synced: 15 Jan 2026