An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/woctezuma/download-steam-banners-data

Data consisting of Steam banners.

data steam steam-api

Last synced: 06 Jan 2026

https://github.com/sharmadhiraj/free-json-datasets

Collection of free JSON data that are scraped and parsed from different websites.

collection crawler data data-scraping datasets json sports statistics web-scraping

Last synced: 28 Mar 2025

https://github.com/minightdev/paperclip

Paperclip is a powerful privacy-focused data breach search engine that empowers users to swiftly and securely investigate breaches using email addresses and phone numbers. Our robust search engine delivers real-time results while prioritizing the privacy and security of user queries.

beaches data database pwn pwned search-engine

Last synced: 22 Mar 2025

https://github.com/heikomuller/histore

Library for maintaining snapshots of evolving tabular data sets

data version-control

Last synced: 10 Apr 2025

https://github.com/alexandregazagnes/scikit-res

Very Basic package to store results of ML models Grid search results are hard to exploit. This package aims to store them in a more convenient way.

data machine-learning mlops mlops-workflow results scikit-learn

Last synced: 20 Jan 2026

https://github.com/philhawksworth/netlify-plugin-trello-lists

A plugin to fetch the JSON data of a public Trello board, and stash the data for each list in a JSON file before your build runs making the data available to your static site generator at build time.

api data eleventy netlify plugin trello

Last synced: 20 Jan 2026

https://github.com/ugurcanerdogan/cross-validation-with-imbalanced-dataset

BBM467*SDSP - Small Data Science Project - Things to consider in cross validation and resampling when dealing with Imbalanced Data : What is the right way?

bbm467 cross-validation data data-science kfold-cross-validation logistic-regression machine-learning oversampling sdsp smote

Last synced: 21 Jun 2025

https://github.com/lemmotresto/migrational

A data migration library

data java migration versioning

Last synced: 30 Oct 2025

https://github.com/johnmackintosh/simd2016_tmap

Mapping SIMD with tmap - static & interactive

data data-science data-visualization mapping r visualisation

Last synced: 20 Mar 2025

https://github.com/masesgroup/datadistributionmanager

A reliable subsystem to distribute data across multiple datacenters using multiple languages (C/C++, .NET, JVM enabled languages) over multiple technologies (e.g. Apache Kafka, OpenDDS, etc)

apachekafka availability business-solutions businesscontinuity data durability opendds reliability softwareneverlockdown

Last synced: 14 Apr 2025

https://github.com/robjg/dido

Data In/Data Out in many formats

csv-parser data etl java json-parser

Last synced: 11 Jan 2026

https://github.com/felixklauke/atomizer

Playing around with butter knife, android bindings and rx java.

binding butterknife data java react rx rxjava

Last synced: 15 May 2026

https://github.com/antoineaugusti/purchasing-power

Archive daily data about purchasing power parity: how much goods should cost in various countries

archive data purchasing-power-parity

Last synced: 28 Oct 2025

https://github.com/nazar-pc/fixed-size-multiplexer

A tiny library for multiplexing data chunks into blocks of fixed size and vice versa

chunk data demultiplex demux fixed multiplex mux size

Last synced: 31 Oct 2025

https://github.com/marcosvidolin/firestore-bulk-loader

A simple tool to load data to Cloud Firestore.🔥

bulk-loader cloud data database firebase firestore import load loader tools

Last synced: 23 Jun 2025

https://github.com/writetome51/big-dataset-paginator

A TypeScript/JavaScript class for pagination in a real-world web app.

app data javascript pagination paginator typescript

Last synced: 17 May 2026

https://github.com/max-tonny8/android_web3

This is a library for Android to call data from Node on Ethereum Chain or Solana Chain

android blockchain coroutines coroutines-android data eth-call ethereum kotlin ktx retrofit rpc smart-contracts solana web3 web3j

Last synced: 27 Mar 2025

https://github.com/danlsn/causality

A Personal Data Platform and the culmination of years of curiosity and learning in the Data Engineering space.

data data-engineering datawarehousing personal-data quantified-self

Last synced: 06 Mar 2026

https://github.com/headless-start/data-augmentation-impact

This repository contains effect of Data Augmentation of Training Set during Model Training.

augmented-images cuda data gpu keras matplotlib mnist opencv-python python3 tensorflow training-data

Last synced: 05 Apr 2026

https://github.com/satyam4229/college-predictor-system

The college predictor system is a Python-based application that utilizes a machine learning model to predict colleges and their corresponding degree programs and branches based on a student's JEE (Joint Entrance Examination) score.

data data-science jupyter-notebook kaggle prediction python

Last synced: 06 Apr 2026

https://github.com/longzheng/southeastwater-usage-scraper

Extract hourly water usage data from South East Water portal website for digital water meters

australia data iot playwright southeastwater victoria water

Last synced: 06 Feb 2026

https://github.com/shysolocup/aepl

A Node.JS multi-layered class creation package with built-in parenting systems that let you get info from classes above as well as better function and property makers for easier to read and understand development and modding support inspired by Roblox's Studio API.

aepl backend classes data framework game-development game-framework javascript js js-class js-framework lightweight nodejs package

Last synced: 28 Oct 2025

https://github.com/jesusgraterol/bitcoin-lightning-network-stats-dataset-builder

The dataset builder script extracts Bitcoin's Lightnining Network statistics through Mempool.space's public API. The data is stored in a .csv file, facilitating its use in data science and machine learning projects.

bitcoin blockchain blockchain-technology data data-science dataset dataset-generation lightning-network machine-learning

Last synced: 16 May 2026

https://github.com/mzazakeith/puppetmaster

Puppeteer & Crawl4AI microservice for web automation, scraping, and AI processing with Bull queues

agent ai automation bull bullmq chrome crawl4ai crawler data data-extraction extraction gemini llm llms openai playwright puppeteer web-automation

Last synced: 13 May 2025

https://github.com/kodie/migrate-acf-field-data-to-repeater

A WordPress plugin that migrates field metadata for ACF fields that have been moved inside of a repeater

acf acf-field acf-fields advance-custom-field data data-migration data-migration-tool wordpress wordpress-plugin

Last synced: 19 May 2026

https://github.com/antvis/create-antv-demo

A simple CV-dashboard framework for practicing how to use AntV.

antv cv dashboard data resume resume-template resume-website visualization

Last synced: 09 Apr 2025

https://github.com/radekbednarik/data_generator

Random data generator using Python. Generate data files with random string, floats, ints, dates via console or TOML files..

csv data generator python python3 random test-data-generator

Last synced: 13 Dec 2025

https://github.com/exaluc/webhookcatcher

Catch your webhooks like a dream

api catcher data webhook webhook-callbacks webhooks-catcher

Last synced: 14 Apr 2025

https://github.com/ange007/jquery.mydata

jQuery.myData - Small jQuery&Zepto plugin for two-ways data binding.

data data-binding jquery jquery-plugin zepto zepto-plugin zeptojs

Last synced: 19 May 2026

https://github.com/mohammadkarbalaee/introduction-to-data-science-sbu

Reports and full documentation of the introduction to data science course held at SBU

data data-science python shahid-beheshti-university

Last synced: 27 Mar 2025

https://github.com/zenwor/table_editor

A simple table data editor, with easily scalable functions and operations & a nice GUI

data data-science formula java parser parsing preprocessing swing tokenizer

Last synced: 22 Jun 2025

https://github.com/andrei-vataselu/data-science-snippets

🧰 Essential EDA and Data Cleaning Helpers for Any DataFrame This collection of functions is designed to accelerate exploratory data analysis (EDA), quickly surface data quality issues, and offer high-level insights into the structure and content of your dataset.

artificial-intelligence data data-science eda feature-engineering hyperparamater-tunning library loading model-evaluation modeling preprocessing python snippets text-processing time-series visualization

Last synced: 10 Mar 2026

https://github.com/memair/apps

App Store for Memair

apps appstore data data-science quantified-self

Last synced: 06 Apr 2026

https://github.com/Ekey/ER.DATA.Tool

Tool for extract data archives from mobile game Earth Revival (Project Arrival)

data earth-revival idx project-arrival

Last synced: 19 May 2026

https://github.com/hyper63/copy

hyper copy tool

copy data hyper

Last synced: 20 Jul 2025

https://github.com/utrechtuniversity/dataprivacysurvey

Code for analysing data from the Data Privacy Survey (2022)

data gdpr open-science privacy rdm research research-data-management survey utrecht-university

Last synced: 16 Jun 2025

https://github.com/wfamous/fiv_update-data

This project automates the retrieval, processing, and publishing of digital product data for our Shopify store. It integrates Google Cloud Platform (GCP), Amazon Web Service (AWS), Terraform (Tofu), Python, Bash, Ansible and GitHub Actions to manage data pipelines efficiently.

ansible aws bash data data-analysis data-science devops gcp python pythonpackage shopify terraform tofu

Last synced: 17 Feb 2026

https://github.com/rrighart/rrighart.github.io

A webpage about data science, programming, statistics and related topics

analyses data data-mining programming statistics

Last synced: 20 Jan 2026

https://github.com/petermeissner/statsgrokse

R-API-binding to stats.grok.se server providing Wikipedia page view statistics for 2008 up to 2015

api binding data pageviews r wikipedia

Last synced: 17 May 2026

https://github.com/aaronmeder/social-history

A quick look into your history on social media. Drop in the archives you've downloaded from Facebook and Instagram and see some stats about your time on the networks.

archives data facebook instagram statistics stats

Last synced: 27 Mar 2025

https://github.com/olegegoism/datagenerator

Django web application for managing database connections and generating test data.

app application big-data csv data database dataset db django fake generator schema teable work

Last synced: 26 Oct 2025

https://github.com/acaciaman/db-autotest

DB Database test automation. This python package allows to create database object structure and load data from database.

data database test-automation

Last synced: 05 May 2026

https://github.com/xxczaki/parsify-plugin-covid19

Parsify plugin, that adds COVID 19-related variables 🦠

confirmed coronavirus covid19 data deaths fun math parser parsify parsify-plugin plugin variable variables

Last synced: 13 Mar 2026

https://github.com/ayoub-amzil/offline-globe

Offline country data for PHP Laravel framework. Over 200 countries, capitals, flags, languages, currencies. No internet needed.

composer data internet laravel offline php

Last synced: 09 May 2026

https://github.com/planarnetwork/feeds.planar.network

GTFS feeds for bus, train and plane

data feeds gtfs transit transportation

Last synced: 11 Feb 2026

https://github.com/asidlo/po

Data science library for manipulating data in Go using the familiar DataFrame and Series constructs from the Python Pandas library.

data dataframe go pandas series

Last synced: 14 Jan 2026

https://github.com/d3oxy/country-state-data

A comprehensive JSON dataset containing countries, states, cities, regions, and languages with TypeScript support. Perfect for building location-based dropdowns, address forms, and geographical applications.

address cities countries currency data dropdown geographical iso json languages location regions states typescript

Last synced: 24 Jan 2026

https://github.com/pawelzny/vo

DDD Value Object implementation

data ddd-patterns object python3 value

Last synced: 15 Feb 2026

https://github.com/jaldekoa/nyfedapi

A Python wrapper to easily retrieve data from the Federal Reserve Bank of New York (FRBoNY) official API in pandas format.

api api-wrapper banking data finance pandas python united-states

Last synced: 08 Feb 2026

https://github.com/dhimmel/het.io-rep-data

Data from Project Rephetio for the het.io website

browser data datatables drug-repurposing rephetio

Last synced: 07 Feb 2026

https://github.com/a3r0id/lightshot-data-miner

A random idea I had a while back to make a data miner for lightshot. Never released this but after a friend sent me a post about lightshot's transparency I figured it'd be a good time to release this. I've included some output from a run before making the repo. I am not responsible for the imagery or it's contents.

brute-force bruteforce data dataset face-recognition image-processing lightshot mining scraper scraping text-recognition

Last synced: 19 Oct 2025

https://github.com/tomwhite/chernoff

A visual mood indicator. One of the first Java programs I ever wrote.

chernoff-faces data visualization

Last synced: 20 Apr 2026

https://github.com/sapienzanlp/exploring-srl

Repository for the paper "Exploring Non-Verbal Predicates in Semantic Role Labeling: Challenges and Opportunities"

acl acl2023 conllu data dataset natural-language-processing nlp semantic-role-labeling srl

Last synced: 31 Jan 2026

https://github.com/zgbjgg/quetzal-examples

Examples using Quetzal :rocket: :bird:

analytics dashboard data data-visualization elixir erlang plotly web-app

Last synced: 24 Apr 2026

https://github.com/kefniark/kaaya

JS Library for State management and Data synchronization between Applications

data game kaaya mutation network serialization state-management

Last synced: 06 Jun 2026

https://github.com/openpeeps/zxc-nim

Bindings to the ZXC compression library, a LZ77-based compressor optimized for high decompression speed

archive compression compressor data decompression game-assets lossless lossless-compression lz77 nim nim-bindings nim-package nim-wrapper openpeeps zxc

Last synced: 07 Jun 2026

https://github.com/csadorf/pydata-ann-arbor-2018

Slides and notebooks demonstrating signac for PyData Ann Arbor Meetup 2018

data data-management jupyter signac workflow

Last synced: 04 Jun 2026

https://github.com/gauravkoradiya/tensorflow-data-and-deployement

This repository contains usage of data and deployment pipline in tensorflow.

data deployment machine-learning-algorithms pipline tensorflowjs

Last synced: 06 Oct 2025

https://github.com/stdlib-js/ndarray-base-dtype-str2enum

Return the enumeration constant associated with an ndarray data type string.

array data dtype dtypes enum javascript multidimensional ndarray node node-js nodejs stdlib types util utilities utility utils

Last synced: 15 Mar 2026

https://github.com/vutran/yahoo-stocks-cli

Fetch stock data from the CLI

cli data finance stocks yahoo

Last synced: 08 Jun 2026

https://github.com/stefen-taime/real-time-data-pipeline-snake-game

Dynamic Snake Game: Unleashing Real-Time Streaming Analytics with Redis, Kafka, Flink, ClickHouse & Chart.js in an Online Snake Game via Flask API

chartjs clickhouse confluent-cloud data flask kafka-streams pipeline redis

Last synced: 04 May 2026

https://github.com/1sumer/sql

This repository contains SQL scripts and data for various analytical and database management tasks. The project is designed to demonstrate SQL capabilities in handling complex queries, data analysis, and database design. It includes datasets related to e-commerce and streaming services, with a focus on real-world scenarios and use cases.

analytics data data-analysis data-storage sql vscode

Last synced: 19 Jan 2026

https://github.com/yanpitangui/iteminfoconverter

Application that converts ragnarok legacy data files to iteminfo.lua

data itemdbconf iteminfo luafiles ragnarok

Last synced: 12 Oct 2025

https://github.com/vyahello/fake-employee-api

👨‍🔧 Simple mock employees data parser (responder + heroku + pytest + github/travis CI)

data employee employer mock responder rest-api

Last synced: 09 Jun 2026

https://github.com/bdpedigo/neuropull

A (soon to be) lightweight Python package for accessing single-cell connectome networks with metadata.

connectome connectomes connectomics data dataset networks networks-biology

Last synced: 05 Oct 2025

https://github.com/yorkulibraries/vendorpol

URLs for vendor privacy policies and terms of use.

data libraries privacy-policy

Last synced: 15 Oct 2025

https://github.com/amethyst-php/customer

A person or an organization that pays for goods or services

amethyst amethyst-package api customer data laravel

Last synced: 11 May 2026

https://github.com/dantesc03/uberpool-case-study

This project was designed to understand the statistical effects of longer wait times on uber rides. Particularly on the user and driver experience with the Uber Pool System.

analysis data excel jupyter jupyternotebooks learn python seaborn statistics t-tests uber visualization

Last synced: 16 Apr 2026

https://github.com/bastianolea/siedu_indicadores_urbanos

Datos del Sistema de Indicadores y Estándares de Desarrollo Urbano, con datos comunales sobre temas como transporte, urbanismo, servicios básicos, calidad de vida y más.

ambiental app chile ciudad comunas data estado social

Last synced: 19 Feb 2026

https://github.com/aleklukanen/chapterhousedb-v1

Allows you to create simple data streaming warehouses written in Golang using Apache Parquet and Arrow.

arrow data database event golang ingestion parquet pipeline processing stream

Last synced: 27 Feb 2026

https://github.com/ium101/files-and-folders-lister-z

Files and Folders Lister Z is a utility for listing the contents of directories on your computer. It provides both a command-line and a graphical user interface (GUI) for easy use.

application application-code brasil brazil cmd command data database databases exe filemanagement filesystem linux lowcode macos python sh tool utility windows

Last synced: 09 Oct 2025

https://github.com/sabujxi/python-scraper-and-data-analysts-admin-panel-in-django

A data scraper from texas govt site and a helping web app for managing, reviewing and editing the data

analyst data data-analysis data-entry data-scraper django django-application python python-scraper real-estate regex scraper texas

Last synced: 30 Apr 2026

https://github.com/mohasarc/treeviz

The best tree data-structures visualization tool

data structures visualization visualization-tools

Last synced: 25 Apr 2026

https://github.com/oliverhennhoefer/shiny-template-interactive-table

Example of interactively adding rows / deleting rows by selecting directly in a data.table (DT) in Shiny

button data delete dt r select selection server shiny shiny-applications shiny-apps shiny-r shinyapps table ui userinterface

Last synced: 16 Apr 2026

https://github.com/zig-utils/zig-faker

A high-performance, lightweight fake data generator. Generate realistic fake data for testing, prototyping, and development.

data faker library mocker zig

Last synced: 01 Apr 2026

https://github.com/bastianolea/sinim_info_municipal

Base de datos del Sistema Nacional de Información Municipal, que incluye datos comunales sobre finanzas municipales, recursos humanos, educación, salud, pensiones, organizaciones sociales, y más.

chile comunas data estado laboral politica social tiempo

Last synced: 26 Oct 2025

https://github.com/audeering/emodb

Publishes Berlin Database of Emotional Speech with audb

audb data emotion

Last synced: 19 Oct 2025