An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/aruneshbasak/python-dsa-problems-geeksforgeeks-160-days

I will upload my daily Python DSA problems solved on GeeksforGeeks and post it here!

algorithms-and-data-structures and data data-structures dsa python python3 structure

Last synced: 08 May 2025

https://github.com/qeeqbox/data-lifecycle-management

Data Lifecycle Management (DLM) is a policy-based model for managing data in an organization

data data-lifecycle-management infosecsimplified lifecycle management qeeqbox

Last synced: 07 Mar 2026

https://github.com/sixarm/sixarm_ruby_fab

SixArm.com → Ruby → Fab gem to fabricate sample data for testing

data fabrication factory fake gem mock ruby

Last synced: 24 Jul 2025

https://github.com/oya163/corteva

Corteva Data Ingestion Pipeline

corteva data engineering etl

Last synced: 25 Jul 2025

https://github.com/shysolocup/stews

Stews is a Node.JS package meant to make storing data easier by mixing parts from common data types.

aepl array arrays data datatypes html javascript js json map maps nodejs object objects package set sets stews

Last synced: 25 Jul 2025

https://github.com/stonecharioteer/renfield

Synchronize and Search through Hard Drives

catalogue data search storage synchronization

Last synced: 09 Feb 2026

https://github.com/public-health-scotland/waiting_times_clinical_prioritisation

This repository contains the Reproducible Analytical Pipeline (RAP) to produce the quarterly statistics on clinical prioritisation, part of the Stage of Treatment (SoT) publication.

data healthcare nhs public-health scotland shiny shiny-app treatment waiting-time

Last synced: 26 Jul 2025

https://github.com/akatrevorjay/helm-nuke

Nukes all helm releases as well as tiller-owned k8s objects that may be left lying around.

all data destroy helm plugin

Last synced: 19 Sep 2025

https://github.com/velocitatem/cellviz

Cellular Automata inspired by live-data visualization, designed to handle multidimensional and high-throughput data efficiently.

cellular-automata conways-game-of-life data economics

Last synced: 29 Jul 2025

https://github.com/rrwen/twitter2pg-cli

Command line tool for extracting Twitter data to PostgreSQL databases

api cli cmd command data database geo interface line location media pg postgres postgresql rest social stream tool tweet twitter

Last synced: 12 Apr 2026

https://github.com/connectomicslab/cmtklib-data

Datalad dataset that stores all data resources of the cmtklib module of Connectome Mapper 3 (https://github.com/connectomicslab/connectomemapper3).

brain data parcellation resources software

Last synced: 16 Jan 2026

https://github.com/rohancyberops/rp1

This project performs an analysis of Starbucks (SBUX) stock returns using R. The analysis includes both simple returns and continuously compounded returns (CC returns) for a period of one month. It also calculates the growth of $1 invested in SBUX and provides visual insights through various plots.

analysis cc data r rlanguage sbux

Last synced: 15 Mar 2025

https://github.com/mitevpi/vue-d3-bar-chart

Reusable, reactive, animated bar chart using D3 + Vue.js. Written in idiomatic Vue, rather than D3 syntax.

d3 data data-visualization frontend interactive svg vue web

Last synced: 18 May 2026

https://github.com/kingabzpro/makefile-actions

GitHub Actions and MakeFile tutorial and project for beginners.

actions analytics automation data data-science makefile

Last synced: 18 Apr 2026

https://github.com/seguradevinn/data-project

A healthcare data audit demo using CMS SynPUF and DuckDB, showing how raw claims are cleaned, validated, and transformed into a 2009 cohort with descriptives and a RADV-style chase list.

auditing cms data duckdb sql

Last synced: 02 Sep 2025

https://github.com/lookininward/data-formatter-demo

You have directories containing data files and specification files. The specification files describe the structure of the data files. Write an app that reads format definitions from specification files. Use these definitions to convert the parsed files to NDJSON files.

csv data demo files json ndjson python txt unittest

Last synced: 27 Apr 2026

https://github.com/nesterenko-kv/object-id

ObjectIDs are a special type of identifier mainly used in MongoDB to uniquely identify documents within a collection. They consist of a 12-byte binary value that includes a timestamp, a machine identifier, a process identifier, and a counter.

c-sharp data id net object-id unique-identifier

Last synced: 16 May 2025

https://github.com/emnetdegafe/allesoverfilm-backend

AllesOverFilm-backend is part of the AllesOverFilm mobile app development project and contains the database structure, server query scripts, and Sequelize-cli database structures.

backend data data-model express postgresql sequelize-cli

Last synced: 11 Apr 2026

https://github.com/cintia0528/data_science-ab_testing

Conduct a 5-way AB Test on Montana State University Library's website, comparing the original "Interact" button with new versions ("Learn," "Help," "Connect," "Services") to boost user engagement.

abtesting bonferroni chisquare-test data data-science datacleaning datavisualization hypothesis-testing mde statistics

Last synced: 31 Mar 2025

https://github.com/tsvikas/covid-19-israel-data

Unofficial Github with the data published by The Israel Ministry of Health, regarding The Coronavirus disease

coronavirus-disease covid-19 csv daily-reports data health israel

Last synced: 05 Jan 2026

https://github.com/ttitcombe/timekeep

Defensive timeseries analysis in python

data data-science sklearn time-series time-series-analysis timeseries

Last synced: 05 Jan 2026

https://github.com/bileljegham/api-sport-cli

Cli for https://api-sports.io/ Retreive data and convert to sql file

cli data database match nodejs sports sports-analytics

Last synced: 08 May 2026

https://github.com/mtingers/opacify

Opacify reads a file and builds a manifest of external sources to rebuild said file.

backup data obfuscation python

Last synced: 18 May 2026

https://github.com/andygeiss/pipeline

Build your own data pipeline to gather, organize and transform data by using protobuf as an intermediate format.

data data-pipeline data-science go golang machine-learning protobuf protobuf-compiler

Last synced: 31 Mar 2025

https://github.com/dataship/beam

Get collimate'd data into Frame, in Node or the Browser

column-store data data-science

Last synced: 27 Apr 2026

https://github.com/davidgamero/gatech-covid-data-scraper

Utility for scraping GATech Exposure Alert Information into a CSV file with automated case number extraction and aggregation

covid data gatech georgia scraper

Last synced: 31 Mar 2025

https://github.com/lamden/merk

A concise implementation of a merkle tree in Python.

crypto data hash merkle structure tree

Last synced: 27 May 2026

https://github.com/abdul-rafay19/youngdevinterns_machine-learning_tasks

This internship offers hands-on exposure to real-world Machine Learning applications — from data visualization and preprocessing to model development, evaluation, and deployment. It focuses on real ML workflows, problem-solving, neural networks, and hyperparameter tuning — all within a collaborative, remote, and growth-oriented environment.

ai artificial-intelligence artificial-intelligence-algorithms artificial-neural-networks data data-visualization internship machine-learning machine-learning-algorithms machinelearning ml model model-development neural-network preprocessing programming-language python task tasks youngdevintern

Last synced: 29 Apr 2026

https://github.com/benmaier/boarding_school_sir

Fit SIR dynamics to the prevalence curve of an H1N1 outbreak of a British boarding school in 1978.

boarding data disease epidemiology modeling school spreading

Last synced: 31 Mar 2025

https://github.com/trstringer/pywave2

:ocean: Get swell buoy data

data ocean python

Last synced: 31 Mar 2025

https://github.com/gbowne1/jsonhelix

This is a X11 GUI JSON application for editing, debugging and converting JSON and schemas and API data.

api data gui gui-application json x11

Last synced: 10 Jun 2025

https://github.com/willdev12/rjson

Encryptable Json file format for .NET projects!

csharp csharp-library data dotnet json json-data json-plugin variables vbdotnet vbnet

Last synced: 11 Apr 2026

https://github.com/tatey/list_of_countries

A list of countries, states, and cities in Ruby

cities countries data ruby states

Last synced: 11 Nov 2025

https://github.com/spectrochempy/spectrochempy_data

Test and examples data repository for SpectroChemPy

data

Last synced: 04 Apr 2025

https://github.com/jorgeatgu/apaga-luz

💡 ¿Cuánto cuesta la luz? 💶

data data-visualization flat-data

Last synced: 04 Feb 2026

https://github.com/SAP-archive/signavio-qualtrics-di

Setup an SAP Data Intelligence data pipeline to connect Qualtrics surveys data to SAP Signavio Process Intelligence via Ingestion API.

data intelligence process-intelligence qualtrics sample sap-data-intelligence sap-signavio-process-intelligence signavio

Last synced: 09 May 2025

https://github.com/danish-foundation-models/dfm-processing

Toolkit for processing data in the danish foundation models project.

data text-processing

Last synced: 02 Jul 2025

https://github.com/insolite/react-data-frame

Table for huge data sets

data react table

Last synced: 14 May 2026

https://github.com/diegoperea20/own_dataset_segmentation_yolov8

Segmentacion y detection de objetos con propio dataset usando YOLOV8 , en el que se utiliza un dataset propio de una moneda de 200 pesos colombianos del año 2023.

coins colombia data opencv own python segmentation tensorflow yolov8

Last synced: 12 Apr 2026

https://github.com/wooldoughnut310/xboxgamertag

Python module to get data from www.xboxgamertag.com

data gamertag html python3 requests xbox

Last synced: 24 Mar 2025

https://github.com/dev-owdenmag/dataflow-manager

A dynamic and versatile web application for managing, collecting, and presenting data with an integrated printing feature.

data data-management data-management-platform data-visualization python

Last synced: 30 Mar 2025

https://github.com/metriccoders/metriccoders_datasets

This is the Metric Coders repository containing all the datasets for machine learning.

data datasets machine-learning natural-language-processing scikit-learn

Last synced: 08 Apr 2025

https://github.com/gher-uliege/bluecloud-plankton

Spatial interpolation of plankton data using a neural network

data data-analysis data-visualization neural-network oceanography

Last synced: 30 Mar 2025

https://github.com/igorskyflyer/npm-adblock-header-extract

✂️ Parse and extract ad-block filter list headers with ease. Works on strings or files, trims whitespace, and returns clean metadata for tooling and automation. 📃

adblock back-end biome data filter header igorskyflyer javascript js metadata node nodejs npm string ts typescript utility

Last synced: 11 Mar 2026

https://github.com/stefanbohacek/exploring-the-mapping-police-violence-dataset

Using my Gutenberg Data Visualization plugin to explore police violence against civilians.

data dataviz police police-brutality police-misconduct

Last synced: 03 Dec 2025

https://github.com/zituocn/dean

Task flow framework for data processing

data golang task

Last synced: 18 Jan 2026

https://github.com/bijx/firestore-data-fetcher

A simple Python script to fetch documents from a Firebase Firestore collection and save them to a local `.json` file.

automation data database downloader exporter fetcher firebase firestore open-source script

Last synced: 12 Apr 2026

https://github.com/cqllum/schema2dwh

⚡ Automatically produce a data model on your database using its information schema using GenAI.

ai data data-structures dataengineering datawarehousing dwh gemini gemini-api genai reporting reporting-tool schema-design

Last synced: 13 Mar 2025

https://github.com/castelao/bufr

BUFR binary data format from WMO

binary data format meteorology oceanography wmo

Last synced: 13 Jul 2025

https://github.com/agahkarakuzu/datavis_edu

Presented in BrainHack School 2019-2020, QBIN SciComm 2021

binder dashboard data notebooks repo2docker visualization

Last synced: 01 Apr 2025

https://github.com/datenoio/internacia-db

Public registry of the intergovernmental organizations, country groups and countries. Available as JSONl, Parquet, YAML and DuckDB database datasets

countries data datasets international international-trade reference

Last synced: 29 May 2026

https://github.com/lmuffato/project-job-insights-trybe

Projeto job insights - Projeto avaliativo da Trybe do Bloco 32: Introdução à Python

data data-science data-transformation filter python

Last synced: 12 Jun 2025

https://github.com/gbv/cocoda-mappings

concordances, mappings and conversion scripts to create JSKOS mappings

coli-conc data jskos

Last synced: 28 Oct 2025

https://github.com/lane-romuald/iot-irrigation-data-collection-system

An IoT-based data collection system using the ESP32 microcontroller programmed with Arduino to monitor environmental conditions for smart irrigation. The system measures soil moisture, temperature, air temperature, humidity, and rain probability. Data is stored locally on an SD card and uploaded to the ThingSpeak platform.

arduino cloud data data-collection esp32 openweather openweathermap thingspeak wi-fi

Last synced: 12 Apr 2026

https://github.com/vapourismo/binary-io

Read and write values of types that implement Binary from and to Handles

data haskell haskell-library io parsing

Last synced: 28 Mar 2025

https://github.com/cleanzr/restaurant

Restaurant data set for entity resolution

data linkage

Last synced: 11 Mar 2026

https://github.com/fiskeben/meetjescraper

HTTP proxy for Meet je stad project

api data go iot meetjestad proxy scraper weather

Last synced: 29 May 2026

https://github.com/quasilyte/phpcorpus

A collection of various PHP code; useful for PHP tools writers to get some insights on how "real-world" PHP code looks like

analysis corpus data php php-corpus

Last synced: 04 Jul 2025

https://github.com/melinteflxrin/softserve-bigdata-project

End-to-end data warehousing project integrating APIs, ETL workflows, and PostgreSQL for analytics and reporting.

analytics api bigdata data datawarehousing externalapi pipeline postgres postgresql python warehouse

Last synced: 26 Jan 2026

https://github.com/thiagopanini/datadelivery

Um módulo Terraform open source capaz de proporcionar um toolkit completo de infraestrutura para que usuários iniciem suas respectivas jornadas de exploração em serviços de Analytics na AWS.

analytics athena aws catalog crawler data datamesh glue s3 terraform

Last synced: 29 Nov 2025

https://github.com/whitehathackerpr/data-visualization-tool

This is a Python-based web application that allows users to upload datasets, analyze data, and create visualizations interactively. The tool is designed for ease of use and provides a simple interface to perform basic data analysis and generate visualizations

data data-analysis data-visualization python python3

Last synced: 05 Sep 2025

https://github.com/xpotify/scraper

Scraper designed for Xpotify's client to gather information from websites🌟

axios cheerio data javascript scraper webscraper

Last synced: 07 Jul 2025

https://github.com/desininja/data-engineer-interview-questions

This repository contains all the Data Engineer Interview Questions asked by interviewers.

data data-engineer-interview-questions

Last synced: 31 Mar 2025

https://github.com/stdlib-js/ndarray-base-to-reversed

Return a new ndarray where the order of elements of an input ndarray is reversed along each dimension.

base data flip javascript matrix ndarray node node-js nodejs reverse slice stdlib structure to-reversed types vector view

Last synced: 12 Apr 2026

https://github.com/agavitalis/sample-c-codes

A collection of small projects I carried out on audino as an electronic engineering student despite felling in love with website development.

ageteller atm binary data gpcalculator logging

Last synced: 09 Apr 2025

https://github.com/shawnduong/pacman-digest

Generate a digest of package space usage for Linux systems using pacman.

arch data pacman

Last synced: 13 May 2026

https://github.com/himel-sarder/web-scraping-it-jobs-dataset

This project is a Python-based web scraping tool that collects job listings from TimesJobs for IT-related positions. It extracts job titles, company names, locations, and experience requirements, and saves the data into a CSV file. The tool uses BeautifulSoup and Pandas for web scraping and data manipulation.

data datascience dataset kaggle-dataset machine-learning machinelearning ml web-scraping

Last synced: 22 Feb 2026

https://github.com/stdlib-js/ndarray-slice-dimension-from

Return a read-only shifted view of an input ndarray along a specific dimension.

copy data javascript matrix ndarray node node-js nodejs shift slice stdlib structure truncate types vector view

Last synced: 24 Apr 2025

https://github.com/yasenstar/powerbi_tutorial

Base on "PowerBI Tutorial" book, provide step by step video demo on learning and mastering Power BI tool

analytics data microsoft powerbi tutorial visualization

Last synced: 07 Jan 2026

https://github.com/sefakcmn00/tensorflow_car_price_analysis

In this project, after extracting the data sets as csv, we tried to represent the car prices graphically and schematically by using data analysis and data visualization methods. We checked the connection of the car prices we analyzed with other data, then we created a 4-layer and 12-neuron system.

data datatrain keras machine-learning matplotlib-pyplot pandas seaborn sklearn tensorflow

Last synced: 14 Apr 2026

https://github.com/alexscigalszky/palabras-aleatorias-data

This package have a set of datasets of random words, animals, colors, jokes, onomatopeias and types

aleatorias data palabras random words

Last synced: 04 Oct 2025

https://github.com/san089/black-friday-sales-analysis

This Project gives an insight into few statistics related to black Friday Sale.

custom data dataanalysis insights sales statistics

Last synced: 13 Jul 2025

https://github.com/camara94/introduction-to-data-engineering

Describe the different entities that form a modern data ecosystem. Describe and differentiate between the role and responsibilities of Data Engineers, Data Scientists, Data Analysts, Business Analysts, and Business Intelligence Analysts. Explain what Data Engineering is. List the tasks that need to be performed in a typical data engineering lifecycle. Describe what a day in the life of a Data Engineer looks like.

business-analytics business-intelligence data dataingestion dataintegration datascience machinelearning python statistical-analysis

Last synced: 09 Apr 2025

https://github.com/tarantinoarchive/dec

Developer-Easy CMS

cms data easy ejs js json simple

Last synced: 11 Mar 2026

https://github.com/goncaloperes/datavisualization

Here I will share some of my data visualizations using a variety of datasets, technologies and tools.

d3js data dataset datavisualization dataviz ggplot matplotlib rawgraphs seaborn tableau visualization yellowbrick

Last synced: 04 Feb 2026

https://github.com/tylerben/data-spring

Easily generate a dummy dataset based on a provided config

data data-spring datagenerator fake-data generator javascript typescript

Last synced: 27 May 2026

https://github.com/stdlib-js/array-zero-to

Generate a linearly spaced numeric array whose elements increment by 1 starting from zero.

array data float32array float64array int16array int32array javascript matrix ndarray node node-js nodejs stdlib structure typed typed-array types uint32array vector

Last synced: 08 Jan 2026

https://github.com/luminati-io/Twitter-X-dataset-samples

A sample dataset of over 1000 Twitter (X) posts, extracted using the Bright Data API, ideal for trend discovery, brand monitoring, and competitive insights.

api data dataset twitter twitter-api twitter-scraper web-scraping x

Last synced: 09 Apr 2025

https://github.com/dbriane208/omdena-apprenticeship-project

This is part of my contribution to the Omdena apprenticeship program .

data data-science feature-engineering machine-learning

Last synced: 14 Mar 2026

https://github.com/stdlib-js/ndarray-slice

Return a read-only view of an input ndarray.

copy data javascript matrix ndarray node node-js nodejs slice stdlib structure types vector view

Last synced: 10 Mar 2026

https://github.com/stdlib-js/array-base-to-deduped

Copy elements to a new generic array after removing consecutive duplicated values.

array compress copy data dedupe deduplicate deduplication duplicate generic javascript node node-js nodejs stdlib structure types uniq unique

Last synced: 14 Jun 2025

https://github.com/neelravi/fairtool

A CLI tool for FAIR processing of computational materials science data.

computational data data-analytics fair management materials physics python science

Last synced: 14 Jan 2026