An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/bohnacker/data-manipulation

Some Javascript and Python scripts to manipulate (large) CSV files and JSON data.

data data-mining data-structures javascript python

Last synced: 18 May 2026

https://github.com/strmprivacy/docs

With STRM Privacy you can easily build privacy-by-design data pipelines and define data contracts to encode privacy inside your data. Data streams are pseudonymised or anonymised in real-time or batch. These are our docs.

data documentation docusaurus privacy privacy-enhancing-technologies

Last synced: 12 Jul 2025

https://github.com/olajideolagunju/gcp_mage_data_pipeline

An end-to-end data engineering pipeline that processes and analyzes Maintenance Work Orders using Mage, Docker, Google BigQuery, MariaDB, and Looker Studio. It features a seamless integration of cloud and open-source tools for scalable data storage, transformation, and visualization.

automation bigquery cloud compute-engine data data-engineering database database-schema docker-compose excel gcp mage-ai maintenance mariadb orchestration python sql virtual-machine visualization-dashboard work-orders

Last synced: 07 Mar 2025

https://github.com/jujuadams/ini-to-json

JSON+buffer replacement for native GameMaker INI functions.

data gamemaker gamemaker-studio-2 gms2 ini json save

Last synced: 21 Jul 2025

https://github.com/njraladdin/newspapers-com-scraper

A Node.js scraper for extracting article data from Newspapers.com based on keywords, dates, and locations.

archive data newspapers scraper scraper-api scraping

Last synced: 06 Apr 2025

https://github.com/csengupta1101/dig-student-files

This Repository will contain all student submissions at one place.

data datascience education machine-learning python students visualization

Last synced: 17 Jul 2025

https://github.com/weecology/ratdat

R package version of Portal Project Teaching Database

data database ecology teaching teaching-data

Last synced: 17 Feb 2026

https://github.com/priyanka7411/dataspark-electronics-retail-analytics

DataSpark is a data analysis project using Python, SQL, and Power BI to analyze global electronics retail sales, focusing on customer behavior, sales performance, product profitability, and store performance to optimize sales strategies.

analytics-providers business-intelligence customer-segmentation data data-analysis electronics-industry global-sales pandas powerbi powerbi-visuals product-profitability python retail-analytics sales-performance sql store-analysis visualization

Last synced: 10 Jul 2025

https://github.com/DataHerb/dataherb-python

Python Package for DataHerb: create, search, and load datasets.

data data-analysis data-mining database dataset python

Last synced: 08 May 2025

https://github.com/kawai-senpai/potatodb

PotatoDB is a lightweight, file-based NoSQL database for Python projects, designed for easy setup and use in small-scale applications. Ideal for developers seeking simple data persistence without the complexity of traditional databases.

data database easy-to-use file-based json key-value lightweight nosql nosql-database persistence python simple

Last synced: 23 Oct 2025

https://github.com/muneeb1030/finetune-tiny-llama

Fine-tuning the Tiny Llama model to mimic my professor's writing style using the Llama Factory. The project involves data collection, preprocessing, preparation, fine-tuning, and evaluation.

data data-preparation data-preprocessing finetuning llama-factory llm pymupdf selenium-python spacy tinyllama webscraping

Last synced: 08 Apr 2026

https://github.com/techiaith/brawddegau-tagiedig

Corpws o frawddegau CC0 mewn fformat jsonl, gyda rhannau ymadrodd y tocynnau (geiriau etc.) wedi'u tagio â thagiau Universal Dependencies. // A Corpus of CC0 sentences in the jsonl format, tagged with Universal Dependency part-of-speech tags.

annotated cc0 commonvoice data nlp welsh

Last synced: 17 Jan 2026

https://github.com/vasturiano/data-bind-mapper

Bind data arrays with any type of JS objects

bind data digest joins mapper performance

Last synced: 26 Jul 2025

https://github.com/xtlsoft/xdo

[DEPRECATED] XDO is a fast,light PHP Data Object. Includes DB,Cache,Upload.

cache data database php upload web

Last synced: 05 Apr 2025

https://github.com/faster-games/whiskey

Data and Events framework for Unity. 🥃⚡

data events framework unity3d

Last synced: 19 May 2026

https://github.com/farhadrezvani/warframe-drops-pwa

a warframe app that finds the best place to farm any in-game item by looking through the official drop tables published by Digital Extremes.

data drop-data game preact pwa vite warframe

Last synced: 11 Oct 2025

https://github.com/FCC/contours-api-node

Enterprise Contours Node API

api contours data data-visualization geospatial gis map

Last synced: 27 Jul 2025

https://github.com/codiepp/elykseer-base

cryptographic data archive; written in F#; envisaged to stay another 10 years

archive cli cryptography data distributed-storage dotnet fsharp longterm-storage

Last synced: 19 May 2026

https://github.com/justunsix/debezium-tests

Testing different Debezium development environment set ups

azure capture cdc change data debezium kafka mssql openshift sql streaming

Last synced: 19 May 2026

https://github.com/pytroll/pygac-fdr

Python package for creating a Fundamental Data Record (FDR) of AVHRR GAC data using pygac

avhrr climate closember data gac hacktoberfest metop noaa pygac record satellite tiros

Last synced: 12 Apr 2025

https://github.com/yashika-malhotra/data-exploration-and-visualization-for-streaming-platform

Data Analysis and Visualization for streaming platform to provide insights and recommendations to improve their userbase.

colab-notebook data data-visualization jupyter-notebook matplotlib numpy pandas python seaborn

Last synced: 18 Apr 2026

https://github.com/paladique/azuresample-guestbook

Guestbook using MySQL and Cosmos DB on Azure

cosmosdb data mysql spa websockets

Last synced: 30 Apr 2026

https://github.com/desultory/pycpio

Python library for CPIO manipulation

cpio cpio-archives data initramfs pypi-package python python-3 python3

Last synced: 04 Feb 2026

https://github.com/legopitstop/addons

All legopitstop's Bedrock add-ons in one place.

add-on assets behaviorpack data hacktoberfest minecraft mods modtoberfest resroucepack vanilla

Last synced: 06 Feb 2026

https://github.com/datalayer/desktop

Ξ 🖥️ Datalayer Destkop.

ai data data-analysis data-science datalayer desktop electron

Last synced: 25 Oct 2025

https://github.com/cnayan/q-server

Gives API for back-end server connectivity; MS SQL Server connector provided.

data database provider q-server query query-engine

Last synced: 09 Oct 2025

https://github.com/suh1z/rakkauttify_fullstack

CS2 Data and Statistics Dashboard -fullstackproject

analytics data expressjs gaming mongo nodejs react redux

Last synced: 24 Oct 2025

https://github.com/lilingxi01/bloark

Blocks Architecture (BloArk) project package for building Blocks-0 dataset and way beyond.

architecture bloark data revision-based

Last synced: 05 Apr 2026

https://github.com/udityamerit/python-librearies-for-data-science

Python libraries for data science enable efficient data manipulation, analysis, and modeling. Key libraries include NumPy for numerical computing, pandas for data handling, Matplotlib for visualization, Scikit-learn for machine learning, TensorFlow for deep learning, and BeautifulSoup/requests for web scraping. These libraries simplify complex data

beautifulsoup data data-science data-science-libraries machine-learning matplotlib numpy pandas requests scikit-learn scikitlearn-machine-learning tensorflow

Last synced: 06 Feb 2026

https://github.com/louisbrulenaudet/legalkit-pipeline

Publication pipeline for French legal codes on 🤗 Datasets from LegiFrance with concurrent upload and dynamic REAMDE.md.

data datasets huggingface huggingface-datasets legal legaltech legifrance open-source parquet piste-api python

Last synced: 17 Mar 2025

https://github.com/justfairdev/Web-Stack-Query

🤖 Powerful asynchronous state management, server-state utilities and data fetching for the web. TS/JS, React Query, Solid Query, Svelte Query and Vue Query.

async cache data fetch graphql hooks query react resources rest

Last synced: 14 Oct 2025

https://github.com/0xdir/htcds_dart

Human Trafficking Case Data Standard (HTCDS v0.2) objects, for easy creation, storage and transmission of case data related to human trafficking.

data humanitarian schema standards

Last synced: 24 Oct 2025

https://github.com/mutasim77/dbt-analytics

🍉 Repo for analytics engineering with dbt, transforming raw data into actionable insights.

big-query data data-analysis dbt warehouse

Last synced: 25 Feb 2026

https://github.com/sneels/parkds

Connect all your Data Sources via 1 process (Cross-Domain + Single-Domain)

cross-domain data database datasource datasources javascript source

Last synced: 24 Feb 2026

https://github.com/tosun-si/world-cup-qatar-team-stats-kotlin-midgard

This application shows a full Apache Beam pipeline with Kotlin and Midgard library. The use case works on the last Qatar FIFA world cup data and calculate players statistics per team. This application will be presented at Beam Summit 2023 in New York

apache-beam beam-summit data kotlin midgard world-cup-2022

Last synced: 01 Feb 2026

https://github.com/vrm-piyush/python-projects

Open source Python Projects. Feel Free to contribute!

data dataanalysis games open-source pygame-games python python-app

Last synced: 26 Feb 2026

https://github.com/qeeqbox/data-compliance

Data compliance is the process of following various regulations and standards to ensure that sensitive digital assets (data) are guarded against loss, theft, and misuse

compliance data data-compliance infosecsimplified qeeqbox

Last synced: 19 Mar 2026

https://github.com/binarybardakshat/suryanayan

Suryanayan AI is a project aimed at using drone technology and artificial intelligence for monitoring and detecting issues in solar panels. This project is inspired by the Indian government's initiative to promote solar energy by providing subsidies on solar panels.

data drone nlp python solar

Last synced: 10 Oct 2025

https://github.com/baaziznasser/qurani

برنامج قرآني بواجهة بسيطة وبميزات خرافية مع قواعد بيانات كبيرة للقرآن الكريم وتفسيره

base data i3rab json quran qurani sql tafsir

Last synced: 12 Feb 2026

https://github.com/jderstd/spec

A standard for JSON responses

data error jder json response specification structure

Last synced: 13 May 2026

https://github.com/cmudig/mosaic-profiler

A data profiler built with Mosaic

data jupyter visualization

Last synced: 25 Oct 2025

https://github.com/justfairdev/web-stack-query

🤖 Powerful asynchronous state management, server-state utilities and data fetching for the web. TS/JS, React Query, Solid Query, Svelte Query and Vue Query.

async cache data fetch graphql hooks query react resources rest

Last synced: 09 May 2026

https://github.com/joamag/pandas

Loads of pandas data from China with awesome data

data data-analysis jupyter notebook pandas

Last synced: 25 Apr 2026

https://github.com/robertmyles/riscobrasil

An R package to download 'Brazil Risk' data :chart_with_upwards_trend:

brazil data finance r

Last synced: 08 Apr 2025

https://github.com/imagodata/filter_mate

FilterMate is a Qgis plugin, an everyday companion that allows you to easily filter your vector layers

data exploratory-data-analysis filter geospatial ogr postgis qgis qgis-plugin qgis3 qgis3-plugin spatialite sql vector-database

Last synced: 29 Apr 2026

https://github.com/fforres/webpack-plugin-dx-metrics

Webpack plugin to track webpack behaviour in datadog

data datadog developer-experience typescript visualization webpack

Last synced: 13 Feb 2026

https://github.com/mabel-dev/opteryx-catalog

📚 Opteryx Cloud Catalog

catalog data python sql

Last synced: 27 Feb 2026

https://github.com/akuzko/use-stash

React hooks for app-wide data access and manipulation

action actions data hook hooks react store

Last synced: 09 May 2026

https://github.com/tommasoazz/collaborative-location-activity-recommendations

Project for the course Scalable and Cloud Programming

data map mapreduce scala spark

Last synced: 16 Apr 2026

https://github.com/anthonykrivonos/ts-algo-masterclass

👾 Giant TypeScript algorithm and data structure masterclass to be constantly updated with important CS concepts.

algorithm class-project computer concepts data data-structures fundamentals giant library masterclass science structures typescript

Last synced: 11 May 2026

https://github.com/definetlynotai/llm_data

A bunch of very famous repos source code's in python as pure localdocs all in this repo to train CODE AI

c code-examples cpp cuda data data-dum jupyter-notebook llm llm-code llm-datasets programming-data programming-data-sets python3

Last synced: 08 Oct 2025

https://github.com/onaio/gisida-react

React Dashboard library for Gisida.

dashboard data gisida map react visualization

Last synced: 28 Apr 2025

https://github.com/reala10n/simplejsondb

Create a simple JSON database with just one line of code!

data database db easy json python simple

Last synced: 27 Oct 2025

https://github.com/huangcongqing/ranking-list

数据!important | 各种排行,榜单数据汇总 数据为王的时代 Data

data rank ranking

Last synced: 15 Feb 2026

https://github.com/cicerops/monitoring-check-grafana

Monitor a Grafana datasource against data becoming stale to detect data loss or other dropout conditions.

data database freshness grafana grafana-datasource icinga2 icinga2-plugin influxdb monitoring stale

Last synced: 08 May 2026

https://github.com/abrudz/parsing

Dyalog APL expressions to parse common and unusual data formats from text files

apl csv data data-format dyalog-apl dyalogapl parsing

Last synced: 20 Mar 2026

https://github.com/as/worm

Worm provides write-once read-many log-structured storage semantics

data log record storage worm

Last synced: 31 Jan 2026

https://github.com/woctezuma/steam-reviews-data

Data available to compute statistics of Steam reviews.

data steam steam-reviews

Last synced: 19 Mar 2026

https://github.com/floriancassayre/nicknames-datasets

Open source nicknames sets with informations about the data origin(s).

data data-mining dataset

Last synced: 08 Feb 2026

https://github.com/jsdhami/python-for-research

"Python-For-Research" Event Organized By Tri-Chandra Research Group, Ghantaghar, Kathmandu

analysis colab data jupyter matplotlib numpy panda physics python research visualization

Last synced: 27 Oct 2025

https://github.com/yashika-malhotra/exploratory-data-analysis-for-multinational-retail-corporation

Analysis via CLT and Visualization on Multinational Retail Corporation's data to provide insights and recommendations to improve their userbase.

colab-notebook data jupyter-notebook matplotlib numpy pandas python seaborn stats

Last synced: 11 Feb 2026

https://github.com/dkxce/osm2shp

Flexible OSM to SHP Converter (convert .osm & .pbf files to ESRI Shape .shp files). OSM to Shape.

converter data dbf dkxce earth esri map maps openseamap openstreetmap osm pbf routes shape shapes shp

Last synced: 26 Apr 2026

https://github.com/healthyregions/oeps

Opioid Environment Policy Scan - data explorer and backend management

data data-visualization public-health

Last synced: 21 Apr 2026

https://github.com/luminovrym/pbo-biodata

Simulasi Cara Input Data dengan OOP

data oop-in-php php-native

Last synced: 18 Jun 2026

https://github.com/potch/whizzy

A prototype rich data editor for GitHub

csv csvconf data github

Last synced: 01 May 2026

https://github.com/gadenbuie/crantrack

Hourly snapshots of CRAN's incoming packages folder

cran data r-packages

Last synced: 12 Mar 2026

https://github.com/pranavpandey/dynamic-backup

Backup and restore app data on Android.

android app backup data library restore storage

Last synced: 07 Sep 2025

https://github.com/d2hydro/fewspy

A Python API for the Deltares FEWS PI REST Web Service

data geopandas hydrology hydrometrics pandas python

Last synced: 23 Apr 2026

https://github.com/anicolaspp/mapr-data-gen

Data generator for MapR Data Platform

data mapr mapr-db mapr-es mapr-streams maprdb parquet scala spark

Last synced: 29 Apr 2026

https://github.com/geocollections/emaapou

eMaapõu: Eesti maapõue andmebaas

data database estonia geology portal

Last synced: 05 Feb 2026

https://github.com/hadro/brewery-guides

The data for guides to breweries across the United States from 1896 to 1918

brewers brewery-guides brewing brewing-history data dataset digital-collections digital-humanities hocr nypl open-data

Last synced: 16 Mar 2026

https://github.com/geopython/pygeoapi-examples

Example pygeoapi deployment patterns and configurations

api data geospatial ogc ogc-api osgeo pygeoapi

Last synced: 11 Oct 2025

https://github.com/sparkpost/event-data

self-hosted message events

api aws data email webhooks

Last synced: 29 Apr 2026

https://github.com/jmsallan/esdata

A R package to bring Spanish economic databases into the R environment

data datasets ine inflation spain unemployment-data

Last synced: 18 Jan 2026

https://github.com/carpentries-incubator/indigenous-data-sovereignty

Introduces the concepts and framework of Indigenous Data Sovereignty and Governance.

data english lesson pre-alpha

Last synced: 24 Jan 2026

https://github.com/chompfoods/sdk-csharp

C# SDK for the Chomp Food & Recipe Database API. Use our API to get high-quality data on recipes and 875,000+ branded/grocery foods plus raw ingredients.

api branded chomp csharp csharp-sdk data database dll food grocery ingredients nuget nutrition raw recipes recipes-api restsharp sdk swagger

Last synced: 06 May 2026