An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/timpulver/netlabel-list

A list of active and inactive netlabels in JSON-format

data data-set json label list music netaudio netlabel

Last synced: 12 May 2025

https://github.com/yisaienkov/tinysets

The project aims to collect various datasets for tasks such as classification, clustering, object detection... The purpose of this datasets is quick checking models and algorithms performance.

algorithms classification data data-science dataset datasets kaggle kaggle-dataset lego lego-minifigures lego-sets object-detection pypi python regression text-classification tinysets

Last synced: 14 Apr 2025

https://github.com/strmprivacy/cli

This is the STRM Privacy Command Line Interface, to define and manage your privacy streams, data schemas, event contracts and much more.

cli data data-pipeline data-privacy data-privacy-compliance data-processing privacy

Last synced: 23 Jun 2025

https://github.com/prakaa/mms-monthly-cli

Source code and CLI tool to query and download data from the Australian Energy Market Operator's Monthly Data Archive

aemo australia data energy national-electricity-market nem nemweb python

Last synced: 30 Oct 2025

https://github.com/khaouitiabdelhakim/etl-real-example

This repository contains a real example of an Extract, Transform, Load (ETL) process using SQL Server Management Studio (SSMS), SQL Server Integration Services (SSIS), and AdventureWorks2012 data. The objective is to load data into our LightAdventureDW data warehouse.

data database database-management sql sql-server ssis ssms warehouse

Last synced: 18 Mar 2025

https://github.com/bluegrams/periodic-table-data

Data of all chemical elements in the periodic table

chemistry csharp data dotnet elements periodic-table

Last synced: 18 Mar 2025

https://github.com/wmarquardt/cassandra-csv

A simple way to export cassandra query result to CSV format

cassandra csv data python

Last synced: 18 Mar 2025

https://github.com/onmyway133/computerscienceswift

👨‍💻 Practice computer science in Swift

algorithm data design pattern puzzle structure

Last synced: 17 Oct 2025

https://github.com/plateformeio/plateforme

The Python framework for Data Applications

api app asgi async data db fastapi plateforme pydantic python restx services sqlalchemy

Last synced: 01 May 2025

https://github.com/heysupratim/android-app-categories

A JSON having 19K Android package name entries with their Play Store Categories. Useful for people looking to create App Category Based things. Eg Smart Launcher

android crawled-data data json

Last synced: 26 Mar 2025

https://github.com/mrchypark/gomsubtitledata

곰tv 자막 데이터 수집 코드

data drama korean movies r subtitles text text-data

Last synced: 17 Oct 2025

https://github.com/cmpadden/dagster-pipes-rust

Dagster pipes implementation in Rust

dagster data integrations orchestration rust

Last synced: 11 Oct 2025

https://github.com/abcnews/data-australian-political-donations

A data package about political donations in Australia.

data data-package

Last synced: 27 Jan 2026

https://github.com/fdmorison/tiozin

Tiozin, your friendly ETL framework

data declarative etl framework pipeline

Last synced: 26 Apr 2026

https://github.com/bastianolea/temperaturas_chile

Datos de temperaturas mínimas y máximas en Chile por estación meteorológica, desde 1950 a 2024

ambiental chile data meses social tiempo

Last synced: 08 Apr 2025

https://github.com/buabaj/byte

data transmission over sound modulator-demodulator model

data python3 sound

Last synced: 08 Oct 2025

https://github.com/datasets/genome-sequencing-costs

Costs associated with DNA sequencing since 2001

data data-science genome

Last synced: 19 Oct 2025

https://github.com/synzen/Discord.Stats

Data visualization for Discord server activities

charts data discord statistics tracking visualization

Last synced: 12 Oct 2025

https://github.com/earthinversion/geospatial-data-visualization-using-pygmt

Example script to visualize topographic data, earthquake data, and tomographic data on a map

data geophysics pygmt python3 seismology visualization

Last synced: 10 Apr 2025

https://github.com/simonsfoundation/spectrum-drug-tracker

Python files and datasets underlying the Spectrum Drug Tracker.

autism data data-visualization python python3

Last synced: 27 Feb 2026

https://github.com/koddachad/dq_tester

A lightweight simple data quality testing tool.

data database dataengineering dataquality dataqualitycheck

Last synced: 08 Oct 2025

https://github.com/szczyglis-dev/ultimate-chain-parser

[PHP] Advanced, extendable, and configurable text data parsing and processing toolkit working in a chain-based flow. The concept of the application is based on processing in subsequent iterations using configurable data processing modules in a configured manner. Each element in the execution chain accesses the output of the previous element.

composer-library csv csv-parser data json-parser parsing plugin-architecture processing rearrange-array recordset regex regex-match regex-pattern repack repair-processes reparse text text-generation text-processing yaml-parser

Last synced: 08 Oct 2025

https://github.com/m-dadej/downloading-and-aggregating-stocks

Scripts for downloading WSE/GPW stock prices. Allows for downloading historical price for every stock into a single dataset

data finance gpw historical-data stock

Last synced: 29 Apr 2025

https://github.com/mckraqs/dataride

Lightning-fast data platform setup toolkit for small projects and PoCs

data data-engineering python terraform

Last synced: 24 Oct 2025

https://github.com/paulgrammer/ug-locale

Uganda districts, sub-counties, counties, parishes and villages

data districts nodejs npm-package uganda

Last synced: 02 Mar 2026

https://github.com/filippobovo/betfair_data

Simple script to collect market data from Betfair.

betfair betfair-api collection data python

Last synced: 27 Feb 2026

https://github.com/dbt-labs/jaffle-shop-mesh-finance

A ✨ meshified ✨ open source sandbox project exploring dbt workflows via a fictional sandwich shop's data. This is a domain-focused node in the mesh focused on finance models, built on the jaffle-shop-mesh-platform project.

analytics analytics-engineering data data-engineering dbt dbt-cloud

Last synced: 05 Mar 2026

https://github.com/anthonykrivonos/nba-ml

🏀 Hardcoded ML classifiers from scratch to create predictive models on the outcomes of NBA games!

basketball classifiers data fromscratch hardcoded machine-learning ml nba python science sports

Last synced: 08 Oct 2025

https://github.com/andrewrporter/goiex

A go interface for accessing IEX finanical information

data fetch finance golang iex iex-api iextrading

Last synced: 28 Apr 2025

https://github.com/frederickgeek8/lyql

📈 Free realtime stock data. Streamed straight from Yahoo.

data data-mining finance realtime stocks stream stream-api yahoo

Last synced: 05 Mar 2026

https://github.com/insightsoftwareconsortium/rirewebsite

Website sources for The Retrospective Image Registration Evaluation Project (RIRE)

data grand-challenge imaging open-access open-science registeration

Last synced: 12 Oct 2025

https://github.com/artemvlas/veretino

Folder integrity checker

checksums cpp crypto data digest folder hash integrity qt

Last synced: 05 Feb 2026

https://github.com/matheusfelipeog/filometro

Obtenha os dados dos postos de vacinação da covid-19 em São Paulo

coronavirus covid-19 data de-olho-na-fila filometro python sao-paulo vacina vacinasampa wrapper

Last synced: 07 Oct 2025

https://github.com/route1io/route1io-python-connectors

Connectors for interacting with popular APIs and services used in marketing analytics via clean and concise Python code.

analytics api api-connector data data-engineering marketing marketing-analytics python python3

Last synced: 13 Apr 2026

https://github.com/zq99/optionsview

This library downloads option chain data for a given symbol from yahoo finance in a trader friendly format.

data options options-trading trading yahoo-finance

Last synced: 14 Jan 2026

https://github.com/giscience/osm-transform

Filter, enrich and prepare your OSM data for openrouteservice 🚙

cleanup data elevation enrichment filter graphs openrouteservice openstreetmap pbf routing

Last synced: 01 Apr 2026

https://github.com/gematik/spec-isip

FHIR resources for information technology systems in nursing care (ISiP – Informationstechnische Systeme in der pflegerischen Versorgung) are determined through the affirmative action process of the same name. Through ISiP, open and standardized interfaces are defined for the interoperable exchange of health data in care.

data isik specification

Last synced: 03 Mar 2026

https://github.com/iondv/report

IONDV. Framework: Report module is to form the analytical reports.

analytics businessintelligence css data data-analysis data-visualization iondv iondv-module reporting

Last synced: 12 Mar 2026

https://github.com/siongui/7rsk9vjkm4p8z5xrdtqc

Pāli chanting resources and dhammatalk books

data pali

Last synced: 19 Jan 2026

https://github.com/cafali/pathscan

PathScan exports information about the contents of directories and hard drives. With a single click, you can create a complete list of all files and paths within a specific folder or across an entire hard drive.

backup command-line data data-analysis data-migration data-mining data-recovery directory folder-management folders forensics hard-drive keyword-extraction logging pathfinding recovery string-search tools utility windows

Last synced: 10 Oct 2025

https://github.com/caerbannogwhite/aargh

A library that helps you out of data nightmares in Go. 🧙‍♂️

csv data data-science data-wrangling dataframe go golang html json linq statistics stats xlsx xpt

Last synced: 14 Jan 2026

https://github.com/dathere/qsvpro.dathere.com

🌐 Promo website for qsv pro, a spreadsheet data wrangling desktop app. Includes download links for Windows, macOS, & Linux. Website built with Astro as a static site.

astro ckan csv data data-wrangling framer-motion javascript product qsv react saas tailwindcss website

Last synced: 28 Feb 2026

https://github.com/koffisani/coding-data-togo

Données sur les langages et outils de développement utilisés ou sollicités au Togo

data python python3 scrapy scrapy-crawler

Last synced: 26 Mar 2025

https://github.com/freeipcc/freedatascrm

工商数据,电话获客,智能客户关系管理,数据驱动营销,自动化销售线索,B2B营销,客户洞察分析,精准营销!

ai bigdata bigdataanalytics data scrm

Last synced: 08 Feb 2026

https://github.com/jetsly/ddrx

A lightweight front-end framework based on rxjs. (Inspired by camel)

data rx rxjs store

Last synced: 13 Oct 2025

https://github.com/gematik/spec-templateforsimplifierprojects

Template for creating gematik FHIR profiles

data fhir fsh miscellaneous template

Last synced: 25 Feb 2026

https://github.com/cmstatr/cmstatr

An R Package for Statistical Analysis of Composite Material Data

composite-material-data cran data materials-science r statistical-analysis statistics

Last synced: 22 Oct 2025

https://github.com/claudiucreanga/data-science

Data Science notebooks

competitions data kaggle science

Last synced: 14 Oct 2025

https://github.com/spatialcurrent/go-math

Math functions that support varied types

big-data bigdata data

Last synced: 29 Jan 2026

https://github.com/zeybek/node-matlab

NodeJS Package for MATLAB

algebra analytics data matlab matrix signal-processing

Last synced: 13 Mar 2026

https://github.com/franloza/running-races-insights

Web application created with Evidence and DuckDB to share stats about the running races in Cuenca.

data dataengineering duckdb elt evidence markdown netlify running sql visualization

Last synced: 23 Jun 2026

https://github.com/apache/incubator-devlake-playground

Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and community growth.

dashboard-friendly data data-analysis data-engineering data-integration data-transfers devops domain-layer dora etl hacktoberfest integration jira open-source python user-friendly

Last synced: 19 Oct 2025

https://github.com/orfium/s3-parquetifier

This is a tool that takes a file from an S3 bucket and transforms it to Parquet format

data missing-codeowners

Last synced: 12 Apr 2025

https://github.com/navchandar/file-convertor-utils

Set of custom Python Utilities to convert one file format into another. Filetypes supported: Excel, Images, PDF, GIF, MP4, XML, etc.

conversions convertor-utils data dataconversion excel file-conversion fileconversion fileformats image pdf python video xml

Last synced: 21 Sep 2025

https://github.com/mongodb-developer/mongo-resilient-evolvability-demo

Demonstrates MongoDB best practices for building resilient yet evolvable shared data applications, using Rust as an example

agile data database flexible fluid mongodb resilient robust rust

Last synced: 07 Apr 2025

https://github.com/unicef/magasin

Cloud native open-source end-to-end data / AI / ML platform

cloud dagster data data-pipelines data-science data-visualization helm-charts kubernetes magasin

Last synced: 21 Apr 2025

https://github.com/amamenko/nypd-arrest-map

Data visualization application for year-to-date NYPD arrests

age arrest borough crime data deck-gl graph new new-york nyc nypd police race trends visualization websockets york

Last synced: 12 Apr 2025

https://github.com/longnguyen010203/ecommerce-elt-pipeline

🌄📈📉 A Data Engineering Project 🌈 that implements an ELT data pipeline using Dagster, Docker, Dbt, Polars, Snowflake, PostgreSQL. Data from kaggle website 🔥

dagster data data-engineering dbt docker docker-compose dockerfile elt elt-pipeline extract kaggle load polars postgresql raw-data relational-databases snowflake transform

Last synced: 27 Feb 2026

https://github.com/bkuhlmann/lode

A monadic store of marshaled objects.

data objects persistence pstore storage transactions value

Last synced: 29 Jul 2025

https://github.com/r-js/mangos

🥭's is monorepo collecting data wrangling and data validation utilities

counterculture data data-wrangling fold functional isomorphism javascript json lens optics schema traversal validation

Last synced: 22 Feb 2026

https://github.com/jesusgraterol/binance-futures-dataset-builder

The dataset builder script extracts the most relevant market data straight from Binance's API and builds a series of datasets that can be used in data science and machine learning projects.

bitcoin blockchain blockchain-technology data datascience datascience-machinelearning dataset dataset-generation futures futures-long-short futures-market machine-learning

Last synced: 06 Mar 2026

https://github.com/sckott/splister

match species list against reference list

data r r-package rstats taxonomy

Last synced: 23 Apr 2025

https://github.com/amol-/datapyground

Easy to study Data Platform for fun and profit

compute-engine data data-engineering database python

Last synced: 28 Jul 2025

https://github.com/stefen-taime/kafka-pipeline

In the following post, we will learn how to build a data pipeline using a combination of open-source software (OSS), including Debezium, Apache Kafka, Kafka Connect.

bash data docker elasticsearch etl-pipeline k kafka kafka-connect kafka-streams kafka-topic kibana ksqldb masking mongodb mysql pii pipeline postgresql

Last synced: 15 Apr 2025

https://github.com/axsaucedo/scalable-data-science

Scalable Data Science: The state of DataOps / MLOps in 2018

data dataops learning machine ml mlops scalable science

Last synced: 25 Oct 2025

https://github.com/arescentral/antares-data

Data needed by Antares

antares data scenario

Last synced: 21 Aug 2025

https://github.com/effect-deprecated/morphic

Domain Modelling and Structural Derivation (port of morphic-ts)

data domain functional typeclasses

Last synced: 29 Jun 2025

https://github.com/spsanderson/steveondata

Repository for mainly R tips and tricks for my blog. I also include some VBA, SQL, C and Linux Usage.

ai blog c data data-science linux machinelearning-r ml ms-sql r sql time-series tipoftheday vba vba-excel

Last synced: 07 Apr 2025

https://github.com/mooxphp/data

[READ-ONLY] Static Language Data for Filament

countries currencies data filament languages laravel static timezones

Last synced: 20 Feb 2026

https://github.com/juliaearth/geoartifacts.jl

Artifacts (e.g., datasets) for Geospatial Data Science

artifacts data geospatial

Last synced: 10 Apr 2026

https://github.com/dprokop/querier

Simple declarative data layer for React apps

data declarative react typescript

Last synced: 23 Mar 2025

https://github.com/arindal1/striversdsasheet

Solutions of all the problems in Striver's A2Z DSA Sheet

cpp data datastructures datastructures-algorithms striver strivers-sde-sheet

Last synced: 04 Apr 2025

https://github.com/hitsz-ids/dbmasker

DBMasker 是一个针对主流数据库系统的 Java 开源项目,旨在提供统一且安全的访问接口。

data data-security database mask sdk security

Last synced: 26 Apr 2025

https://github.com/samashi47/ml-toolkit-project

A general-purpose toolkit for data preprocessing, machine learning modeling, and visualization.

classification data data-preprocessing machine-learning python3 visualization

Last synced: 30 Jul 2025

https://github.com/ejetar/laravel-formatter

A package that enables you to convert your data into various formats such as JSON, XML, CSV and YAML. Based on FuelPHP's 💧 formatter class.

conversion convert csv data format formatter json laravel output php response xml yaml

Last synced: 13 Jan 2026

https://github.com/bjascob/pythondataserve

A module for serving up python data in a stand-alone process.

client-server data python

Last synced: 23 Apr 2025

https://github.com/flintsh/outlier-tools

A collection of free open-source tools to help you better understand your Outlier account, entirely handled in-browser.

data outlier rlhf

Last synced: 27 Feb 2025

https://github.com/sjefvanleeuwen/rqlite-dotnet

A lightweight database HTTP API client for rqlite. rqlite is a lightweight, distributed relational database, which uses RAFT and SQLite as its storage engine.

cluster data database distributed distributed-computing distributed-database distributed-systems dotnet raft rqlite

Last synced: 12 May 2025

https://github.com/mozahran/data-mapper

A data mapping tool that helps you map JSON with configuration files (JSON structure transformation). It also supports if conditions, casting, and mutators (custom or built-in functions).

data json mapper mappings mutator transformer

Last synced: 13 Jan 2026

https://github.com/pottekkat/bulldozer-prize-predictions

Predict the auction sale price for a piece of heavy equipment to create a "blue book" for bulldozers.

bluebook bulldozer data data-science jupyter-notebook kaggle-competition machine-learning

Last synced: 20 Jun 2026

https://github.com/Scetrov/FrontierSharp

C# / .NET API Clients for EVE Frontier — API client for the static data exposed by CCPs HTTP API plus a HTTP Client tuned to the specific API design patterns implemented by CCP.

api data eve-frontier static-data

Last synced: 30 May 2026