An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/expressapp/construct

Library for dealing with data structures

data elixir elixir-construct elixir-lang types validation

Last synced: 11 Dec 2025

https://github.com/quasilyte/gocorpus

The code used to serve gocorpus application

analysis corpus data go gogrep golang query search statistics syntax

Last synced: 21 Apr 2025

https://github.com/moscarde/junior_zone

Vagas Jr. atualizadas diariamente. Telegram e Planilha Online

data python scraping telegram

Last synced: 14 Apr 2025

https://github.com/bpbond/srdb

Global soil respiration database

carbon-cycle data global-database science soil soil-respiration

Last synced: 07 Apr 2025

https://github.com/tony-xlh/animatedqrcodereader

Animated QR code reader

animated data qrcode transfer

Last synced: 23 Apr 2025

https://github.com/iesahin/xvc

A robust (🐢) and fast (🐇) MLOps tool for managing data and pipelines in Rust (🦀)

command-line-tool data data-engineering data-pipelines data-science devops machine-learning machine-learning-engineering mlops rust

Last synced: 28 Jun 2025

https://github.com/ahmed-mohamed-sn/olliePy

OlliePy is a python package which can help data scientists in exploring their data and evaluating and analysing their machine learning experiments by utilising the power and structure of modern web applications. The data scientist only needs to provide the data and any required information and OlliePy will generate the rest.

ai analytics charts dashboard data data-analytics data-science data-scientist eda error-analysis exploratory-data-analysis machine-learning python visualization

Last synced: 08 May 2025

https://github.com/distributedsystemsgroup/zoe

Zoe: Container Analytics as a Service -- mirror of https://gitlab.eurecom.fr/zoe/main/

analytics containers data jupyter python spark

Last synced: 23 Oct 2025

https://github.com/tobilg/aws-iam-data

This repository contains the full dataset of AWS IAM data (services, actions, resource types and conditions keys). It's updated on a daily basis at 4AM UTC.

aws data git-scraping iam

Last synced: 07 Apr 2025

https://github.com/siberiacancode/mock-config-server

🎉 tool that easily and quickly imitates server operation, create full fake api in few steps

api config data database fake graphql mock mock-server mocking rest rest-api server

Last synced: 14 Jul 2025

https://github.com/olist/work-at-olist-data

Apply for a job at Olist's Data Team: https://olist.gupy.io/

analytics data dataengineering datascience dataset julia machinelearning pandas python r sql

Last synced: 25 Jun 2025

https://github.com/digitalghost-dev/poke-cli

A hybrid CLI/TUI tool written in Go for viewing Pokémon data from the terminal!

charm charmbracelet cli data go pokemon terminal terminal-based tui

Last synced: 23 Jan 2026

https://github.com/jcparkyn/phetch

A small Blazor library for handling async query state, in the style of React Query

async blazor data fetch query

Last synced: 27 Mar 2025

https://github.com/michabirklbauer/instagram_json_viewer

Transforms Instagram's *.json / backup data - that you get via the Data Download Tool - to a readable format!

backup converter data download html instagram json python viewer

Last synced: 18 Jul 2025

https://github.com/dig-eds-cat/digeds_cat

This research seeks to examine best practice in the field of digital editions by collating relevant evidence in a detailed catalogue of extant digital projects.

catalogue data digital-edition digital-humanities library-database open-access open-data open-source

Last synced: 08 Apr 2025

https://github.com/gmh5225/ida-find-.data-ptr

A simple ida python script to find .data ptr

anti anti-cheat cheat data data-ptr driver ida idapython plugin ptr windows

Last synced: 21 Mar 2025

https://github.com/datacoon/undatum

undatum: a command-line tool for data processing. Brings CSV simplicity to NDJSON, BSON, XML and other data files

bson cli command-line csv data dataset json jsonl jsonlines parquet

Last synced: 18 Jan 2026

https://github.com/jkanev/treetime

TreeTime is a data organisation, management and analysis tool. A tree is a hierarchical structure that arranges information in units and sub-units. TreeTime uses linked trees (one data item can be part of different distinct trees) to store and organise any general purpose data.

ancestry data data-organisation data-organizer hierarchical-data information-management information-management-system information-manager linked-trees ontology project-management time-management to-do-list tree tree-editor tree-structure

Last synced: 14 Dec 2025

https://github.com/jamesleesaunders/d3-ez

D3-EZ Easy Reusable Charts

chart d3 data dataviz graph svg visualization

Last synced: 16 Mar 2025

https://github.com/ysichov/Simple-Data-Explorer

Simple Data Explorer

abap data erp hcm sap sde se16n viewer

Last synced: 04 May 2025

https://github.com/ysichov/simple-data-explorer

Simple Data Explorer

abap data erp hcm sap sde se16n viewer

Last synced: 17 Jul 2025

https://github.com/paulknysh/sym

A Mathematica package for generating symbolic models from data

data generation mathematica model symbolic

Last synced: 08 Jul 2025

https://github.com/streamingfast/streamingfast

The dfuse Blockchain Data Platform

blockchain data eosio ethereum platform

Last synced: 30 Apr 2025

https://github.com/r-geoflow/geoflow

Orchestrate Geospatial (Meta)Data Management Workflows and Manage FAIR Services

data dataverse fair geospatial inspire iso metadata ocs ogc orchestrator postgis r spatial workflow zenodo

Last synced: 22 Oct 2025

https://github.com/0xb10c/bitcoin-development-history

Data and a example for a open source timeline of the history of Bitcoin development

bitcoin data history json timeline

Last synced: 14 Apr 2025

https://github.com/intake/akimbo

For when your data won't fit in your dataframe

awkward-array cudf data dataframe pandas polars python

Last synced: 26 Aug 2025

https://github.com/3c7/common-osint-model

Converting data from services like Censys and Shodan to a common data model

analysis censys data infrastructure model osint shodan

Last synced: 30 Dec 2025

https://github.com/chrisvwn/Rnightlights

R package to extract data from satellite nightlights.

data dmsp-ols extraction nightlights noaa package r satellite snpp-viirs

Last synced: 13 Jul 2025

https://github.com/raphaelmansuy/digital_palace

My Digital Palace - A Personal Journal for Reflection - A place to store all my thoughts

ai data data-engineering

Last synced: 16 Mar 2025

https://github.com/vesparny/brcast

Tiny data broadcaster with 0 dependencies

broadcast data emitter event pubsub

Last synced: 13 Apr 2025

https://github.com/ropensci/rdataretriever

R interface to the Data Retriever

data data-science database datasets r r-package rstats science

Last synced: 22 Oct 2025

https://github.com/thantthet/YBS-Data

Yangon Bus Service data

bus data public transit yangon ybs

Last synced: 14 Mar 2025

https://github.com/jmboehm/douglass.jl

Stata-like toolkit for data wrangling on Julia DataFrames

data data-frames economics julia stata tabular-data

Last synced: 31 Oct 2025

https://github.com/atolcd/pentaho-gis-plugins

🗺 GIS plugins for Pentaho Data Integration

data dxf etl geojson gpx java mif-mid pentaho-data-integration shp spatialite svg

Last synced: 23 Jan 2026

https://github.com/utrechtuniversity/yoda

A system for reliable, long-term storing and archiving large amounts of research data during all stages of a study.

ansible automated-deployment data irods research utrecht-university yoda

Last synced: 07 Apr 2025

https://github.com/opendatasoft/semantic-bot

A Semi-Automatic Tool to generate RDF mappings for Opendatasoft's datasets

data dbpedia linked lov ontology rdf rdf-mapping rdfs rml semantic yago yarrrml

Last synced: 01 May 2025

https://github.com/cdnjs/cf-stats

📈 Monthly usage statistics from Cloudflare for the cdnjs.cloudflare.com domain - The #1 free and open source CDN built to make life easier for developers.

cdnjs cloudflare data data-analysis statistics stats usage usage-data usage-reports

Last synced: 06 Jul 2025

https://github.com/quickbirdeng/datakit

A Swift library to easily read and write binary formatted data using a modern, declarative interface.

binary-data ble bluetooth bluetooth-le bluetooth-low-energy data declarative declarative-programming decoding dsl encoding network resultbuilder swift swift5

Last synced: 23 Jun 2025

https://github.com/bluewave-labs/maskwise

Maskwise detects, redacts, masks, and anonymizes sensitive data across text, images, and structured data in training datasets for LLM systems. Powered by Microsoft Presidio

data data-anonymization data-redaction data-scanning gdpr-compliance hipaa-compliance pii-anonymization pii-detection sensitive-data-masking

Last synced: 20 Jan 2026

https://github.com/wjakethompson/taylor

A comprehensive resource for data on Taylor Swift songs, and ggplot2 helper functions

color-palettes data genius-lyrics ggplot2-themes lyrics r spotify spotify-api taylor-swift

Last synced: 06 Apr 2025

https://github.com/doktormike/dammmdatagen

Marketing Mix Modeling Data Generator

benchmark data data-generator marketing-mix-modeling

Last synced: 29 Jul 2025

https://github.com/kuda-io/kuda

Kubernetes 原生的数据交付平台

cloud-native data golang hdfs kubernetes storage

Last synced: 14 Jan 2026

https://github.com/optixal/cryptoinscriber

:chart_with_upwards_trend: A live cryptocurrency historical trade data blotter. Download live historical trade data from any cryptoexchange, be it for machine learning, backtesting/visualizing trading strategies or for Quantopian/Zipline.

backtest bot cryptocurrency data downloader exchange feeds historical historical-data learning live machine poll strategy trade transactions

Last synced: 21 Mar 2025

https://github.com/jqnpm/jqnpm

A package manager built for the command-line JSON processor jq.

command-line-tool data data-processing jq json package-manager

Last synced: 21 Jul 2025

https://github.com/Optixal/CryptoInscriber

:chart_with_upwards_trend: A live cryptocurrency historical trade data blotter. Download live historical trade data from any cryptoexchange, be it for machine learning, backtesting/visualizing trading strategies or for Quantopian/Zipline.

backtest bot cryptocurrency data downloader exchange feeds historical historical-data learning live machine poll strategy trade transactions

Last synced: 22 Mar 2025

https://github.com/cgsecurity/testdisk_documentation

Documentation for TestDisk & PhotoRec

data photorec recovery testdisk

Last synced: 03 Sep 2025

https://github.com/asem000/pytreeclass

Visualize, create, and operate on pytrees in the most intuitive way possible.

data dataclasses deep-learning jax machine-learning pipelines pytorch pytree tensorflow

Last synced: 07 Apr 2025

https://github.com/timokoerber/laravel-json-seeder

Create and use JSON files to seed your database in your Laravel applications

data database json laravel seed seeder seeder-table seeders

Last synced: 19 Oct 2025

https://github.com/yzfly/mcp-excel-server

The Excel MCP Server is a powerful tool that enables natural language interaction with Excel files through the Model Context Protocol (MCP). It provides a comprehensive set of capabilities for reading, analyzing, visualizing, and writing Excel data.

claude claude-mcp data excel mcp mcp-excel-server

Last synced: 06 Jul 2025

https://github.com/DoktorMike/dammmdatagen

Marketing Mix Modeling Data Generator

benchmark data data-generator marketing-mix-modeling

Last synced: 06 May 2025

https://github.com/daveebbelaar/df-data-science-template

This templated is provided by Datalumina and based on the Cookiecutter Data Science template

ai data python

Last synced: 06 Sep 2025

https://github.com/greenelab/pubtator

Retrieve and process PubTator annotations

data nlp pubmed pubtator snorkel text-mining tool

Last synced: 05 May 2025

https://github.com/benthosdev/benthos-captain

A Kubernetes Operator to orchestrate Benthos pipelines

benthos data data-engineering gitops go golang helm kubernetes kustomize pipelines stream-processing

Last synced: 22 Jan 2026

https://github.com/albar965/navdatareader

Navdatareader is a command line tool that uses the atools fs/bgl and fs/writer to store a full flight simulator scenery database into a relational database like Sqlite or MySql.

compiler data flight fsx map navigation prepar3d simulator x-plane

Last synced: 02 May 2025

https://github.com/davegomez/silky-charts

A silky smooth D3/React library

charts d3 data dataviz graphs react utils

Last synced: 05 Jul 2025

https://github.com/j535d165/cbsodata

Unofficial Statistics Netherlands (CBS) open data API client for Python

census-api census-data data national-statistics netherlands open-data python-library

Last synced: 05 Apr 2025

https://github.com/cdcgov/cdc-open-viz

CDC OpenViz is a library of React packages for data visualization.

data react visualization visualization-library

Last synced: 04 Apr 2025

https://github.com/nasdaq/hackathons

Nasdaq's realtime streaming stock market data for hackathons.

data hackathon market market-data nasdaq real-time realtime stock-market streaming

Last synced: 18 Oct 2025

https://github.com/adieuadieu/japan-train-data

🇯🇵 🚂 A circular object of train data for Japan including translations & station geocoding and a tool to generate it.

data eki japan nihon train translations

Last synced: 18 Mar 2025

https://github.com/guocaoyi/meituan-spider

美团™爬虫练习项目(Region、POI、店铺、商品)

china-city data learning meituan meituan-pois poi puppeteer reptile reptile-nodejs

Last synced: 17 Aug 2025

https://github.com/ethicnology/ophois

Creates street graph from OpenStreetMap

data graph network openstreetmap osm street

Last synced: 11 Oct 2025

https://github.com/vida-nyu/data-polygamy

Data Polygamy is a topology-based framework that allows users to query for statistically significant relationships between spatio-temporal data sets.

data data-science nyucds

Last synced: 10 Apr 2025

https://github.com/adamhl8/inspectarr

A CLI tool for querying and inspecting the media in your Radarr and Sonarr instances

data inspect query radarr servarr sonarr

Last synced: 24 Oct 2025

https://github.com/parafoxia/analytix

A simple yet powerful SDK for the YouTube Analytics API.

analytical-information api-wrapper arrow data excel google pandas polars python service utility youtube youtube-api

Last synced: 06 Apr 2025

https://github.com/jiro4989/faker

Faker is a Nim package that generates fake data for you.

cli data faker generator lib nim

Last synced: 08 May 2025

https://github.com/xefi/faker-php-symfony

Symfony integration of the xefi\faker-php package

data fake faker php symfony symfony-bundle

Last synced: 18 Mar 2025

https://github.com/datakitchen/dataops-observability

DataOps Observability is part of DataKitchen's Open Source Data Observability. DataOps Observability monitors every data journey from data source to customer value, from any team development environment into production, across every tool, team, environment, and customer so that problems are detected, localized, and understood immediately.

data data-engineering data-observability data-science dataops pipleine-monitoring

Last synced: 09 Apr 2025

https://github.com/itext/itext-pdfocr-dotnet

pdfOCR is an iText 7 add-on to recognize and extract text in scanned documents and images. It can also convert them into fully ISO-compliant PDF or PDF/A-3u files that are accessible, searchable, and suitable for archiving

archival character data diacritic extractable glyphs hindi image iso-compliant ligatures mandarin ocr optical pdf portuguese recognition scan searchable spanish tesseract

Last synced: 08 Jan 2026

https://github.com/airbytehq/write-for-the-community

Contribute and collaborate on educational content for the Airbyte Community.

airbyte articles data open-source showcases tutorial videos

Last synced: 24 Feb 2025

https://github.com/the-alchemists-of-arland/gray-matter-rs

A tool for easily extracting front matter out of a string. It is a fast Rust implementation of gray-matter. Parses YAML, JSON, TOML and support for custom parsers. Use it and let me know by giving it a star!

data front-matter front-matter-parsers frontmatter gray-matter gray-matter-rs gray-matter-rust markdown matter parse rust rust-crate yaml

Last synced: 10 Apr 2025

https://github.com/jpmorganchase/py-avro-schema

Generate Apache Avro schemas for Python types including standard library data-classes and Pydantic data models.

avro data dataclasses deserialization generate jpmorganchase kafka messaging pydantic python schema serialization types

Last synced: 28 Jun 2025

https://github.com/jobovy/apogee

Tools for dealing with APOGEE data

astronomy astrophysics data data-analysis python spectroscopy

Last synced: 02 Oct 2025

https://github.com/paezha/idealista18

Open data product with real estate listings from Idealista. The datasets are for three major cities in Spain and the year 2018. https://doi.org/10.1177/23998083241242844

data open-data-products packages r real-estate spain spatial

Last synced: 29 Jun 2025

https://github.com/PatrickCuba-zz/thedatamustflow

Visio stencils and artefacts related to data vault guru

data data-vault stencil vault visio

Last synced: 20 Jul 2025

https://github.com/tombarr/open-source-words

Visualization of the most frequent words used in open source projects

d3 data data-visualization javascript python

Last synced: 13 Apr 2025

https://github.com/alir3z4/django-databrowse

Databrowse is a Django application that lets you browse your data.

data database django

Last synced: 11 Apr 2025

https://github.com/rmax/databrewer

The missing datasets manager. Like hombrew but for datasets. CLI-tool for search and discover datasets!

command-line data datasets discovery python

Last synced: 20 Mar 2025

https://github.com/maicius/universityrecruitment-ssurvey

用严肃的数据来回答“什么样的企业会到什么样的大学招聘”?

analysis beautifulsoup crawler data redis university

Last synced: 28 Apr 2025

https://github.com/ipeagit/flightsbr

R Package to Download Flight and Airport Data from Brazil

aviation-data brazil data r rstats rstats-package

Last synced: 02 May 2025