An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/tirthajyoti/synthetic-data-gen

Various methods for generating synthetic data for data science and ML

classification data data-science machine-learning python regression symbolic-computation time-series

Last synced: 30 Apr 2025

https://github.com/apple/dnikit

A Python toolkit for analyzing machine learning models and datasets.

ai bias compression data data-duplication fairness fairness-ml introspection machine-learning ml python

Last synced: 19 Oct 2025

https://github.com/turbot/steampipe-postgres-fdw

The Steampipe foreign data wrapper (FDW) is a zero-ETL product that provides Postgres foreign tables which translate queries into API calls to cloud services and APIs. It's bundled with Steampipe and also available as a set of standalone extensions for use in your own Postgres database.

aws azure data devsecops gcp golang hacktoberfest kubernetes postgres postgresql postgresql-fdw security sql steampipe steampipe-engine

Last synced: 07 May 2025

https://github.com/luanborelli/ipeadatapy

ipeadatapy is a data and metadata extraction package made in Python using Ipeadata database official API. In it's essence it is an API wrapper.

api api-wrapper brazil dados-abertos dados-historicos data data-analysis datasets econometrics economic-data economics geographic-data geography ipea ipeadata wrapper

Last synced: 07 Apr 2026

https://github.com/textileio/textile-facebook

[DEPRECATED] simple parsing tool to get your data out of a facebook export

data exporters photography privacy

Last synced: 05 Jan 2026

https://github.com/purarue/hpi

Human Programming Interface - a way to unify, access and interact with all of my personal data [my modules]

data gdpr history lifelogging personal-api quantified-self

Last synced: 04 Jul 2025

https://github.com/trailheadapps/coral-cloud

Sample application that showcases Data Cloud, Agents and Prompts.

agents ai cloud data prompt salesforce

Last synced: 05 Apr 2025

https://github.com/piquette/qtrn

A cli tool to streamline financial markets data analysis :wrench:

cli data data-science finance go golang options quotes scraper stock stock-analysis stock-market

Last synced: 15 May 2025

https://github.com/melroy89/metacritic_api

PHP Metacritic API - Mirror from my GitLab

api crawler data metacritic parser php scores scraper webscraping

Last synced: 13 May 2025

https://github.com/cipherstash/jseql

Encrypt and protect data using industry standard algorithms, field level encryption, a unique data key per record, bulk encryption operations, and decryption level identity verification.

data data-security encryption javascript postgres postgresql security typescript

Last synced: 09 Apr 2025

https://github.com/JoasASantos/ironclaw

Your own personal AI assistant. But with security by design. Support for numerous operating systems. Any platform.

ai ai-agents ai-assistant data openclaw own-your-data personal personal-assistant zeroclaw

Last synced: 16 Jun 2026

https://github.com/mydataharbor/mydataharbor

:cn: MyDataHarbor是一个致力于解决任意数据源到任意数据源的分布式、高扩展性、高性能、事务级的数据同步中间件。帮助用户可靠、快速、稳定的对海量数据进行准实时增量同步或者定时全量同步,主要定位是为实时交易系统服务,亦可用于大数据的数据同步(ETL领域)。

data data-sync elasticsearch etl java jdbc kafka mysql pipeline redis

Last synced: 19 Apr 2025

https://github.com/visivo-io/visivo

✨ Build dashboards with end-to-end version control. 🔋 CLI w/ batteries included, no infra required. Develop on your laptop for instant results, deploy changes safely (with automated checks), and keep every report trustworthy for stakeholders, analysts and agents 🤖

analytics bi bi-analytics bi-as-code business-intelligence data data-analysis data-visualization duckdb plotlyjs pydantic python reactjs sql

Last synced: 16 Oct 2025

https://github.com/countries/countries-data-json

ISO 3116 country information in JSON format to be included in other projects.

countries currency data iso-3166-1 iso-3166-2 iso-4217 json

Last synced: 18 Jan 2026

https://github.com/rsheftel/raccoon

Python DataFrame with fast insert and appends

data dataframe frame pandas

Last synced: 03 Apr 2025

https://github.com/capitalone/dataCompareR

dataCompareR is an R package that allows users to compare two datasets and view a report on the similarities and differences.

compare-data data data-analysis data-science r

Last synced: 30 Jul 2025

https://github.com/kaustubhhiware/facebook-archive

Just some fun you can have with facebook's archive data

data data-visualization facebook python

Last synced: 15 Apr 2025

https://github.com/pdil/usmap

🗺 Create US maps including Alaska and Hawaii in R

counties data fips geodata mapping r states usa

Last synced: 07 Apr 2025

https://github.com/infoculture/datatasks

Задачи для волонтеров/стажеров/всех желающих по работе с открытыми, большими данными. А также всеми иными задачами связанными с темами краудсорсинга, понятного языка и электронной архивации

data infoculture opendata opengov russian-data

Last synced: 19 Jul 2025

https://github.com/geonetwork/geonetwork-ui

GeoNetwork UI is a suite of Applications made to provide a modern facade to your GeoNetwork 4 catalog. It also provides Web Components to embed various parts of your data catalog in third party websites.

angular data geonetwork gis ui webcomponents

Last synced: 10 Apr 2025

https://github.com/bosun-ai/swiftide

Fast, streaming indexing and query library for AI (RAG) applications, written in Rust

ai data indexing llm llmops ml rag

Last synced: 14 Jul 2025

https://github.com/virtadpt/exocortex-halo

Various and sundry additional pieces of software I've written to incorporate into my exocortex.

bots conversation data exocortex interactive parser python rest-api

Last synced: 09 Apr 2025

https://github.com/hi-folks/data-block

PHP Package for handling, querying, filtering, and setting nested data structures

data data-structure hacktoberfest json-data php

Last synced: 04 Jan 2026

https://github.com/spine-tools/spine-toolbox

Spine Toolbox is an open source Python package to manage data, scenarios and workflows for modelling and simulation. You can have your local workflow, but work as a team through version control and SQL databases.

anaconda data energy miniconda python simulation-model spine-toolbox workflow

Last synced: 04 Apr 2025

https://github.com/dkandalov/activity-tracker

Plugin for IntelliJ IDEs to track and record user activity

data intellij plugin

Last synced: 11 Sep 2025

https://github.com/glynnbird/countriesgeojson

Countries of the world as GeoJSON

data geography geojson geospatial json

Last synced: 21 Mar 2025

https://github.com/stephanakkerman/tensortrade-extras

Discover a curated list of projects complementing TensorTrade, distinct from those mentioned in its official documentation. Contributions are welcome; if you spot a missing project, please submit a pull request!

crypto cryptocurrency cryptocurrency-trading-bot data finance live-trading openai-gym reinforcement-learning reinforcement-learning-agent stock-trading stocks technical-analysis tensortrade

Last synced: 18 Mar 2025

https://github.com/stonecypher/flocks.js

A radically simpler alternative to Flux - opinionated React state and rendering management

data flux javascript js layer react react-js reactjs redux screw-flux simple simplicity

Last synced: 02 May 2025

https://github.com/anish-agnihotri/blog-effective-nft-launches-data

Data+code for NFT launch guide blogpost.

data exploiting fairness launches nft

Last synced: 14 Apr 2025

https://github.com/9b/netinfo

Simple IP enrichment service and API wrapping PyASN and MaxMind GeoIP.

cybersecurity data devops enrichment ip-address-lookup network osint python3 webservice

Last synced: 25 Jan 2026

https://github.com/anandchowdhary/bookshelf-action

📚 Track your reading using GitHub Actions

api books data generator github-actions reading tracker

Last synced: 05 Apr 2025

https://github.com/alibaba/dimbin

High-performance serialization for multi-dimension arrays 海量数据高性能序列化方案

binary csharp data serialization typescript

Last synced: 14 Oct 2025

https://github.com/rishit-dagli/cppe-dataset

Code for our paper CPPE - 5 (Medical Personal Protective Equipment), a new challenging object detection dataset

artificial-intelligence computer-vision cppe5 data dataset deep-learning machine-learning models object-detection pretrained-models pytorch tensorflow vision

Last synced: 15 Jun 2025

https://github.com/publici/us-polling-places

Standardized data on historical general election polling places in the United States.

data elections open-data open-elections polling-locations polling-places

Last synced: 03 Feb 2026

https://github.com/purarue/HPI

Human Programming Interface - a way to unify, access and interact with all of my personal data [my modules]

data gdpr history lifelogging personal-api quantified-self

Last synced: 23 Oct 2025

https://github.com/kirankunigiri/Apple-Family

A simple framework that brings Apple devices together - like a family

bluetooth connectivitiy data ios macos usb wifi

Last synced: 30 Jul 2025

https://github.com/visual-layer/visuallayer

Simplify Your Visual Data Ops. Find and visualize issues with your computer vision datasets such as duplicates, anomalies, data leakage, mislabels and others.

cleaning computer computer-vision data data-science dataset datasets-preparation generative machine-learning python vision

Last synced: 19 Apr 2025

https://github.com/kirankunigiri/apple-family

A simple framework that brings Apple devices together - like a family

bluetooth connectivitiy data ios macos usb wifi

Last synced: 15 Sep 2025

https://github.com/enviodev/hyperindex

📖 Blazing-fast multi-chain indexer

blockchain dapp data envio evm framework fuel indexer rescript

Last synced: 03 Mar 2026

https://github.com/AnandChowdhary/bookshelf-action

📚 Track your reading using GitHub Actions

api books data generator github-actions reading tracker

Last synced: 18 Apr 2025

https://github.com/tensorflow/tfjs-data

Simple APIs to load and prepare data for use in machine learning models

data deep-learning javascript machine-learning neural-network tensorflow tfjs

Last synced: 30 Sep 2025

https://github.com/caduandrade/davi_flutter

A full customized data view that builds the cells on demand. Focused on Web/Desktop Applications. Bidirectional scroll bars.

data dataview flutter grid layout table widget

Last synced: 06 Apr 2025

https://github.com/doodlewind/bumpover

🚧 Async data transforming with simple rules.

data json markup migration tree validation xml

Last synced: 11 Sep 2025

https://github.com/malloydata/malloy-composer

Malloy Composer is a simple application to build dashboards or run ad-hoc queries using an existing Malloy model

business-analytics business-intelligence data data-modeling data-visualization malloy semantic-modeling

Last synced: 08 May 2025

https://github.com/patch/i18n-testing

International data for testing and QA

data i18n qa testing unicode

Last synced: 06 Jan 2026

https://github.com/matheusfelipeog/worldometer

Get live, population, geography, projected, and historical data from around the world 🌍

api data historical historical-data live livedata metrics mit-license projected pypi python scraping world worldometer worldometer-api worldometer-scraping worldometers

Last synced: 12 Apr 2025

https://github.com/binste/dbt-ibis

Write your dbt models using Ibis

data dbt ibis

Last synced: 06 Apr 2025

https://github.com/0xplaygrounds/subgrounds

An intuitive Python library for interfacing with subgraphs and GraphQL

analytics blockchain data decentralized graph graphql pandas python subgraphs substreams

Last synced: 12 Sep 2025

https://github.com/jneidel/job-titles

Normalized dataset of 70k job titles

data dataset jobs json opendata

Last synced: 27 Jul 2025

https://github.com/ArtesiaWater/hydropandas

Module for loading observation data into custom DataFrames

data groundwater hydrology observations pandas timeseries

Last synced: 20 Jul 2025

https://github.com/artesiawater/hydropandas

Module for loading observation data into custom DataFrames

data groundwater hydrology observations pandas timeseries

Last synced: 26 Jun 2025

https://github.com/0xPlaygrounds/subgrounds

An intuitive Python library for interfacing with subgraphs and GraphQL

analytics blockchain data decentralized graph graphql pandas python subgraphs substreams

Last synced: 09 May 2025

https://github.com/raystack/compass

Compass is an enterprise data catalog that makes it easy to find, understand, and govern data.

data dataops discovery lineage metadata

Last synced: 02 Apr 2026

https://github.com/ngnjs/ngn

A systems development platform. A revamped portal is coming soon at:

browser data data-model deno event-driven eventemitter javascript middleware ngn nodejs queue systems web

Last synced: 16 May 2025

https://github.com/paintedbicycle/sketch-data-faker

A Sketch plugin providing 130+ types of smart placeholder content for your mockups from Faker.js and other sources.

api content data fakerjs javascript json mock-data plugin sketch sketch-plugin sketchapp

Last synced: 26 Jun 2025

https://github.com/pedrokehl/caminho

Tool for creating efficient data pipelines in a JavaScript environment

backpressure concurency data dataprocessing functional javascript parallel pipeline reactive typescript

Last synced: 06 Apr 2025

https://github.com/whitfin/jen

A fast utility to generate fake/test documents based on a template

data dataset generator json template templating

Last synced: 22 Aug 2025

https://github.com/datasets/football-datasets

Major Europe leagues data (England, Spain, Italy, Germany and France)

csv data datasets football open soccer

Last synced: 27 Mar 2026

https://github.com/altaurog/pdfforms

Populate fillable pdf forms from csv data file

data forms pdf

Last synced: 12 May 2025

https://github.com/disease-sh/node-api

A JavaScript API Wrapper for NovelCOVID/API

coronavirus data javascript library open-source wrapper

Last synced: 21 Apr 2025

https://github.com/octoenergy/timeserio

Better `keras` models for time series and beyond

data data-science

Last synced: 24 Jun 2025

https://github.com/blazejkustra/dynamode

Dynamode is a modeling tool for Amazon's DynamoDB

amazon aws data database datastore db document dynamo dynamodb model nosql schema

Last synced: 14 Sep 2025

https://github.com/greenelab/adage

Data and code related to the paper "ADAGE-Based Integration of Publicly Available Pseudomonas aeruginosa..." Jie Tan, et al · mSystems · 2016

autoencoders data dataset denoising-autoencoders gene-expression machine-learning manuscript methodology neural-networks paper pseudomonas-aeruginosa research supplement

Last synced: 07 Mar 2026

https://github.com/alannikos/edg4llm

A unified tool to generate fine-tuning datasets for LLMs, including questions, answers, and dialogues. ✨🤖📚💬

chagpt chatglm data deepseek fine-tuning generation internlm llm

Last synced: 14 Jan 2026

https://github.com/nom-tam-fits/nom-tam-fits

A full featured 100% Java library for reading and writing FITS files

astronomy data fits fits-files fits-image fitsio java open-source scientific

Last synced: 11 Jan 2026

https://github.com/514-labs/moose

The developer framework for your data & analytics stack

analytics data dataengineering deployment framework insights metrics python rust typescript

Last synced: 05 Apr 2025

https://github.com/bluzi/name-db

:rocket: A multilingual collection of names from around the world

data language names translations

Last synced: 09 Oct 2025

https://github.com/hswick/jutsu

Graphing tool for Clojure built with the web and interactivity in mind

clojure data plotlyjs visualization

Last synced: 09 Sep 2025

https://github.com/z3z1ma/alto

Alto is a versatile data integration tool that allows you to easily run Singer plugins, build and cache PEX files encapsulating those plugins, and create a data reservoir whereby you can extract once and replay to as many destinations as you want.

data data-pipeline etl meltano singer

Last synced: 17 Mar 2025

https://github.com/micromata/generator-http-fake-backend

Yeoman generator for building a fake backend by providing the content of JSON files or JavaScript objects through configurable routes.

api backend data fake fake-data http http-server json mock mocking mocking-server mocks node nodejs rest rest-api restful restful-api yeoman yeoman-generator

Last synced: 14 Jan 2026

https://github.com/rolyatmax/citibike-trips

Visualizing citibike trips with webgl

citibike data javascript regl visualization webgl

Last synced: 14 Apr 2025

https://github.com/nikitastupin/orgs-data

Mapping from bug bounty and vulnerability disclosure programs to respective GitHub organizations

bug-bounty data github reconnaissance vulnerability-disclosure

Last synced: 09 Apr 2025

https://github.com/queryverse/excelreaders.jl

ExcelReaders is a package that provides functionality to read Excel files.

data excel julia queryverse

Last synced: 12 Apr 2025

https://github.com/geirolz/fly4s

A lightweight, simple and functional wrapper of Flyway using cats effect.

cats cats-effect data database database-migrations db flyway flyway-migrations flywaydb functional-programming persistence scala

Last synced: 11 Apr 2025

https://github.com/unsw-ceem/nemosis

NEMOSIS: NEM Open-source information service. A Python package for downloading historical data published by the Australian Energy Market Operator (AEMO)

aemo australia data energy national-electricity-market nem nemweb python

Last synced: 04 Apr 2025

https://github.com/noopeeks/datanvim

A fully-featured batteries-included Neovim distribution for the world of Data Science. Prepared to run code and interact with Jupyter Notebooks without ever leaving your terminal.

data data-science distribution jupyter-notebook machine-learning neovim nvim nvim-config text-editor vim

Last synced: 06 Oct 2025

https://github.com/fillmula/jsonclasses

🌎 The Modern Declarative Data Flow Framework for the AI Empowered Generation.

data json python validation

Last synced: 07 Sep 2025

https://github.com/worldbank/llm4data

LLM4Data is a Python library designed to facilitate the application of large language models (LLMs) and artificial intelligence for development data and knowledge discovery.

ai data development-data gpt gpt-4 indicators llm llm4data metadata sql wdi world-development-indicators

Last synced: 07 Apr 2025

https://github.com/digao-dalpiaz/dztalkapp

Delphi non-visual component to communicate between applications

applications communication component data delphi pascal

Last synced: 12 Jul 2025

https://github.com/cylondata/twister2

A composable framework for fast and scalable data analytics

batch big-data data graph iterative streaming

Last synced: 29 May 2026

https://github.com/kristoferjoseph/redeux

Minimal unidirectional data flow utility library

data flux reducer store unidirectional utility

Last synced: 06 May 2025

https://github.com/leonawicz/rtrek

R package for Star Trek datasets and related R functions.

data r-package stapi star-trek

Last synced: 06 Sep 2025