An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/evilpegasus/real-estate-price-prediction

Predicting NYC real estate sale prices using neural networks (1st place Berkeley SAAS Kaggle Competition Fall 2020)

data ipynb kaggle nyc

Last synced: 28 Oct 2025

https://github.com/stas00/reddit-to-threads

Convert arctic_shift Reddit data dumps into thread-view documents

conversion data dump reddit

Last synced: 20 Mar 2025

https://github.com/midaef/custom-memory-cache

C.M.C it's simple implementation memory cache [key:value] in GO with time to live

cache data go golang memory midaef

Last synced: 14 Jan 2026

https://github.com/purarue/hpi-template

A cookiecutter template for creating a HPI repository

data lifelogging quantified-self

Last synced: 18 Mar 2025

https://github.com/mps9506/rattains

:droplet: Access EPA ATTAINS data in R :droplet:

data epa r r-package rstats water-quality

Last synced: 12 Apr 2025

https://github.com/killovsky/trendings

Repositório do módulo scrapper para obtenção de trendings do Twitter.

api countries data hashtag information module network nodejs scraper scraping social social-network tendencia topics trending trends trends24 twitter world

Last synced: 07 Jul 2025

https://github.com/pysat/pysatmodels

Interface for model analysis and model-data comparisons within the pysat ecosystem

comparison data dineof model pysat python sami2 tie-gcm validation

Last synced: 22 Jul 2025

https://github.com/pnnl/constrain

ConStrain is a data-driven knowledge-integrated framework that automatically verifies that building system controls function as intended.

bms building commissioning data hvac simulation verification

Last synced: 12 Apr 2025

https://github.com/deiu/solidproxy

Proxy server with authentication (using WebID-TLS delegation)

data delegation linked proxy-server solid webid webid-tls-delegation

Last synced: 11 Jan 2026

https://github.com/instafluff/coronavirus

COVID-19 Coronavirus Data Tracker

2019-ncov coronavirus covid-19 data ncov ncov-2019 sars-cov-2 wuhan

Last synced: 29 Oct 2025

https://github.com/cloudposse/terraform-aws-dms

Terraform modules for provisioning and managing AWS DMS resources

data dms dynamodb migration mysql oracle postgres postgresql s3 sql sql-server

Last synced: 09 Sep 2025

https://github.com/malloydata/malloy-cli

A command-line interface for executing Malloy and SQL

data data-modeling transformation

Last synced: 13 Jul 2025

https://github.com/upsonic/server

Self-Driven Autonomous Python Libraries

data data-science gpt-4o library-management ml mlops python

Last synced: 22 Aug 2025

https://github.com/openscm/scmdata

Handling of Simple Climate Model data

climate data data-visualization

Last synced: 27 Jul 2025

https://github.com/seanbreckenridge/HPI-template

A cookiecutter template for creating a HPI repository

data lifelogging quantified-self

Last synced: 29 Jul 2025

https://github.com/ttimbers/canlang

R Data package for Canadian language data collected via the Canadian Census.

census data language package r

Last synced: 04 Apr 2025

https://github.com/effect-deprecated/monocle

Optics for your data (port of monocle-ts)

data functional optics

Last synced: 23 Apr 2025

https://github.com/polyfrost/datastorage

Static data storage for Polyfrost projects.

data data-storage hypixel hypixel-api minecraft minecraft-mod

Last synced: 25 Feb 2025

https://github.com/gagolews/clustering-data-v1

A framework for benchmarking clustering algorithms – Benchmark suite, version 1

benchmark benchmark-datasets clustering data dataset datasets machine-learning

Last synced: 19 Apr 2025

https://github.com/cmpadden/dagster.nvim

Control the Dagster orchestrator directly from Neovim

dagster data neovim plugins

Last synced: 15 Mar 2025

https://github.com/node/circos

Mirror of cricos. "Circos is a software package for visualizing data and information. It visualizes data in a circular layout — this makes Circos ideal for exploring relationships between objects or positions. "

circos data visualization

Last synced: 19 Apr 2025

https://github.com/jgraving/deepposekit-data

Example datasets for DeepPoseKit

data deepposekit pose-estimation posture

Last synced: 24 Sep 2025

https://github.com/markpflug/sylvan.tools.etl

Extract transform load command line tools.

csv data etl sqlite sqlserver

Last synced: 24 Apr 2025

https://github.com/0x8b/datamatrix

Library that enables programs to write Data Matrix barcodes of the modern ECC200 variety.

cli code data data-matrix datamatrix ecc ecc200 elixir generator matrix

Last synced: 26 Sep 2025

https://github.com/hyparam/hightable

A dynamic windowed scrolling table component for react

component data dataset grid javascript react scrolling table virtualized windowed

Last synced: 14 May 2025

https://github.com/p0dalirius/windowsbuilds

This repository contains the list of windows builds as parsable JSON files.

builds data json windows

Last synced: 03 Sep 2025

https://github.com/jcwieme/data-scripts-star-wars

Useful repo for data-viz (e.g.) which contains the scripts of the Star Wars movies as well as refined versions with only the dialogues.

characters count csv-files data data-visualization dialogues durations film json movies speaker star star-wars wars

Last synced: 14 Apr 2025

https://github.com/colinianking/sluice

Sluice is a program that reads input on stdin and outputs on stdout at a specified data rate.

data rate-limiting transfer

Last synced: 25 Aug 2025

https://github.com/michaelwitting/metabolomics2018

Scripts & Data for XCMS Workshop, Metabolomics 2018 in Seattle

analysis data metabolomics xcms

Last synced: 06 Mar 2025

https://github.com/ghodsizadeh/household-survey-data-iran

Household Survey Data Iran but in sqlite3(to work with data easily and everywhere)

data iran iran-data

Last synced: 04 Oct 2025

https://github.com/smac-group/imudata

:battery: This package is meant to serve as a data collection tool for IMU data. This data can be used as a means to assess and test methods designed to analyse IMU error signals (i.e. long and complex autocorrelated signals). An example method used for this kind of data is implemented in the GMWM R package which can also model the latent models that often characterize this data.

accelerometer data gyroscope imu mems-imu-dataset r

Last synced: 22 Apr 2025

https://github.com/ikstream/dalec

Dalec is a project that aims to provide a privacy preserving data collection method. It utilizes DNS for client/server seperation while transmiting data encrypted

collection data data-collection dns exfiltration shell

Last synced: 11 Aug 2025

https://github.com/HowTheyVote/data

Weekly updated data on roll-call-votes in the European Parliament, as collected by https://howtheyvote.eu/.

data ep eu european parliament plenary votes

Last synced: 12 Aug 2025

https://github.com/marcoradocchia/microxdg

An XDG Base Directory Specification Rust library that aims to be conservative on memory allocation and overall memory footprint.

appdir basedir cache config data directories directory executable file home path runtime rust rust-lang state xdg xdg-basedir xdg-compliance xdg-user-dirs

Last synced: 23 Mar 2025

https://github.com/anandchowdhary/everything

⏳ Everything Everywhere All at Once

api data github-actions json

Last synced: 30 Jul 2025

https://github.com/half0wl/datagovsg_api

Unofficial Python API wrapper for public APIs at developers.data.gov.sg

api-wrapper data government-data python singapore

Last synced: 13 Aug 2025

https://github.com/pravj/github-dynamics

Source code for "Information Dynamics on the GitHub Network"

data open-source visualization

Last synced: 30 Jul 2025

https://github.com/tyson-swetnam/emsi

ecosystem moisture stress index

data google-earth-engine javascript python rmarkdown

Last synced: 20 Sep 2025

https://github.com/recodehive/recode-website

recodehive helps you to learn and master the skills on data, and encourage you to code on opensource.

data data-science dataengineering opensource python sql tutorials website

Last synced: 27 Jul 2025

https://github.com/linwin-cloud/linwin-db-server

在广袤无垠的现代大数据海洋之中,计算机深度的和信息以及数据绑定,承载这亿万数据的就是数据库软件。 Linwin Data Server,基于Java开发的国产高性能数据库软件。支持国产和Linux操作系统,支持多用户操作。采用Nosql结构,自研mys数据库操作语言,更加简单方便高效。 用户数据的增删改查全部在内存内操作,与硬盘的交互写入读取交由专门的线程管理,无不妨碍.

data data-science database hashmap http java javascript key-value linux programming-language python server typescript webserver website

Last synced: 17 Jun 2025

https://github.com/markpflug/sylvan.data.xbase

The fastest .NET library for reading xBase (.dbf) data files.

data dbase dbf dotnet xbase

Last synced: 24 Apr 2025

https://github.com/josechirif/2018-house-price-estimation---melbourne-australia

The project proposes to calculate the price of a Melbourne house according to its characteristics.

data data-science python

Last synced: 14 Apr 2025

https://github.com/mstgnz/data-structures

Data Structures With Go

data data-structures go golang

Last synced: 22 Sep 2025

https://github.com/dp6/raft-suite-hub

O Hub é a solução responsável por centralizar a consolidação dos dados no BigQuery, ferramenta escolhida para servir de data warehouse do raft-suite.

bigquery data data-quality google-cloud google-cloud-functions hacktoberfest

Last synced: 28 Jun 2025

https://github.com/marcusschiesser/vectorstores

Vectorstores is a framework for using vector databases in your AI applications

ai ai-agents data database embeddings vector vector-database

Last synced: 13 Jan 2026

https://github.com/itwars/golang-scraping-colly

Exemples de récupération de données non structurées avec le framework Golang COLLY

bigdata colly crawler crawling data forecast golang scraper scraping sports

Last synced: 17 May 2025

https://github.com/data-engineering-community/data-engineering-meetup-in-a-box

A collection of guides, resources, and support for DE meetup organizers.

data data-analysis data-engineering data-mining data-structures database meetups

Last synced: 02 Aug 2025

https://github.com/capturr/jsonld-extract

A damn simple tool to extract json-ld metadata from webpage using jquery like api (jQuery, Cheerio, CashDom ...).

cashdom cheerio crawler crawling data extract extractor javascript jquery json jsonld metadata nodejs parser scraper scraping spider typescript

Last synced: 24 Mar 2025

https://github.com/mquezada/uchile-cc5206

Curso Introducción a la Minería de Datos [DCC UChile]

association-rules chile classification clustering course data eda jupyter mining python r science uchile

Last synced: 15 Mar 2025

https://github.com/jldbc/auctionhouse

See how much advertisers are paying for your attention https://chrome.google.com/webstore/detail/auctionhouse/hmjofiljabjmompfgllkpkbkfbpbpkcp

adtech chrome-extension data prebid

Last synced: 26 Oct 2025

https://github.com/eneko/data-repository

Data files used mainly for testing

data json testing-tools word-list words

Last synced: 29 Sep 2025

https://github.com/iq2i/data-importer

A PHP library to easily manage and import large data file

async csv csv-files csvreader data dto file import json php processor reader xml xmlreader

Last synced: 06 Apr 2025

https://github.com/darothen/experiment

Organizing numerical model experiment output

climate data model science

Last synced: 03 Oct 2025

https://github.com/lvlyke/ngxs-synchronizers

Easily keep your app's local state synchronized with your backend, databases and more! ngxs-synchronizers simplifies synchronizing your NGXS-based application state with external data sources.

async asynchronous backend data external ngx ngxs remote rxjs state store synchronization synchronize synchronizers

Last synced: 22 Aug 2025

https://github.com/bluebrain/data-validation-framework

Simple framework to create data validation workflows.

data data-analysis python validation validation-tool

Last synced: 14 May 2025

https://github.com/kurisubrooks/sherlock

modular server with static page serving, modular data retrieval and functional api endpoints

api backend data modular server sherlock static

Last synced: 25 Aug 2025

https://github.com/mabel-dev/mabel

😊 mabel is a platform for authoring data processing systems.

data pipelines processing python

Last synced: 27 Aug 2025

https://github.com/pazzo83/noaadata.jl

Wrapper of the NOAA Climate Data API in Julia

climate data julia

Last synced: 06 Apr 2025

https://github.com/eurobios-mews-labs/acrocord

This package provide some useful tools to interact with postgresql server using pandas dataframe

data data-science database pandas-dataframe postgresql psycopg2 python python3 sqlalchemy table-factory

Last synced: 15 Apr 2025

https://github.com/jakarto3d/jakarnotator

The Jakarnotator is an annotation tool to create your own database for instance segmentation problem.

annotations computer-vision data database deep-learning detectron instance-segmentation mscoco training-data

Last synced: 15 May 2025

https://github.com/jujuadams/extendingjson

Human-writeable JSON-like data formats for GameMaker

data gamemaker gamemaker-s gms2 json yaml

Last synced: 15 Apr 2025

https://github.com/lVlyke/ngxs-synchronizers

Easily keep your app's local state synchronized with your backend, databases and more! ngxs-synchronizers simplifies synchronizing your NGXS-based application state with external data sources.

async asynchronous backend data external ngx ngxs remote rxjs state store synchronization synchronize synchronizers

Last synced: 24 Apr 2025

https://github.com/chalk-ai/chalk-go

Go client for Chalk

data data-pipeline feature-engineering

Last synced: 03 Feb 2026

https://github.com/fearlesssolutions/engineering-practice-domains

A mono-repo for the Engineering Practice Domains of Development, Data, Infrastructure, Testing, and Platforms

data data-engineering data-science database-design devops drupal end-to-end-testing engineering infrastructure machine-learning salesforce security testing web-development

Last synced: 26 Oct 2025

https://github.com/forcedotcom/comdagen

COMmerce DAta GENerator will build a Commerce Cloud site import file tailored to your specification

commerce-cloud data

Last synced: 19 Jun 2025

https://github.com/tushar2704/best-ever-streamlit-applications

101 Super Streamlit Applications-This interactive web application collection serves as a showcase of my data science and machine learning projects. With a passion for data-driven insights and a knack for creating engaging data applications, I am excited to present this portfolio as a demonstration of my skills and expertise.

data datascience machinelearning python streamlit streamlit-tushar2704 tushar2704

Last synced: 10 Oct 2025

https://github.com/simeononsecurity/track-helium-mobile-wifi

A collection of scripts and tools that tracks the availability of helium mobile wifi networks in the wild from the Wigle Dataset and Helium API. Updates every 24 hours.

automation bigdata carrieroffload data data-analysis dataset dataset-generation helium openroaming passpoint python

Last synced: 23 Apr 2025

https://github.com/greenelab/scihub-browser-data

Data for the Sci-Hub Stats Browser

data journals piracy sci-hub supplement webapp

Last synced: 09 Oct 2025

https://github.com/flother/thecounted

Copy of the data from The Counted, a Guardian project to count people killed by US law enforcement agencies in 2015 and 2016

data guardian police police-killings

Last synced: 09 Apr 2025

https://github.com/enspirit/monolens

Declarative data transformations as data

data data-engineering homoiconic json yaml

Last synced: 12 Apr 2025

https://github.com/prismadic/tractor-beam

high-efficiency text & file scraper with smart tracking, client/server networking for building language model datasets fast

botnet cluster data file-downloader llm llm-finetuning llm-training mass-downloader scraping

Last synced: 07 May 2025

https://github.com/mackysoft/unidata

[PREVIEW] Data Management for Unity. It also useful for implementing Achievements, Quests, etc.

achievements c-sharp csharp data database input-ouput quest unity unity-editor

Last synced: 09 Apr 2025

https://github.com/naupio/pical

(Work In Process) pita is a general distributed computation system with Erlang language base on DAG model. This project is inspired by DouBan 's DPark and Apache Spark.

big-data bigdata dag data distributed distributed-computing distributed-systems erlang erlang-otp flink spark

Last synced: 19 Oct 2025

https://github.com/shrayasr/globalairports.net

The .Net client for http://www.partow.net/miscellaneous/airportdatabase/

airports cross-platform data dotnet dotnet-standard wrapper

Last synced: 14 Jan 2026

https://github.com/pkx8326/google_coursera_cyclistic

A data analysis case study capstone from Google Data Analytics Professional Certificate course on Coursera

analysis analytics capstone coursera cyclistic data google project r

Last synced: 27 Oct 2025

https://github.com/grzegorzme/data-toolz

simple python library for handling data-io tasks

data data-wrangling filesystem python python3 s3 tooling

Last synced: 22 Jan 2026

https://github.com/thoughtspot/visual-embed-sdk

Customizable analytics components for your app, powered by ThoughtSpot's AI

ai data web-embed

Last synced: 18 Jan 2026

https://github.com/the-pew-inc/the-pew

ThePew is an advanced system of records that enables enterprises to detect trends and patterns from questions to drive marketing and business decisions toward their goals.

data data-science docker javascript machine-learning postgresql rails ruby

Last synced: 06 Oct 2025

https://github.com/cjdoris/chevrons.jl

Your friendly >> chevron >> based syntax for piping data through multiple transformations.

data data-science data-transformation julia julia-lang julia-language macros piping repl

Last synced: 16 Oct 2025

https://github.com/0x9ef/go-wiper

Safely wiping your secure data in Golang

data gutmann-method safely secure utility wiping

Last synced: 30 Apr 2025

https://github.com/zehracakir/verimadenciliginotlarim

My notes and my own studies in the Data Mining course in the computer engineering department of Süleyman Demirel University

classifying clustering data data-mining data-science linear-regression machine-learning pandas python

Last synced: 18 Jun 2025

https://github.com/Codeblin/ObjectPreference

Fast and easy Shared Preferences managing with object mapping annotations for simple or complex class structures

android code-generation data dataclasses easy-to-use localstorage mapping-annotations sharedpreferences sharedpreferences-easy sharedpreferences-helper sharedpreferences-manager

Last synced: 12 Apr 2025

https://github.com/geus-glaciology-and-climate/mass_balance

Greenland ice sheet mass balance from 1840 through next week

data grass-gis greenland org-mode publication python research science

Last synced: 12 Apr 2025

https://github.com/brightway-lca/bw_processing

Tools to create structured arrays in a common format

bw3 data life-cycle-assessment python

Last synced: 05 May 2025