An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/sammcj/mcp-data-extractor

A model context protocol server to migrate data out of code (ts/js) into config (json)

data javascript js json llm mcp tool ts typescript

Last synced: 08 Mar 2026

https://github.com/n1ghtf1re/map-of-emergency-incidents

Emergency Map allows you to effectively visualize multi-dimensional information, has an intuitive interface. The developed code is easily modified for use in a variety of areas. The use of color mixing technology enhances the perception and analysis of information

big-data big-data-analytics big-data-visualization bigdata color-mixing colors data data-analytics data-science data-visualization data-visualization-challenges data-visualization-simpler mysql open-source-project php student-project

Last synced: 18 Mar 2025

https://github.com/michaelwitting/metabolomics2018

Scripts & Data for XCMS Workshop, Metabolomics 2018 in Seattle

analysis data metabolomics xcms

Last synced: 07 Mar 2026

https://github.com/deiu/solidproxy

Proxy server with authentication (using WebID-TLS delegation)

data delegation linked proxy-server solid webid webid-tls-delegation

Last synced: 11 Jan 2026

https://github.com/instafluff/coronavirus

COVID-19 Coronavirus Data Tracker

2019-ncov coronavirus covid-19 data ncov ncov-2019 sars-cov-2 wuhan

Last synced: 25 Feb 2026

https://github.com/glassflow/glassflow-python-sdk

GlassFlow Python SDK to publish and consume data to your pipelines at Glassflow.dev

data data-processing datastreaming python real-time sdk stream-processing

Last synced: 18 Feb 2026

https://github.com/modmuss50/cursemapper

A tool to make graphs and stuff from downloads on curse

curseforge data google-charts gradle graph javafx js kotlin php

Last synced: 18 Jul 2025

https://github.com/motasimfoad/emr

“EMR” is a platform built using leading edge web technologies and API’s to help Doctors/ Patient/ Hospitals/ Pharmacies to better deal with medical documentation.

apollo data doctor emr graphcool graphql hospital medical patients pharmacy reactjs record yarn

Last synced: 10 Apr 2025

https://github.com/smac-group/imudata

:battery: This package is meant to serve as a data collection tool for IMU data. This data can be used as a means to assess and test methods designed to analyse IMU error signals (i.e. long and complex autocorrelated signals). An example method used for this kind of data is implemented in the GMWM R package which can also model the latent models that often characterize this data.

accelerometer data gyroscope imu mems-imu-dataset r

Last synced: 22 Apr 2025

https://github.com/Evan-Kim2028/subgraph-query-portal

A collection of reusable public goods subgraph queries.

data messari polars python query subgraphs

Last synced: 09 May 2025

https://github.com/jcwieme/data-scripts-star-wars

Useful repo for data-viz (e.g.) which contains the scripts of the Star Wars movies as well as refined versions with only the dialogues.

characters count csv-files data data-visualization dialogues durations film json movies speaker star star-wars wars

Last synced: 14 Apr 2025

https://github.com/mps9506/rattains

:droplet: Access EPA ATTAINS data in R :droplet:

data epa r r-package rstats water-quality

Last synced: 12 Apr 2025

https://github.com/akeneo/transporteo

Migration Tool for Akeneo PIM from 1.7 to 2.0

akeneo akeneo-pim data migration php symfony

Last synced: 29 Jul 2025

https://github.com/cecoeco/htmltables.jl

Julia package for reading and writing HTML tables using the Tables.jl interface

css data html http julia table tables

Last synced: 11 Apr 2025

https://github.com/evilpegasus/real-estate-price-prediction

Predicting NYC real estate sale prices using neural networks (1st place Berkeley SAAS Kaggle Competition Fall 2020)

data ipynb kaggle nyc

Last synced: 28 Oct 2025

https://github.com/malloydata/malloy-cli

A command-line interface for executing Malloy and SQL

data data-modeling transformation

Last synced: 13 Jul 2025

https://github.com/purarue/HPI-template

A cookiecutter template for creating a HPI repository

data lifelogging quantified-self

Last synced: 01 May 2025

https://github.com/caltechlibrary/caltechdata

The CaltechDATA InvenioRDM source code

data inveniordm repository

Last synced: 13 Apr 2025

https://github.com/SamEdwardes/pydatafaker

A python package to create fake data with relationships between tables.

data data-science fake-data python

Last synced: 09 Jul 2025

https://github.com/datapreprocessing/datacleaning

Data Cleaning is a python package for data preprocessing. This cleans the CSV file and returns the cleaned data frame. It does the work of imputation, removing duplicates, replacing special characters, and many more.

data data-cleaning data-cleansing data-preprocessing data-wrangling imputation python threshold

Last synced: 14 Dec 2025

https://github.com/erhathaway/router-primitives

A framework agnostic application router. Declarative routing by way of layout primitives :sunrise_over_mountains:

custom-primitives data feature layout layout-primitives link router router-actions router-primitives router-state scene stack

Last synced: 20 Jan 2026

https://github.com/iq2i/data-importer

A PHP library to easily manage and import large data file

async csv csv-files csvreader data dto file import json php processor reader xml xmlreader

Last synced: 06 Apr 2025

https://github.com/bluebrain/data-validation-framework

Simple framework to create data validation workflows.

data data-analysis python validation validation-tool

Last synced: 14 May 2025

https://github.com/zehracakir/verimadenciliginotlarim

My notes and my own studies in the Data Mining course in the computer engineering department of Süleyman Demirel University

classifying clustering data data-mining data-science linear-regression machine-learning pandas python

Last synced: 18 Jun 2025

https://github.com/rofl0r/filesync

syncs two directories, with the possibility of creating incremental backups

backup c data directories lightweight synchronization

Last synced: 22 Mar 2025

https://github.com/pinsjs/pinsjs

Pin, Discover and Share Datasets

cache data datascience datasets download s3 upload

Last synced: 30 Apr 2025

https://github.com/avencera/covid

Tracking covid, data visualization and API for COVID-19 (novel corona virus)

countries covid data graphql johns-hopkins-university restapi

Last synced: 11 Apr 2025

https://github.com/5tefan/ncagg

Aggregate NetCDF time series data.

aggregation concatenate data netcdf python time-series utility

Last synced: 13 Apr 2025

https://github.com/dp6/raft-suite-hub

O Hub é a solução responsável por centralizar a consolidação dos dados no BigQuery, ferramenta escolhida para servir de data warehouse do raft-suite.

bigquery data data-quality google-cloud google-cloud-functions hacktoberfest

Last synced: 28 Jun 2025

https://github.com/aminya/varstructs.jl

Variable Julia Structs with dispatching

data dispatch hacktoberfest julia macro polymorphism struct variable

Last synced: 07 May 2025

https://github.com/hubshashwat/letmelive

Finding safe food in India shouldn't be hard. LetMeLive helps you verify if your protein, ghee, or supplements have passed lab tests for heavy metals and adulteration (all in one open-source list)

aggregator data food foodsafety health india open-source

Last synced: 15 Jan 2026

https://github.com/jupyter-naas/docs

Documentation for building AI Networks as a Service with the Naas platform.

ai analytics automation blog data website

Last synced: 17 Jan 2026

https://github.com/lfenzo/impostor.jl

The highly versatile synthetic data generator

data dataframes datasets generator julia synthetic synthetic-data

Last synced: 17 Jul 2025

https://github.com/cjdoris/chevrons.jl

Your friendly >> chevron >> based syntax for piping data through multiple transformations.

data data-science data-transformation julia julia-lang julia-language macros piping repl

Last synced: 07 Mar 2026

https://github.com/brightway-lca/bw_processing

Tools to create structured arrays in a common format

bw3 data life-cycle-assessment python

Last synced: 05 May 2025

https://github.com/lacerbi/visvest-causinf

Bayesian comparison of causal inference strategies in multisensory heading perception

data modeling multisensory-integration visuo-vestibular-interaction

Last synced: 30 Apr 2025

https://github.com/ryanve/ssv

Space Separated Values. JavaScript library for spaced data. Fun and fast for classnames and beyond :two_hearts:

blink-182 classes classlist classname classnames css-classes data fleek javascript library opensource set-theory spaced spaces ssv strings values

Last synced: 10 Apr 2025

https://github.com/timlrx/browser-data-processing-benchmarks

Benchmark of data processing libraries on the browser including Arquero, Sqlite WASM and Duckdb WASM

benchmark data duckdb javascript sqlite wasm

Last synced: 12 May 2025

https://github.com/mattcox/Pack

A Swift package to serialize and deserialize various data types into an external representation.

binary data ios macos swift utitlity

Last synced: 28 Mar 2025

https://github.com/bradmartin/nativescript-android-sensors

NativeScript plugin for using Android device sensors on background thread.

accelerometer android data nativescript sensor

Last synced: 08 May 2025

https://github.com/jim-schwoebel/youtube_scrape

📹 Library for making playlists and scraping youtube videos - alternative to pafy, pytube, and youtube-dl.

audio data database pafy scraper voice youtube-dl

Last synced: 11 Apr 2025

https://github.com/yonet/d3-v4-slides

D3.JS version 4.0 slides for Modern Web/ AngularJs meetup at Google.

d3 d3js d3v4 data data-visualization slides

Last synced: 18 Mar 2025

https://github.com/pazzo83/noaadata.jl

Wrapper of the NOAA Climate Data API in Julia

climate data julia

Last synced: 06 Apr 2025

https://github.com/reubano/pycon17-tute

code for "Using Functional Programming for efficient Data Processing and Analysis" PyCon '17 tutorial

data functional-programming meza pycon python riko tutorial

Last synced: 12 Apr 2025

https://github.com/Codeblin/ObjectPreference

Fast and easy Shared Preferences managing with object mapping annotations for simple or complex class structures

android code-generation data dataclasses easy-to-use localstorage mapping-annotations sharedpreferences sharedpreferences-easy sharedpreferences-helper sharedpreferences-manager

Last synced: 12 Apr 2025

https://github.com/geus-glaciology-and-climate/mass_balance

Greenland ice sheet mass balance from 1840 through next week

data grass-gis greenland org-mode publication python research science

Last synced: 12 Apr 2025

https://github.com/0x9ef/go-wiper

Safely wiping your secure data in Golang

data gutmann-method safely secure utility wiping

Last synced: 30 Apr 2025

https://github.com/Gudsfile/tracksy

👀 tracksy - Visualize your data

astro data duckdb duckdb-wasm hacktoberfest privacy-first

Last synced: 22 May 2026

https://github.com/killovsky/4devs

Repositório do módulo para geração de dados falsos com base no site 4Devs.

4devs age banco bank cartao cep cnpj cpf credito dados data fake fakedata generator gerador gerar idade password rg senha

Last synced: 05 Apr 2025

https://github.com/mocnik-science/giscience.net-data

Open Data about Geographical Information Science (GIScience)

conferences data giscience journals open

Last synced: 30 Oct 2025

https://github.com/synacktraa/pylib

This is a C library which provides python like data reading and handling functions. (WIP)

c data library python

Last synced: 13 Apr 2025

https://github.com/alanmarazzi/mepcheck

Python package to retrieve MEPs voting data from Votewatch.eu. Check what your MEPs are doing in a few simple commands.

data data-retrieval europe european-parliament mep meps politics python scraper voting-data

Last synced: 20 Mar 2025

https://github.com/worldbank/ai4data

Collection of products from the AI4Data - Data4AI program from the Development Data Group and Office of the Chief Statistician

ai ai4data data data4ai llm

Last synced: 02 Jun 2026

https://github.com/enspirit/monolens

Declarative data transformations as data

data data-engineering homoiconic json yaml

Last synced: 12 Apr 2025

https://github.com/forcedotcom/comdagen

COMmerce DAta GENerator will build a Commerce Cloud site import file tailored to your specification

commerce-cloud data

Last synced: 19 Jun 2025

https://github.com/unicolab/keras-data-processor

Data Preprocessing model based on Keras preprocessing layers that can be used as a standalone model or incorporated to Keras model as first layers.

data keras layers preprocessing tensorflow

Last synced: 13 Feb 2026

https://github.com/alimehasin/dunna

Generate random data joyfully

data fake javascript nodejs typescript

Last synced: 04 Mar 2026

https://github.com/grzegorzme/data-toolz

simple python library for handling data-io tasks

data data-wrangling filesystem python python3 s3 tooling

Last synced: 22 Jan 2026

https://github.com/tushar2704/best-ever-streamlit-applications

101 Super Streamlit Applications-This interactive web application collection serves as a showcase of my data science and machine learning projects. With a passion for data-driven insights and a knack for creating engaging data applications, I am excited to present this portfolio as a demonstration of my skills and expertise.

data datascience machinelearning python streamlit streamlit-tushar2704 tushar2704

Last synced: 10 Oct 2025

https://github.com/recodehive/recode-website

recodehive helps you to learn and master the skills on data, and encourage you to code on opensource.

data data-science dataengineering opensource python sql tutorials website

Last synced: 15 Mar 2026

https://github.com/thomas-chauvet/names_transliteration

Neural Machine Translation (NMT) applied to transliterate names in arabic characters to latin characters (romanization).

arabic characters cli data dataset deep-learning latin neural-network nlp nmt romanization seq2seq translation transliteration typer-cli

Last synced: 06 Feb 2026

https://github.com/greenelab/scihub-browser-data

Data for the Sci-Hub Stats Browser

data journals piracy sci-hub supplement webapp

Last synced: 09 Oct 2025

https://github.com/pkx8326/google_coursera_cyclistic

A data analysis case study capstone from Google Data Analytics Professional Certificate course on Coursera

analysis analytics capstone coursera cyclistic data google project r

Last synced: 27 Oct 2025

https://github.com/the-pew-inc/the-pew

ThePew is an advanced system of records that enables enterprises to detect trends and patterns from questions to drive marketing and business decisions toward their goals.

data data-science docker javascript machine-learning postgresql rails ruby

Last synced: 06 Oct 2025

https://github.com/fearlesssolutions/engineering-practice-domains

A mono-repo for the Engineering Practice Domains of Development, Data, Infrastructure, Testing, and Platforms

data data-engineering data-science database-design devops drupal end-to-end-testing engineering infrastructure machine-learning salesforce security testing web-development

Last synced: 26 Oct 2025

https://github.com/naupio/pical

(Work In Process) pita is a general distributed computation system with Erlang language base on DAG model. This project is inspired by DouBan 's DPark and Apache Spark.

big-data bigdata dag data distributed distributed-computing distributed-systems erlang erlang-otp flink spark

Last synced: 19 Oct 2025

https://github.com/linwin-cloud/linwin-db-server

在广袤无垠的现代大数据海洋之中,计算机深度的和信息以及数据绑定,承载这亿万数据的就是数据库软件。 Linwin Data Server,基于Java开发的国产高性能数据库软件。支持国产和Linux操作系统,支持多用户操作。采用Nosql结构,自研mys数据库操作语言,更加简单方便高效。 用户数据的增删改查全部在内存内操作,与硬盘的交互写入读取交由专门的线程管理,无不妨碍.

data data-science database hashmap http java javascript key-value linux programming-language python server typescript webserver website

Last synced: 05 Mar 2026

https://github.com/litert/type-guard

An easy and powerful data validation code generator by JavaScript.

check data types validation

Last synced: 04 Apr 2026

https://github.com/lVlyke/ngxs-synchronizers

Easily keep your app's local state synchronized with your backend, databases and more! ngxs-synchronizers simplifies synchronizing your NGXS-based application state with external data sources.

async asynchronous backend data external ngx ngxs remote rxjs state store synchronization synchronize synchronizers

Last synced: 24 Apr 2025

https://github.com/josechirif/2018-house-price-estimation---melbourne-australia

The project proposes to calculate the price of a Melbourne house according to its characteristics.

data data-science python

Last synced: 14 Apr 2025

https://github.com/tyson-swetnam/emsi

ecosystem moisture stress index

data google-earth-engine javascript python rmarkdown

Last synced: 20 Sep 2025

https://github.com/eneko/data-repository

Data files used mainly for testing

data json testing-tools word-list words

Last synced: 29 Sep 2025

https://github.com/darothen/experiment

Organizing numerical model experiment output

climate data model science

Last synced: 03 Oct 2025

https://github.com/mquezada/uchile-cc5206

Curso Introducción a la Minería de Datos [DCC UChile]

association-rules chile classification clustering course data eda jupyter mining python r science uchile

Last synced: 15 Mar 2025

https://github.com/shrayasr/globalairports.net

The .Net client for http://www.partow.net/miscellaneous/airportdatabase/

airports cross-platform data dotnet dotnet-standard wrapper

Last synced: 14 Jan 2026

https://github.com/jakarto3d/jakarnotator

The Jakarnotator is an annotation tool to create your own database for instance segmentation problem.

annotations computer-vision data database deep-learning detectron instance-segmentation mscoco training-data

Last synced: 15 May 2025

https://github.com/compositejs/datasense

A javascript library of observable, events and advanced model.

bindings data data-flows event observables tasks

Last synced: 10 Mar 2026

https://github.com/thoughtspot/visual-embed-sdk

Customizable analytics components for your app, powered by ThoughtSpot's AI

ai data web-embed

Last synced: 18 Jan 2026

https://github.com/ai-readi/fairhub-app

Web platform for easily managing, curating, and sharing FAIR and AI-ready clinical and biomedical research data

biomedical cloud curation data sharing

Last synced: 03 Apr 2026

https://github.com/simeononsecurity/track-helium-mobile-wifi

A collection of scripts and tools that tracks the availability of helium mobile wifi networks in the wild from the Wigle Dataset and Helium API. Updates every 24 hours.

automation bigdata carrieroffload data data-analysis dataset dataset-generation helium openroaming passpoint python

Last synced: 23 Apr 2025

https://github.com/marcusschiesser/vectorstores

Vectorstores is a framework for using vector databases in your AI applications

ai ai-agents data database embeddings vector vector-database

Last synced: 13 Jan 2026

https://github.com/jujuadams/extendingjson

Human-writeable JSON-like data formats for GameMaker

data gamemaker gamemaker-s gms2 json yaml

Last synced: 15 Apr 2025