An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/giscience/measures-rest-oshdb-docker

Scripts for starting measures for geospatial datasets in docker container, using the OSHDB

data dggs docker geospatial mesure openstreetmap rest

Last synced: 18 Apr 2026

https://github.com/speakeasy-sdks/fivetran-python-sdk

Python SDK for accessing Fivetran API.

api connector data fivetran fivetran-connector python sdk

Last synced: 01 Jul 2025

https://github.com/denko5/sales-analysis

A complete SQL-based sales analysis project covering Africa, showcasing data cleaning, exploratory analysis, insights, and lessons learned. The project highlights sales trends, regional performances, and marketing effectiveness across multiple platforms.

africa data data-analysis data-science exploratory-data-analysis insights kenya sales sql

Last synced: 24 Jan 2026

https://github.com/DataHerb/dataherb-flora

DataHerb Flora: The core of DataHerb

data data-mining data-science datascience dataset datasets

Last synced: 08 May 2025

https://github.com/mheadd/SamDotNet

:office: A C# wrapper for the SAM.gov API.

api business client data gov-api government

Last synced: 30 Apr 2025

https://github.com/pbinkley/tweets-libraries-covid19

A twarc harvest of tweets related to libraries during the COVID-19 outbreak, starting 2020-03-02

data social

Last synced: 06 Mar 2026

https://github.com/bacross/datamunger

python package for handling nan's and outliers

data data-frame datamunger knn nan outliers python scikit-learn

Last synced: 17 May 2026

https://github.com/hughrawlinson/github-data-scripts

Scripts to grab data about repos of interest to compare

data github-graphql github-repo-organizer graphql scripts typescript

Last synced: 09 Jul 2025

https://github.com/12joan/not-analytics

don't be creepy.

data metrics privacy

Last synced: 30 Apr 2025

https://github.com/0xleif/onionstash

Store Onions 🧅

data swift

Last synced: 05 Apr 2025

https://github.com/avto-dev/data-migrations-laravel

Package for database data migrations

data database laravel migrations package

Last synced: 12 Jul 2025

https://github.com/flownrecords/flightTracker

A mobile app built to record essential flight data for post-flight review and debriefing.

aviation data gps tracking

Last synced: 23 Jun 2025

https://github.com/epogrebnyak/business-conditions-digest-2017

Replicate illustration from Business Conditions Digest

data economics

Last synced: 22 Mar 2025

https://github.com/davorg/data-tree

Perl library for handling trees

data perl tree

Last synced: 02 Apr 2025

https://github.com/rafalwrzeszcz-wrzasqpl/pl.wrzasq.commons

General-purpose data structures and routines.

aws data data-structures library rust

Last synced: 10 Apr 2025

https://github.com/ushkinaz/cbn-data

Automated game data extraction and processing for Cataclysm: Bright Nights. Provides JSON mirrors, WebP asset conversion, and unified translation data.

cataclysm-bn data wiki

Last synced: 07 Mar 2026

https://github.com/sambacha/yearn-finance-data

data repo for proposed YIP-DATA

cryptocurrency data erc20 ethereum exchange yearn yip yyip

Last synced: 18 May 2026

https://github.com/amazingtest/data4test

测试数据构造生成器,you can get useful data here for software testing

data test-automation testdata testdatabuilder testing testing-tools

Last synced: 16 Jan 2026

https://github.com/jvrck/australianpayphones

Get Australian payphone data in GeoJSON format.

australia data geojson geojson-data scraper

Last synced: 04 Apr 2025

https://github.com/coral/ddp

Distributed Display Protocol (DDP) in Go

data ddp distributed golang led pixel protocol wled

Last synced: 26 Jun 2025

https://github.com/oya163/corteva

Corteva Data Ingestion Pipeline

corteva data engineering etl

Last synced: 25 Jul 2025

https://github.com/reiiyuki/once-data-manager

Once Data Manager is temporary data management utility kit for Unity.

data manager playerprefs preference scene temporary unity

Last synced: 17 May 2026

https://github.com/passidel/weedmap

Konsumverbot Cannabis

cannabis data map visual

Last synced: 14 Mar 2025

https://github.com/antoineaugusti/antennes-free

Historique des antennes relais Free Mobile en maintenance ou en panne

data free-mobile free-mobile-operator mobile-networks

Last synced: 30 Jul 2025

https://github.com/swarchal/morar

Processing phenotypic screening data

biology data data-analysis drug-discovery hts phenotypic

Last synced: 19 Jun 2025

https://github.com/wamphlett/smart-data-objects

An easy solution for capturing and validating data into usable DTO's

data dto forms php php7 validation

Last synced: 17 May 2026

https://github.com/evoluteur/madeleinology

Playing with data science by taking a look at the proportions of flour, sugar, butter, and eggs in 147 Madeleine recipes (the traditional French sponge cake).

baking cake cooking cooking-recipes data data-science data-visualization dessert exploratory-analysis exploratory-data-analysis exploratory-data-visualizations food histogram longtail madeleine recipe visualization

Last synced: 23 Jun 2025

https://github.com/prioritizr/prioritizrdata

Conservation planning data sets

data r spatial-data

Last synced: 19 Jul 2025

https://github.com/instafluff/acdb

Animal Crossing Database API

animal api crossing data database json open villagers

Last synced: 28 Apr 2026

https://github.com/camilajaviera91/dbt-transformations-sql-mock-data

This repository contains the transformations and documentation for the data model generated in sql-mock-data.

data dbt postgresql sql

Last synced: 02 Feb 2026

https://github.com/codenoid/storial.co-database

a Storial.co Database, collected by Hofesh Bot (Scrapper)

data database

Last synced: 28 Mar 2025

https://github.com/katerynazakharova/common-ml

Creating this lib for ML tasks, because I'm bored of copy-pasting the same functions for different projects.

data data-processing deep-learning lib machi

Last synced: 26 Mar 2025

https://github.com/stdlib-js/array-base-count-same-value

Count the number of elements that are equal to a given value in an array.

array count countif data javascript node node-js nodejs same stdlib structure sum summation total types

Last synced: 21 Apr 2026

https://github.com/alpheustangs/jder

A standardized structure for JSON responses

api data error json response specification structure

Last synced: 26 Mar 2025

https://github.com/greatwoman23/market-basket-analysis

Unlock the power of data-driven sales optimization with Market Basket Analysis. Explore frequent itemsets and association rules to strategically enhance product placement, design targeted promotions, and adapt to seasonal trends. Elevate your business strategy with insights tailored for boosting sales and engaging customers effectively.

analysis analytics analytics-product data data-science jupyter medium-articles notebook-jupyter python

Last synced: 28 Apr 2026

https://github.com/sksubhadeep/nashville-housing-data-cleaning-project-using-sql

SQL Data Cleaning Project on Nashville Housing Dataset

data datacleaning sql

Last synced: 19 Mar 2026

https://github.com/williamzebrowski/assistant-api

OpenAI Assistant API integrated with Elasticsearch, Logstash & Kibana

ai chatapp chatgpt conversational-ai data elasticsearch kibana llm-inference llms openai rag

Last synced: 16 Feb 2026

https://github.com/denisecase/nw-network-data-analytics

Network for those earning a NW Masters of Applied Data Science

analytics data

Last synced: 02 Feb 2026

https://github.com/finnspartronics/orpheus

A took for looking at FRC (First Robotics Competition) scouting data

data first-robotics-competition scouting scouting-data spartronics

Last synced: 28 Mar 2025

https://github.com/openfoodfacts/openfoodfacts-corrector

Ruby script to correct and enhance data on OpenFoodFacts

correction data food ruby

Last synced: 24 Apr 2026

https://github.com/cainmi/easy-pull-from-repository

A repository to pull code and files from, may be used to store page data links, code etc. mainly used for python for now

data html javascript python schema

Last synced: 04 Apr 2025

https://github.com/fairspec/fairspec-extension

Fairspec Extension is a Git repository template for rapid Fairspec extension development

ckan csv data dataset excel fair json ods polars python quality schema sqlite tabl typescript validation zenodo

Last synced: 20 Jan 2026

https://github.com/puzzlef/graph-openmp

Design of high-performance parallel Graph interface supporting efficient Dynamic batch updates.

data digraph directed graph in mtx openmp parallel structure undirected weighted

Last synced: 06 Apr 2025

https://github.com/priyanshubiswas-tech/ev-data-analysis-dashboard

An interactive dashboard analyzing EV trends, including total vehicles, BEV vs. PHEV breakdown, model popularity, state-wise distribution, and CAFV eligibility. Visualizes key insights for data-driven decisions in the EV industry. 📊

dashboard data data-analysis data-science data-visualization tableau tableau-public

Last synced: 17 Feb 2026

https://github.com/puzzlef/hybrid-csr

Comparing space usage of regular vs hybrid CSR.

csr data graph hybrid regular space structure usage

Last synced: 06 Apr 2025

https://github.com/cobluestars/dataherd-raika

"Dataherd-Raika is a library designed to simulate large-scale user behavior datasets. It takes a single user event (like a click or keyword input) and, by applying simple probability distributions and custom variables, expands it into a vast dataset."

big-data data data-generation data-generator data-science front-end javascript machine-learning npm-package simulator statistics typescript user-behavior user-experience

Last synced: 02 Jan 2026

https://github.com/gbburleigh/quick-seeders

Generate realistic test data quickly with Quick-Seeders, a Python library offering a wide range of data types and schema definitions. Control data variance, probabilities, and output formats, including SQL. Simplify your data seeding process and improve testing efficiency.

data dataset faker generator python seeder sql test

Last synced: 03 Apr 2025

https://github.com/erictleung/2017-new-coder-survey

:beginner: Code to help clean and format the 2017 New Coder Survey by freeCodeCamp

coder-survey data data-cleaning dplyr freecodecamp

Last synced: 03 Apr 2025

https://github.com/rrwen/slides-covid19-geosocial-db

Presentation titled "A Real-time Geo-social Media Database for Large-scale Coronavirus Disease 2019 (COVID-19) Research" for my second research seminar at Ryerson University

covid covid-19 covid19 data database disease geo gis index media ncov-2019 ncov19 postgres postgresql presentation research seminar slides social virus

Last synced: 18 May 2026

https://github.com/anuraganalog/365-data-science

A Repository which contains lecture notes, exercise, solutions

365 data exercises ipynb lecture notes pdfs python python3 science solutions sql

Last synced: 15 May 2026

https://github.com/marians/tour-tracker

Track the general classification development of the Tour De France, stage over stage

cycling data sports statistics

Last synced: 24 Jun 2025

https://github.com/jrdnbradford/google-sheet-color-sort

Google Sheet-bound script that assists with sorting Google Sheet rows by background fill color

data excel google-apps google-apps-script google-sheet google-sheets javascript microsoft-excel sort-rows

Last synced: 14 Apr 2025

https://github.com/ubc-library-rc/ggplot2_intro_workshop

Workshop about data visualization with ggplot2 in R

data featured workshop

Last synced: 01 Jul 2026

https://github.com/prdktntwcklr/weatherman

A simple web app displaying environmental data from an SQLite database.

dashboard data flask sensor sqlite

Last synced: 19 May 2026

https://github.com/ubc-library-rc/intro_to_tidyverse

Introductory workshop about the tidyverse package

data workshop

Last synced: 01 Jul 2026

https://github.com/bredalis/seaborn

📊 Library to create graphics 📊

data graphics-programming librery python seaborn seaborn-plots

Last synced: 04 Mar 2025

https://github.com/jimbrig/jimstaskviews

CRAN Task Views and Shiny App https://jimstaskviews.jimbrig.com

cran data docs rstats shiny-app submodules task-views

Last synced: 06 Mar 2026

https://github.com/chompfoods/sdk-typescript-fetch

Fetch TypeScript SDK for the Chomp Food & Recipe Database API. Use our API to get high-quality data on recipes and 875,000+ branded/grocery foods plus raw ingredients.

api branded chomp data database fetch food grocery ingredients nutrition raw recipe-api recipes sdk typescript

Last synced: 03 May 2026

https://github.com/aruneshbasak/python-dsa-problems-geeksforgeeks-160-days

I will upload my daily Python DSA problems solved on GeeksforGeeks and post it here!

algorithms-and-data-structures and data data-structures dsa python python3 structure

Last synced: 08 May 2025

https://github.com/abhaysingh71/india-censes-data-analysis

This repo is a india censes data analysis in many domains

data data-science data-visualization dataanalysis streamlit

Last synced: 15 May 2026

https://github.com/shgysk8zer0/schema

A PHP implementation of schema.org structured data objects

data microdata schema seo structured-data

Last synced: 24 Jun 2025

https://github.com/dostuffthatmatters/circadian-scp-upload

Resumable, interruptible, SCP upload client for any files or directories generated day by day

checksum daily data directories files library python scp ssh synchronization time-series upload utilities

Last synced: 24 Jun 2025

https://github.com/alexandregazagnes/ghisa

ghisa - Github Import Statistic Analyzer is a free and open-source software, app and python package that helps you to analyze the import statistics of your github repositories.

analytics data dependencies git github github-api import package pypi python skills tool

Last synced: 27 Jun 2025

https://github.com/adrian-pasek-prv/data-modeling-with-cassandra

Create a data model in Apache Cassandra for music streaming app

apache-cassandra data data-engineering data-modeling python

Last synced: 02 Jan 2026

https://github.com/muhammad-fiaz/ason

ASON: Adaptive Structured Object Notation - Python library for dynamic data serialization, providing flexibility and simplicity.

adaptive-structure-object-notation api ason cli client data file file-format file-sharing file-upload json json-data json-parser open-source opensource parser parsing python python3

Last synced: 02 Feb 2026

https://github.com/ate329/nsl-kdd-feature-extractor

Python-based tool designed to process network traffic packets and extract features compliant with the NSL-KDD dataset format.

cyber-security cybersecurity data data-science extractor feature-extraction machine-learning network-analysis nsl-kdd nsl-kdd-dataset

Last synced: 30 Oct 2025

https://github.com/glassflow/pipelines-push-action

This Github Action lets you automate GlassFlow pipelines deployments as code

data data-processing datastreaming deployment github-actions glassflow python real-time stream-processing

Last synced: 19 May 2026

https://github.com/shysolocup/stews

Stews is a Node.JS package meant to make storing data easier by mixing parts from common data types.

aepl array arrays data datatypes html javascript js json map maps nodejs object objects package set sets stews

Last synced: 25 Jul 2025

https://github.com/soulyma/web_crawler

A focused web crawler to extract and structure Arabic content from web pages. Designed for researchers, data analysts, and developers working on Arabic language datasets.

beautifulsoup4 crawler csv data json python structured-data

Last synced: 15 May 2026

https://github.com/beangreen247/osfetch-old.sh

script that fetches system information and displays it to the user

247 bash bean beangreen247 data fetch green information neofetch neofetch-clone os script sh shell storage system tem zsh

Last synced: 02 Nov 2025

https://github.com/lisakey/convert-csv-to-sav

We used python 🐍 to convert a csv file into a sav file with all the modifications needed to open it in IBM spss and be able to analyse our data.

analysis chardet convert csv data databases ibm os pandas pyreadstat python sav spss sys transformations

Last synced: 08 May 2026

https://github.com/ibz-04/data-encryption

Encrypting and Decrypting given data of hospital patients such as: audio & image files

data decryption encryption

Last synced: 23 Jul 2025

https://github.com/mustika-putri-m/-tableu-laporan-data-karyawan-growian

I am currently pursuing a data analysis certification at GROWIA, where I've learned to use tools such as Python, SQL, Google Big Query, Google Data Studio, Advanced Microsoft Excel, and Tableau. This course has enhanced my ability to analyze data using KPIs and business metrics, enabling me to solve business problems more effectively

data data-visualization tableau

Last synced: 17 Feb 2026

https://github.com/mundra-ankur/msw_ai_pipeline

Municipal solid waste (MSW) characterization, AI and Data pipeline to charcterize solid waste in real time into diffrent buckets using Yolo

artificial-intelligence data datapipeline solid-waste-segregation yolo

Last synced: 11 Apr 2025

https://github.com/incubrain/awesome-maharashtra-data

A collection of datasets specific to Maharashtra, India. WIP

ai artificial-intelligence data data-analysis data-science datasets maharashtra marathi

Last synced: 23 May 2026

https://github.com/clabe45/kaz

Minimalistic local storage cli

cli data minimalistic storage utility

Last synced: 17 Jul 2025

https://github.com/lmuffato/project-mysql-one-for-all-trybe

Projeto mysql one for all - Projeto avaliativo da Trybe do Bloco 21: Normalização e Modelagem de Banco de Dados

back-end data database database-modeling mysql mysqlworkbench query sql trybe-projects

Last synced: 08 May 2026

https://github.com/hoaihuongbk/lakeops

A modern data lake operations toolkit working with multiple table formats (Delta, Iceberg, Parquet) and engines (Spark, Polars) via the same APIs.

data data-operations dataengineering datalake

Last synced: 07 Mar 2026

https://github.com/cpanse/tartare

raw file collection recorded on Thermo Fisher Scientific mass spectrometers for extented unit testing

bioconductor blob data r unittesting

Last synced: 03 Apr 2025

https://github.com/stdlib-js/ndarray-base-dtype-resolve-str

Return the data type string associated with a supported ndarray data type value.

array data dtype dtypes enum javascript multidimensional ndarray node node-js nodejs stdlib types util utilities utility utils

Last synced: 06 Mar 2026

https://github.com/sergkash7/fdc-facade

Facade for The FoodData Central API.

api center data food usda

Last synced: 15 May 2026

https://github.com/nafisalawalidris/elfeenah

Configuration files for my GitHub profile. Welcome to my GitHub profile! I'm Nafisa Lawal Idris, a passionate Data Scientist with a strong interest for blockchain technology. Explore my GitHub portfolio to delve into the exciting world where data science and blockchain converge.

artificial-intelligence bitcoin blockchain config data data-science-portfolio data-science-projects datascience datascientist deep-learning github-config machinelearning

Last synced: 11 Sep 2025

https://github.com/whatheheckisthis/pwc_project-

Successfully completed a PwC virtual case, advancing Power BI skills to address cybersecurity and cloud architecture requirements. Developed comprehensive dashboards that effectively communicated key performance indicators (KPIs), showcasing proficiency in data visualization and deliver

case-study data data-science dataanalytics databases datavisualization powerbi virtual

Last synced: 05 Apr 2025

https://github.com/panda-official/driftcli

CLI Client for Drift Platform

cli click command-line data

Last synced: 17 Feb 2026

https://github.com/wahyuwsslah/salary_prediction-aiml

Salary Prediction using Machine Learning with 3 Models. Linear Regression, Decision Tree, Random Forest

ai analytics data data-science datascience machine-learning python python3

Last synced: 19 May 2026