An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/raigu/ordered-lists-sync

Library for synchronizing ordered data with the minimum of insert and delete operations. Suitable for lage data sets in isolated environments

data lists ordering sync syncrhonization update

Last synced: 12 Jan 2026

https://github.com/charliecm/meteorite-landings

Data visualization of meteorite landings on Earth.

astronomy d3 data data-visualization mapbox space visualization

Last synced: 18 Apr 2026

https://github.com/asuozzo/medicare-data-analysis

An analysis of Medicare Part D data in Vermont

data python

Last synced: 04 May 2026

https://github.com/mouneshgouda/learn_dsa

This repository explores fundamental data structures and their implementations. Learn how to organize and manipulate data efficiently for various programming tasks. (Feel free to add your specific focus areas here, e.g., algorithms, interview prep)

c data queue sorting-algorithms stack structured-data

Last synced: 30 Jul 2025

https://github.com/visenger/prada

Profiling Datasets

cleaning data dataset profiling

Last synced: 24 Aug 2025

https://github.com/derrickbaruga7/python-data-analysis

This project analyzes ORU’s off-season sewer usage using Python, with `pandas` for data handling, histograms and line plots for exploration, and a `scipy`-based model for prediction. Pearson’s correlation and visualizations help reveal key trends and relationships.

analytics data data-science visualization

Last synced: 31 Jul 2025

https://github.com/dannyben/datamix

DSL for manipulating tabular data

csv data data-analysis data-engineering gem ruby tabular-data

Last synced: 31 Jul 2025

https://github.com/flowsynx/plugin-postgresql

FlowSynx plugin to interfaces with PostgreSQL for CRUD operations. Supports JSONB, full-text search, and advanced query features.

data database flowsynx postgresql postgresql-database sql

Last synced: 09 May 2026

https://github.com/chandraprakash-bathula/keywords_prediction-machine-learning-integration

Keywords Prediction Model Built the Model By: Data Cleaning Removing Stopwords Constructing Word2vec Advancing to TF-IDF Weighted Word2vec.

algori artifici data machine-learning tf-idf weighted-word2vec word2vec

Last synced: 08 Nov 2025

https://github.com/tonykipkemboi/ens_subgraph_data

Query On-Chain Data from Subgraphs by The Graph Protocol using Python

data subgraphs thegraphprotocol web3

Last synced: 17 Sep 2025

https://github.com/stephaniehicks/flowsorted.blood.wgbs.blueprint

A Bioconductor ExperimentHub data package for flow sorted purified whole blood cell types measured using DNA methylation on WGBS platform from BLUEPRINT

bioconductor bioconductor-package bisulfite-sequencing blood data dna-methylation flowsort wgbs

Last synced: 25 Sep 2025

https://github.com/chalk-ai/roadmap

Chalk public roadmap

chalk data data-science mlops pipeline python

Last synced: 17 Jan 2026

https://github.com/simranjeet97/leetcode_practice

Practicing the Leet Code Codes for Competitive Programming

algorithms amazon coding competitive-programming data data-structures facebook google leetcode python

Last synced: 03 Aug 2025

https://github.com/theryston/db-mycro

A node module with a json database that saves data in a specific directory, similar to sqlite, but in JSON

base crud data database db db-mycro javascript json jsondatabase nodejs nosql typescript

Last synced: 09 Apr 2026

https://github.com/tiaanduplessis/country-currency-data

Data about currencies of countries

countries currencies data symbols

Last synced: 08 Aug 2025

https://github.com/dav009/bqt

Local unit tests for your BigQuery queries

bigquery bq data test unittest

Last synced: 11 Feb 2026

https://github.com/woctezuma/download-steam-screenshots-data

Data consisting of Steam screenshots.

data steam steam-api

Last synced: 19 Feb 2026

https://github.com/isaac-lal/english-arabic-dictionary

This is a dictionary website that implements a search feature which allows input for a word in either English or Arabic and returns the alternative translation.

data db javascript react web-development

Last synced: 09 Apr 2026

https://github.com/ddeutils/ddedocs

📖 Data Developer & Engineer Documents and Hands-On

blogs data data-engineering documents hands-on

Last synced: 08 Aug 2025

https://github.com/rubenhortas/python_examples

Examples of Python code and DSA (data structures and algorithms).

algorithm algorithms data dsa examples python python-3 python3 samples snippets structures

Last synced: 03 Oct 2025

https://github.com/tpgillam/teafiles.jl

Tea file support for Julia

data julia time-series

Last synced: 03 Oct 2025

https://github.com/semibran/img-data

Easily read from and write to ImageData instances

canvas data image img

Last synced: 11 Aug 2025

https://github.com/skygenesisenterprise/aether-calendar

Aether Calendar is a lightweight, open-source client built for privacy, speed, and seamless integration within the Aether Office ecosystem

applications calendar capacitorjs data javascript linux macos nextjs typescript windows

Last synced: 12 Apr 2026

https://github.com/shysolocup/fndt

JavaScript package allowing you to see function data like body and arguments from outside of the function

aepl data fndt functions javascript javascript-tools js js-function js-functions lightweight nodejs nodejs-modules package stews

Last synced: 30 Apr 2026

https://github.com/huspacy/huspacy-resources

Resources for building and evaluating huspacy

data huspacy

Last synced: 21 Mar 2025

https://github.com/rsc-labs/see-open-data

Show www.dane.gov.pl in user friendly format. Generate flourish data or other data visualizations.

data data-visualization flourish government poland

Last synced: 04 Apr 2025

https://github.com/eslamdyab21/apara-data-gui

Custom application for Apara's data wrangling scripts, Technologies used are Qt-designer, PyQt5 for the GUI and Pandas, Numpy for the data work.

csv data data-analysis data-wrangling gui pandas pyqt5-desktop-application qt5-gui

Last synced: 17 May 2026

https://github.com/jefking/copyblobs

Copies all files in a container to another container, in another storage account.

aci arm azcopy azure blob container copy data file files from instant move one-time simple storage sync template to transfer

Last synced: 27 Mar 2025

https://github.com/akesling/csvb

Have CSV? Use CSVB!

analytics csv data database

Last synced: 02 Feb 2026

https://github.com/edjoukou/human_resources

A data analysis project using MySQL Server database

analysis data mysql powerbi sql visualization

Last synced: 25 Sep 2025

https://github.com/eloyhere/semantic-java

Semantic-Java is a modern, maven Java stream processing framework with zero dependencies. It elegantly blends the fluency of Java Streams, the laziness of JavaScript generators, and intelligent index-based control inspired by database indexing — perfect for time-series, event streams, and high-performance data pipelines as a maven pendency.

data functional functional-programming java pipeline stream

Last synced: 07 Apr 2026

https://github.com/hackolade/yugabytedb-ysql

Hackolade(https://hackolade.com) plugin for the Cloud Native Yugabyte database with YSQL API

data data-modeling entity-relationship-diagram schema-design ysql yugabyte yugabytedb

Last synced: 30 Apr 2025

https://github.com/nushratjabenaurnima/cse_477_data_mining

A collection of labs, reports, Jupyter notebooks, and project outputs for the CSE 477 Data Mining course. This repository tracks my learning journey through data preprocessing, association rules, clustering, classification, and real-world data analysis with Python.

data data-analysis data-mining data-science google-colab-notebook jupyter-notebook machine-learning python python-3

Last synced: 09 Apr 2026

https://github.com/yourdataarchitect/french-realestate-data-pipeline

This repository contains a fully automated data pipeline built with Apache Airflow to extract, clean, analyze, and report real estate listings from Seloger. It pushes data to MongoDB, Elasticsearch, and Google Sheets, with real-time Slack alerts for monitoring.

airlfow data datanalysis datapipeline market-intelligence real-estate

Last synced: 31 Dec 2025

https://github.com/coderooz/hr-dashboard

The goal of this project is to create a power bi dashboard to showcase the attrition data within the company.

data data-analytics power-bi

Last synced: 07 Jan 2026

https://github.com/ranjeetj06/insighthub

InsightHub is a data analytics project that helps automate the entire process of preparing, analyzing, and reporting on CSV data.

analysis begineer data springboot

Last synced: 17 May 2026

https://github.com/ellisvalentiner/legislation-embeddings

Embeddings for U.S. Congress legislation

data embeddings machine-learning nlp python

Last synced: 12 Aug 2025

https://github.com/sharmadhiraj/plot-pi

Graphical Representation of PI

data data-visualization html javascript js mathematics plot

Last synced: 28 Mar 2025

https://github.com/pyrustic/jayson

Intuitive interaction with JSON files [DEPRECATED, check the project Shared]

data json pyrustic python

Last synced: 17 May 2026

https://github.com/plurid/datasign

Single Source of Truth Data Contract Specifier

data file-format

Last synced: 08 Nov 2025

https://github.com/fliplet/fliplet-widget-data-source-query

Data Source Query Provider

data provider widget

Last synced: 11 Apr 2025

https://github.com/boettiger-lab/taxadb-cache

Cache for taxadb files

data

Last synced: 19 May 2026

https://github.com/pulipulichen/pts-local-news-dataset

A dataset containing local news from Public Television Service.

data dataset

Last synced: 27 Mar 2026

https://github.com/hivesolutions/repos

Modular repository management system

data python repos storage system

Last synced: 14 May 2026

https://github.com/ciscorn/japanmesh-rs

A Rust library for handling Japanese Grid Square Code (JIS X 0410:2002 地域メッシュコード)

census data geospatial japan rust

Last synced: 11 Jan 2026

https://github.com/encoreshao/data-science

Data analyze examples, using Jupyter notebook and Python!!!

data dataanalysis encore jupyter-notebook

Last synced: 29 Mar 2025

https://github.com/aguven6/inmemory-data-processor

Convert tabular data to columnar data with index. Aim is to process huge data quicker especially in aggregation operation

columnar-storage data data-structures parallel-computing parallel-programming processing

Last synced: 17 May 2026

https://github.com/bhojpur/dlm

The Bhojpur DLM is a software-as-a-service product used for Data Lifecycle Management based on Bhojpur.NET Platform for data delivery.

data lifecycle-management

Last synced: 19 Feb 2026

https://github.com/pulgamecanica/d3examples

https://www.oreilly.com/library/view/d3-for-the/9781492046783/

d3 d3-visualization d3js d3v4 data javascript

Last synced: 19 May 2026

https://github.com/kameronbrooks/datalys2-reporting

Datalys2 Reports allows you to create rich, interactive reports by simply defining a JSON configuration embedded in your HTML. It handles the layout, data visualization, and interactivity, so you don't need to write custom React code for every report.

data data-visualization html react

Last synced: 08 Apr 2026

https://github.com/shivamsharma32/ipl-2022-analysis

The IPL 2022 Analysis project is a data-driven exploration of the Indian Premier League (IPL) 2022 cricket tournament. The analysis focuses on utilizing Python programming and various libraries to analyze and visualize the performance of teams, players, and key metrics in the IPL 2022 season.

data dataana dataanalytics datavi matplotlib python

Last synced: 17 May 2026

https://github.com/weecology/updating-data

Hugo website for instructions on how to make a regularly updating data pipeline

continuous-analysis continuous-integration data gh-actions living-data netlify travis-ci

Last synced: 17 Feb 2026

https://github.com/amethyst-php/post

A comment, a note, a post, a pseudo-chat. Can be really anything

amethyst amethyst-package api data laravel post

Last synced: 17 May 2026

https://github.com/shahules786/titanic-analysis

different analysis of titanic accident (data from kaggle)

analyze data titanic-kaggle

Last synced: 26 Jun 2025

https://github.com/kuanjiahong/covid19-analysis

A simple project to familiarize myself with data analysis

data data-science data-visualization pandas python

Last synced: 02 Apr 2025

https://github.com/jigyasag18/financial-risk-analysis-project

The Credit Card Financial Risk Analysis Dashboard is a real-time Power BI tool designed to provide insights into credit card transactions and customer demographics. It features interactive visualizations, efficient data processing, and actionable insights to support decision-making. Utilizing data from SQL database, the dashboard tracks key metrics

data dataanalysis database datacleaning datapreprocessing dataprocessing datavisualization financial-analysis financialriskanalysis mysql powerbi sql statistical-analysis

Last synced: 06 Mar 2026

https://github.com/henryssondaniel/teacup-java-report-file

Report Teacup data to a file

data file logs reports teacup

Last synced: 22 Jul 2025

https://github.com/toofancodes/h1b-dashboard-insights

An interactive Tableau dashboard that visualizes H1B visa data from the USCIS Employer Data Hub, offering insights into application trends, top employers, and geographic distributions. Showcases advanced data visualization, analytics, and business intelligence skills.

analysis analytics business-intelligence dashboard data data-visualization h1b h1b-visa interactive-data tableau

Last synced: 20 Jan 2026

https://github.com/ericmaddox/nyc-crime-analytics

Analyzes and visualizes crime data from the NYC Police Department using interactive maps and heatmaps, leveraging the NYC Open Data API.

crime-analysis crimedata data datavisualization esri folium heatmap nycopendata python python3 rtcc

Last synced: 24 Jun 2025

https://github.com/adadalshabab/machine-predictive-maintenance-classification

This repository hosts a machine predictive maintenance classification project, aimed at predicting the maintenance needs of industrial machinery before they fail. By leveraging machine learning algorithms, this project seeks to enhance operational efficiency and reduce downtime by identifying potential maintenance requirements proactively.

data data-science datanalysis datanalytics machine-learning machine-learning-algorithms matplotlib-pyplot pandas

Last synced: 17 May 2026

https://github.com/plurid/delog

Cloud Service for Centralized Logging

cloud data logging

Last synced: 08 Nov 2025

https://github.com/antoninpvr/battery-logger

Simple scripts to record data from my laptop battery

bash-script battery data

Last synced: 17 May 2026

https://github.com/plurid/defocus

Apophatic User Content Resolution [Desearch Concept]

data

Last synced: 08 Nov 2025

https://github.com/basinghse/covid19simulator

Real Time Assessment and Simulation of COVID-19 - showing current numbers of cases, deaths and treated patients globally.

coronavirus covid-19 data real-time simulation visualisation visualisation-data-ingester

Last synced: 05 Apr 2025

https://github.com/hidayathamir/telegram-group-data

1,865,827 message data in telegram group. Text, identity, datetime.

bahasa-indonesia data python3 scrape telegram telethon

Last synced: 17 May 2026

https://github.com/amarlearning/exploring-the-evolution-of-linux

Data Analysis about the development of the Linux operating system by exploring its Git repository history.

cleaning-data data data-analysis data-wrangling datacamp first-commit git-history linux

Last synced: 12 May 2026

https://github.com/stdlib-js/array-base-fill-by

Fill all elements within a portion of an array according to a callback function.

accessor array data fill generic javascript map node node-js nodejs set stdlib structure transform typed types

Last synced: 14 May 2026

https://github.com/saksham-jain177/data-analysis

A collection of data analysis and machine learning projects across various datasets. Explore predictive modeling, data visualization, and insights from real-world data. Projects include sales predictions, disease detection, customer segmentation, and more.

api data data-analysis data-cleaning data-science data-visualization datamodeling dataset datasets exploratory-data-analysis python python3 web-scraping youtube-api

Last synced: 01 May 2026

https://github.com/eyluldursun/data-science-project

This project involves a data science analysis conducted on the Obesity Data Set. The study explores factors influencing obesity, includes data visualization, and develops predictive models. The goal of the project is to gain insights to help prevent obesity.

data data-science obesity r rmarkdown

Last synced: 26 Jun 2025

https://github.com/meta-llama/synthetic-data-kit

Tool for generating high quality Synthetic datasets

data generation llm python synthetic

Last synced: 08 May 2025

https://github.com/ericgio/history-of-jazz

Data and visualizations based on Ted Gioia's "The History of Jazz"

data jazz

Last synced: 28 Mar 2025

https://github.com/anct-cartographie-nationale/mednum-cli

✨ Interface en ligne de commande pour la transformation des données de lieux de médiation numériques collectées dans un format non standard vers le schéma de la mednum et leur publication sur data.gouv

anct betagouv data donnees gouvernement mediation-numerique nodejs open-data transformation

Last synced: 02 Aug 2025

https://github.com/robsteranium/user2022-ldf-talk

Slides from my useR! 2022 talk about the Linked-Data Frames package

data data-frame linked-data r rdf

Last synced: 19 Apr 2025

https://github.com/sumansuhag/wasserstoff-aiinterntask

Welcome to the AI Pipeline for Image Segmentation and Object Analysis project – a state-of-the-art solution designed to process, segment, identify, and analyze objects within images. This AI-powered pipeline is engineered to deliver precise insights by extracting, mapping, and summarizing data from each segmented object.

artificial-intelligence cdn data data-science modeling pipline

Last synced: 28 Mar 2025

https://github.com/sumansuhag/prediction_model

This repository features a collection of Jupyter notebooks designed to showcase the practical applications of machine learning, data preprocessing, feature engineering, and recommendation systems. These notebooks enable users to explore, analyze, and predict business events.

algotithms artificial-intelligence data logistic-regression machine-learning-algorithms science sckiit-learn

Last synced: 28 Mar 2025

https://github.com/thedevreda/jadaerospace

A Real life project showing how to improve selling aircraftparts and helping salers to focus more on effective products at JadAero

data data-analysis data-cleaning data-visualization jupyter-notebook powerbi python

Last synced: 02 Aug 2025

https://github.com/reubano/pyconza-tutorial

Jupyter notebooks and data for "Data Mining and Processing for fun and profit" PyConZA16 tutorial

data functional-programming jupyter-notebook meza pycon python tutorial

Last synced: 17 May 2026

https://github.com/chompfoods/sdk-scala

Scala SDK for the Chomp Food & Recipe Database API. Use our API to get high-quality data on recipes and 875,000+ branded/grocery foods plus raw ingredients.

api branded chomp data database food grocery ingredients nutrition raw recipe-api recipes scala sdk

Last synced: 17 May 2026

https://github.com/namescode/hub_harvester

A python script to gather data on a user or organisations git repos

data github nix nix-flake python python3 sqlite

Last synced: 08 Apr 2026