An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/arcticsnow/climatepy

Collection of tools to perform timeseries analysis on climate data (Observation and Downscaled)

climate data era5 meteorological-data noaa-data pandas timeseries weather wmo xarray

Last synced: 05 Feb 2026

https://github.com/sdhutchins/jxn-open-data-api

Access Jackson, MS open government data using a python API wrapper.

api data jackson jxn mississippi open-gov

Last synced: 08 Apr 2025

https://github.com/sanand0/texas-deathrow

Texas deathrow inmates data

data

Last synced: 04 Sep 2025

https://github.com/andrew-johnson-4/misspeller

Take correctly spelled words and return common spelling mistakes

common-mistakes data language natural nlp processing rust

Last synced: 30 Apr 2025

https://github.com/andrewrporter/my-analytics

Analyzes FireFox browsing history with modern python3 features and libraries

analytics data firefox matplotlib python python3 sqlite3

Last synced: 28 Apr 2026

https://github.com/utrechtuniversity/dataprivacyproject

This is the repository underlying the landing page for the Data Privacy Project @UtrechtUniversity, the Netherlands.

data gdpr open-science privacy rdm research research-data-management utrecht-university

Last synced: 10 Oct 2025

https://github.com/marek-jakub/monitoring

A university project concerning field data management for bird ringers.

bird data fieldwork management ringing

Last synced: 24 Jun 2026

https://github.com/14richa/patient-readmission-analysis

This project focuses on predictive modeling to foresee hospital readmissions of diabetic patients within 30 days post-discharge. By leveraging a dataset spanning a decade (1999-2008) and covering records from 130 US hospitals, the aim is to enhance healthcare management and patient outcomes.

analytics data jupyter-notebook numpy

Last synced: 29 Apr 2026

https://github.com/asidlo/po

Data science library for manipulating data in Go using the familiar DataFrame and Series constructs from the Python Pandas library.

data dataframe go pandas series

Last synced: 14 Jan 2026

https://github.com/alexandregazagnes/unilasalle-public-resources

UniLaSalle-Public-Ressources : This public repository contains the notebooks and the data used for both : 2nd Year - Practical Statistical Tests 4th Year - Data Analysis with Python

data data-analysis data-analytics data-cleaning data-storytelling education educational exploratory-data-analysis python python3 r r-programming rstudio statistics visualization

Last synced: 28 Apr 2026

https://github.com/wonderium/browser-feature-compatibility

This repository contains browser support details for HTML, CSS, JS and SVG features.

browsers compatability css data html js json releases support svg wonderium

Last synced: 27 Jan 2026

https://github.com/williamwutq/bllist

Durable, crash-safe, checksummed block-based linked list allocators stored in a single file

data data-storage data-structure database file-based linkedlist

Last synced: 25 Jun 2026

https://github.com/stdlib-js/datasets-harrison-boston-house-prices-corrected

A (corrected) dataset derived from information collected by the US Census Service concerning housing in Boston, Massachusetts (1978).

boston data dataset datasets house housing javascript linear-regression node node-js nodejs prediction prices statistics stats stdlib value

Last synced: 15 Feb 2026

https://github.com/neomutt/sample-data

📚 Lists of things. Useful for developing and testing.

data list sample

Last synced: 19 Mar 2026

https://github.com/m-rishab/stock_trend-analysis-power-bi-project-

In this project, I've harnessed the robust capabilities of Power BI to analyse, visualize, and uncover the story behind HUL's stock performance.

data datavisualization datavisualization-project powerbi

Last synced: 19 Mar 2026

https://github.com/stefanbohacek/exploring-the-mapping-police-violence-dataset

Using my Gutenberg Data Visualization plugin to explore police violence against civilians.

data dataviz police police-brutality police-misconduct

Last synced: 03 Dec 2025

https://github.com/stdlib-js/array-base-none-by-right

Test whether all elements in an array fail a test implemented by a predicate function, iterating from right to left.

all array data every generic javascript node node-js nodejs none predicate stdlib structure test types validate

Last synced: 01 Mar 2026

https://github.com/skywardai/paper_gallery

Papers gallery for using LLMs ability over dataset

ai data data-science llm medicine neural-network research security

Last synced: 19 Mar 2026

https://github.com/anthonybench/datapeek

Peek summary of datafile in a succinct, opinionated manner.

cli data data-analysis

Last synced: 02 Mar 2026

https://github.com/stdlib-js/array-base-every-by

Test whether all elements in an array pass a test implemented by a predicate function.

all array data every generic javascript node node-js nodejs predicate stdlib structure test types validate

Last synced: 03 Mar 2026

https://github.com/mg380/ibm-applied-data-science-capstone

This Capstone is the 10th (final) course in IBM Data Science Professional Certificate specialization, and it actually summarises in the form of project all materials that have been learned during this specialization

capstone data data-analysis data-science datascience ibm machine-learning plotly python scikit-learn sql

Last synced: 05 Mar 2026

https://github.com/sadmanca/uoft-pey-coop-job-postings

Code for parsing approximately 1.8k HTML pages of UofT PEY co-op job postings (from September 2023 to May 2024) to a single sqlite3 database file.

co-op data html python singlefile sqlite sqlite3 uoft uoft-pey

Last synced: 17 Apr 2026

https://github.com/gallo13/neuralnetworks-deeplearning-stats-classification

Descriptive Statistics, Classification and Analysis Using Python & Python Libraries (Assignment 1)

analysis data datasets deep-learning jupyter-notebook matplotlib neural-networks numpy pandas plotting python seaborn

Last synced: 17 Apr 2026

https://github.com/sogful/archive

you will NEVER believe what this repository contains

archive crawl data scrapes sites

Last synced: 03 Jun 2026

https://github.com/timmymatten/spikeball-stat-tracker

Spikeball stat tracking web app built with Streamlit and Python, designed to easily log and analyze player performance over multiple games.

data data-analysis data-visualization dataset matplotlib-pyplot multipage python spikeball statistics streamlit

Last synced: 18 Apr 2026

https://github.com/shivam1808/data-cleaning-project

We take raw housing data and transform it in SQL Server to make it more usable for analysis.

analysis data datacleaning sql sqlserver

Last synced: 29 May 2026

https://github.com/dataspoclab/dataspoc-lens

Virtual warehouse — SQL + Jupyter + AI over cloud Parquet via DuckDB

cli data data-engineering data-lake duckdb etl parquet python singer sql

Last synced: 20 Apr 2026

https://github.com/sinedied/htf-data

CLI tool to process Hadra Trance Festival database export into valid data for the app

cleaner cli data database hadra tool

Last synced: 20 Apr 2026

https://github.com/mishra-krishna/analysis-and-optimization-of-supply-chain-operations

Analyzed supply chain data to identify trends and key factors. Visualized sales, defect rates, lead times, and costs. Used Decision Tree Regressor to find top features impacting product costs and lead times.

data dataanalytics datavisualization supplychain supplychainanalytics

Last synced: 20 Apr 2026

https://github.com/cicerotcv/br-gen

A browser extension for generating Brazilian placeholder data.

chrome data extension generation hacktoberfest

Last synced: 21 Apr 2026

https://github.com/agahkarakuzu/datavis_edu

Presented in BrainHack School 2019-2020, QBIN SciComm 2021

binder dashboard data notebooks repo2docker visualization

Last synced: 01 Apr 2025

https://github.com/lmuffato/project-ting-trybe

Projeto ting - Projeto avaliativo da Trybe do Bloco 37: Estrutura de Dados II: Listas, Filas e Pilhas

data data-analysis python queue read-file stack trybe trybe-projects

Last synced: 12 Jun 2025

https://github.com/toransahu/metoffice

Data visualisation - MetOffice

data metoffice uk visualization weather

Last synced: 25 Mar 2025

https://github.com/stefen-taime/myubereats_datapipeline

Building a Modern Uber Eats Data Pipeline

airflow api data datawarehouse mongodb pipeline powerbi snowflake

Last synced: 22 Apr 2026

https://github.com/ofelipelucca/cdc-kafka-debezium-pipeline

A real-time event-driven social network API built with CDC (Change Data Capture), Kafka, Debezium, PostgreSQL and MongoDB implementing CQRS-style architecture with streaming data pipelines.

cdc data data-engineering data-integration data-pipeline debezium event-driven fastapi kafka kafka-connect microservices mongodb postgresql python sqlalchemy

Last synced: 05 Jun 2026

https://github.com/howtoquitvivek/ai-crop-yeild-prediction

AI-driven crop yield prediction and agricultural optimization system (SIH 2025)

2025 2026 ai crop-yeild data minor-project ml predcition python science sih

Last synced: 23 Apr 2026

https://github.com/mattqdev/koalaz

Why don't use koalas as data mock? With this npm package you can!

data koala lorem-ipsum meme mock placeholder

Last synced: 13 Jan 2026

https://github.com/yord/klp-core

A plugin with basic operations for klp (Kelpie), the small, fast, and magical command-line data processor.

csv data deserializer dsv json kelpie klp marshaller parser serializer ssv tsv

Last synced: 24 Apr 2026

https://github.com/zalweny26/open_data_unipa

Progetto per l'esame di Laboratorio di Algoritmi 23-24, UniPa, Informatica L-31

data open project python

Last synced: 26 Apr 2026

https://github.com/aero-db/airports

A public and free dataset of all airports in the world

airports aviation csv data dataset json

Last synced: 27 Apr 2026

https://github.com/aidenellis/connectmp

🍰 ConnectMP - An easy way to share data between Processes in Python.

aidenellis connectmp data data-sharing multiprocessing process sharing

Last synced: 27 Apr 2026

https://github.com/mbolam/DSWS_OpenRefine

Cleaning and Linking Data with OpenRefine

cleaning data metadata openrefine

Last synced: 07 Apr 2025

https://github.com/rrwen/twitter2pg-cli

Command line tool for extracting Twitter data to PostgreSQL databases

api cli cmd command data database geo interface line location media pg postgres postgresql rest social stream tool tweet twitter

Last synced: 12 Apr 2026

https://github.com/nightroman/farnet.fsharp.data

FSharp.Data package for FarNet.FSharpFar

data farmanager farnet fsharp

Last synced: 27 Apr 2026

https://github.com/bolajiolayinka/graph-api-automation

An End to End Automation from Facebook Business to Data Visualization of Campaigns

data data-science

Last synced: 07 May 2025

https://github.com/jtpio/data-playground

Experiments using public APIs and data

data experiments python

Last synced: 28 Apr 2026

https://github.com/saulojoab/crato-ce-json

Nesse repositório irei armazenar todos os bairros (e mais informações, no futuro) de Crato-CE em JSON.

data database geolocation json json-api localization

Last synced: 28 Apr 2026

https://github.com/ahmetcansolak/developer-insights

New project of ClubRockers from Sarıyer Hills

bitbucket data data-science data-visualization github python3

Last synced: 28 Apr 2026

https://github.com/thiagopanini/datadelivery

Um módulo Terraform open source capaz de proporcionar um toolkit completo de infraestrutura para que usuários iniciem suas respectivas jornadas de exploração em serviços de Analytics na AWS.

analytics athena aws catalog crawler data datamesh glue s3 terraform

Last synced: 29 Nov 2025

https://github.com/jackosheadev/databasetechproject

This is a repo for a database project which involves creating tables, populating them, viewing data with selects and finally simulating a transaction

data database mssql sql

Last synced: 18 May 2026

https://github.com/reubano/ckanny

A Python command line interface (CLI) for interacting with CKAN instances

ckan cli data featured open-data

Last synced: 28 Apr 2026

https://github.com/the-aerospace-corporation/pivt

PIVT is an analytics tool to help software development teams visualize the life cycle and behavior of their software factory.

analytics dashboards data devops jenkins pipeline python splunk visualization

Last synced: 29 Apr 2026

https://github.com/xpotify/scraper

Scraper designed for Xpotify's client to gather information from websites🌟

axios cheerio data javascript scraper webscraper

Last synced: 07 Jul 2025

https://github.com/cainmi/data-page-project

A repository to pull code and files from, may be used to store page data links, code etc. mainly used for python for now

data html javascript python schema

Last synced: 21 Oct 2025

https://github.com/eve-ning/osumania_data

processed osu!mania data from osu!API

data osu rhythm-game vsrg

Last synced: 24 Feb 2026

https://github.com/aidanjuma/ankideckextractor

A CLI tool written in Python that extracts Anki flashcard decks (.apkg) into separate JSON notes and media files. Perfect for developers building custom learning applications or repurposing Anki content programmatically.

anki apkg cli data decompression extraction flashcards learning python zip

Last synced: 29 Apr 2026

https://github.com/sodascience/open_supply_hub

Processing supply chain data obtained from Open Supply Hub

data global-supply-chain open-supply-hub python

Last synced: 29 Apr 2026

https://github.com/v-mayya/python-sales-data-analysis

Group project with another team member held by CFG to conduct spreadsheet data analysis of fake sales data using Python

analysis data matplotlib numpy python

Last synced: 29 Apr 2026

https://github.com/himel-sarder/web-scraping-it-jobs-dataset

This project is a Python-based web scraping tool that collects job listings from TimesJobs for IT-related positions. It extracts job titles, company names, locations, and experience requirements, and saves the data into a CSV file. The tool uses BeautifulSoup and Pandas for web scraping and data manipulation.

data datascience dataset kaggle-dataset machine-learning machinelearning ml web-scraping

Last synced: 22 Feb 2026

https://github.com/dalikewara/typego

typego provides custom type that can be used to construct information (such as success data, error data, etc)

custom data golang helper type typego

Last synced: 09 Apr 2025

https://github.com/chrnthnkmutt/theartofstatistic_python

This repository is implemented from David Spiegelhalter's The Art of Statistics Book, for making Python Visualization

data data-science data-visualization machine-learning statistics

Last synced: 08 Jun 2026

https://github.com/kevinsames/microsoft-fabric-data-platform-template

A GitHub starter repository for building modern Data Engineering, ML, and AI solutions on Microsoft Fabric. Includes medallion architecture (Bronze → Silver → Gold), Spark Notebooks, dbt, MLflow, GitHub Actions CI/CD, and arc42-based documentation.

data dbt fabric microsoft python spark

Last synced: 29 Apr 2026

https://github.com/yasenstar/powerbi_tutorial

Base on "PowerBI Tutorial" book, provide step by step video demo on learning and mastering Power BI tool

analytics data microsoft powerbi tutorial visualization

Last synced: 07 Jan 2026

https://github.com/chompfoods/stub-asp-net-core

ASP.NET Core server stub for the Chomp Food & Recipe Database API. Use our API to get high-quality data on recipes and 875,000+ branded/grocery foods plus raw ingredients.

api asp asp-net-core aspnetcore branded chomp data database food grocery ingredients nutrition raw recipe-api recipes server stub stub-server

Last synced: 30 Apr 2026

https://github.com/bukalapak/bukadata

Data supplier plugin for populating design with real data.

data plugin sketch sketch-plugin

Last synced: 05 Jul 2025

https://github.com/bilalmehrban/data-log-monitor

A simple yet elegant desktop c# application based on 3 Tier architecture, designed to have a look at the logs stored in the database using Nlog or other logging framework's.

csharp data desktop-app logging

Last synced: 14 Mar 2025

https://github.com/alrza2003/alrza2003.github.io

This repository contains the source files for my personal portfolio website. It highlights my background as a data analyst and radiology student, and showcases real-world projects, tools I use, and ways to connect with me. The site is based on a pre-built template that I customized to reflect my profile and experience.

data data-analysis data-visualization portfolio portfolio-website python

Last synced: 30 Apr 2026

https://github.com/jigyasag18/gold-price-prediction-project-using-machine-learning

This repository contains a machine learning project focused on predicting gold prices (GLD) using historical stock market data, including indicators such as SPX, USO, SLV, and EUR/USD. The project implements a Random Forest Regressor for accurate price forecasting, complete with data visualization, correlation analysis, and model evaluation metrics

data dataset jupyter-notebook jupyter-notebooks machine-learning machinelearing machinelearningalgorithms machinelearningmodel machinelearningprojects matplotlib mlproject numpy pandas randomforestregressor seaborn

Last synced: 23 Jul 2025

https://github.com/scarblase/salary-comparison

Submission for the DataCamp Salary Competition(1 level). 🏆

data data-analysis data-science data-visualization engineering python sql structured-data

Last synced: 01 May 2026

https://github.com/lucien-loua/libgn

Manipulate geographical and administrative data about Guinea.

data guinea

Last synced: 08 Jun 2026

https://github.com/stdlib-js/array-zero-to-like

Generate a linearly spaced numeric array whose elements increment by 1 starting from zero and having the same length and data type as a provided input array.

array data float32array float64array int16array int32array javascript matrix ndarray node node-js nodejs stdlib structure typed typed-array types uint32array vector

Last synced: 07 Jan 2026

https://github.com/gdhhgnbnvbn/f1-2025-ai-predict

fully generated by claude 3.5 sonnet via Windsurf IDE. Not a single lines wrote.

agent-based-modeling claude csv data f1 gpt machine-learning model prediction predictive-modeling python rainforest streamlit vibe

Last synced: 01 May 2026

https://github.com/athari22/house_sales_in_king_count_usa

The idea of the project is to do a Data analysis in a Real Estate Investment Trust. The Trust would like to start investing in Residential real estate.

analysis data data-science data-visualization ibm ibm-watson linearregression machine-learning matplotlib numpy pandas sklearn-library

Last synced: 01 May 2026

https://github.com/ggeop/multiple-fields-management

Fields management from/to different data sources. :bulb:

data data-engineering data-organization data-retrieval data-science pandas python

Last synced: 01 May 2026

https://github.com/danielgiljam/orbit-utils

A collection of utility packages for Orbit.js.

data inference orbit orbitjs schema synchronization type typescript validation zod

Last synced: 01 May 2026

https://github.com/kuro337/scalamono

Scala Monorepo Tooling for Kafka, Opensearch, Spark, Redpanda, Hadoop - and Lang Reference.

data database duckdb hadoop kafka redpanda sdala spark

Last synced: 13 Apr 2026

https://github.com/nodef/infoods

Kit for International Network of Food Data Systems (INFOODS).

component data food identifier infoods international network systems tagnames

Last synced: 11 Mar 2026

https://github.com/rbruinier/mysqlbulkimportbenchmark

Benchmarking some methods to import big data sets into mysql tables

benchmark data database mysql php

Last synced: 02 May 2026