An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/ahmad-ali-rafique/comment-generation-tool

This repository hosts a Jupyter Notebook-based Comment Generation Tool exploring advanced NLP techniques for automated, contextually relevant comment generation from input data. Ideal for developers and researchers in NLP and automated text generation.

ai aitools artificial-intelligence content-based-recommendation data datascience jupyter-notebook machine-learning

Last synced: 07 Oct 2025

https://github.com/patrickdavies100/datapipeline37

Some Data Science practice using datasets available online. Currently test data is similar to this dataset: https://www.kaggle.com/datasets/asaniczka/amazon-uk-products-dataset-2023 but the plan is to expand.

data data-science pandas-dataframe python3

Last synced: 08 Oct 2025

https://github.com/scienxlab/datasets

Some small datasets for demos, courses, testing, etc.

data open-data sample-data teaching-resources

Last synced: 09 Oct 2025

https://github.com/east-empire-trading-company/eetc-data-client

Client library for retrieving data managed by EETC Data Hub.

client-library data data-science finance library python

Last synced: 31 May 2026

https://github.com/famarks/grafarg

Grafarg is an interactive data analytics and graphical data visualization application. Grafarg being a progressive fork of Grafana 7.5.17 continues to be available under open source Apache 2.0 License

analytics charts data data-analysis data-science data-visualization grafana grafarg graph

Last synced: 19 Jan 2026

https://github.com/mccarthy-m-g/alda

An R data package for the book "Applied longitudinal data analysis: Modeling change and event occurrence" by Singer and Willett (2003).

data growth-curves longitudinal-data mixed-models nonlinear-mixed-models r r-package structural-equation-modeling survival-analysis time-to-event

Last synced: 19 Jan 2026

https://github.com/player29879/neum-ai

Neum AI is a best-in-class framework to manage the creation and synchronization of vector embeddings at large scale.

ai chatgpt data data-engineering database embeddings etl llm llmops mlops ops pipeline python rag retrieval vector-database vectors

Last synced: 18 Apr 2026

https://github.com/nnavales/desafios-data-engineer

En este proyecto abordaremos desafíos comunes en el rol de un Data Engineer con tecnologías modernas.

data data-engineering database dataengineering docker minio scrapping spark

Last synced: 01 Jun 2026

https://github.com/open-i18n/data-iso-15924

Git mirror for ISO 15924, Codes for the representation of names of scripts data

data iso iso-15924 iso15924 open-i18n scripts unicode unicode-data writing-systems

Last synced: 14 Mar 2026

https://github.com/bishtrishu/pizza_sales_data_analysis_sql

This project is a comprehensive data analysis of pizza sales, aimed at uncovering key insights and trends to inform business decisions. Using a combination of SQL, Python, and data visualization tools, the project analyzes sales data to understand customer preferences, peak sales periods, and the most popular pizza types.

cloud data data-analysis data-science data-visualization dataanalytics database mysql oracle-database

Last synced: 14 Apr 2026

https://github.com/mscbuild/analysis

🎢 This collection of data analysis projects demonstrates techniques for extracting, transforming, analyzing, and visualizing data. Data Analytics Projects for Beginners 📈 ⚡

anallysis analysis chart csv dashboard data data-science data-science-projects excel google html5 mashine-learning portfolio pyton

Last synced: 19 Oct 2025

https://github.com/priyanshubiswas-tech/pwc-power-bi-task-1-2

Power BI dashboards analyzing Phonenow's call center performance and customer retention. Task 1 focuses on KPIs like satisfaction rating, call count, and agent efficiency. Task 2 analyzes retention trends and customer behavior to enhance loyalty. Built using Power BI, DAX, and Excel.

dashboard data data-analysis dax-measures excel powerbi powerbidashboard

Last synced: 23 Jan 2026

https://github.com/doziestar/datavinci

DataVinci enables you to visualize data from various sources, generate insights, analyze data with AI models, and receive real-time updates on anomalies

data golang logs pipeline

Last synced: 23 Jan 2026

https://github.com/capire/xtravels-java

Travel booking app using master data from xflights built with CAP Java

cap cds data federation flights java reuse

Last synced: 23 Jan 2026

https://github.com/aleenprd/docbt

Documentation Build Tool - Generate YAML documentation for dbt models with optional AI assistance. Built with Streamlit for an intuitive and familiar web interface.

ai analytics-engineering bigquery data data-modeling data-science dbt docker llm lmstudio ollama openai snowflake sql streamlit

Last synced: 11 Nov 2025

https://github.com/city-of-helsinki/drupal-helfi-tyollisyyspalvelut-manuaali

Työllisyyden kuntakokeilujen palvelutietovarannon manuaali

data drupal drupal-9 unemployment

Last synced: 24 Jan 2026

https://github.com/desktopcleaner/naturemagazinescraper

Scrapes open-access Nature magazine articles and store as txt files.

data nature-magazine python scrapper word-frequency

Last synced: 06 Feb 2026

https://github.com/simranjeet97/quotes-analysis

Kaggle Dataset on Quotes Analysis and Visualization With Python, Pandas and MatplotLib Using Jupyter Notebook.

data data-science datavisualization jupyter-notebook kaggle kaggle-dataset machine-learning matplotlib-pyplot numpy pandas python quotes quotes-application

Last synced: 15 Apr 2026

https://github.com/openearth/rws-viewer

This viewer is created by Deltares in cooperation with Voorhoede under OpenEarth GPL License. The viewer can be used via several RWS websites, please visit https://www.informatiehuismarien.nl/, https://waterinfo-extra.rws.nl/ and https://basismonitoringwadden.waddenzee.nl/.

data mapbox-gl-js ogc-services viewer

Last synced: 01 Feb 2026

https://github.com/jeanmanguy/milk-sci-fi

Census of every mention of milk in sci-fi works.

data milk sci-fi

Last synced: 26 Feb 2026

https://github.com/3squared/smoulder

Smoulder is a really good data pipe

composition data facade-pattern forge-framework object-oriented

Last synced: 25 Apr 2026

https://github.com/stdlib-js/ndarray-base-fliplr

Return a view of an input ndarray in which the order of elements along the last dimension is reversed.

base data flip javascript matrix ndarray node node-js nodejs reverse slice stdlib structure types vector view

Last synced: 11 Feb 2026

https://github.com/tushar2704/applied-ai-playground

This repository serves as a comprehensive collection of resources and projects for Applied Artificial Intelligence (AI). Whether you're an AI enthusiast, a data scientist, or a developer looking to explore practical applications of AI, this repository aims to provide you with valuable materials and hands-on projects to deepen your understanding.

artificial-intelligence data data-science machine-learning machine-learning-algorithms

Last synced: 12 Feb 2026

https://github.com/tushard48/analyzing-usa-market-trends-a-financial-overview

In-depth analysis of US market trends, encompassing economic indicators, industry performance, and financial data

data data-visualization powerbi

Last synced: 19 Mar 2026

https://github.com/garcane/beverage-sales-analytics

This project provides an in-depth analysis of beverage sales and delivery across different states using Power BI.

data data-visualization powerbi powerbi-report powerbi-visuals

Last synced: 19 Mar 2026

https://github.com/nikhilash45/power-bi-vsualisation-of-joins

In This Power Bi Report User Can Visualis Join By Themselves , and it is easy to understand joins now.

business-analytics business-intelligence data data-analysis data-visualization joins powerbi sql visualization

Last synced: 19 Mar 2026

https://github.com/stdlib-js/datasets-harrison-boston-house-prices-corrected

A (corrected) dataset derived from information collected by the US Census Service concerning housing in Boston, Massachusetts (1978).

boston data dataset datasets house housing javascript linear-regression node node-js nodejs prediction prices statistics stats stdlib value

Last synced: 15 Feb 2026

https://github.com/ghonimo/diode-pn-junction-characterization-psu-ece515

A detailed analysis of the I-V characteristics of a PN junction diode (1N4148) under different temperatures, utilizing Excel for graphical analysis and parameter extraction. This study was conducted as part of the ECE 515: Fundamentals of Semiconductor Devices course at Portland State University.

analysis characterization data device diode diodes excel mosfet-transistor pn-junction

Last synced: 28 Feb 2026

https://github.com/stdlib-js/array-base-none-by-right

Test whether all elements in an array fail a test implemented by a predicate function, iterating from right to left.

all array data every generic javascript node node-js nodejs none predicate stdlib structure test types validate

Last synced: 01 Mar 2026

https://github.com/docusign/extension-app-data-io-reference-implementation

Extension App for Data IO Reference Implementation for the Docusign IAM Platform

apps data extension

Last synced: 02 Mar 2026

https://github.com/agnosticeng/cli

Agnostic magic is now at your fingertips.

cli clickhouse data datalake datalakehouse

Last synced: 03 Mar 2026

https://github.com/droduit/grand-comics-database

EPFL course project to manage a huge database containing hundreds of millions data, and optimize the queries to create a smooth experience on user interface.

big-data data database epfl sql

Last synced: 16 Apr 2026

https://github.com/denisecase/datakit-lite

Helpful utilities for Python data projects

analysis data education kit lite utils

Last synced: 04 Mar 2026

https://github.com/rousan/weshare

An application that transfers files between devices

c-sharp data dot-net file lan phone share transfer-data weshare wifi

Last synced: 17 Apr 2026

https://github.com/timmymatten/spikeball-stat-tracker

Spikeball stat tracking web app built with Streamlit and Python, designed to easily log and analyze player performance over multiple games.

data data-analysis data-visualization dataset matplotlib-pyplot multipage python spikeball statistics streamlit

Last synced: 18 Apr 2026

https://github.com/mrpudn/maltrends

(mirror) MyAnimeList.net manga and anime trend data.

anime data json jsonl jsonlines manga myanimelist

Last synced: 20 Apr 2026

https://github.com/jinsyin/dataorigin

数据之源 | A data source management framework

data data-source datasource

Last synced: 21 Apr 2026

https://github.com/aravind-selvam/bikeshare-company-analysis

Google Data Analytics Professional Certificate program's Capstone project, of a bike sharing company

analytics business-analytics business-intelligence data data-analysis data-visualization dataanalytics google-data-analytics postgresql sql sql-server

Last synced: 22 Apr 2026

https://github.com/ofelipelucca/cdc-kafka-debezium-pipeline

A real-time event-driven social network API built with CDC (Change Data Capture), Kafka, Debezium, PostgreSQL and MongoDB implementing CQRS-style architecture with streaming data pipelines.

cdc data data-engineering data-integration data-pipeline debezium event-driven fastapi kafka kafka-connect microservices mongodb postgresql python sqlalchemy

Last synced: 05 Jun 2026

https://github.com/andygol/osm-diff-state

CLI tool to search OSM diff state files

custom data openstreetmap planet replication

Last synced: 24 Apr 2026

https://github.com/zalweny26/open_data_unipa

Progetto per l'esame di Laboratorio di Algoritmi 23-24, UniPa, Informatica L-31

data open project python

Last synced: 26 Apr 2026

https://github.com/aero-db/airports

A public and free dataset of all airports in the world

airports aviation csv data dataset json

Last synced: 27 Apr 2026

https://github.com/saulojoab/crato-ce-json

Nesse repositório irei armazenar todos os bairros (e mais informações, no futuro) de Crato-CE em JSON.

data database geolocation json json-api localization

Last synced: 28 Apr 2026

https://github.com/rdjarbeng/rdjarbeng

Richard Djarbeng's github profile-computer engineer specializing in web development, machine learning, and IoT devices. New web posts have moved to website below

data jekyll machine-learning ruby website

Last synced: 28 Apr 2026

https://github.com/player29879/sketch

AI code-writing assistant that understands data content

ai codex data dataframe dats-science df ds gpt3 pandas python sketchs

Last synced: 28 Apr 2026

https://github.com/aidanjuma/ankideckextractor

A CLI tool written in Python that extracts Anki flashcard decks (.apkg) into separate JSON notes and media files. Perfect for developers building custom learning applications or repurposing Anki content programmatically.

anki apkg cli data decompression extraction flashcards learning python zip

Last synced: 29 Apr 2026

https://github.com/chrnthnkmutt/theartofstatistic_python

This repository is implemented from David Spiegelhalter's The Art of Statistics Book, for making Python Visualization

data data-science data-visualization machine-learning statistics

Last synced: 08 Jun 2026

https://github.com/kevinsames/microsoft-fabric-data-platform-template

A GitHub starter repository for building modern Data Engineering, ML, and AI solutions on Microsoft Fabric. Includes medallion architecture (Bronze → Silver → Gold), Spark Notebooks, dbt, MLflow, GitHub Actions CI/CD, and arc42-based documentation.

data dbt fabric microsoft python spark

Last synced: 29 Apr 2026

https://github.com/chompfoods/sdk-php

PHP SDK for the Chomp Food & Recipe Database API. Use our API to get high-quality data on recipes and 875,000+ branded/grocery foods plus raw ingredients.

api branded chomp data database food grocery ingredients php raw recipe-api recipes sdk

Last synced: 30 Apr 2026

https://github.com/athari22/house_sales_in_king_count_usa

The idea of the project is to do a Data analysis in a Real Estate Investment Trust. The Trust would like to start investing in Residential real estate.

analysis data data-science data-visualization ibm ibm-watson linearregression machine-learning matplotlib numpy pandas sklearn-library

Last synced: 01 May 2026

https://github.com/dominhduy09/my-links

All of my links and websites I have been creating - For saving all of my website's links

data database link linked-list linktree list save storage website

Last synced: 25 Jun 2026

https://github.com/rrwen/twitter2mongodb

Module for extracting Twitter data to MongoDB databases

api data database geo get location mdb media mongo mongod mongodb oauth post rest sample social stream token tweet twitter

Last synced: 06 May 2026

https://github.com/satur-io/estoraje

Estoraje is the simplest distributed system for key-value storage in less than 800 lines of code. It is temporary consistent, high available, lightweight, scalable and gives a good performance.

data database distributed go golang key-value performance training

Last synced: 07 May 2026

https://github.com/geo-y20/loan-approval-automation-using-mongodb-and-pymongo

This project demonstrates the implementation of a loan approval system that utilizes MongoDB for distributed data storage and management, and PyMongo for database operations. The project aims to automate the assessment of loan eligibility using customer details from online applications.

crud-application data data-analysis data-science data-visualization deployment jupyter-notebook loan-default-prediction loan-prediction-analysis machine-learning machine-learning-algorithms matplotlib mongodb pymongo streamlit web

Last synced: 08 May 2026

https://github.com/themuhd/world-cup-analysis

Analysis of The FIFA World cup from its inception to the recently completed tournament in 2023

data data-science data-visualization dataanalysis matplotlib matplotlib-pyplot notebook python

Last synced: 08 May 2026

https://github.com/raynardj/r_notes

Learning notebooks of R

data docker guru99 jupyter learning r

Last synced: 09 May 2026

https://github.com/sathyasris27/data-analysis-on-adult-smoking-patterns-in-the-uk

The aim of this analysis is to understand the smoking patterns among adults in the UK.

data data-analysis data-visualization python3

Last synced: 09 May 2026

https://github.com/lmuffato/project-mysql-vocabulary-booster-trybe

Projeto mysql vocabulary booster - Projeto avaliativo da Trybe do Bloco 20: Funções SQL, Joins e Subqueries

back-end crud data database mysql mysqlworkbench query sql trybe-projects

Last synced: 10 May 2026

https://github.com/carlossilva2/pybase

An easy to use Database using Python and JSON

data database json python3 storage

Last synced: 11 May 2026

https://github.com/scarblase/russian-military-losses-analysis

This repository provides an in-depth analysis of Russian equipment losses using PySpark and data visualization techniques.

data data-science data-visualization jyputer-notebook matplotlib pyspark python3 seaborn seaborn-plots ukraine ukraine-invasion

Last synced: 12 May 2026

https://github.com/dmitriiweb/tr-data-getter

Tool to get market data from bitstamp.ne

cryptocurrency data

Last synced: 14 May 2026

https://github.com/iotchulindrarai/reactlearning

learning react like data passing using usestate and props using fom both child to parent and parent to child

data passing props react usestate-hook

Last synced: 14 May 2026

https://github.com/brandonhimpfen/data-size-parser

A tiny, practical parser for human-readable data sizes.

data data-size data-sizes npm open-source web-design web-development

Last synced: 12 Jun 2026

https://github.com/erwan-simon/aws-serverless-notebook-platform

A self-hosted, serverless platform offering an intuitive UI to manage, schedule, and execute Jupyter notebooks on AWS.

aws data docker notebook python serverless terraform webapp

Last synced: 13 Jun 2026

https://github.com/mwmorale/storingencryptiondata

Welcome! Here, I am working with some very basic encryption. This is a work in progress and, for now, is only compatible with Windows OS. Using a password, a user can easily encrypt their “notes” file after writing. Then, later, decrypt when desired in order to view/edit their notes. This is hiding information in plain sight. Eventually, this project will be merged with my folder locker so that an encrypted file can be stored in a "locked" directory/folder. Avoid personal use for I am releasing the encryption key and/or “cipher solution” in my code. When used, run the file called “RUN_ME.py”.

cipher ciphertext data decryption encryption filesystem graphical gui gui-application notes privacy rotation-encryption secure security-tools user-interface whitehat

Last synced: 21 Jun 2026

https://github.com/cliffano/birthmap

Mapping birth places of groups of prominent people

birthmap data maps

Last synced: 22 Jun 2026

https://github.com/michalwols/awesome-data-curation

🗑️ ✨ 📊 Awesome things related to data collection, annotation, cleaning and management.

active-learning annotation cleaning-data data data-science deep-learning machine-learning

Last synced: 24 Jun 2026

https://github.com/matusf/glasgow_wifi

Script that plots wifi access points to map and labels them by their protection

data data-visualization folium python python3

Last synced: 24 Jun 2026

https://github.com/williamwutq/mappedpages

A fixed-size page provider backed by memory mapping, intended for building higher-level allocators and storage systems

allocation allocator data data-storage database file memory-mapping mmap page rust rust-crate rust-library storage

Last synced: 25 Jun 2026

https://github.com/diddypod/crop-data-converter

A Python script to convert crop data from .txt to .xlsx format

converter crop data openpyxl python

Last synced: 29 Jun 2026

https://github.com/anuveyatsu/cloudflare-data-fabric

Cloudflare Data Fabric: Use Cloudflare's global infrastructure to build a flexible, resilient framework for data solutions.

cloudflare data data-lake fabric lakehouse mesh

Last synced: 29 Jun 2026

https://github.com/ddeutils/ddedocs

📖 Data Developer & Engineer Documents and Hands-On

blogs data data-engineering documents hands-on

Last synced: 08 Aug 2025

https://github.com/alrza2003/alrza2003.github.io

This repository contains the source files for my personal portfolio website. It highlights my background as a data analyst and radiology student, and showcases real-world projects, tools I use, and ways to connect with me. The site is based on a pre-built template that I customized to reflect my profile and experience.

data data-analysis data-visualization portfolio portfolio-website python

Last synced: 30 Apr 2026

https://github.com/s-raza/csvio

Wrapper for conveniently processing CSV files

csv data file processing wrapper

Last synced: 14 Jan 2026

https://github.com/avto-dev/static-references-data

Data for static references

data references static

Last synced: 05 Oct 2025

https://github.com/chompfoods/stub-asp-net-core

ASP.NET Core server stub for the Chomp Food & Recipe Database API. Use our API to get high-quality data on recipes and 875,000+ branded/grocery foods plus raw ingredients.

api asp asp-net-core aspnetcore branded chomp data database food grocery ingredients nutrition raw recipe-api recipes server stub stub-server

Last synced: 30 Apr 2026

https://github.com/dixslyf/nbparts

Unpack a Jupyter notebook into its sources, outputs and metadata.

data haskell jupyter jupyter-notebook nix nix-flake

Last synced: 05 Oct 2025

https://github.com/ginga1402/chinook_database

Microsoft SQL Server Management Studio

business-query data sql-server

Last synced: 30 Mar 2025