An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/saisriramkamineni/e-commerce-sales-analysis-excel-

Conducted an in-depth sales analysis for an e-commerce platform, leveraging Excel for data preprocessing and Power BI for visualization. Identified key sales trends, customer purchasing behavior, and revenue growth patterns to optimize business performance.

analysis analytics data excel sales

Last synced: 14 Feb 2026

https://github.com/grycap/cdmi-client-go

A basic Go library to perform CDMI core operations

cdmi cloud data go

Last synced: 21 Jan 2026

https://github.com/keanteng/nextjs-directory

🌐A Draft Website For Data Catalogue Using NextJs

catalogue climate-change css data directory html javascript nextjs website

Last synced: 09 May 2026

https://github.com/cintia0528/data_science-ab_testing

Conduct a 5-way AB Test on Montana State University Library's website, comparing the original "Interact" button with new versions ("Learn," "Help," "Connect," "Services") to boost user engagement.

abtesting bonferroni chisquare-test data data-science datacleaning datavisualization hypothesis-testing mde statistics

Last synced: 31 Mar 2025

https://github.com/spiceai/datasets

Spice AI curated dataset definitions for Spice.ai

ai bitcoin blockchain data ethereum polygon

Last synced: 20 Apr 2026

https://github.com/yeshunit/walmart-product-customer-sales-sql-analysis

This project aims to explore the Walmart Sales data to understand top performing branches and products, sales trend of of different products, customer behaviour. The aims is to study how sales strategies can be improved and optimized. The dataset was obtained from the Kaggle

data database mysql sql walmart

Last synced: 24 Feb 2026

https://github.com/bolajiolayinka/graph-api-automation

An End to End Automation from Facebook Business to Data Visualization of Campaigns

data data-science

Last synced: 07 May 2025

https://github.com/bastianolea/comisarias_chile

Base de datos con las comisarías, retenes, tenencias y otras instalaciones de Carabineros

chile data estado social

Last synced: 23 Jun 2025

https://github.com/garcane/british-airways-analysis

This project focuses on analyzing and visualising travel data from British Airways using Tableau. The goal is to extract insights and present them in an interactive and visually appealing manner.

data data-analysis data-visualization tableau

Last synced: 19 Mar 2026

https://github.com/kunalshelke90/predict-bank-credit-risk-using-south-german-credit-data

This is an end-to-end ML project, which aims at developing a classification model for the problem of classifying a given customer profile into either of the risk category (safe or not safe). The final classifier used for this project is CatBoost classifier. Deployed in AWS.

aws cassandra catboost-classifier classification credit-risk data data-science dataanalysis dockerfile finance financial-analysis flask github-actions logging machine-learning mlflow numpy pandas python

Last synced: 03 Jan 2026

https://github.com/cdcgov/importsurvey

Import survey: Import data into R, with an application to the National Center for Health Statistics (NCHS)

data import r sas survey survey-data

Last synced: 19 Jun 2026

https://github.com/anuveyatsu/cloudflare-data-fabric

Cloudflare Data Fabric: Use Cloudflare's global infrastructure to build a flexible, resilient framework for data solutions.

cloudflare data data-lake fabric lakehouse mesh

Last synced: 29 Jun 2026

https://github.com/rafaelfloressouza/Covid-19-Dashboard

Python web application to display COVID19 data from the world using Plotly and Dash

bootstrap covid-19 css data datavisualization plotly-dash python3

Last synced: 10 Mar 2025

https://github.com/alechash/rndmzr

Randomizer is a random data generator.

data data-science random random-generation random-number-generators

Last synced: 10 Jun 2026

https://github.com/sathyasris27/data-analysis-on-adult-smoking-patterns-in-the-uk

The aim of this analysis is to understand the smoking patterns among adults in the UK.

data data-analysis data-visualization python3

Last synced: 09 May 2026

https://github.com/williamwutq/bllist

Durable, crash-safe, checksummed block-based linked list allocators stored in a single file

data data-storage data-structure database file-based linkedlist

Last synced: 25 Jun 2026

https://github.com/gallo13/neuralnetworks-deeplearning-stats-classification

Descriptive Statistics, Classification and Analysis Using Python & Python Libraries (Assignment 1)

analysis data datasets deep-learning jupyter-notebook matplotlib neural-networks numpy pandas plotting python seaborn

Last synced: 17 Apr 2026

https://github.com/lmuffato/project-mysql-vocabulary-booster-trybe

Projeto mysql vocabulary booster - Projeto avaliativo da Trybe do Bloco 20: Funções SQL, Joins e Subqueries

back-end crud data database mysql mysqlworkbench query sql trybe-projects

Last synced: 10 May 2026

https://github.com/dimitryzub/walmart-stores-coffee-analysis

Walmart Coffee Exploratory Data Analysis. Data Extracted with SerpApi 🧡

analysis analytics data data-visualization matplotlib pandas python pythonanalysis seaborn

Last synced: 10 May 2026

https://github.com/williamwutq/bblock

Persistent checksummed blocks built on top of bstack's allocators

allocation binary block data data-structures database rust rust-crate rust-library serialization

Last synced: 25 Jun 2026

https://github.com/souvik09-tech/adventure-works-kpi-dashboard

This repository contains a complete Business Intelligence solution for AdventureWorks, a global manufacturing company specializing in cycling equipment and accessories. Built using Power BI Desktop, this project helps track KPIs, analyze product performance, compare regional data, and identify high-value customers.

analysis data kpi powerbi visualization

Last synced: 27 Jan 2026

https://github.com/sbdk-dev/sbdk.dev

A complete reference implementation of a local-first ecosystem for AI-powered analytics. This repository contains the source code for the SBDK.dev website, the central hub for the SBDK suite of open-source tools.

ai-powered-analytics data data-engineering data-engineeringlocal-first data-pipeline-automation data-pipelines dbt dlt duckdb elt etl-pipeline llm local-first machine-learning pipeline sbdk semantic-layer

Last synced: 27 May 2026

https://github.com/jaldekoa/fiscaldataapi

A Python wrapper to easily retrieve data from the Fiscal Data (US Treasury) official API in pandas format.

api api-wrapper banking data finance pandas python united-states

Last synced: 27 Jan 2026

https://github.com/782e616c6d/covid-d.a

Academic project, using Apache Spark for ETL and Data Studio for data analysis.

academic analytics automation cluster covid-19 data database etl python spark sql

Last synced: 10 May 2026

https://github.com/ispyhumanfly/prowler

Query the web, extract data from the results, and transform that data into a format you can use.

ai analytics business cryptocurrency data extract-data machine-learning mining scraping web

Last synced: 06 Sep 2025

https://github.com/mishra-krishna/analysis-and-optimization-of-supply-chain-operations

Analyzed supply chain data to identify trends and key factors. Visualized sales, defect rates, lead times, and costs. Used Decision Tree Regressor to find top features impacting product costs and lead times.

data dataanalytics datavisualization supplychain supplychainanalytics

Last synced: 20 Apr 2026

https://github.com/ahmad-ali-rafique/comment-generation-tool

This repository hosts a Jupyter Notebook-based Comment Generation Tool exploring advanced NLP techniques for automated, contextually relevant comment generation from input data. Ideal for developers and researchers in NLP and automated text generation.

ai aitools artificial-intelligence content-based-recommendation data datascience jupyter-notebook machine-learning

Last synced: 07 Oct 2025

https://github.com/wooldoughnut310/xboxgamertag

Python module to get data from www.xboxgamertag.com

data gamertag html python3 requests xbox

Last synced: 24 Mar 2025

https://github.com/atymri/linqsimulator

LINQ Simulator is an interactive C# console application designed to let you experiment with LINQ queries in real time.

console csharp data data-analysis linq query sql

Last synced: 23 Oct 2025

https://github.com/purarue/git_doc_history

copy/track file history in git, with python bindings to traverse and extract history/files/lines at some date

data git

Last synced: 17 May 2026

https://github.com/trstringer/pywave2

:ocean: Get swell buoy data

data ocean python

Last synced: 31 Mar 2025

https://github.com/nikoshet/rust-dms-cdc-operator

The rust-dms-cdc-operator is a Rust-based utility for comparing the state of a list of tables in an Amazon RDS database with data stored in Parquet files on Amazon S3, particularly useful for change data capture (CDC) scenarios.

aws cdc data dms parquet pgdatadiff polars postgres rds rust s3 validation

Last synced: 18 Jan 2026

https://github.com/stdlib-js/array-base-none-by-right

Test whether all elements in an array fail a test implemented by a predicate function, iterating from right to left.

all array data every generic javascript node node-js nodejs none predicate stdlib structure test types validate

Last synced: 01 Mar 2026

https://github.com/ibilalkayy/covid-tracking-app

This repository contains the code of a covid tracking app that shows the data of covid-19 on Google Map.

covid-19 data google-maps

Last synced: 14 Oct 2025

https://github.com/carlossilva2/pybase

An easy to use Database using Python and JSON

data database json python3 storage

Last synced: 11 May 2026

https://github.com/yord/klp-core

A plugin with basic operations for klp (Kelpie), the small, fast, and magical command-line data processor.

csv data deserializer dsv json kelpie klp marshaller parser serializer ssv tsv

Last synced: 24 Apr 2026

https://github.com/suryavamsi-p/conflict-nlp-topic-modeling-sentiment-analysis-using-llms

Extracts insights from 26K+ protest events using BERTopic, Top2Vec, and LLMs for real-world applications like crisis monitoring, policy research, and social unrest analysis.

all-mpnet-base-v2 bertopic conflict-data data data-science lda llama2 llms machine-learning mistral-7b nlp nltk protest-analysis pyldavis python3 top2vec topic-modeling transformers visualization

Last synced: 11 May 2026

https://github.com/ilejuxepwaduzd/structured-data-extractor

🛠️ Extract structured data from messy texts using Chain-of-Thought prompting to improve processing of customer support and technical issues.

cdp chrome-fetcher data document-extraction ecommerce golang-library headless metadata-extraction ocr open-source pdf pdf-converter pdf-extractor ruby scraper shopify spider structured-data

Last synced: 10 Apr 2026

https://github.com/seanowenhayes/recipe-scraper

A simple scraper uses puppeteer to scrape recipes and more from the web

crawler crawling data recipes scraping

Last synced: 22 Feb 2026

https://github.com/farzai/geonames-php

This package provides a simple way to download Geonames data and format it for friendly use.

countries country-codes data geography geonames

Last synced: 24 Oct 2025

https://github.com/mascanho/ruddit

CLI to interact with Reddit's API to programatically retrieve data

cli data marketing rust rust-lang rustlang sales

Last synced: 19 Aug 2025

https://github.com/mewmix/drivehound

magic file signatures + python drive recovery magic

data disk file-signatures harddrive python recovery recovery-tool

Last synced: 08 Oct 2025

https://github.com/pharo-ai/data-imputers

This project contains transformers for missing value imputation

ai data data-science imputer pharo pharo-smalltalk smalltalk

Last synced: 18 Jan 2026

https://github.com/ginga1402/chinook_database

Microsoft SQL Server Management Studio

business-query data sql-server

Last synced: 30 Mar 2025

https://github.com/liyakhathshaik/datascout.jl

This is a julia package

data datascout julia

Last synced: 09 Oct 2025

https://github.com/scienxlab/datasets

Some small datasets for demos, courses, testing, etc.

data open-data sample-data teaching-resources

Last synced: 09 Oct 2025

https://github.com/ayushverma135/sas-health-metrics-analysis-bmi-categorization-and-gender-insights

Using SAS, this project processes Excel data on individual statistics and health metrics. It calculates BMI, categorizes health status, and visualizes distributions through pie charts.

analytics data excel sas sasprogramming statistical-analysis

Last synced: 24 Feb 2026

https://github.com/cmda-tt/course-24-25

🎓 tech track · 2024-2025 · curriculum and syllabus 📊

d3 data datavis datavisualization es6 functional javascript programming svelte

Last synced: 28 Jan 2026

https://github.com/jtpio/data-playground

Experiments using public APIs and data

data experiments python

Last synced: 28 Apr 2026

https://github.com/imahdimir/githubdata

A very simple Python package to easily download from and manage a GitHub "Data Repository"

data data-repository python-package

Last synced: 23 Jan 2026

https://github.com/efler/microservice-data-bus

Data bus based on Apache Kafka and consisting of separate components [copied from own private repos]

data data-bus deduplication enrichment filtering kafka microservice mongodb postgresql redis

Last synced: 16 Apr 2026

https://github.com/idea2app/public-meta-data

HTTP API for Public Meta Data, written in TypeScript & designed for CDN.

api cdn data http meta public typescript

Last synced: 15 Mar 2025

https://github.com/reubano/ckanny

A Python command line interface (CLI) for interacting with CKAN instances

ckan cli data featured open-data

Last synced: 28 Apr 2026

https://github.com/bilalmehrban/data-log-monitor

A simple yet elegant desktop c# application based on 3 Tier architecture, designed to have a look at the logs stored in the database using Nlog or other logging framework's.

csharp data desktop-app logging

Last synced: 14 Mar 2025

https://github.com/ishaansathaye/cpe202-datastructalgos

CPE 202 Data Structures and Algorithms Winter 2022 Freshman at Cal Poly

algorithm binary binary-search-tree data graph hash heap python queue stack structures

Last synced: 12 May 2026

https://github.com/mihaiconstantin/lavot

A `React` application that allows users to indicate how votes will be redistributed among candidates for the second round of Romanian presidential elections.

data data-visualization elections react sankey typescript

Last synced: 06 Feb 2026

https://github.com/h2lsoft/validator

A library of validators values in multilanguage with CSRF protection

csrf csrf-protection data form php validator

Last synced: 04 Feb 2026

https://github.com/brianali-codes/github-searcher

A website for API experimentation that users the github Api to search for different users and some of their (public) information

api data github user

Last synced: 21 May 2026

https://github.com/avto-dev/static-references-data

Data for static references

data references static

Last synced: 05 Oct 2025

https://github.com/medz/block

A flexible and efficient binary data block handling library for Dart.

binary blob block data streams

Last synced: 24 Feb 2026

https://github.com/aidanjuma/ankideckextractor

A CLI tool written in Python that extracts Anki flashcard decks (.apkg) into separate JSON notes and media files. Perfect for developers building custom learning applications or repurposing Anki content programmatically.

anki apkg cli data decompression extraction flashcards learning python zip

Last synced: 29 Apr 2026

https://github.com/patrikmasiar/algorythm-of-the-night

Awesome list of algorithms that help you 🚀 Feel free to contribute 👨🏻‍💻

algorithms data interview-questions logic logic-programming math mathematics science

Last synced: 27 Oct 2025

https://github.com/iguptashubham/walmart-eda

Imagine diving into the fascinating world of Walmart with just a few lines of code! This project lets you do that using MySQL, a powerful tool for data analysts. You can clean up messy data like a detective, uncovering hidden patterns and trends. Data scientists can take it further,.

analysis data dataset eda mysql portfolio-project python sql

Last synced: 10 Apr 2026

https://github.com/open-i18n/data-iso-15924

Git mirror for ISO 15924, Codes for the representation of names of scripts data

data iso iso-15924 iso15924 open-i18n scripts unicode unicode-data writing-systems

Last synced: 14 Mar 2026

https://github.com/marabesi/d3-visualization

Different visualizations using data and d3.js

charts css d3js data html js json timeline-chart visualization

Last synced: 01 May 2026

https://github.com/kevinsames/microsoft-fabric-data-platform-template

A GitHub starter repository for building modern Data Engineering, ML, and AI solutions on Microsoft Fabric. Includes medallion architecture (Bronze → Silver → Gold), Spark Notebooks, dbt, MLflow, GitHub Actions CI/CD, and arc42-based documentation.

data dbt fabric microsoft python spark

Last synced: 29 Apr 2026

https://github.com/droduit/grand-comics-database

EPFL course project to manage a huge database containing hundreds of millions data, and optimize the queries to create a smooth experience on user interface.

big-data data database epfl sql

Last synced: 16 Apr 2026

https://github.com/city-of-helsinki/drupal-helfi-tyollisyyspalvelut-manuaali

Työllisyyden kuntakokeilujen palvelutietovarannon manuaali

data drupal drupal-9 unemployment

Last synced: 24 Jan 2026

https://github.com/jorgeatgu/apaga-luz

💡 ¿Cuánto cuesta la luz? 💶

data data-visualization flat-data

Last synced: 04 Feb 2026

https://github.com/zoekelepiri/ota_observatory

A front-end web application that provides detailed information about the boundaries and statistical data of the regions and prefectures of Greece.

backend data database spring-boot

Last synced: 06 Feb 2026

https://github.com/scarblase/salary-comparison

Submission for the DataCamp Salary Competition(1 level). 🏆

data data-analysis data-science data-visualization engineering python sql structured-data

Last synced: 01 May 2026

https://github.com/ultrasage-danz/scikit-learn-ml

Machine Learning with scikit-learn by Data School

ai data data-school machine-learning macos ml scikit-learn ultrasage-dan

Last synced: 13 May 2026

https://github.com/jessielw/parse-fel-master-data

Simple CLI to parse Dolby Vision master data via the RPU/MediaInfo and output data needed for x265

data dolby fel master mediainfo mi parse rpu vision

Last synced: 26 Aug 2025

https://github.com/danielgiljam/orbit-utils

A collection of utility packages for Orbit.js.

data inference orbit orbitjs schema synchronization type typescript validation zod

Last synced: 01 May 2026

https://github.com/lamden/merk

A concise implementation of a merkle tree in Python.

crypto data hash merkle structure tree

Last synced: 27 May 2026

https://github.com/jmcanterafonseca/leaflet-context-information

A Leaflet plugin + infrastructure for getting access to Context Information (i.e. data) exposed through FIWARE NGSIv2

context data fiware information leaflet map open visualization web

Last synced: 21 Apr 2026

https://github.com/alejo1630/titanic_kaggle

This Python Notebook is a proposal to analyse the Titanic dataset for the Kaggle Competition, using several data science techniques and concepts.

data data-science jupyter-notebook notebook python titanic-survival-prediction

Last synced: 03 May 2026