An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/mohibmirza-py/email-verifier-script

Streamlit app to verify emails in bulk

ai analysis data streamlit

Last synced: 29 Apr 2026

https://github.com/beriberikix/senml-zephyr

A codec for encoding and decoding Sensor Measurement Lists (SenML) for Zephyr

codec data iot senml sensor zephyr-rtos

Last synced: 24 Mar 2025

https://github.com/merekat/flight-delay-prediction

This project focuses on predicting flight delays using historical data from a Tunisian airline. We analyzed patterns in airport operations and flight schedules to build a machine learning model that can forecast potential delays.

aviation data data-science machine-learning machine-learning-algorithms machinelearning prediction predictive-modeling

Last synced: 08 Apr 2025

https://github.com/robertoostenveld/dcn.dsc_62002071_01_114_v1

Simon task M/EEG data [Data set].

data datalad open-data

Last synced: 23 Jan 2026

https://github.com/andrewl/danelaw

Geopackage containing the boundary of the Danelaw

data geospatial medieval viking

Last synced: 23 Jan 2026

https://github.com/jigyasag18/bird-strikes-in-aviation-project

This project analyzes over a decade of U.S. bird strike data (2000–2011) to evaluate safety risks, damage trends, and cost implications in aviation. Using PostgreSQL for database management and Power BI for dashboard visualization, it uncovers critical insights into when, where, and how wildlife impacts aircraft. Key findings inform strategically.

bird-strike-prevention bird-strike-prevention-in-real-airport data data-analysis data-analysis-project data-visualisation data-visualization data-visualization-project data-visualizations database dataset dax-query postgresql postgresql-database powerbi powerbi-desktop powerbi-report powerbi-visuals sql sql-database

Last synced: 09 May 2026

https://github.com/arthurdanjou/studies

💼 This is the repository containing all my projects done during my studies in Python and R.

ai data data-science data-visualization jupyter jupyter-notebook ml python r

Last synced: 08 Apr 2025

https://github.com/jsanz/kart-test

Testing Kart repository

data geospatial kart

Last synced: 26 Jan 2026

https://github.com/remcostoeten/github-and-vercel-api-showcase-dashboard

Showcase results of possible fetched data from the Github and Vercel API built in all vanilla js.

api-rest da data express-js github-api nodejs vercel-api

Last synced: 07 Mar 2026

https://github.com/harmanveer-2546/reducing-data-entries

Way to delete data entries from csv/excel file using. For excel file, use excel instead of csv in the code.

csv data data-entry delete-data excel numpy pandas python

Last synced: 05 May 2026

https://github.com/shahsuvarli/election-voters-data-analysis-pandas

Educational project analyzing Azerbaijan voter demographics with pandas, focusing on data cleaning, grouping, and visualization.

cleaning data grouping matplotlib numpy pandas python visualization

Last synced: 12 Apr 2026

https://github.com/mumtaz4118/nlp-course

Programming Assignments and Lectures for Stanford's CS 224: Natural Language Processing with Deep Learning

course data data-analysis data-analytics data-science data-visualization deep-learning education machine-learning natural-language-processing neural-network transfer-learning

Last synced: 24 Nov 2025

https://github.com/cpietsch/breitband

developer repo of breitband-berlin

d3js data threejs visualization

Last synced: 02 May 2026

https://github.com/ztgx/muvera

MUVERA: Making multi-vector retrieval as fast as single-vector search

algorithms data google muvera retrieval rust search structure vector

Last synced: 25 Oct 2025

https://github.com/prajjwol09/power-bi-project

The Data Survey Breakdown is an interactive Power BI dashboard designed to present insights gathered from a survey of professionals and enthusiasts in the data industry.

dashboard data interactive powerbi survey

Last synced: 15 Mar 2026

https://github.com/uznetdev/smoking-prediction

This project focuses on analyzing the "Smoking" dataset and building a predictive model for smoking status based on various health metrics. The goal is to identify factors influencing smoking behavior and develop a reliable model for prediction.

ai classification data data-science kaggle-competition machine-learning ml roc-auc sklearn smoking

Last synced: 17 Apr 2026

https://github.com/petzi53/repair

R Datasets of the Open Repair Alliance (ORA).

data r repair repair-cafe

Last synced: 19 May 2026

https://github.com/merekat/hb-oil-assets

Eine Analyse der Assetentwicklung im Zusammenhang mit schockartigen Anstiegen des Ölpreises seit des Markteintritts von Brent-Öl in 1986.

analyze asset data datajournalism oil python

Last synced: 16 Mar 2026

https://github.com/tomquirk/sunshine-coast-council-rates-data

Rates data for the Sunshine Coast, Australia

australia data property rates real-estate

Last synced: 24 Feb 2026

https://github.com/noedemange/orderedheatmapanalysis

OrderedHeatMapAnalysis (OHMA) is a direct data analysis framework allowing to simultaneously visualize and analyze the structure of complex datasets. An optimized seriation of rows and columns of the input data table is performed, resulting in a mapping of the whole dataset into an ordered heatmap.

analysis bi-seriation data dataanalysis heatmap r rstats seriation shiny shiny-apps

Last synced: 27 Feb 2025

https://github.com/0xHericles/ufcg-geojson

GeoJSON file containing the blocks and buildings of the Federal University of Campina Grande.

data data-visualization geojson map open-source ufcg university

Last synced: 24 Mar 2025

https://github.com/justintime50/dad-python

Dummy Address Data (DAD) - Retrieve real addresses from all around the world. (Python Client Library)

address addresses country dad data dummy python real state world

Last synced: 24 Jan 2026

https://github.com/prateekmaj21/tableau-public-links

Tableau work as part of Data Visualization [AI&DS_205]

data data-visualization dataanalytics tableau-public

Last synced: 24 Jan 2026

https://github.com/mfurmanczyk/wh-sales

E-commerce analytics data warehouse ETL made with Apache Spark.

airflow data data-engineering data-warehouse kotlin python spark

Last synced: 24 Jan 2026

https://github.com/sahraiidle/email-spam-detector

Email/SMS spam detector with a Flask UI/API, tuned ML models (TF‑IDF + SVM/LogReg/NB), and a ready-to-run web form plus JSON endpoint for predictions.

data machine-learning numpy pandas python randomforest scikit-learn spam-classifier spam-detection svm

Last synced: 24 Jan 2026

https://github.com/robertoostenveld/dccn.dsc_3015055.00_583_v1

The FieldTrip-SimBio Pipeline for EEG Forward Solutions [Data set].

data datalad open-data

Last synced: 24 Jan 2026

https://github.com/woctezuma/hidden-gems-data

Data available to compute regional rankings of hidden gems.

data hidden-gems steam steam-reviews

Last synced: 06 Feb 2026

https://github.com/atharvapathak/twitter_sentiment_analysis_project

Twitter sentiment analysis is the process of analyzing tweets posted on the Twitter platform to determine the overall sentiment expressed within them. It involves using natural language processing (NLP) and machine learning techniques to classify tweets.

api bag-of-words bert cnn data gbm nltk rnn spacy twitter

Last synced: 28 Jan 2026

https://github.com/cmdrvl/rvl

rvl reveals the smallest set of numeric changes that explain what actually changed between two datasets — or confidently tells you nothing changed.

cli csv data data-quality data-validation diff finance numerical-analysis open-source ops rust tooling

Last synced: 25 Feb 2026

https://github.com/maxisoft/yahoo-finance-data-downloader

Automate downloading historical and recent stock data from Yahoo Finance.

data stock-market yahoo-finance

Last synced: 29 Jan 2026

https://github.com/spatialcurrent/go-counter

Simple library and command line program for generating frequency distributions.

big-data bigdata data

Last synced: 29 Jan 2026

https://github.com/nasa-pds/nucleus

Nucleus is a software platform used to create workflows for the Planetary Data (PDS).

data ingestion pds planetary workflow

Last synced: 06 Feb 2026

https://github.com/apoorv74/njdg-stats

Tracking data from the National Judicial Data Grid's (NJDG) district courts portal

data git-scraping judiciary law

Last synced: 29 Jan 2026

https://github.com/chenxingqiang/modeling_tabular_data

# modeling_tabular_data | Keywords: modeling_tabular_data focusing on modeling_tabular_data.

data modeling tabular

Last synced: 30 Jan 2026

https://github.com/abhijeetdasbakshi/ecommerce-insights

A Dockerized end-to-end project that combines unsupervised machine learning for customer segmentation with scalable data pipelines. It uses MongoDB for data ingestion, Scikit-learn for clustering, Airflow for orchestration, and Streamlit for interactive visualization — enabling actionable insights into e-commerce

airflow airflow-dags ci-cd-pipeline clustering dags data data-pipelines docker docker-compose docker-container dockerfile git great-expectations kafka mongodb pca-analysis postgresql pyspark t-sne umap-learn

Last synced: 04 Apr 2026

https://github.com/jigyasag18/aircraft-data-management

This repository offers a comprehensive simulation of global military air deployments involving 10 countries, aircraft models, mission types, and strategic zones. It analyzes air power distribution, mission intent (offensive, defensive, support), and geopolitical positioning. The project provides structured insights into regional & zone level threat

aircraft-data aircraft-performance data data-analysis data-visualization database database-management dataset datavisualisation mysql powerbi powerbi-report powerbi-visuals sql

Last synced: 04 Feb 2026

https://github.com/mreshboboyev/elastic-search-dotnet

A powerful and easy-to-use .NET library for integrating Elasticsearch, enabling fast full-text search, scalable indexing, and advanced data analytics in your applications.

analytics c-sharp data dotnet-core elastic-search full-text indexing open-source scalable search

Last synced: 30 Jan 2026

https://github.com/lut-ful/pizza-sales-report

This Pizza Sales Report provides valuable insights into sales performance through detailed analysis and visualizations. By leveraging Power BI and SQL Server

data data-wrangling microsoft-sql-server power-bi power-bi-dax python

Last synced: 30 Jan 2026

https://gitlab.com/pommalabs/htmlark

HtmlArk packs a webpage into a single HTML file: https://htmlark-docs.pommalabs.xyz/

audios css data embed fonts html images javascript uri videos

Last synced: 03 Sep 2025

https://github.com/denisecase/dc-texter

Send a text message using Python

alerts data python sms-messages streaming

Last synced: 08 Feb 2026

https://github.com/opendatach/alds

a colaborative list of resources and ideas to enable "Amt Local Data Stewards" to manage the (open) data of their respective federal office

awesome-list data datagovernance dataliteracy datamanagement datastewardship opendata opengovernmentdata

Last synced: 31 Jan 2026

https://github.com/azmag/spm-dashboard

System Performance Measures are a selection of criteria used by Department of Housing and Urban Development (HUD) to evaluate how local Continua of Care are performing.

data human-services spm

Last synced: 31 Jan 2026

https://github.com/okieraised/rke2-deployment

Single-node RKE2 deployment

data helm helm-charts helm-deployment rke2

Last synced: 17 Mar 2026

https://github.com/drostlab/biodbretrievr

Retrieve and efficiently index entire biological sequence databases

biological-data biological-sequences data databasestoring retrieval

Last synced: 26 Feb 2026

https://github.com/mahtabranjbar/onlineshopping_analysis_dashboard

This project analyzes online shopper behavior using various machine learning models and EDA techniques.

dashboard data dataanalysis eda machine-learning streamlit

Last synced: 08 Feb 2026

https://github.com/tanyagarg25/project_covidanalysis

This repository is a project for analyzing COVID-19 data using SQL and visualizing it with Tableau. Technologies used include SQL for querying and Tableau for data visualization.

analysis dashboard data data-visualization sql tableau

Last synced: 08 Feb 2026

https://github.com/smaug6739/data-bit

This project is a module for converting a structured dataset into a number that can be stored in a database taking up little space.

bits data nodejs

Last synced: 14 May 2026

https://github.com/rahult18/atmo-flow

AtmoFlow is a robust data engineering pipeline built on Google Cloud Platform (GCP) that processes and analyzes weather and air quality data in both batch and streaming modes

airflow data data-modeling data-science data-visualization dataengineering gcp-bigquery gcp-cloud-composer gcp-cloud-functions pyspark

Last synced: 23 Jun 2026

https://github.com/bishtrishu/netflix_movies_dashboard

This project is a comprehensive dashboard for analyzing Netflix movies and shows. Using a combination of Power BI, Python, and Excel, this dashboard provides insights into various aspects of Netflix's content library.

ai artifical-intelligense dashboard data dataanalysis dataanalyst dataanalytics datacleaning datahandling datascience datavisualization excel machine-learning msexcel powerbi report

Last synced: 09 Feb 2026

https://github.com/samaalharbi2/project-recommendation-system

This project focuses on building a Recommendation System using real interaction data from IBM's Watson Studio platform.

clustering data ibm-watson kmeans nlp python rec svd udacity-nanodegree

Last synced: 09 Feb 2026

https://github.com/myles-parfeniuk/esp32_sdlogger

C++ esp-idf driver component for SD cards interfaced via SPI. WIP

card data esp-idf esp32 logger sd sdcard sdmmc sdspi spi

Last synced: 09 Feb 2026

https://github.com/metapsy-project/data-panic-psyctr

Database of psychotherapy for panic disorder compared to control conditions

data

Last synced: 18 Mar 2026

https://github.com/ludreinsalvador/global-covid-19-data-analysis

Contains Power BI dashboards that visualizes and analyzes global COVID-19 cases, deaths, and vaccination trends using data from the World Health Organization (WHO). The project aims to provide insights into the pandemic’s impact and vaccination progress worldwide through dynamic reports and advanced analytics.

analytics covid-19 covid19-data data data-analysis data-collection data-transformation data-visualization

Last synced: 26 Feb 2026

https://github.com/davidkhala/ai

GenAI index

data dify huggingface

Last synced: 27 Feb 2026

https://github.com/kena0ki/dddl

generates test Data from DDL.

data database db ddl generator sql table test

Last synced: 30 Apr 2026

https://github.com/os-climate/data-requests

This repo is used to track issues related to new Data Requests

data data-engineering dataset

Last synced: 27 Feb 2026

https://github.com/utrechtuniversity/momentum-dataflow

Repository for publishing website about data management practices of the Momentum project

data datageneration datamanagement

Last synced: 27 Feb 2026

https://github.com/faster-games/dynamic-components

Dynamic Runtime Components for Unity3D

data framework unity3d

Last synced: 11 Apr 2026

https://github.com/bastianolea/sicvir_indicadores_rurales

Sistema de Indicadores de Calidad de Vida Rural (Sicvir)

chile comunas data estado rural social

Last synced: 27 Feb 2026

https://github.com/sweta-kaundilya/power-bi-learning-projects

This repository contains completed exercises while learning Power BI

data datavisualization dax powerbi powerquery

Last synced: 27 Feb 2026

https://github.com/praveendecode/retail-revenue-forecasting

Designed an end-to-end ML model pipeline, forecasting department-wide sales by accounting for holiday markdown effects, spanning data collection to inferencing.

azure collection data datapreprocessing docker exploratory-data-analysis feature-engineering featureimportance model modelbuilding modeldeployment modelselction python report tableau

Last synced: 16 Apr 2026

https://github.com/kunalthakur204/visualization-on-flower

🌸 Flower Dataset Visualization Visualizing patterns and relationships in flower data through charts and plots. Perfect for exploring floral characteristics and trends! 📊

data data-visualization dataanalysis flowerdataset python

Last synced: 16 Apr 2026

https://github.com/khalyomede/request

Function to validate request data for V.

data function request validate vlang

Last synced: 12 Feb 2026

https://github.com/kirillsemyonkin/lsd

LSD (Less Syntax Data) configuration/data transfer format.

configuration data java parsing rust

Last synced: 27 Feb 2026

https://github.com/soenneker/soenneker.dtos.requestdataoptions

A flexible request options object for paging, sorting, and filtering queryable data, similar to OData-style parameters.

controller coordinator csharp data dotnet dto dtos http manager object odata options request requestdataoptions

Last synced: 12 Mar 2026

https://github.com/namratha2301/sales-orders-analysis

Wanted to experiment with Looker. This dashboard visualizes sales trends across regions, customer segments, and product categories.

business-analytics dashboard data dataanalysis datavisualization excel looker looker-studio

Last synced: 13 Feb 2026

https://github.com/j0a0m4/olympics

Final Project for Data Engineering Accelerated LATAM

data olympics spark

Last synced: 13 Feb 2026

https://github.com/infinitode/pywebscrapr

An open-source Python web scraping tool. Supports both image scraping and text scraping.

data data-collection data-science open-source pip scraping web-scraper

Last synced: 14 Feb 2026

https://github.com/sanand0/iss-location

Tracks the International Space Station position. A demo of how to use GitHub Actions to schedule commits weekly.

data

Last synced: 14 Feb 2026

https://github.com/imartinezl/madrid-challenge

Madrid Route Optimization Challenge 🚚♻️🚚

challenge city data optimization routing-algorithm traffic

Last synced: 28 Feb 2026

https://github.com/turner-kendall/turner-kendall

Turner Kendall - dev, opps, sec.

config data github-config go rust security

Last synced: 31 Oct 2025

https://github.com/e-kotov/albofr-data-archive

Tiger Mosquito Colonisation in France data

aedes-albopictus colonisation data france tiger-mosquito

Last synced: 23 May 2026

https://github.com/gusenov/open-data-scripts

Scripts to explore public datasets. Скрипты для работы с открытыми данными.

charts data data-visualisation data-visualization datavisualization highcharts kazakhstan open-data opendata qazaqstan

Last synced: 28 Feb 2026

https://github.com/madhuresh2011/genai-powered-data-analytics-by-tata

I recently participated in Tata iQ's job simulation on the Forage platform, and it was incredibly useful to understand what it might be like to be on a data analytics team in an AI transformation consulting role.

chatgpt data dataanalytics eda excel gemini generative-ai internships powerpoint presentation

Last synced: 14 Feb 2026

https://github.com/florianreuth/pit

pit - the private information tracker

data java passwords security vault

Last synced: 28 Feb 2026

https://github.com/mochsyahrizal/jkfkjabar_studycase

First Data Analytics Study Case

data datanalytics studycase

Last synced: 15 Feb 2026

https://github.com/pbinkley/mfmcollections

Project to distill data about published collections of microfilms from library lists

data research retro

Last synced: 28 May 2026

https://github.com/amethyst-php/activity

Someone just did something, should we save who did this and when?

activity amethyst amethyst-package api data laravel

Last synced: 17 May 2026

https://github.com/soenneker/soenneker.attributes.mapto

A C# attribute for generic data mapping translation

attributes columns csharp data datatables dotnet mapping mapto maptoattribute object

Last synced: 02 Mar 2026