An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/eharshit/end-to-end-vendor-insights

End-to-end analysis of vendor performance for wholesale/retail businesses, featuring data ingestion, cleaning, insights, and interactive Power BI dashboards.

analysis analysis-algorithms analytics dashboard data data-analysis datascience jupyter jupyter-notebook pandas powerbi powerbi-report retail wholesale

Last synced: 07 Oct 2025

https://github.com/pythoncoderunicorn/startrek

a repo for Star Trek data from Technical Manuals

data klingon-language star-trek vulcan

Last synced: 07 Oct 2025

https://github.com/ournet/weather-data

Ournet weather data module

data forecast ournet storage weather

Last synced: 07 Oct 2025

https://github.com/slavos1/covid_data

Use Public Health England daily data to show daily cases.

covid covid-19 covid19 data pandas python3

Last synced: 14 Apr 2026

https://github.com/shubhamsoni98/classification-with-random-forest-1

To classify sales into categories (Low, Moderate, High) using Random Forests to inform strategic decisions and optimize marketing strategies.

algorithms anaconda data data-science datacleaning eda jupyter-notebook machine-learning pyhton random-forest scikit-learn visualization

Last synced: 18 Jan 2026

https://github.com/loaiwalid07/automation_data_overviwe

This is Streamlit app that gives an overview for a dataset you upload

automation data data-analysis data-exploration data-science data-transformation data-visualization

Last synced: 19 May 2026

https://github.com/badranalyst/data-professional-survey-breakdown-power-bi-dashboard

This project presents an interactive Power BI dashboard analyzing data professionals' insights. Key focus areas include job satisfaction, challenges in entering the data field, career priorities, demographics, and more. The visualization helps uncover trends and factors impacting data professionals globally.

charts dashboard dashboards data data-cleaning data-visualization dataset dax power-bi powerbi

Last synced: 23 Feb 2026

https://github.com/dumkydewilde/mcp-memory-layer

A template for building your own BI MCP with dbt, LLMs and multi-user corrections

bi data dbt llm mcp-server

Last synced: 13 Mar 2026

https://github.com/alexmcvay/uber-data

UBER sql clone

data data-visualization sql

Last synced: 19 Jan 2026

https://github.com/mr-chang95/udacity-starbucks-challenge

Data Science Project for Udacity's Data Scientist Program. Using Python in Jupyter Notebook.

data data-science data-visualization numpy pandas sklearn

Last synced: 14 Apr 2026

https://github.com/equinor/sumo-wrapper-python

Thin python wrapper to interact with Sumo API

analytics data fmu python subsurface sumo

Last synced: 19 Jan 2026

https://github.com/elimu-ai/analytics

📊 Android application which collects, provides and uploads learning event data

csv data data-science dataset edtech egma egra infrastructural learning-analytics

Last synced: 12 Oct 2025

https://github.com/adadalshabab/data-engineering-gcp-project

An end-to-end modern data engineering project, including deployment of ETL pipeline on Google Cloud Platform, using BigQuery for data analysis and leveraging Looker to generate an insight dashboard.

bigquery data data-science data-visualization databases dataengineering-a engineering etl-pipeline looker-studio powerbi

Last synced: 19 Jan 2026

https://github.com/digital-media/cv_data

Datasets used for courses/tutorials at the Digital Media Department

computer-vision data image-processing images

Last synced: 14 Oct 2025

https://github.com/cemc-oper/nmc-typhoon-db-client

A CLI client for NMC Typhoon Database.

data database-client nmc

Last synced: 01 Jun 2026

https://github.com/mohibmirza-py/email-verifier-script

Streamlit app to verify emails in bulk

ai analysis data streamlit

Last synced: 29 Apr 2026

https://github.com/andrewl/danelaw

Geopackage containing the boundary of the Danelaw

data geospatial medieval viking

Last synced: 23 Jan 2026

https://github.com/brianlesko/r_data_science_stat5730

Written by Brian Lesko, the repository contains R Scripts demonstrating data science topics largely originating from study at Ohio State. Contents are written in R studio using the R markdown file. As of 1/21/23 Future projects concerning data science, statistics, and machine learning will be in python in my machine learning Repository

data data-analysis flight-data ggplot2 olympics-data r-markdown tidyverse

Last synced: 23 Jan 2026

https://github.com/knowcnu12/metamask-wallet-recovery-funds-phrase-data-seed-token

This repository provides tools and guidelines for securely recovering MetaMask Wallet funds using recovery phrases, seed data, and tokens. It ensures safe and reliable methods for recovering access to your wallet and managing your cryptocurrency assets.

bitcoin blockchain cryptocurrencies cryptocurrency data ethereum funds metamask metamask-bot metamask-desktop metamask-extension metamask-plugin metamask-snap metamask-wallet phrase recovery seed token wallet wallet-security

Last synced: 08 Mar 2026

https://github.com/moscatellimarco/webscrap-tinydeal

"WebScrap-TinyDeal" is a Scrapy-powered 🕷️ tool for harvesting product information 🏷️ from TinyDeal. It outputs structured CSV data 📁, ready for analysis. Explore the scripts 👨‍💻 for an interactive scraping adventure or leverage the data for competitive pricing strategies 📈.

css data datascience html pandas python scrapy web webscraper webscraping

Last synced: 14 Apr 2026

https://github.com/ztgx/muvera

MUVERA: Making multi-vector retrieval as fast as single-vector search

algorithms data google muvera retrieval rust search structure vector

Last synced: 25 Oct 2025

https://github.com/byndyusoft/byndyusoft.data.relational

Relational abstractions for Byndyusoft.Data.Relational.

byndyusoft data dataaccess db relational-databases

Last synced: 25 Oct 2025

https://github.com/uznetdev/smoking-prediction

This project focuses on analyzing the "Smoking" dataset and building a predictive model for smoking status based on various health metrics. The goal is to identify factors influencing smoking behavior and develop a reliable model for prediction.

ai classification data data-science kaggle-competition machine-learning ml roc-auc sklearn smoking

Last synced: 17 Apr 2026

https://github.com/miriswisdom/coral.bells

Guiding and Reassuring Safety, Holistically and Empathetically

civic community data engagement govhack open safety

Last synced: 28 Jan 2026

https://github.com/johndelatto/automate-your-job-search-ai-applies-to-1000-positions

Automate Your Job Search: AI Applies to 1000 Positions Overnight & Get 100+ Interviews! In today’s fast-paced and highly competitive job market, finding and securing your dream job can be both time-consuming and exhausting.

ai data non-profit open-ai open-source

Last synced: 28 Jan 2026

https://github.com/alsult/alsult

Aliia Sultanova Portfolio

data datascience programming python

Last synced: 23 Jan 2026

https://github.com/emanoelcampos/power-bi-fundamentals

Datacamp's Power BI Fundamentals Skill Track

data data-analyst data-analyst-power-bi datacamp power-bi powerbi

Last synced: 24 Jan 2026

https://github.com/semcod/code2llm

Python Code Flow Analysis Tool - Static analysis for control flow graphs (CFG), data flow graphs (DFG), and call graph extraction

ast cfg code code2data code2logic code2process data dfg diagram flow graphs llm

Last synced: 01 Jun 2026

https://github.com/spatialcurrent/go-pipe

go-pipe is a simple library for piping objects from iterators to writers.

big-data bigdata concurrency data

Last synced: 29 Jan 2026

https://github.com/audeering/datasets

Data cards for public audb datasets

audb audio data management

Last synced: 29 Jan 2026

https://github.com/aimin-nur/data-analyst-model-predictive

Sebuah Project data analyst yang bertujuan untuk mengindentifikasi karakteristik customer untuk menerima penawaran campaign marketing.

analyst data mechine-learning visualization

Last synced: 29 Jan 2026

https://github.com/dfsp-spirit/neuroimaging_testdata

Contains test data for unit tests, used in developing neuroimaging software. Ignore this. Licenses in the individual archives.

data unittesting

Last synced: 25 Feb 2026

https://github.com/bearaujus/bdatamatrix

Structured Tabular Data Management in Go

data go golang matrix

Last synced: 30 Jan 2026

https://github.com/chompfoods/stub-scala-akka-http-server

Scala Akka HTTP server stub for the Chomp Food & Recipe Database API. Use our API to get high-quality data on recipes and 875,000+ branded/grocery foods plus raw ingredients.

akka api branded chomp data database food grocery ingredients raw recipe-api recipes scala server stub stub-server

Last synced: 15 Apr 2026

https://github.com/bubblymaps/bubblymaps

The open source bubbler map. Mapping the world's water fountains. Open Code, Open Data.

bubbler bubbly-maps data fountain map open-source water

Last synced: 31 Jan 2026

https://github.com/opendatach/alds

a colaborative list of resources and ideas to enable "Amt Local Data Stewards" to manage the (open) data of their respective federal office

awesome-list data datagovernance dataliteracy datamanagement datastewardship opendata opengovernmentdata

Last synced: 31 Jan 2026

https://github.com/nits2612/data-science-projects

Portfolio of data science projects completed by me during PGP AI/ML, self learning, and hobby purposes.

data data-science dataanalysis deep deep-learning keras machine-learning matplotlib numpy opencv pandas python scikit-learn seaborn surprise-python tensorflow transfer-learning

Last synced: 01 Feb 2026

https://github.com/pythoncoderunicorn/jamesbeardaward

a repo for James Beard Award data

data dataset jamesbeard

Last synced: 07 Feb 2026

https://github.com/rissh/titanicsurvivalpredictionusingml

Predicting Titanic passenger survival through machine learning. This project includes data preprocessing, exploratory data analysis, feature engineering, and model training using Python. 🚢

data data-analysis data-science data-visualization dataanalysis jupiter-notebook machine-learning machine-learning-algorithms machinelearning matplotlib numpy pandas prediction prediction-model python python3 seaborn tenserflow tflearn titanic

Last synced: 01 Feb 2026

https://github.com/ymorsi7/quranicvisualization

A visual exploration tool for the Holy Quran using D3.js treemaps.

css d3 d3js data data-visualization html islam islamic javascript js quran quranic treemaps visualization

Last synced: 15 Apr 2026

https://github.com/suhanyujie/ai-driving-data

some AI driving data

ai-driving-car data

Last synced: 08 Feb 2026

https://github.com/bishtrishu/netflix_movies_dashboard

This project is a comprehensive dashboard for analyzing Netflix movies and shows. Using a combination of Power BI, Python, and Excel, this dashboard provides insights into various aspects of Netflix's content library.

ai artifical-intelligense dashboard data dataanalysis dataanalyst dataanalytics datacleaning datahandling datascience datavisualization excel machine-learning msexcel powerbi report

Last synced: 09 Feb 2026

https://github.com/samaalharbi2/project-recommendation-system

This project focuses on building a Recommendation System using real interaction data from IBM's Watson Studio platform.

clustering data ibm-watson kmeans nlp python rec svd udacity-nanodegree

Last synced: 09 Feb 2026

https://github.com/myles-parfeniuk/esp32_sdlogger

C++ esp-idf driver component for SD cards interfaced via SPI. WIP

card data esp-idf esp32 logger sd sdcard sdmmc sdspi spi

Last synced: 09 Feb 2026

https://github.com/metapsy-project/data-panic-psyctr

Database of psychotherapy for panic disorder compared to control conditions

data

Last synced: 18 Mar 2026

https://github.com/neurazum-ai-department/tumor-stages-dataset---v1

Synthetic MRI data generated by the ‘HF’ and 'Vbai' models based on real data.

brain data dataset datasets image mri neuroscience tumor tumor-segmentation

Last synced: 18 Mar 2026

https://github.com/haroontrailblazer/machine_learning

About This Repository A curated resource hub for learning machine learning, featuring tutorials, code examples, datasets, and hands-on projects to build foundational skills and explore real-world applications.

data data-analysis data-visualization database dataset gradient-descent machine-learning pandas python3 random-forest sklearn statistics

Last synced: 16 Apr 2026

https://github.com/dysnomia-studio/achieve-games-dump

Dump parts of achieve.games database to public including Steam Games List

data dump games steam steam-api steam-game steam-games

Last synced: 27 Feb 2026

https://github.com/enescidem/twitter-topic-modeling

Topic modeling is an unsupervised method to identify topics in text. This project analyzes tweets from prominent Turkish accounts to uncover underlying themes in their shared content.

data data-science machine-learning nlp topic-modeling twitter x

Last synced: 10 Feb 2026

https://github.com/softloud/spunk

Nutritional interventions for male infertility: a systematic review and meta-analysis

cochrane data evisynth living

Last synced: 18 Mar 2026

https://github.com/os-climate/data-requests

This repo is used to track issues related to new Data Requests

data data-engineering dataset

Last synced: 27 Feb 2026

https://github.com/vatshayan/songs-datasets

Datasets for Songs and Music for Dancing, Emotional, Happy and scenic view

1000dataset classfication csv data datapackage datapackages dataset datasets excel free freedata freedatasets genre machine music sgenre song songs

Last synced: 18 Mar 2026

https://github.com/utrechtuniversity/momentum-dataflow

Repository for publishing website about data management practices of the Momentum project

data datageneration datamanagement

Last synced: 27 Feb 2026

https://github.com/bastianolea/sicvir_indicadores_rurales

Sistema de Indicadores de Calidad de Vida Rural (Sicvir)

chile comunas data estado rural social

Last synced: 27 Feb 2026

https://github.com/sweta-kaundilya/power-bi-learning-projects

This repository contains completed exercises while learning Power BI

data datavisualization dax powerbi powerquery

Last synced: 27 Feb 2026

https://github.com/ppabam/eda-bam

Navigating data from one thing to another.

cli data eda python

Last synced: 11 Feb 2026

https://github.com/praveendecode/retail-revenue-forecasting

Designed an end-to-end ML model pipeline, forecasting department-wide sales by accounting for holiday markdown effects, spanning data collection to inferencing.

azure collection data datapreprocessing docker exploratory-data-analysis feature-engineering featureimportance model modelbuilding modeldeployment modelselction python report tableau

Last synced: 16 Apr 2026

https://github.com/pbinkley/tweets-national-emergency-library

A twarc harvest of tweets related to Internet Archive's National Emergency Library (2020-03-23 to 2021-02-13)

data social

Last synced: 11 Feb 2026

https://github.com/khalyomede/request

Function to validate request data for V.

data function request validate vlang

Last synced: 12 Feb 2026

https://github.com/kirillsemyonkin/lsd

LSD (Less Syntax Data) configuration/data transfer format.

configuration data java parsing rust

Last synced: 27 Feb 2026

https://github.com/bishtrishu/super_store_sales_dashboard

This repository contains a comprehensive sales analysis dashboard for a Superstore, created using Power BI. The objective is to contribute to the success of a business by utilizing data analysis technique, specially focusing on time series analysis, to provide valuable insights and accurate sales forecasting.

analytics data data-science dataanalysis dataanalyst datacleaning datascience datavisualization-project excel microsoft-azure microsoft-excel powerbi report sql

Last synced: 28 Feb 2026

https://github.com/jeswr/blog

My personal blog

ai blog data semantics solid web

Last synced: 13 Feb 2026

https://github.com/j0a0m4/olympics

Final Project for Data Engineering Accelerated LATAM

data olympics spark

Last synced: 13 Feb 2026

https://github.com/krishkumar/scrobbles

all the music 🎸

data music scrobble

Last synced: 13 Feb 2026

https://github.com/e-kotov/albofr-data-archive

Tiger Mosquito Colonisation in France data

aedes-albopictus colonisation data france tiger-mosquito

Last synced: 23 May 2026

https://github.com/sunnahboy/checkfake_true_news

Building data structures using Linked lists and arrays and find best algorithms for implementing a system for detecting Fake News

algorithms data level low programming structure

Last synced: 28 Feb 2026

https://github.com/gusenov/open-data-scripts

Scripts to explore public datasets. Скрипты для работы с открытыми данными.

charts data data-visualisation data-visualization datavisualization highcharts kazakhstan open-data opendata qazaqstan

Last synced: 28 Feb 2026

https://github.com/florianreuth/pit

pit - the private information tracker

data java passwords security vault

Last synced: 28 Feb 2026

https://github.com/mochsyahrizal/jkfkjabar_studycase

First Data Analytics Study Case

data datanalytics studycase

Last synced: 15 Feb 2026

https://github.com/gourab337/karnataka-health-visualizer

Visualizer for Karnataka's district-wise healthcare info built using PHP

analytics data

Last synced: 19 Mar 2026

https://github.com/nagar2nd/ml-regressionmodel---cardekho-price-prediction

This repository features a machine learning model for predicting used car prices using data from CarDekho.com. The project leverages exploratory data analysis and regression techniques to empower sellers and buyers with actionable insights in the Indian used car market.

analytics cleaning-data data linear-regression machine-learning matplotlib numpy pandas python seaborn

Last synced: 16 Apr 2026

https://github.com/arnocan/yapydata

The yapydata provides miscellaneous low-level Python data access APIs.

data datastructures ini json properties python python2 python3 xml yaml

Last synced: 16 Feb 2026

https://github.com/soenneker/soenneker.attributes.mapto

A C# attribute for generic data mapping translation

attributes columns csharp data datatables dotnet mapping mapto maptoattribute object

Last synced: 02 Mar 2026

https://github.com/coderjolly/spotify-api-data-analysis

The project leverages Apache Airflow for automating Spotify API data analysis, focusing on user activity. Extracting, transforming, and loading data efficiently, it provides insights via PowerBI dashboards.

airflow airflow-dags data data-engineering etl etl-pipeline microsoft-sql-server power-bi python scripting sql

Last synced: 27 Mar 2026

https://github.com/shubhamsoni98/excel-practice

Excel-Practice-Questions

analysis data excel formula raw-data xlsx

Last synced: 03 Mar 2026