An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/ayushman0511/data-analytics-project1

This repository contains a collection of SQL scripts demonstrating various analytical techniques, such as changes over time, cumulative, performance, data segmentation, part-to-whole analysis.

analytics busine data data-anal data-enginee data-sci data-scien database datascien query reporting sql sql-query sql-server window-func

Last synced: 17 Jun 2026

https://github.com/bubblymaps/bubblymaps

The open source bubbler map. Mapping the world's water fountains. Open Code, Open Data.

bubbler bubbly-maps data fountain map open-source water

Last synced: 31 Jan 2026

https://github.com/denisecase/dc-texter

Send a text message using Python

alerts data python sms-messages streaming

Last synced: 08 Feb 2026

https://github.com/jrmi/pyfiles

Big file collection manager

big-data data opendata

Last synced: 31 Jan 2026

https://github.com/noedemange/orderedheatmapanalysis

OrderedHeatMapAnalysis (OHMA) is a direct data analysis framework allowing to simultaneously visualize and analyze the structure of complex datasets. An optimized seriation of rows and columns of the input data table is performed, resulting in a mapping of the whole dataset into an ordered heatmap.

analysis bi-seriation data dataanalysis heatmap r rstats seriation shiny shiny-apps

Last synced: 27 Feb 2025

https://github.com/okieraised/rke2-deployment

Single-node RKE2 deployment

data helm helm-charts helm-deployment rke2

Last synced: 17 Mar 2026

https://github.com/abhijeetdasbakshi/ecommerce-insights

A Dockerized end-to-end project that combines unsupervised machine learning for customer segmentation with scalable data pipelines. It uses MongoDB for data ingestion, Scikit-learn for clustering, Airflow for orchestration, and Streamlit for interactive visualization — enabling actionable insights into e-commerce

airflow airflow-dags ci-cd-pipeline clustering dags data data-pipelines docker docker-compose docker-container dockerfile git great-expectations kafka mongodb pca-analysis postgresql pyspark t-sne umap-learn

Last synced: 04 Apr 2026

https://github.com/drostlab/biodbretrievr

Retrieve and efficiently index entire biological sequence databases

biological-data biological-sequences data databasestoring retrieval

Last synced: 26 Feb 2026

https://github.com/mahtabranjbar/onlineshopping_analysis_dashboard

This project analyzes online shopper behavior using various machine learning models and EDA techniques.

dashboard data dataanalysis eda machine-learning streamlit

Last synced: 08 Feb 2026

https://github.com/jigyasag18/aircraft-data-management

This repository offers a comprehensive simulation of global military air deployments involving 10 countries, aircraft models, mission types, and strategic zones. It analyzes air power distribution, mission intent (offensive, defensive, support), and geopolitical positioning. The project provides structured insights into regional & zone level threat

aircraft-data aircraft-performance data data-analysis data-visualization database database-management dataset datavisualisation mysql powerbi powerbi-report powerbi-visuals sql

Last synced: 04 Feb 2026

https://github.com/michaelfromyeg/lyrics

Lyric-store and API hosted on Git.

data lyrics

Last synced: 08 Feb 2026

https://github.com/haroontrailblazer/machine_learning

About This Repository A curated resource hub for learning machine learning, featuring tutorials, code examples, datasets, and hands-on projects to build foundational skills and explore real-world applications.

data data-analysis data-visualization database dataset gradient-descent machine-learning pandas python3 random-forest sklearn statistics

Last synced: 16 Apr 2026

https://github.com/turner-kendall/turner-kendall

Turner Kendall - dev, opps, sec.

config data github-config go rust security

Last synced: 31 Oct 2025

https://github.com/os-climate/data-requests

This repo is used to track issues related to new Data Requests

data data-engineering dataset

Last synced: 27 Feb 2026

https://github.com/pbinkley/mfmcollections

Project to distill data about published collections of microfilms from library lists

data research retro

Last synced: 28 May 2026

https://github.com/miozilla/snowden

snowden :snowman::video_game: : VR Game # Snowflake # Data Engineering # ELT

data elt engineering snowflake sql vr-game

Last synced: 11 Feb 2026

https://github.com/ppabam/eda-bam

Navigating data from one thing to another.

cli data eda python

Last synced: 11 Feb 2026

https://github.com/kunalthakur204/visualization-on-flower

🌸 Flower Dataset Visualization Visualizing patterns and relationships in flower data through charts and plots. Perfect for exploring floral characteristics and trends! 📊

data data-visualization dataanalysis flowerdataset python

Last synced: 16 Apr 2026

https://github.com/mikeqfu/network-rail-track-fixity-layer

This project develops a data mining tool for analysing and predicting track movements using asset data, environmental factors and track design knowledge to model key parameters and generate fixity values for the GB rail network.

data data-integration data-mining data-science information-management knowledge-discovery point-cloud rail rail-alignment rail-track track-fixity

Last synced: 02 Sep 2025

https://github.com/bablukumarjha/startup-funding-revenue-analysis-by-sql-and-pandas

SQL project analyzing startup funding, revenue, and founder data to extract business insights using Python and MySQL.

data data-analysis data-platform data-science dataanalysisusingpython dataanalytics pandas-dataframe pandas-library python sql sql-server sqlalchemy sqldatabase

Last synced: 18 May 2026

https://github.com/ibttf/bayborhood

Interactive map to find the ideal neighborhood in San Francisco based on data.

data data-analysis data-visualization gis mapbox react

Last synced: 18 Jun 2026

https://github.com/pawamoy/keycut-data

Keyboard shortcuts data stored in YAML files

data keyboard-shortcuts

Last synced: 12 Feb 2026

https://github.com/foundationallm/.github

A platform accelerating delivery of secure, trustworthy enterprise copilots.

agent ai data enterprise generative-ai large-language-model llm ml tool

Last synced: 12 Feb 2026

https://github.com/namratha2301/sales-orders-analysis

Wanted to experiment with Looker. This dashboard visualizes sales trends across regions, customer segments, and product categories.

business-analytics dashboard data dataanalysis datavisualization excel looker looker-studio

Last synced: 13 Feb 2026

https://github.com/plnech/never2late

Never 2 Late - a reinterpretation of Everest Pipkin's 'i've never picked a protected flower'

dada dada-science data generative-art glitch-art installation nlp poetry spacy vector-similarity wallpaper

Last synced: 10 Jun 2025

https://github.com/krishkumar/scrobbles

all the music 🎸

data music scrobble

Last synced: 13 Feb 2026

https://github.com/dushansenadheera/web_scraper

web scraper using Python along with BeautifulSoup and Selenium

beautifulsoup data python selenium web-scraping

Last synced: 19 Jun 2026

https://github.com/juanandres-montero/dataanalysis

Dedicado al análisis de datos.

costa-rica data

Last synced: 10 Aug 2025

https://github.com/imartinezl/madrid-challenge

Madrid Route Optimization Challenge 🚚♻️🚚

challenge city data optimization routing-algorithm traffic

Last synced: 28 Feb 2026

https://github.com/nolanbconaway/rollercoaster-tycoon-data

Every roller coaster I have built in RCT2 for iPad

data roller-coaster-tycoon

Last synced: 24 Mar 2025

https://github.com/sunnahboy/checkfake_true_news

Building data structures using Linked lists and arrays and find best algorithms for implementing a system for detecting Fake News

algorithms data level low programming structure

Last synced: 28 Feb 2026

https://github.com/mwelwankuta/image-match

a multi-threaded tool for batch renaming images of their appearance and match in a datasource

data openai typescript worker-threads

Last synced: 09 Mar 2025

https://github.com/d4niee/exifpy

An simple console tool to view Image meta datas

data exif image meta python

Last synced: 23 Mar 2025

https://github.com/mbagalman/lattice-doe

Python code to create experimental designs optimized to meet statistical power targets

abtesting data datascience designofexperiments experimentaldesign statistics

Last synced: 19 Jun 2026

https://github.com/lablnet/alibaba_scraper

This is a robust web scraper that extracts data from the Alibaba website. It's multi-threaded and utilizes Playwright to efficiently scrape data from the website. This script is capable of scraping the entire Alibaba site, which would take approximately 4-6 months to complete.

alibaba data ecom mit-license open-source products scraper

Last synced: 15 Mar 2025

https://github.com/gourab337/karnataka-health-visualizer

Visualizer for Karnataka's district-wise healthcare info built using PHP

analytics data

Last synced: 19 Mar 2026

https://github.com/omarcodex/data_analysis

My repository of past and present research and data-driven projects.

data ecodev ecology science sustainability yale

Last synced: 18 Jan 2026

https://github.com/sakan811/show-leaving-soon-tracker-website

This is a Vue.js application that displays shows that are leaving each platform soon, featuring a countdown timer for each title based on the user's local timezone.

data hbo hbomax netflix shows streaming tv-shows vue vuejs web webapp website

Last synced: 18 Mar 2025

https://github.com/vishwas-chakilam/hr-dashboard

This project involves creating an interactive HR Dashboard using Power BI for visualization and MySQL for data cleaning and analysis. It provides insights into employee performance, attrition, salary distribution, and hiring trends.

dashboard data datac datacleaning datavisualization mysql powerbi

Last synced: 23 Mar 2025

https://github.com/soenneker/soenneker.attributes.mapto

A C# attribute for generic data mapping translation

attributes columns csharp data datatables dotnet mapping mapto maptoattribute object

Last synced: 02 Mar 2026

https://github.com/j2kun/terrorism-usa-post-9-11

A copy of the terror data published by NewAmerica

data politics terrorism transparency

Last synced: 02 Mar 2026

https://github.com/boratechlife/tensorflow-questions-datasets

A Tensorflow questions Datasets to help you practice Machine learning and Train Models

data datapreprocessing datasets machinelearning modeltrain questions tensorflow

Last synced: 23 Mar 2025

https://github.com/inzhenerka/scooters_data_generator

Generate data of scooter trips for analysis

data dbt generator

Last synced: 02 Jun 2026

https://github.com/jillmpla/kaggle_notebooks

Kaggle-based data analysis, data science, and data visualization.

data data-science data-visualization kaggle machine-learning

Last synced: 16 Apr 2026

https://github.com/anuraganalog/onyx-data

BI Visualizations to the problems in website. All the Visualization can be found at the below link

data onyx public tableau viz

Last synced: 02 Apr 2026

https://github.com/rylan12/apscores

A quick way to visualize how the AP score distributions have changed from year to year.

advanced-placement analysis ap-exam data scores

Last synced: 19 Jun 2026

https://github.com/dhimmel/adeptus

ADEPTUS -- differential gene expression signatures of disease

adeptus data differential-expression disease gene-expression genes rephetio

Last synced: 05 Jan 2026

https://github.com/jigyasag18/power-bi-dashboard-project

The Ecommerce Sales Analysis Dashboard project utilizes Power BI to provide detailed insights into ecommerce sales data, enabling stakeholders to track key performance metrics and uncover trends. This interactive dashboard allows users to explore the data in real-time, offering features such as drill-down capabilities, customizable filters.

dashboard data data-visualization datacleaning datanalysis datanalytics datapreprocessing powerbi visulaization

Last synced: 04 Mar 2026

https://github.com/zelon88/motorized_bike_data

A repo to contain data in various formats related to motorized bicycle configurations.

bicycle bikes data data-set engine w

Last synced: 05 Mar 2026

https://github.com/suryadev99/stream_processing_website_click_data

Stream Processing of website click data using Kafka and monitored and visualised using Prometheus and Grafana

clickdata data dataengineering docker flink-kafka flink-metrics flink-stream-processing git grafana kafka kafka-streams kafka-topic prometheus psql python

Last synced: 10 Mar 2026

https://github.com/amethyst-php/collection

Simple as the name, this package allow you to create collection of other models.

amethyst amethyst-package api collection data laravel

Last synced: 17 Apr 2026

https://github.com/gkannan-codes/habitableexos

With Earth’s habitability under strain, we ask: which known exoplanets could humans live on? Using NASA’s Exoplanet Archive, we score planets 0–1 (1 ≈ Earth) from five Earth-normalized features to rank top candidates.

data html kaggle matplotlib-pyplot numpy pandas plotly python seaborn visualization

Last synced: 11 Apr 2026

https://github.com/mecha-cms/x.route

Custom route files.

custom data extension file folder path route url

Last synced: 23 Mar 2025

https://github.com/bastianolea/cut_comunas

Versión actualizada de los códigos únicos territoriales (CUT) de las comunas y regiones del país.

chile comunas data estado

Last synced: 24 Jun 2026

https://github.com/ashfaqalizardariofficial/databasehelper

A C# database helper library to connect with the database server and perform actions insert, update, delete, select data and select multiple data from the database.

ashfaq-ali-zardari ashfaq-ali-zardari-official data database delete helper insert ms-sql-server multiple select-data server sql-server update

Last synced: 02 Apr 2026

https://github.com/ehvenga/data.driven.modeling

Repository to practice data driven modelling

data data-modeling

Last synced: 23 Mar 2025

https://github.com/equinor/fmu-sumo

Interaction with Sumo in the FMU context

analytics data fmu python subsurface sumo visualization

Last synced: 01 May 2025

https://github.com/bastianolea/mineduc_personal_academico

Datos de Personal Académico, entre los años 2008 y 2024, del sistema de Educación Superior.

chile data educacion meses tiempo

Last synced: 19 Jun 2026

https://github.com/yuvrajsaraogi/sales-prediction-using-python

Sales prediction involves estimating future product sales based on factors like advertising spend, target audience, and platform. Businesses rely on data scientists to forecast sales and optimize advertising costs. Machine learning in Python can be used for this task.

data data-analysis data-science data-visualization machine-learning matplotlib natural-language-processing numpy pandas prediction python sales-prediction-using-python sql

Last synced: 19 Apr 2026

https://github.com/zurd46/zurdsynthdatagen

This Electron project uses the OpenAI ChatCompletion API to generate synthetic datasets in either German (DE) or English (EN).

data data-structures dataset electron json jsonl nodejs openai synthetic

Last synced: 04 Apr 2026

https://github.com/artcc/coredatademo

Demo for CoreDataGenericModule implementation

core coredata coredata-model data encrypted encrypted-data encryption persist

Last synced: 19 Jun 2026

https://github.com/jigyasag18/iit-guhawati-final-capstone-project

Smart Dynamic Parking Price Optimization System that adjusts parking fees in real-time based on demand, traffic, and competition. It employs adaptive pricing models and rerouting logic to enhance parking utilization and reduce congestion. The system is visualized via an interactive Streamlit dashboard, enabling users to simulate dynamic pricing.

bokeh bokeh-server bokehplots capstone-project data dataset deployment machine-learning machine-learning-algorithms matplotlib matplotlib-pyplot mlproject normalisation numpy pandas pathway python streamlit

Last synced: 05 Apr 2026

https://github.com/stimulsoft/samples-dashboards.web-for-blazor-webassembly

Blazor WebAssembly (Wasm) samples for Reports.BLAZOR embedded components, Visual Studio C# projects, .NET 6, .NET 7, .NET 8 dashboards tool

blazor client-side converter dashboard data data-analysis data-sources database datagrid designer diagram dimension json net presentation print runtime viewer wasm webassembly

Last synced: 18 Apr 2026

https://github.com/mksingh431/free-data-science-courses

Data science is a rapidly growing tech field that’s transforming business decision-making. To break into this field, you need the right skills. Fortunately, top institutions like Harvard and IBM offer free online courses. These courses cover everything from basic programming to advanced machine learning.

course data data-analysis data-science data-visualization free freecou python

Last synced: 19 Apr 2026

https://github.com/istinnew/etl-pipeline-ganz-project

End-to-end ETL pipeline project for collecting, transforming, and loading data into a cloud-based database using Python, MySQL, and Google Cloud Analytics

cloud cloud-engineering cloud-services data data-science dataanalytics database database-schema googlecloud mysql mysql-database python python-lambda

Last synced: 20 Apr 2026

https://github.com/hormcodes/data

Terraform configuration for public data storage hosted on data.horm.codes

aws cloudfront content-management data github-actions s3-bucket terraform

Last synced: 20 Apr 2026

https://github.com/nikoheikkila/maps

A TypeScript collection of specialized map implementations

data javascript maps typescript

Last synced: 20 Apr 2026

https://github.com/mozzo1000/web-analytics

Website analysis tools and data

analysis analytics data website

Last synced: 21 Apr 2026

https://github.com/charon25/weatherdata

17 000 weather measurements collected by a weather station created for a college project.

csv data dataset datasets json measurements strasbourg weather weather-data

Last synced: 16 Jan 2026

https://github.com/zawaung7791/streamlit-data-viewer

Data previewer using streamlit, plotly and python

data plotly python streamlit

Last synced: 21 Apr 2026

https://github.com/schijioke-uche/data-analysis-with-python-an-spss-model

With this Python notebook algorithm, you can use SPSS Model notebook to build machine learning pipelines that you can use to iterate rapidly during the model building process in data analysis. Whether you're trying to find the right algorithm or experimenting with different ways of preparing your data, you can create reproducible research that's easily understood by any member of your team with Hypothesis definition.

anova cp4a cp4d cp4i cp4s data ibm ibm-cloud jeffrey-chijioke-uche jeffrey-solomon-chijioke-uche openshift python python3 redhat t-test

Last synced: 22 Apr 2026

https://github.com/grimen/python-humanizer

A human/developer friendly value humanizer - for Python.

data debug debugging format formatting humanize humanizer log logging print printing value

Last synced: 05 Jun 2026

https://github.com/syed-nihaal/car-price-prediction-and-performance-analysis

A data science notebook project focused on analyzing car features and building a model for car price prediction.

data data-analysis data-visualization jupyter-notebook python

Last synced: 23 Apr 2026

https://github.com/ppatrzyk/heatmap

Display CSV as a heatmap in terminal

csv data data-visualization terminal

Last synced: 24 Apr 2026

https://github.com/yuvrajsaraogi/-iris-flower-classification

Iris flower has three species; setosa, versicolor, and virginica, which differs according to their measurements. Now assume that you have the measurements of the iris flowers according to their species, and the task is to train a machine learning model that can learn from the measurements of the iris species and classify them.

classification data data-analysis data-science data-visualization flower flower-classification iris iris-classification iris-flower iris-flower-classification knn knn-classification machine-learning machine-learning-algorithms ml natural-language-processing nlp python

Last synced: 24 Apr 2026

https://github.com/hruth-vik/sales-analysis-report

SalesScope is a powerful sales analytics dashboard that extracts insights, reveals trends, and drives strategy from raw data.

analytics data powerbi-report powerbi-visuals python

Last synced: 24 Apr 2026

https://github.com/kuanhungchen/spring-2019-data-structures

📦 Some programming assignments about basic data structures.

data data-structures

Last synced: 25 Feb 2025

https://github.com/rubix982/product-quality-classification

This is an implementation for the CIKM AnalytiCup 2017, around the topic of "Product Title Quality". The goal is to take SKUs and rank its title's clarity and conciseness. Referenced papers are attached to this repository. And as such, the aim is to craft ensemble models that either try to replicate results or find new methods for classification.

data data-analysis information-retrieval jupyter-notebook machine-learning nlp python spacy-nlp

Last synced: 25 Apr 2026

https://github.com/carlos-levi/twitterbots_analise_redesneurais

Projeto para a disciplina de IA - análise exploratória e aplicação de técnicas de aprendizado de máquina para detectar contas automatizadas (bots) na plataforma 𝕏 (Twitter)

data machine-learning twitter-bot

Last synced: 06 Jun 2026

https://github.com/dms-codes/scrape-kesaintblanc-id

Kesaintblanc Data Scraper This Python script is designed to scrape product data from the Kesaintblanc website. It collects information about products, including product name, URL, price, image URLs, status, stock, and more. The scraped data is saved to a CSV file for further analysis.

data kesaintblanc python webscraper

Last synced: 27 May 2026

https://github.com/f-ssemwanga/pandas-numpy-repo

This repo has extensive work I have done on Pandas and NumPy Modules during the advanced programming Module

cleaning-data-in-python data numpy-arrays pandas visualization

Last synced: 27 Apr 2026

https://github.com/tsbarr/citi-bikes-challenge

Citibikes NYC Data Analysis: Uncover insights from over a decade of ride data. Jupyter notebook for data aggregation/cleaning & Tableau dashboards for interactive visualization.

data data-visualization pandas-python python tableau

Last synced: 27 Apr 2026