An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/mikeqfu/network-rail-track-fixity-layer

This project develops a data mining tool for analysing and predicting track movements using asset data, environmental factors and track design knowledge to model key parameters and generate fixity values for the GB rail network.

data data-integration data-mining data-science information-management knowledge-discovery point-cloud rail rail-alignment rail-track track-fixity

Last synced: 02 Sep 2025

https://github.com/randomgamingdev/randomgamingdev.github.io.data

The data for RandomGamingDev.github.io (feel free to build your own website off of mine :D)

blog custom data projects projects-list

Last synced: 02 Jan 2026

https://github.com/bdr-pro/graphyml

A powerful, interactive Streamlit application to explore, edit, visualize, and query a graph-based database of YAML nodes — ideal for movie metadata, research articles, or structured knowledge graphs.

data database yaml yml

Last synced: 23 Jul 2025

https://github.com/akashlogics/street-data-tracking

Detect, Track and Count number of persons walking across the path(s) making use of YOLO. This Python project tracks people moving across predefined street zones

analysis data excel newdataset object-detection opencv python python3 yolo

Last synced: 19 May 2026

https://github.com/amethyst-php/activity

Someone just did something, should we save who did this and when?

activity amethyst amethyst-package api data laravel

Last synced: 17 May 2026

https://github.com/jameshenderson12/chatbot-utils

Generic data and elements that can be reused or repurposed for chatbot development.

boilerplate chatbot data development elements intents template utterances

Last synced: 04 Mar 2026

https://github.com/dushansenadheera/web_scraper

web scraper using Python along with BeautifulSoup and Selenium

beautifulsoup data python selenium web-scraping

Last synced: 19 Jun 2026

https://github.com/turner-kendall/turner-kendall

Turner Kendall - dev, opps, sec.

config data github-config go rust security

Last synced: 31 Oct 2025

https://github.com/jigyasag18/financial-risk-analysis-project

The Credit Card Financial Risk Analysis Dashboard is a real-time Power BI tool designed to provide insights into credit card transactions and customer demographics. It features interactive visualizations, efficient data processing, and actionable insights to support decision-making. Utilizing data from SQL database, the dashboard tracks key metrics

data dataanalysis database datacleaning datapreprocessing dataprocessing datavisualization financial-analysis financialriskanalysis mysql powerbi sql statistical-analysis

Last synced: 06 Mar 2026

https://github.com/shahules786/titanic-analysis

different analysis of titanic accident (data from kaggle)

analyze data titanic-kaggle

Last synced: 26 Jun 2025

https://github.com/kameronbrooks/datalys2-reporting

Datalys2 Reports allows you to create rich, interactive reports by simply defining a JSON configuration embedded in your HTML. It handles the layout, data visualization, and interactivity, so you don't need to write custom React code for every report.

data data-visualization html react

Last synced: 08 Apr 2026

https://github.com/rahult18/atmo-flow

AtmoFlow is a robust data engineering pipeline built on Google Cloud Platform (GCP) that processes and analyzes weather and air quality data in both batch and streaming modes

airflow data data-modeling data-science data-visualization dataengineering gcp-bigquery gcp-cloud-composer gcp-cloud-functions pyspark

Last synced: 23 Jun 2026

https://github.com/smaug6739/data-bit

This project is a module for converting a structured dataset into a number that can be stored in a database taking up little space.

bits data nodejs

Last synced: 14 May 2026

https://github.com/encoreshao/data-science

Data analyze examples, using Jupyter notebook and Python!!!

data dataanalysis encore jupyter-notebook

Last synced: 29 Mar 2025

https://github.com/gustavonav/daily-youtube-extraction

Projeto que completa a criação de um ambiente para extração, armazenamento e processamento de dados do Youtube

airflow data minio python3 spark

Last synced: 21 Feb 2026

https://github.com/seldszar/piccha

Another tree data structure

data tree

Last synced: 16 Jul 2025

https://github.com/fliplet/fliplet-widget-data-source-query

Data Source Query Provider

data provider widget

Last synced: 11 Apr 2025

https://gitlab.com/pommalabs/htmlark

HtmlArk packs a webpage into a single HTML file: https://htmlark-docs.pommalabs.xyz/

audios css data embed fonts html images javascript uri videos

Last synced: 03 Sep 2025

https://github.com/jigyasag18/aircraft-data-management

This repository offers a comprehensive simulation of global military air deployments involving 10 countries, aircraft models, mission types, and strategic zones. It analyzes air power distribution, mission intent (offensive, defensive, support), and geopolitical positioning. The project provides structured insights into regional & zone level threat

aircraft-data aircraft-performance data data-analysis data-visualization database database-management dataset datavisualisation mysql powerbi powerbi-report powerbi-visuals sql

Last synced: 04 Feb 2026

https://github.com/coderooz/hr-dashboard

The goal of this project is to create a power bi dashboard to showcase the attrition data within the company.

data data-analytics power-bi

Last synced: 07 Jan 2026

https://github.com/ailixter/gears-dictionary

The project, which Gears Dictionary

arrays data dictionaries dictionary php struct utilities

Last synced: 19 Jul 2025

https://github.com/abhijeetdasbakshi/ecommerce-insights

A Dockerized end-to-end project that combines unsupervised machine learning for customer segmentation with scalable data pipelines. It uses MongoDB for data ingestion, Scikit-learn for clustering, Airflow for orchestration, and Streamlit for interactive visualization — enabling actionable insights into e-commerce

airflow airflow-dags ci-cd-pipeline clustering dags data data-pipelines docker docker-compose docker-container dockerfile git great-expectations kafka mongodb pca-analysis postgresql pyspark t-sne umap-learn

Last synced: 04 Apr 2026

https://github.com/ffatahillah7/snowflake-data-governance-warehouses

Welcome to the Powered by Tasty Bytes - Zero to Snowflake Quickstart focused on Data Governance! Within this Quickstart we will learn about Snowflake Roles, Role Based Access Control and deploy both Column and Row Level Security that can scale with your business.

data data-governance snowflake

Last synced: 06 Jan 2026

https://github.com/0xHericles/SpamDetector

:email: A Simple Python Spam Detector with Scikit-Learn

data ham machine-learning python sklearn spam

Last synced: 24 Mar 2025

https://github.com/germanpaul12/automating-hacker-news-and-weather-mails

Project for my Raspberry Pi to send me mails when it rains and to inform with hot tech news

beautifulsoup beautifulsoup4 data hacker-news openweather-api raspberry-pi requests

Last synced: 05 May 2026

https://github.com/heitang/fcu-classid

逢甲大學:學院 ID 、 系所 ID 和班級 ID

data fcu project

Last synced: 30 Mar 2025

https://github.com/0xHericles/ufcg-geojson

GeoJSON file containing the blocks and buildings of the Federal University of Campina Grande.

data data-visualization geojson map open-source ufcg university

Last synced: 24 Mar 2025

https://github.com/doppelgunner/baby

A program for storing data just for fun

data doppelgunner java note storing

Last synced: 12 Jun 2026

https://github.com/noedemange/orderedheatmapanalysis

OrderedHeatMapAnalysis (OHMA) is a direct data analysis framework allowing to simultaneously visualize and analyze the structure of complex datasets. An optimized seriation of rows and columns of the input data table is performed, resulting in a mapping of the whole dataset into an ordered heatmap.

analysis bi-seriation data dataanalysis heatmap r rstats seriation shiny shiny-apps

Last synced: 27 Feb 2025

https://github.com/richardlitt/bird-watching

My birdwatching list and repo

birding data ebird

Last synced: 26 Jan 2026

https://github.com/nika2811/new-york-city-taxi-fare-prediction

About In this project using New York dataset we will predict the fare price of next trip. The dataset can be downloaded from https://www.kaggle.com/kentonnlp/2014-new-york-city-taxi-trips The dataset contains 8 features along with GPS coordinates of pickup and dropoff

data data-preprocessing data-visualization decision-trees feature-engineering kaggle kaggle-competition linear-regression machine-learning neural-network nyc polynomial-regression ridge-regression scikit-learn taxi taxi-data tensorflow xgboost

Last synced: 06 Apr 2025

https://github.com/fatihilhan42/hollywood-theatrical-market-synopsis-1995-to-2021

In this project, the data of hollywood film production companies from 1995 to 2021 were examined. Significant tables and graphs were created using data visualization algorithms, with the tickets sold divided into categories.

data data-analysis data-science data-visualization

Last synced: 23 Mar 2025

https://github.com/petzi53/repair

R Datasets of the Open Repair Alliance (ORA).

data r repair repair-cafe

Last synced: 19 May 2026

https://github.com/cpietsch/breitband

developer repo of breitband-berlin

d3js data threejs visualization

Last synced: 02 May 2026

https://github.com/rylan12/apscores

A quick way to visualize how the AP score distributions have changed from year to year.

advanced-placement analysis ap-exam data scores

Last synced: 19 Jun 2026

https://github.com/sauravsrivastav/githubreposearcher

GitHub Repo Searcher 🔍 is a Streamlit web application designed to help you search for GitHub repositories based on a query and view the results in a tabular format. You can also download the results in CSV or Excel format for further analysis. 📊📈

data data-export excel github-api python repository-searcher streamlit webapp

Last synced: 20 Jan 2026

https://github.com/frnt-end/ts-context-items-list

⚛️ React Typescript project - Fetch data and display it as a list of 10 items in 10 (pagination) pages. click on each item leads to more details page- using axios, Context and Styled Components.

api axios context context-api data fetch list pagination router router-dom styled-components typescript

Last synced: 19 May 2026

https://github.com/mumtaz4118/nlp-course

Programming Assignments and Lectures for Stanford's CS 224: Natural Language Processing with Deep Learning

course data data-analysis data-analytics data-science data-visualization deep-learning education machine-learning natural-language-processing neural-network transfer-learning

Last synced: 24 Nov 2025

https://github.com/shahsuvarli/election-voters-data-analysis-pandas

Educational project analyzing Azerbaijan voter demographics with pandas, focusing on data cleaning, grouping, and visualization.

cleaning data grouping matplotlib numpy pandas python visualization

Last synced: 12 Apr 2026

https://github.com/jsanz/kart-test

Testing Kart repository

data geospatial kart

Last synced: 26 Jan 2026

https://github.com/arthurdanjou/studies

💼 This is the repository containing all my projects done during my studies in Python and R.

ai data data-science data-visualization jupyter jupyter-notebook ml python r

Last synced: 08 Apr 2025

https://github.com/infinitode/pyautoplot

PyAutoPlot is an open-source Python library designed to make dataset analysis much easier by generating helpful detailed plots using matplotlib. It automatically generates appropriate plots based on the dataset you feed it.

analysis automatic csv data dataset dataset-analysis generation matplotlib pandas plots plotting-in-python plotting-library python

Last synced: 16 Mar 2025

https://github.com/merekat/flight-delay-prediction

This project focuses on predicting flight delays using historical data from a Tunisian airline. We analyzed patterns in airport operations and flight schedules to build a machine learning model that can forecast potential delays.

aviation data data-science machine-learning machine-learning-algorithms machinelearning prediction predictive-modeling

Last synced: 08 Apr 2025

https://github.com/jigyasag18/movie-recommendation-system-project

This repository features a personalized movie recommendation system that offers tailored suggestions to users. It leverages a dataset of 5,000 English-language films and utilizes data processing, feature engineering, and a cosine similarity algorithm to analyze user preferences. The system includes an intuitive user interface for easy navigation.

data datacleaning datapreprocessing machine-learning machine-learning-algorithms python streamlit streamlit-webapp

Last synced: 28 May 2026

https://github.com/beriberikix/senml-zephyr

A codec for encoding and decoding Sensor Measurement Lists (SenML) for Zephyr

codec data iot senml sensor zephyr-rtos

Last synced: 24 Mar 2025

https://github.com/ot-code/sql-sabor-y-tradicion

A SQL-driven project that integrates menu and order data to reveal insights on dish performance, customer preferences, and spending trends. It informs pricing strategies, menu adjustments, and targeted promotions, ultimately enhancing the overall customer experience and driving business growth.

analytical-queries data data-aggregation data-analysis database-design join-queries mysql order-analytics relational-databases restaurant-data sql sql-script

Last synced: 08 Apr 2025

https://github.com/michaelfromyeg/data

Data set dump.

data data-set

Last synced: 16 Jan 2026

https://github.com/gsmithun4/expressjs-field-validator

Plugin for validating JSON request, middleware for expressjs

data express-js expressjs json-request middleware nodejs request rest-api validation

Last synced: 06 Mar 2026

https://github.com/fordinand45/bdp_a_kelompok_3

Project Big Data Python yang diadakan oleh Digitalent Kominfo. Berikut adalah yang ikut serta pada project, yaitu : Dhian Prameswari, Fordinand Pasaribu, dan Muhdad Alfaris Bachmid

data data-analytics data-science linear-regression python3

Last synced: 12 Apr 2026

https://github.com/svetlanam/etl-transformation

ETL data cleaning and transformation for specific use case in own Keboola project

cleaning data etl keboola python rest-api transformation

Last synced: 20 Jun 2026

https://github.com/raghavendranhp/attrition-alchemy

This project uses machine learning to predict and analyze employee attrition in Company.By developing three predictive models,it identifies key factors influencing turnover,providing actionable insights to mitigate attrition challenges.The analysis focuses on enhancing job satisfaction,work-life balance and career growth opportunities.

data datawrangling decision-trees eda gradient-boosting logistic-regression macine-learning pandas preprocessing random-forest-classifier skicit-learn svm

Last synced: 18 May 2026

https://github.com/keanteng/kaggledata

📊Data Source For Program Testing

data dataset excel

Last synced: 24 Mar 2025

https://github.com/rid17pawar/friendscircle

Friends Circle is a console based application developed in cpp using Graph Data Structure.

cpp data graph graph-algorithms oop

Last synced: 08 Jun 2026

https://github.com/giuleo129/dataanalysis

This folder contains two projects focused on data analysis and statistical learning using R, covering exploratory data analysis, modeling, and predictive techniques.

data data-analysis data-science statistical-learning

Last synced: 25 Jan 2026

https://github.com/jigyasag18/amazon-power-bi-dashboard

The Amazon Power BI Dashboard Project repository provides an interactive analytics dashboard for visualizing and analyzing sales performance across various product categories within Amazon's ecosystem. Utilizing comprehensive sales data, it empowers stakeholders with actionable insights to enhance decision-making and improve business strategies.

data data-visualization dataanalysis dataanalytics dataset datasets datavisualization-project powerbi powerbi-report powerbi-visuals powerbidashboard

Last synced: 07 Mar 2026

https://github.com/ashishsingh789/quantium_data-analysis-_virtual-internship

Completed a job simulation focused on Data Analytics and Commercial Insights for the data science team. Developed expertise in data preparation and customer analytics, utilizing transaction datasets to extract valuable insights and deliver data-driven commercial recommendations

data datawrangling matplotlib pandas pandas-dataframe presentation programming python python-library

Last synced: 07 Apr 2026

https://github.com/natanast/euroleaguebasketball

An R package providing data on Euroleague Basketball

data data-science package r

Last synced: 01 Apr 2025

https://github.com/darkogamerz/dhis2heat

A Comprehensive data management and Health Equity Assessment and Analysis platform that fetches data from DHIS2, optimize, calculate, clean and visualize inequality data.

analytics data data-science dhis2 equality equity health heat inequality r shiny shinydashboard visualization

Last synced: 01 Apr 2025

https://github.com/v-mayya/quantitative-analysis-data-dashboard

Quantitative survey data analysis using R

data data-analysis data-visualization flourish r

Last synced: 01 Apr 2025

https://github.com/ludwing-mj/manipulacion_ej

Ejercicio utilizado en la seccion numero ocho del manual para ejemplificar las herramientas proporcionadas por el tydyverse para la manipulacion de datos.

data manipulate-data package r

Last synced: 01 Apr 2025

https://github.com/suchi25sathavara/data-wrangling-with-r

Analyzing Road Accidents in Victoria, Australia

data r reporting rstudio wrangling-data

Last synced: 01 Apr 2025

https://github.com/suchi25sathavara/r-projects

R projects in Real world Scenerios for Data Analysis

data data-analysis datavisualization r

Last synced: 01 Apr 2025

https://github.com/wraith13/systematic-metasyntactic-variables

This is a list for that you can express the existence of different serieses when using metasyntax variables.

data

Last synced: 14 Jun 2025

https://github.com/armand-sauzay/datasets

Datasets for machine learning

ai data datasets machine-learning ml

Last synced: 18 Jan 2026

https://github.com/dhruvil-26/powerbi-projects

This repository contains Power BI projects showcasing data analysis and interactive dashboards. Each project includes detailed visualizations and insights on diverse topics such as loan analysis, sales performance, and customer behavior.

customer-behavior-analysis data data-analysis interactive-dashboards loan-analysis powerbi sales-performance visualization

Last synced: 04 Feb 2026

https://github.com/h-sutiwas/r2de-2025

This repository is related to the Road To Data Engineer Bootcamp by DataTH. It contains all related coursework, some mini projects and other resources within the field of Data Engineering.

data data-engineering data-visualization docker gcp pipeline spark

Last synced: 30 Apr 2026

https://github.com/tompollard/data

Repository to hold sample datasets etc

data

Last synced: 05 Jan 2026

https://github.com/pchaparro/search-engine

Full stack search-engine created from youtube videos obtained using "web-scraping"

data opensearch python python3 react scraper scraping scraping-websites search search-engine semantic-search sentence-transformers typescript website

Last synced: 17 Apr 2026

https://github.com/wolfchamane/data-sandbox

Sandbox tool for Front-end developments.

data database front-end nodejs npm rest sandbox tool

Last synced: 28 Oct 2025

https://github.com/shubhamsoni98/classification-with-decision-tree

This project predicts iPhone purchases using demographic data (gender, age, salary). A Decision Tree Classifier was used, achieving 88.16% accuracy. Insights from the model can refine marketing strategies, optimize product offerings, and boost sales by targeting key customer segments.

algorithms anaconda classification data data-science descision-tree jupyter-notebook machine-learning prediction python

Last synced: 19 Jan 2026

https://github.com/afeiship/data-pagination

Raw data(items) pagination.

data next page pagination previous total

Last synced: 18 May 2026

https://github.com/yadavkaushal/datascience-e-commerce-shopping-details

This project analyzes customer purchase data including details such as location, company, credit card usage, browser info, job roles and purchase price. It explores patterns in payment methods, spending behavior and online transactions. Using Pandas, Matplotlib and Seaborn, we clean analyze and visualize key trends to derive actionable insights.

data datacleaning dataframe datapreprocessing dataset libraries matplotlib numpy pandas plots visulaization

Last synced: 06 May 2026

https://github.com/makcymal/silvera

My researches on ML and statistics, optimization methods, CS algoritms and numerical methods

algorithms data data-structures machine-learning numerical-methods statistics

Last synced: 01 Apr 2025

https://github.com/inist-cnrs/ws-data

Modèles et données pour les web services

data dvc models

Last synced: 03 Sep 2025

https://github.com/johnelliott/wb-web

Moved —> https://github.com/johnelliott/waybot

arduino browser data iot raspberry-pi web

Last synced: 12 Apr 2026