An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/cljoly/data

📊 Data sets to populate some parts of my website (mostly https://cj.rs/open-source/).

data open-source sqlite wip

Last synced: 03 May 2026

https://github.com/codegouvfr/codegouvfr-data

🧢 Data for code.gouv.fr

bluehats codegouvfr data

Last synced: 05 Mar 2026

https://github.com/powersyang/visualization

data visualization templates 数据可视化模板

data templates visualization

Last synced: 24 Mar 2025

https://github.com/ournet/videos-data

Ournet videos data module

data ournet video videos

Last synced: 04 Apr 2025

https://github.com/prajakta1321/streetml-a-cityscape-traffic-volume-prognostication

StreetML leverages ML learning techniques to revolutionize urban traffic prediction through precise volume prognostication, aiming to enhance cityscape mobility through data-driven insights.

catboostregressor data datavisualisation exploratory-data-analysis lightgbm-regressor linearregression machine-learning machine-learning-algorithms predictive-analytics random-forest-regression xgboost-regression

Last synced: 08 Apr 2025

https://github.com/thingston/extractor

Collection of PHP classes to extract data from HTML pages.

data html php

Last synced: 14 Jan 2026

https://github.com/zulfachafidz/telco_churn_insight_customer_loss_prediction_with_random_forest_and_decision_tree-algorithms

The main problem in the business world is customer churn, or losing customers, especially in the telecommunications industry, which experiences very tight competition. To overcome this problem, an analysis was carried out to help the company understand how many customers have the potential to switch providers.

data data-science data-visualization dataanalysis dataanalyst dataanalytics datadrivenwithdataprovider decision-tree decision-tree-classifier decision-trees random-forest random-forest-classifier

Last synced: 01 May 2026

https://github.com/inist-cnrs/ws-data

Modèles et données pour les web services

data dvc models

Last synced: 03 Sep 2025

https://github.com/musamairshad/dsa-python

This repository contains all the material related to Data Structures and Algorithms implemented in Python.

algorithms data datastructures efficiency python searching-algorithms sorting-algorithms

Last synced: 25 Mar 2025

https://github.com/zeptosec/bpscrapper

Shows history of oil prices

data data-visualization database nodejs scraper

Last synced: 13 Apr 2026

https://github.com/taquece/goals-per-match

basic script to calculate average football goals per match from .CSV

beginner csv data football nodejs python sports-analytics

Last synced: 09 May 2026

https://github.com/h-sutiwas/r2de-2025

This repository is related to the Road To Data Engineer Bootcamp by DataTH. It contains all related coursework, some mini projects and other resources within the field of Data Engineering.

data data-engineering data-visualization docker gcp pipeline spark

Last synced: 30 Apr 2026

https://github.com/deliprofesor/breast-cancer-detection-using-svm-with-smote-and-model-optimization

This project analyzes health and lifestyle factors influencing heart attack risk using statistical methods and machine learning, with Ridge Regression identified as the best predictive model.

classification data data-preprocessing data-science data-visualization gridsearchcv machine-learning python roc-curve smote svm

Last synced: 10 Apr 2025

https://github.com/luminati-io/Google-Maps-dataset-samples

A sample dataset of over 1000 Google Maps businesses, extracted using the Bright Data API, ideal for competitor analysis, location-based marketing, and market strategies.

api data dataset google-maps maps web-scraping

Last synced: 09 Apr 2025

https://github.com/wraith13/systematic-metasyntactic-variables

This is a list for that you can express the existence of different serieses when using metasyntax variables.

data

Last synced: 14 Jun 2025

https://github.com/braiso-22/ejercicio-seguro-medico

Ejercicio de acercamiento a los datos para hacer predicciones

data data-science dataset ia insurance jupyter-notebook ml python python3

Last synced: 24 Apr 2026

https://github.com/deliprofesor/virtual-reality-in-education-impact-analysis-and-insights

This project examines the impact of Virtual Reality (VR) on education, focusing on its effects on student engagement, learning outcomes, and creativity. It uses data analysis techniques like descriptive statistics, correlation analysis, and clustering to assess VR's effectiveness in enhancing learning.

clustering data data-analysis data-science data-visualization exploratory-data-analysis hypothesis-testing machine-learning python regression-analysis virtual-reality

Last synced: 14 Jun 2025

https://github.com/suchi25sathavara/r-projects

R projects in Real world Scenerios for Data Analysis

data data-analysis datavisualization r

Last synced: 01 Apr 2025

https://github.com/smac-group/smacdata

Data sets used in various packages.

data r

Last synced: 02 Apr 2025

https://github.com/suchi25sathavara/data-wrangling-with-r

Analyzing Road Accidents in Victoria, Australia

data r reporting rstudio wrangling-data

Last synced: 01 Apr 2025

https://github.com/darkogamerz/dhis2heat

A Comprehensive data management and Health Equity Assessment and Analysis platform that fetches data from DHIS2, optimize, calculate, clean and visualize inequality data.

analytics data data-science dhis2 equality equity health heat inequality r shiny shinydashboard visualization

Last synced: 01 Apr 2025

https://github.com/giuleo129/dataanalysis

This folder contains two projects focused on data analysis and statistical learning using R, covering exploratory data analysis, modeling, and predictive techniques.

data data-analysis data-science statistical-learning

Last synced: 25 Jan 2026

https://github.com/afolabi022/getting-and-cleaning-data-course-project

Tidy Dataset Creation for Human Activity Recognition" This repository contains the code and files for cleaning and transforming the Human Activity Recognition Using Smartphones dataset into a tidy format. The project demonstrates data wrangling skills in R, including merging datasets

data data-science datacleaning r

Last synced: 25 Mar 2025

https://github.com/elkingarcia11/mlb-gameday-obp-odds

Small Python script that pulls MLB team on-base percentage (OBP) for the current season, loads today’s schedule, and writes CSV files that list each team’s OBP edge against its opponent for the day. It also labels each side of a game as betting favorite, not favorite, or equal using American moneylines from ESPN’s public game data.

api csv data http https json mlb mlb-stats-api moneyline odds python rest sports urllib

Last synced: 30 May 2026

https://github.com/beriberikix/senml-zephyr

A codec for encoding and decoding Sensor Measurement Lists (SenML) for Zephyr

codec data iot senml sensor zephyr-rtos

Last synced: 24 Mar 2025

https://github.com/infinitode/pyautoplot

PyAutoPlot is an open-source Python library designed to make dataset analysis much easier by generating helpful detailed plots using matplotlib. It automatically generates appropriate plots based on the dataset you feed it.

analysis automatic csv data dataset dataset-analysis generation matplotlib pandas plots plotting-in-python plotting-library python

Last synced: 16 Mar 2025

https://github.com/arthurdanjou/studies

💼 This is the repository containing all my projects done during my studies in Python and R.

ai data data-science data-visualization jupyter jupyter-notebook ml python r

Last synced: 08 Apr 2025

https://github.com/primetdmomega/webscraper

A data web scraper that looks for jobs on Glassdoor.com

data python web-scraper

Last synced: 25 Mar 2025

https://github.com/fatihilhan42/hollywood-theatrical-market-synopsis-1995-to-2021

In this project, the data of hollywood film production companies from 1995 to 2021 were examined. Significant tables and graphs were created using data visualization algorithms, with the tickets sold divided into categories.

data data-analysis data-science data-visualization

Last synced: 23 Mar 2025

https://github.com/richardlitt/bird-watching

My birdwatching list and repo

birding data ebird

Last synced: 26 Jan 2026

https://github.com/heitang/fcu-classid

逢甲大學:學院 ID 、 系所 ID 和班級 ID

data fcu project

Last synced: 30 Mar 2025

https://github.com/soenneker/soenneker.constants.data

A set of commonly used constants related to various types of data

constants csharp data dotnet

Last synced: 12 Mar 2026

https://github.com/fcoagz/rate-reader-epv

pyDolarVenezuela API utilities, image processing (EnParaleloVzla) to extract currency exchange rates from specific platforms, validating content against expected patterns

data finance json processing-images pydolarvenezuela

Last synced: 14 Jun 2025

https://github.com/yuvrajsaraogi/car-price-prediction-with-machine-learning

The price of a car depends on a lot of factors like the goodwill of the brand of the car, features of the car, horsepower and the mileage it gives and many more. Car price prediction is one of the major research areas in machine learning. So, if you want to learn how to train a car price prediction model then this project is for you.

car-price-prediction-with-machine-learning data data-science deep-learning deep-neural-networks engineer github learning machine-learning mini-project natural-language-processing prediction predictive-modeling project python3 sql

Last synced: 15 Apr 2026

https://github.com/0xHericles/SpamDetector

:email: A Simple Python Spam Detector with Scikit-Learn

data ham machine-learning python sklearn spam

Last synced: 24 Mar 2025

https://github.com/ffatahillah7/snowflake-data-governance-warehouses

Welcome to the Powered by Tasty Bytes - Zero to Snowflake Quickstart focused on Data Governance! Within this Quickstart we will learn about Snowflake Roles, Role Based Access Control and deploy both Column and Row Level Security that can scale with your business.

data data-governance snowflake

Last synced: 06 Jan 2026

https://github.com/seldszar/piccha

Another tree data structure

data tree

Last synced: 16 Jul 2025

https://github.com/lefuturiste/npm-api

Search or get a npm package

api data npm php

Last synced: 14 May 2026

https://github.com/smaug6739/data-bit

This project is a module for converting a structured dataset into a number that can be stored in a database taking up little space.

bits data nodejs

Last synced: 14 May 2026

https://github.com/dcmox/algorithms

General purpose data structures and algorithms

algorithms binary data hash linked list structures tree

Last synced: 10 Jun 2026

https://github.com/davitshahnazaryan3/data-management-web

Explore datasets with ease using taxonomy filtering, allowing you to quickly identify the specific experimental datasets you need and download them effortlessly

data environmental experiments filtering-data seismic taxonomy

Last synced: 17 Jan 2026

https://github.com/robthree/cfnreader

Provides a simple way to read FNIRSI's CFN files (*.cfn) produced by the FNIRSI UsbMeter tool

cfn csv data fnirsi usb usb-tester

Last synced: 01 Mar 2025

https://github.com/boytchev/coursedataviz

Supplementary materials for "Data Visualization" course

data fmi su visualization

Last synced: 16 Mar 2025

https://github.com/faster-games/dynamic-components

Dynamic Runtime Components for Unity3D

data framework unity3d

Last synced: 11 Apr 2026

https://github.com/nmelgar/healthy_child_dataviz

Data visualization project to analyze what a healthy child is.

analysis data data-analysis data-science data-visualization dataviz research tableau visualization

Last synced: 23 Feb 2026

https://github.com/vidushibhadana/covid19-data-exploration-using-sql

Deployed diverse SQL techniques to analyze COVID-19 data for an improved understanding of pandemic's regression.

data database database-management sql

Last synced: 19 Aug 2025

https://github.com/stoyank7/football-prediction

This is my Semester 7 Project for my "AI for Society" minor at Fontys University of Applied Sciences.

ai betting data football machine-learning university-project

Last synced: 25 Mar 2025

https://github.com/flowsynx/plugin-base64

FlowSynx plugin to provides encoding and decoding of Base64 strings, allowing workflows to handle Base64 content transformations efficiently.

base64 base64-decoding base64-encoding data data-platform decoding encoding flowsynx flowsynx-plugins

Last synced: 10 Mar 2026

https://github.com/shudhanshusaurabh001/super_market-data-analysis-using-python

This project focuses on analyzing supermarket sales data using Python. The goal is to extract meaningful insights from the dataset, such as sales trends, customer purchasing behavior, and product performance.

analysis csv data insights matplotlib numpy pandas project python seaborn

Last synced: 06 Apr 2026

https://github.com/stupidcucumber/elephant-crawler

System for mining texts from websites.

data data-mining-python python

Last synced: 25 Apr 2026

https://github.com/henryssondaniel/teacup-service-report-mysql-java

Connect your Teacup report data to a MySQL database

data logs mysql reports teacup

Last synced: 13 Apr 2026

https://github.com/romaintailhurat/dagster-playground

Playing with Dagster 🐙

data pipelines python3

Last synced: 14 Jun 2025

https://github.com/naitiknayak196/tech-layoffs-cleaning-sql-vs-python

This project cleans and analyzes a tech layoffs dataset using MySQL and Python (Pandas) to compare their efficiency in data processing. It provides business insights into workforce trends, industry stability, and economic impacts to support data-driven decision-making.

data datacleaning dataset jyputer-notebook layoffdata layoffs mysql python sql

Last synced: 09 May 2026

https://github.com/bdr-pro/graphyml

A powerful, interactive Streamlit application to explore, edit, visualize, and query a graph-based database of YAML nodes — ideal for movie metadata, research articles, or structured knowledge graphs.

data database yaml yml

Last synced: 23 Jul 2025

https://github.com/mikeqfu/network-rail-track-fixity-layer

This project develops a data mining tool for analysing and predicting track movements using asset data, environmental factors and track design knowledge to model key parameters and generate fixity values for the GB rail network.

data data-integration data-mining data-science information-management knowledge-discovery point-cloud rail rail-alignment rail-track track-fixity

Last synced: 02 Sep 2025

https://github.com/corneliustanui/personal_quarto_website

This repo contains source files for my personal Quarto-based website.

data netlify programming quarto r rbind websites

Last synced: 02 Apr 2025

https://github.com/albanecoiffe/jo2024_visualization

Tableau de bord avec Streamlit sur les JO de Paris 2024.

data streamlit visualization

Last synced: 30 Apr 2026

https://github.com/mohamedbilal1800/olympic_history_data_analysis

This project delves into the 120 Years of Olympic History: Athletes and Results dataset, analyzing athlete demographics, medal achievements, and country performances across the Summer and Winter Olympics from 1896 to 2016.

analysis data eda matplotlib-pyplot pandas python seaborn visulaization

Last synced: 09 May 2026

https://github.com/samhollings/nhs_data_cleansing

A repo of reusable functions for cleansing data

cleansing data data-cleaning data-cleansing preprocessing pyspark python python3

Last synced: 05 Oct 2025

https://github.com/nolanbconaway/rollercoaster-tycoon-data

Every roller coaster I have built in RCT2 for iPad

data roller-coaster-tycoon

Last synced: 24 Mar 2025

https://github.com/affan005-ai/tesla-stock-prediction

This project analyzes Tesla stock data and builds machine learning models to predict and classify stock movements. The analysis includes EDA, feature correlation, moving averages, and two models

data data-analysis data-science data-visualization-project eda machine-learning matplotlib pandas predictive-analytics predictive-modeling python scikit-learn

Last synced: 05 Oct 2025

https://github.com/mevlutcelik/turkey-cities-data

📍 Türkiye şehirlerine ait şehir verisi paketi: Plaka, koordinat (lat/lon), nüfus (2024 ADNKS) ve coğrafi bölge bilgilerini içerir.

cities coordinates data json nufus plaka turkey turkiye typescript

Last synced: 10 Mar 2026

https://github.com/mohammad-malik/covid-visualizations-d3

This project provides a dashboard with five different perspectives on the pandemic, from patient-infection relationships to regional trends and hierarchical distributions. This was developed as part of a project for the course Data Analysis and Visualization (DS3001).

covid-19 d3 d3-visualization d3js data data-analysis data-analytics data-science visualization

Last synced: 28 May 2026

https://github.com/tsbarr/belly-button-challenge

Using front-end development tools (javascript, html and css) I built an interactive dashboard to explore the Belly Button Biodiversity dataset, which catalogs the microbes that colonize human navels.

data data-visualization javascript

Last synced: 04 Mar 2026

https://github.com/abdullahashfaqvirk/earth-engine-data-scraper

A Python based web scraper designed to extract and organize dataset metadata from the Google Earth Engine Datasets Catalog for research, and analysis purposes.

beautifulsoup data data-science python requests scraper web-scraping

Last synced: 10 May 2026

https://github.com/andykee/aurora

A lightweight tool for indexing, cataloging, and browsing data.

catalog data data-catalog data-discovery indexing metadata metadata-extraction search-and-discovery

Last synced: 17 Jan 2026

https://github.com/yashkp1234/movie-recommendation-engine

My project on analyzing the movie data set, and creating a recommendation engine using that analysis.

analysis data notebook python recommendation-engine

Last synced: 04 May 2025

https://github.com/eharshit/end-to-end-vendor-insights

End-to-end analysis of vendor performance for wholesale/retail businesses, featuring data ingestion, cleaning, insights, and interactive Power BI dashboards.

analysis analysis-algorithms analytics dashboard data data-analysis datascience jupyter jupyter-notebook pandas powerbi powerbi-report retail wholesale

Last synced: 07 Oct 2025

https://github.com/prajjwol09/sql_retail_analysis_project

This project demonstrates SQL-based data cleaning, exploration, and business analysis on a retail sales dataset. It involves setting up a database, removing null values, performing EDA, and using SQL queries to extract key insights such as top customers, best-selling categories, and monthly sales trends.

data data-analysis datacleaning dataexploration pgadmin4 sql

Last synced: 15 Feb 2026

https://github.com/openwashdata/ugabore

Borehole repair data from central Uganda associated with a project report completed by Joseph Lwere for the “data science for openwashdata” course

analysis borehole data open-data r uganda wash water

Last synced: 17 Jan 2026

https://github.com/machinecyc/lotteryinsight

Use crawler to collect Taiwan Lotto data, and save data into local MySQL server.

crawler data docker lottery mysql-database python3 taiwan

Last synced: 09 May 2026

https://github.com/82luli02/sakila_dvd_rental_database_analysis

Analysis of the Sakila DVD Rental database using SQL

data data-analysis data-science data-visualization sql

Last synced: 10 Mar 2026

https://github.com/rahul1582/bank-loan-classification

Classifying whether a person is taking personal loan or not using all the Classification Algorithms.

algorithm analysis classi data

Last synced: 08 Oct 2025

https://github.com/shubhamsoni98/classification-with-random-forest-1

To classify sales into categories (Low, Moderate, High) using Random Forests to inform strategic decisions and optimize marketing strategies.

algorithms anaconda data data-science datacleaning eda jupyter-notebook machine-learning pyhton random-forest scikit-learn visualization

Last synced: 18 Jan 2026

https://github.com/sakan811/show-leaving-soon-tracker-website

This is a Vue.js application that displays shows that are leaving each platform soon, featuring a countdown timer for each title based on the user's local timezone.

data hbo hbomax netflix shows streaming tv-shows vue vuejs web webapp website

Last synced: 18 Mar 2025

https://github.com/cburmeister/disc-golf-courses

All the disc golf courses i've played at. Maintained with http://geojson.io/.

data geojson

Last synced: 21 Jan 2026

https://github.com/miniql/miniql-inline

A MiniQL query resolver for inline data.

data query query-language

Last synced: 27 May 2026

https://github.com/thanh-wutan/chess-opening-comparator

Interactive web app using R to visualize and compare chess opening performance and popularity.

chess-openings data databases datavisualisation r

Last synced: 09 May 2026

https://github.com/boratechlife/tensorflow-questions-datasets

A Tensorflow questions Datasets to help you practice Machine learning and Train Models

data datapreprocessing datasets machinelearning modeltrain questions tensorflow

Last synced: 23 Mar 2025

https://github.com/nel-zi/nuga_bank

Developed an automated data exploration and cleaning pipeline for Nuga Bank to streamline data preparation, ensure consistent data quality, and normalize datasets into structured databases for efficient analysis and reporting.

data data-automation data-visualization datacleaning datatransformation etl-automation etl-pipeline

Last synced: 16 May 2025