An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/jstafford5380/provausio.testing.generators

Generate fake data for testing and/or mocking

data fake-data generator testing

Last synced: 14 Jan 2026

https://github.com/allanotieno254/powerbi-dax-filter-context

This repository contains a Power BI project that explores **DAX Filter Context**, a crucial concept in DAX calculations. The project focuses on **Bank Loan Analysis**, demonstrating how different filter contexts affect DAX formulas.

business-intelligence data data-analysis dax dax-functions powerbi powerbi-visuals visualization

Last synced: 08 Jan 2026

https://github.com/elkingarcia11/mlb-gameday-obp-odds

Small Python script that pulls MLB team on-base percentage (OBP) for the current season, loads today’s schedule, and writes CSV files that list each team’s OBP edge against its opponent for the day. It also labels each side of a game as betting favorite, not favorite, or equal using American moneylines from ESPN’s public game data.

api csv data http https json mlb mlb-stats-api moneyline odds python rest sports urllib

Last synced: 30 May 2026

https://github.com/microsoftcloudessentials-learninghub/demosscenarios-techtalks

This repository showcases demonstrations and scenarios using Microsoft Cloud technologies. Please note that these demos are intended as a guide and are based on my personal experiences.

ai analytics azure copilot data data-science fabric m365 microsoft-general ml powerapps powerbi privatebot security sharepoint

Last synced: 14 Mar 2026

https://github.com/shadeglare/genum

The ES Next tools to process data in a LINQ manner

data linq processing typescript

Last synced: 13 Apr 2026

https://github.com/primetdmomega/webscraper

A data web scraper that looks for jobs on Glassdoor.com

data python web-scraper

Last synced: 25 Mar 2025

https://github.com/meokullu/prefill

PreFill adds desired characters onto output values to increase their legibility.

alignment data data-analysis data-engineering data-science legibility

Last synced: 17 Jan 2026

https://github.com/fiedsch/data_util

misc. Utilities for data files like variable name lists

data helper management php

Last synced: 14 Jun 2025

https://github.com/mladen/ds-ml-and-ai-experiments

:1234: My Data Science, Machine learning and Artificial Intelligence experiments and projects

data data-mining data-science datascience dataset

Last synced: 09 Jun 2026

https://github.com/soenneker/soenneker.constants.data

A set of commonly used constants related to various types of data

constants csharp data dotnet

Last synced: 12 Mar 2026

https://github.com/vatshayan/pokemon-analysis

Visualization, Analysis & Predicting the accuracy of finding Pokemon power, attack & speed through Machine Learning

artificial-intelligence data data-analysis data-science data-visualization dataset machine-learning machine-learning-algorithms pokemon scikit-learn

Last synced: 30 May 2026

https://github.com/fcoagz/rate-reader-epv

pyDolarVenezuela API utilities, image processing (EnParaleloVzla) to extract currency exchange rates from specific platforms, validating content against expected patterns

data finance json processing-images pydolarvenezuela

Last synced: 14 Jun 2025

https://github.com/2kabhishek/pybank

Data Analysis for the silliest Bank 💰🏦

csv data data-science learning pandas python topic1 topic2

Last synced: 12 May 2026

https://github.com/newrelic-experimental/newrelic-java-atomikos

Gives status of Atomikos Data Sources since this information is unavailable via JMX

atomikos data instrumentation java nrlabs nrlabs-data nrlabs-java-verify nrlabs-odp observability-data

Last synced: 30 May 2026

https://github.com/elijah-1994/pre-process-e-commerce-dataset

Importing, Cleaning, and Pre-Processing E-Commerce Data for Analysis Using MySQL.

analytics data dataanalytics datacleaning dataprocessing mysql mysql-database sql

Last synced: 11 Mar 2025

https://github.com/yeti-robotics/past-scouting-data

❄️ Scouting Data from Previous Events/Seasons ❄️

data first frc

Last synced: 06 Jan 2026

https://github.com/grimen/js-humanizer

A human/developer friendly value humanizer - for JavaScript/Node.

data debug debugging format formatting humanize humanizer log logging print printing value

Last synced: 13 Jun 2026

https://github.com/nafisalawalidris/nafisalawalidris

Configuration files for my GitHub profile. Welcome to my GitHub profile! I'm Nafisa Lawal Idris, a passionate Data Scientist with a strong interest for blockchain technology. Explore my GitHub portfolio to delve into the exciting world where data science and Bitcoin converge.

artifical-intelligence bitcoin config data data-science developer github-config github-pages machine-learning

Last synced: 16 May 2026

https://github.com/dcmox/algorithms

General purpose data structures and algorithms

algorithms binary data hash linked list structures tree

Last synced: 10 Jun 2026

https://github.com/rajkumarbestha/nsedataextractor

NSEDataExtractor

data python python3

Last synced: 26 Mar 2025

https://github.com/fehmitahsindemirkan/web-scrapper

Professional and high performance web scraping project.

data ecommerce emailsender fileexplorer logging python web webscraping

Last synced: 10 Jan 2026

https://github.com/robthree/cfnreader

Provides a simple way to read FNIRSI's CFN files (*.cfn) produced by the FNIRSI UsbMeter tool

cfn csv data fnirsi usb usb-tester

Last synced: 01 Mar 2025

https://github.com/mightymetrika/holi

holi: Higher Order Likelihood Inference Web Applications

data data-science r statistics

Last synced: 10 Feb 2026

https://github.com/anthonybench/convert

A quick way to convert data, document, and image formats.

cli converter data documents images

Last synced: 14 Jan 2026

https://github.com/boytchev/coursedataviz

Supplementary materials for "Data Visualization" course

data fmi su visualization

Last synced: 16 Mar 2025

https://github.com/welli7ngton/mysql-server-formacao-alura

repositório para guardar códigos escritos em SQL de cursos da formação em mysql server da alura

data database mysql

Last synced: 19 Apr 2026

https://github.com/paezha/bsantiago

A data package with the results of a travel and well-being survey conducted in Santiago in 2016

data equity package r santiago survey travel well-being

Last synced: 18 Mar 2025

https://github.com/sandysanthosh/aspose-doc-to-pdf

Document & Browser object model

aspose build data doc java pdf

Last synced: 04 Jun 2026

https://github.com/programmer-rd-ai/competitive-programming-solutions

A collection of my solutions to various competitive programming problems from platforms like LeetCode. This repository serves as a personal archive of my problem-solving journey, covering a range of algorithms, data structures, and problem-solving techniques.

algorithm algorithms algorithms-and-data-structures data datastructures dsa javascript pandas python structures

Last synced: 01 Mar 2025

https://github.com/vidushibhadana/covid19-data-exploration-using-sql

Deployed diverse SQL techniques to analyze COVID-19 data for an improved understanding of pandemic's regression.

data database database-management sql

Last synced: 19 Aug 2025

https://github.com/stoyank7/football-prediction

This is my Semester 7 Project for my "AI for Society" minor at Fontys University of Applied Sciences.

ai betting data football machine-learning university-project

Last synced: 25 Mar 2025

https://github.com/roshaka/samplr

Samplr is a Python decorator for selecting a subset of items from a list, with options for customisation and informative console printouts.

data data-analysis data-engineering decorators list python sampling

Last synced: 14 Jan 2026

https://github.com/ankitrai259/sales_insight_dashboard

Sales Insight: Using SQL for data cleaning and Power BI for making interactive dashboard

dashboard data data-visualization datacleaning postgresql powerbi sql

Last synced: 17 Mar 2025

https://github.com/austinhartzheim/career-fair-backend

Backend for ECS Career Fair app

data django python

Last synced: 13 Apr 2026

https://github.com/stupidcucumber/elephant-crawler

System for mining texts from websites.

data data-mining-python python

Last synced: 25 Apr 2026

https://github.com/romaintailhurat/dagster-playground

Playing with Dagster 🐙

data pipelines python3

Last synced: 14 Jun 2025

https://github.com/deliprofesor/health-score-prediction-model-the-impact-of-lifestyle-and-demographic-factors

A machine learning project predicting health scores based on lifestyle and demographic factors like age, BMI, diet, and exercise. Techniques include Random Forest, Polynomial Regression, and Linear Regression, with a focus on model performance and actionable health insights.

cross-validation data data-science data-visualization feature-engineering linear-regression machine-learning polynomial-regression random-forest

Last synced: 10 Apr 2025

https://github.com/krakozaure/pyzzy

Set of packages to simplify development in Python

configuration data formats json library logging logs python3 toml utils yaml

Last synced: 14 Jan 2026

https://github.com/zevio/acl

ACL Anthology corpus sample

data dataset scholarly-articles

Last synced: 01 Mar 2026

https://github.com/samhollings/nhs_data_cleansing

A repo of reusable functions for cleansing data

cleansing data data-cleaning data-cleansing preprocessing pyspark python python3

Last synced: 05 Oct 2025

https://github.com/flyconnectome/hnf

Documentation for the hierarchical neuron format

annotations data dotprops hdf5 mesh neurons skeleton storage

Last synced: 17 Jan 2026

https://github.com/marielachirinosr/analysis-urgencias-hospital-pitalito

This project involves analyzing emergency room admission data from the E.S.E Hospital Departamental de Pitalito using a star schema model.

bigquery data data-analysis etl-pipeline tableau

Last synced: 21 Jan 2026

https://github.com/rysteq/abstract-data-structures

This repository contains two programs written in C about the stack and queue ADT's

abstract-data-structures c data queue stack

Last synced: 06 Oct 2025

https://github.com/vim89/flowforge

Let's be honest - most data pipeline frameworks treat types as suggestions. Config files are strings. Schemas are "validated" at runtime. Data quality is an afterthought. So, let's do differently

archetype data data-contracts data-engineering data-pipelines data-quality data-science database dataengineering datapipeline etl etl-framework pipelines scala scalability spark spark-sql spark-streaming

Last synced: 14 Apr 2026

https://github.com/prajjwol09/sql_retail_analysis_project

This project demonstrates SQL-based data cleaning, exploration, and business analysis on a retail sales dataset. It involves setting up a database, removing null values, performing EDA, and using SQL queries to extract key insights such as top customers, best-selling categories, and monthly sales trends.

data data-analysis datacleaning dataexploration pgadmin4 sql

Last synced: 15 Feb 2026

https://github.com/openwashdata/ugabore

Borehole repair data from central Uganda associated with a project report completed by Joseph Lwere for the “data science for openwashdata” course

analysis borehole data open-data r uganda wash water

Last synced: 17 Jan 2026

https://github.com/pythoncoderunicorn/startrek

a repo for Star Trek data from Technical Manuals

data klingon-language star-trek vulcan

Last synced: 07 Oct 2025

https://github.com/jacob-pitsenberger/python-electronics-inventory-management-system-object-oriented-programming-project

Welcome to the Python Electronics Inventory Management System project repository! This project is a demonstration of Object-Oriented Programming (OOP) principles in Python for managing an electronic parts inventory.

data data-structures dictionary exception-handling file-io filesystem input-output inventory-management-system management-system modules oop pickle python user-interface

Last synced: 08 Oct 2025

https://github.com/danieljdufour/fast-b64

Quickly Convert between B64 and Binary Strings

b64 base64 base64-decoding base64-encoding binary bits compression data

Last synced: 08 Oct 2025

https://github.com/djdhairya/whatsapp-chat-analysis

WhatsApp chat analysis is a multidimensional process that delves into the content, structure, and dynamics of conversations within the platform. It provides valuable insights for personal reflection, organizational decision-making, and improving communication strategies.

data data-science dataanalytics datapreprocessing machine-learning ml

Last synced: 08 Oct 2025

https://github.com/shubhamsoni98/classification-with-random-forest-1

To classify sales into categories (Low, Moderate, High) using Random Forests to inform strategic decisions and optimize marketing strategies.

algorithms anaconda data data-science datacleaning eda jupyter-notebook machine-learning pyhton random-forest scikit-learn visualization

Last synced: 18 Jan 2026

https://github.com/anarya22/e-commerce_analysis

E-Commerce_Analysis is a data analysis project performed on the Superstore_USA dataset. It explores various aspects of e-commerce performance, including sales trends, customer demographics, product categories, and regional performance. The analysis includes data cleaning, visualizations, and insights on factors influencing sales and profitability.

analysis analytics cleaning-data data

Last synced: 09 Oct 2025

https://github.com/psyteachr/sdg-data

Data relevant to the UN Sustainable Development Goals

data

Last synced: 09 Oct 2025

https://github.com/sillyash/untappd-viz

A data visualisation page using public datasets and HTML/CSS/JS with D3.js.

beer beer-statistics data data-analysis data-visualization kaggle kaggle-dataset public-dataset school-project

Last synced: 18 May 2026

https://github.com/j-sephb-lt-n/joes_giant_toolbox

A large collection of general python functions and classes that I use in my daily work

ascii browser classifier data dataviz gcp mime nlp python regex search statistics supervised web-scraping

Last synced: 10 Oct 2025

https://github.com/bastianolea/minsal_suicidios

Casos de intento de suicidio y suicidio consumado en Chile

chile comunas data genero salud tiempo

Last synced: 19 Jan 2026

https://github.com/azkarmoulana/winter-of-data-2019

:snowflake: :snowman: Winter of Data is coming..... :wolf:

data data-science machine-learning mathematics

Last synced: 05 Feb 2026

https://github.com/chowington/bg-counter-tools

A set of tools that can pull data from Biogents BG-Counter smart mosquito traps and convert them into a Darwin Core compliant format.

bg-counter biogents darwin-core data internet-of-things mosquito-prevalence population-dynamics

Last synced: 10 Oct 2025

https://github.com/alexmcvay/uber-data

UBER sql clone

data data-visualization sql

Last synced: 19 Jan 2026

https://github.com/nukopian/shell-series

Extract columns from tabular text

automation data shell

Last synced: 11 Oct 2025

https://github.com/dhruvil-26/tableau-projects

This repository contains Tableau visualization projects focused on data analysis across different domains. Projects include: 1. IPL Visualization - Insights into IPL match, Team and player statistics. 2. EV Analysis - Visualizations exploring the adoption of electric vehicles. 3. Road Accident Analysis - Analysis of road accident patterns

analysis data data-analysis data-analytics electric-vehicles ipl road-accident-analysis tableau tableau-public

Last synced: 19 Jan 2026

https://github.com/laguer/jupyterdatascienceworkflow

Jupyter Notebook dedicated to studying Agriculture and AMI analytics

agriculture amis corn data fao jupyter maize oecd rice science soja

Last synced: 11 Oct 2025

https://github.com/mr-chang95/udacity-starbucks-challenge

Data Science Project for Udacity's Data Scientist Program. Using Python in Jupyter Notebook.

data data-science data-visualization numpy pandas sklearn

Last synced: 14 Apr 2026

https://github.com/madhuresh2011/daily-sql-from-hackerrank

Welcome to my SQL Series, where I tackle SQL problems from HackerRank on a daily basis.

data dataanalysis database question-answering sql

Last synced: 19 Jan 2026

https://github.com/drzax/light-up-brisbane

Where, what and why various public places in Brisbane are lit up.

brisbane data git-scraping

Last synced: 19 Jan 2026

https://github.com/adadalshabab/data-engineering-gcp-project

An end-to-end modern data engineering project, including deployment of ETL pipeline on Google Cloud Platform, using BigQuery for data analysis and leveraging Looker to generate an insight dashboard.

bigquery data data-science data-visualization databases dataengineering-a engineering etl-pipeline looker-studio powerbi

Last synced: 19 Jan 2026

https://github.com/tyriek-cloud/nyc-dca-etl

Created an ETL pipeline to merge two CSV files (converted to JSON) into a parquet file using Azure Data Factory, The data was extracted from NYC Open Data: https://opendata.cityofnewyork.us/ and I created a Blob Container within an existing storage account.

azure azure-data-factory blob-storage data data-engineering etl-pipeline

Last synced: 21 Jan 2026

https://github.com/mikeschinkel/go-testdata-defaulter

Simple package for Go to set table-driven test data defaults so that tables in tests only need include data that differs from defaults.

data defaults package testing tests

Last synced: 13 Oct 2025

https://github.com/deepanshkhurana/facebook-birthdays

Python script to create a .csv from Facebook's Event Data to list Birthdays.

data facebook python

Last synced: 14 Oct 2025

https://github.com/digital-media/cv_data

Datasets used for courses/tutorials at the Digital Media Department

computer-vision data image-processing images

Last synced: 14 Oct 2025

https://github.com/soenneker/soenneker.data.email.disposables

Simply adds a list of compiled disposable/temporary email domains, updated daily (if available)

csharp data disposable disposables domain dotnet email mailinator

Last synced: 29 May 2026

https://github.com/isandyawan/simplelinearregression

A application to analyze data using simple linear regression. This application can make regression model from variable and give advice to user if the model break regression assumsion

data linear r regression rstudio shiny statistic

Last synced: 14 Oct 2025

https://github.com/mominurr/fire-gas-leak-detection-system

A real-time fire prevention system integrating IoT sensors and computer vision to trigger evacuations.

ai computer-vision data datascience machine-learning ml python yolo

Last synced: 27 Jan 2026

https://github.com/datamine/yelp-date

Does being on a date impact the score on a yelp review? Let's find out!

data ipython ipython-notebook pandas python python-2 yelp yelp-reviews

Last synced: 14 Apr 2026

https://github.com/intersystems-ib/workshop-smart-data-fabric

Learn the main ideas involved in developing a Smart Data Fabric using InterSystems IRIS

analytics data datafabric interoperability smart

Last synced: 14 Apr 2026

https://github.com/yagoluiz/enem-analise-extracao

[PT-BR] Extração e análise de dados do desempenho da região Centro-Oeste

analysis data extraction python3 r

Last synced: 17 Apr 2026

https://github.com/j-sephb-lt-n/personal-projects

A history of my personal projects and professional development

ai api auth cloud data llms personal-development web

Last synced: 24 Jan 2026

https://github.com/bdr-pro/streamlint

ltra-cool Streamlit app, where you can interact with widgets, see data in action, and even upload and download files

data streamlit

Last synced: 14 Apr 2026

https://github.com/saboye/sales-performance-analysis

A dashboard that presents monthly sales performance by product segment and product category to help clients identifying the segments and categories that have met or exceeded their sales targets, as well as those that have not met their sales targets.

dashboard data data-science eda tableau visualization

Last synced: 27 Jan 2026

https://github.com/mat06mat/matbot

My discord bot code

data discord-bot discord-py py-cord

Last synced: 17 Oct 2025

https://github.com/ronknight/user-data-dashboard

📈 A data visualization tool for analyzing user data using an Excel-based data source.

dashboard data excel ga4 screenshot

Last synced: 17 Oct 2025

https://github.com/enoch208/eventmaster

A user-friendly application that helps you easily record and play back your keyboard and mouse actions. With its modern design using `tkinter` and `ttkthemes`, it provides a smooth and easy-to-use interface. The app combines reliable technical features to give you a great experience.

automation data key keylogging-python replay spy tools

Last synced: 01 Jun 2026

https://github.com/analyst-amitbisht/pizza-sales-report-

Its a guided project to practice tools like SSMS + Power BI & also skills like data cleaning, data exploration, data analysis, data visualization, etc.

analytics data data-visualization powerbi sql-server

Last synced: 18 Oct 2025

https://github.com/meokullu/colorizenumber

ColorizeNumber - Bodrum Papatya, visualizes numeric data into colors which creates an image.

color colorize colors data data-visualization visualization vizualize-data

Last synced: 01 Jun 2026