An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/rohancyberops/rp1

This project performs an analysis of Starbucks (SBUX) stock returns using R. The analysis includes both simple returns and continuously compounded returns (CC returns) for a period of one month. It also calculates the growth of $1 invested in SBUX and provides visual insights through various plots.

analysis cc data r rlanguage sbux

Last synced: 15 Mar 2025

https://github.com/mitevpi/vue-d3-bar-chart

Reusable, reactive, animated bar chart using D3 + Vue.js. Written in idiomatic Vue, rather than D3 syntax.

d3 data data-visualization frontend interactive svg vue web

Last synced: 18 May 2026

https://github.com/oefenweb/python-untraceables

Randomizes IDs for a given set of tables making them untraceable across environments

anonymize data database mysql privacy python python2 python3 randomization

Last synced: 03 Feb 2026

https://github.com/kingabzpro/makefile-actions

GitHub Actions and MakeFile tutorial and project for beginners.

actions analytics automation data data-science makefile

Last synced: 18 Apr 2026

https://github.com/ahmadjamil888/facial-recognition-ai-model

A facial recognition AI model powered by CNN , and trained by thousands of images.

ai cnn data data-science facial facial-recognition recognition

Last synced: 30 Jun 2025

https://github.com/antononcube/raku-data-cryptocurrencies

Raku package of cryptocurrency data retrieval.

crypto cryptocurrency data

Last synced: 02 Apr 2025

https://github.com/ishanoshada/matplot3dex

A Matplotlib 3D Extension package for enhanced data visualization

data data-science matplotlib python-packages scikit-learn

Last synced: 05 Jan 2026

https://github.com/mbolam/DSWS_OpenRefine

Cleaning and Linking Data with OpenRefine

cleaning data metadata openrefine

Last synced: 07 Apr 2025

https://github.com/lookininward/data-formatter-demo

You have directories containing data files and specification files. The specification files describe the structure of the data files. Write an app that reads format definitions from specification files. Use these definitions to convert the parsed files to NDJSON files.

csv data demo files json ndjson python txt unittest

Last synced: 27 Apr 2026

https://github.com/nesterenko-kv/object-id

ObjectIDs are a special type of identifier mainly used in MongoDB to uniquely identify documents within a collection. They consist of a 12-byte binary value that includes a timestamp, a machine identifier, a process identifier, and a counter.

c-sharp data id net object-id unique-identifier

Last synced: 16 May 2025

https://github.com/sbdk-dev/sbdk.dev

A complete reference implementation of a local-first ecosystem for AI-powered analytics. This repository contains the source code for the SBDK.dev website, the central hub for the SBDK suite of open-source tools.

ai-powered-analytics data data-engineering data-engineeringlocal-first data-pipeline-automation data-pipelines dbt dlt duckdb elt etl-pipeline llm local-first machine-learning pipeline sbdk semantic-layer

Last synced: 27 May 2026

https://github.com/spine-tools/metreload

Python application for downloading meteorological reanalysis data

data python reanalysis

Last synced: 01 Jul 2025

https://github.com/cintia0528/data_analytics_and_visualization-sql_tableau

Evaluate Magist as a strategic partner for Eniac's Brazilian expansion. Use SQL to analyze growth, tech accessory sales potential, delivery times, and customer satisfaction in Magist's database.

data dataanalysis datavisualization sql strategy tableau

Last synced: 31 Mar 2025

https://github.com/idea2app/public-meta-data

HTTP API for Public Meta Data, written in TypeScript & designed for CDN.

api cdn data http meta public typescript

Last synced: 15 Mar 2025

https://github.com/humbertocg18/pucrs-alest-i-2.3-2023.24

Trabalhos, Projetos, Exercícios e aulas realizados em Java na cadeira de Algoritimos e estrutura de dados 1, matéria do segundo semestre.

beecrowd beecrowd-solution-in-js beecrowd-solutions-in-java data data-structures datastructures-algorithms hashmap hashtable java-8 leetcode leetcode-javascript leetcode-solutions leetcodepra pucrs sorting-algorithms

Last synced: 29 Mar 2025

https://github.com/bileljegham/api-sport-cli

Cli for https://api-sports.io/ Retreive data and convert to sql file

cli data database match nodejs sports sports-analytics

Last synced: 08 May 2026

https://github.com/mtingers/opacify

Opacify reads a file and builds a manifest of external sources to rebuild said file.

backup data obfuscation python

Last synced: 18 May 2026

https://github.com/andygeiss/pipeline

Build your own data pipeline to gather, organize and transform data by using protobuf as an intermediate format.

data data-pipeline data-science go golang machine-learning protobuf protobuf-compiler

Last synced: 31 Mar 2025

https://github.com/davidgamero/gatech-covid-data-scraper

Utility for scraping GATech Exposure Alert Information into a CSV file with automated case number extraction and aggregation

covid data gatech georgia scraper

Last synced: 31 Mar 2025

https://github.com/lamden/merk

A concise implementation of a merkle tree in Python.

crypto data hash merkle structure tree

Last synced: 27 May 2026

https://github.com/abdul-rafay19/youngdevinterns_machine-learning_tasks

This internship offers hands-on exposure to real-world Machine Learning applications — from data visualization and preprocessing to model development, evaluation, and deployment. It focuses on real ML workflows, problem-solving, neural networks, and hyperparameter tuning — all within a collaborative, remote, and growth-oriented environment.

ai artificial-intelligence artificial-intelligence-algorithms artificial-neural-networks data data-visualization internship machine-learning machine-learning-algorithms machinelearning ml model model-development neural-network preprocessing programming-language python task tasks youngdevintern

Last synced: 29 Apr 2026

https://github.com/peterampazzo/padova-opendata

A list of open data resources of the City of Padova

data json opendata padova xml

Last synced: 31 Mar 2025

https://github.com/richardschoen/sshnetibmi

This .Net/.Net Core class library is used to interface with existing IBM i database, program calls, CL commands, service programs and data queues via the PASE based xmlservice-cli PASE command program or regular qsh/bash commands. qsh/bash commands can be used to interface with any qsh/pase based utilities such as the IBM i db2util utility

as400 cl command csharp data db2 ddm dotnet drda ibm ibmi os400 pase program qcmdexc qcmdexec queue rpg xmlservice xmlservice-cli

Last synced: 04 Feb 2026

https://github.com/fritzrehde/asciibar

A cli tool to print percentages as ascii bar charts

cli data percentage visualization

Last synced: 31 Oct 2025

https://github.com/h2lsoft/validator

A library of validators values in multilanguage with CSRF protection

csrf csrf-protection data form php validator

Last synced: 04 Feb 2026

https://github.com/waylonwalker/exceltocsv

A usefull tool to convert excel spreadsheets to csv files without launching excel

csv-converter csv-files data excel python spreadsheet

Last synced: 05 May 2025

https://github.com/stdlib-js/datasets-cdc-nchs-us-births-1969-1988

US birth data from 1969 to 1988, as provided by the Center for Disease Control and Prevention's National Center for Health Statistics.

america babies births data dataset datasets javascript node node-js nodejs stdlib time-series timeseries united-states us usa

Last synced: 19 Apr 2025

https://github.com/flowsynx/plugin-csv

FlowSynx plugin to reads and writes CSV files, enabling easy batch data import/export operations and integration with spreadsheet-based data workflows.

comma-separated-values csv data data-platform flowsynx

Last synced: 10 Mar 2026

https://github.com/flowsynx/plugin-json

FlowSynx plugin to loads and parses local JSON files. Supports transformation, extraction, and mapping of hierarchical data structures in workflows.

data data-platform flowsynx json

Last synced: 10 Mar 2026

https://github.com/tatey/list_of_countries

A list of countries, states, and cities in Ruby

cities countries data ruby states

Last synced: 11 Nov 2025

https://github.com/gmersy/data-carbon

Repository accompanying the paper: Toward a Life Cycle Assessment for the Carbon Footprint of Data

carbon-emissions carbon-footprint climate-change data data-science sustainability sustainable-software

Last synced: 31 Mar 2025

https://github.com/inc44/raqua

Raqua 💧, a set of Python scripts and Rust program, is designed to scan an ocean of disk copies and retrieve files lacking conventional signatures, by creating an overflowing cache

cli console data data-recovery files linux macos python python3 recovery rust search terminal tool windows

Last synced: 11 Apr 2026

https://github.com/spectrochempy/spectrochempy_data

Test and examples data repository for SpectroChemPy

data

Last synced: 04 Apr 2025

https://github.com/jorgeatgu/apaga-luz

💡 ¿Cuánto cuesta la luz? 💶

data data-visualization flat-data

Last synced: 04 Feb 2026

https://github.com/ginga1402/chinook_database

Microsoft SQL Server Management Studio

business-query data sql-server

Last synced: 30 Mar 2025

https://github.com/insolite/react-data-frame

Table for huge data sets

data react table

Last synced: 14 May 2026

https://github.com/victorowinoke/after-work-data-science-project-showcase-eda

You work for Lublu as a Data Science Consultant and you have been tasked to perform analysis on pricing, product and assortment of Adidas and Nike. Create a descriptive analysis report, making relevant observations and recommendations that will help Lublu in the launch of such similar products.

adidas analysis data deliverables nike pythonanalysis ranges

Last synced: 28 May 2026

https://github.com/rafaelfloressouza/Covid-19-Dashboard

Python web application to display COVID19 data from the world using Plotly and Dash

bootstrap covid-19 css data datavisualization plotly-dash python3

Last synced: 10 Mar 2025

https://github.com/diegoperea20/own_dataset_segmentation_yolov8

Segmentacion y detection de objetos con propio dataset usando YOLOV8 , en el que se utiliza un dataset propio de una moneda de 200 pesos colombianos del año 2023.

coins colombia data opencv own python segmentation tensorflow yolov8

Last synced: 12 Apr 2026

https://github.com/wooldoughnut310/xboxgamertag

Python module to get data from www.xboxgamertag.com

data gamertag html python3 requests xbox

Last synced: 24 Mar 2025

https://github.com/dev-owdenmag/dataflow-manager

A dynamic and versatile web application for managing, collecting, and presenting data with an integrated printing feature.

data data-management data-management-platform data-visualization python

Last synced: 30 Mar 2025

https://github.com/metriccoders/metriccoders_datasets

This is the Metric Coders repository containing all the datasets for machine learning.

data datasets machine-learning natural-language-processing scikit-learn

Last synced: 08 Apr 2025

https://github.com/igorskyflyer/npm-adblock-header-extract

✂️ Parse and extract ad-block filter list headers with ease. Works on strings or files, trims whitespace, and returns clean metadata for tooling and automation. 📃

adblock back-end biome data filter header igorskyflyer javascript js metadata node nodejs npm string ts typescript utility

Last synced: 11 Mar 2026

https://github.com/GiveMePseudonyms/PiVisualisations

A way to visualise millions of digits of Pi. Written in Python using Pygame and Tkinter.

data data-visualization pi pygame python self-organising-criticality tkinter

Last synced: 08 Apr 2025

https://github.com/stefanbohacek/exploring-the-mapping-police-violence-dataset

Using my Gutenberg Data Visualization plugin to explore police violence against civilians.

data dataviz police police-brutality police-misconduct

Last synced: 03 Dec 2025

https://github.com/bijx/firestore-data-fetcher

A simple Python script to fetch documents from a Firebase Firestore collection and save them to a local `.json` file.

automation data database downloader exporter fetcher firebase firestore open-source script

Last synced: 12 Apr 2026

https://github.com/cqllum/schema2dwh

⚡ Automatically produce a data model on your database using its information schema using GenAI.

ai data data-structures dataengineering datawarehousing dwh gemini gemini-api genai reporting reporting-tool schema-design

Last synced: 13 Mar 2025

https://github.com/shivam1808/data-cleaning-project

We take raw housing data and transform it in SQL Server to make it more usable for analysis.

analysis data datacleaning sql sqlserver

Last synced: 29 May 2026

https://github.com/fredhutch/gdscnsoilsites

Homepage for BioDIGS Project. Learn about the project and download data.

biodigs data metagenomics student-research

Last synced: 25 Mar 2025

https://github.com/agahkarakuzu/datavis_edu

Presented in BrainHack School 2019-2020, QBIN SciComm 2021

binder dashboard data notebooks repo2docker visualization

Last synced: 01 Apr 2025

https://github.com/datenoio/internacia-db

Public registry of the intergovernmental organizations, country groups and countries. Available as JSONl, Parquet, YAML and DuckDB database datasets

countries data datasets international international-trade reference

Last synced: 29 May 2026

https://github.com/lmuffato/project-ting-trybe

Projeto ting - Projeto avaliativo da Trybe do Bloco 37: Estrutura de Dados II: Listas, Filas e Pilhas

data data-analysis python queue read-file stack trybe trybe-projects

Last synced: 12 Jun 2025

https://github.com/toransahu/metoffice

Data visualisation - MetOffice

data metoffice uk visualization weather

Last synced: 25 Mar 2025

https://github.com/lane-romuald/iot-irrigation-data-collection-system

An IoT-based data collection system using the ESP32 microcontroller programmed with Arduino to monitor environmental conditions for smart irrigation. The system measures soil moisture, temperature, air temperature, humidity, and rain probability. Data is stored locally on an SD card and uploaded to the ThingSpeak platform.

arduino cloud data data-collection esp32 openweather openweathermap thingspeak wi-fi

Last synced: 12 Apr 2026

https://github.com/osiota10/alx-low_level_programming

C Low Level Programming - Data Structures, Linux/Unix System Programming and Algorithms with ALX Software Engineering

algorithms assembly c data data-structures linux shell unix

Last synced: 25 Jun 2025

https://github.com/edugmenes/azure-data-engineering

This repository contains my first end-to-end Data Engineering project, built using Microsoft Azure Cloud and Azure Databricks with PySpark.

azure cloud data data-engineering data-lakehouse data-structures databricks delta-lake etl-pipelines lakehouse lakehouse-architectures medallion-architecture microsoft-azure pyspark spark

Last synced: 29 Jan 2026

https://github.com/bastianolea/campamentos_chile

Datos del Catastro de campamentos nacional 2024, del Ministerio de Vivienda y urbanismo

chile comunas data pobreza social

Last synced: 24 Aug 2025

https://github.com/cleanzr/restaurant

Restaurant data set for entity resolution

data linkage

Last synced: 11 Mar 2026

https://github.com/codeforafrica/ckanext-followy

[ARCHIVED] A CKAN extension to show the datasets a user is following.

ckan ckan-extension ckanext-followy data dataset followy-extension open-data

Last synced: 16 Mar 2025

https://github.com/rrwen/twitter2pg-cli

Command line tool for extracting Twitter data to PostgreSQL databases

api cli cmd command data database geo interface line location media pg postgres postgresql rest social stream tool tweet twitter

Last synced: 12 Apr 2026

https://github.com/bolajiolayinka/graph-api-automation

An End to End Automation from Facebook Business to Data Visualization of Campaigns

data data-science

Last synced: 07 May 2025

https://github.com/melinteflxrin/softserve-bigdata-project

End-to-end data warehousing project integrating APIs, ETL workflows, and PostgreSQL for analytics and reporting.

analytics api bigdata data datawarehousing externalapi pipeline postgres postgresql python warehouse

Last synced: 26 Jan 2026

https://github.com/thiagopanini/datadelivery

Um módulo Terraform open source capaz de proporcionar um toolkit completo de infraestrutura para que usuários iniciem suas respectivas jornadas de exploração em serviços de Analytics na AWS.

analytics athena aws catalog crawler data datamesh glue s3 terraform

Last synced: 29 Nov 2025

https://github.com/tether/tether-schema

Custom protocol buffer schema for data validation

data protocol schema validation

Last synced: 09 Apr 2025

https://github.com/whitehathackerpr/data-visualization-tool

This is a Python-based web application that allows users to upload datasets, analyze data, and create visualizations interactively. The tool is designed for ease of use and provides a simple interface to perform basic data analysis and generate visualizations

data data-analysis data-visualization python python3

Last synced: 05 Sep 2025

https://github.com/akhi07rx/f1-statistics-dashboard

A comprehensive command-line tool for analyzing Formula 1 race data using the FastF1 library.

akhi07rx cli cli-tools data f1 f1-score f1cli f1dashboard f1stats fastf1 formula1 opensource race race-analytics

Last synced: 23 May 2026

https://github.com/bredalis/datastructure

📚 Estructuras de Datos en Python

algorithms data data-structure python

Last synced: 12 Apr 2026

https://github.com/stdlib-js/ndarray-base-to-reversed

Return a new ndarray where the order of elements of an input ndarray is reversed along each dimension.

base data flip javascript matrix ndarray node node-js nodejs reverse slice stdlib structure to-reversed types vector view

Last synced: 12 Apr 2026

https://github.com/agavitalis/sample-c-codes

A collection of small projects I carried out on audino as an electronic engineering student despite felling in love with website development.

ageteller atm binary data gpcalculator logging

Last synced: 09 Apr 2025

https://github.com/devlive-community/mockaroo

一个轻量级的 HTTP Mock 服务器,用于快速构建模拟数据接口,适用于前后端开发和接口测试场景。

data mock

Last synced: 08 Jul 2025

https://github.com/himel-sarder/web-scraping-it-jobs-dataset

This project is a Python-based web scraping tool that collects job listings from TimesJobs for IT-related positions. It extracts job titles, company names, locations, and experience requirements, and saves the data into a CSV file. The tool uses BeautifulSoup and Pandas for web scraping and data manipulation.

data datascience dataset kaggle-dataset machine-learning machinelearning ml web-scraping

Last synced: 22 Feb 2026

https://github.com/stdlib-js/ndarray-slice-dimension-from

Return a read-only shifted view of an input ndarray along a specific dimension.

copy data javascript matrix ndarray node node-js nodejs shift slice stdlib structure truncate types vector view

Last synced: 24 Apr 2025

https://github.com/dalikewara/typego

typego provides custom type that can be used to construct information (such as success data, error data, etc)

custom data golang helper type typego

Last synced: 09 Apr 2025

https://github.com/ourouimed/github-profile

Simple Github Profile HTML CSS JS Using Github APi data

api css data github html js json

Last synced: 13 Apr 2026

https://github.com/seanowenhayes/recipe-scraper

A simple scraper uses puppeteer to scrape recipes and more from the web

crawler crawling data recipes scraping

Last synced: 22 Feb 2026

https://github.com/bukalapak/bukadata

Data supplier plugin for populating design with real data.

data plugin sketch sketch-plugin

Last synced: 05 Jul 2025

https://github.com/jigyasag18/gold-price-prediction-project-using-machine-learning

This repository contains a machine learning project focused on predicting gold prices (GLD) using historical stock market data, including indicators such as SPX, USO, SLV, and EUR/USD. The project implements a Random Forest Regressor for accurate price forecasting, complete with data visualization, correlation analysis, and model evaluation metrics

data dataset jupyter-notebook jupyter-notebooks machine-learning machinelearing machinelearningalgorithms machinelearningmodel machinelearningprojects matplotlib mlproject numpy pandas randomforestregressor seaborn

Last synced: 23 Jul 2025

https://github.com/so-cool/uobrain

My solution to the University of Bristol PURE Data Challenge

competition data modeling

Last synced: 09 Sep 2025

https://github.com/alexscigalszky/palabras-aleatorias-data

This package have a set of datasets of random words, animals, colors, jokes, onomatopeias and types

aleatorias data palabras random words

Last synced: 04 Oct 2025

https://github.com/san089/black-friday-sales-analysis

This Project gives an insight into few statistics related to black Friday Sale.

custom data dataanalysis insights sales statistics

Last synced: 13 Jul 2025

https://github.com/nikhilash45/live_ipl_report

This repository hosts the source code for an interactive IPL (Indian Premier League) Dashboard built using PowerBI. The dashboard provides real-time updates on ongoing matches, including live scores, batting and bowling statistics for both teams, and the points table.

analysts cleaning-data cricket-data dashboard data data-analysis data-visualization dax powerbi

Last synced: 19 Mar 2026

https://github.com/camara94/introduction-to-data-engineering

Describe the different entities that form a modern data ecosystem. Describe and differentiate between the role and responsibilities of Data Engineers, Data Scientists, Data Analysts, Business Analysts, and Business Intelligence Analysts. Explain what Data Engineering is. List the tasks that need to be performed in a typical data engineering lifecycle. Describe what a day in the life of a Data Engineer looks like.

business-analytics business-intelligence data dataingestion dataintegration datascience machinelearning python statistical-analysis

Last synced: 09 Apr 2025

https://github.com/hardwario/cloud-fetch

HARDWARIO Cloud Fetch - Data Extraction Tool

cli cloud data excel python

Last synced: 07 Feb 2026

https://github.com/kuro337/scalamono

Scala Monorepo Tooling for Kafka, Opensearch, Spark, Redpanda, Hadoop - and Lang Reference.

data database duckdb hadoop kafka redpanda sdala spark

Last synced: 13 Apr 2026

https://github.com/gkapfham/ast2016-paper

Source Code of and Supporting Files for a Paper Published at AST 2016

data latex-document paper research

Last synced: 19 Oct 2025

https://github.com/nodef/infoods

Kit for International Network of Food Data Systems (INFOODS).

component data food identifier infoods international network systems tagnames

Last synced: 11 Mar 2026

https://github.com/danreynolds/data_batcher

Data batcher batches and de-dupes data fetched in the same task of the event loop.

batching data flutter hacktoberfest

Last synced: 19 May 2026