An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/gbv/cocoda-mappings

concordances, mappings and conversion scripts to create JSKOS mappings

coli-conc data jskos

Last synced: 28 Oct 2025

https://github.com/mundra-ankur/msw_ai_pipeline

Municipal solid waste (MSW) characterization, AI and Data pipeline to charcterize solid waste in real time into diffrent buckets using Yolo

artificial-intelligence data datapipeline solid-waste-segregation yolo

Last synced: 11 Apr 2025

https://github.com/osiota10/alx-low_level_programming

C Low Level Programming - Data Structures, Linux/Unix System Programming and Algorithms with ALX Software Engineering

algorithms assembly c data data-structures linux shell unix

Last synced: 25 Jun 2025

https://github.com/michellepellon/jobx

A modern, powerful job scraper for LinkedIn, Indeed and beyond.

compensation data data-analysis indeed indeed-scraping jobs jobsearch linkedin linkedin-scraper

Last synced: 17 Jan 2026

https://github.com/MikeBairdRocks/Fluky

[floo-kee]: obtained by chance rather than skill.

data framework mock netcore netstandard nuget random vscode

Last synced: 02 Apr 2025

https://github.com/jvrck/australianpayphones

Get Australian payphone data in GeoJSON format.

australia data geojson geojson-data scraper

Last synced: 04 Apr 2025

https://github.com/edugmenes/azure-data-engineering

This repository contains my first end-to-end Data Engineering project, built using Microsoft Azure Cloud and Azure Databricks with PySpark.

azure cloud data data-engineering data-lakehouse data-structures databricks delta-lake etl-pipelines lakehouse lakehouse-architectures medallion-architecture microsoft-azure pyspark spark

Last synced: 29 Jan 2026

https://github.com/2kabhishek/pokemon-stats

Gotta stat 'em all 🖲🐭

d3 data emoji pokemon rollup statistics

Last synced: 14 May 2026

https://github.com/eugenedakin/caesarcipher

Native Xojo code for the Caesar Cipher algorithm with an example program

caesar-cipher data decryption encryption xojo

Last synced: 07 Jan 2026

https://github.com/agavitalis/sample-c-codes

A collection of small projects I carried out on audino as an electronic engineering student despite felling in love with website development.

ageteller atm binary data gpcalculator logging

Last synced: 09 Apr 2025

https://github.com/luminati-io/pinterest-dataset-samples

Two sample datasets of over 1000 Pinterest profiles and posts, extracted using the Bright Data API, ideal for market research, influencer marketing, and product development.

data data-extraction data-mining database datasets pinterest pinterest-api structured-data web-scraping

Last synced: 17 Mar 2025

https://github.com/devlive-community/mockaroo

一个轻量级的 HTTP Mock 服务器,用于快速构建模拟数据接口,适用于前后端开发和接口测试场景。

data mock

Last synced: 08 Jul 2025

https://github.com/stdlib-js/array-base-filled4d-by

Create a filled four-dimensional nested array according to a provided callback function.

alloc allocate array callback data fill filled foreach generic javascript map matrix multidimensional node node-js nodejs stdlib strided structure types

Last synced: 07 Sep 2025

https://github.com/evoluteur/madeleinology

Playing with data science by taking a look at the proportions of flour, sugar, butter, and eggs in 147 Madeleine recipes (the traditional French sponge cake).

baking cake cooking cooking-recipes data data-science data-visualization dessert exploratory-analysis exploratory-data-analysis exploratory-data-visualizations food histogram longtail madeleine recipe visualization

Last synced: 23 Jun 2025

https://github.com/shuklayash02/complete_data_analysis_project

A Full Data Analysis project where a sales data is ask,prepare,process,analyze,share and act through data analysis process

data data-visualization dataanalysis database datacleaning powerbi sql

Last synced: 16 Jul 2025

https://github.com/plurid/deserve

Own Your Data · Control The Code

data owner

Last synced: 16 Jul 2025

https://github.com/wamphlett/smart-data-objects

An easy solution for capturing and validating data into usable DTO's

data dto forms php php7 validation

Last synced: 17 May 2026

https://github.com/cleanzr/restaurant

Restaurant data set for entity resolution

data linkage

Last synced: 11 Mar 2026

https://github.com/dineshpinto/geist-finance-subgraph

Subgraph for the Geist Finance protocol on the Fantom blockchain.

assemblyscript blockchain data fantom graphql typescript

Last synced: 17 May 2026

https://github.com/cainmi/easy-pull-from-repository

A repository to pull code and files from, may be used to store page data links, code etc. mainly used for python for now

data html javascript python schema

Last synced: 04 Apr 2025

https://github.com/aikuyun/flinkx

flinkx 一些修改

data flink

Last synced: 04 Apr 2025

https://github.com/fiskeben/meetjescraper

HTTP proxy for Meet je stad project

api data go iot meetjestad proxy scraper weather

Last synced: 29 May 2026

https://github.com/himel-sarder/web-scraping-it-jobs-dataset

This project is a Python-based web scraping tool that collects job listings from TimesJobs for IT-related positions. It extracts job titles, company names, locations, and experience requirements, and saves the data into a CSV file. The tool uses BeautifulSoup and Pandas for web scraping and data manipulation.

data datascience dataset kaggle-dataset machine-learning machinelearning ml web-scraping

Last synced: 22 Feb 2026

https://github.com/grycap/cdmi-client-go

A basic Go library to perform CDMI core operations

cdmi cloud data go

Last synced: 21 Jan 2026

https://github.com/rohancyberops/rp1

This project performs an analysis of Starbucks (SBUX) stock returns using R. The analysis includes both simple returns and continuously compounded returns (CC returns) for a period of one month. It also calculates the growth of $1 invested in SBUX and provides visual insights through various plots.

analysis cc data r rlanguage sbux

Last synced: 15 Mar 2025

https://github.com/quasilyte/phpcorpus

A collection of various PHP code; useful for PHP tools writers to get some insights on how "real-world" PHP code looks like

analysis corpus data php php-corpus

Last synced: 04 Jul 2025

https://github.com/erinaldi/bmn2-lattice

Data analysis of lattice Monte Carlo simulations of quantum matrix models.

data data-science data-visualisation lattice

Last synced: 27 Mar 2025

https://github.com/millengustavo/salarios-data-science

Aplicativo Streamlit de exploração dos dados da Pesquisa de mercado de Data Science feita pelo Data Hackers

brasil brazil ciencia-de-dados data data-science heroku salarios salary

Last synced: 07 Oct 2025

https://github.com/mitevpi/vue-d3-bar-chart

Reusable, reactive, animated bar chart using D3 + Vue.js. Written in idiomatic Vue, rather than D3 syntax.

d3 data data-visualization frontend interactive svg vue web

Last synced: 18 May 2026

https://github.com/stdlib-js/array-base-to-reversed

Return a new array with elements in reverse order.

array data generic javascript node node-js nodejs rev reverse stdlib structure swap types

Last synced: 11 Apr 2025

https://github.com/LisaKey/convert-csv-to-sav

We used python 🐍 to convert a csv file into a sav file with all the modifications needed to open it in IBM spss and be able to analyse our data.

analysis chardet convert csv data databases ibm os pandas pyreadstat python sav spss sys transformations

Last synced: 03 Mar 2025

https://github.com/sottey/shon

SHON (Structured Human-Optimized Notation) is a data serialization format designed for readability, schema support, and practical use in modern systems. Version 0.6 introduces advanced types and syntax improvements.

data golang json spec specification

Last synced: 18 May 2026

https://github.com/avahoffman/dataplay

🤸‍♂️ Load data to play with

data data-package r r-package rstats

Last synced: 25 Mar 2025

https://github.com/rrwen/twitter2pg-cli

Command line tool for extracting Twitter data to PostgreSQL databases

api cli cmd command data database geo interface line location media pg postgres postgresql rest social stream tool tweet twitter

Last synced: 12 Apr 2026

https://github.com/sergkash7/fdc-facade

Facade for The FoodData Central API.

api center data food usda

Last synced: 15 May 2026

https://github.com/astrid-project/cb-manager

APIs to interact with the Context Broker's database. Through a REST Interface, it exposes data and events stored in the internal storage system in a structured way. It provides uniform access to the capabilities of monitoring agents.

agent beats control data ebpf elasticsearch log logstash management programmability security

Last synced: 30 Jun 2025

https://github.com/mohsinali08000/myportfolio

I’m Mohsin Ali, a passionate software engineer with over 2 years of experience in developing robust software solutions. Currently transitioning into the field of data science.

css data data-science html

Last synced: 22 Apr 2026

https://github.com/bolajiolayinka/graph-api-automation

An End to End Automation from Facebook Business to Data Visualization of Campaigns

data data-science

Last synced: 07 May 2025

https://github.com/saboye/web-scraping-with-python

A web scraping project using Python's "Requests" and "BeautifulSoup" libraries to extract structured data from one or more websites. This project involves sending HTTP requests to the target website(s), retrieving the HTML content of the website(s), and parsing this content to extract the desired data in a usable format.

beautifulsoup csv data data-harvesting data-mining python request web webscraping

Last synced: 18 Jul 2025

https://github.com/makosai/covid19datachart

A basic chart for checking corona data. Written in a single HTML file for convenience. Grab the single file and run it anywhere. Or visit the webpage.

chart chartjs corona coronavirus coronavirus-analysis covid-19 covid-2019 covid19 covid19-data data data-analysis datasets

Last synced: 23 Feb 2026

https://github.com/melinteflxrin/softserve-bigdata-project

End-to-end data warehousing project integrating APIs, ETL workflows, and PostgreSQL for analytics and reporting.

analytics api bigdata data datawarehousing externalapi pipeline postgres postgresql python warehouse

Last synced: 26 Jan 2026

https://github.com/jimbrig/jimstaskviews

CRAN Task Views and Shiny App https://jimstaskviews.jimbrig.com

cran data docs rstats shiny-app submodules task-views

Last synced: 06 Mar 2026

https://github.com/derstimmler/aokexporter

Exporter for data from the statutory health insurance company AOK

aok cocona console csharp data dotnet export polly

Last synced: 15 May 2026

https://github.com/thiagopanini/datadelivery

Um módulo Terraform open source capaz de proporcionar um toolkit completo de infraestrutura para que usuários iniciem suas respectivas jornadas de exploração em serviços de Analytics na AWS.

analytics athena aws catalog crawler data datamesh glue s3 terraform

Last synced: 29 Nov 2025

https://github.com/jub0t/Eso

An application to manage all your Encryption & Decryption keys and other related tools.

data encryption encryption-decryption hacking hacking-tool keys pgp privacy private

Last synced: 10 May 2025

https://github.com/inzhenerka/scooters_data_uploader

Загрузка данных в PostgreSQL в рамках курса по dbt от Инженерка.Тех

data dbt postgresql

Last synced: 04 May 2026

https://github.com/kingabzpro/makefile-actions

GitHub Actions and MakeFile tutorial and project for beginners.

actions analytics automation data data-science makefile

Last synced: 18 Apr 2026

https://github.com/benji-lewis/archivord

An archival bot for Discord servers designed to retain as much data as possible to show future generations how we communicated.

archive data data-mining discord discord-bot typescript

Last synced: 16 May 2026

https://github.com/ahmadjamil888/facial-recognition-ai-model

A facial recognition AI model powered by CNN , and trained by thousands of images.

ai cnn data data-science facial facial-recognition recognition

Last synced: 30 Jun 2025

https://github.com/mikebairdrocks/fluky

[floo-kee]: obtained by chance rather than skill.

data framework mock netcore netstandard nuget random vscode

Last synced: 17 May 2026

https://github.com/fjc0k/vue-merge-data

Intelligently merge data for Vue render functions.

data merge-data render-functions vue

Last synced: 17 May 2026

https://github.com/shgysk8zer0/schema

A PHP implementation of schema.org structured data objects

data microdata schema seo structured-data

Last synced: 24 Jun 2025

https://github.com/muhammad-fiaz/ason

ASON: Adaptive Structured Object Notation - Python library for dynamic data serialization, providing flexibility and simplicity.

adaptive-structure-object-notation api ason cli client data file file-format file-sharing file-upload json json-data json-parser open-source opensource parser parsing python python3

Last synced: 02 Feb 2026

https://github.com/cont-limno/lagosus-reservoir

Data module classifying lakes as natural lakes or reservoirs in the conterminous U.S.

data module

Last synced: 17 Jan 2026

https://github.com/dostuffthatmatters/circadian-scp-upload

Resumable, interruptible, SCP upload client for any files or directories generated day by day

checksum daily data directories files library python scp ssh synchronization time-series upload utilities

Last synced: 24 Jun 2025

https://github.com/DataHerb/dataherb-flora

DataHerb Flora: The core of DataHerb

data data-mining data-science datascience dataset datasets

Last synced: 08 May 2025

https://github.com/giscience/measures-rest-sparql

A SPARQL endpoint for the Measures REST OSHDB App framework.

data osm quality semantics sparql sparql-endpoints

Last synced: 24 Jun 2025

https://github.com/seguradevinn/data-project

A healthcare data audit demo using CMS SynPUF and DuckDB, showing how raw claims are cleaned, validated, and transformed into a 2009 cohort with descriptives and a RADV-style chase list.

auditing cms data duckdb sql

Last synced: 02 Sep 2025

https://github.com/ate329/nsl-kdd-feature-extractor

Python-based tool designed to process network traffic packets and extract features compliant with the NSL-KDD dataset format.

cyber-security cybersecurity data data-science extractor feature-extraction machine-learning network-analysis nsl-kdd nsl-kdd-dataset

Last synced: 30 Oct 2025

https://github.com/erictleung/2017-new-coder-survey

:beginner: Code to help clean and format the 2017 New Coder Survey by freeCodeCamp

coder-survey data data-cleaning dplyr freecodecamp

Last synced: 03 Apr 2025

https://github.com/0xleif/onionstash

Store Onions 🧅

data swift

Last synced: 05 Apr 2025

https://github.com/denko5/sales-analysis

A complete SQL-based sales analysis project covering Africa, showcasing data cleaning, exploratory analysis, insights, and lessons learned. The project highlights sales trends, regional performances, and marketing effectiveness across multiple platforms.

africa data data-analysis data-science exploratory-data-analysis insights kenya sales sql

Last synced: 24 Jan 2026

https://github.com/snegovoy98/data-storage

This is test version of data storage

data of storage test version

Last synced: 19 Jul 2025

https://github.com/prioritizr/prioritizrdata

Conservation planning data sets

data r spatial-data

Last synced: 19 Jul 2025

https://github.com/bastianolea/fonasa_beneficiarios

Datos de beneficiarios del Fondo Nacional de Salud, por tramo del sistema, edad, tramo de edad, sexo, y comuna.

chile comunas data estado genero salud social

Last synced: 27 Feb 2026

https://github.com/redodo/shipper

Hide encrypted data in files.

audio data images python steganography

Last synced: 26 Mar 2025

https://github.com/denisecase/nw-network-data-analytics

Network for those earning a NW Masters of Applied Data Science

analytics data

Last synced: 02 Feb 2026

https://github.com/bacross/datamunger

python package for handling nan's and outliers

data data-frame datamunger knn nan outliers python scikit-learn

Last synced: 17 May 2026

https://github.com/hughrawlinson/github-data-scripts

Scripts to grab data about repos of interest to compare

data github-graphql github-repo-organizer graphql scripts typescript

Last synced: 09 Jul 2025

https://github.com/antononcube/raku-data-cryptocurrencies

Raku package of cryptocurrency data retrieval.

crypto cryptocurrency data

Last synced: 02 Apr 2025

https://github.com/ishanoshada/matplot3dex

A Matplotlib 3D Extension package for enhanced data visualization

data data-science matplotlib python-packages scikit-learn

Last synced: 05 Jan 2026

https://github.com/definetlynotai/test_generator

A tool to create datasets based on configurations from a csv file, This tool can be used as a skeleton for other software.

algorithim csv data development dynamic exam generator huge nirt powerful python skeleton test tools

Last synced: 21 Jul 2025

https://github.com/bayer-group/cmc-ontologies

This is a submodule of cmc-knowledge-graph-setup. It contains ontologies and relevant data graph files

data ontologies owl turtle

Last synced: 16 Jun 2025

https://github.com/ayush585/fireducksblog

BLOG: Unlocking AI Efficiency: How FireDucks Revolutionizes Data Preprocessing

data processing

Last synced: 28 Apr 2026

https://github.com/bytraembedded/Laptop-Price-Prediction-with-Machine-Learning

The Laptop Price Prediction with Machine Learning project provides a system to predict the price of laptops based on various features such as processor type, RAM size, storage capacity, and more/

airflow data data-science data-visualization fastapi heroku-deployment machine-learning-algorithms matplotlib-pyplot numpy pandas python reactjs seaborn

Last synced: 30 Dec 2025

https://github.com/mbolam/DSWS_OpenRefine

Cleaning and Linking Data with OpenRefine

cleaning data metadata openrefine

Last synced: 07 Apr 2025

https://github.com/danieljdufour/fast-bin

Quickly Convert an Array of Numbers into their Minimal Binary Representations

array binarize binary bits data nbits numbers unbinarize

Last synced: 13 Apr 2025

https://github.com/finnspartronics/orpheus

A took for looking at FRC (First Robotics Competition) scouting data

data first-robotics-competition scouting scouting-data spartronics

Last synced: 28 Mar 2025