An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/fairdataihub/fair-amd-oct-paper-code

Code associated with the paper on FAIR assessment of AMD-related datasets containing OCT data

amd biomedical data eye fair oct

Last synced: 03 Apr 2025

https://github.com/nia-cloud-official/datascript

DataScript: A Hypothetical Data Scripting Language, DataScript is designed for simplifying data manipulation and analysis tasks. It serves as a scripting language tailored specifically for handling various data operations efficiently.

data data-scripting scripting-language

Last synced: 22 Jun 2025

https://github.com/elazar/pycopyql

Exports a subset of data from a relational database.

data database export relational tool utility

Last synced: 16 May 2026

https://github.com/jackokring/www

Generic www flask server with phinka module

compression data flask phinka python

Last synced: 16 Jan 2026

https://github.com/kevinsames/spark-fuse

spark-fuse is an open-source toolkit for PySpark — providing utilities, connectors, and tools to fuse your data workflows together.

data databricks fabric pyspark python spark

Last synced: 08 May 2026

https://github.com/bredalis/seaborn

📊 Library to create graphics 📊

data graphics-programming librery python seaborn seaborn-plots

Last synced: 04 Mar 2025

https://github.com/thomd/git-scrape-hacker-news

scrape hacker news metadata for data analysis

data data-science git-scraping hacker-news

Last synced: 16 Sep 2025

https://github.com/nafisalawalidris/dr.-semmelweis-and-the-discovery-of-handwashing

Uncover the revolutionary impact of handwashing on mortality rates in healthcare. Explore the story of Dr. Semmelweis and his groundbreaking findings.

data data-analysis handwashing healthcare-analysis medical-breakthrough mortality-rates

Last synced: 13 Jul 2025

https://github.com/jonsafari/toy-data

Embeddable submodule of parallel/monolingual text data, for use in testing code and sanity checks

data language-data machine-translation nlp sanity-checks toy-data

Last synced: 06 Nov 2025

https://github.com/epogrebnyak/business-conditions-digest-2017

Replicate illustration from Business Conditions Digest

data economics

Last synced: 22 Mar 2025

https://github.com/prdktntwcklr/weatherman

A simple web app displaying environmental data from an SQLite database.

dashboard data flask sensor sqlite

Last synced: 19 May 2026

https://github.com/soulyma/web_crawler

A focused web crawler to extract and structure Arabic content from web pages. Designed for researchers, data analysts, and developers working on Arabic language datasets.

beautifulsoup4 crawler csv data json python structured-data

Last synced: 15 May 2026

https://github.com/aruneshbasak/python-dsa-problems-geeksforgeeks-160-days

I will upload my daily Python DSA problems solved on GeeksforGeeks and post it here!

algorithms-and-data-structures and data data-structures dsa python python3 structure

Last synced: 08 May 2025

https://github.com/qeeqbox/data-security

Safeguarding your personal information (How your info is protected)

data data-security infosecsimplified qeeqbox security

Last synced: 19 Mar 2026

https://github.com/kerlossony/nested-formdata

Nested-FormData is a Function designed to handle nested form data structures in a simplified and efficient way. It helps in managing complex form data, making it easier to work with forms that require hierarchical data

data forms javascript nested-structures nextjs reactjs typescript

Last synced: 08 Mar 2026

https://github.com/priyanshubiswas-tech/aws-etl-pipeline-on-cloud-using-glue-athena-lambda-and-redshift

Serverless ETL pipeline on AWS using Glue, Lambda, Athena, and Redshift — automates data ingestion, transformation, and analytics with scalable, event-driven architecture.

athena aws aws-glue data data-engineering etl etl-pipeline lambda redshift

Last synced: 02 May 2026

https://github.com/canelmas/data-producer

Fake data producer for Kafka, console and http endpoints

data fake-content fake-data fakerjs kafka kafka-producer

Last synced: 05 Apr 2025

https://github.com/kingsley-ezenwaka/app-profile-data-analysis

A Python data analysis project that aims to propose an app profile based on analysis of Google Playstore dataset.

analysis data jupyter-notebook matplotlib pandas python seaborn

Last synced: 29 Apr 2026

https://github.com/umbaji/yodi

This is the official repository for Yodi, the speech recognition model for 8 words, in Ewè. The yodi package is also useful for rapid inference inference on speech data, especially on the mini_speech datasets.

data data-visualization keras python3 speech-recognition tensorflow

Last synced: 12 Jan 2026

https://github.com/erictleung/2017-new-coder-survey

:beginner: Code to help clean and format the 2017 New Coder Survey by freeCodeCamp

coder-survey data data-cleaning dplyr freecodecamp

Last synced: 03 Apr 2025

https://github.com/lagden/injection

Inject data into file

data file inject nodejs

Last synced: 24 Apr 2026

https://github.com/sixarm/sixarm_ruby_fab

SixArm.com → Ruby → Fab gem to fabricate sample data for testing

data fabrication factory fake gem mock ruby

Last synced: 24 Jul 2025

https://github.com/jigyasag18/sonar-rock-vs-mine-prediction-ml-project

This repository contains a machine learning project that classifies SONAR reading data to distinguish between rocks and mines. It implements various classification models,evaluates their performance,and features a user-friendly web application deployed with Streamlit for real-time predictions. The project is aimed to help in safe marine operations.

classification data dataset machine-learning machine-learning-algorithms machinelearning machinelearning-python machinelearningmodel machinelearningproject machinelearningprojects modelevaluation modeltraining prediction-model streamlit streamlit-webapp

Last synced: 18 May 2026

https://github.com/sermetpekin/perse

Perse is an experimental Python package that combines some of the most widely-used functionalities from the powerhouse libraries Pandas, Polars, and DuckDB into a single, unified DataFrame object. The goal of Perse is to provide a streamlined and efficient interface, leveraging the strengths of these libraries to create a versatile data handling.

data data-science data-structures duckdb pandas polars

Last synced: 09 May 2026

https://github.com/concaption/ksa-lawyers-data

scraped data of ksa lawyers and law firms

data lawyers

Last synced: 03 Apr 2025

https://github.com/clabe45/kaz

Minimalistic local storage cli

cli data minimalistic storage utility

Last synced: 17 Jul 2025

https://github.com/abhaysingh71/india-censes-data-analysis

This repo is a india censes data analysis in many domains

data data-science data-visualization dataanalysis streamlit

Last synced: 15 May 2026

https://github.com/shuklayash02/complete_data_analysis_project

A Full Data Analysis project where a sales data is ask,prepare,process,analyze,share and act through data analysis process

data data-visualization dataanalysis database datacleaning powerbi sql

Last synced: 16 Jul 2025

https://github.com/shysolocup/stews

Stews is a Node.JS package meant to make storing data easier by mixing parts from common data types.

aepl array arrays data datatypes html javascript js json map maps nodejs object objects package set sets stews

Last synced: 25 Jul 2025

https://github.com/randomfractals/chicago-transport

Exploratory data analysis of public Chicago transportation datasets.

chicago data data-tools duckdb sql transportation

Last synced: 01 May 2026

https://github.com/ferhatgec/tuc

TinyUrl CLI, generate short link/s from terminal.

data little python3 request script

Last synced: 18 Feb 2026

https://github.com/ornella-gigante/wildlife-data-analysis-toolkit-ml

A data-driven exploration of Canis lupus signatus (Iberian) and Canis lupus labradorius (Labrador) subspecies, leveraging Jupyter Notebook and pandas to analyze weight distributions (25-56 kg), geographic patterns, and reproductive behaviors. Features size-weight correlations and NaN-handling workflows for robust ecological insights

analysis data datasets jupyter-notebook pandas-dataframe python

Last synced: 15 May 2026

https://github.com/pbinkley/tweets-libraries-covid19

A twarc harvest of tweets related to libraries during the COVID-19 outbreak, starting 2020-03-02

data social

Last synced: 06 Mar 2026

https://github.com/stonecharioteer/renfield

Synchronize and Search through Hard Drives

catalogue data search storage synchronization

Last synced: 09 Feb 2026

https://github.com/finnspartronics/orpheus

A took for looking at FRC (First Robotics Competition) scouting data

data first-robotics-competition scouting scouting-data spartronics

Last synced: 28 Mar 2025

https://github.com/jensz12/uhc

Datapack til Minecraft 1.13+ UHC

data minecraft pack

Last synced: 21 Sep 2025

https://github.com/patelabhi574/hotel_reservation_analysis

Analyzing data collected by hotel to make future prediction for the owner of what are the segments they are making most profit & also which are the patterns & trends which have been seen over the past years in the booking in different times throughout the year and price setting on the website in peak time as per availability index.

data data-visualization datamodeling looker-studio powerbi reporting sql-query sql-server

Last synced: 19 Feb 2026

https://github.com/tobinchilongo/oop-school-library

This project consists of Ruby script for the school library app. I implemented encapsulation and inheritance with Ruby by creating classes to represent students and teachers in the school.

data database gemfile input-output preserve rspec-testing rubocop unit-test

Last synced: 02 May 2026

https://github.com/tezcatlipoca0000/db-helper_sf

A program tailored for my workplace; it analyze, visualize and manipulate a Firebird 2.0 database

data data-visualization fdb firebird jupyter-notebook pandas python3

Last synced: 09 Apr 2025

https://github.com/tezcatlipoca0000/ayudante

It's mainly a program for a store to manage the products data

data javascript scraping self-taught web

Last synced: 09 Apr 2025

https://github.com/r-mahesh45/hr---resume-text-classification

Text Classification for Resumes: Conducted Exploratory Data Analysis (EDA) on a vast collection of resumes. Organized the data using Bag of Words (BoW) and TF-IDF techniques. Built and evaluated multiple models, with Logistic Regression delivering standout performance. Created Word Clouds and Histograms.

data datacleaning extract-transform-load feature-extraction nlp nltk-tokenizer text-mining text-processing

Last synced: 12 Sep 2025

https://github.com/themost-framework/centroid

MOST Web Framework for deno.js

api api-rest data deno odata orm

Last synced: 18 May 2026

https://github.com/ushkinaz/cbn-data

Automated game data extraction and processing for Cataclysm: Bright Nights. Provides JSON mirrors, WebP asset conversion, and unified translation data.

cataclysm-bn data wiki

Last synced: 07 Mar 2026

https://github.com/akatrevorjay/helm-nuke

Nukes all helm releases as well as tiller-owned k8s objects that may be left lying around.

all data destroy helm plugin

Last synced: 19 Sep 2025

https://github.com/mundra-ankur/msw_ai_pipeline

Municipal solid waste (MSW) characterization, AI and Data pipeline to charcterize solid waste in real time into diffrent buckets using Yolo

artificial-intelligence data datapipeline solid-waste-segregation yolo

Last synced: 11 Apr 2025

https://github.com/stdlib-js/ndarray-base-dtype-resolve-str

Return the data type string associated with a supported ndarray data type value.

array data dtype dtypes enum javascript multidimensional ndarray node node-js nodejs stdlib types util utilities utility utils

Last synced: 06 Mar 2026

https://github.com/jbdesbas/custom-scripts

Custom SQL functions or scripts

data database sql

Last synced: 23 Feb 2025

https://github.com/ayushai/salesfoce-hospital-management

A custom Salesforce-based Hospital Management System with powerful dashboards and data analysis tools. It provides real-time insights into patient care, appointment scheduling, and inventory management, optimizing healthcare operations and decision-making.

analytics dashboard data salesforce-developers visualization

Last synced: 22 Feb 2026

https://github.com/diegoperea20/own_dataset_segmentation_yolov8

Segmentacion y detection de objetos con propio dataset usando YOLOV8 , en el que se utiliza un dataset propio de una moneda de 200 pesos colombianos del año 2023.

coins colombia data opencv own python segmentation tensorflow yolov8

Last synced: 12 Apr 2026

https://github.com/wooldoughnut310/xboxgamertag

Python module to get data from www.xboxgamertag.com

data gamertag html python3 requests xbox

Last synced: 24 Mar 2025

https://github.com/dev-owdenmag/dataflow-manager

A dynamic and versatile web application for managing, collecting, and presenting data with an integrated printing feature.

data data-management data-management-platform data-visualization python

Last synced: 30 Mar 2025

https://github.com/rayenfathallah/students_analysis

This projects contains an analysis of the different fadtors affecting students performance in their final exams. The project uses D3.js to create interactive dashboards that are compelling and easy to interpret.

analysis d3 data education javascript python students

Last synced: 12 Apr 2026

https://github.com/stefanbohacek/exploring-the-mapping-police-violence-dataset

Using my Gutenberg Data Visualization plugin to explore police violence against civilians.

data dataviz police police-brutality police-misconduct

Last synced: 03 Dec 2025

https://github.com/khalyomede/fetch

Quickly retrieve your PHP data

config configuration data fetch php php7

Last synced: 15 Mar 2025

https://github.com/zituocn/dean

Task flow framework for data processing

data golang task

Last synced: 18 Jan 2026

https://github.com/bijx/firestore-data-fetcher

A simple Python script to fetch documents from a Firebase Firestore collection and save them to a local `.json` file.

automation data database downloader exporter fetcher firebase firestore open-source script

Last synced: 12 Apr 2026

https://github.com/cqllum/schema2dwh

⚡ Automatically produce a data model on your database using its information schema using GenAI.

ai data data-structures dataengineering datawarehousing dwh gemini gemini-api genai reporting reporting-tool schema-design

Last synced: 13 Mar 2025

https://github.com/castelao/bufr

BUFR binary data format from WMO

binary data format meteorology oceanography wmo

Last synced: 13 Jul 2025

https://github.com/shivam1808/data-cleaning-project

We take raw housing data and transform it in SQL Server to make it more usable for analysis.

analysis data datacleaning sql sqlserver

Last synced: 29 May 2026

https://github.com/fredhutch/gdscnsoilsites

Homepage for BioDIGS Project. Learn about the project and download data.

biodigs data metagenomics student-research

Last synced: 25 Mar 2025

https://github.com/datenoio/internacia-db

Public registry of the intergovernmental organizations, country groups and countries. Available as JSONl, Parquet, YAML and DuckDB database datasets

countries data datasets international international-trade reference

Last synced: 29 May 2026

https://github.com/lmuffato/project-ting-trybe

Projeto ting - Projeto avaliativo da Trybe do Bloco 37: Estrutura de Dados II: Listas, Filas e Pilhas

data data-analysis python queue read-file stack trybe trybe-projects

Last synced: 12 Jun 2025

https://github.com/azrunguraya/kabyle-corpus-dataset

Dans l'univers du Traitement Automatique des Langues , l'accès à des datasets diversifiés et bien annotés est essentiel pour développer des modèles performants. Ce projet vise à combler cette lacune spécifique pour la langue taqbaylit, une langue berbère parlée principalement en Kabylie

ber berber berber-dataset corpus data dataset ia kabyle kabyle-art kb machine-learning nlp nlp-machine-learning python taqbaylit text words

Last synced: 31 Jul 2025

https://github.com/gbv/cocoda-mappings

concordances, mappings and conversion scripts to create JSKOS mappings

coli-conc data jskos

Last synced: 28 Oct 2025

https://github.com/eugenedakin/caesarcipher

Native Xojo code for the Caesar Cipher algorithm with an example program

caesar-cipher data decryption encryption xojo

Last synced: 07 Jan 2026

https://github.com/vapourismo/binary-io

Read and write values of types that implement Binary from and to Handles

data haskell haskell-library io parsing

Last synced: 28 Mar 2025

https://github.com/grycap/cdmi-client-go

A basic Go library to perform CDMI core operations

cdmi cloud data go

Last synced: 21 Jan 2026

https://github.com/fastbolt/excel-writer

Excel-Writer component

data excel excel-export

Last synced: 14 Apr 2025

https://github.com/codeforafrica/ckanext-followy

[ARCHIVED] A CKAN extension to show the datasets a user is following.

ckan ckan-extension ckanext-followy data dataset followy-extension open-data

Last synced: 16 Mar 2025

https://github.com/avahoffman/dataplay

🤸‍♂️ Load data to play with

data data-package r r-package rstats

Last synced: 25 Mar 2025

https://github.com/rrwen/twitter2pg-cli

Command line tool for extracting Twitter data to PostgreSQL databases

api cli cmd command data database geo interface line location media pg postgres postgresql rest social stream tool tweet twitter

Last synced: 12 Apr 2026

https://github.com/whitehathackerpr/data-visualization-tool

This is a Python-based web application that allows users to upload datasets, analyze data, and create visualizations interactively. The tool is designed for ease of use and provides a simple interface to perform basic data analysis and generate visualizations

data data-analysis data-visualization python python3

Last synced: 05 Sep 2025

https://github.com/desininja/data-engineer-interview-questions

This repository contains all the Data Engineer Interview Questions asked by interviewers.

data data-engineer-interview-questions

Last synced: 31 Mar 2025

https://github.com/eve-ning/osumania_data

processed osu!mania data from osu!API

data osu rhythm-game vsrg

Last synced: 24 Feb 2026

https://github.com/stdlib-js/ndarray-base-to-reversed

Return a new ndarray where the order of elements of an input ndarray is reversed along each dimension.

base data flip javascript matrix ndarray node node-js nodejs reverse slice stdlib structure to-reversed types vector view

Last synced: 12 Apr 2026

https://github.com/agavitalis/sample-c-codes

A collection of small projects I carried out on audino as an electronic engineering student despite felling in love with website development.

ageteller atm binary data gpcalculator logging

Last synced: 09 Apr 2025

https://github.com/devlive-community/mockaroo

一个轻量级的 HTTP Mock 服务器,用于快速构建模拟数据接口,适用于前后端开发和接口测试场景。

data mock

Last synced: 08 Jul 2025

https://github.com/himel-sarder/web-scraping-it-jobs-dataset

This project is a Python-based web scraping tool that collects job listings from TimesJobs for IT-related positions. It extracts job titles, company names, locations, and experience requirements, and saves the data into a CSV file. The tool uses BeautifulSoup and Pandas for web scraping and data manipulation.

data datascience dataset kaggle-dataset machine-learning machinelearning ml web-scraping

Last synced: 22 Feb 2026

https://github.com/ourouimed/github-profile

Simple Github Profile HTML CSS JS Using Github APi data

api css data github html js json

Last synced: 13 Apr 2026

https://github.com/yasenstar/powerbi_tutorial

Base on "PowerBI Tutorial" book, provide step by step video demo on learning and mastering Power BI tool

analytics data microsoft powerbi tutorial visualization

Last synced: 07 Jan 2026

https://github.com/seanowenhayes/recipe-scraper

A simple scraper uses puppeteer to scrape recipes and more from the web

crawler crawling data recipes scraping

Last synced: 22 Feb 2026

https://github.com/geo-y20/uber-rides-data-analysis

This project aims to analyze Uber ride data to understand various aspects of ride usage, such as the distribution of rides across different categories, purposes, months, days, and times.

dashboard dashboard-templates data data-analysis data-analysis-python data-analytics data-visualization pandas powerbi python recommendation-system rides uber

Last synced: 13 Apr 2026