An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/aidanjuma/ankideckextractor

A CLI tool written in Python that extracts Anki flashcard decks (.apkg) into separate JSON notes and media files. Perfect for developers building custom learning applications or repurposing Anki content programmatically.

anki apkg cli data decompression extraction flashcards learning python zip

Last synced: 29 Apr 2026

https://github.com/taiwotman/pysparkstreaming

Demonstrate pyspark structured programming using template design pattern

analytics data pyspark streaming wordcount

Last synced: 18 Mar 2026

https://github.com/kevinsames/microsoft-fabric-data-platform-template

A GitHub starter repository for building modern Data Engineering, ML, and AI solutions on Microsoft Fabric. Includes medallion architecture (Bronze → Silver → Gold), Spark Notebooks, dbt, MLflow, GitHub Actions CI/CD, and arc42-based documentation.

data dbt fabric microsoft python spark

Last synced: 29 Apr 2026

https://github.com/elissorokin/data-analyst-portfolio-rus

Это репозиторий, в котором я демонстрирую свои навыки, делюсь проектами и отслеживаю прогресс в области анализа данных и Data Science.

ab-testing data data-analysis datalense matplotlib numpy pandas plotly portfolio postgresql python scipy seaborn sql statistical-analysis

Last synced: 25 Feb 2026

https://github.com/CentralFloridaAttorney/ComfyUI-ZMongo

An Easy-to-Use database framework and parameter library for ComfyUI. Centralize node presets, capture workflow logic, manage structured image collections, and build document-driven text automation pipelines on an offline Local File Store or BusinessProcessApplications.com .

api comfy comfy-ui comfyui comfyui-custom-node comfyui-custom-nodes comfyui-manager comfyui-node comfyui-nodes comfyui-workflow data database

Last synced: 21 Jun 2026

https://github.com/ryanjoy0000/yt-notifier

Youtube Notifier (Telegram Bot) - A real time data processing pipeline

data go kafka-streams real-time telegram-api youtube-api

Last synced: 14 Jan 2026

https://github.com/cworld1/novel-data

The data repository of novel analysis

analysis data novel

Last synced: 01 Feb 2026

https://github.com/bileljegham/api-sport-cli

Cli for https://api-sports.io/ Retreive data and convert to sql file

cli data database match nodejs sports sports-analytics

Last synced: 08 May 2026

https://github.com/lisakey/datacamp-data-analyst-python-sql-projects

Several projects completed during my Data Analyst 📊 training on the DataCamp platform with Python 🐍 and SQL 🗃️. Each project addresses real-world challenges using modern analytical tools and techniques.

analysis cleaning-data data dataanalysis dataanalyst matplotlib pandas python seaborn sql transformation visuali

Last synced: 19 Apr 2026

https://github.com/pharo-ai/data-imputers

This project contains transformers for missing value imputation

ai data data-science imputer pharo pharo-smalltalk smalltalk

Last synced: 18 Jan 2026

https://github.com/hardwario/cloud-fetch

HARDWARIO Cloud Fetch - Data Extraction Tool

cli cloud data excel python

Last synced: 07 Feb 2026

https://github.com/connectomicslab/cmtklib-data

Datalad dataset that stores all data resources of the cmtklib module of Connectome Mapper 3 (https://github.com/connectomicslab/connectomemapper3).

brain data parcellation resources software

Last synced: 16 Jan 2026

https://github.com/leomsgit/extrator-de-parametros-analise-hemograma-e-bioquimico

Software em Python para varrer arquivos PDF e extrair parâmetros diretamente para arquivo Excel

analysis data excel excel-export google-colab hemogram jupyter-notebook pdf pdf-document-processor pdf-viewer python python3

Last synced: 01 May 2026

https://github.com/asuozzo/medicare-data-analysis

An analysis of Medicare Part D data in Vermont

data python

Last synced: 04 May 2026

https://github.com/stdlib-js/ndarray-slice

Return a read-only view of an input ndarray.

copy data javascript matrix ndarray node node-js nodejs slice stdlib structure types vector view

Last synced: 10 Mar 2026

https://github.com/raymondcm/strawberrydata

Tool suite for fast multi-camera strawberry data collection project. The standards document houses cross compatibility/purpose implementation details.

camera cpp data intel multi-camera

Last synced: 08 Feb 2026

https://github.com/garcane/cookie-company-visual-dashboard

This Excel-based interactive dashboard provides a comprehensive overview of the Cookie Company's sales performance and key metrics.

dashboard data data-visualization excel microsoft-excel

Last synced: 09 Feb 2026

https://github.com/3squared/smoulder

Smoulder is a really good data pipe

composition data facade-pattern forge-framework object-oriented

Last synced: 25 Apr 2026

https://github.com/liuliqiang/laueagle

YAML/JSON Lints and Converters

converter data formater json linter python serialization yaml

Last synced: 02 May 2026

https://github.com/danielbello7/nosql-json-database

Simple and quick database to help development process and speed

data database json json-database models nosql nosql-database nosql-json-database schema

Last synced: 09 May 2026

https://github.com/kuro337/scalamono

Scala Monorepo Tooling for Kafka, Opensearch, Spark, Redpanda, Hadoop - and Lang Reference.

data database duckdb hadoop kafka redpanda sdala spark

Last synced: 13 Apr 2026

https://github.com/carlossilva2/pybase

An easy to use Database using Python and JSON

data database json python3 storage

Last synced: 11 May 2026

https://github.com/prajwalsinha/unveiling-climate-change-dynamics-through-earth-surface-temperature-analysis

Climate change analysis through global surface temperature data. Includes data preprocessing, statistical analysis, visualizations, and forecasting. Python-based project using Pandas, Matplotlib, and Scikit-learn.

data dataanalysis dynamic-mapping pyplot python scikit-learn seaborn

Last synced: 10 Feb 2026

https://github.com/themost-framework/memory

MOST Web Framework in-memory data adapter for testing environments

adapter data orm

Last synced: 06 Mar 2025

https://github.com/scottleechua/data

Public datasets under CC-BY-4.0 license.

data public-data

Last synced: 18 Mar 2026

https://github.com/walderlansena/datastructureinc

:battery: Algoritmos de Estrutura de Dados em C++

c cplusplus data fila list lista pilha stack struct structure structured-data

Last synced: 03 May 2026

https://github.com/marcelo-earth/h5n8-data

🔢🦠 Confirmed cases of H5N8 in humans - Feel free to open Pull Requests with new data.

csv data h5n8 h5n8-cases h5n8-virus russia

Last synced: 19 Jan 2026

https://github.com/double-o-z/powershell-json-lightweight-serializer-deserializer

Simple powershell functions to convert from and to json. Very lightweight, will be supported with every powershell version. No dependences.

convert converter data data-science deserialize json lightweight powershell serializer

Last synced: 04 May 2026

https://github.com/mews-labs/dataframe-memory

This tools aims to provide simple solution to save memory when using pandas' data frame.

data data-science memory-usage pandas-dataframe python3

Last synced: 22 May 2026

https://github.com/vtalks/youtube_data_api3

A python3 library to interact with Youtube Data API.

api client data library python python3 youtube

Last synced: 09 Apr 2026

https://github.com/yord/klp-dsv

A delimiter-separated values plugin for klp (Kelpie), the small, fast, and magical command-line data processor.

csv data deserializer dsv json kelpie klp marshaller parser serializer ssv tsv

Last synced: 14 May 2026

https://github.com/skygenesisenterprise/aether-account

Your cloud hub to securely manage all Aether services, profiles, and preferences in one unified dashboard. Fully open-source, fully cloud.

account data javascript nextjs platform service sso-service typescript user-interface

Last synced: 16 Apr 2026

https://github.com/kucingkode/dmerge

Small javascript library to help you merge same formatted data in a string

cithak data data-merge javascript library lightweight lightweight-javascript-library merge open-source

Last synced: 04 May 2026

https://github.com/svetlanam/twitter-ads

Get data about campaigns from Twitter Ads API

api data keboola keboola-extractor twitter twitter-ads twitter-api

Last synced: 12 Jun 2026

https://github.com/waylonwalker/exceltocsv

A usefull tool to convert excel spreadsheets to csv files without launching excel

csv-converter csv-files data excel python spreadsheet

Last synced: 05 May 2025

https://github.com/intersystems-ib/workshop-healthcare-interop

Learn the basics in HealthCare Interoperability using InterSystems IRIS for Health

data fhir health hl7 interoperability

Last synced: 14 Apr 2026

https://github.com/igorskyflyer/npm-adblock-header-extract

✂️ Parse and extract ad-block filter list headers with ease. Works on strings or files, trims whitespace, and returns clean metadata for tooling and automation. 📃

adblock back-end biome data filter header igorskyflyer javascript js metadata node nodejs npm string ts typescript utility

Last synced: 11 Mar 2026

https://github.com/michalwols/awesome-data-curation

🗑️ ✨ 📊 Awesome things related to data collection, annotation, cleaning and management.

active-learning annotation cleaning-data data data-science deep-learning machine-learning

Last synced: 24 Jun 2026

https://github.com/lmuffato/project-mongodb-dataflights-trybe

Projeto MongoDB Dataflights - Projeto avaliativo da Trybe do Bloco 23: Introdução ao MongoDB

back-end crud data database filter mongo mongodb query trybe-projects

Last synced: 16 Apr 2026

https://github.com/matusf/glasgow_wifi

Script that plots wifi access points to map and labels them by their protection

data data-visualization folium python python3

Last synced: 24 Jun 2026

https://github.com/jayantur13/kountry

Node module variant of the Country API

api data jsdelivr kountry nodejs npm npm-module npm-package unpkg yarn

Last synced: 26 Jan 2026

https://github.com/lookininward/data-formatter-demo

You have directories containing data files and specification files. The specification files describe the structure of the data files. Write an app that reads format definitions from specification files. Use these definitions to convert the parsed files to NDJSON files.

csv data demo files json ndjson python txt unittest

Last synced: 27 Apr 2026

https://github.com/brandonhimpfen/data-size-parser

A tiny, practical parser for human-readable data sizes.

data data-size data-sizes npm open-source web-design web-development

Last synced: 12 Jun 2026

https://github.com/tushard48/analyzing-usa-market-trends-a-financial-overview

In-depth analysis of US market trends, encompassing economic indicators, industry performance, and financial data

data data-visualization powerbi

Last synced: 19 Mar 2026

https://github.com/gkapfham/ast2016-paper

Source Code of and Supporting Files for a Paper Published at AST 2016

data latex-document paper research

Last synced: 19 Oct 2025

https://github.com/m0nica/datalogues-outdated

Programming blog focused on data with an emphasis on exploration in Python. Has been migrated from Pelican to Jekyll

data pelican pelican-blog pelican-theme

Last synced: 28 Feb 2026

https://github.com/montanaz0r/imdb-ratings-auto-inserter

A Python script that enables auto-inserting movie ratings into the IMDB profile.

data data-science dataanalysis imdb movies pandas pandas-dataframe python3 selenium selenium-webdriver webscraping

Last synced: 07 May 2026

https://github.com/chompfoods/stub-python-flask

Flask (Python) server stub for the Chomp Food & Recipe Database API. Use our API to get high-quality data on recipes and 875,000+ branded/grocery foods plus raw ingredients.

api branded chomp data database flask flask-server food grocery ingredients nutrition python raw recipe-api recipes server stub stub-server

Last synced: 07 May 2026

https://github.com/jmcanterafonseca/leaflet-context-information

A Leaflet plugin + infrastructure for getting access to Context Information (i.e. data) exposed through FIWARE NGSIv2

context data fiware information leaflet map open visualization web

Last synced: 21 Apr 2026

https://github.com/stdlib-js/ndarray-base-dtype-enum2str

Return the data type string associated with an ndarray data type enumeration constant.

array data dtype dtypes enum javascript multidimensional ndarray node node-js nodejs stdlib types util utilities utility utils

Last synced: 13 Oct 2025

https://github.com/fairspec/fairspec-standard

Fairspec is a data exchange format compatible with DataCite for metadata and JSON Schema for structured data

ckan csv data dataset excel fair fairspec json ods polars python quality schema sqlite table typescript validation zenodo

Last synced: 16 Jun 2026

https://github.com/garcane/beverage-sales-analytics

This project provides an in-depth analysis of beverage sales and delivery across different states using Power BI.

data data-visualization powerbi powerbi-report powerbi-visuals

Last synced: 19 Mar 2026

https://github.com/nodef/infoods

Kit for International Network of Food Data Systems (INFOODS).

component data food identifier infoods international network systems tagnames

Last synced: 11 Mar 2026

https://github.com/bastianolea/campamentos_chile

Datos del Catastro de campamentos nacional 2024, del Ministerio de Vivienda y urbanismo

chile comunas data pobreza social

Last synced: 24 Aug 2025

https://github.com/cleanzr/restaurant

Restaurant data set for entity resolution

data linkage

Last synced: 11 Mar 2026

https://github.com/undistraction/grid-model

A small API for creating a grid and accessing the positions of the cells, rows and columns within it.

2d calculations cells data grid layout model

Last synced: 04 Aug 2025

https://github.com/simranjeet97/leetcode_practice

Practicing the Leet Code Codes for Competitive Programming

algorithms amazon coding competitive-programming data data-structures facebook google leetcode python

Last synced: 03 Aug 2025

https://github.com/varbrad/mindb

🗄 🔍 ⚡️ Schema-less document-oriented collection model data-store for Node & Browsers.

browser data datastore db document javascript json-schema mongo mongodb nodejs nosql query schema

Last synced: 13 Apr 2026

https://github.com/bishtrishu/pizza_sales_data_analysis_sql

This project is a comprehensive data analysis of pizza sales, aimed at uncovering key insights and trends to inform business decisions. Using a combination of SQL, Python, and data visualization tools, the project analyzes sales data to understand customer preferences, peak sales periods, and the most popular pizza types.

cloud data data-analysis data-science data-visualization dataanalytics database mysql oracle-database

Last synced: 14 Apr 2026

https://github.com/alexscigalszky/palabras-aleatorias-data

This package have a set of datasets of random words, animals, colors, jokes, onomatopeias and types

aleatorias data palabras random words

Last synced: 04 Oct 2025

https://github.com/kunalshelke90/predict-bank-credit-risk-using-south-german-credit-data

This is an end-to-end ML project, which aims at developing a classification model for the problem of classifying a given customer profile into either of the risk category (safe or not safe). The final classifier used for this project is CatBoost classifier. Deployed in AWS.

aws cassandra catboost-classifier classification credit-risk data data-science dataanalysis dockerfile finance financial-analysis flask github-actions logging machine-learning mlflow numpy pandas python

Last synced: 03 Jan 2026

https://github.com/toransahu/metoffice

Data visualisation - MetOffice

data metoffice uk visualization weather

Last synced: 25 Mar 2025

https://github.com/jopanel/factual-scraper

Data scraper for Factual v2 API

data

Last synced: 15 Feb 2026

https://github.com/mscbuild/analysis

🎢 This collection of data analysis projects demonstrates techniques for extracting, transforming, analyzing, and visualizing data. Data Analytics Projects for Beginners 📈 ⚡

anallysis analysis chart csv dashboard data data-science data-science-projects excel google html5 mashine-learning portfolio pyton

Last synced: 19 Oct 2025

https://github.com/programmer-rd-ai/library-management-system-oraclesql

The Library Management System project, part of the CI6320 Advanced Data Modelling coursework, features comprehensive SQL scripts utilizing OracleSQL to facilitate efficient data modeling and management.

adm advanced ci6320 cw data icw library management modelling oracle oraclesql report sql system

Last synced: 29 Oct 2025

https://github.com/rrwen/twitter2pg-cli

Command line tool for extracting Twitter data to PostgreSQL databases

api cli cmd command data database geo interface line location media pg postgres postgresql rest social stream tool tweet twitter

Last synced: 12 Apr 2026

https://github.com/kouisamine/data-uri-to-image

Convert Data URI into Image(png, jpeg, webp, gif, svg, ...) files.

conversion convert converter data datauri datauri-to-image image js online php script source-code tools uri

Last synced: 10 May 2026

https://github.com/luminati-io/twitter-x-dataset-samples

A sample dataset of over 1000 Twitter (X) posts, extracted using the Bright Data API, ideal for trend discovery, brand monitoring, and competitive insights.

api data dataset twitter twitter-api twitter-scraper web-scraping x

Last synced: 19 Mar 2026

https://github.com/lmuffato/project-job-insights-trybe

Projeto job insights - Projeto avaliativo da Trybe do Bloco 32: Introdução à Python

data data-science data-transformation filter python

Last synced: 12 Jun 2025

https://github.com/keanteng/nextjs-directory

🌐A Draft Website For Data Catalogue Using NextJs

catalogue climate-change css data directory html javascript nextjs website

Last synced: 09 May 2026

https://github.com/agahkarakuzu/datavis_edu

Presented in BrainHack School 2019-2020, QBIN SciComm 2021

binder dashboard data notebooks repo2docker visualization

Last synced: 01 Apr 2025

https://github.com/skywardai/paper_gallery

Papers gallery for using LLMs ability over dataset

ai data data-science llm medicine neural-network research security

Last synced: 19 Mar 2026

https://github.com/jorgeatgu/apaga-luz

💡 ¿Cuánto cuesta la luz? 💶

data data-visualization flat-data

Last synced: 04 Feb 2026

https://github.com/xpotify/scraper

Scraper designed for Xpotify's client to gather information from websites🌟

axios cheerio data javascript scraper webscraper

Last synced: 07 Jul 2025