An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/agahkarakuzu/datavis_edu

Presented in BrainHack School 2019-2020, QBIN SciComm 2021

binder dashboard data notebooks repo2docker visualization

Last synced: 01 Apr 2025

https://github.com/flowsynx/plugin-json

FlowSynx plugin to loads and parses local JSON files. Supports transformation, extraction, and mapping of hierarchical data structures in workflows.

data data-platform flowsynx json

Last synced: 10 Mar 2026

https://github.com/aymane-maghouti/mobile-data-hive-insights

This project demonstrates the process of extracting data from a MySQL database, transferring it using Apache Sqoop, storing it in Hive Data warehouse (the data actually is store in Hadoop Distributed File System (HDFS)), and performing analysis using Hive Query Language (Hive QL) (it is a language close to SQL). Then visualize the data in Power BI,

apache-sqoop data data-integration data-visualization hadoop-hdfs hivedb hiveql powerbi

Last synced: 09 Mar 2026

https://github.com/mascanho/ruddit

CLI to interact with Reddit's API to programatically retrieve data

cli data marketing rust rust-lang rustlang sales

Last synced: 19 Aug 2025

https://github.com/gmersy/data-carbon

Repository accompanying the paper: Toward a Life Cycle Assessment for the Carbon Footprint of Data

carbon-emissions carbon-footprint climate-change data data-science sustainability sustainable-software

Last synced: 31 Mar 2025

https://github.com/tylerben/data-spring

Easily generate a dummy dataset based on a provided config

data data-spring datagenerator fake-data generator javascript typescript

Last synced: 27 May 2026

https://github.com/iwconfig/svtplay-data

Daily JSON backup of content metadata from SVTPlay

data metadata streamlink svtplay svtplay-dl youtube-dl

Last synced: 24 Oct 2025

https://github.com/cainmi/data-page-project

A repository to pull code and files from, may be used to store page data links, code etc. mainly used for python for now

data html javascript python schema

Last synced: 21 Oct 2025

https://github.com/mikezange/laravel-encryptable

A simple encryptable trait for encrypting model fields in laravel

data encrypt field gdpr laravel model trait

Last synced: 16 May 2026

https://github.com/freddy03h/immutable-data-structure

Normalize and Merge your application's data store using Immutable.JS objects

data immutable redux store

Last synced: 05 Oct 2025

https://github.com/jahilldev/immutable-parsejs

Parse a JS object or array/map into an Immutable collection. Makes use of ImmutableJs List, and Record primitives.

data immutablejs javascript json nodejs parse typescript

Last synced: 13 Apr 2026

https://github.com/zonggen/data-structure

Course notes on data structures and analysis (CSC263)

algo clrs csc263 data

Last synced: 23 Mar 2025

https://github.com/rnabla/cuda-des

Bruteforcing DES using CUDA

bruteforce cuda data des encryption gpu parallel standard

Last synced: 27 Oct 2025

https://github.com/anobaka/insidecollector

这是一个介于Excel和纯记录工具之间的软件,您可以自由创建各种列表,然后将其以各种规则关联起来,并且可以创建自定义视图帮助您更好地理解数据。

collection data excel-like list list-manager table

Last synced: 19 Jan 2026

https://github.com/SAP-archive/signavio-qualtrics-di

Setup an SAP Data Intelligence data pipeline to connect Qualtrics surveys data to SAP Signavio Process Intelligence via Ingestion API.

data intelligence process-intelligence qualtrics sample sap-data-intelligence sap-signavio-process-intelligence signavio

Last synced: 09 May 2025

https://github.com/kingabzpro/makefile-actions

GitHub Actions and MakeFile tutorial and project for beginners.

actions analytics automation data data-science makefile

Last synced: 18 Apr 2026

https://github.com/R-Mahesh45/HR---Resume-Text-Classification

Text Classification for Resumes: Conducted Exploratory Data Analysis (EDA) on a vast collection of resumes. Organized the data using Bag of Words (BoW) and TF-IDF techniques. Built and evaluated multiple models, with Logistic Regression delivering standout performance. Created Word Clouds and Histograms.

data datacleaning extract-transform-load feature-extraction nlp nltk-tokenizer text-mining text-processing

Last synced: 13 Oct 2025

https://github.com/hyperversal-blocks/averveil

Averveil is OpenSea for Data.

blockchain data golang iot privacy zero-knowledge zkp

Last synced: 14 Jan 2026

https://github.com/miniql/miniql-express-mongodb-example

A MiniQL example for querying a MongoDB database through an Express REST API.

data database mongodb query query-language

Last synced: 19 Apr 2026

https://github.com/giorgiosavastano/process

processing-chain provides a convenient way to seamlessly set up processing chains for large amounts of data.

big-data data data-science parallel parallel-computing process processing processing-chain rust

Last synced: 05 Oct 2025

https://github.com/insolite/react-data-frame

Table for huge data sets

data react table

Last synced: 14 May 2026

https://github.com/mihaiconstantin/lavot

A `React` application that allows users to indicate how votes will be redistributed among candidates for the second round of Romanian presidential elections.

data data-visualization elections react sankey typescript

Last synced: 06 Feb 2026

https://github.com/cleanzr/restaurant

Restaurant data set for entity resolution

data linkage

Last synced: 11 Mar 2026

https://github.com/husna-poyraz/titanic-machine-learning

Use machine learning to create a model that predicts which passengers survived the Titanic shipwreck.

data data-analysis data-science data-visualization deep-learning machine-learning missing-data outlier-detection python titanic

Last synced: 10 May 2026

https://github.com/stdlib-js/ndarray-base-empty-like

Create an uninitialized ndarray having the same shape and data type as a provided ndarray.

base data empty javascript matrix ndarray node node-js nodejs stdlib structure types vector

Last synced: 09 Mar 2026

https://github.com/oefenweb/python-untraceables

Randomizes IDs for a given set of tables making them untraceable across environments

anonymize data database mysql privacy python python2 python3 randomization

Last synced: 03 Feb 2026

https://github.com/iamgmujtaba/github-python-daily-trending

This repository provides an automated, daily-updated list of the top trending Python repositories on GitHub. Using a GitHub Actions workflow, it scrapes data from GitHub's trending page, sorts the results by total stars, and generates a clean, well-structured README file

data data-scraping github-actions tranding tranding-bot

Last synced: 13 Oct 2025

https://github.com/codenoid/webtoons.com-database

a Webtoons.com Database, collected by Hofesh Bot (Scrapper)

data database

Last synced: 28 Mar 2025

https://github.com/lookininward/data-formatter-demo

You have directories containing data files and specification files. The specification files describe the structure of the data files. Write an app that reads format definitions from specification files. Use these definitions to convert the parsed files to NDJSON files.

csv data demo files json ndjson python txt unittest

Last synced: 27 Apr 2026

https://github.com/Lemniscate-world/StratAI

This project analyzes financial assets using a Hidden Markov Model (HMM) to identify different market regimes and patterns. The analysis includes calculating daily returns, rolling volatility, and volume changes, and visualizing the hidden states identified by the HMM.

ai assets data data-science data-visualization finance financial-analysis fintech hmm-model hmmlearn machine-learning trading

Last synced: 13 Oct 2025

https://github.com/jayantur13/kountry

Node module variant of the Country API

api data jsdelivr kountry nodejs npm npm-module npm-package unpkg yarn

Last synced: 26 Jan 2026

https://github.com/aleenprd/docbt

Documentation Build Tool - Generate YAML documentation for dbt models with optional AI assistance. Built with Streamlit for an intuitive and familiar web interface.

ai analytics-engineering bigquery data data-modeling data-science dbt docker llm lmstudio ollama openai snowflake sql streamlit

Last synced: 11 Nov 2025

https://github.com/gematik/app-fhir-snapshots-package-generator

The repository contains a library and a console application to generate snapshots for StructureDefinitions in FHIR-packages.

data fhir miscellaneous

Last synced: 05 Oct 2025

https://github.com/connectaman/deepseek-ocr-multigpu-infer

Efficient multi-GPU OCR inference framework leveraging parallel processes for accelerated token throughput and faster batch processing. Designed for scalable, high-performance optical character recognition workloads using PyTorch. Supports dynamic GPU assignment, optimized resource utilization, and easy integration for large-scale image datasets.

agentic-extraction data deepseek document-parser extraction extractor gpu image-parser llm multigpu nvidia ocr parallel-computing parser pdf-parser vlm

Last synced: 22 Jan 2026

https://github.com/tonykipkemboi/ens_subgraph_data

Query On-Chain Data from Subgraphs by The Graph Protocol using Python

data subgraphs thegraphprotocol web3

Last synced: 17 Sep 2025

https://github.com/ajsalemo/python-pandas-datalib

Testing and experimenting with some simple Pandas functionality using Flask to serve the parsed data.

csv data flask json pandas pandas-dataframe pandas-series python tabular tabular-data terminal

Last synced: 09 Apr 2026

https://github.com/arif-miad/heart-attack-risk-prediction

This dataset explores key factors influencing heart attack risk, such as age, cholesterol, blood pressure, and lifestyle habits. Using machine learning models.

classification data data-science matplotlib ml pandas-python seaborn visualization

Last synced: 18 Aug 2025

https://github.com/xdrokra/road-accident-analytics

A data visualization project that maps and analyzes road accidents across major Italian municipalities in 2023

analytics data design italy javascript

Last synced: 30 Aug 2025

https://github.com/lagden/injection

Inject data into file

data file inject nodejs

Last synced: 24 Apr 2026

https://github.com/luminati-io/pinterest-dataset-samples

Two sample datasets of over 1000 Pinterest profiles and posts, extracted using the Bright Data API, ideal for market research, influencer marketing, and product development.

data data-extraction data-mining database datasets pinterest pinterest-api structured-data web-scraping

Last synced: 17 Mar 2025

https://github.com/luminati-io/crunchbase-dataset-samples

A sample of 1001 Crunchbase companies with key data points, extracted using the Bright Data API.

crunchbase crunchbase-api crunchbase-scraper data database datasets webscraper-api webscraping

Last synced: 17 Mar 2025

https://github.com/sermetpekin/perse

Perse is an experimental Python package that combines some of the most widely-used functionalities from the powerhouse libraries Pandas, Polars, and DuckDB into a single, unified DataFrame object. The goal of Perse is to provide a streamlined and efficient interface, leveraging the strengths of these libraries to create a versatile data handling.

data data-science data-structures duckdb pandas polars

Last synced: 09 May 2026

https://github.com/m-muecke/isocountry

R package containing ISO codes for countries and currencies

country-codes currency-codes data iso-3166-1 iso-4217 r r-package

Last synced: 20 Mar 2025

https://github.com/stdlib-js/array-base-count-same-value

Count the number of elements that are equal to a given value in an array.

array count countif data javascript node node-js nodejs same stdlib structure sum summation total types

Last synced: 21 Apr 2026

https://github.com/bluecolor/lauda

Cross database data transfer tool

data database etl extract jdbc load

Last synced: 02 May 2026

https://github.com/redodo/shipper

Hide encrypted data in files.

audio data images python steganography

Last synced: 26 Mar 2025

https://github.com/jimbrig/jimstaskviews

CRAN Task Views and Shiny App https://jimstaskviews.jimbrig.com

cran data docs rstats shiny-app submodules task-views

Last synced: 06 Mar 2026

https://github.com/millengustavo/salarios-data-science

Aplicativo Streamlit de exploração dos dados da Pesquisa de mercado de Data Science feita pelo Data Hackers

brasil brazil ciencia-de-dados data data-science heroku salarios salary

Last synced: 07 Oct 2025

https://github.com/stdlib-js/array-base-filled4d-by

Create a filled four-dimensional nested array according to a provided callback function.

alloc allocate array callback data fill filled foreach generic javascript map matrix multidimensional node node-js nodejs stdlib strided structure types

Last synced: 07 Sep 2025

https://github.com/2kabhishek/pokemon-stats

Gotta stat 'em all 🖲🐭

d3 data emoji pokemon rollup statistics

Last synced: 14 May 2026

https://github.com/MikeBairdRocks/Fluky

[floo-kee]: obtained by chance rather than skill.

data framework mock netcore netstandard nuget random vscode

Last synced: 02 Apr 2025

https://github.com/michellepellon/jobx

A modern, powerful job scraper for LinkedIn, Indeed and beyond.

compensation data data-analysis indeed indeed-scraping jobs jobsearch linkedin linkedin-scraper

Last synced: 17 Jan 2026

https://github.com/tillahoffmann/idxhound

🐶 Track indices across one or more numpy selections.

data numpy scientific-computing

Last synced: 14 May 2026

https://github.com/mvuorre/psyarxivdb

Datasette serving PsyArXiv preprint metadata

data datasette open-science preprints psyarxiv

Last synced: 14 May 2026

https://github.com/hoangsonww/fred-banking-data-analysis

💸 AI-powered banking data explorer that combines FRED API insights with vector search, regression analysis, and interactive chat via OpenAI, Claude, and Gemini. Built with TypeScript, React, and Express for seamless full-stack performance.

anthropic chartjs claude-ai data data-analysis data-analytics data-science data-visualization fred fred-api gemini google-generative-ai logistic-regression multiple-regression openai pinecone react regression typescript vector-database

Last synced: 09 Apr 2025

https://github.com/qetdr/names-genders

Surnames, genders, and gender probabilities data extraction script and dataset

data python

Last synced: 01 May 2026

https://github.com/soulyma/web_crawler

A focused web crawler to extract and structure Arabic content from web pages. Designed for researchers, data analysts, and developers working on Arabic language datasets.

beautifulsoup4 crawler csv data json python structured-data

Last synced: 15 May 2026

https://github.com/williamzebrowski/assistant-api

OpenAI Assistant API integrated with Elasticsearch, Logstash & Kibana

ai chatapp chatgpt conversational-ai data elasticsearch kibana llm-inference llms openai rag

Last synced: 16 Feb 2026

https://github.com/alpheustangs/jder

A standardized structure for JSON responses

api data error json response specification structure

Last synced: 26 Mar 2025

https://github.com/katerynazakharova/common-ml

Creating this lib for ML tasks, because I'm bored of copy-pasting the same functions for different projects.

data data-processing deep-learning lib machi

Last synced: 26 Mar 2025

https://github.com/davorg/data-tree

Perl library for handling trees

data perl tree

Last synced: 02 Apr 2025

https://github.com/avto-dev/data-migrations-laravel

Package for database data migrations

data database laravel migrations package

Last synced: 12 Jul 2025

https://github.com/12joan/not-analytics

don't be creepy.

data metrics privacy

Last synced: 30 Apr 2025

https://github.com/mheadd/SamDotNet

:office: A C# wrapper for the SAM.gov API.

api business client data gov-api government

Last synced: 30 Apr 2025

https://github.com/realabbas/instagram-user-meta-data

Instagram User Meta Data 📷 can be fetched using this script in an easy to use JSON Object for displaying Instagram Cards.

data instagram javascript metadata nodejs profile user xray

Last synced: 10 May 2026

https://github.com/fairdataihub/fair-amd-oct-paper-code

Code associated with the paper on FAIR assessment of AMD-related datasets containing OCT data

amd biomedical data eye fair oct

Last synced: 03 Apr 2025

https://github.com/concaption/ksa-lawyers-data

scraped data of ksa lawyers and law firms

data lawyers

Last synced: 03 Apr 2025

https://github.com/ornella-gigante/wildlife-data-analysis-toolkit-ml

A data-driven exploration of Canis lupus signatus (Iberian) and Canis lupus labradorius (Labrador) subspecies, leveraging Jupyter Notebook and pandas to analyze weight distributions (25-56 kg), geographic patterns, and reproductive behaviors. Features size-weight correlations and NaN-handling workflows for robust ecological insights

analysis data datasets jupyter-notebook pandas-dataframe python

Last synced: 15 May 2026

https://github.com/tezcatlipoca0000/db-helper_sf

A program tailored for my workplace; it analyze, visualize and manipulate a Firebird 2.0 database

data data-visualization fdb firebird jupyter-notebook pandas python3

Last synced: 09 Apr 2025

https://github.com/tezcatlipoca0000/ayudante

It's mainly a program for a store to manage the products data

data javascript scraping self-taught web

Last synced: 09 Apr 2025

https://github.com/rob-med/data-visualizations-for-python

A collection of useful snippets for clean data visualizations in Python (with matplotlib)

academic-publishing data data-science data-visualization dataviz matplotlib python scientific-publications storytelling visualization

Last synced: 08 May 2026

https://github.com/shukkkur/py_dash

Assignment for ETL Course - Dashbaord (plotly & dash)

dash dashboard data data-visualization plotly

Last synced: 06 Oct 2025

https://github.com/seafloor-geodesy/gnatss-test-data

Repository to host test data for GNATSS software

data testing

Last synced: 06 Apr 2026

https://github.com/sstendahl/giscan

Simple tool to read and analyze existing GISAXS data

cbf data diffraction diffraction-analysis gisans gisaxs physics reflectivity scattering xray

Last synced: 30 Jun 2026

https://github.com/cpanse/tartare

raw file collection recorded on Thermo Fisher Scientific mass spectrometers for extented unit testing

bioconductor blob data r unittesting

Last synced: 03 Apr 2025

https://github.com/chompfoods/sdk-typescript-fetch

Fetch TypeScript SDK for the Chomp Food & Recipe Database API. Use our API to get high-quality data on recipes and 875,000+ branded/grocery foods plus raw ingredients.

api branded chomp data database fetch food grocery ingredients nutrition raw recipe-api recipes sdk typescript

Last synced: 03 May 2026

https://github.com/anuraganalog/365-data-science

A Repository which contains lecture notes, exercise, solutions

365 data exercises ipynb lecture notes pdfs python python3 science solutions sql

Last synced: 15 May 2026

https://github.com/Greatwoman23/Market-Basket-Analysis

Unlock the power of data-driven sales optimization with Market Basket Analysis. Explore frequent itemsets and association rules to strategically enhance product placement, design targeted promotions, and adapt to seasonal trends. Elevate your business strategy with insights tailored for boosting sales and engaging customers effectively.

analysis analytics analytics-product data data-science jupyter medium-articles notebook-jupyter python

Last synced: 04 May 2025

https://github.com/agustinmusanti/sqlchallenge-2

This repository contains my solutions to a SQL challenge using MySQL, centered around a fictional retail company called TechMarket. The challenge covers various SQL tasks such as data retrieval, manipulation, and analysis, simulating real-world scenarios within a retail business environment.

challenge data mysql

Last synced: 03 Apr 2025

https://github.com/kingtous/bots_task_result

Result of the Barcelona OpenMP Tasks Suite (BOTS) using ompTG

data openmp

Last synced: 09 Jul 2025

https://github.com/greatwoman23/market-basket-analysis

Unlock the power of data-driven sales optimization with Market Basket Analysis. Explore frequent itemsets and association rules to strategically enhance product placement, design targeted promotions, and adapt to seasonal trends. Elevate your business strategy with insights tailored for boosting sales and engaging customers effectively.

analysis analytics analytics-product data data-science jupyter medium-articles notebook-jupyter python

Last synced: 28 Apr 2026

https://github.com/hmeleiro/r_dataviz

Data visualization projects with R / Proyectos de visualización de datos con R

data dataviz r rmd-files social-science survey-data

Last synced: 21 Jun 2026