An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/melinteflxrin/softserve-bigdata-project

End-to-end data warehousing project integrating APIs, ETL workflows, and PostgreSQL for analytics and reporting.

analytics api bigdata data datawarehousing externalapi pipeline postgres postgresql python warehouse

Last synced: 26 Jan 2026

https://github.com/iwconfig/svtplay-data

Daily JSON backup of content metadata from SVTPlay

data metadata streamlink svtplay svtplay-dl youtube-dl

Last synced: 24 Oct 2025

https://github.com/hyperversal-blocks/averveil

Averveil is OpenSea for Data.

blockchain data golang iot privacy zero-knowledge zkp

Last synced: 14 Jan 2026

https://github.com/igorwastaken/math-problems

Solve math problems easily with this utility library.

algorithm area data demography geography javascript math npm package population school typescript util utils

Last synced: 23 Feb 2026

https://github.com/deva-246/data-analytics-with-walmart-dataset

Power BI dashboard contruction , data cleaning using PYTHON and Data projection in SQL.

dashboard data dataquery powerbi python sql

Last synced: 06 Oct 2025

https://github.com/desininja/data-engineer-interview-questions

This repository contains all the Data Engineer Interview Questions asked by interviewers.

data data-engineer-interview-questions

Last synced: 31 Mar 2025

https://github.com/helins/ex.clj

Java exceptions as clojure data

clojure data exception java java-exceptions

Last synced: 12 Dec 2025

https://github.com/skygenesisenterprise/aether-account

Your cloud hub to securely manage all Aether services, profiles, and preferences in one unified dashboard. Fully open-source, fully cloud.

account data javascript nextjs platform service sso-service typescript user-interface

Last synced: 16 Apr 2026

https://github.com/stdlib-js/ndarray-base-fliplr

Return a view of an input ndarray in which the order of elements along the last dimension is reversed.

base data flip javascript matrix ndarray node node-js nodejs reverse slice stdlib structure types vector view

Last synced: 11 Feb 2026

https://github.com/wangshouh/cryptofinancedata

An ipynb file containing data acquisition of futures, options and other financial derivatives

data financial-data

Last synced: 05 Oct 2025

https://github.com/outofbedlam/tine

TINE a data pipeline runner.

data pipeline

Last synced: 05 Oct 2025

https://github.com/cworld1/novel-data

The data repository of novel analysis

analysis data novel

Last synced: 01 Feb 2026

https://github.com/stdlib-js/array-base-fancy-slice-assign

Assign element values from a broadcasted input array to corresponding elements in an output array.

array assign assignment copy data fancy generic javascript node node-js nodejs shallow slice stdlib structure subseq subsequence types

Last synced: 06 Oct 2025

https://github.com/cqllum/schema2dwh

⚡ Automatically produce a data model on your database using its information schema using GenAI.

ai data data-structures dataengineering datawarehousing dwh gemini gemini-api genai reporting reporting-tool schema-design

Last synced: 13 Mar 2025

https://github.com/aiwithqasim/competitive-programming

I will add all material which i did or in the future i will do to make my programming skill more enhanced to become a competitive programmer

c-plus-plus code data java programming structured-data

Last synced: 20 May 2026

https://github.com/zituocn/dean

Task flow framework for data processing

data golang task

Last synced: 18 Jan 2026

https://github.com/fairspec/fairspec-typescript

Fairspec TypeScript is a fast data management framework built on top of the Fairspec standard and Polars DataFrames

ckan csv data dataframe dataset excel fair json ods polars quality schema sqlite table typescript validation zenodo

Last synced: 09 Feb 2026

https://github.com/benmaier/boarding_school_sir

Fit SIR dynamics to the prevalence curve of an H1N1 outbreak of a British boarding school in 1978.

boarding data disease epidemiology modeling school spreading

Last synced: 31 Mar 2025

https://github.com/scottleechua/data

Public datasets under CC-BY-4.0 license.

data public-data

Last synced: 18 Mar 2026

https://github.com/williamwutq/mappedpages

A fixed-size page provider backed by memory mapping, intended for building higher-level allocators and storage systems

allocation allocator data data-storage database file memory-mapping mmap page rust rust-crate rust-library storage

Last synced: 25 Jun 2026

https://github.com/shivam1808/data-cleaning-project

We take raw housing data and transform it in SQL Server to make it more usable for analysis.

analysis data datacleaning sql sqlserver

Last synced: 29 May 2026

https://github.com/matusf/glasgow_wifi

Script that plots wifi access points to map and labels them by their protection

data data-visualization folium python python3

Last synced: 24 Jun 2026

https://github.com/alejo1630/titanic_kaggle

This Python Notebook is a proposal to analyse the Titanic dataset for the Kaggle Competition, using several data science techniques and concepts.

data data-science jupyter-notebook notebook python titanic-survival-prediction

Last synced: 03 May 2026

https://github.com/michalwols/awesome-data-curation

🗑️ ✨ 📊 Awesome things related to data collection, annotation, cleaning and management.

active-learning annotation cleaning-data data data-science deep-learning machine-learning

Last synced: 24 Jun 2026

https://github.com/flowsynx/plugin-json

FlowSynx plugin to loads and parses local JSON files. Supports transformation, extraction, and mapping of hierarchical data structures in workflows.

data data-platform flowsynx json

Last synced: 10 Mar 2026

https://github.com/cainmi/data-page-project

A repository to pull code and files from, may be used to store page data links, code etc. mainly used for python for now

data html javascript python schema

Last synced: 21 Oct 2025

https://github.com/mwmorale/storingencryptiondata

Welcome! Here, I am working with some very basic encryption. This is a work in progress and, for now, is only compatible with Windows OS. Using a password, a user can easily encrypt their “notes” file after writing. Then, later, decrypt when desired in order to view/edit their notes. This is hiding information in plain sight. Eventually, this project will be merged with my folder locker so that an encrypted file can be stored in a "locked" directory/folder. Avoid personal use for I am releasing the encryption key and/or “cipher solution” in my code. When used, run the file called “RUN_ME.py”.

cipher ciphertext data decryption encryption filesystem graphical gui gui-application notes privacy rotation-encryption secure security-tools user-interface whitehat

Last synced: 21 Jun 2026

https://github.com/unownone/spenddy-link

Simple Privacy Friendly chrome extension to track your spends and more!

analytics data extension link

Last synced: 12 Mar 2026

https://github.com/fredhutch/gdscnsoilsites

Homepage for BioDIGS Project. Learn about the project and download data.

biodigs data metagenomics student-research

Last synced: 25 Mar 2025

https://github.com/cdcgov/importsurvey

Import survey: Import data into R, with an application to the National Center for Health Statistics (NCHS)

data import r sas survey survey-data

Last synced: 19 Jun 2026

https://github.com/gsmith257-cyber/bit3434cve

BI T3434 Project on data mining CVEs and Exploits

cve data data-mining exploits research-project

Last synced: 17 Jun 2026

https://github.com/fairspec/fairspec-standard

Fairspec is a data exchange format compatible with DataCite for metadata and JSON Schema for structured data

ckan csv data dataset excel fair fairspec json ods polars python quality schema sqlite table typescript validation zenodo

Last synced: 16 Jun 2026

https://github.com/mewmix/drivehound

magic file signatures + python drive recovery magic

data disk file-signatures harddrive python recovery recovery-tool

Last synced: 08 Oct 2025

https://github.com/patrickdavies100/datapipeline37

Some Data Science practice using datasets available online. Currently test data is similar to this dataset: https://www.kaggle.com/datasets/asaniczka/amazon-uk-products-dataset-2023 but the plan is to expand.

data data-science pandas-dataframe python3

Last synced: 08 Oct 2025

https://github.com/pharo-ai/data-imputers

This project contains transformers for missing value imputation

ai data data-science imputer pharo pharo-smalltalk smalltalk

Last synced: 18 Jan 2026

https://github.com/sandipbera35/blogapp.spring.boot

A proof-of-concept Project Of Blog application In Java Spring Boot, Spring Data JPA with mysql Minio Object Storage , it is an Integration with JWT authservice project(written in golang) .

data java jpa jpa-entity-manager jpa-hibernate mysql mysql-server postman postmanapi spring-boot

Last synced: 13 Apr 2026

https://github.com/xpotify/scraper

Scraper designed for Xpotify's client to gather information from websites🌟

axios cheerio data javascript scraper webscraper

Last synced: 07 Jul 2025

https://github.com/liyakhathshaik/datascout.jl

This is a julia package

data datascout julia

Last synced: 09 Oct 2025

https://github.com/basemax/buskool.com-data

This repository contains the collected product data from the Buskool website (باسکول). The data is stored in 20k+ JSON files, each containing detailed information about products available on the website.

buskool buskoolcom data farsi information ir iran json persian

Last synced: 03 Apr 2025

https://github.com/svetlanam/twitter-ads

Get data about campaigns from Twitter Ads API

api data keboola keboola-extractor twitter twitter-ads twitter-api

Last synced: 12 Jun 2026

https://github.com/scienxlab/datasets

Some small datasets for demos, courses, testing, etc.

data open-data sample-data teaching-resources

Last synced: 09 Oct 2025

https://github.com/iotchulindrarai/reactlearning

learning react like data passing using usestate and props using fom both child to parent and parent to child

data passing props react usestate-hook

Last synced: 14 May 2026

https://github.com/stdlib-js/ndarray-slice-dimension-to

Return a read-only truncated view of an input ndarray along a specific dimension.

copy data javascript matrix ndarray node node-js nodejs slice stdlib structure truncate types vector view

Last synced: 29 Jun 2026

https://github.com/kingsley-ezenwaka/app-profile-data-analysis

A Python data analysis project that aims to propose an app profile based on analysis of Google Playstore dataset.

analysis data jupyter-notebook matplotlib pandas python seaborn

Last synced: 29 Apr 2026

https://github.com/canelmas/data-producer

Fake data producer for Kafka, console and http endpoints

data fake-content fake-data fakerjs kafka kafka-producer

Last synced: 05 Apr 2025

https://github.com/davidgamero/gatech-covid-chart

Line chart showing COVID19 cases per day at Georgia Tech

covid covid19 data gatech

Last synced: 28 Oct 2025

https://github.com/tushar2704/interview-quest

Interview-Quest is comprehensive collection of interview questions and answers that can help you prepare for technical interviews. Whether you're a seasoned developer looking to brush up on your skills or a job seeker preparing for your next big opportunity, this repository aims to provide valuable resources to enhance your interview readiness.

artificial-intelligence data data-science interview interview-questions machine-learning

Last synced: 23 Jan 2026

https://github.com/tupizz/data-processing-pipeline-aws

This project is a serverless application built with the Serverless Framework, TypeScript, and AWS services. It provides an enrichment service that processes contact information and enriches it with additional data.

aws data pipeline serverless typescript

Last synced: 13 May 2026

https://github.com/tbrowder/classfactory

Provides tools to create a data collection with classes to manipulate the persistent data.

class data persistent raku

Last synced: 04 Apr 2025

https://github.com/lunastev/wson-rust

WSON data serialization parser

data parser serialization

Last synced: 07 Apr 2025

https://github.com/sandravizz/global_inequality_story

Dataviz Project about Global Inequality

data data-visualization inequality

Last synced: 03 Jul 2025

https://github.com/kevinsames/spark-fuse

spark-fuse is an open-source toolkit for PySpark — providing utilities, connectors, and tools to fuse your data workflows together.

data databricks fabric pyspark python spark

Last synced: 08 May 2026

https://github.com/thomd/git-scrape-hacker-news

scrape hacker news metadata for data analysis

data data-science git-scraping hacker-news

Last synced: 16 Sep 2025

https://github.com/oya163/corteva

Corteva Data Ingestion Pipeline

corteva data engineering etl

Last synced: 25 Jul 2025

https://github.com/shysolocup/stews

Stews is a Node.JS package meant to make storing data easier by mixing parts from common data types.

aepl array arrays data datatypes html javascript js json map maps nodejs object objects package set sets stews

Last synced: 25 Jul 2025

https://github.com/discindo/natochak

Analysis of bicycle accidents in Macedonia using Rmarkdown and ggplot2

cycling data macedonia

Last synced: 19 Feb 2026

https://github.com/fairspec/fairspec-application

Fairspec Application is a visual tool for managing and validating tabular and structured data

ckan csv data dataset excel fair fairspec json ods polars python quality schema sqlite table typescript validation zenodo

Last synced: 23 May 2026

https://github.com/derrickbaruga7/python-data-analysis

This project analyzes ORU’s off-season sewer usage using Python, with `pandas` for data handling, histograms and line plots for exploration, and a `scipy`-based model for prediction. Pearson’s correlation and visualizations help reveal key trends and relationships.

analytics data data-science visualization

Last synced: 31 Jul 2025

https://github.com/tonykipkemboi/ens_subgraph_data

Query On-Chain Data from Subgraphs by The Graph Protocol using Python

data subgraphs thegraphprotocol web3

Last synced: 17 Sep 2025

https://github.com/theryston/db-mycro

A node module with a json database that saves data in a specific directory, similar to sqlite, but in JSON

base crud data database db db-mycro javascript json jsondatabase nodejs nosql typescript

Last synced: 09 Apr 2026

https://github.com/tiaanduplessis/country-currency-data

Data about currencies of countries

countries currencies data symbols

Last synced: 08 Aug 2025

https://github.com/dav009/bqt

Local unit tests for your BigQuery queries

bigquery bq data test unittest

Last synced: 11 Feb 2026

https://github.com/ddeutils/ddedocs

📖 Data Developer & Engineer Documents and Hands-On

blogs data data-engineering documents hands-on

Last synced: 08 Aug 2025

https://github.com/qeeqbox/data-classification

Data classification defines and categorizes data according to its type, sensitivity, and value

classification data data-classification infosecsimplified qeeqbox

Last synced: 09 Mar 2026

https://github.com/jorgeatgu/casa-caida-bot

Twitter-bot sobre la despoblación en Aragón

aragon bot data data-viz despoblacion twitter-bot

Last synced: 11 Aug 2025

https://github.com/helosantosdesousa/analise-previsao-de-rotatividade-ml

Projeto final do Bootcamp Data Girls 2025 que analisa a rotatividade de funcionários usando Machine Learning. Com base no dataset IBM HR Analytics Attrition, o projeto identifica os principais fatores de risco e cria modelos preditivos (SVC e Random Forest) com até 89% de acurácia para antecipar saídas e apoiar decisões estratégicas de RH.

analise-de-dados analise-exploratoria bootcamp ciencia-de-dados colab-notebook dados data data-analysis data-science dataanalytics dataframe eda machine-learning machine-learning-algorithms pandas python random-forest svc

Last synced: 16 Apr 2026

https://github.com/stdlib-js/ndarray-base-zeros-like

Create a zero-filled ndarray having the same shape and data type as a provided ndarray.

base data fill filled javascript matrix ndarray node node-js nodejs stdlib structure types vector zeros

Last synced: 04 Oct 2025

https://github.com/rishabh-agarwal/datastructuremachineproblem

Data Structure MP - Clemson University (Language C)

273 alogrithms clemson data ece structure university

Last synced: 26 Oct 2025

https://github.com/pradeep221b/turbofan_predictive_maintenance

An R project for predicting turbofan engine RUL using {targets} and {tidymodels}.

data data-science-portfolio machine-learning nasa preditive-maintaince r rstats targets-pipeline tidymodels

Last synced: 04 Oct 2025

https://github.com/garcane/income-prediction-ml

This is a machine learning project aimed at predicting whether an individual's annual income exceeds $50,000 based on their demographic and personal information.

data data-science machine-learning ml numpy pandas python random-forest scikit-learn

Last synced: 08 Apr 2026

https://github.com/DefinetlyNotAI/VulnScan_Data

Logicytics VulnScan Module's Training Data and old model archive

ai data logicytics ml models pytorch sensitive-files text-processing tfidf-text-analysis training-data

Last synced: 17 Aug 2025

https://github.com/freddy03h/immutable-data-structure

Normalize and Merge your application's data store using Immutable.JS objects

data immutable redux store

Last synced: 05 Oct 2025

https://github.com/gematik/app-fhir-snapshots-package-generator

The repository contains a library and a console application to generate snapshots for StructureDefinitions in FHIR-packages.

data fhir miscellaneous

Last synced: 05 Oct 2025

https://github.com/stdlib-js/ndarray-base-empty-like

Create an uninitialized ndarray having the same shape and data type as a provided ndarray.

base data empty javascript matrix ndarray node node-js nodejs stdlib structure types vector

Last synced: 09 Mar 2026

https://github.com/giorgiosavastano/process

processing-chain provides a convenient way to seamlessly set up processing chains for large amounts of data.

big-data data data-science parallel parallel-computing process processing processing-chain rust

Last synced: 05 Oct 2025

https://github.com/mascanho/ruddit

CLI to interact with Reddit's API to programatically retrieve data

cli data marketing rust rust-lang rustlang sales

Last synced: 19 Aug 2025

https://github.com/gusenov/qazaqstan-geography-data

:world_map: Географические данные Казахстана.

data geographic-data geography json kazakhstan qazaqstan regions

Last synced: 20 Feb 2026

https://github.com/carlotta94c/sql4datascientistsdemo

Demo material for Microsoft Reactor session "Getting Started with Databases: SQL and Data Visualizations"

analysis data r sqlite tidyverse visualisation

Last synced: 18 Apr 2026

https://github.com/labwhatever/leetcode

Collection of LeetCode questions to ace the coding interview!

data data-structures-and-algorithms dsa leetcode-cpp leetcode-solutions structure structure-learning

Last synced: 22 Aug 2025

https://github.com/jerryfzhang/rockets

A Node + React App that displays space launch missions around the world.

bootstrap data expressjs less momentjs nodejs react reactjs reactstrap

Last synced: 10 Apr 2026

https://github.com/jessielw/parse-fel-master-data

Simple CLI to parse Dolby Vision master data via the RPU/MediaInfo and output data needed for x265

data dolby fel master mediainfo mi parse rpu vision

Last synced: 26 Aug 2025

https://github.com/xdrokra/road-accident-analytics

A data visualization project that maps and analyzes road accidents across major Italian municipalities in 2023

analytics data design italy javascript

Last synced: 30 Aug 2025

https://github.com/tatey/list_of_baby_names

A list of baby names given to tiny humans in Ruby

data names ruby

Last synced: 11 Nov 2025