An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/seanowenhayes/recipe-scraper

A simple scraper uses puppeteer to scrape recipes and more from the web

crawler crawling data recipes scraping

Last synced: 22 Feb 2026

https://github.com/chriseaton/sample-database

A long-term supported sample dataset for file and database unit testing and validation. Simple, straight-forward, raw data shared across formats.

data database examples flat-file samples schema unit-testing

Last synced: 25 Apr 2026

https://github.com/theryston/db-mycro

A node module with a json database that saves data in a specific directory, similar to sqlite, but in JSON

base crud data database db db-mycro javascript json jsondatabase nodejs nosql typescript

Last synced: 09 Apr 2026

https://github.com/vincentlaucsb/csv-data

A curated repository of real and fake CSV data for use in testing suites

csv data test testing

Last synced: 08 Mar 2026

https://github.com/avitai/datarax

A Differentiable Data Pipeline Framework for JAX

autograd data data-analysis data-science differentiable flax-nnx jax jit machine-learning xla

Last synced: 25 Apr 2026

https://github.com/ahmad-ali-rafique/pyviznotebook

PyVizNotebook is a collection of Matplotlib visualizations demonstrating a wide range of plot types and techniques for data visualization. Whether you're a beginner looking to learn or an experienced developer seeking inspiration, this repository offers a diverse set of examples to explore.

analytics colab-notebook data data-science data-visualization dataanalytics matplotlib-python plots seaborn-python visualization

Last synced: 06 Jun 2026

https://github.com/desktopcleaner/naturemagazinescraper

Scrapes open-access Nature magazine articles and store as txt files.

data nature-magazine python scrapper word-frequency

Last synced: 06 Feb 2026

https://github.com/s-raza/csvio

Wrapper for conveniently processing CSV files

csv data file processing wrapper

Last synced: 14 Jan 2026

https://github.com/outofbedlam/tine

TINE a data pipeline runner.

data pipeline

Last synced: 05 Oct 2025

https://github.com/sycho9/populater

:elephant: PHP script that populates your database tables with fake data using fzaninotto/faker

composer data database fake packagist php populate

Last synced: 13 Apr 2026

https://github.com/wangshouh/cryptofinancedata

An ipynb file containing data acquisition of futures, options and other financial derivatives

data financial-data

Last synced: 05 Oct 2025

https://github.com/zalweny26/open_data_unipa

Progetto per l'esame di Laboratorio di Algoritmi 23-24, UniPa, Informatica L-31

data open project python

Last synced: 26 Apr 2026

https://github.com/grkndev/twitcher

A great library that will allow you to use the Twitch API service. All you need to do is use your Token and Client Id information.

api clip clipr data javascript nodejs npm npm-package npmjs streamers streaming twitch twitch-api twitch-bot twitchtv twtich-clip user

Last synced: 09 Mar 2026

https://github.com/stdlib-js/ndarray-base-output-policy-str2enum

Return the enumeration constant associated with an output ndarray data type policy string.

array data dtype dtypes enum javascript multidimensional ndarray node node-js nodejs policy stdlib types util utilities utility utils

Last synced: 15 Apr 2026

https://github.com/aero-db/airports

A public and free dataset of all airports in the world

airports aviation csv data dataset json

Last synced: 27 Apr 2026

https://github.com/petermartens98/nba-analytics-streamlit-app-with-langchain-agent

Interactive NBA Analytics app with Streamlit and a LangChain conversational agent connected to extracted data. Explore player, team, and game stats, track injuries, run simulations, visualize trends, and get AI-powered insights. Ongoing development, open to collaboration.

agentic-ai analysis data deepseek langchain nba python streamlit visualization

Last synced: 08 May 2026

https://github.com/karthikmprakash/github_repos_scraper

A tool to extract names of github repos of any user

automation bs4 data github python repositories requests webscraping

Last synced: 27 Apr 2026

https://github.com/gmersy/data-carbon

Repository accompanying the paper: Toward a Life Cycle Assessment for the Carbon Footprint of Data

carbon-emissions carbon-footprint climate-change data data-science sustainability sustainable-software

Last synced: 31 Mar 2025

https://github.com/iguptashubham/walmart-eda

Imagine diving into the fascinating world of Walmart with just a few lines of code! This project lets you do that using MySQL, a powerful tool for data analysts. You can clean up messy data like a detective, uncovering hidden patterns and trends. Data scientists can take it further,.

analysis data dataset eda mysql portfolio-project python sql

Last synced: 10 Apr 2026

https://github.com/priyanshubiswas-tech/deloitte-daikibo-forensic-analysis-task-2

Forensic pay equity analyzer for Deloitte. Processes compensation data to classify gender equality scores into Fair/Unfair/Discriminative tiers. Outputs modified Excel with 3-tier evaluation system.

data data-analysis deloitte excel forensic-analysis

Last synced: 06 Feb 2026

https://github.com/anobaka/insidecollector

这是一个介于Excel和纯记录工具之间的软件,您可以自由创建各种列表,然后将其以各种规则关联起来,并且可以创建自定义视图帮助您更好地理解数据。

collection data excel-like list list-manager table

Last synced: 19 Jan 2026

https://github.com/nightroman/farnet.fsharp.data

FSharp.Data package for FarNet.FSharpFar

data farmanager farnet fsharp

Last synced: 27 Apr 2026

https://github.com/R-Mahesh45/HR---Resume-Text-Classification

Text Classification for Resumes: Conducted Exploratory Data Analysis (EDA) on a vast collection of resumes. Organized the data using Bag of Words (BoW) and TF-IDF techniques. Built and evaluated multiple models, with Logistic Regression delivering standout performance. Created Word Clouds and Histograms.

data datacleaning extract-transform-load feature-extraction nlp nltk-tokenizer text-mining text-processing

Last synced: 13 Oct 2025

https://github.com/stdlib-js/datasets-herndon-venus-semidiameters

Fifteen observations of the vertical semidiameter of Venus, made by Lieutenant Herndon, with the meridian circle at Washington, in the year 1846.

astronomy data dataset datasets grubbs herndon javascript node node-js nodejs outlier outliers sample statistics stats stdlib venus

Last synced: 09 Oct 2025

https://github.com/jinsyin/datagovernance

公众号:「数据之道」

data data-governance datagovernance governance

Last synced: 30 Jan 2026

https://github.com/jtpio/data-playground

Experiments using public APIs and data

data experiments python

Last synced: 28 Apr 2026

https://github.com/ahmetcansolak/developer-insights

New project of ClubRockers from Sarıyer Hills

bitbucket data data-science data-visualization github python3

Last synced: 28 Apr 2026

https://github.com/tatey/list_of_countries

A list of countries, states, and cities in Ruby

cities countries data ruby states

Last synced: 11 Nov 2025

https://github.com/oneblack333/pizza_sales_analysis

The project involves transforming raw pizza sales data into actionable business intelligence through analysis and visualization. This enables pizza business owners to make data-driven decisions on inventory, staffing, and marketing, ultimately improving performance and profitability.

data data-structures data-visualization excel mysql powerbi

Last synced: 20 Jun 2026

https://github.com/sandk21/etude_eau_potable_monde

Etude sur l'accès à l'eau dans le monde - Tableaux de bord avec Tableau

analysis data tableau tableau-public visualization

Last synced: 19 Mar 2026

https://github.com/helins/ex.clj

Java exceptions as clojure data

clojure data exception java java-exceptions

Last synced: 12 Dec 2025

https://github.com/reubano/ckanny

A Python command line interface (CLI) for interacting with CKAN instances

ckan cli data featured open-data

Last synced: 28 Apr 2026

https://github.com/tiaanduplessis/country-currency-data

Data about currencies of countries

countries currencies data symbols

Last synced: 08 Aug 2025

https://github.com/pferreirafabricio/data-immersion

🏊🏻‍♂️ Activities and exercises from 'Imersão Dados' event

data data-analysis data-science dataset jupiter-notebook python

Last synced: 14 May 2026

https://github.com/Lemniscate-world/StratAI

This project analyzes financial assets using a Hidden Markov Model (HMM) to identify different market regimes and patterns. The analysis includes calculating daily returns, rolling volatility, and volume changes, and visualizing the hidden states identified by the HMM.

ai assets data data-science data-visualization finance financial-analysis fintech hmm-model hmmlearn machine-learning trading

Last synced: 13 Oct 2025

https://github.com/willdev12/rjson

Encryptable Json file format for .NET projects!

csharp csharp-library data dotnet json json-data json-plugin variables vbdotnet vbnet

Last synced: 11 Apr 2026

https://github.com/aadityatamrakar/futures_spread_chart

Cash Market & Futures Daily Spread Chart - NSE Stocks

data data-analysis data-mining expressjs nodejs requests

Last synced: 10 Apr 2026

https://github.com/connectaman/deepseek-ocr-multigpu-infer

Efficient multi-GPU OCR inference framework leveraging parallel processes for accelerated token throughput and faster batch processing. Designed for scalable, high-performance optical character recognition workloads using PyTorch. Supports dynamic GPU assignment, optimized resource utilization, and easy integration for large-scale image datasets.

agentic-extraction data deepseek document-parser extraction extractor gpu image-parser llm multigpu nvidia ocr parallel-computing parser pdf-parser vlm

Last synced: 22 Jan 2026

https://github.com/castdrian/kdapi

A TypeScript library that scrapes K-pop idol and group information from online sources to create comprehensive JSON datasets.

api data kpop scraper typescript

Last synced: 15 May 2025

https://github.com/cintia0528/data_analytics_and_visualization-sql_tableau

Evaluate Magist as a strategic partner for Eniac's Brazilian expansion. Use SQL to analyze growth, tech accessory sales potential, delivery times, and customer satisfaction in Magist's database.

data dataanalysis datavisualization sql strategy tableau

Last synced: 31 Mar 2025

https://github.com/simranjeet97/quotes-analysis

Kaggle Dataset on Quotes Analysis and Visualization With Python, Pandas and MatplotLib Using Jupyter Notebook.

data data-science datavisualization jupyter-notebook kaggle kaggle-dataset machine-learning matplotlib-pyplot numpy pandas python quotes quotes-application

Last synced: 15 Apr 2026

https://github.com/iotchulindrarai/reactlearning

learning react like data passing using usestate and props using fom both child to parent and parent to child

data passing props react usestate-hook

Last synced: 14 May 2026

https://github.com/stdlib-js/strided-base-dtype-resolve-str

Return the data type string associated with a supported strided array data type value.

array data dtype dtypes enum javascript node node-js nodejs stdlib strided types util utilities utility utils

Last synced: 13 Oct 2025

https://github.com/mbolam/DSWS_OpenRefine

Cleaning and Linking Data with OpenRefine

cleaning data metadata openrefine

Last synced: 07 Apr 2025

https://github.com/athul64/powerbi

Financial Reports Dashboard This repository showcases a Financial Reporting Dashboard that visualizes key financial metrics and performance insights. The dashboard contains Monthly and Annual reports, allowing users to switch between the two views to analyze data at different intervals.

data data-an data-visualization dax dax-expression powerbi

Last synced: 23 Feb 2026

https://github.com/sodascience/open_supply_hub

Processing supply chain data obtained from Open Supply Hub

data global-supply-chain open-supply-hub python

Last synced: 29 Apr 2026

https://github.com/yord/klp-dsv

A delimiter-separated values plugin for klp (Kelpie), the small, fast, and magical command-line data processor.

csv data deserializer dsv json kelpie klp marshaller parser serializer ssv tsv

Last synced: 14 May 2026

https://github.com/gusenov/qazaqstan-geography-data

:world_map: Географические данные Казахстана.

data geographic-data geography json kazakhstan qazaqstan regions

Last synced: 20 Feb 2026

https://github.com/dbriane208/omdena-apprenticeship-project

This is part of my contribution to the Omdena apprenticeship program .

data data-science feature-engineering machine-learning

Last synced: 14 Mar 2026

https://github.com/openearth/rws-viewer

This viewer is created by Deltares in cooperation with Voorhoede under OpenEarth GPL License. The viewer can be used via several RWS websites, please visit https://www.informatiehuismarien.nl/, https://waterinfo-extra.rws.nl/ and https://basismonitoringwadden.waddenzee.nl/.

data mapbox-gl-js ogc-services viewer

Last synced: 01 Feb 2026

https://github.com/elissorokin/data-analyst-portfolio-rus

Это репозиторий, в котором я демонстрирую свои навыки, делюсь проектами и отслеживаю прогресс в области анализа данных и Data Science.

ab-testing data data-analysis datalense matplotlib numpy pandas plotly portfolio postgresql python scipy seaborn sql statistical-analysis

Last synced: 25 Feb 2026

https://github.com/abdul-rafay19/youngdevinterns_machine-learning_tasks

This internship offers hands-on exposure to real-world Machine Learning applications — from data visualization and preprocessing to model development, evaluation, and deployment. It focuses on real ML workflows, problem-solving, neural networks, and hyperparameter tuning — all within a collaborative, remote, and growth-oriented environment.

ai artificial-intelligence artificial-intelligence-algorithms artificial-neural-networks data data-visualization internship machine-learning machine-learning-algorithms machinelearning ml model model-development neural-network preprocessing programming-language python task tasks youngdevintern

Last synced: 29 Apr 2026

https://github.com/iwconfig/svtplay-data

Daily JSON backup of content metadata from SVTPlay

data metadata streamlink svtplay svtplay-dl youtube-dl

Last synced: 24 Oct 2025

https://github.com/milandjurdjevic/discriminalizer

.NET library designed for seamless JSON deserialization of objects with complex discrimination requirements, built on top of System.Text.Json.

data deserialization dotnet json

Last synced: 15 Apr 2025

https://github.com/neelravi/data-management

A data management plan for computational chemists/physicists and material scientists for a FAIR storage of raw data

data dmp fair management workflows

Last synced: 16 Jan 2026

https://github.com/chompfoods/sdk-php

PHP SDK for the Chomp Food & Recipe Database API. Use our API to get high-quality data on recipes and 875,000+ branded/grocery foods plus raw ingredients.

api branded chomp data database food grocery ingredients php raw recipe-api recipes sdk

Last synced: 30 Apr 2026

https://github.com/cosmos-loops/cosmos-dapper

Cosmos.Dapper is a part of Cosmos.Data, a inline project of COSMOS LOOPS PROGRAMME. This repository provides a package of StackExchange.Dapper to improve development efficiency.

dapper data mysql mysqlconnector oracle postgresql sql-query sqlite sqlkata sqlserver

Last synced: 11 Apr 2026

https://github.com/alrza2003/alrza2003.github.io

This repository contains the source files for my personal portfolio website. It highlights my background as a data analyst and radiology student, and showcases real-world projects, tools I use, and ways to connect with me. The site is based on a pre-built template that I customized to reflect my profile and experience.

data data-analysis data-visualization portfolio portfolio-website python

Last synced: 30 Apr 2026

https://github.com/jahilldev/immutable-parsejs

Parse a JS object or array/map into an Immutable collection. Makes use of ImmutableJs List, and Record primitives.

data immutablejs javascript json nodejs parse typescript

Last synced: 13 Apr 2026

https://github.com/lamden/merk

A concise implementation of a merkle tree in Python.

crypto data hash merkle structure tree

Last synced: 27 May 2026

https://github.com/mascanho/ruddit

CLI to interact with Reddit's API to programatically retrieve data

cli data marketing rust rust-lang rustlang sales

Last synced: 19 Aug 2025

https://github.com/raynardj/r_notes

Learning notebooks of R

data docker guru99 jupyter learning r

Last synced: 09 May 2026

https://github.com/divanny/academixbackend

🧑‍🎓 Academix is a comprehensive academic management system designed to streamline and enhance the educational experience for both students and professors. This repository contains the backend codebase for the Academix system, responsible for handling data processing, authentication, and API endpoints.

backend csharp data net webapi

Last synced: 07 Jun 2026

https://github.com/giorgiosavastano/process

processing-chain provides a convenient way to seamlessly set up processing chains for large amounts of data.

big-data data data-science parallel parallel-computing process processing processing-chain rust

Last synced: 05 Oct 2025

https://github.com/connectaman/c-and-data-structure

Program,Notes,Explanation on Data Structure using C++

cpp data data-structures sorting-algorithms

Last synced: 14 Mar 2025

https://github.com/gdhhgnbnvbn/f1-2025-ai-predict

fully generated by claude 3.5 sonnet via Windsurf IDE. Not a single lines wrote.

agent-based-modeling claude csv data f1 gpt machine-learning model prediction predictive-modeling python rainforest streamlit vibe

Last synced: 01 May 2026

https://github.com/athari22/house_sales_in_king_count_usa

The idea of the project is to do a Data analysis in a Real Estate Investment Trust. The Trust would like to start investing in Residential real estate.

analysis data data-science data-visualization ibm ibm-watson linearregression machine-learning matplotlib numpy pandas sklearn-library

Last synced: 01 May 2026

https://github.com/ggeop/multiple-fields-management

Fields management from/to different data sources. :bulb:

data data-engineering data-organization data-retrieval data-science pandas python

Last synced: 01 May 2026

https://github.com/danielgiljam/orbit-utils

A collection of utility packages for Orbit.js.

data inference orbit orbitjs schema synchronization type typescript validation zod

Last synced: 01 May 2026

https://github.com/henrylin03/china-gdp

Analysis and visualisation of China GDP data using Python.

data data-analysis data-visualisation dataset kaggle pandas

Last synced: 01 May 2026

https://github.com/stdlib-js/ndarray-slice-dimension-to

Return a read-only truncated view of an input ndarray along a specific dimension.

copy data javascript matrix ndarray node node-js nodejs slice stdlib structure truncate types vector view

Last synced: 29 Jun 2026

https://github.com/antononcube/raku-data-cryptocurrencies

Raku package of cryptocurrency data retrieval.

crypto cryptocurrency data

Last synced: 02 Apr 2025

https://github.com/mwmorale/storingencryptiondata

Welcome! Here, I am working with some very basic encryption. This is a work in progress and, for now, is only compatible with Windows OS. Using a password, a user can easily encrypt their “notes” file after writing. Then, later, decrypt when desired in order to view/edit their notes. This is hiding information in plain sight. Eventually, this project will be merged with my folder locker so that an encrypted file can be stored in a "locked" directory/folder. Avoid personal use for I am releasing the encryption key and/or “cipher solution” in my code. When used, run the file called “RUN_ME.py”.

cipher ciphertext data decryption encryption filesystem graphical gui gui-application notes privacy rotation-encryption secure security-tools user-interface whitehat

Last synced: 21 Jun 2026

https://github.com/arif-miad/heart-attack-risk-prediction

This dataset explores key factors influencing heart attack risk, such as age, cholesterol, blood pressure, and lifestyle habits. Using machine learning models.

classification data data-science matplotlib ml pandas-python seaborn visualization

Last synced: 18 Aug 2025

https://github.com/brianali-codes/github-searcher

A website for API experimentation that users the github Api to search for different users and some of their (public) information

api data github user

Last synced: 21 May 2026

https://github.com/stdlib-js/array-base-none-by

Test whether all elements in an array fail a test implemented by a predicate function.

all array data every generic javascript node node-js nodejs predicate stdlib structure test types validate

Last synced: 15 Apr 2026

https://github.com/liuliqiang/laueagle

YAML/JSON Lints and Converters

converter data formater json linter python serialization yaml

Last synced: 02 May 2026

https://github.com/raymondcm/strawberrydata

Tool suite for fast multi-camera strawberry data collection project. The standards document houses cross compatibility/purpose implementation details.

camera cpp data intel multi-camera

Last synced: 08 Feb 2026

https://github.com/freddy03h/immutable-data-structure

Normalize and Merge your application's data store using Immutable.JS objects

data immutable redux store

Last synced: 05 Oct 2025