An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/tpgillam/teafiles.jl

Tea file support for Julia

data julia time-series

Last synced: 03 Oct 2025

https://github.com/qeeqbox/data-classification

Data classification defines and categorizes data according to its type, sensitivity, and value

classification data data-classification infosecsimplified qeeqbox

Last synced: 09 Mar 2026

https://github.com/stdlib-js/ndarray-base-empty

Create an uninitialized ndarray having a specified shape and data type.

base data empty javascript matrix ndarray node node-js nodejs stdlib structure types vector

Last synced: 19 Feb 2026

https://github.com/pradeep221b/turbofan_predictive_maintenance

An R project for predicting turbofan engine RUL using {targets} and {tidymodels}.

data data-science-portfolio machine-learning nasa preditive-maintaince r rstats targets-pipeline tidymodels

Last synced: 04 Oct 2025

https://github.com/garcane/income-prediction-ml

This is a machine learning project aimed at predicting whether an individual's annual income exceeds $50,000 based on their demographic and personal information.

data data-science machine-learning ml numpy pandas python random-forest scikit-learn

Last synced: 08 Apr 2026

https://github.com/arif-miad/heart-attack-risk-prediction

This dataset explores key factors influencing heart attack risk, such as age, cholesterol, blood pressure, and lifestyle habits. Using machine learning models.

classification data data-science matplotlib ml pandas-python seaborn visualization

Last synced: 18 Aug 2025

https://github.com/giorgiosavastano/process

processing-chain provides a convenient way to seamlessly set up processing chains for large amounts of data.

big-data data data-science parallel parallel-computing process processing processing-chain rust

Last synced: 05 Oct 2025

https://github.com/mascanho/ruddit

CLI to interact with Reddit's API to programatically retrieve data

cli data marketing rust rust-lang rustlang sales

Last synced: 19 Aug 2025

https://github.com/jerryfzhang/rockets

A Node + React App that displays space launch missions around the world.

bootstrap data expressjs less momentjs nodejs react reactjs reactstrap

Last synced: 10 Apr 2026

https://github.com/miniql/miniql-express-mongodb-example

A MiniQL example for querying a MongoDB database through an Express REST API.

data database mongodb query query-language

Last synced: 19 Apr 2026

https://github.com/jessielw/parse-fel-master-data

Simple CLI to parse Dolby Vision master data via the RPU/MediaInfo and output data needed for x265

data dolby fel master mediainfo mi parse rpu vision

Last synced: 26 Aug 2025

https://github.com/kunalshelke90/predict-bank-credit-risk-using-south-german-credit-data

This is an end-to-end ML project, which aims at developing a classification model for the problem of classifying a given customer profile into either of the risk category (safe or not safe). The final classifier used for this project is CatBoost classifier. Deployed in AWS.

aws cassandra catboost-classifier classification credit-risk data data-science dataanalysis dockerfile finance financial-analysis flask github-actions logging machine-learning mlflow numpy pandas python

Last synced: 03 Jan 2026

https://github.com/tatey/list_of_baby_names

A list of baby names given to tiny humans in Ruby

data names ruby

Last synced: 11 Nov 2025

https://github.com/nafisalawalidris/sales-performance-dashboard

Sales Performance Dashboard: Analyze and visualize sales data using Power BI. Gain insights into trends, customer segments, product performance, and geographic distribution. Make data-driven decisions to optimize sales strategies and maximize revenue.

analytics-revenue dashboard-power-bi data data-analysis intelligence-sales optimization performance sales visualization-business

Last synced: 03 Feb 2026

https://github.com/ngambip/priscilla

About my work and Experience

accounting analytics data finance-management

Last synced: 03 Feb 2026

https://github.com/gappeah/global-shipping-analytics-dashboard

This Tableau project provides a comprehensive visual analysis of global sales, shipping costs, and quality metrics across different regions and countries.

data data-analysis data-analyst data-visualization metrics tableau

Last synced: 25 Feb 2025

https://github.com/francescodisalesgithub/data-for-developers

simple SQL database with problems and solution found on stackoverflow, documentation or chatgpt

chatgpt data database developer hacker hacking knowledge solutions sql targets

Last synced: 22 Mar 2025

https://github.com/codenoid/webtoons.com-database

a Webtoons.com Database, collected by Hofesh Bot (Scrapper)

data database

Last synced: 28 Mar 2025

https://github.com/alja7dali/swift-bits

A bite sized library for dealing with bytes.

binary bit bits byte bytes comprehension data manipulation swift

Last synced: 09 Jun 2026

https://github.com/makepath/medaprep

medaprep is a data preparation and feature engineering toolkit for geospatial applications.

data data-science datacleaning eda exploratory-data-analysis xarray

Last synced: 29 Jun 2025

https://github.com/rremple/intervalidus

For all your interval-based data needs.

data intervals

Last synced: 21 Feb 2026

https://github.com/jayantur13/kountry

Node module variant of the Country API

api data jsdelivr kountry nodejs npm npm-module npm-package unpkg yarn

Last synced: 26 Jan 2026

https://github.com/themost-framework/memory

MOST Web Framework in-memory data adapter for testing environments

adapter data orm

Last synced: 01 Jul 2026

https://github.com/coqui123/tradegpt

TradeGPT is a full-stack cryptocurrency trading application that combines a modern Fresh (Deno) frontend with a Python (FASTAPI) backend for Coinbase integration and Azure AI Services for intelligent trading analysis. 💹

analytics automation cryptocurrency data deno fastapi fresh numpy python trading-algorithms trading-strategies tradingbot typescript

Last synced: 11 Apr 2026

https://github.com/rohancyberops/rp1

This project performs an analysis of Starbucks (SBUX) stock returns using R. The analysis includes both simple returns and continuously compounded returns (CC returns) for a period of one month. It also calculates the growth of $1 invested in SBUX and provides visual insights through various plots.

analysis cc data r rlanguage sbux

Last synced: 15 Mar 2025

https://github.com/ahmadjamil888/facial-recognition-ai-model

A facial recognition AI model powered by CNN , and trained by thousands of images.

ai cnn data data-science facial facial-recognition recognition

Last synced: 30 Jun 2025

https://github.com/sbdk-dev/sbdk.dev

A complete reference implementation of a local-first ecosystem for AI-powered analytics. This repository contains the source code for the SBDK.dev website, the central hub for the SBDK suite of open-source tools.

ai-powered-analytics data data-engineering data-engineeringlocal-first data-pipeline-automation data-pipelines dbt dlt duckdb elt etl-pipeline llm local-first machine-learning pipeline sbdk semantic-layer

Last synced: 27 May 2026

https://github.com/cosmos-loops/cosmos-dapper

Cosmos.Dapper is a part of Cosmos.Data, a inline project of COSMOS LOOPS PROGRAMME. This repository provides a package of StackExchange.Dapper to improve development efficiency.

dapper data mysql mysqlconnector oracle postgresql sql-query sqlite sqlkata sqlserver

Last synced: 11 Apr 2026

https://github.com/cintia0528/data_analytics_and_visualization-sql_tableau

Evaluate Magist as a strategic partner for Eniac's Brazilian expansion. Use SQL to analyze growth, tech accessory sales potential, delivery times, and customer satisfaction in Magist's database.

data dataanalysis datavisualization sql strategy tableau

Last synced: 31 Mar 2025

https://github.com/ttitcombe/timekeep

Defensive timeseries analysis in python

data data-science sklearn time-series time-series-analysis timeseries

Last synced: 05 Jan 2026

https://github.com/mtingers/opacify

Opacify reads a file and builds a manifest of external sources to rebuild said file.

backup data obfuscation python

Last synced: 18 May 2026

https://github.com/dataship/beam

Get collimate'd data into Frame, in Node or the Browser

column-store data data-science

Last synced: 27 Apr 2026

https://github.com/lamden/merk

A concise implementation of a merkle tree in Python.

crypto data hash merkle structure tree

Last synced: 27 May 2026

https://github.com/rayenfathallah/students_analysis

This projects contains an analysis of the different fadtors affecting students performance in their final exams. The project uses D3.js to create interactive dashboards that are compelling and easy to interpret.

analysis d3 data education javascript python students

Last synced: 12 Apr 2026

https://github.com/igorskyflyer/npm-adblock-header-extract

✂️ Parse and extract ad-block filter list headers with ease. Works on strings or files, trims whitespace, and returns clean metadata for tooling and automation. 📃

adblock back-end biome data filter header igorskyflyer javascript js metadata node nodejs npm string ts typescript utility

Last synced: 11 Mar 2026

https://github.com/zituocn/dean

Task flow framework for data processing

data golang task

Last synced: 18 Jan 2026

https://github.com/bijx/firestore-data-fetcher

A simple Python script to fetch documents from a Firebase Firestore collection and save them to a local `.json` file.

automation data database downloader exporter fetcher firebase firestore open-source script

Last synced: 12 Apr 2026

https://github.com/cqllum/schema2dwh

⚡ Automatically produce a data model on your database using its information schema using GenAI.

ai data data-structures dataengineering datawarehousing dwh gemini gemini-api genai reporting reporting-tool schema-design

Last synced: 13 Mar 2025

https://github.com/ntia/compound_radar_waveforms-data

Data used by NTIA/ITS TR-23-566 Examining the Effects of Resolution Bandwidth when Measuring Compound Radar Waveforms.

bandwidth data measurement p0n q3n radar resolution stepped waveform

Last synced: 27 Jan 2026

https://github.com/datenoio/internacia-db

Public registry of the intergovernmental organizations, country groups and countries. Available as JSONl, Parquet, YAML and DuckDB database datasets

countries data datasets international international-trade reference

Last synced: 29 May 2026

https://github.com/toransahu/metoffice

Data visualisation - MetOffice

data metoffice uk visualization weather

Last synced: 25 Mar 2025

https://github.com/edugmenes/azure-data-engineering

This repository contains my first end-to-end Data Engineering project, built using Microsoft Azure Cloud and Azure Databricks with PySpark.

azure cloud data data-engineering data-lakehouse data-structures databricks delta-lake etl-pipelines lakehouse lakehouse-architectures medallion-architecture microsoft-azure pyspark spark

Last synced: 29 Jan 2026

https://github.com/eugenedakin/caesarcipher

Native Xojo code for the Caesar Cipher algorithm with an example program

caesar-cipher data decryption encryption xojo

Last synced: 07 Jan 2026

https://github.com/bastianolea/campamentos_chile

Datos del Catastro de campamentos nacional 2024, del Ministerio de Vivienda y urbanismo

chile comunas data pobreza social

Last synced: 24 Aug 2025

https://github.com/cleanzr/restaurant

Restaurant data set for entity resolution

data linkage

Last synced: 11 Mar 2026

https://github.com/fiskeben/meetjescraper

HTTP proxy for Meet je stad project

api data go iot meetjestad proxy scraper weather

Last synced: 29 May 2026

https://github.com/grycap/cdmi-client-go

A basic Go library to perform CDMI core operations

cdmi cloud data go

Last synced: 21 Jan 2026

https://github.com/quasilyte/phpcorpus

A collection of various PHP code; useful for PHP tools writers to get some insights on how "real-world" PHP code looks like

analysis corpus data php php-corpus

Last synced: 04 Jul 2025

https://github.com/tether/tether-schema

Custom protocol buffer schema for data validation

data protocol schema validation

Last synced: 09 Apr 2025

https://github.com/cainmi/data-page-project

A repository to pull code and files from, may be used to store page data links, code etc. mainly used for python for now

data html javascript python schema

Last synced: 21 Oct 2025

https://github.com/stdlib-js/ndarray-base-to-reversed

Return a new ndarray where the order of elements of an input ndarray is reversed along each dimension.

base data flip javascript matrix ndarray node node-js nodejs reverse slice stdlib structure to-reversed types vector view

Last synced: 12 Apr 2026

https://github.com/devlive-community/mockaroo

一个轻量级的 HTTP Mock 服务器,用于快速构建模拟数据接口,适用于前后端开发和接口测试场景。

data mock

Last synced: 08 Jul 2025

https://github.com/himel-sarder/web-scraping-it-jobs-dataset

This project is a Python-based web scraping tool that collects job listings from TimesJobs for IT-related positions. It extracts job titles, company names, locations, and experience requirements, and saves the data into a CSV file. The tool uses BeautifulSoup and Pandas for web scraping and data manipulation.

data datascience dataset kaggle-dataset machine-learning machinelearning ml web-scraping

Last synced: 22 Feb 2026

https://github.com/yasenstar/powerbi_tutorial

Base on "PowerBI Tutorial" book, provide step by step video demo on learning and mastering Power BI tool

analytics data microsoft powerbi tutorial visualization

Last synced: 07 Jan 2026

https://github.com/geo-y20/uber-rides-data-analysis

This project aims to analyze Uber ride data to understand various aspects of ride usage, such as the distribution of rides across different categories, purposes, months, days, and times.

dashboard dashboard-templates data data-analysis data-analysis-python data-analytics data-visualization pandas powerbi python recommendation-system rides uber

Last synced: 13 Apr 2026

https://github.com/bukalapak/bukadata

Data supplier plugin for populating design with real data.

data plugin sketch sketch-plugin

Last synced: 05 Jul 2025

https://github.com/nikhilash45/live_ipl_report

This repository hosts the source code for an interactive IPL (Indian Premier League) Dashboard built using PowerBI. The dashboard provides real-time updates on ongoing matches, including live scores, batting and bowling statistics for both teams, and the points table.

analysts cleaning-data cricket-data dashboard data data-analysis data-visualization dax powerbi

Last synced: 19 Mar 2026

https://github.com/varbrad/mindb

🗄 🔍 ⚡️ Schema-less document-oriented collection model data-store for Node & Browsers.

browser data datastore db document javascript json-schema mongo mongodb nodejs nosql query schema

Last synced: 13 Apr 2026

https://github.com/gkapfham/ast2016-paper

Source Code of and Supporting Files for a Paper Published at AST 2016

data latex-document paper research

Last synced: 19 Oct 2025

https://github.com/ispyhumanfly/prowler

Query the web, extract data from the results, and transform that data into a format you can use.

ai analytics business cryptocurrency data extract-data machine-learning mining scraping web

Last synced: 06 Sep 2025

https://github.com/cintia0528/data_cleaning_and_analytics-python

Evaluate if aggressive discounting benefits Eniac long-term, considering differing views on customer acquisition and brand positioning. Focus on data cleaning for informed decision-making.

colab-notebook data data-analysis datacleaning dataquality jupyter-notebook matplotlib pandas python seaborn

Last synced: 08 Jan 2026

https://github.com/jrcichra/ingestd

HTTP server that easily ingests data into a database

data gin hacktoberfest ingest ingestion restful-api

Last synced: 28 Apr 2026

https://github.com/garcane/global-shipping-analytics-dashboard

This Tableau project provides a comprehensive visual analysis of global sales, shipping costs, and quality metrics across different regions and countries.

data data-analysis data-analyst data-visualization metrics tableau

Last synced: 01 Mar 2026

https://github.com/josephtlyons/prefix_tree

A rusty implementation of a prefix tree.

data prefix rust structure tree

Last synced: 21 Jun 2025

https://github.com/jmcanterafonseca/leaflet-context-information

A Leaflet plugin + infrastructure for getting access to Context Information (i.e. data) exposed through FIWARE NGSIv2

context data fiware information leaflet map open visualization web

Last synced: 21 Apr 2026

https://github.com/programmer-rd-ai/moviedatascraper

Explore the cinematic universe with our IMDb web scraping project! Dive into movie data with ease, uncovering insights from cast to critical reviews. With dynamic visualizations and reliable data, let's journey through the world of movies like never before. Lights, camera, analysis!

beautifulsoup beautifulsoup4 data data-analysis jupyter-notebook matplotlib numpy pandas programming python python3 scraping seaborn software web

Last synced: 01 Mar 2025

https://github.com/basemax/buskool.com-data

This repository contains the collected product data from the Buskool website (باسکول). The data is stored in 20k+ JSON files, each containing detailed information about products available on the website.

buskool buskoolcom data farsi information ir iran json persian

Last synced: 03 Apr 2025

https://github.com/unownone/spenddy-link

Simple Privacy Friendly chrome extension to track your spends and more!

analytics data extension link

Last synced: 12 Mar 2026

https://github.com/ncgl-git/eriparse

Python code to parse the cost-of-living HTML from erieri.com, i.e. https://www.erieri.com/cost-of-living/united-states/illinois/chicago

cost-of-living crime crime-data data economic-research-institute erieri webscraper

Last synced: 14 Jan 2026

https://github.com/aiwithqasim/competitive-programming

I will add all material which i did or in the future i will do to make my programming skill more enhanced to become a competitive programmer

c-plus-plus code data java programming structured-data

Last synced: 20 May 2026

https://github.com/vincentlaucsb/csv-data

A curated repository of real and fake CSV data for use in testing suites

csv data test testing

Last synced: 08 Mar 2026

https://github.com/stdlib-js/array-base-fancy-slice-assign

Assign element values from a broadcasted input array to corresponding elements in an output array.

array assign assignment copy data fancy generic javascript node node-js nodejs shallow slice stdlib structure subseq subsequence types

Last synced: 06 Oct 2025

https://github.com/outofbedlam/tine

TINE a data pipeline runner.

data pipeline

Last synced: 05 Oct 2025

https://github.com/wangshouh/cryptofinancedata

An ipynb file containing data acquisition of futures, options and other financial derivatives

data financial-data

Last synced: 05 Oct 2025

https://github.com/helins/ex.clj

Java exceptions as clojure data

clojure data exception java java-exceptions

Last synced: 12 Dec 2025

https://github.com/igorwastaken/math-problems

Solve math problems easily with this utility library.

algorithm area data demography geography javascript math npm package population school typescript util utils

Last synced: 23 Feb 2026

https://github.com/iwconfig/svtplay-data

Daily JSON backup of content metadata from SVTPlay

data metadata streamlink svtplay svtplay-dl youtube-dl

Last synced: 24 Oct 2025

https://github.com/tushar2704/insurance-cross-sell

This project harnesses the power of cutting-edge technologies including H2O AutoML, MLflow, FastAPI, and Streamlit to enhance cross-selling campaigns and boost efficiency.

data datascience h20automl machine-learning mlflow python streamlit-tushar2704

Last synced: 08 Oct 2025

https://github.com/patrickdavies100/datapipeline37

Some Data Science practice using datasets available online. Currently test data is similar to this dataset: https://www.kaggle.com/datasets/asaniczka/amazon-uk-products-dataset-2023 but the plan is to expand.

data data-science pandas-dataframe python3

Last synced: 08 Oct 2025

https://github.com/scienxlab/datasets

Some small datasets for demos, courses, testing, etc.

data open-data sample-data teaching-resources

Last synced: 09 Oct 2025

https://github.com/alexandregazagnes/rica-analysis

This repository contains the code to download, analyse, and modelize the RICA dataset from the french ministry of agriculture.

analysis argiculture business data data-analysis data-analytics food python

Last synced: 29 Apr 2026

https://github.com/definetlynotai/vulnscan_data

Logicytics VulnScan Module's Training Data and old model archive

ai data logicytics ml models pytorch sensitive-files text-processing tfidf-text-analysis training-data

Last synced: 11 Oct 2025