An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/dataspoclab/dataspoc-lens

Virtual warehouse — SQL + Jupyter + AI over cloud Parquet via DuckDB

cli data data-engineering data-lake duckdb etl parquet python singer sql

Last synced: 20 Apr 2026

https://github.com/mrpudn/maltrends

(mirror) MyAnimeList.net manga and anime trend data.

anime data json jsonl jsonlines manga myanimelist

Last synced: 20 Apr 2026

https://github.com/mishra-krishna/analysis-and-optimization-of-supply-chain-operations

Analyzed supply chain data to identify trends and key factors. Visualized sales, defect rates, lead times, and costs. Used Decision Tree Regressor to find top features impacting product costs and lead times.

data dataanalytics datavisualization supplychain supplychainanalytics

Last synced: 20 Apr 2026

https://github.com/cicerotcv/br-gen

A browser extension for generating Brazilian placeholder data.

chrome data extension generation hacktoberfest

Last synced: 21 Apr 2026

https://github.com/aravind-selvam/bikeshare-company-analysis

Google Data Analytics Professional Certificate program's Capstone project, of a bike sharing company

analytics business-analytics business-intelligence data data-analysis data-visualization dataanalytics google-data-analytics postgresql sql sql-server

Last synced: 22 Apr 2026

https://github.com/tkonopka/makealive

Dynamic web content through controlled javascript

conversion-functions d3 data data-science javascript visualization

Last synced: 22 Apr 2026

https://github.com/ofelipelucca/cdc-kafka-debezium-pipeline

A real-time event-driven social network API built with CDC (Change Data Capture), Kafka, Debezium, PostgreSQL and MongoDB implementing CQRS-style architecture with streaming data pipelines.

cdc data data-engineering data-integration data-pipeline debezium event-driven fastapi kafka kafka-connect microservices mongodb postgresql python sqlalchemy

Last synced: 05 Jun 2026

https://github.com/yord/klp-core

A plugin with basic operations for klp (Kelpie), the small, fast, and magical command-line data processor.

csv data deserializer dsv json kelpie klp marshaller parser serializer ssv tsv

Last synced: 24 Apr 2026

https://github.com/zalweny26/open_data_unipa

Progetto per l'esame di Laboratorio di Algoritmi 23-24, UniPa, Informatica L-31

data open project python

Last synced: 26 Apr 2026

https://github.com/astrid-project/cb-manager

APIs to interact with the Context Broker's database. Through a REST Interface, it exposes data and events stored in the internal storage system in a structured way. It provides uniform access to the capabilities of monitoring agents.

agent beats control data ebpf elasticsearch log logstash management programmability security

Last synced: 30 Jun 2025

https://github.com/mohsinali08000/myportfolio

I’m Mohsin Ali, a passionate software engineer with over 2 years of experience in developing robust software solutions. Currently transitioning into the field of data science.

css data data-science html

Last synced: 22 Apr 2026

https://github.com/mitevpi/vue-d3-bar-chart

Reusable, reactive, animated bar chart using D3 + Vue.js. Written in idiomatic Vue, rather than D3 syntax.

d3 data data-visualization frontend interactive svg vue web

Last synced: 18 May 2026

https://github.com/rohancyberops/rp1

This project performs an analysis of Starbucks (SBUX) stock returns using R. The analysis includes both simple returns and continuously compounded returns (CC returns) for a period of one month. It also calculates the growth of $1 invested in SBUX and provides visual insights through various plots.

analysis cc data r rlanguage sbux

Last synced: 15 Mar 2025

https://github.com/oefenweb/python-untraceables

Randomizes IDs for a given set of tables making them untraceable across environments

anonymize data database mysql privacy python python2 python3 randomization

Last synced: 03 Feb 2026

https://github.com/kingabzpro/makefile-actions

GitHub Actions and MakeFile tutorial and project for beginners.

actions analytics automation data data-science makefile

Last synced: 18 Apr 2026

https://github.com/ahmadjamil888/facial-recognition-ai-model

A facial recognition AI model powered by CNN , and trained by thousands of images.

ai cnn data data-science facial facial-recognition recognition

Last synced: 30 Jun 2025

https://github.com/sap-samples/security-research-codegraphsmote

Data augmentation strategy that can be applied to code graphs for learning-based vulnerability discovery.

augmentation data detection learning machine research sample security vulnerability

Last synced: 07 Jun 2026

https://github.com/seguradevinn/data-project

A healthcare data audit demo using CMS SynPUF and DuckDB, showing how raw claims are cleaned, validated, and transformed into a 2009 cohort with descriptives and a RADV-style chase list.

auditing cms data duckdb sql

Last synced: 02 Sep 2025

https://github.com/karthikmprakash/github_repos_scraper

A tool to extract names of github repos of any user

automation bs4 data github python repositories requests webscraping

Last synced: 27 Apr 2026

https://github.com/antononcube/raku-data-cryptocurrencies

Raku package of cryptocurrency data retrieval.

crypto cryptocurrency data

Last synced: 02 Apr 2025

https://github.com/yuvrajsaraogi/car-price-prediction-with-machine-learning

The price of a car depends on a lot of factors like the goodwill of the brand of the car, features of the car, horsepower and the mileage it gives and many more. Car price prediction is one of the major research areas in machine learning. So, if you want to learn how to train a car price prediction model then this project is for you.

car-price-prediction-with-machine-learning data data-science deep-learning deep-neural-networks engineer github learning machine-learning mini-project natural-language-processing prediction predictive-modeling project python3 sql

Last synced: 15 Apr 2026

https://github.com/iankitnegi/statistically_speaking

Explore diverse projects showcasing statistical techniques with real-world data, comprehensive docs, and interactive visualizations.

data excel statistical-analysis statistics

Last synced: 09 Feb 2026

https://github.com/kashifkhan7/cleaning-analysis_cli

Analyze sales data easily with our CLI app. Gain insights on revenue trends and visualize results using Python, Pandas, and Matplotlib. 🚀📊

conditional-statements css data datacleaning exception-handling exiftool html json matplotlib-pyplot metadata metadata-extraction pandas-python python sales-analysis seaborn-python speech-to-text transcription youtube

Last synced: 13 Apr 2026

https://github.com/fiddlydigital/anonimizer

A lib to replace and rehydrate sensitive data in text

anonimize anonymize data data-security prompt sanitize string string-manipulation text

Last synced: 15 Mar 2025

https://github.com/srindot/average_flightdata_collection_fwuaav

This repository is designed for collecting average data for a flapping wing UAV. The script acg_coeff_data_collection.py runs the necessary data collection, and the resulting data is saved into a CSV file called AverageFlightData.csv.

data flaping-uav

Last synced: 18 Aug 2025

https://github.com/rse/nebulize

Nebulize Security-Sensitive Information

data dsgvo gdpr information nebulize security sensitive

Last synced: 16 Mar 2025

https://github.com/mochsyahrizal/jkfkjabar_studycase

First Data Analytics Study Case

data datanalytics studycase

Last synced: 15 Feb 2026

https://github.com/mikeqfu/network-rail-track-fixity-layer

This project develops a data mining tool for analysing and predicting track movements using asset data, environmental factors and track design knowledge to model key parameters and generate fixity values for the GB rail network.

data data-integration data-mining data-science information-management knowledge-discovery point-cloud rail rail-alignment rail-track track-fixity

Last synced: 02 Sep 2025

https://github.com/laguer/jupyt-nb

Mathematical and Physical Constants ratios in Cosmology and micro physics

analysis constants cosmology data dimensional julia mathematical micro notebook physical physics python ratios science

Last synced: 13 Apr 2026

https://github.com/kalaspuff/ready

🎟 [not yet built] Take control of the event loop with simplified task management, queueing and data loading.

asyncio data dataloading event futures python python3 resolver tasks

Last synced: 10 May 2026

https://github.com/lefuturiste/npm-api

Search or get a npm package

api data npm php

Last synced: 14 May 2026

https://github.com/grimen/js-humanizer

A human/developer friendly value humanizer - for JavaScript/Node.

data debug debugging format formatting humanize humanizer log logging print printing value

Last synced: 13 Jun 2026

https://github.com/moeabbas6/bq_data_loader

A Python script for executing and logging batch SQL commands in Google BigQuery. Includes tracking of execution times, unique job and statement IDs, and automated logging to a specified BigQuery table.

bigquery data python

Last synced: 24 Mar 2025

https://github.com/kfrural/customer-churn-prediction

Customer churn prediction using machine learning. The project follows CRISP-DM and KDD methodologies, including data preprocessing, feature engineering, modeling, and evaluation. It also features an interactive dashboard for visualizing results.

crisp-dm data jupyter kdd python

Last synced: 29 Apr 2026

https://github.com/purarue/HPI-personal

Personal HPI modules/scripts

data history lifelogging

Last synced: 30 Mar 2025

https://github.com/tomcardoso/journalism-data-intersection

A talk on working at the intersection of journalism and data science

data data-journalism journalism

Last synced: 15 May 2025

https://github.com/wlgs/got-dialogues-data-stats

Game of Thrones dialogues data statistics processed with R and SQLite. Project for Probability and Statistics course 21/22 at AGH UST. The project was about manipulating data and getting many pieces of information from it in addition to visualizing these results.

data game-of-thrones got r statistics stats

Last synced: 22 May 2026

https://github.com/nafisalawalidris/nafisalawalidris

Configuration files for my GitHub profile. Welcome to my GitHub profile! I'm Nafisa Lawal Idris, a passionate Data Scientist with a strong interest for blockchain technology. Explore my GitHub portfolio to delve into the exciting world where data science and Bitcoin converge.

artifical-intelligence bitcoin config data data-science developer github-config github-pages machine-learning

Last synced: 16 May 2026

https://github.com/sirmaxx/log_manager

log manager services for microservices

data fastapi logging microservice mongodb

Last synced: 09 Apr 2026

https://github.com/luminati-io/jupyter-notebooks-web-scraping

Perform web scraping interactively using Jupyter Notebooks, integrating coding, data analysis, and visualization into one seamless workflow.

beautifulsoup4 data jupyter jupyter-notebook pandas python requests seaborn virtual-environment web-scraper web-scraping

Last synced: 13 Apr 2026

https://github.com/lucavallin/uppa

Unsplash Photo Performance Analyzer.

analyzer data photo python unsplash

Last synced: 05 Oct 2025

https://github.com/officialxviid/gloogia

👓 Make your big ideas come true by building real projects using real data 🌎

api build data gloogia projects xviid

Last synced: 05 Jan 2026

https://github.com/rajkumarbestha/nsedataextractor

NSEDataExtractor

data python python3

Last synced: 26 Mar 2025

https://github.com/fehmitahsindemirkan/web-scrapper

Professional and high performance web scraping project.

data ecommerce emailsender fileexplorer logging python web webscraping

Last synced: 10 Jan 2026

https://github.com/robthree/cfnreader

Provides a simple way to read FNIRSI's CFN files (*.cfn) produced by the FNIRSI UsbMeter tool

cfn csv data fnirsi usb usb-tester

Last synced: 01 Mar 2025

https://github.com/guardias-eu/reasin

Interface to the European Alien Species Information Network API

api biodiversity biodiversity-data biodiversity-informatics data invasive-species oscibio r r-package

Last synced: 04 Oct 2025

https://github.com/anthonybench/convert

A quick way to convert data, document, and image formats.

cli converter data documents images

Last synced: 14 Jan 2026

https://github.com/bablukumarjha/startup-funding-revenue-analysis-by-sql-and-pandas

SQL project analyzing startup funding, revenue, and founder data to extract business insights using Python and MySQL.

data data-analysis data-platform data-science dataanalysisusingpython dataanalytics pandas-dataframe pandas-library python sql sql-server sqlalchemy sqldatabase

Last synced: 18 May 2026

https://github.com/sulujulianto/population-data-retrieval-and-analysis

I created a simple program that can be used to search for global population data or population data from various countries using Python.

data population world

Last synced: 09 Mar 2026

https://github.com/anisimov-anthony/data_forest

Implementation of various types of trees

algorithms-and-data-structures data lib rust tree

Last synced: 28 Apr 2025

https://github.com/yorkearwaker/data

Data things; representation, transformation, pipelines, governance,

actuality data epistemology information knowledge ontology

Last synced: 07 Apr 2025

https://github.com/reubano/devcraft-workshop

Materials for the DevCraft workshop on stream processing

data functional-programming meza python riko stream-processing tutorial

Last synced: 04 May 2026

https://github.com/paezha/bsantiago

A data package with the results of a travel and well-being survey conducted in Santiago in 2016

data equity package r santiago survey travel well-being

Last synced: 18 Mar 2025

https://github.com/sandysanthosh/aspose-doc-to-pdf

Document & Browser object model

aspose build data doc java pdf

Last synced: 04 Jun 2026

https://github.com/lamouchi-bayrem/data-matrix-scanner

A dual-interface tool that leverages AI to **detect and decode QR codes and Data Matrix codes** from images using computer vision

data datamatrix-scanner decoder flask qrcode scanner tkinter-gui webapp

Last synced: 30 Apr 2026

https://github.com/plnech/never2late

Never 2 Late - a reinterpretation of Everest Pipkin's 'i've never picked a protected flower'

dada dada-science data generative-art glitch-art installation nlp poetry spacy vector-similarity wallpaper

Last synced: 10 Jun 2025

https://github.com/bertrand31/one-billion-rows-challenge

🌪️ Pushing Scala to its limits to aggregate a billion rows' worth of data in 2.42 seconds

competitive-programming competitive-programming-contests data data-engineering data-processing performance scala

Last synced: 05 Sep 2025

https://github.com/juanandres-montero/dataanalysis

Dedicado al análisis de datos.

costa-rica data

Last synced: 10 Aug 2025

https://github.com/stoyank7/football-prediction

This is my Semester 7 Project for my "AI for Society" minor at Fontys University of Applied Sciences.

ai betting data football machine-learning university-project

Last synced: 25 Mar 2025

https://github.com/e-panourgia/big-data

Big Data Management Systems course assignments

analytics azure bigdata data hadoop json latex mrjob neo4j python redis stream

Last synced: 11 Apr 2026

https://github.com/nia-cloud-official/influx-agents

Influx-CRD is a web application designed to facilitate data collection, recovery, and distribution for agents uploading data to a centralized database. It provides an intuitive interface for managing data collection from various sources, recovering lost or corrupted data.

broker collection data data- influx influx-agent

Last synced: 30 Jul 2025

https://github.com/mtalhaofc/nutrition_system

A simple AI-powered web app built using Streamlit that provides personalized weekly meal plans and nutrition recommendations based on user demographics, health goals, and nutritional preferences.

cosine-similarity data data-science food machine-learning model nutrition pandas python streamlit

Last synced: 29 Apr 2026

https://github.com/arkanovicz/skorm

Simple Kotlin Object Relational Mapping

data database model orm sql

Last synced: 19 Apr 2026

https://github.com/rijkvanzanten/ds-fa-1

The first final assignment for the data structures class

assignment data final map now parsons structures thenewschool

Last synced: 04 Oct 2025

https://github.com/quonverbat/ordner

A simple, customizable and cross-platform data tracker.

data datatracker javafx management

Last synced: 07 Jul 2025

https://github.com/jprando/mattkillua

Estudo sobre .Net Core

data dbcontext domain efcore netcore

Last synced: 23 Mar 2025

https://github.com/roshaka/samplr

Samplr is a Python decorator for selecting a subset of items from a list, with options for customisation and informative console printouts.

data data-analysis data-engineering decorators list python sampling

Last synced: 14 Jan 2026

https://github.com/austinhartzheim/career-fair-backend

Backend for ECS Career Fair app

data django python

Last synced: 13 Apr 2026

https://github.com/blueheron786/quranic-universal-library-mushaf-layouts

The Quranic Universal Library (QUL)'s Qur'an mushaf 15-line layouts (madini, uthmani)

data database layout mushaf quran sqlite uthmani uthmani-quran

Last synced: 13 Apr 2026

https://github.com/stdlib-js/array-base-last-index-of-same-value

Return the index of the last element which equals a provided search element according to the same value algorithm.

array data find generic index javascript locate node node-js nodejs same scan search stdlib structure types

Last synced: 13 Apr 2026

https://github.com/soenneker/soenneker.attributes.mapto

A C# attribute for generic data mapping translation

attributes columns csharp data datatables dotnet mapping mapto maptoattribute object

Last synced: 02 Mar 2026

https://github.com/nolanbconaway/rollercoaster-tycoon-data

Every roller coaster I have built in RCT2 for iPad

data roller-coaster-tycoon

Last synced: 24 Mar 2025

https://github.com/abirsaha111/ipl-2022-analysis

The IPL 2022 Analysis project is a data-driven exploration of the Indian Premier League (IPL) 2022 cricket tournament. The analysis focuses on utilizing Python programming and various libraries to analyze and visualize the performance of teams, players, and key metrics in the IPL 2022 season.

data dataana dataanalytics datavi matplotlib python

Last synced: 07 Jun 2026

https://github.com/j2kun/terrorism-usa-post-9-11

A copy of the terror data published by NewAmerica

data politics terrorism transparency

Last synced: 02 Mar 2026

https://github.com/deliprofesor/health-score-prediction-model-the-impact-of-lifestyle-and-demographic-factors

A machine learning project predicting health scores based on lifestyle and demographic factors like age, BMI, diet, and exercise. Techniques include Random Forest, Polynomial Regression, and Linear Regression, with a focus on model performance and actionable health insights.

cross-validation data data-science data-visualization feature-engineering linear-regression machine-learning polynomial-regression random-forest

Last synced: 10 Apr 2025

https://github.com/krakozaure/pyzzy

Set of packages to simplify development in Python

configuration data formats json library logging logs python3 toml utils yaml

Last synced: 14 Jan 2026

https://github.com/seqeralabs/ffq-api

A minimal wrapper to make ffq searches available via a REST API.

api data fastq fetch-fastq ffq genomics

Last synced: 15 Aug 2025

https://github.com/supremkc05/global-job-market-analytics

Scrape jobs from websites like Indeed/LinkedIn, extract skills using NLP, then visualize hiring trends.

beautifulsoup data machine-learning nlp pandas scrapping

Last synced: 14 Aug 2025