An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/yourdataarchitect/french-realestate-data-pipeline

This repository contains a fully automated data pipeline built with Apache Airflow to extract, clean, analyze, and report real estate listings from Seloger. It pushes data to MongoDB, Elasticsearch, and Google Sheets, with real-time Slack alerts for monitoring.

airlfow data datanalysis datapipeline market-intelligence real-estate

Last synced: 31 Dec 2025

https://github.com/coderooz/hr-dashboard

The goal of this project is to create a power bi dashboard to showcase the attrition data within the company.

data data-analytics power-bi

Last synced: 07 Jan 2026

https://github.com/huspacy/huspacy-resources

Resources for building and evaluating huspacy

data huspacy

Last synced: 21 Mar 2025

https://github.com/pyrustic/jayson

Intuitive interaction with JSON files [DEPRECATED, check the project Shared]

data json pyrustic python

Last synced: 17 May 2026

https://github.com/h4fide/politicalcompassbot

This Python project allows you to take a quiz and find out where you fit on the political compass. Give it a try and see where you stand!

bot data greedy-algorithms politics python python3 sql telegram

Last synced: 19 Aug 2025

https://github.com/boettiger-lab/taxadb-cache

Cache for taxadb files

data

Last synced: 19 May 2026

https://github.com/eslamdyab21/apara-data-gui

Custom application for Apara's data wrangling scripts, Technologies used are Qt-designer, PyQt5 for the GUI and Pandas, Numpy for the data work.

csv data data-analysis data-wrangling gui pandas pyqt5-desktop-application qt5-gui

Last synced: 17 May 2026

https://github.com/encoreshao/data-science

Data analyze examples, using Jupyter notebook and Python!!!

data dataanalysis encore jupyter-notebook

Last synced: 29 Mar 2025

https://github.com/questionlp/wwdtm_uniquedates

Script that lists out the unique months and days of months that Wait Wait... Don't Tell Me! shows have aired

data python python3 script wwdtm

Last synced: 17 May 2026

https://github.com/kameronbrooks/datalys2-reporting

Datalys2 Reports allows you to create rich, interactive reports by simply defining a JSON configuration embedded in your HTML. It handles the layout, data visualization, and interactivity, so you don't need to write custom React code for every report.

data data-visualization html react

Last synced: 08 Apr 2026

https://github.com/jefking/copyblobs

Copies all files in a container to another container, in another storage account.

aci arm azcopy azure blob container copy data file files from instant move one-time simple storage sync template to transfer

Last synced: 27 Mar 2025

https://github.com/akesling/csvb

Have CSV? Use CSVB!

analytics csv data database

Last synced: 02 Feb 2026

https://github.com/henryssondaniel/teacup-java-report-file

Report Teacup data to a file

data file logs reports teacup

Last synced: 22 Jul 2025

https://github.com/eloyhere/semantic-java

Semantic-Java is a modern, maven Java stream processing framework with zero dependencies. It elegantly blends the fluency of Java Streams, the laziness of JavaScript generators, and intelligent index-based control inspired by database indexing — perfect for time-series, event streams, and high-performance data pipelines as a maven pendency.

data functional functional-programming java pipeline stream

Last synced: 07 Apr 2026

https://github.com/ranjeetj06/insighthub

InsightHub is a data analytics project that helps automate the entire process of preparing, analyzing, and reporting on CSV data.

analysis begineer data springboot

Last synced: 17 May 2026

https://github.com/ellisvalentiner/legislation-embeddings

Embeddings for U.S. Congress legislation

data embeddings machine-learning nlp python

Last synced: 12 Aug 2025

https://github.com/sharmadhiraj/plot-pi

Graphical Representation of PI

data data-visualization html javascript js mathematics plot

Last synced: 28 Mar 2025

https://github.com/dina-hosny/calculate-installments-dates-and-amounts-plsql

PLSQL project to Calculate the installments dates and amounts for contracts

data plsql sql toad trigger

Last synced: 06 Mar 2026

https://github.com/pulipulichen/pts-local-news-dataset

A dataset containing local news from Public Television Service.

data dataset

Last synced: 27 Mar 2026

https://github.com/eyluldursun/data-science-project

This project involves a data science analysis conducted on the Obesity Data Set. The study explores factors influencing obesity, includes data visualization, and develops predictive models. The goal of the project is to gain insights to help prevent obesity.

data data-science obesity r rmarkdown

Last synced: 26 Jun 2025

https://github.com/ciscorn/japanmesh-rs

A Rust library for handling Japanese Grid Square Code (JIS X 0410:2002 地域メッシュコード)

census data geospatial japan rust

Last synced: 11 Jan 2026

https://github.com/ngupta23/data_prep_helper

A helper package for preparing and combining data from a variety of sources

data data-science dataprep datapreparation dataprocessing helpers python

Last synced: 03 Apr 2025

https://github.com/aguven6/inmemory-data-processor

Convert tabular data to columnar data with index. Aim is to process huge data quicker especially in aggregation operation

columnar-storage data data-structures parallel-computing parallel-programming processing

Last synced: 17 May 2026

https://github.com/shivamsharma32/ipl-2022-analysis

The IPL 2022 Analysis project is a data-driven exploration of the Indian Premier League (IPL) 2022 cricket tournament. The analysis focuses on utilizing Python programming and various libraries to analyze and visualize the performance of teams, players, and key metrics in the IPL 2022 season.

data dataana dataanalytics datavi matplotlib python

Last synced: 17 May 2026

https://github.com/injamul3798/cpp_stl-discussion

As we know ,STL is mostly used tools is competitive programming.

data list map set structure vector

Last synced: 02 Apr 2025

https://github.com/weecology/updating-data

Hugo website for instructions on how to make a regularly updating data pipeline

continuous-analysis continuous-integration data gh-actions living-data netlify travis-ci

Last synced: 17 Feb 2026

https://github.com/akashlogics/street-data-tracking

Detect, Track and Count number of persons walking across the path(s) making use of YOLO. This Python project tracks people moving across predefined street zones

analysis data excel newdataset object-detection opencv python python3 yolo

Last synced: 19 May 2026

https://github.com/buildinamsterdam/contentful-graphql

Contentful GraphQL connection

contentful data graphql

Last synced: 05 Jan 2026

https://github.com/talitalobo/statistics-with-python

Repo about statistical concepts and (not always) their python implementation.

data data-science machine-learning statistics

Last synced: 11 Jan 2026

https://github.com/amethyst-php/post

A comment, a note, a post, a pseudo-chat. Can be really anything

amethyst amethyst-package api data laravel post

Last synced: 17 May 2026

https://github.com/toofancodes/h1b-dashboard-insights

An interactive Tableau dashboard that visualizes H1B visa data from the USCIS Employer Data Hub, offering insights into application trends, top employers, and geographic distributions. Showcases advanced data visualization, analytics, and business intelligence skills.

analysis analytics business-intelligence dashboard data data-visualization h1b h1b-visa interactive-data tableau

Last synced: 20 Jan 2026

https://github.com/ezeparziale/analisis-uso-bicicletas-caba

:biking_man: Análisis de como afecto la pandemia el uso de las bicicletas en CABA.

data data-science data-visualization

Last synced: 14 Mar 2025

https://github.com/ezeparziale/analisis-data-delitos

:gun: Analsis de delitos de CABA

data data-science

Last synced: 14 Mar 2025

https://github.com/emna-chebbi/student-performance

Predictive model for student exam scores based on student performance factors

ai computer-vision data kaggle machine-learning ml mse regression regression-models

Last synced: 15 May 2026

https://github.com/ericmaddox/nyc-crime-analytics

Analyzes and visualizes crime data from the NYC Police Department using interactive maps and heatmaps, leveraging the NYC Open Data API.

crime-analysis crimedata data datavisualization esri folium heatmap nycopendata python python3 rtcc

Last synced: 24 Jun 2025

https://github.com/ohspc89/better_call_jin

A repository containing mentoring materials for a Ph.D. student in Neuroscience

data matlab spss-statistics visualization visualization-tools wrangling-data

Last synced: 03 Jul 2026

https://github.com/moscatellimarco/webscrap-imdb

🎬 Python scraper for IMDB: Extract movie/TV details for 📊 analysis & 🗃️ storage. Easy setup, 🔧 customizable, with 🖥️ CLI.

css data datascience html movies python scrapy scrapy-crawler scrapy-spider web web-scraping webdata webscraping

Last synced: 15 May 2026

https://github.com/adadalshabab/machine-predictive-maintenance-classification

This repository hosts a machine predictive maintenance classification project, aimed at predicting the maintenance needs of industrial machinery before they fail. By leveraging machine learning algorithms, this project seeks to enhance operational efficiency and reduce downtime by identifying potential maintenance requirements proactively.

data data-science datanalysis datanalytics machine-learning machine-learning-algorithms matplotlib-pyplot pandas

Last synced: 17 May 2026

https://github.com/antoninpvr/battery-logger

Simple scripts to record data from my laptop battery

bash-script battery data

Last synced: 17 May 2026

https://github.com/basinghse/covid19simulator

Real Time Assessment and Simulation of COVID-19 - showing current numbers of cases, deaths and treated patients globally.

coronavirus covid-19 data real-time simulation visualisation visualisation-data-ingester

Last synced: 05 Apr 2025

https://github.com/luminati-io/linkedin-dataset-samples

Sample dataset of 1001 LinkedIn companies, extracted via Bright Data API, featuring essential data points for competitive analysis and market insights.

data database dataset linkedin linkedin-api linkedin-data linkedin-dataset linkedin-scraper sample web-scraping

Last synced: 17 Mar 2025

https://github.com/hidayathamir/telegram-group-data

1,865,827 message data in telegram group. Text, identity, datetime.

bahasa-indonesia data python3 scrape telegram telethon

Last synced: 17 May 2026

https://github.com/saksham-jain177/data-analysis

A collection of data analysis and machine learning projects across various datasets. Explore predictive modeling, data visualization, and insights from real-world data. Projects include sales predictions, disease detection, customer segmentation, and more.

api data data-analysis data-cleaning data-science data-visualization datamodeling dataset datasets exploratory-data-analysis python python3 web-scraping youtube-api

Last synced: 01 May 2026

https://github.com/md-emranhossen/leetcode-practice

This repository stores my solutions to LeetCode problems, organized by problem number and title.

cpp data datastructures-algorithms leetcode-solutions

Last synced: 26 Jun 2025

https://github.com/nonsignificantp/enfermedades-inmunoprevenibles

Analisis sobre el efecto de las vacunas y la incidencia de casos de enfermedades inmunoprevenibles en la Ciudad de Buenos Aires entre los años 1995 y 2016

a analysis argentina buenosaires data hepatitis science vaccination

Last synced: 18 Jun 2026

https://github.com/meta-llama/synthetic-data-kit

Tool for generating high quality Synthetic datasets

data generation llm python synthetic

Last synced: 08 May 2025

https://github.com/ericgio/history-of-jazz

Data and visualizations based on Ted Gioia's "The History of Jazz"

data jazz

Last synced: 28 Mar 2025

https://github.com/majorcluster/clj-data-adapter

A Clojure library designed to convert data

clojure data lib library

Last synced: 12 Jul 2025

https://github.com/robsteranium/user2022-ldf-talk

Slides from my useR! 2022 talk about the Linked-Data Frames package

data data-frame linked-data r rdf

Last synced: 19 Apr 2025

https://github.com/dsietz/daas-workshop

Workshop for building a Data as a Service platform using the DaaS SDK.

archconf daas daas-pattern data dataprivacy nfjs rust rust-lang

Last synced: 20 May 2026

https://github.com/sumansuhag/wasserstoff-aiinterntask

Welcome to the AI Pipeline for Image Segmentation and Object Analysis project – a state-of-the-art solution designed to process, segment, identify, and analyze objects within images. This AI-powered pipeline is engineered to deliver precise insights by extracting, mapping, and summarizing data from each segmented object.

artificial-intelligence cdn data data-science modeling pipline

Last synced: 28 Mar 2025

https://github.com/bcodmo/workshop_bios_oceanographic_data

Repository holding lesson on Data Management Basics. See webpage for rendered view: https://bcodmo.github.io/workshop_bios_oceanographic_data/

bco-dmo data datamanagement fair workshop

Last synced: 08 Apr 2026

https://github.com/jigyasag18/orders-sales-analysis-report-using-power-bi

This repository analyzes and visualizes office supply sales data to improve profitability. It examines sales performance by various factors, using charts to provide insights and actionable recommendations for sales optimization, market research, and product mix.

data dataanalysis dataanalytics dataset powerbi powerbi-dashboards powerbi-report powerbi-reports powerbi-visuals powerbidashboard

Last synced: 18 Feb 2026

https://github.com/shysolocup/fndt

JavaScript package allowing you to see function data like body and arguments from outside of the function

aepl data fndt functions javascript javascript-tools js js-function js-functions lightweight nodejs nodejs-modules package stews

Last synced: 30 Apr 2026

https://github.com/sumansuhag/prediction_model

This repository features a collection of Jupyter notebooks designed to showcase the practical applications of machine learning, data preprocessing, feature engineering, and recommendation systems. These notebooks enable users to explore, analyze, and predict business events.

algotithms artificial-intelligence data logistic-regression machine-learning-algorithms science sckiit-learn

Last synced: 28 Mar 2025

https://github.com/wolfchamane/amjs-data-types

Data types for your OOP javascript project

cjs data javascript modules nodejs oop types

Last synced: 20 May 2026

https://github.com/circlexo/circlexo

Open-source project to seamlessly integrate and manage your business workflow, connecting Jira, GitHub, Discord, Stripe, RevenueCat, and OpenAI all in one intuitive platform.

bussiness-intelligence data discord-bot forge github google jira kpis ploi revenuecat stripe vapor

Last synced: 20 May 2026

https://github.com/furkankarakuz/turkey_earthquake

This project focuses on analyzing and visualizing earthquake data specific to Turkey. It aims to provide insightful visualizations on topics such as earthquake frequency, location, and magnitude using data obtained from Boğaziçi University Kandilli Observatory and Earthquake Research Institute.

api data data-visualization earthquake python python3 request streamlit turkey turkey-earthquake

Last synced: 20 May 2026

https://github.com/reubano/pyconza-tutorial

Jupyter notebooks and data for "Data Mining and Processing for fun and profit" PyConZA16 tutorial

data functional-programming jupyter-notebook meza pycon python tutorial

Last synced: 17 May 2026

https://github.com/clagiordano/marketplaces-data-export

LIbrary that share the same interface and provide adapters for online marketplaces services

adapter amazon api clagiordano data ebay ebay-api export marketplaces mws mws-api rest soap

Last synced: 22 Mar 2025

https://github.com/chompfoods/sdk-scala

Scala SDK for the Chomp Food & Recipe Database API. Use our API to get high-quality data on recipes and 875,000+ branded/grocery foods plus raw ingredients.

api branded chomp data database food grocery ingredients nutrition raw recipe-api recipes scala sdk

Last synced: 17 May 2026

https://github.com/rsc-labs/see-open-data

Show www.dane.gov.pl in user friendly format. Generate flourish data or other data visualizations.

data data-visualization flourish government poland

Last synced: 04 Apr 2025

https://github.com/hackolade/yugabytedb-ysql

Hackolade(https://hackolade.com) plugin for the Cloud Native Yugabyte database with YSQL API

data data-modeling entity-relationship-diagram schema-design ysql yugabyte yugabytedb

Last synced: 30 Apr 2025

https://github.com/hivesolutions/repos

Modular repository management system

data python repos storage system

Last synced: 14 May 2026

https://github.com/kuanjiahong/covid19-analysis

A simple project to familiarize myself with data analysis

data data-science data-visualization pandas python

Last synced: 02 Apr 2025

https://github.com/jigyasag18/fake-news-prediction-project

The Fake News Prediction App Repository offers a machine learning project that focuses on identifying the authenticity of news articles as fake or real. It uses a dataset of 20,000 articles and employs methods such as TF-IDF vectorization and the Porter stemming algorithm, achieving around 97% classification accuracy with logistic regression model.

data datapreprocessing logistic-regression machine-learning machine-learning-algorithms numpy pandas prediction stemming vectorization

Last synced: 08 Jun 2026

https://github.com/namescode/hub_harvester

A python script to gather data on a user or organisations git repos

data github nix nix-flake python python3 sqlite

Last synced: 08 Apr 2026

https://github.com/UznetDev/Smoking-Prediction

This project focuses on analyzing the "Smoking" dataset and building a predictive model for smoking status based on various health metrics. The goal is to identify factors influencing smoking behavior and develop a reliable model for prediction.

ai classification data data-science kaggle-competition machine-learning ml roc-auc sklearn smoking

Last synced: 28 Mar 2025

https://github.com/pyfig/s21_data-science-bootcamp

School21 Bootcamp Data Science

data data-science numpy pandas python school21

Last synced: 26 Jun 2025

https://github.com/stdlib-js/array-base-fill-by

Fill all elements within a portion of an array according to a callback function.

accessor array data fill generic javascript map node node-js nodejs set stdlib structure transform typed types

Last synced: 14 May 2026

https://github.com/sajjadanwar0/booking.com-scraping

Scraping booking.com using Selenium and Beautiful Soup

crawler data python scraping selenium

Last synced: 18 Oct 2025

https://github.com/sharoonjoseph321/social_media_eda

Data Analysis on social media apps ,using pandas, python, matplotlib.

data data-analysis data-science data-visualization matplotlib programming-language project python pythonprojects

Last synced: 03 Mar 2025

https://github.com/danielrosehill/ghg-ebitda-correlations

Streamlit data visualisation examining correlation between emissions & profitability

data sustainability sustainability-data

Last synced: 14 Mar 2025

https://github.com/zshn1248/pyfilecrypto

PyFileCrypto is a Python module for easy encryption and decryption of files using the cryptography library. It provides a simple interface to generate encryption keys, encrypt files, and decrypt files securely.

data decryption encryption file security-tools

Last synced: 07 Apr 2026

https://github.com/deliprofesor/cardiac-data-analysis-exploring-cholesterol-and-heart-rate

This project analyzes a heart disease dataset to explore the relationship between cholesterol, heart rate, and chest pain type. It includes normality tests, outlier detection, correlation analysis, MANOVA, post-hoc tests, and VIF analysis, with visualizations using histograms, heatmaps, and boxplots.

correlation-analysis data data-cleaning data-visualization machine-learning manova post-hoc-analysis python tukey-hsd vif

Last synced: 17 May 2026

https://github.com/joseluisq/input-verifier

Some useful functions to check common data input.

data input utils validation

Last synced: 19 Jul 2025

https://github.com/gui-sitton/carsells

In this project I am an analyst on the Crankshaft List. Hundreds of free vehicle advertisements are published on the site every day. I need to study the data collected over the last few years and determine which factors influence the price of a vehicle.

data data-analysis data-analysis-python data-science data-visualization python

Last synced: 20 May 2026

https://github.com/avijeetpandey/quizzez

Implementation of quizzez application using kotlin

android data kotlin viewmodel

Last synced: 20 May 2026