An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/hyfi06/unam-careers

A utility package for retrieving career information from UNAM.

career data npm-package unam

Last synced: 16 May 2026

https://github.com/badranalyst/data-cleaning-and-exploratory-data-analysis-project

This project uses SQL to clean and analyze a layoffs dataset. Data cleaning tasks include removing duplicates, standardizing values, and handling missing data. Exploratory analysis is performed to identify trends in layoffs across companies, industries, and time periods.

cleaning-data data database dataset mysql mysql-database sql

Last synced: 07 Apr 2025

https://github.com/svenruppert/_data_for_demos

Data used for demos

data datasets images ruppert sven

Last synced: 25 Jan 2026

https://github.com/webianks/anotech-android

Android application which deals on various anomalous behaviour that occur on server data.

anomaly-detection data server

Last synced: 13 Apr 2025

https://github.com/srindot/average_flightdata_collection_fwuav

This repository is designed for collecting average data for a flapping wing UAV. The script acg_coeff_data_collection.py runs the necessary data collection, and the resulting data is saved into a CSV file called AverageFlightData.csv.

data flaping-uav

Last synced: 18 Sep 2025

https://github.com/paulveillard/cybersecurity-analytics

An ongoing collection of awesome software, libraries, learning tutorials, documents and books, technical resources and cool stuff about Analytics Engineering in Cybersecurity.

analytics bigdata bigquery cybernetics cybersecurity data data-engineering data-science encryption encryption-decryption seo seo-friendly seo-optimization

Last synced: 28 Mar 2025

https://github.com/jonathanstowe/databulous

Abstraction for tabular data

data perl6 table tabular

Last synced: 02 Apr 2025

https://github.com/miroslavvidovic/distribution-graphs

Creating ASCII graphical histograms in the terminal with https://github.com/philovivero/distribution

ascii data graph histogram python terminal

Last synced: 24 Apr 2026

https://github.com/lorinczakos/sql-projects

This is a collection of my SQL scripts that I wrote and were approved through my course with GoIT Romania Data Analyst course

bigquery cte data data-analysis dbeaver marketing-analytics postgresql project-repository sql vscode

Last synced: 16 May 2026

https://github.com/cmutel/jester

Import data from the olca-schema JSON-LD format into the HESTIA JSON-LD schema

agriculture data json-ld life-cycle-assessment ontology

Last synced: 26 Jul 2025

https://github.com/mattpap/pycon-2017-bokeh

Bokeh tutorial at PyCon.PL 2017

bokeh data tutorial visualization

Last synced: 17 Mar 2025

https://github.com/ranjeetj06/insighthub

InsightHub is a data analytics project that helps automate the entire process of preparing, analyzing, and reporting on CSV data.

analysis begineer data springboot

Last synced: 17 May 2026

https://github.com/naufalbasara/superstores-pipeline

Data Pipeline on Dummy E-commerce with Apache Airflow

airflow data data-engineering data-pipeline data-warehouse postgresql

Last synced: 16 May 2026

https://github.com/dms-codes/www.usu.ac.ididdirektori

Faculty and Docent Data Retrieval Script The faculty_and_docent_data_retrieval.py script is a Python script for retrieving faculty and docent data from a university website using Selenium. It includes functions to extract faculty names and docent profiles, as well as a multithreading approach to fetch data for multiple faculty-docent pairs.

data python scrape

Last synced: 26 May 2026

https://github.com/mysociety/sync-ep-to-jkan

Syncs EveryPolitician data to mySociety's data portal.

data everypolitician jkan politicians

Last synced: 27 Jul 2025

https://github.com/gunn/covid-19-scripts

Scripts for processing COVID-19 data - e.g. converting from absolute to per capita numbers, adding fine-grained data from more countries

covid-19 data geography typescript

Last synced: 17 May 2026

https://github.com/uzinfocom-org/archive

📦 | Archived projects that aren't used anymore

archive archive-data data notused

Last synced: 01 Sep 2025

https://github.com/srgchrksv/articles

My articles about coding, data etc

article coding data learning medium python

Last synced: 18 Jun 2026

https://github.com/par7133/xsltmaster

Dynamically load data from multiple XML/XSLT in webpages

data dynamic load webpages xml xslt

Last synced: 02 Mar 2025

https://github.com/kenjyco/mongo-helper

Helper funcs and tools for working with MongoDB

aggregation-pipeline data database kenjyco mongo mongodb python

Last synced: 28 Jan 2026

https://github.com/ellisvalentiner/legislation-embeddings

Embeddings for U.S. Congress legislation

data embeddings machine-learning nlp python

Last synced: 12 Aug 2025

https://github.com/josecsotomorales/dataform

Repository for testing dataform

cli data data-engineering data-transformation

Last synced: 27 Mar 2025

https://github.com/cleanzr/cd

CD dataset for Entity Resolution

data linkage

Last synced: 10 Mar 2026

https://github.com/piyushkumar2025/india-general-elections-2024_data-analyst

Analyzed election data for 540+ constituencies and 100+ parties using SQL. Calculated state-wise seat distributions, classified 30+ parties into alliances, identified top 10 candidates by EVM votes, calculated victory margins, and analyzed voting patterns for 300+ candidates to uncover key insights.

analytics data database mysql sql statistics

Last synced: 22 May 2026

https://github.com/aminnairi/node-decode

Check that your data meet your expectations

check data decode expectations schema

Last synced: 22 Apr 2026

https://github.com/erictleung/tidytuesdays

:chart_with_upwards_trend: My attempts at #tidytuesday

data data-science data-visualization r rstats tables tidytuesday tidyverse

Last synced: 19 Sep 2025

https://github.com/weskal/vexus_pipeline

Automated pipeline for generating, ingesting, and validating realistic data, designed to simulate real-world workflows with scheduling, data quality checks, and version control.

airflow data pipeline python sqlserver workflow

Last synced: 20 Jan 2026

https://github.com/sharmadhiraj/plot-pi

Graphical Representation of PI

data data-visualization html javascript js mathematics plot

Last synced: 28 Mar 2025

https://github.com/rd-uk/rduk-data-sqlite

SQLite Data Provider implementation for rduk-data

data rduk sqlite

Last synced: 16 May 2026

https://github.com/mapaor/horaris-rodalies

Web que utilitza la API de rodalies de Catalunya per mostrar els horaris d'una manera més divertida

adif api ave barcelona bordils catalunya dades data distancia generalitat girona horaris md r11 regional renfe rodalies sants tren viajes

Last synced: 16 May 2026

https://github.com/lancewalk87/cls-cloud-sync-ruby-on-rails

Software | SQL Database with automated Cloud Sync for mitigating lost data across dist. servers. Managed by Ruby on Rails.

cloud-computing cloud-storage data database ruby ruby-application ruby-on-rails server sql

Last synced: 24 Jul 2025

https://github.com/ember-nexus/reference-dataset

Ember Nexus API backup containing different standardized scenarios

backup data ember-nexus

Last synced: 25 Jan 2026

https://github.com/questionlp/wwdtm_uniquedates

Script that lists out the unique months and days of months that Wait Wait... Don't Tell Me! shows have aired

data python python3 script wwdtm

Last synced: 17 May 2026

https://github.com/stkisengese/numpy-data-fundamentals

A comprehensive collection of NumPy exercises covering array manipulation, slicing, broadcasting, random data generation, and real-world data analysis applications.

data data-analysis numpy pre-processing

Last synced: 16 May 2026

https://github.com/ngupta23/data_prep_helper

A helper package for preparing and combining data from a variety of sources

data data-science dataprep datapreparation dataprocessing helpers python

Last synced: 03 Apr 2025

https://github.com/maulanakavaldo/tri-hita-karana

Project Tri Hita Karana - Future Knowledge G20 Bali. DTS Kominfo x Binar Academy.

bali data data-science g20 science

Last synced: 02 Mar 2025

https://github.com/chubek/pyramid-dashboard

A Dashboard to Show Data Made Using Plotly Dash

dash data docker ml plotly plotly-dash python

Last synced: 19 May 2026

https://github.com/gui-sitton/prepaid

In this project I work as an analyst for the telecommunications company Megaline. The company offers its customers prepaid plans, Surf and Ultimate. The sales department wants to know which plans bring in the most revenue in order to adjust the advertising budget

data data-analysis data-analysis-python data-science data-visualization python

Last synced: 22 May 2026

https://github.com/team-hydrogen/2025-adc-data

All files relating to the computation of the data provided

data jupyter-notebook nasa-app-development-challenge

Last synced: 11 Apr 2025

https://github.com/tks18/xl-pq-handler

A Pythonic Power Query (.pq) File Manager for Excel & Power BI Automation

analytics automation data excel power-query powerbi python xlwings

Last synced: 20 Jan 2026

https://github.com/pulipulichen/pts-local-news-dataset

A dataset containing local news from Public Television Service.

data dataset

Last synced: 27 Mar 2026

https://github.com/ciscorn/japanmesh-rs

A Rust library for handling Japanese Grid Square Code (JIS X 0410:2002 地域メッシュコード)

census data geospatial japan rust

Last synced: 11 Jan 2026

https://github.com/praveendecode/data-analysis

Implemented data analysis projects with interactive Streamlit UI for user-friendly data exploration and insights presentation

data data-science dataanalysis exploratory-data-analysis insights python streamlit-dashboard tableau tableau-public

Last synced: 04 Apr 2025

https://github.com/denisecase/buzzline-04-case

Adding live visualizations to streaming data applications

animation data kafka matplotlib python streaming

Last synced: 11 Apr 2025

https://github.com/denisecase/cintel-03-data

Getting started with interactive data analytics in Python

analytics data interactive python shiny

Last synced: 11 Apr 2025

https://github.com/0xkibh/datamining-algo

This repository consist data mining algorithm implementation example in python

apriori-algorithm data datamining fp-growth python

Last synced: 19 May 2026

https://github.com/fabsdevx/files-to-database-loader-handout

Data Engineering project for learning purposes. Credits to itversity

csv data data-engineering database json pandas python

Last synced: 09 Apr 2026

https://github.com/srindot/fwuav-average-flight-data-collection

This repository is designed for collecting average data for a flapping wing UAV. The script acg_coeff_data_collection.py runs the necessary data collection, and the resulting data is saved into a CSV file called AverageFlightData.csv.

data flaping-uav

Last synced: 10 Aug 2025

https://github.com/domarps/grad-project-reports

Write-ups of a few key semester-long projects I have worked during my Masters

circuit data deeplearning graph-algorithms matlab question-answering

Last synced: 26 Mar 2025

https://github.com/aguven6/inmemory-data-processor

Convert tabular data to columnar data with index. Aim is to process huge data quicker especially in aggregation operation

columnar-storage data data-structures parallel-computing parallel-programming processing

Last synced: 17 May 2026

https://github.com/ahmad-ali-rafique/heart-disease-detection-model

A comprehensive project for detecting heart disease using machine learning, including data processing, model training, and evaluation metrics with AUC curve analysis.

artificial-intelligence data datascience heart-disease machine-learning modeling prediction-model

Last synced: 11 Aug 2025

https://github.com/jor-/measurements

Python functions to handle, statistically analyze and plot measurement data.

data measurements python

Last synced: 17 Mar 2025

https://github.com/ashita-ai/ashita-ai.github.io

Ashita AI - The island of misfit data tools

ai data

Last synced: 19 Feb 2026

https://github.com/injamul3798/cpp_stl-discussion

As we know ,STL is mostly used tools is competitive programming.

data list map set structure vector

Last synced: 02 Apr 2025

https://github.com/amethyst-php/setting

Give the user the ability to configure his own settings

amethyst amethyst-package api data laravel setting

Last synced: 19 May 2026

https://github.com/andrii04/andreamonforte-bi-assignment

Automated Data Pipeline that ingests daily GA4-formatted CSV files from a private Google Cloud Storage bucket, validates and loads them into BigQuery, and prepares analysis-ready views. The solution is built for deployment as a Cloud Function triggered by Cloud Scheduler and uses Python with the Google Cloud Storage and BigQuery client libraries.

automation bigquery cloud cloudfunctions data data-analysis data-engineering etl etlpipeline gcp google googlecloudplatform pipeline python sql

Last synced: 09 Nov 2025

https://github.com/snimmagadda1/luigi-etl-example

🔍 Example of an ETL pipeline using Spotify's Luigi

data luigi luigi-pipeline python spotify

Last synced: 30 Mar 2025

https://github.com/dolanmiu/mclaren-task

A front end assessment task for Mclaren

angular data observable observables rxjs

Last synced: 16 May 2026

https://github.com/talitalobo/statistics-with-python

Repo about statistical concepts and (not always) their python implementation.

data data-science machine-learning statistics

Last synced: 11 Jan 2026

https://github.com/shivamsharma32/ipl-2022-analysis

The IPL 2022 Analysis project is a data-driven exploration of the Indian Premier League (IPL) 2022 cricket tournament. The analysis focuses on utilizing Python programming and various libraries to analyze and visualize the performance of teams, players, and key metrics in the IPL 2022 season.

data dataana dataanalytics datavi matplotlib python

Last synced: 17 May 2026

https://github.com/emna-chebbi/student-performance

Predictive model for student exam scores based on student performance factors

ai computer-vision data kaggle machine-learning ml mse regression regression-models

Last synced: 15 May 2026

https://github.com/weecology/updating-data

Hugo website for instructions on how to make a regularly updating data pipeline

continuous-analysis continuous-integration data gh-actions living-data netlify travis-ci

Last synced: 17 Feb 2026

https://github.com/rameshaditya/dynamic-hybrid-data-grid

Facilitates faster read-and-write of large ordered collections of data.

algorithms data data-structures storage

Last synced: 23 Feb 2025

https://github.com/kadirlofca/unity-csvmaker

Quick and easy way to create and export .csv files from Unity.

csharp data database unity

Last synced: 09 Apr 2026

https://github.com/amethyst-php/post

A comment, a note, a post, a pseudo-chat. Can be really anything

amethyst amethyst-package api data laravel post

Last synced: 17 May 2026

https://github.com/aaisha-nexus/sql_company_insights

A beginner-friendly SQL project for managing employee records, departments, and sales transactions. Includes table creation, optimized queries, stored procedures, and window functions to extract business insights.

business-analytics data data-analysis dataanalysis-projects dataanalytics database-schema mssql-database query relational-databases sql sql-query ssms

Last synced: 12 Aug 2025

https://github.com/nel-zi/zipco_foods

Developed an automated ETL pipeline using Python and Apache Airflow to consolidate fragmented CSV sales data into a normalized Azure SQL database for Zipco Foods.

airflow apache-spark data dataengineering etl pyspark wsl

Last synced: 03 May 2026

https://github.com/carlosrs14/parallel-data-preprocessig-system

A parallel data preprocessing system using threads and synchronization mechanisms (barrier, busy-waiting, condition variables) to clean and prepare data for AI training.

barrier-method c condition-variable data operative-systems parallel-computing posix preprocessing synchronization threads

Last synced: 24 Jul 2025

https://github.com/terracrow/tml

Easy to use data manipulation package using YAML.

data database db node npm tml yml

Last synced: 26 Feb 2025

https://github.com/toofancodes/h1b-dashboard-insights

An interactive Tableau dashboard that visualizes H1B visa data from the USCIS Employer Data Hub, offering insights into application trends, top employers, and geographic distributions. Showcases advanced data visualization, analytics, and business intelligence skills.

analysis analytics business-intelligence dashboard data data-visualization h1b h1b-visa interactive-data tableau

Last synced: 20 Jan 2026

https://github.com/madhuresh2011/kulturehire-internship

☺️Hi folk, During my internship at KultureHire, I completed a real-world Data Analyst project. I created an interactive dashboard using pivot tables, conducted a thorough analysis, and provided actionable recommendations. I'm excited to share my work and the insights I discovered.

data data-analytics data-cleaning data-standardization data-visualization excel excel-pivot-charts excel-pivot-tables genz-aspirations my-sql

Last synced: 17 Feb 2026

https://github.com/the-tech-idea/beep.winform.sample

Application for Managing your Different DataSources . Still in Alpha.please be patient

application data data-science database dataset integeration mysql nosql oracle postgres sqlite sqlserver workflow-engine workflows

Last synced: 08 Jul 2025

https://github.com/moscatellimarco/webscrap-imdb

🎬 Python scraper for IMDB: Extract movie/TV details for 📊 analysis & 🗃️ storage. Easy setup, 🔧 customizable, with 🖥️ CLI.

css data datascience html movies python scrapy scrapy-crawler scrapy-spider web web-scraping webdata webscraping

Last synced: 15 May 2026

https://github.com/nmelgar/birthday_sports_dataviz

We will analyze how the Matthew Effect has influenced in professional sports players.

analysis csv data data-analysis data-science data-visualization datavisualization dataviz probability research tableau

Last synced: 08 Jan 2026

https://github.com/als8446/tripleten-data-science-projects

Projects Overview Projects made in the Data Scientist course from TripleTen LatAm

data data-analysis hypothesis-tests machine matplotlib numpy pandas python scipy sklearn

Last synced: 10 Apr 2026

https://github.com/bocchilorenzo/hugginginfo

Unofficial library to retrieve information from the HuggingFace website.

api data huggingface scrape

Last synced: 03 Apr 2026

https://github.com/shysolocup/fndt

JavaScript package allowing you to see function data like body and arguments from outside of the function

aepl data fndt functions javascript javascript-tools js js-function js-functions lightweight nodejs nodejs-modules package stews

Last synced: 30 Apr 2026