An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/dataglyder/Data-Analysis-Tools-to-Get-You-Started

This repository describes a few tools for a beginner Data Analyst.

analytics data python r sql

Last synced: 29 Jul 2025

https://github.com/goutam1511/real-time-covid-19-tracker-for-slack

This automated tracker tracks the spread of Covid-19 in a real time basis by scraping data from Ministry of Health and Family Welfare and notifies the same at Slack

covid-19 data python slack-bot web-scraping

Last synced: 30 Aug 2025

https://github.com/agusk/ilmudata-book-excel-analytics

Hallo Microsoft Excel: Mastering Data Analytics

analytics data data-analytics excel power-query-editor

Last synced: 06 Jan 2026

https://github.com/citizenlabsgr/data.world

Work with data sets prior to uploading to data.world

data data-structures

Last synced: 26 Mar 2025

https://github.com/kwame-mintah/ml-data-copy-to-aws-s3

Automatically copy new data to an AWS S3 bucket for Machine Learning.

aws aws-actions aws-s3 data

Last synced: 14 May 2026

https://github.com/andypicke/ev_station_explorer

Shiny App to visualize electric-vehicle charging station data

data electric-vehicles r shiny-apps visualization

Last synced: 29 Jul 2025

https://github.com/pooja-manjunatha/nyc_parking_violations_dbt

This project uses dbt to transform NYC parking violations data through a layered architecture: Bronze: Raw ingested data Silver: Cleaned and enriched data Gold: Aggregated tables for analytics Using DuckDB as the warehouse backend, it ensures data quality with tests and documentation. The project enables reliable analysis of parking violations

data data-analysis data-engineering dbt duckdb python sql

Last synced: 14 May 2026

https://github.com/valyaevgeorgiy/r_basic

Работа с основами среды R и тем самым изучения нового языка программирования, связанного непосредственно с анализом данных и построением графиков и диаграмм.

coding data data-analysis r rstudio

Last synced: 12 Dec 2025

https://github.com/charlieroth/exoexplo

Exploring NASA Exoplanet Archive Data

data exoplanets julia nasa

Last synced: 03 Apr 2025

https://github.com/push-protocol/push-google-bigquery

The Power of Web3 Big Data: A Guide to Using Google BigQuery and Push Protocol for Data Communication and Analysis

bigquery data push push-notifications web3

Last synced: 26 Mar 2025

https://github.com/jacopodl/jcollections

Common data structures for the C language

c collections data data-structures jcollections

Last synced: 30 Jul 2025

https://github.com/rajesh9943/web-scraping-analysis-of-top-us-company-revenue-growth-in-2023

Explore the landscape of US business growth in 2023 with our dynamic project, 'Web Scraping for US 2023 Revenue Growth.' Utilizing advanced web scraping techniques, we unveil insights into the top companies driving economic expansion.

cleaning-data data data-analysis data-visualization manipulation numpy pandas pre-fill

Last synced: 16 Aug 2025

https://github.com/RedInfinityPro/ScientificSharp

Rating: (5/10) The code is a Windows Forms application for a basic scientific calculator, allowing users to perform mathematical operations like addition, subtraction, multiplication, division, trigonometrics, and logarithms.

componentmodel cryptography data drawing forms generic linq system tasks text

Last synced: 30 Sep 2025

https://github.com/8hrsk/ranger

Package for generating fake userdata to work with.

data factory faker generator npm

Last synced: 30 Apr 2026

https://github.com/luminati-io/zoominfo-dataset-samples

A sample dataset of over 1000 ZoomInfo companies, extracted using the Bright Data API, ideal for market growth, lead generation, and market analysis.

b2b business companies data data-extraction database dataset datasets web-scraping zoominfo

Last synced: 17 Mar 2025

https://github.com/vladandreitoma/igisol_jyvaskyla_xept_experimental_campaign

A simulation toolkit together with data analysis for the Xe&Pt Exotic Nuclei Generation experiment @ Jyvaskyla December 2022. Helping dr.Paul Constantin with simulation development. Simulation is done using Geant4 provided by CERN. Data anlysis is done using ROOT by Cern. Both C++ based. Job distributors to run the sim are coded in pearl

analysis architecture-design cplusplus data oop oop-principles pearl simulations

Last synced: 05 Sep 2025

https://github.com/ethenkem/PyGraphSurvey

A python base web app that provide graphical analysis on data collected from surveys and the system has its on built in form fiiling where admin can set question and sent a link for the forms to be filled and then the system provide anylysis on the collected data. Form feature include selection options, range values file inputs etc

data

Last synced: 30 Apr 2025

https://github.com/jonathanstowe/databulous

Abstraction for tabular data

data perl6 table tabular

Last synced: 02 Apr 2025

https://github.com/mattpap/pycon-2017-bokeh

Bokeh tutorial at PyCon.PL 2017

bokeh data tutorial visualization

Last synced: 17 Mar 2025

https://github.com/par7133/xsltmaster

Dynamically load data from multiple XML/XSLT in webpages

data dynamic load webpages xml xslt

Last synced: 02 Mar 2025

https://github.com/octoenergy/tentaclio-gdrive

A python project containing all the dependencies for the gdrive tentaclio schema

data

Last synced: 24 Jun 2025

https://github.com/octoenergy/tentaclio-databricks

Module to give tentaclio support to databricks

data

Last synced: 24 Jun 2025

https://github.com/octoenergy/tentaclio-s3

A python project containing all the dependencies for s3 tentaclio schema.

data

Last synced: 24 Jun 2025

https://github.com/octoenergy/tentaclio-athena

A python project containing all the dependencies for awsathena+rest tentaclio schema.

data

Last synced: 24 Jun 2025

https://github.com/octoenergy/tentaclio-postgres

A python project containing all the dependencies for postgresq tentaclio schema.

data

Last synced: 24 Jun 2025

https://github.com/octoenergy/tentaclio-gs

A python project containing all the dependencies for gs tentaclio schema.

data

Last synced: 24 Jun 2025

https://github.com/huspacy/huspacy-resources

Resources for building and evaluating huspacy

data huspacy

Last synced: 21 Mar 2025

https://github.com/kaizadp/bbwm_moisture

HOBO data for soil moisture - Bear Brook Watershed in Maine

data hobo-data soil-moisture

Last synced: 17 May 2026

https://github.com/eslamdyab21/apara-data-gui

Custom application for Apara's data wrangling scripts, Technologies used are Qt-designer, PyQt5 for the GUI and Pandas, Numpy for the data work.

csv data data-analysis data-wrangling gui pandas pyqt5-desktop-application qt5-gui

Last synced: 17 May 2026

https://github.com/jefking/copyblobs

Copies all files in a container to another container, in another storage account.

aci arm azcopy azure blob container copy data file files from instant move one-time simple storage sync template to transfer

Last synced: 27 Mar 2025

https://github.com/wilcotomassen/lorem-datum-core

Java based data generator for data simulation

data dataset generator java lorem-ipsum simulated-data

Last synced: 11 Jan 2026

https://github.com/maximkrouk/storage

Lightweight framework for storing data (beta)

cache data keychain memmory storage swift swift5-1 userdefaults

Last synced: 30 Oct 2025

https://github.com/ahmad-ali-rafique/logistic-regression-modeling

An in-depth exploration of logistic regression models, including data cleaning, model building, and performance evaluation on various datasets.

accuracy confusion-matrix data dataanalytics logistic-regression logistic-regression-classifier machine-learning-algorithms mlmodels model modelling regression-models

Last synced: 11 Sep 2025

https://github.com/lukaszkn/data-software-engineering-interview-questions

Data and Software engineering interview questions

data engineering interview-questions python

Last synced: 20 Jul 2025

https://github.com/akesling/csvb

Have CSV? Use CSVB!

analytics csv data database

Last synced: 02 Feb 2026

https://github.com/iankitnegi/statistically_speaking

Explore diverse projects showcasing statistical techniques with real-world data, comprehensive docs, and interactive visualizations.

data excel statistical-analysis statistics

Last synced: 09 Feb 2026

https://github.com/eloyhere/semantic-java

Semantic-Java is a modern, maven Java stream processing framework with zero dependencies. It elegantly blends the fluency of Java Streams, the laziness of JavaScript generators, and intelligent index-based control inspired by database indexing — perfect for time-series, event streams, and high-performance data pipelines as a maven pendency.

data functional functional-programming java pipeline stream

Last synced: 07 Apr 2026

https://github.com/mightymetrika/scdtb

Single Case Design Toolbox

data math r science statistics

Last synced: 04 Jan 2026

https://github.com/0xbitx/dedsec_pastebin-cli

allows you to manage your pastes directly from the terminal

code data paste pastebin payload

Last synced: 25 Jan 2026

https://github.com/ramtinsoltani/safe-cli

A simple Command-line Interface which encrypts and decrypts UTF-8 files using AES-256.

aes-256 cli data data-hook decryption encryption generator handlebars hooks markup partial partial-decryption password safe swap temp temporary tool

Last synced: 16 Apr 2026

https://github.com/ranjeetj06/insighthub

InsightHub is a data analytics project that helps automate the entire process of preparing, analyzing, and reporting on CSV data.

analysis begineer data springboot

Last synced: 17 May 2026

https://github.com/ellisvalentiner/legislation-embeddings

Embeddings for U.S. Congress legislation

data embeddings machine-learning nlp python

Last synced: 12 Aug 2025

https://github.com/pawlo77/messenger-analyser

Repo for Data Visualization project, part of IAD study program at Faculty of Mathematics and Information Science, Warsaw University of Technology

data visualization

Last synced: 17 May 2026

https://github.com/sharmadhiraj/plot-pi

Graphical Representation of PI

data data-visualization html javascript js mathematics plot

Last synced: 28 Mar 2025

https://github.com/nathanieliskandar26/data-analysis-project

This project demonstrates my ability to clean and analyze data using Python and SQL so far. The dataset used for this analysis focuses on general customer information. Through this project, I aimed to uncover meaningful insights and trends by cleaning the data and performing structured queries.

analysis data data-cleaning jupyter-notebook mysql mysql-database python

Last synced: 19 Apr 2026

https://github.com/apparaomulpuri/readline

Explains you the usage of readLine function in Swift.

data fromkeyboard keyboard reading readline swift

Last synced: 29 Mar 2025

https://github.com/vin20777/drone-data-layer

Drone Project Data Layer

csharp data drone layer software-design

Last synced: 18 May 2026

https://github.com/bodfdaf/api

api data service provider

api data detail instagram lazada shopee tiktok video

Last synced: 11 Mar 2025

https://github.com/maulanakavaldo/tri-hita-karana

Project Tri Hita Karana - Future Knowledge G20 Bali. DTS Kominfo x Binar Academy.

bali data data-science g20 science

Last synced: 02 Mar 2025

https://github.com/pedelriomarron/spanish-api-covid19

Data from Spain of COVID-19 (by Datadista) as a service

api covid-19 covid-19-spain data now spain zeit

Last synced: 12 Mar 2025

https://github.com/solrikk/vargen

VarGen (Variation Generator) is a user-friendly desktop application designed to simplify the creation of product variations from CSV files.

csv-files csv-format csv-parser data data-engineering excel excelparser python

Last synced: 29 Mar 2025

https://github.com/pulipulichen/pts-local-news-dataset

A dataset containing local news from Public Television Service.

data dataset

Last synced: 27 Mar 2026

https://github.com/gui-sitton/prepaid

In this project I work as an analyst for the telecommunications company Megaline. The company offers its customers prepaid plans, Surf and Ultimate. The sales department wants to know which plans bring in the most revenue in order to adjust the advertising budget

data data-analysis data-analysis-python data-science data-visualization python

Last synced: 22 May 2026

https://github.com/ciscorn/japanmesh-rs

A Rust library for handling Japanese Grid Square Code (JIS X 0410:2002 地域メッシュコード)

census data geospatial japan rust

Last synced: 11 Jan 2026

https://github.com/aguven6/inmemory-data-processor

Convert tabular data to columnar data with index. Aim is to process huge data quicker especially in aggregation operation

columnar-storage data data-structures parallel-computing parallel-programming processing

Last synced: 17 May 2026

https://github.com/rellyson/data-engineering-tools

This repository holds examples and documentation about the most used tools in the data engineering ecosystem.

apache-airflow apache-spark data data-engineering jupyter-notebook python tools

Last synced: 17 Jan 2026

https://github.com/a-poor/taro

A package for repeatable rectangular data transformations in Python.

data data-science data-transformation pipeline pypi-package python

Last synced: 13 Oct 2025

https://github.com/yash22222/olympic-games-analytics-using-apache-spark

The "Olympic Games Analytics Using Apache Spark Databricks" project explores data from the Olympic Games (1896-2016) to identify trends and insights. Using Apache Spark for big data processing and Databricks for visualization, the project analyzes key factors like top-performing countries and athlete attributes, showcasing real-world analytics.

apache apache-kafka apache-spark big-data-analytics csv data data-analytics data-visualization databricks excel mysql olympics regions

Last synced: 03 May 2026

https://github.com/hallmx/mx_utils

Utility scripts for software development in data science

colaboratory data development nbdev python science scripts software utlities

Last synced: 19 May 2026

https://github.com/gsmith257-cyber/BIT3434CVE

BI T3434 Project on data mining CVEs and Exploits

cve data data-mining exploits research-project

Last synced: 10 Mar 2025

https://github.com/domarps/grad-project-reports

Write-ups of a few key semester-long projects I have worked during my Masters

circuit data deeplearning graph-algorithms matlab question-answering

Last synced: 26 Mar 2025

https://github.com/shivamsharma32/ipl-2022-analysis

The IPL 2022 Analysis project is a data-driven exploration of the Indian Premier League (IPL) 2022 cricket tournament. The analysis focuses on utilizing Python programming and various libraries to analyze and visualize the performance of teams, players, and key metrics in the IPL 2022 season.

data dataana dataanalytics datavi matplotlib python

Last synced: 17 May 2026

https://github.com/pratik-codes/zomato_data_eda

Cleaned, analysed messy data and created a predictive model with and accuracy of 93% with tree Regressor algorithm

bengaluru data data-cleaning data-science famous-restaurants restaurants-delivering-online restraunts

Last synced: 27 Mar 2025

https://github.com/weecology/updating-data

Hugo website for instructions on how to make a regularly updating data pipeline

continuous-analysis continuous-integration data gh-actions living-data netlify travis-ci

Last synced: 17 Feb 2026

https://github.com/encelo/nctracer-data

Data files for the ncTracer project

data icons ncine

Last synced: 15 Jan 2026

https://github.com/rameshaditya/dynamic-hybrid-data-grid

Facilitates faster read-and-write of large ordered collections of data.

algorithms data data-structures storage

Last synced: 23 Feb 2025

https://github.com/amethyst-php/post

A comment, a note, a post, a pseudo-chat. Can be really anything

amethyst amethyst-package api data laravel post

Last synced: 17 May 2026

https://github.com/toofancodes/h1b-dashboard-insights

An interactive Tableau dashboard that visualizes H1B visa data from the USCIS Employer Data Hub, offering insights into application trends, top employers, and geographic distributions. Showcases advanced data visualization, analytics, and business intelligence skills.

analysis analytics business-intelligence dashboard data data-visualization h1b h1b-visa interactive-data tableau

Last synced: 20 Jan 2026

https://github.com/phette23/nces-ipeds-archive

download NCES IPEDS data

data datarescue ipeds nces

Last synced: 30 Oct 2025

https://github.com/ericmaddox/nyc-crime-analytics

Analyzes and visualizes crime data from the NYC Police Department using interactive maps and heatmaps, leveraging the NYC Open Data API.

crime-analysis crimedata data datavisualization esri folium heatmap nycopendata python python3 rtcc

Last synced: 24 Jun 2025

https://github.com/adadalshabab/machine-predictive-maintenance-classification

This repository hosts a machine predictive maintenance classification project, aimed at predicting the maintenance needs of industrial machinery before they fail. By leveraging machine learning algorithms, this project seeks to enhance operational efficiency and reduce downtime by identifying potential maintenance requirements proactively.

data data-science datanalysis datanalytics machine-learning machine-learning-algorithms matplotlib-pyplot pandas

Last synced: 17 May 2026

https://github.com/antoninpvr/battery-logger

Simple scripts to record data from my laptop battery

bash-script battery data

Last synced: 17 May 2026

https://github.com/basinghse/covid19simulator

Real Time Assessment and Simulation of COVID-19 - showing current numbers of cases, deaths and treated patients globally.

coronavirus covid-19 data real-time simulation visualisation visualisation-data-ingester

Last synced: 05 Apr 2025

https://github.com/gaemapiracicaba/norma_dec_8468-76

Padrões de qualidade e lançamento de efluentes de águas interiores

data python

Last synced: 19 Apr 2026

https://github.com/hidayathamir/telegram-group-data

1,865,827 message data in telegram group. Text, identity, datetime.

bahasa-indonesia data python3 scrape telegram telethon

Last synced: 17 May 2026

https://github.com/annaanastasy/mushroom-binary-classification-eda-ml

Explored and modeled a competition dataset of mushroom species, focusing on data cleaning, exploratory data analysis, and building machine learning models for accurate classification of edible and poisonous mushrooms.

binary-classification data data-cleaning-and-preprocessing data-science exploratory-data-analysis machine-learning-algorithms xgboost-classifier

Last synced: 29 Mar 2025

https://github.com/saksham-jain177/data-analysis

A collection of data analysis and machine learning projects across various datasets. Explore predictive modeling, data visualization, and insights from real-world data. Projects include sales predictions, disease detection, customer segmentation, and more.

api data data-analysis data-cleaning data-science data-visualization datamodeling dataset datasets exploratory-data-analysis python python3 web-scraping youtube-api

Last synced: 01 May 2026

https://github.com/meta-llama/synthetic-data-kit

Tool for generating high quality Synthetic datasets

data generation llm python synthetic

Last synced: 08 May 2025

https://github.com/kammarah/studentdata

I created & deployed a Streamlit app to store, manage & analyze student data. 📊🎓

connection data data-analysis data-visualization deploy deployments libraries python streamlit streamlit-webapp webapp

Last synced: 18 May 2026

https://github.com/mkshah605/personal-brand-development

A data-driven approach to a personal brand development project.

branding data data-science growth music personal

Last synced: 12 Sep 2025