An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/eloyhere/semantic-java

Semantic-Java is a modern, maven Java stream processing framework with zero dependencies. It elegantly blends the fluency of Java Streams, the laziness of JavaScript generators, and intelligent index-based control inspired by database indexing — perfect for time-series, event streams, and high-performance data pipelines as a maven pendency.

data functional functional-programming java pipeline stream

Last synced: 07 Apr 2026

https://github.com/chrisrobertsjr/chrisrobertsjr

Welcome to my Github Profile!

data data-analysis java r sql statistics

Last synced: 03 May 2026

https://github.com/danicaalana/breast-cancer-random-forest

This project is developed as part of Digital Skill Fair (DSF) 35.0 - Data Science by Dibimbing. I am using Wisconsin Breast Cancer Diagnostic Dataset from scikit-learn, which is a classic and very easy binary classification dataset.

breast-cancer-classification breast-cancer-wisconsin data eda machine-learning-algorithms python random-forest-classifier

Last synced: 16 May 2026

https://github.com/debruine/faux.jl

Julia version of faux for data simulation

data julia simulation

Last synced: 28 Mar 2025

https://github.com/shreedata/data-analysis-using-python-libraries-

The COVID-19 pandemic has significantly impacted India, necessitating a detailed analysis of the virus’s spread within the country. In this project, we explore an India-specific COVID-19 dataset, leveraging Python libraries such as Pandas, NumPy, Matplotlib, and Seaborn.

covid-19 data data-cleaning data-visualization datana kaggle-dataset matplotlib numpy pandas-python python3 pythonlibrarires scikit seaborn

Last synced: 28 Mar 2025

https://github.com/muneeb1030/webscrapper_politifact

This initiative seeks to extract and analyze fact-checking data from Politifact.com, providing valuable insights into political statements, rulings, and the evolving information landscape.

data data-collection dataanalysis python3 scrapy scrapy-spider webscraping

Last synced: 09 Sep 2025

https://github.com/tadiusfrank2001/data_mining_projects_labs_cs145

A collection of data mining course assignments to implement advanced predictive statistical analysis models

algorithms data data-mining data-science deep-learning predictive-modeling python3 wide-learning

Last synced: 16 May 2026

https://github.com/ishansurdi/data-visualisation-empowering-business-with-effective-insights

The following tasks are completed for Data Visualization: Empowering Business with Effective Insights on Forage in October 2024. It is important to note that this should not be interpreted as an endorsement.

chart communicating-insights-and-analysis dashboard data data-analysis forage powerbi powerbi-visuals tableau tata tata-group virtual-internship visual visualization

Last synced: 17 Feb 2026

https://github.com/umstek/sampler

Generate elaborate random data instantly.

data faker javascript json sample

Last synced: 20 Jul 2025

https://github.com/mx51/data-dictionary-action

GitHub Action for generating and checking freshness of data dictionaries

action analytics data

Last synced: 17 Jan 2026

https://github.com/akesling/csvb

Have CSV? Use CSVB!

analytics csv data database

Last synced: 02 Feb 2026

https://github.com/jefking/copyblobs

Copies all files in a container to another container, in another storage account.

aci arm azcopy azure blob container copy data file files from instant move one-time simple storage sync template to transfer

Last synced: 27 Mar 2025

https://github.com/eslamdyab21/apara-data-gui

Custom application for Apara's data wrangling scripts, Technologies used are Qt-designer, PyQt5 for the GUI and Pandas, Numpy for the data work.

csv data data-analysis data-wrangling gui pandas pyqt5-desktop-application qt5-gui

Last synced: 17 May 2026

https://github.com/huspacy/huspacy-resources

Resources for building and evaluating huspacy

data huspacy

Last synced: 21 Mar 2025

https://github.com/octoenergy/tentaclio-gdrive

A python project containing all the dependencies for the gdrive tentaclio schema

data

Last synced: 24 Jun 2025

https://github.com/tuscanicz/doctrine-data-applier

Symfony bundle for Doctrine Migrations of data using doctrine entities

data database doctrine entity migrations symfony symfony-bundle

Last synced: 02 Feb 2026

https://github.com/octoenergy/tentaclio-databricks

Module to give tentaclio support to databricks

data

Last synced: 24 Jun 2025

https://github.com/octoenergy/tentaclio-s3

A python project containing all the dependencies for s3 tentaclio schema.

data

Last synced: 24 Jun 2025

https://github.com/octoenergy/tentaclio-athena

A python project containing all the dependencies for awsathena+rest tentaclio schema.

data

Last synced: 24 Jun 2025

https://github.com/octoenergy/tentaclio-postgres

A python project containing all the dependencies for postgresq tentaclio schema.

data

Last synced: 24 Jun 2025

https://github.com/octoenergy/tentaclio-gs

A python project containing all the dependencies for gs tentaclio schema.

data

Last synced: 24 Jun 2025

https://github.com/prcharan592/olympic-insights-historical-data-analytics-in-r

This project analyzes 120 years of Olympic history (1896–2016), uncovering trends and insights from the data

data data-analytics data-science data-visualization kaggle r-programming

Last synced: 03 Apr 2025

https://github.com/kaizadp/bbwm_moisture

HOBO data for soil moisture - Bear Brook Watershed in Maine

data hobo-data soil-moisture

Last synced: 17 May 2026

https://github.com/theduardomaciel/cc-pe

Conteúdos, scripts em R e datasets utilizados durante a matéria de Probabilidade e Estatística.

data probability r statistics

Last synced: 27 Mar 2025

https://github.com/vijaykumar1303/sales-data-analysis-and-dashboard-development

To analyze sales data to uncover insights into sales performance, trends, and patterns, and to develop an interactive dashboard that provides a comprehensive view of sales metrics and KPIs.

data dataanalysis datacleaning datavisualisation dax-query powerbi powerquery sql sqldataanalysis

Last synced: 11 Feb 2026

https://github.com/wilcotomassen/lorem-datum-core

Java based data generator for data simulation

data dataset generator java lorem-ipsum simulated-data

Last synced: 11 Jan 2026

https://gitlab.com/sean-c/pdf_rules

Turn PDFs into CSVs by defining rules

Data Cleaning automation data data parsing

Last synced: 14 Apr 2025

https://github.com/maximkrouk/storage

Lightweight framework for storing data (beta)

cache data keychain memmory storage swift swift5-1 userdefaults

Last synced: 30 Oct 2025

https://github.com/ahmad-ali-rafique/logistic-regression-modeling

An in-depth exploration of logistic regression models, including data cleaning, model building, and performance evaluation on various datasets.

accuracy confusion-matrix data dataanalytics logistic-regression logistic-regression-classifier machine-learning-algorithms mlmodels model modelling regression-models

Last synced: 11 Sep 2025

https://github.com/lukaszkn/data-software-engineering-interview-questions

Data and Software engineering interview questions

data engineering interview-questions python

Last synced: 20 Jul 2025

https://github.com/campiohe/geomask

A very simple lib for creating geometric masks from spatial data using regular grids.

climate data gis weather

Last synced: 30 Dec 2025

https://github.com/mightymetrika/scdtb

Single Case Design Toolbox

data math r science statistics

Last synced: 04 Jan 2026

https://github.com/ramtinsoltani/safe-cli

A simple Command-line Interface which encrypts and decrypts UTF-8 files using AES-256.

aes-256 cli data data-hook decryption encryption generator handlebars hooks markup partial partial-decryption password safe swap temp temporary tool

Last synced: 16 Apr 2026

https://github.com/chocolateboy/corrigenda

Corrections, addenda, and deltas for data that's wrong on the Internet

addenda api corrections corrigenda data json json-data

Last synced: 27 Mar 2025

https://github.com/pawlo77/messenger-analyser

Repo for Data Visualization project, part of IAD study program at Faculty of Mathematics and Information Science, Warsaw University of Technology

data visualization

Last synced: 17 May 2026

https://github.com/nathanieliskandar26/data-analysis-project

This project demonstrates my ability to clean and analyze data using Python and SQL so far. The dataset used for this analysis focuses on general customer information. Through this project, I aimed to uncover meaningful insights and trends by cleaning the data and performing structured queries.

analysis data data-cleaning jupyter-notebook mysql mysql-database python

Last synced: 19 Apr 2026

https://github.com/apparaomulpuri/readline

Explains you the usage of readLine function in Swift.

data fromkeyboard keyboard reading readline swift

Last synced: 29 Mar 2025

https://github.com/vin20777/drone-data-layer

Drone Project Data Layer

csharp data drone layer software-design

Last synced: 18 May 2026

https://github.com/bodfdaf/api

api data service provider

api data detail instagram lazada shopee tiktok video

Last synced: 11 Mar 2025

https://github.com/pedelriomarron/spanish-api-covid19

Data from Spain of COVID-19 (by Datadista) as a service

api covid-19 covid-19-spain data now spain zeit

Last synced: 12 Mar 2025

https://github.com/solrikk/vargen

VarGen (Variation Generator) is a user-friendly desktop application designed to simplify the creation of product variations from CSV files.

csv-files csv-format csv-parser data data-engineering excel excelparser python

Last synced: 29 Mar 2025

https://github.com/jorgeatgu/dataset-elecciones-28a

Datasets generados a partir del dataset de elecciones generales de El País

28a data elecciones2019 elections spain

Last synced: 16 May 2026

https://github.com/a-poor/taro

A package for repeatable rectangular data transformations in Python.

data data-science data-transformation pipeline pypi-package python

Last synced: 13 Oct 2025

https://github.com/yash22222/olympic-games-analytics-using-apache-spark

The "Olympic Games Analytics Using Apache Spark Databricks" project explores data from the Olympic Games (1896-2016) to identify trends and insights. Using Apache Spark for big data processing and Databricks for visualization, the project analyzes key factors like top-performing countries and athlete attributes, showcasing real-world analytics.

apache apache-kafka apache-spark big-data-analytics csv data data-analytics data-visualization databricks excel mysql olympics regions

Last synced: 03 May 2026

https://github.com/hallmx/mx_utils

Utility scripts for software development in data science

colaboratory data development nbdev python science scripts software utlities

Last synced: 19 May 2026

https://github.com/gsmith257-cyber/BIT3434CVE

BI T3434 Project on data mining CVEs and Exploits

cve data data-mining exploits research-project

Last synced: 10 Mar 2025

https://github.com/nabilaagha/chest-x-ray-medical-diagnosis-using-deep-learning

This project uses deep learning to classify chest X-ray images for disease detection. It involves data preprocessing, pre-trained CNN models, and the ChestX-ray8 dataset to enhance medical diagnostics with AI.

computer-vision data data-processing deep-learning juypter-notebook medical-image-processing x-ray-images

Last synced: 15 Dec 2025

https://github.com/pratik-codes/zomato_data_eda

Cleaned, analysed messy data and created a predictive model with and accuracy of 93% with tree Regressor algorithm

bengaluru data data-cleaning data-science famous-restaurants restaurants-delivering-online restraunts

Last synced: 27 Mar 2025

https://github.com/encelo/nctracer-data

Data files for the ncTracer project

data icons ncine

Last synced: 15 Jan 2026

https://github.com/cemoktra/data_series

time series handling

data lazy-evaluation time-series

Last synced: 29 Oct 2025

https://github.com/luminati-io/zoominfo-dataset-samples

A sample dataset of over 1000 ZoomInfo companies, extracted using the Bright Data API, ideal for market growth, lead generation, and market analysis.

b2b business companies data data-extraction database dataset datasets web-scraping zoominfo

Last synced: 17 Mar 2025

https://github.com/noorkhokhar99/text-to-speech-demo

Text to Speech Demo

data python roboflow

Last synced: 27 Mar 2025

https://github.com/ahabdel/amazon-web-scraper

Amazon Web Scraper to scrape pricing adjustments and provide updates on a day to day basis

data web-scraping

Last synced: 29 Oct 2025

https://github.com/takamoso/umami

Cross browser compatibility data.

browser compat compatibility data dataset json

Last synced: 27 Mar 2025

https://github.com/annaanastasy/mushroom-binary-classification-eda-ml

Explored and modeled a competition dataset of mushroom species, focusing on data cleaning, exploratory data analysis, and building machine learning models for accurate classification of edible and poisonous mushrooms.

binary-classification data data-cleaning-and-preprocessing data-science exploratory-data-analysis machine-learning-algorithms xgboost-classifier

Last synced: 29 Mar 2025

https://github.com/kammarah/studentdata

I created & deployed a Streamlit app to store, manage & analyze student data. 📊🎓

connection data data-analysis data-visualization deploy deployments libraries python streamlit streamlit-webapp webapp

Last synced: 18 May 2026

https://github.com/mkshah605/personal-brand-development

A data-driven approach to a personal brand development project.

branding data data-science growth music personal

Last synced: 12 Sep 2025

https://github.com/yugoff/ml-kaggle-regression-with-a-mohs-hardness-dataset

Your Goal: For this Episode of the Series, your task is to use regression to predict the Mohs hardness of a mineral, given its properties

data gradient-boosting kaggle kaggle-competition regression-models

Last synced: 18 May 2026

https://github.com/ahadly/sql-data-analytics-project

This repository contains a collection of SQL scripts demonstrating various analytical techniques, such as changes over time, cumulative, performance, data segmentation, part-to-whole analysis.

analytics business-analytics business-intelligence data data-analysis data-analyst data-analytics data-engineering data-science data-scientist database datascience query reporting sql sql-queries sql-query sql-server window-functions window-functions-in-sql

Last synced: 18 May 2026

https://github.com/samridhisainii/airbnb-data-analysis

Data analysis of airbnb dataset

analysis data data-visualization eda models

Last synced: 16 May 2026

https://github.com/shrutakeerti/crime-filex

Crime FileX : The mission to trace crime and make this a crime free world

ai aiml analysis crime-data css data html ics js ml

Last synced: 19 Apr 2026

https://github.com/jigyasag18/ibm-power-bi-dashboard-project

IBM Power BI Dashboard Project is a data-driven analysis of employees using IBM's comprehensive dataset, providing insights into key factors contributing to employee turnover and enabling organizations to strategize effectively towards improved employee retention and satisfaction.

data data-visualization dataanalysis dataanalytics dataset datavisualisation datavisualization-project powerbi powerbi-dashboards powerbi-report powerbi-visuals powerbidashboard

Last synced: 07 Mar 2026

https://github.com/juniorreisx/movelo-logstica

Movelo is a lightweight logistics simulator built with TypeScript that provides mock order and delivery data for developing and testing UIs, dashboards, and backend features without external APIs.

data hooks lucide-react react tailwindcss typescript

Last synced: 12 Apr 2025

https://github.com/e22m4u/ts-projection

Модуль для работы с проекцией данных для TypeScript

data projection typescript

Last synced: 12 Apr 2025

https://github.com/theleopard65/isa-imitation

This repository contains a simple C++ implementation of a Von-Neumann architecture simulator. The program mimics the behavior of a basic computer architecture that uses a single memory space for both instructions and data. Users can load programs, execute them, and view the current state of the memory and registers.

32-bit 64-bit ac architecture c-plus-plus data executable explained implementation ir isa mar mdr memory pc registers simulation von-neumann x64 x86

Last synced: 18 Mar 2025

https://github.com/nanvenomous/sizable

A generic interface to mongo go driver

data driver generic generics go golang mongodb

Last synced: 15 May 2026

https://github.com/eryks1999/data-collection-project_python

This project allowed me to practice classes, populating json files as well as extracting data.

data git json python

Last synced: 16 Apr 2026

https://github.com/ezmiller/boe-election-data

CSV files containing parsed NYC Bureau of Elections data for 2009 and 2013

data elections nyc

Last synced: 18 Oct 2025

https://github.com/ebrizzzz/data-visualization-project-using-tableau

A data visualization project for the Visual Data Analysis course (Spring Term 2025) at the University of Skövde. This project explores the factors influencing national happiness scores across different global regions from 2005 to 2022.

analytics data data-analysis data-science data-visualization python regression tableau

Last synced: 16 Jun 2025

https://github.com/styd/sd_struct

Searchable Deep Struct

activesupport data gem openstruct rails ruby structure

Last synced: 18 May 2026

https://github.com/fastbolt/entity-importer

Entity importing library for importing data from files (CSV and Excel currently) or API into doctrine.

data doctrine2 excel excel-import

Last synced: 17 Feb 2026

https://github.com/mekramy/ircity

Iran province, county and city data in json format.

data iran-city json mekramy

Last synced: 05 Apr 2025

https://github.com/naithikjorapur/practive-tanstacktsx

Practice TanStack with React, Vite, and TypeScript to build fast, type-safe apps. Leverage tools like TanStack Query for data management and Vite for a streamlined development experience.

data exercise fetching html-css-javascript json learning-by-doing practice query router tsx

Last synced: 05 Apr 2025

https://github.com/mrk214/bible-data-es-spa

La Biblia en formato JSON

api bible biblia data god jesus json spanish

Last synced: 05 Apr 2025

https://github.com/bakangmonei/is_final_assignment

My intelligent systems assignment

data data-science intelligent-systems python

Last synced: 02 May 2026

https://github.com/metapsy-project/data-depression-psiloctr

Database of psilocybin-assisted therapies for adults with depression versus control conditions.

data

Last synced: 01 Mar 2026

https://github.com/luminati-io/google-search-api

Two methods to collect real Google SERP data—a free scraper for basic use and the enterprise-grade Bright Data API for high-volume demands.

data google-scraper html python serp-api web-scraping

Last synced: 25 Jun 2025

https://github.com/amethyst-php/email-subscription

Subscribe your email to our mailing-list, we'll promise no spam will be delivered.

amethyst amethyst-package api data email-subscription laravel

Last synced: 17 Mar 2025