An open API service indexing awesome lists of open source software.

data

Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)

https://github.com/liyakhathshaik/datascout.jl

This is a julia package

data datascout julia

Last synced: 09 Oct 2025

https://github.com/scienxlab/datasets

Some small datasets for demos, courses, testing, etc.

data open-data sample-data teaching-resources

Last synced: 09 Oct 2025

https://github.com/varun-khorgade/sentimentscope-e-commerce-review-analyzer

Analyzed customer reviews and purchase data to extract sentiment and behavioral insights. Built SQL-based ETL for data preparation and visualized results using Python and Power BI dashboards for actionable business decisions.

analytics customer-beheviour dashboard data data-visualization dataextraction natural-language-processing nlp pandas powerbi python sentiment-analysis sql textblob

Last synced: 17 Apr 2026

https://github.com/stdlib-js/ndarray-base-zeros-like

Create a zero-filled ndarray having the same shape and data type as a provided ndarray.

base data fill filled javascript matrix ndarray node node-js nodejs stdlib structure types vector zeros

Last synced: 04 Oct 2025

https://github.com/helosantosdesousa/analise-previsao-de-rotatividade-ml

Projeto final do Bootcamp Data Girls 2025 que analisa a rotatividade de funcionários usando Machine Learning. Com base no dataset IBM HR Analytics Attrition, o projeto identifica os principais fatores de risco e cria modelos preditivos (SVC e Random Forest) com até 89% de acurácia para antecipar saídas e apoiar decisões estratégicas de RH.

analise-de-dados analise-exploratoria bootcamp ciencia-de-dados colab-notebook dados data data-analysis data-science dataanalytics dataframe eda machine-learning machine-learning-algorithms pandas python random-forest svc

Last synced: 16 Apr 2026

https://github.com/east-empire-trading-company/eetc-data-client

Client library for retrieving data managed by EETC Data Hub.

client-library data data-science finance library python

Last synced: 31 May 2026

https://github.com/jorgeatgu/casa-caida-bot

Twitter-bot sobre la despoblación en Aragón

aragon bot data data-viz despoblacion twitter-bot

Last synced: 11 Aug 2025

https://github.com/strata/data

Tools to help you read data from a range of different data providers.

api data data-integration

Last synced: 27 Jan 2026

https://github.com/qeeqbox/data-classification

Data classification defines and categorizes data according to its type, sensitivity, and value

classification data data-classification infosecsimplified qeeqbox

Last synced: 09 Mar 2026

https://github.com/vikjam/ui-policy

Unemployment policy at the state level

data government government-data

Last synced: 13 Feb 2026

https://github.com/semibran/img-data

Easily read from and write to ImageData instances

canvas data image img

Last synced: 11 Aug 2025

https://github.com/rohancyberops/r-language

R Language Projects directory. This repository contains various projects, scripts, and experiments developed using R, a powerful statistical computing and data visualization language.

caret cran data dplyr ggplot2 rlanguage rstudio shiny tidyverse

Last synced: 12 Oct 2025

https://github.com/genert/metis

Asynchronous data sender library

analytics asynchronous data dependency-free typescript

Last synced: 27 Jan 2026

https://github.com/tpgillam/teafiles.jl

Tea file support for Julia

data julia time-series

Last synced: 03 Oct 2025

https://github.com/rubenhortas/python_examples

Examples of Python code and DSA (data structures and algorithms).

algorithm algorithms data dsa examples python python-3 python3 samples snippets structures

Last synced: 03 Oct 2025

https://github.com/ddeutils/ddedocs

📖 Data Developer & Engineer Documents and Hands-On

blogs data data-engineering documents hands-on

Last synced: 08 Aug 2025

https://github.com/orisai/nette-data-sources

Orisai Data Sources integration for Nette

data decoder encoder file-format files json neon nette orisai parser php yaml

Last synced: 05 Feb 2026

https://github.com/carlossilva2/pybase

An easy to use Database using Python and JSON

data database json python3 storage

Last synced: 11 May 2026

https://github.com/isaac-lal/english-arabic-dictionary

This is a dictionary website that implements a search feature which allows input for a word in either English or Arabic and returns the alternative translation.

data db javascript react web-development

Last synced: 09 Apr 2026

https://github.com/open-i18n/data-iso-15924

Git mirror for ISO 15924, Codes for the representation of names of scripts data

data iso iso-15924 iso15924 open-i18n scripts unicode unicode-data writing-systems

Last synced: 14 Mar 2026

https://github.com/woctezuma/download-steam-screenshots-data

Data consisting of Steam screenshots.

data steam steam-api

Last synced: 19 Feb 2026

https://github.com/akv3sic/cryptocurrency-charts

Cryptocurrency API data visualizations 📈 with Matplolib.

cryptocurrency data data-visualization matplotlib python

Last synced: 16 Oct 2025

https://github.com/ireddragonicy/wascrub

Clean WhatsApp chat export easily.

chat clean data meta whatsapp

Last synced: 03 May 2026

https://github.com/dav009/bqt

Local unit tests for your BigQuery queries

bigquery bq data test unittest

Last synced: 11 Feb 2026

https://github.com/bishtrishu/pizza_sales_data_analysis_sql

This project is a comprehensive data analysis of pizza sales, aimed at uncovering key insights and trends to inform business decisions. Using a combination of SQL, Python, and data visualization tools, the project analyzes sales data to understand customer preferences, peak sales periods, and the most popular pizza types.

cloud data data-analysis data-science data-visualization dataanalytics database mysql oracle-database

Last synced: 14 Apr 2026

https://github.com/nicolasbizzozzero/datagenerator

Randomly generate various commonly used data

data data-generation data-generator data-science

Last synced: 18 Oct 2025

https://github.com/mscbuild/analysis

🎢 This collection of data analysis projects demonstrates techniques for extracting, transforming, analyzing, and visualizing data. Data Analytics Projects for Beginners 📈 ⚡

anallysis analysis chart csv dashboard data data-science data-science-projects excel google html5 mashine-learning portfolio pyton

Last synced: 19 Oct 2025

https://github.com/divithraju/divith-aju-hadoop-pyspark-pipeline

This project demonstrates the creation of a scalable data processing pipeline for handling and analyzing log data from a hypothetical e-commerce platform. Leveraging Hadoop and PySpark, the pipeline is designed to process large volumes of log files, providing meaningful insights into user behavior, system performance, and sales metrics.

apache-hadoop-framework apache-spark bigdata client data database dataengineering dataingestionframework datapreprocessing documentation ecommerce-platform hdfs pipeline project project-repository pyspark python3 software-engineering

Last synced: 27 Jan 2026

https://github.com/marcelo-earth/H5N8-Data

🔢🦠 Confirmed cases of H5N8 in humans - Feel free to open Pull Requests with new data.

csv data h5n8 h5n8-cases h5n8-virus russia

Last synced: 20 Oct 2025

https://github.com/tiaanduplessis/country-currency-data

Data about currencies of countries

countries currencies data symbols

Last synced: 08 Aug 2025

https://github.com/rodekruis/510-data-catalog

The Project is CKAN based Data Catalog Portal for 510

catalog ckan data opendata

Last synced: 23 Jan 2026

https://github.com/purarue/git_doc_history

copy/track file history in git, with python bindings to traverse and extract history/files/lines at some date

data git

Last synced: 17 May 2026

https://github.com/suryavamsi-p/conflict-nlp-topic-modeling-sentiment-analysis-using-llms

Extracts insights from 26K+ protest events using BERTopic, Top2Vec, and LLMs for real-world applications like crisis monitoring, policy research, and social unrest analysis.

all-mpnet-base-v2 bertopic conflict-data data data-science lda llama2 llms machine-learning mistral-7b nlp nltk protest-analysis pyldavis python3 top2vec topic-modeling transformers visualization

Last synced: 11 May 2026

https://github.com/doziestar/datavinci

DataVinci enables you to visualize data from various sources, generate insights, analyze data with AI models, and receive real-time updates on anomalies

data golang logs pipeline

Last synced: 23 Jan 2026

https://github.com/rnabla/cuda-des

Bruteforcing DES using CUDA

bruteforce cuda data des encryption gpu parallel standard

Last synced: 27 Oct 2025

https://github.com/aleenprd/docbt

Documentation Build Tool - Generate YAML documentation for dbt models with optional AI assistance. Built with Streamlit for an intuitive and familiar web interface.

ai analytics-engineering bigquery data data-modeling data-science dbt docker llm lmstudio ollama openai snowflake sql streamlit

Last synced: 11 Nov 2025

https://github.com/medz/block

A flexible and efficient binary data block handling library for Dart.

binary blob block data streams

Last synced: 24 Feb 2026

https://github.com/maccccd/wsoa3029a_2444372

This website serves an extension of my portfolio work. It focuses specifically on showcasing my understanding of D3.js , a JavaScript library used to create interactive data visualizations. The visualizations in here were used to provide insights on two types of cybersecurity attacks: Phishing & Ransomware.

d3js data hacking visualization

Last synced: 24 Jan 2026

https://github.com/jayantur13/data-bharat

Get states their capital and districts,UTS and other useful information

data js node npmjs package yarn

Last synced: 28 Jan 2026

https://github.com/undistraction/grid-model

A small API for creating a grid and accessing the positions of the cells, rows and columns within it.

2d calculations cells data grid layout model

Last synced: 04 Aug 2025

https://github.com/ariqf1/learn_data

Currently learning and building projects related to data pipelines, ETL processes, and data processing using Python. Passionate about scalable data solutions and modern data stack tools.

data data-engineering mysql

Last synced: 15 Apr 2026

https://github.com/theryston/db-mycro

A node module with a json database that saves data in a specific directory, similar to sqlite, but in JSON

base crud data database db db-mycro javascript json jsondatabase nodejs nosql typescript

Last synced: 09 Apr 2026

https://github.com/simranjeet97/leetcode_practice

Practicing the Leet Code Codes for Competitive Programming

algorithms amazon coding competitive-programming data data-structures facebook google leetcode python

Last synced: 03 Aug 2025

https://github.com/fairspec/fairspec-typescript

Fairspec TypeScript is a fast data management framework built on top of the Fairspec standard and Polars DataFrames

ckan csv data dataframe dataset excel fair json ods polars quality schema sqlite table typescript validation zenodo

Last synced: 09 Feb 2026

https://github.com/alejo1630/titanic_kaggle

This Python Notebook is a proposal to analyse the Titanic dataset for the Kaggle Competition, using several data science techniques and concepts.

data data-science jupyter-notebook notebook python titanic-survival-prediction

Last synced: 03 May 2026

https://github.com/chalk-ai/roadmap

Chalk public roadmap

chalk data data-science mlops pipeline python

Last synced: 17 Jan 2026

https://github.com/michalwols/awesome-data-curation

🗑️ ✨ 📊 Awesome things related to data collection, annotation, cleaning and management.

active-learning annotation cleaning-data data data-science deep-learning machine-learning

Last synced: 24 Jun 2026

https://github.com/noahweasley/node-user-settings

A universal but simple node library to implement user settings, built to work with Electron.js with little or no configurations

app data electronjs json nodejs persist settings storage sync user

Last synced: 08 Feb 2026

https://github.com/garcane/cookie-company-visual-dashboard

This Excel-based interactive dashboard provides a comprehensive overview of the Cookie Company's sales performance and key metrics.

dashboard data data-visualization excel microsoft-excel

Last synced: 09 Feb 2026

https://github.com/3squared/smoulder

Smoulder is a really good data pipe

composition data facade-pattern forge-framework object-oriented

Last synced: 25 Apr 2026

https://github.com/jhpoelen/rats

self-replicating data publication related to rat (Rattus sp.) specimen.

biodiversity data natural-history-collections provenance

Last synced: 18 Mar 2026

https://github.com/mchenryspagg/hng-hire-data-model

The project involves creating a data model for HNG Hire, implementing it in MySQL, and building a Power BI dashboard to display hiring statistics.

dashboard data database datamodeling dimensional-modeling mysql mysql-database powerbi starschema

Last synced: 11 Feb 2026

https://github.com/v6ntage/sql-sales_data-analytics-project

This repository contains a SQL scripts demonstration analytical techniques.

analytics business-analytics data data-analysis database query sql sql-server

Last synced: 12 Apr 2026

https://github.com/tushar2704/applied-ai-playground

This repository serves as a comprehensive collection of resources and projects for Applied Artificial Intelligence (AI). Whether you're an AI enthusiast, a data scientist, or a developer looking to explore practical applications of AI, this repository aims to provide you with valuable materials and hands-on projects to deepen your understanding.

artificial-intelligence data data-science machine-learning machine-learning-algorithms

Last synced: 12 Feb 2026

https://github.com/lmuffato/project-mongodb-dataflights-trybe

Projeto MongoDB Dataflights - Projeto avaliativo da Trybe do Bloco 23: Introdução ao MongoDB

back-end crud data database filter mongo mongodb query trybe-projects

Last synced: 16 Apr 2026

https://github.com/ishaansathaye/cpe202-datastructalgos

CPE 202 Data Structures and Algorithms Winter 2022 Freshman at Cal Poly

algorithm binary binary-search-tree data graph hash heap python queue stack structures

Last synced: 12 May 2026

https://github.com/stephaniehicks/flowsorted.blood.wgbs.blueprint

A Bioconductor ExperimentHub data package for flow sorted purified whole blood cell types measured using DNA methylation on WGBS platform from BLUEPRINT

bioconductor bioconductor-package bisulfite-sequencing blood data dna-methylation flowsort wgbs

Last synced: 25 Sep 2025

https://github.com/obsidianplusplus/5e_play_cs-go

Python工具,分析你在5EPlay的CS:GO比赛数据。抓取、分析、筛选并导出。 | Python tool to analyze your 5EPlay CS:GO match data. Fetches, analyzes, filters, and exports.

5eplay analysis api automation csgo data esports excel json match pandas performance player python reporting scraping stats team

Last synced: 13 Feb 2026

https://github.com/garcane/beverage-sales-analytics

This project provides an in-depth analysis of beverage sales and delivery across different states using Power BI.

data data-visualization powerbi powerbi-report powerbi-visuals

Last synced: 19 Mar 2026

https://github.com/scarblase/russian-military-losses-analysis

This repository provides an in-depth analysis of Russian equipment losses using PySpark and data visualization techniques.

data data-science data-visualization jyputer-notebook matplotlib pyspark python3 seaborn seaborn-plots ukraine ukraine-invasion

Last synced: 12 May 2026

https://github.com/tonykipkemboi/ens_subgraph_data

Query On-Chain Data from Subgraphs by The Graph Protocol using Python

data subgraphs thegraphprotocol web3

Last synced: 17 Sep 2025

https://github.com/nikhilash45/power-bi-vsualisation-of-joins

In This Power Bi Report User Can Visualis Join By Themselves , and it is easy to understand joins now.

business-analytics business-intelligence data data-analysis data-visualization joins powerbi sql visualization

Last synced: 19 Mar 2026

https://github.com/stdlib-js/array-base-assert-is-complex-floating-point-data-type

Test if an input value is a supported array complex-valued floating-point data type.

array assert base check data dtype is javascript node node-js nodejs stdlib test types util utilities utility utils valid validate

Last synced: 14 Feb 2026

https://github.com/ajsalemo/python-pandas-datalib

Testing and experimenting with some simple Pandas functionality using Flask to serve the parsed data.

csv data flask json pandas pandas-dataframe pandas-series python tabular tabular-data terminal

Last synced: 09 Apr 2026

https://github.com/jopanel/factual-scraper

Data scraper for Factual v2 API

data

Last synced: 15 Feb 2026

https://github.com/neomutt/sample-data

📚 Lists of things. Useful for developing and testing.

data list sample

Last synced: 19 Mar 2026

https://github.com/m-rishab/stock_trend-analysis-power-bi-project-

In this project, I've harnessed the robust capabilities of Power BI to analyse, visualize, and uncover the story behind HUL's stock performance.

data datavisualization datavisualization-project powerbi

Last synced: 19 Mar 2026

https://github.com/luminati-io/twitter-x-dataset-samples

A sample dataset of over 1000 Twitter (X) posts, extracted using the Bright Data API, ideal for trend discovery, brand monitoring, and competitive insights.

api data dataset twitter twitter-api twitter-scraper web-scraping x

Last synced: 19 Mar 2026

https://github.com/linx-software/file-import-to-rest-api

Import a CSV file and make the data available via a REST API.

csv data linx low-code

Last synced: 19 Mar 2026

https://github.com/chandraprakash-bathula/keywords_prediction-machine-learning-integration

Keywords Prediction Model Built the Model By: Data Cleaning Removing Stopwords Constructing Word2vec Advancing to TF-IDF Weighted Word2vec.

algori artifici data machine-learning tf-idf weighted-word2vec word2vec

Last synced: 08 Nov 2025

https://github.com/stdlib-js/ndarray-slice-dimension

Return a read-only view of an input ndarray when sliced along a specified dimension.

copy data javascript matrix ndarray node node-js nodejs select slice stdlib structure types vector view

Last synced: 01 Mar 2026

https://github.com/skywardai/paper_gallery

Papers gallery for using LLMs ability over dataset

ai data data-science llm medicine neural-network research security

Last synced: 19 Mar 2026

https://github.com/mohamedhany99/human-voice-identifier-counter

the application developed in (KIVY) it can identify the users imported into the dataset based on the support vector machine training model it has two features ( Importing new voice - Detection to detect the human voices and count them)

android android-app android-application automation automation-framework data data-analysis data-mining data-science data-visualization datascience kivy kivy-framework machine-learning python

Last synced: 27 Mar 2026