data
Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)
- GitHub: https://github.com/topics/data
- Wikipedia: https://en.wikipedia.org/wiki/Data
- Related Topics: datum,
- Last updated: 2026-07-02 00:07:45 UTC
- JSON Representation
https://github.com/divithraju/divith-aju-hadoop-pyspark-pipeline
This project demonstrates the creation of a scalable data processing pipeline for handling and analyzing log data from a hypothetical e-commerce platform. Leveraging Hadoop and PySpark, the pipeline is designed to process large volumes of log files, providing meaningful insights into user behavior, system performance, and sales metrics.
apache-hadoop-framework apache-spark bigdata client data database dataengineering dataingestionframework datapreprocessing documentation ecommerce-platform hdfs pipeline project project-repository pyspark python3 software-engineering
Last synced: 27 Jan 2026
https://github.com/rambodrahmani/covid19-behind-the-numbers
COVID-19: Behind the Numbers.
apriori-algorithm apriori-algorithm-python clustering clustering-algorithm clustering-analysis covid covid-19 covid19-data data data-mining data-science datamining fpgrowth machine-learning machine-learning-algorithms python python-machine-learning
Last synced: 20 Aug 2025
https://github.com/nesterenko-kv/object-id
ObjectIDs are a special type of identifier mainly used in MongoDB to uniquely identify documents within a collection. They consist of a 12-byte binary value that includes a timestamp, a machine identifier, a process identifier, and a counter.
c-sharp data id net object-id unique-identifier
Last synced: 16 May 2025
https://github.com/arif-miad/titanic-analysis
artificial-intelligence data data-science deep-neural-networks
Last synced: 09 Jun 2026
https://github.com/unicef/magicbox-download-shapefiles
Downloads shapefiles for each country from gadm.org and unzips them.
data data-science docker downloads-shapefiles emergency-response gadm geospatial geospatial-data humanitarian javascript magicbox nodejs shapefile unicef
Last synced: 02 May 2026
https://github.com/ozlerhakan/eda
Exploratory Data Analysis Samples
data data-analysis data-virtualization eda exploratory-data-analysis matplotlib plotly python seaborn
Last synced: 16 Apr 2026
https://github.com/suryavamsi-p/conflict-nlp-topic-modeling-sentiment-analysis-using-llms
Extracts insights from 26K+ protest events using BERTopic, Top2Vec, and LLMs for real-world applications like crisis monitoring, policy research, and social unrest analysis.
all-mpnet-base-v2 bertopic conflict-data data data-science lda llama2 llms machine-learning mistral-7b nlp nltk protest-analysis pyldavis python3 top2vec topic-modeling transformers visualization
Last synced: 11 May 2026
https://github.com/undistraction/grid-model
A small API for creating a grid and accessing the positions of the cells, rows and columns within it.
2d calculations cells data grid layout model
Last synced: 04 Aug 2025
https://github.com/jcasbin/jcasbin-menu-permission
Casbin Menu Permission Example (Based on jCasbin)
abac acl auth authorization authz casbin data go java jcasbin menu permission rbac spring springboot
Last synced: 11 Jul 2025
https://github.com/izaaccoding36/dados-dinamicos
Esse repositório apresenta um site criado com API para a criação de gráficos, relatando o uso de redes sociais em uma escala global
api data redes-sociais social-media website
Last synced: 26 Mar 2025
https://github.com/mtingers/opacify
Opacify reads a file and builds a manifest of external sources to rebuild said file.
backup data obfuscation python
Last synced: 18 May 2026
https://github.com/ahmad-ali-rafique/comment-generation-tool
This repository hosts a Jupyter Notebook-based Comment Generation Tool exploring advanced NLP techniques for automated, contextually relevant comment generation from input data. Ideal for developers and researchers in NLP and automated text generation.
ai aitools artificial-intelligence content-based-recommendation data datascience jupyter-notebook machine-learning
Last synced: 07 Oct 2025
https://github.com/shuklayash02/excel_complete_vrindastore_dataanalysis
Compltete AnalysisData Cleaning,processing and data analysis with interactive dashboard
analysis data data-visualization datacleaning excel excel-vba
Last synced: 19 Mar 2026
https://github.com/makepath/medaprep
medaprep is a data preparation and feature engineering toolkit for geospatial applications.
data data-science datacleaning eda exploratory-data-analysis xarray
Last synced: 29 Jun 2025
https://github.com/double-o-z/powershell-json-lightweight-serializer-deserializer
Simple powershell functions to convert from and to json. Very lightweight, will be supported with every powershell version. No dependences.
convert converter data data-science deserialize json lightweight powershell serializer
Last synced: 04 May 2026
https://github.com/jrmedd/emojinal
An experimental API for determining emoji sentiment, based on research from Institut "Jožef Stefan", Slovenia.
data emojis sentiment user-research ux
Last synced: 19 Jan 2026
https://github.com/saroshfarhan/kaggle-playground-s4e12
Kaggle competition first attempt
analytics data data-analysis-python data-science
Last synced: 12 Oct 2025
https://github.com/simranjeet97/leetcode_practice
Practicing the Leet Code Codes for Competitive Programming
algorithms amazon coding competitive-programming data data-structures facebook google leetcode python
Last synced: 03 Aug 2025
https://github.com/ishaansathaye/data40x-1_2_3
Fall 2025 Cal Poly Data 401 Data Science Process and Ethics, 402 Mathematical Foundations of Data Science, 403 Projects Lab
capstone-prep data data-science ethics lab python
Last synced: 04 May 2026
https://github.com/alja7dali/swift-bits
A bite sized library for dealing with bytes.
binary bit bits byte bytes comprehension data manipulation swift
Last synced: 09 Jun 2026
https://github.com/seabbs/estzoonotictb
Explore, Visualise and Estimate the Global Zoonotic Tuberculosis Burden
bovine-tb data estimation package rstats tuberculosis visualisation zoonotic-tb
Last synced: 28 Feb 2026
https://github.com/aranfononi/h4x0r-news-section-17-project
A SwiftUI-powered app that displays top stories from Hacker News. Users can open articles directly within the app, utilizing SwiftUI’s NavigationLink and custom WebView integration.
app-development data data-binding data-binding-library ios swift swiftui xcode
Last synced: 18 May 2026
https://github.com/perceptronv/miscellaneous
A huge variety of materials, mostly training data for AI. Not a lot of source code yet.
data gan machine-learning nlp text-generation
Last synced: 04 May 2026
https://github.com/kucingkode/dmerge
Small javascript library to help you merge same formatted data in a string
cithak data data-merge javascript library lightweight lightweight-javascript-library merge open-source
Last synced: 04 May 2026
https://github.com/kenmwaura1/nuvo-data-cleaning-functions
Collection of scripts and functions to clean and preprocess data using Nuvo SDK.
Last synced: 04 May 2026
https://github.com/pferreirafabricio/data-immersion
🏊🏻♂️ Activities and exercises from 'Imersão Dados' event
data data-analysis data-science dataset jupiter-notebook python
Last synced: 14 May 2026
https://github.com/flowsynx/plugin-json
FlowSynx plugin to loads and parses local JSON files. Supports transformation, extraction, and mapping of hierarchical data structures in workflows.
data data-platform flowsynx json
Last synced: 10 Mar 2026
https://github.com/n0nag0n/flee-intercom
For those of you who like to keep your money after Intercom jacks up the prices year after year, but want to keep an export of your data.
again-and-again api data database export exporter flee high-prices intercom mysql php price run save saver year-over-year
Last synced: 09 May 2026
https://github.com/rapter1990/data-visualization-examples
Data Visualization Examples
data data-analysis data-visualization folium matplotlib plot plotly python seaborn visualization
Last synced: 13 Apr 2026
https://github.com/diegoperea20/own_dataset_segmentation_yolov8
Segmentacion y detection de objetos con propio dataset usando YOLOV8 , en el que se utiliza un dataset propio de una moneda de 200 pesos colombianos del año 2023.
coins colombia data opencv own python segmentation tensorflow yolov8
Last synced: 12 Apr 2026
https://github.com/stdlib-js/array-base-every-by-right
Test whether all elements in an array pass a test implemented by a predicate function, iterating from right to left.
all array data every generic javascript node node-js nodejs predicate stdlib structure test types validate
Last synced: 13 Feb 2026
https://github.com/colour-science/colour-checker-detection-tests-datasets
Colour - Checker Detection - Tests Datasets
color color-checker color-science color-space color-spaces colorspace colorspaces colour colour-checker colour-science colour-space colour-spaces colourspace colourspaces data dataset datasets raw
Last synced: 19 Mar 2026
https://github.com/obsidianplusplus/5e_play_cs-go
Python工具,分析你在5EPlay的CS:GO比赛数据。抓取、分析、筛选并导出。 | Python tool to analyze your 5EPlay CS:GO match data. Fetches, analyzes, filters, and exports.
5eplay analysis api automation csgo data esports excel json match pandas performance player python reporting scraping stats team
Last synced: 13 Feb 2026
https://github.com/frictionlessdata/cardealerdp
Cardealer DP (Car Dealer Data Package) is a data exchange format for car dealerships. It is developed on top of the Data Package standard
car data datapackage dealer exchange extension format
Last synced: 13 Feb 2026
https://github.com/dkosarevsky/db_cp
DB course project
data database db postgres postgresql postgresql-database postgressql
Last synced: 05 May 2026
https://github.com/frictionlessdata/extensiondp
Extension DP (Data Package Extension Template) is a Git repository template for rapid Data Package extension development
data datapackage exchange extension format
Last synced: 13 Feb 2026
https://github.com/jaldekoa/fiscaldataapi
A Python wrapper to easily retrieve data from the Fiscal Data (US Treasury) official API in pandas format.
api api-wrapper banking data finance pandas python united-states
Last synced: 27 Jan 2026
https://github.com/vagnerbellacosa/029_analisededadoscompythonpandas
Neste Labs será apresentada a biblioteca Pandas, uma biblioteca Python de código aberto para análise de dados. Ela dá ao Python a capacidade de trabalhar com dados do tipo planilha, permitindo carregar, manipular e combinar dados rapidamente, entre outras funções. Python
data digital-innovation-one dio jupiter-notebook labs ms-excel panda python
Last synced: 14 May 2026
https://github.com/keanteng/nextjs-directory
🌐A Draft Website For Data Catalogue Using NextJs
catalogue climate-change css data directory html javascript nextjs website
Last synced: 09 May 2026
https://github.com/igorskyflyer/npm-adblock-header-extract
✂️ Parse and extract ad-block filter list headers with ease. Works on strings or files, trims whitespace, and returns clean metadata for tooling and automation. 📃
adblock back-end biome data filter header igorskyflyer javascript js metadata node nodejs npm string ts typescript utility
Last synced: 11 Mar 2026
https://github.com/simonbernarding/ml_project_simonbernarding
This project focuses on predicting flight delays using historical data from a Tunisian airline. We analyzed patterns in airport operations and flight schedules to build a machine learning model that can forecast potential delays.
data data-science flight-delay-prediction machine-learning machinelearning prediction
Last synced: 12 Oct 2025
https://github.com/codenoid/webtoons.com-database
a Webtoons.com Database, collected by Hofesh Bot (Scrapper)
Last synced: 28 Mar 2025
https://github.com/michalwols/awesome-data-curation
🗑️ ✨ 📊 Awesome things related to data collection, annotation, cleaning and management.
active-learning annotation cleaning-data data data-science deep-learning machine-learning
Last synced: 24 Jun 2026
https://github.com/nikhilash45/power-bi-vsualisation-of-joins
In This Power Bi Report User Can Visualis Join By Themselves , and it is easy to understand joins now.
business-analytics business-intelligence data data-analysis data-visualization joins powerbi sql visualization
Last synced: 19 Mar 2026
https://github.com/yord/klp-dsv
A delimiter-separated values plugin for klp (Kelpie), the small, fast, and magical command-line data processor.
csv data deserializer dsv json kelpie klp marshaller parser serializer ssv tsv
Last synced: 14 May 2026
https://github.com/freddy03h/immutable-data-structure
Normalize and Merge your application's data store using Immutable.JS objects
Last synced: 05 Oct 2025
https://github.com/blacksujit/shikshamitra
Shiksha Mitra is an innovative MVP designed to reshape the way students learn through gamification. Our platform transforms the traditional approach to education by making learning engaging, interactive, and rewarding. As an MVP, Shiksha Mitra focuses on delivering core features that showcase the value of gamified learning,
ai data gamified-learning hackathon lms ml mlflow mlops mlops-workflow mvp pipeline platforn
Last synced: 28 Feb 2026
https://github.com/jmcanterafonseca/leaflet-context-information
A Leaflet plugin + infrastructure for getting access to Context Information (i.e. data) exposed through FIWARE NGSIv2
context data fiware information leaflet map open visualization web
Last synced: 21 Apr 2026
https://github.com/neelravi/data-management
A data management plan for computational chemists/physicists and material scientists for a FAIR storage of raw data
data dmp fair management workflows
Last synced: 16 Jan 2026
https://github.com/6km/islamic-data-repository
مستودع البيانات الإسلامية - قائمة بالموارد التي قد تفيد المبرمجين في تطوير التطبيقات ومواقع الويب.
data fonts hadeeth json quran quran-json
Last synced: 06 May 2026
https://github.com/mickfrog/uace-analysis
UACE ANALYSIS FOR 2011 - 2015
data data-science data-visualization folium-maps geocoder jupyter-notebook pandas python3
Last synced: 14 Feb 2026
https://github.com/montanaz0r/imdb-ratings-auto-inserter
A Python script that enables auto-inserting movie ratings into the IMDB profile.
data data-science dataanalysis imdb movies pandas pandas-dataframe python3 selenium selenium-webdriver webscraping
Last synced: 07 May 2026
https://github.com/dylanhogg/cloud-products
A package for getting cloud products and product descriptions from a cloud provider website.
aws cloud-products crawler data text-processing
Last synced: 05 Oct 2025
https://github.com/mattqdev/koalaz
Why don't use koalas as data mock? With this npm package you can!
data koala lorem-ipsum meme mock placeholder
Last synced: 13 Jan 2026
https://github.com/sivas-2/coffee-sales-visualization
This repository contains data visualization scripts and notebooks analyzing coffee sales data from a vending machine, sourced from Kaggle. The visualizations explore sales trends, customer preferences, and product popularity over time.
data data-analysis data-science data-visualization python visualization
Last synced: 07 May 2026
https://github.com/chompfoods/stub-python-flask
Flask (Python) server stub for the Chomp Food & Recipe Database API. Use our API to get high-quality data on recipes and 875,000+ branded/grocery foods plus raw ingredients.
api branded chomp data database flask flask-server food grocery ingredients nutrition python raw recipe-api recipes server stub stub-server
Last synced: 07 May 2026
https://github.com/ishanoshada/matplot3dex
A Matplotlib 3D Extension package for enhanced data visualization
data data-science matplotlib python-packages scikit-learn
Last synced: 05 Jan 2026
https://github.com/bileljegham/api-sport-cli
Cli for https://api-sports.io/ Retreive data and convert to sql file
cli data database match nodejs sports sports-analytics
Last synced: 08 May 2026
https://github.com/garcane/income-prediction-ml
This is a machine learning project aimed at predicting whether an individual's annual income exceeds $50,000 based on their demographic and personal information.
data data-science machine-learning ml numpy pandas python random-forest scikit-learn
Last synced: 08 Apr 2026
https://github.com/husna-poyraz/titanic-machine-learning
Use machine learning to create a model that predicts which passengers survived the Titanic shipwreck.
data data-analysis data-science data-visualization deep-learning machine-learning missing-data outlier-detection python titanic
Last synced: 10 May 2026
https://github.com/neomutt/sample-data
📚 Lists of things. Useful for developing and testing.
Last synced: 19 Mar 2026
https://github.com/genert/metis
Asynchronous data sender library
analytics asynchronous data dependency-free typescript
Last synced: 27 Jan 2026
https://github.com/fiddlydigital/fastmap
A simple 2D map that is optimized for speed.
Last synced: 23 Oct 2025
https://github.com/lemniscate-world/stratai
This project analyzes financial assets using a Hidden Markov Model (HMM) to identify different market regimes and patterns. The analysis includes calculating daily returns, rolling volatility, and volume changes, and visualizing the hidden states identified by the HMM.
ai assets data data-science data-visualization finance financial-analysis fintech hmm-model hmmlearn machine-learning trading
Last synced: 23 Oct 2025
https://github.com/cisagov/cyhy-feeds
Tools to create and retrieve Cyber Hygiene (CyHy) data extracts
Last synced: 23 Oct 2025
https://github.com/garcane/global-shipping-analytics-dashboard
This Tableau project provides a comprehensive visual analysis of global sales, shipping costs, and quality metrics across different regions and countries.
data data-analysis data-analyst data-visualization metrics tableau
Last synced: 01 Mar 2026
https://github.com/stdlib-js/array-base-none-by-right
Test whether all elements in an array fail a test implemented by a predicate function, iterating from right to left.
all array data every generic javascript node node-js nodejs none predicate stdlib structure test types validate
Last synced: 01 Mar 2026
https://github.com/whitehathackerpr/data-visualization-tool
This is a Python-based web application that allows users to upload datasets, analyze data, and create visualizations interactively. The tool is designed for ease of use and provides a simple interface to perform basic data analysis and generate visualizations
data data-analysis data-visualization python python3
Last synced: 05 Sep 2025
https://github.com/tylerben/data-spring
Easily generate a dummy dataset based on a provided config
data data-spring datagenerator fake-data generator javascript typescript
Last synced: 27 May 2026
https://github.com/dicook/tutorial_effective_data_plots
Materials for WOMBAT 2024 tutorial
data graphics inference statistics tidyverse visualisation
Last synced: 23 Jan 2026
https://github.com/thiagopanini/datadelivery
Um módulo Terraform open source capaz de proporcionar um toolkit completo de infraestrutura para que usuários iniciem suas respectivas jornadas de exploração em serviços de Analytics na AWS.
analytics athena aws catalog crawler data datamesh glue s3 terraform
Last synced: 29 Nov 2025
https://github.com/melinteflxrin/softserve-bigdata-project
End-to-end data warehousing project integrating APIs, ETL workflows, and PostgreSQL for analytics and reporting.
analytics api bigdata data datawarehousing externalapi pipeline postgres postgresql python warehouse
Last synced: 26 Jan 2026
https://github.com/bolajiolayinka/graph-api-automation
An End to End Automation from Facebook Business to Data Visualization of Campaigns
Last synced: 07 May 2025
https://github.com/nikoshet/rust-dms-cdc-operator
The rust-dms-cdc-operator is a Rust-based utility for comparing the state of a list of tables in an Amazon RDS database with data stored in Parquet files on Amazon S3, particularly useful for change data capture (CDC) scenarios.
aws cdc data dms parquet pgdatadiff polars postgres rds rust s3 validation
Last synced: 18 Jan 2026
https://github.com/avahoffman/dataplay
🤸♂️ Load data to play with
data data-package r r-package rstats
Last synced: 25 Mar 2025
https://github.com/rishabh-agarwal/datastructuremachineproblem
Data Structure MP - Clemson University (Language C)
273 alogrithms clemson data ece structure university
Last synced: 26 Oct 2025
https://github.com/efler/microservice-data-bus
Data bus based on Apache Kafka and consisting of separate components [copied from own private repos]
data data-bus deduplication enrichment filtering kafka microservice mongodb postgresql redis
Last synced: 16 Apr 2026
https://github.com/purarue/git_doc_history
copy/track file history in git, with python bindings to traverse and extract history/files/lines at some date
Last synced: 17 May 2026
https://github.com/eve-ning/osumania_data
processed osu!mania data from osu!API
Last synced: 24 Feb 2026
https://github.com/stdlib-js/ndarray-base-to-reversed
Return a new ndarray where the order of elements of an input ndarray is reversed along each dimension.
base data flip javascript matrix ndarray node node-js nodejs reverse slice stdlib structure to-reversed types vector view
Last synced: 12 Apr 2026
https://github.com/lmuffato/project-mysql-vocabulary-booster-trybe
Projeto mysql vocabulary booster - Projeto avaliativo da Trybe do Bloco 20: Funções SQL, Joins e Subqueries
back-end crud data database mysql mysqlworkbench query sql trybe-projects
Last synced: 10 May 2026
https://github.com/stdlib-js/array-one-to-like
Generate a linearly spaced numeric array whose elements increment by 1 starting from one and having the same length and data type as a provided input array.
array data float32array float64array int16array int32array javascript matrix ndarray node node-js nodejs stdlib structure typed typed-array types uint32array vector
Last synced: 20 Feb 2026
https://github.com/agavitalis/sample-c-codes
A collection of small projects I carried out on audino as an electronic engineering student despite felling in love with website development.
ageteller atm binary data gpcalculator logging
Last synced: 09 Apr 2025
https://github.com/devlive-community/mockaroo
一个轻量级的 HTTP Mock 服务器,用于快速构建模拟数据接口,适用于前后端开发和接口测试场景。
Last synced: 08 Jul 2025
https://github.com/vvipjain/bike-sales-dashboard
Bike Sales Dashboard
dashboards data data-analysis data-cleaning data-normalisation data-visualization excel pivot-chart pivot-tables
Last synced: 04 Feb 2026
https://github.com/flowsynx/plugin-csv
FlowSynx plugin to reads and writes CSV files, enabling easy batch data import/export operations and integration with spreadsheet-based data workflows.
comma-separated-values csv data data-platform flowsynx
Last synced: 10 Mar 2026