data
Individual facts, statistics, or items of information, often numeric. In a technical sense, data are a set of values of qualitative or quantitative variables about one or more persons or objects. (https://en.wikipedia.org/w/index.php?title=Data&oldid=1093674723, released under CC BY-SA 3.0)
- GitHub: https://github.com/topics/data
- Wikipedia: https://en.wikipedia.org/wiki/Data
- Related Topics: datum,
- Last updated: 2026-06-20 00:07:41 UTC
- JSON Representation
https://juliaearth.github.io/geospatial-data-science-with-julia/
Geospatial Data Science with Julia
book computational data geo geometry geospatial geostatistics julia science statistics
Last synced: 07 May 2025
https://github.com/leeper/data-versioning
Collecting thoughts about data versioning
data data-citation data-versioning metadata unf version-control
Last synced: 15 Feb 2026
https://github.com/robjhyndman/fpp2
All data sets required for the examples and exercises in the book "Forecasting: principles and practice" (2nd ed, 2018) by Rob J Hyndman and George Athanasopoulos <http://OTexts.org/fpp2/>. All packages required to run the examples are also loaded.
Last synced: 04 Mar 2026
https://github.com/anitaa1990/wifi-connect
A library project to connect two devices using Wifi-Direct
android android-library android-sdk broadcastreceiver data java receiver sender share wifi-connection wifi-direct wifip2pconnectioncallback-interface wrapper
Last synced: 26 Jun 2025
https://github.com/International-Data-Spaces-Association/idsa
This is the main repository of International Data Spaces Association on GitHub, where you can find general overview and useful information on IDS Landscape.
cybersecurity data data-sharing data-spaces dataeconomy dataexchange datasharing datasovereignty dataspace
Last synced: 04 Apr 2025
https://github.com/hyriver/hyriver.github.io
A Python software stack for retrieving hydroclimate data from web services.
climate data hydrology python webservice
Last synced: 04 Oct 2025
https://github.com/datainsider-co/rocket-bi
A free, open-source, web-based self-service BI tailor-made for clickhouse, google bigquery, mysql, postgresql, vertica
analytics bigdata bigquery bussiness-intelligence clickhouse dashboard data etl hacktoberfest hacktoberfest2023 ingestion mysql postgresql vertica
Last synced: 05 Apr 2025
https://github.com/robjhyndman/fpp2-package
All data sets required for the examples and exercises in the book "Forecasting: principles and practice" (2nd ed, 2018) by Rob J Hyndman and George Athanasopoulos <http://OTexts.org/fpp2/>. All packages required to run the examples are also loaded.
Last synced: 13 Jul 2025
https://github.com/christophhagen/binarycodable
A binary encoder for Swift Codable types
binary codable data decoder decoding encoder encoding protobuf protobuf3 protocol-buffers serialization swift
Last synced: 07 Sep 2025
https://github.com/anna-geller/prefect-deployment-patterns
Code examples showing flow deployment to various types of infrastructure
automation aws data data-engineering data-engineering-infrastructure data-engineering-pipeline data-engineering-team data-products data-science dataflow dataflow-ops orchestration pipeline prefect python serverless serverless-framework
Last synced: 22 Jun 2025
https://github.com/scienceverse/faux
R functions for simulating factorial datasets
Last synced: 21 Feb 2026
https://github.com/juliacomputing/tableview.jl
A Tables.jl compatible table viewer based on ag-grid
Last synced: 02 Sep 2025
https://github.com/ahmetfurkandemir/data-engineering-project-with-hdfs-and-kafka
Data Engineering Project with Hadoop HDFS and Kafka
data data-engineer data-engineering data-engineering-pipeline docker docker-compose hadoop hadoop-filesystem hadoop-hdfs hdfs hdfs-client hdfs-dfs kafka kafka-consumer kafka-producer kafka-ui kafkaui pipline python python-hdfs-client
Last synced: 15 Apr 2025
https://github.com/pfython/cleverdict
A JSON-friendly data structure which allows both object attributes and dictionary keys and values to be used simultaneously and interchangeably.
alias attributes auto-save data dictionary keyword object orm
Last synced: 10 Apr 2025
https://github.com/centerforopenscience/share
SHARE is building a free, open, data set about research and scholarly activities across their life cycle.
data elasticsearch harvest-data metadata openscience python scholarly-communication science
Last synced: 05 Apr 2025
https://github.com/debruine/faux
R functions for simulating factorial datasets
Last synced: 28 Aug 2025
https://github.com/bredele/datastore
:hamster: Bloat free and flexible interface for data store and database access.
async asynchronous data database database-access datastore model store
Last synced: 13 Apr 2025
https://github.com/itachi-uchiha581/auto-data
Auto Data is a library designed for quick and effortless creation of datasets tailored for fine-tuning Large Language Models (LLMs).
ai data finetuning-large-language-models finetuning-llms generative-ai llm llm-training python python3
Last synced: 20 Sep 2025
https://github.com/ERDDAP/erddap
ERDDAP is a scientific data server that gives users a simple, consistent way to download subsets of gridded and tabular scientific datasets in common file formats and make graphs and maps. ERDDAP is a Free and Open Source (Apache and Apache-like) Java Servlet from NOAA NMFS SWFSC Environmental Research Division (ERD).
data environmental erddap noaa scientific server
Last synced: 08 May 2025
https://github.com/ecsim/pem-dataset1
Proton Exchange Membrane (PEM) Fuel Cell Dataset
activation-procedure chemistry data data-science dataset electrochemistry energy fuel-cell impedance mea nafion open-science open-source pem physics polarization power proton-exchange-membrane science science-research
Last synced: 15 May 2025
https://github.com/purarue/google_takeout_parser
A library/CLI tool to parse data out of your Google Takeout (History, Activity, Youtube, Locations, etc...)
backup data export google google-location-history google-takeout
Last synced: 14 Jun 2025
https://github.com/zbrookle/dataframe_sql
A Python package that parses SQL and interprets it as methods that act upon existing pandas (or other types of) DataFrames that have been declared and registered
data dataframes pandas python sql
Last synced: 20 Aug 2025
https://github.com/1n3/powerexfil
A collection of data exfiltration scripts for Red Team assessments.
data exfil exfiltration hacking powershell redteam redteaming script scripts tool tools
Last synced: 08 Aug 2025
https://github.com/ralyodio/humanparser
Parse a human name string into salutation, first name, middle name, last name, suffix.
data es6 javascript parsing scraping
Last synced: 02 Apr 2026
https://github.com/saschagobel/legislatoR
Interface to the Comparative Legislators Database
data dataset legislators parliament political-science politicians politics r wikipedia
Last synced: 13 Jul 2025
https://github.com/bukalapak/ktpextractor
This is a service which takes KTP image as the input, and extract the data in the KTP as the output. This is a part of open source project by Data Scientists of Bukalapak.
Last synced: 01 Aug 2025
https://github.com/jason89521/daxus
Daxus is a server state management library for React that provides full control over data, leading to a better user experience.
cache data dedupe hook react revalidate server-state-management user-experience
Last synced: 23 Jun 2025
https://github.com/geostatsguy/geodatasets
Synthetic datasets for geoscience (geo)statistical modeling
Last synced: 26 Oct 2025
https://github.com/blueapron/kafka-connect-protobuf-converter
Protobuf converter plugin for Kafka Connect
data data-platform jar kafka kafka-connect protobuf protobuf-converter protocol-buffers
Last synced: 03 May 2025
https://github.com/gabrieldim/advanced-programming
Generic programming, generic classes, maps, sets, abstract data types and so on.
abstarct class data data-type data-types generic generic-programming generics interface interfaces map set
Last synced: 10 Jul 2025
https://github.com/vr-25/migrator
A backup solution and data migration utility for Android
android appdata backup data factoryreset magisk migation migrate titaniumbackup
Last synced: 08 Jul 2025
https://github.com/neurosnap/cofx
A node and javascript library that helps developers describe side-effects as data in a declarative, flexible API.
asynchronous cofx data javascript node promise side-effects yield
Last synced: 14 Apr 2025
https://github.com/anthonybudd/S4
S4 is 100% S3 compatible storage, accessed through Tor and distributed using IPFS.
data docker ipfs object-storage s3 s4 storage
Last synced: 07 Apr 2025
https://github.com/manifoldfinance/mev-corpus
MEV Data Corpus
blockchain corpus data ethereum flashbots mev miner-extracted-value
Last synced: 21 Jan 2026
https://github.com/andreiduca/use-async-resource
A custom React hook for simple data fetching with React Suspense
async cache custom-hook data data-fetching fetch hooks react react-cache react-hook react-hooks react-suspense reactjs suspense
Last synced: 12 Jan 2026
https://github.com/anthonybudd/s4
S4 is 100% S3 compatible storage, accessed through Tor and distributed using IPFS.
data docker ipfs object-storage s3 s4 storage
Last synced: 12 Apr 2025
https://github.com/dagshub/client
DagsHub client libraries
ai data data-science data-streaming dvc hacktoberfest hacktoberfest2023 keras machine-learning machinelearning mlops python pytorch tensorflow
Last synced: 16 May 2025
https://github.com/mattphillips/jest-each
A parameterised testing library for Jest. https://www.npmjs.com/package/jest-each 🏃
data each jest parameterised test
Last synced: 13 Apr 2025
https://github.com/jobehi/isthistechdead
The place where your favourite framework will be resting
Last synced: 19 Jun 2025
https://github.com/synthesized-io/fairlens
Identify bias and measure fairness of your data
bias data data-analysis data-science fairness ml pandas python statistics
Last synced: 24 Jun 2025
https://github.com/adzz/data_schema
Declarative schemas for data transformations.
data data-parsing elixir functional-programming types validation
Last synced: 20 Jul 2025
https://github.com/yuxqiu/modern-poetry
The most comprehensive database of modern Chinese poetry and foreign poetry 最全的中国近现代诗以及外国诗数据库
data json poems poetry translation
Last synced: 16 Jan 2026
https://github.com/nucs/cryptocurrency-ticks-data
590 days of trade ticks on BTC/ETH/LTC/NEO to USDT
Last synced: 12 Feb 2026
https://github.com/mhanberg/schematic
📐 schematic
data elixir specification validation
Last synced: 12 Apr 2025
https://github.com/googlecloudplatform/dlp-dataflow-deidentification
Multi Cloud Data Tokenization Solution By Using Dataflow and Cloud DLP
beam bigquery data dataflow dlp pii tokenization
Last synced: 11 Apr 2025
https://github.com/contextdata/vectoretl
Build super simple end-to-end data & ETL pipelines for your vector databases and Generative AI applications
cohere data datapipeline etl etl-framework etl-pipeline openai pinecone python qdrant qdrant-vector-database unstructured vector-database weaviate
Last synced: 09 Apr 2025
https://github.com/tmcw/simpleopendata
simple guidelines for publishing open data in useful formats
copleft copyright data formats government licensing open
Last synced: 03 Mar 2026
https://github.com/empower-ai/sql-agent
Ai Agent that helps you do data analytics with natural language.
analytics bigquery chatgpt chatgpt-bot data data-analytics data-science mysql postgresql slack slack-bot slackbot
Last synced: 11 Apr 2025
https://github.com/phyphox/phyphox-arduino
The phyphox BLE library to connect Arduino projects with the phyphox app to display data on the phone or use the phone's sensors on the Arduino
arduino ble bluetooth bluetooth-low-energy data phyphox sensors
Last synced: 16 Jan 2026
https://github.com/Azure/azure-data-labs-modules
A list of Terraform modules to build your Azure Data IaC templates.
analytics azure data github github-actions labs terraform terraform-modules
Last synced: 06 May 2025
https://github.com/josephrp/datatonic
🌟DataTonic : A Data-Capable AGI-style Agent Builder of Agents , that creates swarms , runs commands and securely processes and creates datasets, databases, visualisations, and analyses.
agent-builder agi autogen azure chroma data data-science data-visualization database memgpt semantic-kernel semantic-memory taskweaver
Last synced: 11 Oct 2025
https://github.com/aws-solutions/automated-data-analytics-on-aws
The Automated Data Analytics on AWS solution provides an end-to-end data platform for ingesting, transforming, managing and querying datasets. This helps analysts and business users manage and gain insights from data without deep technical experience using Amazon Web Services (AWS).
Last synced: 17 Apr 2025
https://github.com/JujuAdams/SNAP
Data format converters for GameMaker LTS 2022
array data gamemaker gamemaker-studio-2 gms2 ini json messagepack struct xml
Last synced: 01 Apr 2025
https://github.com/jujuadams/snap
Data format converters for GameMaker LTS 2022
array data gamemaker gamemaker-studio-2 gms2 ini json messagepack struct xml
Last synced: 06 Apr 2025
https://github.com/volorf/paster
Pasting a text data from a clipboard directlly to Sketch text layers [Sketch plugin]
clipboard data plugin sketch sketch-plugin text
Last synced: 21 Mar 2025
https://github.com/joaocarmo/react-smart-data-table
A smart data table component for React meant to be configuration free
data data-table data-visualization plug-and-play react
Last synced: 13 Apr 2025
https://github.com/stanfordnlp/edu-convokit
Edu-ConvoKit: An Open-Source Framework for Education Conversation Data
data data-analysis data-science education language natural-language-processing
Last synced: 15 Apr 2025
https://github.com/malloydata/publisher
Publisher is the open-source semantic model server for the Malloy data language. It lets you define semantic models once — and use them everywhere.
analytics business-intelligence data data-modeling data-transformation data-visualization database semantic-modeling transformation
Last synced: 06 May 2026
https://github.com/richienb/ros-data-waster
The easiest way to waste your data.
Last synced: 19 Jun 2025
https://github.com/wildflowai/platform
Model natural ecosystems 🌎🪸🐳
ai biodiversity conservation data ocean restoration
Last synced: 11 Apr 2026
https://github.com/packtworkshops/the-data-visualization-workshop
A New, Interactive Approach to Learning Data Visualization
bokeh data data-visualization data-wrangling figures geoplotlib matplotlib numpy pandas plots python seaborn
Last synced: 15 Apr 2025
https://github.com/azure/azure-data-labs-modules
A list of Terraform modules to build your Azure Data IaC templates.
analytics azure data github github-actions labs terraform terraform-modules
Last synced: 05 Jul 2025
https://github.com/slowkow/tftargets
:dart: Human transcription factor target genes from 6 databases in convenient R format.
bioinformatics data rstats transcription-factors
Last synced: 14 Apr 2025
https://github.com/runprism/prism
Prism is the easiest way to develop, orchestrate, and execute data pipelines in Python.
bigquery data data-analysis data-engineering data-integration data-orc data-science dbt etl etl-pipeline machine-learning orchestration pipeline postgres python redshift snowflake trino
Last synced: 19 May 2026
https://github.com/ropensci/opentripplanner
An R package to set up and use OpenTripPlanner (OTP) as a local or remote multimodal trip planner.
data isochrones java opentripplanner otp public-transport r routing transport transportation-planning
Last synced: 08 Oct 2025
https://github.com/Baukebrenninkmeijer/table-evaluator
Evaluate real and synthetic datasets against each other
data data-evaluation evaluation generation synthetic synthetic-data table-evaluator
Last synced: 02 May 2025
https://github.com/Hebilicious/vue-query-nuxt
A lightweight, 0 config Nuxt Module for Vue Query.
data data-fetching fetch nuxt react-query tanstack tanstack-query vue vue-query
Last synced: 02 Aug 2025
https://github.com/knime/knime-python
KNIME Python Integration
arrow data flatbuffer integration knime python science workflow
Last synced: 21 Jan 2026
https://github.com/drivy/checker_jobs
Regression testing for data
data regression-testing ruby sidekiq
Last synced: 06 Apr 2025
https://github.com/ContextData/VectorETL
Build super simple end-to-end data & ETL pipelines for your vector databases and Generative AI applications
cohere data datapipeline etl etl-framework etl-pipeline openai pinecone python qdrant qdrant-vector-database unstructured vector-database weaviate
Last synced: 22 Sep 2025
https://github.com/hebilicious/vue-query-nuxt
A lightweight, 0 config Nuxt Module for Vue Query.
data data-fetching fetch nuxt react-query tanstack tanstack-query vue vue-query
Last synced: 04 Apr 2025
https://github.com/open-discourse/open-discourse
Open Discourse is the first fully comprehensive corpus of the plenary proceedings of the federal German Parliament (Bundestag).
bundestag corpus data hacktoberfest
Last synced: 14 Mar 2025
https://github.com/fityannugroho/idn-area-map
The map of Indonesia's administrative areas 🇮🇩🌏
data hacktoberfest idn-area indonesia island map nextjs tailwindcss wilayah
Last synced: 07 Apr 2025
https://github.com/torkleyy/nitric
[ABANDONED] General-purpose data processing library. Mirror of https://gitlab.com/nitric/nitric
data ecs entity-component processing
Last synced: 20 Aug 2025
https://github.com/mahmoudparsian/pyspark-algorithms
PySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2
algorithms big-data data data-abstractions data-science dataframe distributed-computing graphframes mapreduce monoid nosql partitioning pyspark pyspark-algorithms python rdd spark transformations
Last synced: 07 Apr 2025
https://github.com/f1lt3r/bitcoin-scraper
💲 bitcoin chart history scraper
bitcoin bitcoin-scraper chart data graph javascript json logarithmic moore scrapes-bitcoin-data
Last synced: 09 Mar 2026
https://github.com/jbzoo/data
Extended implementation of ArrayObject - useful collection for any config in your system (write, read, store, change, validate, convert to other format and etc).
arrayobject config converts data filters ini jbzoo php yml
Last synced: 05 Apr 2025
https://github.com/uwdata/flechette
Fast, lightweight access to Apache Arrow data.
Last synced: 04 Apr 2025
https://github.com/opensource-observer/oss-directory
A curated directory of open source software (OSS) projects and their associated artifacts
data github open-source public-goods research
Last synced: 08 Oct 2025
https://victorcouste.github.io/data-tools/
Data Tools Subjective List
awesome awesome-list data data-architecture data-tools datatools list modern open-source opensource tools
Last synced: 10 May 2025
https://github.com/ngxs-labs/data
NGXS Persistence API
data entity ngxs ngxs-persistence-api
Last synced: 24 Apr 2025
https://github.com/zhangyoujia/hd_write_verify
LBA tools(hd_write_verify & hd_write_verify_dump) are very useful for testing Storage stability and verifying DATA consistency, there are much better than FIO & vdbench's verifying functions. for example: physical disk: ide/sata/scsi/ssd/iscsi/fc/raid/...; virtual disk: loop/nbd/lvm/soft raid/...; VM disk: ide/sata/scsi/virtio-blk/virtio-scsi/...;
consistency data filesystem migration physical-disk snapshot stability storage testing verifying virtual-disk vm-backup vm-disk
Last synced: 27 Feb 2026
https://github.com/spine-tools/Spine-Toolbox
Spine Toolbox is an open source Python package to manage data, scenarios and workflows for modelling and simulation. You can have your local workflow, but work as a team through version control and SQL databases.
anaconda data energy miniconda python simulation-model spine-toolbox workflow
Last synced: 07 May 2025
https://github.com/visgl/deck.gl-data
Data for the data visualization library deck.gl examples (https://uber.github.io/deck.gl/#/)
data data-science data-visualization uber
Last synced: 12 Jun 2025
https://github.com/ashvin27/react-datatable
React-datatable is a component which provide ability to create multifunctional table using single component like jQuery Datatable. It's fully customizable and easy to integrate in any react component. Bootstrap compatible.
data datatables datatables-plugin react react-data-table react-datagrid react-datatable react-table table
Last synced: 13 May 2025
https://github.com/smappnyu/youtube-data-api
A Python Client for collect and parse public data from the Youtube Data API
api api-wrapper data python python-client research research-tool youtube youtube-api-v3 youtube-search
Last synced: 28 Oct 2025
https://github.com/tirthajyoti/synthetic-data-gen
Various methods for generating synthetic data for data science and ML
classification data data-science machine-learning python regression symbolic-computation time-series
Last synced: 30 Apr 2025
https://github.com/mainakrepositor/datasets
A bunch of some 200 datasets. You can call it mini-kaggle :)
csv data data-science database datasets image-files mini-kaggle ml nlp-machine-learning tsv
Last synced: 01 Mar 2025
https://github.com/luanborelli/ipeadatapy
ipeadatapy is a data and metadata extraction package made in Python using Ipeadata database official API. In it's essence it is an API wrapper.
api api-wrapper brazil dados-abertos dados-historicos data data-analysis datasets econometrics economic-data economics geographic-data geography ipea ipeadata wrapper
Last synced: 07 Apr 2026